BuildStream and host tools

It’s been a while since I had to build a whole operating system from source. I’ve mostly been working on compilers so far this year at Codethink in fact, but my new project is to bring up some odd target systems that aren’t supported by any mainstream distros.

We did something similar about 4 years ago using Baserock and it worked well; this time we are using the Baserock OS definitions again but with BuildStream as a build tool. I’ve not had any chance to get involved in BuildStream up til now (beyond observing it) so this will be good.

The first thing I’m getting my head around is the “no host tools” policy. The design of BuildStream is that every build is run in a sandbox that’s isolated from the host. Older Baserock tools took a similar approach too and it makes a lot of sense: it’s a lot easier to maintain build instructions if you limit the set of environments in which they can run, and you are much more likely to be able to reproduce them later or on other people’s machines.

However your sandbox is going to need a compiler and a shell environment in there if it’s going to be able to build anything, and BuildStream leaves open the question of where those come from. It’s simple to find a prebuilt toolchain at least for mainstream architectures — pretty much every Linux distro can provide one so the only question is which one to use and how to get it into BuildStream’s sandbox?

GNOME and Freedesktop base runtime and SDK

The Flatpak project has a similar need for a controlled runtime and build environment, and is producing a GNOME SDK, and a lower level Freedesktop SDK. These are at present built on top of Yocto.

Up to date versions of these are made available in an OSTree repo at http://sdk.gnome.org/repo. This makes it easy to import them into BuildStream using an ‘import’ element and the ‘ostree’ source:

kind: import
description: Import the base freedesktop SDK
config:
  source: files
  target: usr
host-arches:
  x86_64:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/x86_64/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 0d9d255d56b08aeaaffb1c820eef85266eb730cb5667e50681185ccf5cd7c882
  i386:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/i386/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 16036b747c1ec8e7fe291f5b1f667cb942f0267d08fcad962e9b7627d6cf1981

The main downside to using these is that they are pretty large — the GNOME 3.18 SDK weighs in at 1.5 GB uncompressed and around 63,000 files. Creating a hardlink tree using `ostree checkout` takes up to a minute on my (admittedly rather old) laptop. The Freedesktop SDK is smaller but still not ideal. They are also only built for a small set of architectures — I think just some x86 and ARM families at the moment.

Debian in OSTree

As part of building GNOME’s jhbuild modulesets inside BuildStream Tristan created a script to produce Debian chroots for various architectures and commit them to an OSTree repo. The GNOME components are then built on top of these base Debian images, with the idea that in future they can be tested on top of a whole variety of distros in addition to Debian to make us catch platform-specific regressions more quickly.

The script, which uses the awesome Multistrap tool to do most of the heavy lifting, lives here and pushes its results to a repo that is temporarily housed at https://gnome7.codethink.co.uk/repo/ and signed with this key.

The resulting sysroot are 2.7 GB in size with 105,320 different files. This again takes up to a minute to check out on my laptop. Like the GNOME SDK, this sysroot contains every external dependency of GNOME which adds up to a lot of stuff.

Alpine Linux Toolchain

I want a lighter weight set of host tools to put in my build sandbox. Baserock’s OS images can be built with just a C++ toolchain and a minimal shell environment, so there’s no need to start copying gigabytes of dependencies around.

Ultimately the Baserock project could build its own set of host tools, but to save faff while prototyping things I decided to try Alpine Linux, which is a minimal distribution.

Alpine Linux provide “mini root filesystem” tarballs. These can’t be used directly as they contain device nodes (so require privileges to extract) and don’t contain a toolchain.

Here’s how I produced a workable host tools sysroot. I’m using Bubblewrap (the same tool used by BuildStream to create build sandboxes) as a simple container driver to run the `apk` package tool as root without needing special host privileges. This won’t work on every OS; you can use something like Docker or plain old `chroot` instead if needed.

wget https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-minirootfs-3.6.1-x86_64.tar.gz
mkdir -p sysroot
tar -x -f alpine-minirootfs-3.6.1-x86_64.tar.gz -C sysroot --exclude=./dev

alias alpine_exec='bwrap --unshare-all --share-net --setenv PATH /usr/bin:/bin:/usr/sbin:/sbin  --bind ./sysroot / --ro-bind /etc/resolv.conf /etc/resolv.conf --uid 0 --gid 0'
alpine_exec apk update
alpine_exec apk add bash bc gcc g++ musl-dev make gawk gettext-dev gzip linux-headers perl e2fsprogs mtools

tar -z -c -f alpine-host-tools-3.6.1-x86_64.tar.gz -C sysroot .

This produces a 219MB host tools sysroot containing 11,636 files. This is not as minimal as you can go with a GNU C/C++ toolchain but it’s around the right order of magnitude and it checks out from BuildStream’s artifact store into the build directory in a matter of seconds.

We include gawk as it is needed during the GCC build (BusyBox awk is not enough), and gettext-dev is needed by GLIBC (at least, libintl.h is needed and in Alpine only gettext provides that header). Bash is needed by scripts/config from linux.git, and bc, GNU gzip, linux-headers and Perl are also needed for building Linux. The e2fsprogs and mtools are useful for creating disk images.

I’ve integrated this into my builds in a pretty lazy way for now:

kind: import
description: Import an Alpine Linux C/C++ toolchain
host-arches:
  x86_64:
    sources:
    - kind: tar
      url: file:///home/sam/src/buildstream-bootstrap/alpine-host-tools-3.6.1-x86_64.tar.gz
      base-dir: .
      ref: e01d76ef2c7e3e105778e2aa849a42d38dc3163f8c15f5b2de8f64cd5543cf29

This element is obviously not something I can share with others — I’d need to upload the tarball somewhere or set up a public OSTree repo that others could pull from, and then have the element reference that.

However, this is just the first step towards some much deeper work which will result in me needing to move beyond Alpine in any case. In future I hope that it’ll be pretty straightforward to obtain a minimal toolchain as a sysroot that can be pulled into a sandbox using OSTree. The work required to produce such a thing is simple enough to automate but it requires a server to host the binaries which then requires ongoing maintenance for security updates, so I’m not yet going to commit to doing it …

Posted in Uncategorized | Leave a comment

GUADEC Talks Schedule Now Available

As mentioned on the GUADEC 2017 blog, we’ve just published the talks schedule for this year’s edition.

Thanks to everyone who submitted a talk this year, we think there’s a fantastic mix of interesting topics on each day. We sadly had to make some tough decisions and turn down some great submissions as well – we received around 70 talk submissions this year which is a lot to fit into a 3 day, 2 track conference.

It’s now less than 6 weeks until the conference starts! If you’re planning on attending and haven’t yet registered, please register now at registration.guadec.org. It will be possible to register on the day, but we have to finalize room allocations, party venues and food orders way in advance so it will cause problems if everyone waits until the last minute to sign up.

If you’ve not yet booked a room, we have a number of rooms still available at the conference venue. You can book these when you register, or if you already registered then just log into registration.guadec.org and click ‘Edit registration’.

We also have hotel rooms available at fixed prices through Visit Manchester’s hotel booking service. The deadline for booking these rooms is Thursday 29th June.

Thanks to everyone who has opted to volunteer during the conference using the “Keep me informed” box on the registration form. We will be in touch with you very soon to discuss how you can get involved. There’s still time to edit your registration to tick this box if you’ve decided you want to help out on the day!

Posted in Uncategorized | 2 Comments

Manchester

Last night an suicide attack took place in Manchester killing at least 22 people. I don’t have much to comment on that apart from that everyone’s thoughts are with those who have been injured or lost friends and family to the attack, and to quote a friend of mine:

If you think you can sow disunity in Manchester with a bomb, you don’t know Manchester.

Posted in Uncategorized | 1 Comment

Tracker πŸ’™ Meson

A long time ago I started looking at rewriting Tracker’s build system using Meson. Today those build instructions landed in the master branch in Git!

Meson is becoming pretty popular now so I probably don’t need to explain why it’s such a big improvement over Autotools. Here are some key benefits:

  • It takes 2m37s for me to build from a clean Git tree with Autotools,Β  but only 1m08s with Meson.
  • There are 2573 lines of meson.build files, vs. 5013 lines of Makefile.am, a 2898 line configure.ac file, and various other bits of debris needed for Autotools
  • Only compile warnings are written to stdout by default, so they’re easy to spot
  • Out of tree builds actually work

Tracker is quite a challenging project to build, and I hit a number of issues in Meson along the way plus a few traps for the unwary.

We have a huge number of external dependencies — Meson handles this pretty neatly, although autodetection of backends requires a bit of boilerplate.

There’s a complex mix of Vala and C code in Tracker, including some libraries that are written in both. The Meson developers have put a lot of work into supporting Vala, which is much appreciated considering it’s a fairly niche language and in fact the only major problem we have left is something that’s just as broken with Autotools: failing to generate a single introspection repo for a combined C + Vala library

Tracker also has a bunch of interdependent libraries. This caused continual problems because Meson does very little deduplication in the commandlines it generates, and so I’d get combinational explosions hitting fairly ridiculous errors like commandline too long (the limit is 262KB) or too many open files inside the ld Β  process. This is a known issue. For now I work around it by manually specifying some dependencies for individual targets instead of relying on them getting pulled in as transitive dependencies of a declare_dependency target.

A related issue was that if the same .vapi file ends up on the valac commandline more than once it would trigger an error. This required some trickery to avoid. New versions of Meson work around this issue anyway.

One pretty annoying issue is that generated files in the source tree cause Meson builds to fail. Out of tree builds seem to not work with our Autotools build system — something to do with the Vala integration — with the result that you need to make clean before running a Meson build even if the Meson build is in a separate build dir. If you see errors about conflicting types or duplicate definitions, that’s probably the issue. While developing the Meson build instructions I had a related problem of forgetting about certain files that needed to be generated because the Autotools build system had already generated them. Be careful!

Meson users need to be aware that the rpath is not set automatically for you. If you previously used Libtool you probably didn’t need to care what an rpath was, but with Meson you have to manually set install_rpath for every program that depends on a library that you have installed into a non-standard location (such as a subdirectory of /usr/lib). I think rpaths are a bit of a hack anyway — if you want relocatable binary packages you need to avoid them — so I like that Meson is bringing this implementation detail to the surface.

There are a few other small issues: for example we have a Gtk-Doc build that depends on the output of a program, which Meson’s gtk-doc module currently doesn’t handle so we have to rebuild that documentation on every build as a workaround. There are also some workarounds in the current Tracker Meson build instructions that are no longer needed — for example installing generated Vala headers used to require a custom install script, but now it’s supported more cleanly.

Tracker’s Meson build rules aren’t quite ready for prime time: some tests fail when run under Meson that pass when run under Autotools, and we have to work out how best to create release tarballs. But it’s pretty close!

All in all this took a lot longer to achieve than I originally hoped (about 9 months of part-time effort), but in the process I’ve found some bugs in both Tracker and Meson, fixed a few of them, and hopefully made a small improvement to the long process of turning GNU/Linux users into GNU/Linux developers.

Meson has come a long way in that time and I’m optimistic for its future. It’s a difficult job to design and implement a new general purpose build system (plus project configuration tool, test runner, test infrastructure, documentation, etc. etc), and the Meson project have done so in 5 years without any large corporate backing that I know of. Maintaining open source projects is often hard and thankless. Ten thumbs up to the Meson team!

Posted in Uncategorized | 1 Comment

Night Bus: simple SSH-based build automation

night-858546_640

My current project at Codethink has involved testing and packaging GCC on several architectures. As part of this I wanted nightly builds of ‘master’ and the GCC 7 release branch, which called for some kind of automated build system.

What I wanted was a simple CI system that could run some shell commands on different machines, check if they failed, and save a log somewhere that can be shared publically. Some of the build targets are obsolete proprietary OSes where modern software doesn’t work out of the box so simplicity is key. I considered using GitLab CI, for example, but it requires a runner written in Go, which is not something I can just install on AIX. And I really didn’t have time to maintain a Jenkins instance.

So I started by trying to use Ansible as a CI system, and it kind of worked but the issue is that there’s no way to get the command output streamed back to you in real time. GCC builds take hours and its test suite can take a full day to run on an old machine so it’s essential to be able to see how things are progressing without waiting a full day for the command to complete. If you can’t see the output, the build could be hanging somewhere and you’d not realise. I discovered that Ansible isn’t going to support this use case and so I ended up writing a new tool: Night Bus.

Night Bus is written in Python 3 and runs tasks across different machines, similarly to Ansible but with the usecase of doing nightly builds and tests as opposed to configuration management. It provides:

  • remote task execution via SSH (using the Parallel-SSH library)
  • live logging of output to a specified directory
  • an overall report written once all tasks are done, which can contain status messages from the tasks
  • parametrization of the tasks (to e.g. build 3 branches of the same thing)
  • a support library of helper functions to make your task scripts more readable

Scheduled execution can be set up using cron or systemd. You can set up a webserver (i’m using lighttpd) on the machine that runs Night Bus to make the log output available over HTTP

You control it by creating two YAML (or JSON) files:

  • hosts describes the SSH configuration for each machine
  • tasks lists the sequence of tasks to run

Here’s an example hosts file:

host1:
  user: automation
  private_key: ssh/automation.key

host2:
  proxy_host: 86.75.30.9
  proxy_user: jenny
  proxy_private_key: ssh/jenny.key

Here’s an example tasks file:

tasks:
- name: counting-example
  commands: |
    echo "Counting to 20."
    for i in `seq 1 20`; do
      echo "Hello $i"
      sleep 1
    done

You might wonder why I didn’t just write a shell script to automate my builds as many thousands of hackers have done in the past. Basically I find maintaining shell scripts over about 10 lines to be a hateful experience. Shell is great as a “live” programming environment because it’s very flexible and quick to type. But those strengths turn into weaknesses when you’re trying to write maintainable software. Every CI system ultimately ends up with you writing shell scripts (or if you’re really unlucky, some XML equivalent) so I don’t see any point hiding the commands that are being run under layers of other stuff, but at the same time I want a clear separation between the tasks themselves and the support aspects like remote system access, task ordering, and logging.

Night Bus is released as a random GitHub project that may never get much in the way of updates. My aim is for it to fall into the category of software that doesn’t need much ongoing work or maintenance because it doesn’t try to do anything special. If it saves one person from having to maintain a Jenkins instance then the week I spent writing it will have been worthwhile.

Posted in Uncategorized | 4 Comments

GUADEC call for talks ends this Sunday, 23rd April

GUADEC 2017 is just over three months away, which is a very long time in the future and leaves lots of time to organise everything (at least that’s what I keep telling myself).

However, the call for papers is closing this Sunday so if you have something you want to talk about in front of the GNOME community and you haven’t already submitted a talk then please head to the registration site and do so!

Once the call for papers closes, the Papers Committee will fetch their ceremonial robes and make their way to a cave deep in the Peak District for two weeks. There they will drink fresh spring water, hunt grouse on the moors and study your talk submissions in great detail. When two weeks is up, their votes are communicated back to Manchester using smoke signals and by Sunday 7th May you’ll be notified by email if your talk was accepted. From there we can organise travel sponsorship and finalize the schedule of the first 3 days of the conference, which should be available late next month.

We’ll put a separate call out for BoF sessions, workshops, and tutorial sessions to take place during the second half of GUADEC — the 23rd April deadline only applies to talks.

Posted in Uncategorized | 1 Comment

GUADEC accommodation

At this year’s GUADEC in Manchester we have rooms available for you right at the venue in lovely modern student townhouses. As I write this there are still some available to book along with your registration. In a couple of days we have to give final numbers to the University for how many rooms we want, so it would help us out if all the folk who want a room there could register and book one now if you haven’t already done so! We’ll have some available for later booking but we have to pay up front for them now so we can’t reserve too many.

Rooms for sponsored attendees are reserved separately so you don’t need to book now if your attendance depends on travel sponsorship.

If you are looking for a hotel, we have a hotel booking service run by Visit Manchester where you can get the best rates from various hotels right up til June 2017. (If you need to arrive before Thursday 27th July then you can to contact Visit Manchester directly for your booking at abs@visitmanchester.com).

We have had some great talk submissions already but there is room for plenty more, so make sure you also submit your idea for a talk before 23rd April!

Posted in Uncategorized | 2 Comments