How BuildStream uses OSTree

Note: In version 1.2, BuildStream stopped using OSTree to cache artifacts. It now uses a generic “Content Addressable Storage” system, implemented internally but designed to be compatible with Bazel and any other tool which supports the Remote Execution API. I’ve updated this article accordingly.

I’ve been asked a few times about the relationship between BuildStream and OSTree. The answer is a bit complicated so I decided to answer the question here.

OSTree is a content-addressed content store, inspired in many ways by Git but optimized for storing trees of binary files rather than trees of text files.

BuildStream is an integration tool which deals with trees of binary files.

I’m deliberately using the abstract term “trees of binary files” here because neither BuildStream or OSTree limit themselves to a particular use case. BuildStream itself uses the term “artifact” to describe the output of a build job and in practice this could be the set of development headers and documentation for library, a package file such as a .deb or .rpm, a filesystem for a whole operating system, a bootable VM disk image, or whatever else.

Anyway let’s get to the point! There are actually four two ways that BuildStream uses OSTree.

The `ostree` source plugin

The `ostree` source plugin allows pulling arbitrary data from a remote OSTree repository. It is normally used with an `import` element as a way of importing prebuilt binaries into a build pipeline. For example BuildStream’s integration tests currently run on top of the Freedesktop SDK binaries (which were originally intended for use with Flatpak applications but are equally useful as a generic platform runtime). The gnome-build-meta project uses this mechanism to import a prebuilt Debian base image, which is currently manually pushed to an OSTree repo (this is a temporary measure, in future we want to base gnome-build-meta on top of the upcoming Freedesktop SDK 1.8 instead).

It’s also possible to import binaries using the `tar` and `local` source types of course, and you can even use the `git` or `bzr` plugins for this if you really get off on using the wrong tools for the wrong job.

In future we will likely add other source plugins for importing binaries, for example from the Docker Registry and perhaps using casync.

Storing artifacts locally

Once a build has completed, BuildStream needs to store the results somewhere locally. The results go in the exciting-sounding “local artifact cache”, which is usually located inside your home directory at ‚Äč~/.cache/buildstream/artifacts.

BuildStream 1.0 used OSTree to store artifacts. BuildStream 1.2 and later use a generic Content Addressed Storage implementation.

Storing artifacts remotely

As a way of saving everyone from building the same things, BuildStream supports downloading prebuilt artifacts from a remote cache.

BuildStream 1.0 used OSTree for remote storage. BuildStream 1.2 and later use the same CAS service that is used for local storage.

Pushing and pulling artifacts

BuildStream 1.2 and later use the CAS protocol from the Remote Execution API to transfer artifacts. This protocol is implemented using GRPC.

Indirect uses of OSTree

It may be that you also end up deploying stuff into an OSTree repository somewhere. BuildStream itself is only interested with building and integrating your project — once that is done you run `bst checkout` and are rewarded with a tree of files on your local machine. What if, let’s say, your project aims to build a Flatpak application?

Flatpak actually uses OSTree as well and so your deployment step may involve committing those files into yet another OSTree repo ready for Flatpak to run them. (This can be a bit long winded at present so there will likely be some better integration appearing here at some point).

So, is anywhere safe from the rise of OSTree or is it going to take over completely? Something you might not know about me is that I grew up outside a town in north Shropshire called Oswestry. Is that a coincidence? I can’t say.

Oswestry
Oswestry, from Wikipedia.

Using BuildStream through Docker

BuildStream isn’t packaged in any distributions yet, and it’s not entirely trivial to install it yourself from source. BuildStream itself is just Python, but it depends on a modern version of OSTree (2017.8 or newer at time of writing), with the GObject introspection bindings, which is a little annoying to have to build yourself1.

So we have put some work into making it convenient to run BuildStream inside a Docker container. We publish an image to the Docker hub with the necessary dependencies, and we provide a helper script named bst-here that sets up a container with the current worked directory mounted at /src and then runs a BuildStream command or an interactive shell inside it. Just download the script, read it through and run it: all going well you’ll be rewarded with an interactive Bash session where you can run the bst command. This allows users on any distro that supports Docker to run BuildStream builds in a pretty transparent way and without any major performance limitations. It even works on Mac OS X!

In order to run builds inside a sandbox, BuildStream uses Bubblewrap. This requires certain kernel features, in particular CONFIG_USER_NS which right now is not enabled by default in Arch and possibly in other distros. Docker containers run against the kernel of the host OS so it doesn’t help with this issue.

The Docker images we provide are based off Fedora and are built by GitLab CI from this repo. After a commit to that repo’s ‘master’ branch, a new image wends its way across hyperspace from GitLab to the Docker hub. These images are then pulled when the bst-here wrapper script calls docker run. (We also use these images for running the BuildStream CI pipelines).

More importantly, we now have a mascot now! Let me introduce the BuildStream beaver:

Beavers, of course, are known for building things in streams, are also native to Canada. This isn’t going to be the final logo, he’s just been brought in as we got tired of the project being represented by a capital letter B in a grey circle. If anyone can contribute a better one then please get in touch!

So what can you build with BuildStream now that you have it running in Docker? As recently announced, you can build GNOME! Follow this modified version of the newcomer’s guide to get started. Soon you will also be able to build Flatpak runtimes using the rewritten Freedesktop SDK; or build VM images using Baserock; and of course you can create pipelines for your own projects (although if you only have a few dependencies, using Meson subprojects might be quicker).

After one year of development, we are just a few blockers away from releasing BuildStream 1.0. So it is a great time to get involved in the project!

[1]. Installing modern OSTree from source is not impossible — my advice if you want to avoid Docker and your distro doesn’t provide a new enough OSTree would be to build the latest tagged release of OSTree from Git, and configure it to install into /opt/ostree. Then put something like export GI_TYPELIB_PATH=/opt/ostree/lib/girepository-1.0/ in your shell’s startup file. Make sure you have all the necessary build dependencies installed first.

BuildStream and host tools

It’s been a while since I had to build a whole operating system from source. I’ve mostly been working on compilers so far this year at Codethink in fact, but my new project is to bring up some odd target systems that aren’t supported by any mainstream distros.

We did something similar about 4 years ago using Baserock and it worked well; this time we are using the Baserock OS definitions again but with BuildStream as a build tool. I’ve not had any chance to get involved in BuildStream up til now (beyond observing it) so this will be good.

The first thing I’m getting my head around is the “no host tools” policy. The design of BuildStream is that every build is run in a sandbox that’s isolated from the host. Older Baserock tools took a similar approach too and it makes a lot of sense: it’s a lot easier to maintain build instructions if you limit the set of environments in which they can run, and you are much more likely to be able to reproduce them later or on other people’s machines.

However your sandbox is going to need a compiler and a shell environment in there if it’s going to be able to build anything, and BuildStream leaves open the question of where those come from. It’s simple to find a prebuilt toolchain at least for mainstream architectures — pretty much every Linux distro can provide one so the only question is which one to use and how to get it into BuildStream’s sandbox?

GNOME and Freedesktop base runtime and SDK

The Flatpak project has a similar need for a controlled runtime and build environment, and is producing a GNOME SDK, and a lower level Freedesktop SDK. These are at present built on top of Yocto.

Up to date versions of these are made available in an OSTree repo at http://sdk.gnome.org/repo. This makes it easy to import them into BuildStream using an ‘import’ element and the ‘ostree’ source:

kind: import
description: Import the base freedesktop SDK
config:
  source: files
  target: usr
host-arches:
  x86_64:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/x86_64/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 0d9d255d56b08aeaaffb1c820eef85266eb730cb5667e50681185ccf5cd7c882
  i386:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/i386/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 16036b747c1ec8e7fe291f5b1f667cb942f0267d08fcad962e9b7627d6cf1981

The main downside to using these is that they are pretty large — the GNOME 3.18 SDK weighs in at 1.5 GB uncompressed and around 63,000 files. Creating a hardlink tree using `ostree checkout` takes up to a minute on my (admittedly rather old) laptop. The Freedesktop SDK is smaller but still not ideal. They are also only built for a small set of architectures — I think just some x86 and ARM families at the moment.

Debian in OSTree

As part of building GNOME’s jhbuild modulesets inside BuildStream Tristan created a script to produce Debian chroots for various architectures and commit them to an OSTree repo. The GNOME components are then built on top of these base Debian images, with the idea that in future they can be tested on top of a whole variety of distros in addition to Debian to make us catch platform-specific regressions more quickly.

The script, which uses the awesome Multistrap tool to do most of the heavy lifting, lives here and pushes its results to a repo that is temporarily housed at https://gnome7.codethink.co.uk/repo/ and signed with this key.

The resulting sysroot are 2.7 GB in size with 105,320 different files. This again takes up to a minute to check out on my laptop. Like the GNOME SDK, this sysroot contains every external dependency of GNOME which adds up to a lot of stuff.

Alpine Linux Toolchain

I want a lighter weight set of host tools to put in my build sandbox. Baserock’s OS images can be built with just a C++ toolchain and a minimal shell environment, so there’s no need to start copying gigabytes of dependencies around.

Ultimately the Baserock project could build its own set of host tools, but to save faff while prototyping things I decided to try Alpine Linux, which is a minimal distribution.

Alpine Linux provide “mini root filesystem” tarballs. These can’t be used directly as they contain device nodes (so require privileges to extract) and don’t contain a toolchain.

Here’s how I produced a workable host tools sysroot. I’m using Bubblewrap (the same tool used by BuildStream to create build sandboxes) as a simple container driver to run the `apk` package tool as root without needing special host privileges. This won’t work on every OS; you can use something like Docker or plain old `chroot` instead if needed.

wget https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-minirootfs-3.6.1-x86_64.tar.gz
mkdir -p sysroot
tar -x -f alpine-minirootfs-3.6.1-x86_64.tar.gz -C sysroot --exclude=./dev

alias alpine_exec='bwrap --unshare-all --share-net --setenv PATH /usr/bin:/bin:/usr/sbin:/sbin  --bind ./sysroot / --ro-bind /etc/resolv.conf /etc/resolv.conf --uid 0 --gid 0'
alpine_exec apk update
alpine_exec apk add bash bc gcc g++ musl-dev make gawk gettext-dev gzip linux-headers perl e2fsprogs mtools

tar -z -c -f alpine-host-tools-3.6.1-x86_64.tar.gz -C sysroot .

This produces a 219MB host tools sysroot containing 11,636 files. This is not as minimal as you can go with a GNU C/C++ toolchain but it’s around the right order of magnitude and it checks out from BuildStream’s artifact store into the build directory in a matter of seconds.

We include gawk as it is needed during the GCC build (BusyBox awk is not enough), and gettext-dev is needed by GLIBC (at least, libintl.h is needed and in Alpine only gettext provides that header). Bash is needed by scripts/config from linux.git, and bc, GNU gzip, linux-headers and Perl are also needed for building Linux. The e2fsprogs and mtools are useful for creating disk images.

I’ve integrated this into my builds in a pretty lazy way for now:

kind: import
description: Import an Alpine Linux C/C++ toolchain
host-arches:
  x86_64:
    sources:
    - kind: tar
      url: file:///home/sam/src/buildstream-bootstrap/alpine-host-tools-3.6.1-x86_64.tar.gz
      base-dir: .
      ref: e01d76ef2c7e3e105778e2aa849a42d38dc3163f8c15f5b2de8f64cd5543cf29

This element is obviously not something I can share with others — I’d need to upload the tarball somewhere or set up a public OSTree repo that others could pull from, and then have the element reference that.

However, this is just the first step towards some much deeper work which will result in me needing to move beyond Alpine in any case. In future I hope that it’ll be pretty straightforward to obtain a minimal toolchain as a sysroot that can be pulled into a sandbox using OSTree. The work required to produce such a thing is simple enough to automate but it requires a server to host the binaries which then requires ongoing maintenance for security updates, so I’m not yet going to commit to doing it …