How Tracker is tested in 2019

I became interested in the Tracker project in 2011. I was looking at media file scanning and was happy to discover an active project that was focused on the same thing. I wanted to contribute, but I found it very hard to test my changes; and since Tracker runs as a daemon I really didn’t want to introduce any crazy regressions.

In those days Tracker already had a set of tests written in Python that tested the Tracker daemons as a whole, but they were a bit unfinished and unreliable. I focused some spare-time effort on improving those. Surprisingly enough it’s taken eight years to get the point where I’m happy with how they work.

The two biggest improvements parallel changes in many other GNOME projects. Last year Tracker stopped using GNU Autotools in favour of Meson, after a long incubation period. I probably don’t need to go into detail of how much better this is for developers. Also, we set up GitLab CI to automatically run the test suite, where previously developers and maintainers were required to run the test suite manually before merging anything. Together, these changes have made it about 100000% easier to review patches for Tracker, so if you were considering contributing code to the project I can safely say that there has never been a better time!

The Tracker project is now divided into two parts, the ‘core’ (tracker.git) and the ‘miners’ (tracker-miners.git) . The core project contains the database and the application interface libraries, while the miners project contains the daemons that scan your filesystem and extract metadata from your interesting files.

Let’s look at what happens automatically when you submit a merge request on GNOME GitLab for the tracker-miners project:

  1. The .gitlab-ci.yml file specifies a Docker image to be used for running tests. The Docker images are built automatically from this project and are based on Fedora.
  2. The script in .gitlab-ci.yml clones the ‘master’ version of Tracker core.
  3. The tracker and tracker-miners projects are configured and built, using Meson. There is a special build option in tracker-miners that makes it include Tracker core as a Meson subproject, instead of building against the system-provided version. (It still depends on a few files from host at the time of writing).
  4. The script starts a private D-Bus session using dbus-run-session, sets a fixed en_US.UTF8 locale, and runs the test suite for tracker-miners using meson test.
  5. Meson runs the tests that are defined in meson.build files. It tries to run them in parallel with one test per CPU core.
  6. The libtracker-miners-common tests exercises some utility code, which is duplicated from libtracker-common in Tracker core.
  7. The libtracker-extract tests exercises libtracker-extract, which is a private library with helper code for accessing file metadata. It mainly focuses on standard metadata formats like XMP and EXIF.
  8. The functional-300-miner-basic-ops and functional-301-resource-removal tests check the operation of the tracker-miner-fs daemon, mostly by copying files in and out of a specific path and then waiting for the corresponding changes to the Tracker database to take effect.
  9. The functional-310-fts-basic test tries some full-text search operations on a text file. There are a couple of other FTS tests too.
  10. The functional/extract/* tests effectively run tracker extract on a set of real media files, and test that the expected metadata is extracted. The tests are defined by JSON files such as this one.
  11. The functional-500-writeback tests exercise the tracker-writeback daemon (which allows updating things like MP3 tags following changes in the Tracker database). These tests are not particularly thorough. The writeback feature of Tracker is not widely used, to my knowledge.
  12. Finally, the functional-600-* tests simulate the behaviour of some MeeGo phone applications. Yes, that’s how old this code is 🙂

There is plenty of room for more testing of course, but this list is very comprehensive when compared to the total lack of automated testing that the project had just a year ago!

How BuildStream uses OSTree

Note: In version 1.2, BuildStream stopped using OSTree to cache artifacts. It now uses a generic “Content Addressable Storage” system, implemented internally but designed to be compatible with Bazel and any other tool which supports the Remote Execution API. I’ve updated this article accordingly.

I’ve been asked a few times about the relationship between BuildStream and OSTree. The answer is a bit complicated so I decided to answer the question here.

OSTree is a content-addressed content store, inspired in many ways by Git but optimized for storing trees of binary files rather than trees of text files.

BuildStream is an integration tool which deals with trees of binary files.

I’m deliberately using the abstract term “trees of binary files” here because neither BuildStream or OSTree limit themselves to a particular use case. BuildStream itself uses the term “artifact” to describe the output of a build job and in practice this could be the set of development headers and documentation for library, a package file such as a .deb or .rpm, a filesystem for a whole operating system, a bootable VM disk image, or whatever else.

Anyway let’s get to the point! There are actually four two ways that BuildStream uses OSTree.

The `ostree` source plugin

The `ostree` source plugin allows pulling arbitrary data from a remote OSTree repository. It is normally used with an `import` element as a way of importing prebuilt binaries into a build pipeline. For example BuildStream’s integration tests currently run on top of the Freedesktop SDK binaries (which were originally intended for use with Flatpak applications but are equally useful as a generic platform runtime). The gnome-build-meta project uses this mechanism to import a prebuilt Debian base image, which is currently manually pushed to an OSTree repo (this is a temporary measure, in future we want to base gnome-build-meta on top of the upcoming Freedesktop SDK 1.8 instead).

It’s also possible to import binaries using the `tar` and `local` source types of course, and you can even use the `git` or `bzr` plugins for this if you really get off on using the wrong tools for the wrong job.

In future we will likely add other source plugins for importing binaries, for example from the Docker Registry and perhaps using casync.

Storing artifacts locally

Once a build has completed, BuildStream needs to store the results somewhere locally. The results go in the exciting-sounding “local artifact cache”, which is usually located inside your home directory at ​~/.cache/buildstream/artifacts.

BuildStream 1.0 used OSTree to store artifacts. BuildStream 1.2 and later use a generic Content Addressed Storage implementation.

Storing artifacts remotely

As a way of saving everyone from building the same things, BuildStream supports downloading prebuilt artifacts from a remote cache.

BuildStream 1.0 used OSTree for remote storage. BuildStream 1.2 and later use the same CAS service that is used for local storage.

Pushing and pulling artifacts

BuildStream 1.2 and later use the CAS protocol from the Remote Execution API to transfer artifacts. This protocol is implemented using GRPC.

Indirect uses of OSTree

It may be that you also end up deploying stuff into an OSTree repository somewhere. BuildStream itself is only interested with building and integrating your project — once that is done you run `bst checkout` and are rewarded with a tree of files on your local machine. What if, let’s say, your project aims to build a Flatpak application?

Flatpak actually uses OSTree as well and so your deployment step may involve committing those files into yet another OSTree repo ready for Flatpak to run them. (This can be a bit long winded at present so there will likely be some better integration appearing here at some point).

So, is anywhere safe from the rise of OSTree or is it going to take over completely? Something you might not know about me is that I grew up outside a town in north Shropshire called Oswestry. Is that a coincidence? I can’t say.

Oswestry
Oswestry, from Wikipedia.

Using BuildStream through Docker

BuildStream isn’t packaged in any distributions yet, and it’s not entirely trivial to install it yourself from source. BuildStream itself is just Python, but it depends on a modern version of OSTree (2017.8 or newer at time of writing), with the GObject introspection bindings, which is a little annoying to have to build yourself1.

So we have put some work into making it convenient to run BuildStream inside a Docker container. We publish an image to the Docker hub with the necessary dependencies, and we provide a helper script named bst-here that sets up a container with the current worked directory mounted at /src and then runs a BuildStream command or an interactive shell inside it. Just download the script, read it through and run it: all going well you’ll be rewarded with an interactive Bash session where you can run the bst command. This allows users on any distro that supports Docker to run BuildStream builds in a pretty transparent way and without any major performance limitations. It even works on Mac OS X!

In order to run builds inside a sandbox, BuildStream uses Bubblewrap. This requires certain kernel features, in particular CONFIG_USER_NS which right now is not enabled by default in Arch and possibly in other distros. Docker containers run against the kernel of the host OS so it doesn’t help with this issue.

The Docker images we provide are based off Fedora and are built by GitLab CI from this repo. After a commit to that repo’s ‘master’ branch, a new image wends its way across hyperspace from GitLab to the Docker hub. These images are then pulled when the bst-here wrapper script calls docker run. (We also use these images for running the BuildStream CI pipelines).

More importantly, we now have a mascot now! Let me introduce the BuildStream beaver:

Beavers, of course, are known for building things in streams, are also native to Canada. This isn’t going to be the final logo, he’s just been brought in as we got tired of the project being represented by a capital letter B in a grey circle. If anyone can contribute a better one then please get in touch!

So what can you build with BuildStream now that you have it running in Docker? As recently announced, you can build GNOME! Follow this modified version of the newcomer’s guide to get started. Soon you will also be able to build Flatpak runtimes using the rewritten Freedesktop SDK; or build VM images using Baserock; and of course you can create pipelines for your own projects (although if you only have a few dependencies, using Meson subprojects might be quicker).

After one year of development, we are just a few blockers away from releasing BuildStream 1.0. So it is a great time to get involved in the project!

[1]. Installing modern OSTree from source is not impossible — my advice if you want to avoid Docker and your distro doesn’t provide a new enough OSTree would be to build the latest tagged release of OSTree from Git, and configure it to install into /opt/ostree. Then put something like export GI_TYPELIB_PATH=/opt/ostree/lib/girepository-1.0/ in your shell’s startup file. Make sure you have all the necessary build dependencies installed first.

BuildStream and host tools

It’s been a while since I had to build a whole operating system from source. I’ve mostly been working on compilers so far this year at Codethink in fact, but my new project is to bring up some odd target systems that aren’t supported by any mainstream distros.

We did something similar about 4 years ago using Baserock and it worked well; this time we are using the Baserock OS definitions again but with BuildStream as a build tool. I’ve not had any chance to get involved in BuildStream up til now (beyond observing it) so this will be good.

The first thing I’m getting my head around is the “no host tools” policy. The design of BuildStream is that every build is run in a sandbox that’s isolated from the host. Older Baserock tools took a similar approach too and it makes a lot of sense: it’s a lot easier to maintain build instructions if you limit the set of environments in which they can run, and you are much more likely to be able to reproduce them later or on other people’s machines.

However your sandbox is going to need a compiler and a shell environment in there if it’s going to be able to build anything, and BuildStream leaves open the question of where those come from. It’s simple to find a prebuilt toolchain at least for mainstream architectures — pretty much every Linux distro can provide one so the only question is which one to use and how to get it into BuildStream’s sandbox?

GNOME and Freedesktop base runtime and SDK

The Flatpak project has a similar need for a controlled runtime and build environment, and is producing a GNOME SDK, and a lower level Freedesktop SDK. These are at present built on top of Yocto.

Up to date versions of these are made available in an OSTree repo at http://sdk.gnome.org/repo. This makes it easy to import them into BuildStream using an ‘import’ element and the ‘ostree’ source:

kind: import
description: Import the base freedesktop SDK
config:
  source: files
  target: usr
host-arches:
  x86_64:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/x86_64/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 0d9d255d56b08aeaaffb1c820eef85266eb730cb5667e50681185ccf5cd7c882
  i386:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/i386/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 16036b747c1ec8e7fe291f5b1f667cb942f0267d08fcad962e9b7627d6cf1981

The main downside to using these is that they are pretty large — the GNOME 3.18 SDK weighs in at 1.5 GB uncompressed and around 63,000 files. Creating a hardlink tree using `ostree checkout` takes up to a minute on my (admittedly rather old) laptop. The Freedesktop SDK is smaller but still not ideal. They are also only built for a small set of architectures — I think just some x86 and ARM families at the moment.

Debian in OSTree

As part of building GNOME’s jhbuild modulesets inside BuildStream Tristan created a script to produce Debian chroots for various architectures and commit them to an OSTree repo. The GNOME components are then built on top of these base Debian images, with the idea that in future they can be tested on top of a whole variety of distros in addition to Debian to make us catch platform-specific regressions more quickly.

The script, which uses the awesome Multistrap tool to do most of the heavy lifting, lives here and pushes its results to a repo that is temporarily housed at https://gnome7.codethink.co.uk/repo/ and signed with this key.

The resulting sysroot are 2.7 GB in size with 105,320 different files. This again takes up to a minute to check out on my laptop. Like the GNOME SDK, this sysroot contains every external dependency of GNOME which adds up to a lot of stuff.

Alpine Linux Toolchain

I want a lighter weight set of host tools to put in my build sandbox. Baserock’s OS images can be built with just a C++ toolchain and a minimal shell environment, so there’s no need to start copying gigabytes of dependencies around.

Ultimately the Baserock project could build its own set of host tools, but to save faff while prototyping things I decided to try Alpine Linux, which is a minimal distribution.

Alpine Linux provide “mini root filesystem” tarballs. These can’t be used directly as they contain device nodes (so require privileges to extract) and don’t contain a toolchain.

Here’s how I produced a workable host tools sysroot. I’m using Bubblewrap (the same tool used by BuildStream to create build sandboxes) as a simple container driver to run the `apk` package tool as root without needing special host privileges. This won’t work on every OS; you can use something like Docker or plain old `chroot` instead if needed.

wget https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-minirootfs-3.6.1-x86_64.tar.gz
mkdir -p sysroot
tar -x -f alpine-minirootfs-3.6.1-x86_64.tar.gz -C sysroot --exclude=./dev

alias alpine_exec='bwrap --unshare-all --share-net --setenv PATH /usr/bin:/bin:/usr/sbin:/sbin  --bind ./sysroot / --ro-bind /etc/resolv.conf /etc/resolv.conf --uid 0 --gid 0'
alpine_exec apk update
alpine_exec apk add bash bc gcc g++ musl-dev make gawk gettext-dev gzip linux-headers perl e2fsprogs mtools

tar -z -c -f alpine-host-tools-3.6.1-x86_64.tar.gz -C sysroot .

This produces a 219MB host tools sysroot containing 11,636 files. This is not as minimal as you can go with a GNU C/C++ toolchain but it’s around the right order of magnitude and it checks out from BuildStream’s artifact store into the build directory in a matter of seconds.

We include gawk as it is needed during the GCC build (BusyBox awk is not enough), and gettext-dev is needed by GLIBC (at least, libintl.h is needed and in Alpine only gettext provides that header). Bash is needed by scripts/config from linux.git, and bc, GNU gzip, linux-headers and Perl are also needed for building Linux. The e2fsprogs and mtools are useful for creating disk images.

I’ve integrated this into my builds in a pretty lazy way for now:

kind: import
description: Import an Alpine Linux C/C++ toolchain
host-arches:
  x86_64:
    sources:
    - kind: tar
      url: file:///home/sam/src/buildstream-bootstrap/alpine-host-tools-3.6.1-x86_64.tar.gz
      base-dir: .
      ref: e01d76ef2c7e3e105778e2aa849a42d38dc3163f8c15f5b2de8f64cd5543cf29

This element is obviously not something I can share with others — I’d need to upload the tarball somewhere or set up a public OSTree repo that others could pull from, and then have the element reference that.

However, this is just the first step towards some much deeper work which will result in me needing to move beyond Alpine in any case. In future I hope that it’ll be pretty straightforward to obtain a minimal toolchain as a sysroot that can be pulled into a sandbox using OSTree. The work required to produce such a thing is simple enough to automate but it requires a server to host the binaries which then requires ongoing maintenance for security updates, so I’m not yet going to commit to doing it …

Tracker 💙 Meson

A long time ago I started looking at rewriting Tracker’s build system using Meson. Today those build instructions landed in the master branch in Git!

Meson is becoming pretty popular now so I probably don’t need to explain why it’s such a big improvement over Autotools. Here are some key benefits:

  • It takes 2m37s for me to build from a clean Git tree with Autotools,  but only 1m08s with Meson.
  • There are 2573 lines of meson.build files, vs. 5013 lines of Makefile.am, a 2898 line configure.ac file, and various other bits of debris needed for Autotools
  • Only compile warnings are written to stdout by default, so they’re easy to spot
  • Out of tree builds actually work

Tracker is quite a challenging project to build, and I hit a number of issues in Meson along the way plus a few traps for the unwary.

We have a huge number of external dependencies — Meson handles this pretty neatly, although autodetection of backends requires a bit of boilerplate.

There’s a complex mix of Vala and C code in Tracker, including some libraries that are written in both. The Meson developers have put a lot of work into supporting Vala, which is much appreciated considering it’s a fairly niche language and in fact the only major problem we have left is something that’s just as broken with Autotools: failing to generate a single introspection repo for a combined C + Vala library

Tracker also has a bunch of interdependent libraries. This caused continual problems because Meson does very little deduplication in the commandlines it generates, and so I’d get combinational explosions hitting fairly ridiculous errors like commandline too long (the limit is 262KB) or too many open files inside the ld   process. This is a known issue. For now I work around it by manually specifying some dependencies for individual targets instead of relying on them getting pulled in as transitive dependencies of a declare_dependency target.

A related issue was that if the same .vapi file ends up on the valac commandline more than once it would trigger an error. This required some trickery to avoid. New versions of Meson work around this issue anyway.

One pretty annoying issue is that generated files in the source tree cause Meson builds to fail. Out of tree builds seem to not work with our Autotools build system — something to do with the Vala integration — with the result that you need to make clean before running a Meson build even if the Meson build is in a separate build dir. If you see errors about conflicting types or duplicate definitions, that’s probably the issue. While developing the Meson build instructions I had a related problem of forgetting about certain files that needed to be generated because the Autotools build system had already generated them. Be careful!

Meson users need to be aware that the rpath is not set automatically for you. If you previously used Libtool you probably didn’t need to care what an rpath was, but with Meson you have to manually set install_rpath for every program that depends on a library that you have installed into a non-standard location (such as a subdirectory of /usr/lib). I think rpaths are a bit of a hack anyway — if you want relocatable binary packages you need to avoid them — so I like that Meson is bringing this implementation detail to the surface.

There are a few other small issues: for example we have a Gtk-Doc build that depends on the output of a program, which Meson’s gtk-doc module currently doesn’t handle so we have to rebuild that documentation on every build as a workaround. There are also some workarounds in the current Tracker Meson build instructions that are no longer needed — for example installing generated Vala headers used to require a custom install script, but now it’s supported more cleanly.

Tracker’s Meson build rules aren’t quite ready for prime time: some tests fail when run under Meson that pass when run under Autotools, and we have to work out how best to create release tarballs. But it’s pretty close!

All in all this took a lot longer to achieve than I originally hoped (about 9 months of part-time effort), but in the process I’ve found some bugs in both Tracker and Meson, fixed a few of them, and hopefully made a small improvement to the long process of turning GNU/Linux users into GNU/Linux developers.

Meson has come a long way in that time and I’m optimistic for its future. It’s a difficult job to design and implement a new general purpose build system (plus project configuration tool, test runner, test infrastructure, documentation, etc. etc), and the Meson project have done so in 5 years without any large corporate backing that I know of. Maintaining open source projects is often hard and thankless. Ten thumbs up to the Meson team!

Night Bus: simple SSH-based build automation

night-858546_640

My current project at Codethink has involved testing and packaging GCC on several architectures. As part of this I wanted nightly builds of ‘master’ and the GCC 7 release branch, which called for some kind of automated build system.

What I wanted was a simple CI system that could run some shell commands on different machines, check if they failed, and save a log somewhere that can be shared publically. Some of the build targets are obsolete proprietary OSes where modern software doesn’t work out of the box so simplicity is key. I considered using GitLab CI, for example, but it requires a runner written in Go, which is not something I can just install on AIX. And I really didn’t have time to maintain a Jenkins instance.

So I started by trying to use Ansible as a CI system, and it kind of worked but the issue is that there’s no way to get the command output streamed back to you in real time. GCC builds take hours and its test suite can take a full day to run on an old machine so it’s essential to be able to see how things are progressing without waiting a full day for the command to complete. If you can’t see the output, the build could be hanging somewhere and you’d not realise. I discovered that Ansible isn’t going to support this use case and so I ended up writing a new tool: Night Bus.

Night Bus is written in Python 3 and runs tasks across different machines, similarly to Ansible but with the usecase of doing nightly builds and tests as opposed to configuration management. It provides:

  • remote task execution via SSH (using the Parallel-SSH library)
  • live logging of output to a specified directory
  • an overall report written once all tasks are done, which can contain status messages from the tasks
  • parametrization of the tasks (to e.g. build 3 branches of the same thing)
  • a support library of helper functions to make your task scripts more readable

Scheduled execution can be set up using cron or systemd. You can set up a webserver (i’m using lighttpd) on the machine that runs Night Bus to make the log output available over HTTP

You control it by creating two YAML (or JSON) files:

  • hosts describes the SSH configuration for each machine
  • tasks lists the sequence of tasks to run

Here’s an example hosts file:

host1:
  user: automation
  private_key: ssh/automation.key

host2:
  proxy_host: 86.75.30.9
  proxy_user: jenny
  proxy_private_key: ssh/jenny.key

Here’s an example tasks file:

tasks:
- name: counting-example
  commands: |
    echo "Counting to 20."
    for i in `seq 1 20`; do
      echo "Hello $i"
      sleep 1
    done

You might wonder why I didn’t just write a shell script to automate my builds as many thousands of hackers have done in the past. Basically I find maintaining shell scripts over about 10 lines to be a hateful experience. Shell is great as a “live” programming environment because it’s very flexible and quick to type. But those strengths turn into weaknesses when you’re trying to write maintainable software. Every CI system ultimately ends up with you writing shell scripts (or if you’re really unlucky, some XML equivalent) so I don’t see any point hiding the commands that are being run under layers of other stuff, but at the same time I want a clear separation between the tasks themselves and the support aspects like remote system access, task ordering, and logging.

Night Bus is released as a random GitHub project that may never get much in the way of updates. My aim is for it to fall into the category of software that doesn’t need much ongoing work or maintenance because it doesn’t try to do anything special. If it saves one person from having to maintain a Jenkins instance then the week I spent writing it will have been worthwhile.

CMake: dependencies between targets and files and custom commands

As I said in my last post about CMake, targets are everything in CMake. Unfortunately, not everything is a target though!

If you’ve tried do anything non-trivial in CMake using the add_custom_command() command, you may have got stuck in this horrible swamp of confusion. If you want to generate some kind of file at build time, in some manner other than compiling C or C++ code, then you need to use a custom command to generate the file. But files aren’t targets and have all sorts of exciting limitations to make you forget everything you ever new about dependency management.

What makes it so hard is that there’s not one limitation, but several. Here is a hopefully complete list of things you might want to do in CMake that involve custom commands and custom targets depending on each other, and some explainations as to why things don’t work the way that you might expect.

1. Dependencies between targets

point1-verticalThis is CMake at its simplest (and best).

cmake_minimum_required(VERSION 3.2)

add_library(foo foo.c)

add_executable(bar bar.c)
target_link_libraries(bar foo)

You have a library, and a program that depends on it. When you run CMake, both of them get built. Ideal! This is great!

What is “all”, in the dependency graph to the left? It’s a built in target, and it’s the default target. There are also “install” and “test” targets built in (but no “clean” target).

2. Custom targets

If your project is a good one then maybe you use a documentation tool like GTK-Doc or Doxygen to generate documentation from the code.

This is where add_custom_command() enters your life. You may live to regret ever letting it in.

cmake_minimum_required(VERSION 3.2)

add_custom_command(
    OUTPUT
        docs/doxygen.stamp
    DEPENDS
        docs/Doxyfile
    COMMAND
        doxygen docs/Doxyfile
    COMMAND
        cmake -E touch docs/doxygen.stamp
    COMMENT
        "Generating API documentation with Doxygen"
    VERBATIM
    )

We have to create a ‘stamp’ file because Doxygen generates lots of different files, and we can’t really tell CMake what to expect. But actually, here’s what to expect: nothing! If you build this, you get no output. Nothing depends on the documentation, so it isn’t built.

So we need to add a dependency between docs/doxygen.stamp and the “all” target. How about using add_dependencies()? No, you can’t use that with any of the built in targets. But as a special case, you can use add_custom_target(ALL) to create a new target attached to the “all” target:

add_custom_target(
    docs ALL
    DEPENDS docs/doxygen.stamp
    )

point2-horizontal.png

In practice, you might also want to make the custom command depend on all your source code, so it gets regenerated every time you change the code. Or, you might want to remove the ALL from your custom target, so that you have to explicitly run make docs to generate the documentation.

This is also discussed here.

3. Custom commands in different directories

Another use case for add_custom_command() is generating source code files using 3rd party tools.

### Toplevel CMakeLists.txt
cmake_minimum_required(VERSION 3.2)

add_subdirectory(src)
add_subdirectory(tests)


### src/CMakeLists.txt
add_custom_command(
    OUTPUT
        ${CMAKE_CURRENT_BINARY_DIR}/foo.c
    COMMAND
        cmake -E echo "Generate my C code" > foo.c
    VERBATIM
    )


### tests/CMakeLists.txt
add_executable(
    test-foo
        test-foo.c ${CMAKE_CURRENT_BINARY_DIR}/../src/foo.c
    )

add_test(
    NAME test-foo
    COMMAND test-foo
    )

How does this work? Actually it doesn’t! You’ll see the following error when you run CMake:

CMake Error at tests/CMakeLists.txt:1 (add_executable):
  Cannot find source file:

    /home/sam/point3/build/src/foo.c

  Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp
  .hxx .in .txx
CMake Error: CMake can not determine linker language for target: test-foo
CMake Error: Cannot determine link language for target "test-foo".

Congratulations, you’ve hit bug 14633! The fun thing here is that generated files don’t behave anything like targets. Actually they can only be referenced in the file that contains the corresponding add_custom_command() call. So when we refer to the generated foo.c in tests/CMakeLists.txt, CMake actually has no idea where it could come from, so it raises an error.

point3-1.png

As the corresponding FAQ entry describes, there are two things you need to do to work around this limitation.

The first is to wrap your custom command in a custom target. Are you noticing a pattern yet? Most of the workarounds here are going to involve wrapping custom commands in custom targets. In src/CMakeLists.txt, you do this:

add_custom_target(
    generate-foo
    DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/../src/foo.c
    )

Then, in tests/CMakeLists.txt, you can add a dependency between “test-foo” and “generate-foo”:

add_dependencies(test-foo generate-foo)

That’s enough to ensure that foo.c now gets generated before the build of test-foo begins, which is obviously important. If you try to run CMake now, you’ll hit the same error, because CMake still has no idea where that generated foo.c file might come from. The workaround here is to manually set the GENERATED target property:

set_source_files_properties(
    ${CMAKE_CURRENT_BINARY_DIR}/../src/foo.c
    PROPERTIES GENERATED TRUE
    )

point3-2.png

Note that this is a bit of a contrived example. In most cases, the correct solution is to do this:

### src/CMakeLists.txt
add_library(foo foo.c)

### tests/CMakeLists.txt
target_link_libraries(test-foo foo)

Then you don’t have to worry about any of the nonsense above, because libraries are proper targets, and you can use them anywhere.

Even if it’s not practical to make a library containing ‘foo.c’, there must be some other target that links against it in the same directory that it is generated in. So instead of creating a “generate-foo” target, you can make “test-foo” depend on whatever other target links to “foo.c”.

4. Custom commands and parallel make

I came into this issue while doing something pretty unusual with CMake: wrapping a series of Buildroot builds. Imagine my delight at discovering that, when parallel make was used, my CMake-generated Makefile was running the same Buildroot build multiple times at the same time! That is not what I wanted!

It turns out this is a pretty common issue. The crux of it is that with the “Unix Makefiles” backend, multiple toplevel targets run as an independent, parallel make processes. Files aren’t targets, and unless something is a target then it doesn’t get propagated around like you would expect.

Here is the test case:

cmake_minimum_required(VERSION 3.2)

add_custom_command(
    OUTPUT gen
    COMMAND sleep 1
    COMMAND cmake -E echo Hello > gen
    )

add_custom_target(
    my-all-1 ALL DEPENDS gen
    )

add_custom_target(
    my-all-2 ALL DEPENDS gen
    )

If you generate a Makefile from this and run make -j 2, you’ll see the following:

Scanning dependencies of target my-all-2
Scanning dependencies of target my-all-1
[ 50%] Generating gen
[100%] Generating gen
[100%] Built target my-all-2
[100%] Built target my-all-1

If creating ‘gen’ takes a long time, then you really don’t want it to happen multiple times! It may even cause disasters, for example running make twice at once in the same Buildroot build tree is not pretty at all.

point4-1.png

As explained in bug 10082, the solution is (guess what!) to wrap the custom command in a custom target!

add_custom_target(make-gen DEPENDS gen)

point4-2.png

Then you change the custom targets to depend on “make-gen”, instead of the file ‘gen’. Except! Be careful when doing that — because there is another trap waiting for you!

5. File-level dependencies of custom targets are not propagated

If you read the documentation of add_custom_command() closely, and you look at the DEPENDS keyword argument, you’ll see this text:

If DEPENDS specifies any target (created by the add_custom_target(), add_executable(), or add_library() command) a target-level dependency is created to make sure the target is built before any target using this custom command. Additionally, if the target is an executable or library a file-level dependency is created to cause the custom command to re-run whenever the target is recompiled.

This sounds quite nice, like more or less what you would expect. But the important bit of information here is what CMake doesn’t do: when a custom target depends on another custom target, all the file level dependencies are completely ignored.

Here’s your final example for the evening:

cmake_minimum_required(VERSION 3.2)

set(SPECIAL_TEXT foo)

add_custom_command(
    OUTPUT gen1
    COMMAND cmake -E echo ${SPECIAL_TEXT} > gen1
    )

add_custom_target(
    gen1-wrapper
    DEPENDS gen1
    )

add_custom_command(
    OUTPUT gen2
    DEPENDS gen1-wrapper
    COMMAND cmake -E copy gen1 gen2
    )

add_custom_target(
    all-generated ALL
    DEPENDS gen2
    )

This is subtly wrong, even though you did what you were told, and wrapped the custom command in a custom target.

The first time you build it:

Scanning dependencies of target gen1-wrapper
[ 50%] Generating gen1
[ 50%] Built target gen1-wrapper
Scanning dependencies of target all-generated
[100%] Generating gen2
[100%] Built target all-generated

But then touch the file ‘gen1’, or overwrite it with something other text, or change the value of SPECIAL_TEXT in CMakeLists.txt to something else, and you will see this:

[ 50%] Generating gen1
[ 50%] Built target gen1-wrapper
[100%] Built target all-generated

There’s no file-level dependency created between ‘gen2’ and ‘gen1’, so ‘gen2’ never gets updated, and things get all weird.

point5-1-horizontal.png

You can’t just depend on gen1 instead of gen1-wrapper, because it may end up being built multiple times! See the previous point. Instead, you need to depend on the “gen1-wrapper” target and the file itself:

add_custom_command(
    OUTPUT gen2
    DEPENDS gen1-wrapper gen1
    COMMAND cmake -E copy gen1 gen2
    )

As the documentation says, this only applies to targets wrapping add_custom_command() output. If ‘gen1’ was a library created with add_library, things would work how you expect.

point5-2-horizontal.png

Conclusion

Maybe I just have a blunt head, but I found all of this quite difficult to work out. I can understand why CMake works this way, but I think there is plenty of room for improvement in the documentation where this is explained. Hopefully this guide has gone some way to making things clearer.

If you have any other dependency-related traps in CMake that you’ve hit, please comment and I’ll add them to this list…

Some CMake tips

Sketch of John Barber's gas turbine, from his patentI spent the past few weeks converting a bunch of Make and Autotools-based modules to use CMake instead. This was my first major outing with CMake. Maybe there will be a few blog posts on that subject!

In general I think CMake has a sound design and I quite want to like it. It seems like many of its warts are due to its long history and the need for backwards compatibility, not anything fundamentally broken. To keep a project going for 16 years is impressive and it is pretty widely used now. This is a quick list of things I found in CMake that confused me to start with but ultimately I think are good things.

  1. Targets are everythingCMake is pretty similar to normal make in that all the things that you care about are ‘targets’. Libraries are targets, programs are targets, subdirectories are targets and custom commands create files which are considered targets. You can also create custom targets which run commands if executed. You need to use custom targets feature if you want a custom command target to be tied to the default target, which is a little confusing but works OK.

    Targets have properties, which are useful.

  2. Absolute paths to shared librariesTraditionally you link to libfoo by passing -lfoo to the linker. Then, if libfoo is in a non-standard location, you pass -L/path/to/foo -lfoo. I don’t think pkg-config actually enforces this pattern but pretty much all the .pc files I have installed use the -L/path -Lname pattern.

    CMake makes this quite awkward to do, because it makes every effort to forget about the linker paths. Library ‘targets’ in CMake keep track of associated include paths, dependent libraries, compile flags, and even extra source files, using ‘target properties’. There’s no target property for LINK_DIRECTORIES, though, so outside of the current CMakeLists.txt file they won’t be tracked. There is a global LINK_DIRECTORIES property, confusingly, but it’s specifically marked as “for debugging purposes.”

    So the recommended way to link to libraries is with the absolute path. Which makes sense! Why say with two commandline arguments what you can say with one?

    At least, this will be fine once CMake’s pkg-config integration returns absolute paths to libraries

  3. Semicolon safety instead of whitespace safetyCMake has a ‘list’ type which is actually a string with ; (semicolon) used to delimit entities. Spaces are used as an argument separator, but converted to semicolons during argument parsing, I think. Crucially, they seem to be converted before variable expansion is done, which means that filenames with spaces don’t need any special treatment. I like this more than shell code where I have to quote literally every variable (or else Richard Maw shouts at me).

    For example:

    cmake_minimum_required(VERSION 3.2)
    
    set(path "filename with spaces")
    
    set(command ls ${path})
    
    foreach(item ${command})
        message(item: ${item})
    endforeach()

    Output:

    item:ls
    item:filename with spaces

    On the other hand:

    cmake_minimum_required(VERSION 3.2)
    
    set(path "filename;with\;semicolons")
    
    set(command ls ${path})
    
    foreach(item ${command})
        message(item: ${item})
    endforeach()

    Output:

    item:ls
    item:filename
    item:with
    item:semicolons

    Semicolons occur less often in file names, I guess. Most of us are trained to avoid spaces, partly because we know how broken (all?) most shell-based build systems are in those cases. CMake hasn’t actually solved this but just punted the special character to a less often used one, as far as I can see. I guess that’s an improvement? Maybe?

    The semi-colon separator can bite you in other ways, for example, when specifying CMAKE_PREFIX_PATH (library and header search path) you might expect this to work:

    cmake . -DCMAKE_PREFIX_PATH=/opt/path1:/opt/path2

    However, that won’t work (unless you did actually mean that to be one item). Instead, you need to pass this:

    cmake . -DCMAKE_PREFIX_PATH=/opt/path1\;/opt/path2

    Of course, ; is a special character in UNIX shells so must be escaped.

  4. Ninja instead of MakeCMake supports multiple backends, and Ninja is often faster than GNU Make, so give the Ninja backend a try: cmake -G Ninja.
  5. PoliciesThe CMake developers seem pretty good at backwards compatibility. To this end they have introduced the rather obtuse policies framework. The great thing about the policies framework is that you can completely ignore it, as long as you have cmake_minimum_required(VERSION 3.3) at the top of your toplevel CMakeLists.txt. You’ll only need it once you have a massive bank of existing CMakeLists.txt files and you are getting started on porting them to a newer version of CMake.

    Quite a lot of CMake error messages are worded to make you think like you might need to care about policies, but don’t be fooled. Mostly these errors are for situations where there didn’t use to be an error, I think, and so the policy exists to bring back the ‘old’ behaviour, if you need it.

If a tool is weird but internally consistent, I can get on with it. Hopefully, CMake is getting there. I can see there have been a lot of good improvements since CMake 2.x, at least. And at no point so far has it made me more angry than GNU Autotools. It’s not crashed at all (impressive given it’s entirely C++ code!). And it is significantly faster and more widely applicable than Autotools or artisanal craft Makefiles. So I’ll be considering it in future. But I can’t help wishing for a build system that I actually liked

Edit: you might also be interested in a list of common CMake antipatterns.

My first Ansible modules

This week I had my first go at writing modules for Ansible. I’ve been a fan of Ansible for a while now, I find it really practical and I think the developers made some very sensible compromises when making it. Writing a module for it has been almost painless, which is a very rare thing in the programming world. The only real surprise was that it was so easy.
What I have written is some Ansible modules to adminstrate a Gerrit instance. It seems that while there are a few Ansible roles online already for deploying a Gerrit instance, there are none that automate the step after deployment, when you need to configure your groups, projects, and special accounts.

I owe several beers to the developers of the Gerrit REST API which is powerful enough to do everything I needed to do, and has complete and accurate documentation, as well as to the developers of Ansible. The repo so far contains less than 800 lines of code, but is hopefully complete enough to let me define the initial configuration of the Baserock Gerrit instance with just a few files in a Git repo.

Developing with Packer and Docker

I spent last Friday looking at setting up an OpenID provider for the Baserock project. This is kind of a prerequisite to having our own Storyboard instance, which we’d quite like to use for issue and task tracking.

I decided to start by using some existing tools that have nothing to do with Baserock. Later we can move the infrastructure over to using Baserock, to see what it adds to the process. We have spent quite a lot of time “eating our own dogfood” and while it’s a good idea in general, a balanced diet contains more than dogfood.

The Baserock project has an OpenStack tenency at DataCentred which can host our public infrastructure. The goal is to deploy my OpenID provider system there. However, for prototyping it’s much easier to use a container on my laptop because I don’t need to be working across an internet connection.

The morph deploy command in Baserock allows deploying systems in multiple ways and I think it’s pretty cool. Another tool which seems to do this is Packer.

Differences between `morph deploy` and Packer

Firstly, I notice Packer takes JSON as input, where Morph uses YAML. So with Packer I have to get the commas and brackets right and can’t use comments. That’s a bit mean!

A bigger difference is that Morph considers ‘building and configuring’ the image differently to ‘writing the image somewhere.’ By constrast, Packer ties ‘builder’ and ‘type of hosting infrastructure’ together. In my case I need to use the Docker builder for my prototype and the OpenStack builder for the production system. There can be asymmetry here: in Docker my Fedora 20 base comes from the Docker registry, but for OpenStack I created my own image from the Fedora 20 cloud image. I could have created my own Docker image too if I didn’t trust the semi-official Fedora image, though.

The downside to the `morph deploy` approach is that it can be less efficient. To deploy to OpenStack, Morph first unpacks a tarball locally, then runs various ‘configuration’ extensions on it, then converts it to a disk image and uploads it to OpenStack. Packer starts by creating the OpenStack instance and then configures it ‘live’ via SSH. This means it doesn’t need to unpack and repack the system image, which can be slow if the system being deployed is large.

Packer has separate types of ‘provisioner’ for different configuration management frameworks. This is handy, but the basic ‘shell’ and ‘file’ provisioners actually give you all you need to implement the others. The `morph deploy` command doesn’t have any special helpers for different configuration management tools, but that doesn’t prevent one from using them.

Building Packer

Packer doesn’t seem to be packaged for Fedora 20, which I use as my desktop system, so I had a go at building it from source:

    sudo yum install golang
    mkdir gopath && cd gopath
    export GOPATH=`pwd`
    go get -u github.com/mitchellh/gox
    git clone https://github.com/mitchellh/packer \
src/github.com/mitchellh/packer
    cd src/github.com/mitchellh/packer    
    make updatedeps
    make dev

There’s no `make install`, but you can run the tool as

$GOPATH/bin/packer

. Note there’s also a program called `packer` provided by the CrackLib package in /usr/sbin on Fedora 20 which can cause a bit of confusion!

Prototyping with Docker

I’m keen on use of Docker for prototyping and for managing contains in general. I’m not really sold on the idea of the Dockerfile as it seems like basically a less portable incarnation of shell scripting. I do think it’s cool that Docker takes a snapshot after each line of the file is executed, but I’ve no idea if this is useful in practice. I’d much prefer to use a configuration management system like Ansible. And rather than installing packages every time I deploy a system, it’d be nice to just build the right one to begin with, like I can do with Morph in Baserock.

Using Packer’s Docker builder doesn’t need to me to write a Dockerfile, and as the Packer documentation points out, all of the configuration that can be described in a Dockerfile can also be specified as arguments to the docker run command.

As a Fedora desktop user it makes sense to use Fedora for now as my Docker base image. So I started with this template.json file:

    {
        "builders": [
            {
                "type": "docker",
                "image": "fedora:20",
                "commit": true
            }
        ],
        "post-processors": [
            {
                "type": "docker-tag",
                "repository": "baserock/openid-provider",
                "tag": "latest",
            }
        ]
    }

I ran packer build template.json and (after some confusion with /usr/sbin/packer) waited for a build. Creating my container took less then a minute including downloading the Fedora base image from the Docker hub. Nice!

I could then enter my image with docker run -i -t and check out my new generic Fedora 20 system.

Initially I thought I’d use Vagrant, which is a companion tool to Packer, to get a development environment set up. That would have required me to use VirtualBox rather than Docker for my development deployment, though, which would be much slower and more memory-hungry than a container. I realised that all I really wanted was the ability to share the Git repo I was developing things in between my desktop and my test deployments anyway, which could be achieved with a Docker volume just as easily.

Edit: I since found out that there are about four different ways to use Vagrant with Docker, but I’m going to stick with my single docker run command for now

So I ended up with the following commandline to enter my development environent:

    docker run -i -t --rm \
        --volume=`pwd`:/src/test-baserock-infrastructure \
        baserock/openid-provider

The only issue is that because I’m running as ‘root’ inside the container, files from the develpoment Git repo that I edit inside the container become owned by root in my home directory. It’s no problem to always edit and run Git from my desktop, though (and since the container system lacks both vim and git, it’s easy to remember!).

Running a web service

I knew of two OpenID providers I wanted to try out. Since Baserock is mostly a Python shop I thought the first one to try should be the Python-based Django OpenID Provider (the alternative being Masq, which is for Ruby on Rails).

Fedora 20 doesn’t ship with Django so the first step was install it in my system. The easiest (though far from the best) way is using the Packer ‘shell’ provisioner and running the following:

    yum install python-pip
    pip install django

Next step was to follow the Django tutorial and get a demo webserver running. Port forwarding is nobody’s friend it took me a bit of time to be able to talk to the webserver (which was in a container) from my desktop. The Django tutorial advises running the server on 127.0.0.1:80 but this doesn’t make so much sense in a container. Instead, I ran the server with the following:

    python ./manage.py runserver 0.0.0.0:80

And I ran the container as follows:

    docker run -i -t --rm \
        --publish=127.0.0.1:80:80 \
        --volume=`pwd`:/src/test-baserock-infrastructure \
        baserock/openid-provider

So inside the container the web server listens on all interfaces, but but it’s forwarded only to localhost on my desktop, so that other computers can’t connect to my rather insecure test webserver.

I then spent a while learning Django and setting up Django OpenID Provider in my Django project. It was actually quite easy and fun! Eventually I got to the point where I wanted to put my demo OpenID server on the internet, so I could test it against a real OpenID consumer.

Deploying to OpenStack

The Packer to deployment to OpenStack proved a bit more tricky than deploying to Docker. It turns out that fields like ‘source_image’, ‘flavor’ and ‘networks’ take IDs rather than names, which is pain (although I understand the reason). I had to give my instance a floating IP, too, as Packer needs to contact it via SSH and I’m deploying from my laptop, which isn’t on the cloud’s internal network. It took a while to get a successful deployment but we got there.

I found that using the “files” provisioner to copy in the code of the Django application didn’t work: the files were there in the resulting system, but corrupted. It may be better to try and deploy this from Git anyway, but I’m a little confused what went wrong there.

Edit: I found that there was quite a bit of corruption in the files that were added during provisioning, and while I didn’t figure out the cause, I did find that adding a call to ‘sleep 10’ as the first step in provisioning made the issue go away. Messy.

I’m fairly happy with what I managed to get done in a single day: we now have a way of developing and deploying infrastructure which requires minimal fluff in the Git repository and a pretty quick turnaround time. As for the Django OpenID provider, I’ve not yet managed to get it to serve me an OpenID — I guess next Friday I shall start debugging it.

The code is temporarily available at http://github.com/ssssam/test-baserock-infrastructure. If we make use of this in Baserock it will no doubt move to git.baserock.org.

The Fundamentals of OSTree

I’ve had some time at work to get my head around the OSTree tool. I summarised the tool in a previous post. This post is an example of its basic functionality (as of January 2014).

The Repository

OSTree repos can be in one of multiple modes. It seems ‘bare’ is for use on deployed systems (as the actual deployment uses hardlinks so disk space usage is kept to a minimum), while ‘archive-z2’ is for server use. The default is ‘bare’.

Every command requires a –repo=… argument, so it makes sense to create an alias to save typing.

$ cd ostree-demo-1
$ mkdir repo
$ alias ost="ostree --repo=`pwd`/repo"

$ ost init
$ ls repo
config  objects  refs  remote-cache  tmp
$ cat repo/config
[core]
repo_version=1
mode=bare

Committing A Filesystem Tree

Create some test data and commit the tree to ‘repo’. Note that none of the commit SHA256s here will match the ones you see when running these commands, unless you are in some kind of time vortex.

$ mkdir tree
$ cd tree
$ echo "x" > 1
$ echo "y" > 2
$ mkdir dir
$ cp /usr/share/dict/words words

$ ost commit --branch=my-branch \
    --subject="Initial commit" \
    --body="This is the first commit."
ce19c41036cc45e49b0cecf6b157523c2105c4de1ce30101def1f759daafcc3e

$ ost ls my-branch
d00755 1002 1002      0 /
-00644 1002 1002      2 /1
-00644 1002 1002      2 /2
-00644 1002 1002 4953680 /words
d00755 1002 1002      0 /dir

Let’s make a second commit.

$ tac words > words.tmp && mv words.tmp words

$ ost commit --branch=my-branch \
    --subject="Reverse 'words'"
    --body="This is the second commit."
67e382b11d213a402a5313e61cbc69dfd5ab93cb07fbb8b71c2e84f79fa5d7dc

$ ost log my-branch
commit 67e382b11d213a402a5313e61cbc69dfd5ab93cb07fbb8b71c2e84f79fa5d7dc
Date:  2014-01-14 12:27:05 +0000

    Reverse 'words'

    This is the second commit.

commit ce19c41036cc45e49b0cecf6b157523c2105c4de1ce30101def1f759daafcc3e
Date:  2014-01-14 12:24:19 +0000

    Initial commit

    This is the first commit.

Now you can see two versions of ‘words’:

$ ost cat my-branch words | head -n 3
ZZZ
zZt
Zz
error: Error writing to file descriptor: Broken pipe

$ ost cat my-branch^ words | head -n 3
1080
10-point
10th
error: Error writing to file descriptor: Broken pipe

OSTree lacks most of the ref convenience tools that you might be used to from Git. For example, where providing a SHA256 you cannot abbreviate it, you must give the full 40 character string. There is an ostree rev-parse command which takes a branch name and returns the commit that branch ref currently points too, and you may use ^ to refer to the parent of a commit as seen above.

Checking Out Different Versions of the Tree

The last command we’ll look at is ostree checkout. This command expects you to be checking out the tree into an empty directory, and if you give it only a branch or commit name it will create a new directory with the same name as that branch or commit:

$ ost checkout my-branch
$ ls my-branch/
1  2  dir  words
$ head -n 3 my-branch/words
ZZZ
zZt
Zz

By default, OSTree will refuse to check out into a directory that already exists:

$ ost checkout my-branch^ my-branch
error: File exists

However, you can also use ‘union mode’, which will overwrite the contents of the directory (keeping any files which already exist that the tree being checked out doesn’t touch, but overwriting any existing files that are also present in the tree):

$ ost checkout --union my-branch^ my-branch
$ ls my-branch/
1  2  dir  words
$ head -n 3 my-branch/words 
1080
10-point
10th

One situation where this is useful is if you are constructing a buildroot that is a combination of several other trees. This is how the Gnome Continuous build system works, as I understand it.

Disk Space

In a bare repository OSTree stores each file that changes in a tree as a new copy of that file, so our two copies of ‘words’ make the repo 9.6MB in total:

$ du -sh my-branch/
4.8M    my-branch/
$ du -sh repo/
9.6M    repo/

It’s clear that repositories will get pretty big when there is a full operating system tree in there and there is no functionality right now that allows you to expire commits that are older than a certain threshold. So if you are looking for somewhere to start hacking …

In Use

OSTree contains a higher layer of commands which operate on an entire sysroot rather than just a repository. These commands are documented in the manual, and a good way to try them out is to download the latest Gnome Continuous virtual machine images.

OS-level version control

At the time of writing (January 2014), the area of OS-level version control is at the “scary jungle of new-looking projects” stage. I’m pretty sure we are at the point where most people would answer the question “Is it useful to be able to snapshot and rollback your operating system?” with “yes”. Faced with the next question, “How do you recommend that we do it?” the answer becomes a bit more difficult.

I have not tried out all of the projects below but hopefully this will be a useful summary.

Snapper

The openSUSE project have created Snapper, which has a specific goal of providing snapshot, rollback and diff of openSUSE installations.

It’s not tied to a specific implementation method, instead there are multiple backends, currently for btrfs, LVM or ext4 using the ‘next4’ branch. Btrfs snapshots seems to be the preferred implementation.

The tool is implemented as a daemon which exposes a D-Bus API. There is a plugin for the Zypper package manager and the YaST config tool which automatically creates a snapshot before each operation. There is a cron job which creates a snapshot every hour, and a PAM module which creates a snapshot on every login. There is also a command-line client which allows you to create and manage snapshots manually (of course, you can also call the D-Bus API directly).

Creating a snapshot every hour might sound expensive but remember that Btrfs is a copy-on-write filesystem, so if nothing has changed on disk then the snapshot comes virtually free. Even so, there is also a cron job set up which cleans up stale snapshots in a configurable way.

The openSuSE user manual contains this rather understated quote:

Since Linux is a multitasking system, processes other than YaST or zypper may modify data in the timeframe between the pre- and the post-snapshot. If this is the case, completely reverting to the pre-snapshot will also undo these changes by other processes. In most cases this would be unwanted — therefore it is strongly recommended to closely review the changes between two snapshots before starting the rollback. If there are changes from other processes you want to keep, select which files to roll back.

OSTree

GNOME/Red Hat developer Colin Walters is behind OSTree, which is notable primarily because it provides snapshotting without depending on any underlying filesystem features. Instead, it implements its own version control system for binary files.

In brief, you can create snapshots and go back to them later on, and compare differences between them. There doesn’t seem to be a way to delete them, yet — unlike Snapper, which is being developed with a strong focus on being usable today, OSTree is being developed bottom-up starting from the “Git for binaries” layer. There is work on the higher level in progress; Colin recently announced a prototype of RPM integration for OSTree.

Beyond allowing local management of filesystem trees, OSTree also has push and a pull command, which allow you to share branches between machines. Users of the GNOME Continuous continuous integration system can use this today to obtain a ready-built GNOME OS image from the build server, and then keep it up to date with the latest version using binary deltas.

OSTree operates atomically, and the root tree is mounted read-only so that other processes cannot write data there. This is a surefire way to avoid losing data, but it does require major rethinking on how Linux-based OSes work. OSTree provides an init ramdisk and a GRUB module, which allows you to choose at boot time which branch of the filesystem to load.

There are various trade-offs between OSTree’s approach versus depending on features in a specific filesystem. OSTree’s manual discusses these tradeoffs in some detail. An additional consideration is that both OSTree and Btrfs are still under active development and therefore you may encounter bugs that you need to triage and fix. OSTree is roughly 28k lines of C, running entirely in userland, while the kernel portion of Btrfs alone is roughly 91k lines of C.

Docker

Docker is a tool for managing Linux container images. As well as providing a helpful wrapper around LXC for running containers, it allows taking snapshots of the state of a container. Until recently it implemented this using the AUFS union file system, which requires patching your kernel to use and is unlikely to make it into mainline Linux any time soon, but as of version 0.7 Docker allows use of multiple storage backends. Alex Larsson (coincidentally another GNOME/Red Hat developer) implemented a practical device-mapper storage backend which works on all Linux-based OSes. There is talk of a Btrfs storage backend too, but I have not seen concrete details since this prototype from May 2013.

It can be kind of confusing to understand Docker’s terminology at first so I’ll run through my understanding of it. The machine you are actually running has a root filesystem and some configuration info, and that’s a container. You create containers by calling docker run <image>, where an image is a root filesystem and some configuration info, but stored. You can have multiple versions of the same image, which are distinguished by tags. For example, Docker provide a base image named ‘ubuntu’ which has a tag for each available version. The rule seems to be that if you’re using it, it’s a container, if it’s a snapshot or something you want to use as a base for something else, it’s an image. You’ll have to train yourself to avoid ever using the term “container image” again.

Docker’s version control functionality is quite primitive. You can call docker commit to create a new image or a new tag of an existing image from a container. You can call docker diff to show an svn status-style list of the differences between a container and the image that it is based on. You can also delete images and containers.

You can also push and pull images from repositories such as the public Docker INDEX. This allows you to get a container going from scratch pretty quickly. However you can create an image from a rootfs tarball using docker import, so you can base your containers off any operating system you like.

On top of all this, Docker provides a simple automation system for creating containers via calls to a package manager or a more advanced provisioning system like Puppet (or just arbitrary shell commands).

Linux Containers (LXC)

Docker is implemented on top of Linux Containers. It’s possible to use these commands directly, too (although unlike Docker, where ignoring the manual and using the default settings will get you to a working container, using LXC commands without having read the manual first is likely to lead to crashing your kernel). LXC provides a snapshotting mechanism too via the lxc-clone and lxc-snap commands. It uses the snapshotting features of either Btrfs or LVM, so it requires that your containers (not just the rootfs, but the configuration files and all) are stored on either a Btrfs or LVM filesystem.

Baserock

The design of Baserock includes atomic snapshot and rollback using Btrfs. Not much time has been spent on this so far, partly because despite being originally being aimed at embedded systems Baserock is seeing quite a lot of use as a server OS.

The component that does exist is tb-diff, which is a tool for comparing two snapshots (two arbitrary filesystems, really) and constructing a binary delta. This is useful for providing upgrades, and could even be used to rebase in-development systems on top of a new base image. While all the above projects provide a ‘diff’ command, Snapper’s simply defers to the UNIX diff command (which can only display differences in text files, not binaries), and Docker’s doesn’t display contents at all, just the names of the files that have changes.

Addendum: Nix

It was remiss of me I think not to have mentioned Nix, which aims to remove the need for snapshots altogether by implementing rollback (and atomic package management transactions) at the package level using hardlinks. Nix is a complex enough beast that you wouldn’t want to use it in a non-developer-focused OS as-is, but you could easily implement a snapshot and rollback mechanism on top.

How to bootstrap Docker 0.7 on Fedora 18

NOTE: Docker 0.7 has now been released and is packaged in Fedora 20, so you should ignore these instructions completely. I will leave them here in any case.

I’ve been hearing about Docker recently and it turns out the project is really, really cool. In a nutshell Docker makes it pretty easy to use Linux Containers, which let you run images of different operating systems in a quicker and less resource-intensive manner than virtualisation methods like KVM, The only downside with Docker is you must share the kernel and architecture of the host machine.

A virtual machine can’t share the host’s memory, so if you want to run much inside one you have to allocate half of your machine’s RAM to it and spend the rest of the day watching your machine write to swap memory. Containers don’t do that. Docker also uses copy-on-write storage and tracks your containers so that you don’t accumulate opaque 2GB disk image files all over the place.

Docker 0.6 is the current version, but it depends on the slightly esoteric AUFS and so the only supported way to run it is on Ubuntu, or in an Ubuntu VM. I stopped using Ubuntu a while back due to the lack of up-to-date (or even functional) GNOME packages, the increasing number use of slow and memory-hungry Python scripts running in the background, the spyware and them silently dropping support for my laptop, and I didn’t really want to install a VM to try out Docker when the whole reason I wanted to play with Docker was so that I didn’t have to use virtual machines any more!

Docker 0.7, which is unreleased, has a Linux Device-mapper backend to implement copy-on-write storage, instead of needing AUFS, with the explicit goal of getting it packaged it in Fedora and then RHEL. So it seemed reasonable to think I might be able to get it to work on Fedora today.

Docker is written in Go, which is a build
system and compiler in one. It’s actually pretty good; I went from having
never used Go to working out the quirky way it wants to build stuff fairly
fast, and everything really did just work. Kudos to everyone behind Docker and Go.

Here is what I did to get it to run (you may need to install more packages than the one I list on a bare Fedora install; I’m sure you can work it out).

yum install device-mapper-devel golang lxc

git clone git://github.com/dotcloud/docker
cd docker

# Check out latest 0.7 branch; the name may be different by the
# time you read this.
git checkout --track origin/v0.7.0-rc4

# Go seems to look for all source code within 'GOPATH'. Docker's Git
# tree contains the source code of most of the dependencies inside
# the 'vendor' directory. If we symlink the Docker source tree there
# we can add that directory to GOPATH and Go will find all the code.
ln -s `pwd` vendor/src/github.com/dotcloud/docker

mkdir build
GOPATH=`pwd`/vendor go build -o build/docker ./docker
GOPATH=`pwd`/vendor go build -o build/dockerinit ./dockerinit

If you run ‘go test’ at this point, it fails in all sorts of interesting places, but I found that my build seems to work well enough.

For networking to work in Fedora, you need IPv4 forwarding to be enabled.

# Enable IPv4 forwarding in Linux
sudo sysctl net.ipv4.ip_forward=1
sudo sh -c 'echo net.ipv4.ip_forward=1 > /etc/sysctl.d/docker.conf'

# Enable IPv4 forwarding in firewalld (this might need doing every
# time you boot; please tell me if you know a better way. The
# firewalld documentation is terrifying).
sudo firewall-cmd --add-masquerade

# Start Docker daemon. If you want it to store data somewhere other
# than /var/lib then use the -g flag, e.g.: '-g ~/.docker'.
sudo build/docker -d

In a seperate shell, you can now do fun things, like start a shell inside an Ubuntu!

sudo build/docker run -i -t base /bin/bash

The first run needs to download the base Ubuntu system so it’s a little slow. Exit and try again. It loads in under a second! You should be able to access the internet from in there.

Remember that we had to build an unreleased version Docker and various bits are probably broken? Now you can also use Docker to do a clean build of Docker, inside your Ubuntu container!

sudo build/docker build -t docker .

This is as far as I have got so far. I am impressed with the high quality of the engineering in Docker, and with the cool features in Linux itself that make this all possible in the first place! Hopefully I’ll get the chance to get Baserock running in there — the idea with Baserock is that you do your development inside a Baserock VM, which is pretty unweildy. Running Baserock inside a container is much more practical. Docker should be pretty useful for Gnome Continuous as well (although I have a feeling it overlaps a lot with the functionality of OSTree). There are all sorts of tasks that I do on my laptop directly because a VM would take too long to spin up I can now do in containers without installing all sorts of unstable libraries or messing about in prefixes with LD_LIBRARY_PATH, dbus-launch and friends. Good times!

Programming-fact-that-should-have-been-obvious-but-it-wasn’t Of The Day

How do you make a Python class which subclasses Mock, to extend the base Mock class with some features specific to one class from your code?

Like this, right?

    class MyClass(mock.Mock):
        def __init__(self, param1, param2):
            super(MyClass, self).__init__()

        def real_implementation_of_something(...):
            ...

This is useful when you want most methods to be mocks but there is some functionality that still needs to be there, or at least can’t be mocked automatically by the Mock class. Sadly, though, when you call any of its methods you get the following cryptic error:

    def _get_child_mock(self, **kw):
        """Create the child mocks for attributes and return value.
            By default child mocks will be the same type as the parent.
            Subclasses of Mock may want to override this to customize the way
            child mocks are made.
    
            For non-callable mocks the callable variant will be used (rather than
            any custom subclass)."""
        _type = type(self)
        if not issubclass(_type, CallableMixin):
            if issubclass(_type, NonCallableMagicMock):
                klass = MagicMock
            elif issubclass(_type, NonCallableMock) :
                klass = Mock
        else:
            klass = _type.__mro__[1]
>       return klass(**kw)
E       TypeError: __init__() got an unexpected keyword argument 'param1'

The first time you call a certain method on a Mock, what the object does is dynamically create another Mock object to represent the method, and save that as an attribute. My mental model of Mock was for a long time that you mocked objects, but that’s not the right way to look at it. A Mock can represent anything (if you’ve been paying attention you’ll remember that everything in Python is an object).

The problem above is that when you access MyClass.foo(), the Mock library calls the constructor MyClass.__init__() again to create a mock that represents foo. It passed in various arguments for Mock.__init__() class, but because we have subclassed Mock and overridden the constructor, this call went to MyClass.__init__() first, which choked on the unexpected parameters and gave us the weird backtrace you see above.

The fix is kind of obvious when you think about it:

    class MyClass(mock.NonCallableMock):
        def __init__(self, param1, param2):
            super(MyClass, self).__init__()

        def real_implementation_of_something(...):
            ...

. (source) in shells

Hi, here is a dull post about the . (source) command. It contains facts. Some of my life was spent finding them out, and the purpose of this post is to hopefully save others from the same fate.

The starting point was the gnome-autogen.sh script that is used from autogen.sh scripts instead of autoreconf to handle detecting gtk-doc and intltool as well as the standard autotools. A couple of modules bundle this file in git as well, which isn’t necessary, and it turns out differences between the way source works in different shells causes slightly confusing things to happen sometimes. The bundled version may be preferred over the system-wide version, depending on your shell, and the phase of the moon.

Here is an abstract test which demonstrates the problem:

mkdir -p in-path
mkdir -p local

cat <<EOF > in-path/foo.sh
#!/bin/sh
echo "I'm in the path"
EOF
chmod +x in-path/foo.sh

cat <<EOF > local/foo.sh
#!/bin/sh
echo "I'm local only"
EOF
chmod +x local/foo.sh

# Here we should run the local copy
cd local
. foo.sh
cd ..

# Now we should run the copy that is in the path instead
PATH=$(pwd)/in-path:$PATH
cd local
. foo.sh
cd ..

rm in-path/foo.sh
rm local/foo.sh
rmdir in-path
rmdir local

What output do you expect?

bash by default searches the path, and then the cwd:

sam@droopy:~$ bash source_observes_path.tests
I'm local only
I'm in the path

However, the autogen.sh starts itself with #!/bin/sh, so bash will actually run POSIX mode:

sam@droopy:~$ sh source_observes_path.tests
source_observes_path.tests: line 20: .: foo.sh: file not found

In Busybox’s default shell (Ash), as a performance optimisation they search the local directory first, which is extra unhelpful.

sam@droopy:~$ busybox sh source_observes_path.tests
I'm local only
I'm local only

However, you can use hush, if you like:

sam@droopy:~$ busybox hush source_observes_path.tests
I'm local only
I'm in the path

In this blog post I’ve tried very hard to convey the inherent dullness of exploring these sorts of irregularities in the way different UNIX shells work. Thank you for your attention.

intltool compile warnings

As part of my ongoing quest to fix minor compile warnings and annoyances as a way of avoiding doing any of that in-depth and productive work that takes so much more effort, I had a look at how to get rid of the 100 warnings intltool-update gives while doing ‘make check’ for Tracker.

mismatched quotes at line 748 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 754 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 782 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 785 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 813 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 817 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 846 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 852 in ../tests/functional-tests/performance-tc-modified.py
mismatched quotes at line 880 in ../tests/functional-tests/performance-tc-modified.py

The solution turns out to be upgrade to intltool 0.50, it’s a simple package so you should be able to just do a drop-in upgrade, for example on Fedora just upgrade with this RPM.

rpmlint

Recently I’ve been doing some Meego work (rumours of its death are slightly exaggerated), and also moved to running Fedora since Ubuntu aren’t prepared to put enough resources into packaging Gnome 3 for it to actually work very well. All of this means a lot more involvement with RPM!

In fact my first experience with Linux was with SuSE 6.x and I had such bad experiences with RPM constantly complaining about missing dependencies that swore to avoid it for the rest of my life and spent many years running Slackware, which of course lets you manage your own missing dependencies. I think my recent reversal of this view largely stems from having tried my hand at Debian packaging and realising just how complex that system is.

So, anyway, making .rpm’s seems easier than .deb’s, mainly because there are fewer levels of indirection between the recipe and the actions being taken. There’s also minimal documentation, sadly. Here I’ve collected some general tips for the benefit of search engines.

General info

Fedora’s RPM packing guide

rpmlint

rpmlint is a very simple Python tool to sanity-check packages. It’s written in Python and essentially runs a bunch of regexes over the file. SUSE have some helpful docs.

You can ignore false positives with an rpmlintrc file, which is simply some Python source code. A useful function is addFilter() – this allows you to ignore false positives, and is literally a regexp over the console output for that warning. In my case I was getting lots of the following:

vala.i586: W: files-duplicate /usr/share/vala-0.14/vapi/sdl-mixer.deps /usr/share/vala-0.14/vapi/sdl-net.deps:/usr/share/vala-0.14/vapi/sdl-ttf.deps:/usr/share/vala-0.14/vapi/sdl-image.deps:/usr/share/vala-0.14/vapi/sdl-gfx.deps

These files aren’t duplicates, they just happen to have the same contents, so we want to ignore this warning. I created this vala-rpmlintrc file:

from Config import *

# Prevent duplicate warnings for the vapi .deps files
addFilter("files-duplicate (/usr/share/vala-0.14/vapi/.+\\.deps)+")

Another problem I had was the following error:

vala-devel.i586: E: library-without-ldconfig-postun (Badness: 300) /usr/lib/libvala-0.14.so.0.0.0

This package contains a library and provides no %postun scriptlet containing a
call to ldconfig.

vala-devel.i586: E: library-without-ldconfig-postin (Badness: 300) /usr/lib/libvala-0.14.so.0.0.0

This package contains a library and provides no %post scriptlet containing a
call to ldconfig.

This confused me for a long time. I had the correct code there, but it turned out that I had

%post -p /sbin/ldconfig

%postun -p /sbin/ldconfig

…but this is for the wrong package! The correct solution was to change it to this:

%post -n vala-devel -p /sbin/ldconfig

%postun -n vala-devel -p /sbin/ldconfig

I hope this helps, people from the future!

Nix

The other day I found out about the Nix package manager. It’s interesting because each version of package (called a derivation for some reason) is in an isolated directory, and the environment is built from symlinks and long PATH variables and the like. This gives some niceties like atomic upgrades, and the ability to reason about broken dependencies much more easily; the developers term it a purely functional package manager. You can also have multiple versions of a package installed and switch between versions using a “profiles” feature.

This got my brain working a little. In Nix, the packages are specified as a set of attributes and a shell script that together describe how to build the package from source. When installing, it first checks various places for a suitable binary version and if there isn’t one, it will download the source and build it. So Nix could give us beautiful support for testing and hacking on bits of GNOME. Imagine I decide to rewrite GtkTreeView, for whatever reason. I set up a new profile which uses Gtk+ from git, and keeps the source in my home directory somewhere. Nix can download binary versions of the latest GLib unstable, and any other deps not satisfied by the distro, so I don’t have to waste time building those. I’m not sure what Nix would do about the apps that depend on Gtk+; ideally you could tell it the ABI wasn’t changing so it would just run the same versions of apps but linked against the unstable Gtk+. This isn’t possible at the moment and would be hard to implement, I imagine. Right now Nix could install new versions of various apps, hopefully just the ones you specify to save duplicating your entire system. The point anyway is that now I can do some hacking and test my changes in my real environment straight away, all with no danger of breaking my actual system.

I know there are major obstacles to achieving this. Nix isn’t perfect – it uses wastes quite a bit of time and disk space, although there is a distro using it in the real-world. I wrote to their mailing list, and it seems like a lot of what I mention above is possible but isn’t really documented or used much at the moment. And of course it’s not like jhbuild doesn’t work well. Still, I kind of think this is a vision of the future. It would be awesome being able to pick and choose bits of your system to hack on and have it integrate perfectly.

Interestingly, nix would also be really useful for packaging MSYS. I mean the dependency problems there are so complex that the msysGIT people choose to ship an entirely separate environment.