At the time of writing (January 2014), the area of OS-level version control is at the “scary jungle of new-looking projects” stage. I’m pretty sure we are at the point where most people would answer the question “Is it useful to be able to snapshot and rollback your operating system?” with “yes”. Faced with the next question, “How do you recommend that we do it?” the answer becomes a bit more difficult.
I have not tried out all of the projects below but hopefully this will be a useful summary.
The openSUSE project have created Snapper, which has a specific goal of providing snapshot, rollback and diff of openSUSE installations.
It’s not tied to a specific implementation method, instead there are multiple backends, currently for btrfs, LVM or ext4 using the ‘next4’ branch. Btrfs snapshots seems to be the preferred implementation.
The tool is implemented as a daemon which exposes a D-Bus API. There is a plugin for the Zypper package manager and the YaST config tool which automatically creates a snapshot before each operation. There is a cron job which creates a snapshot every hour, and a PAM module which creates a snapshot on every login. There is also a command-line client which allows you to create and manage snapshots manually (of course, you can also call the D-Bus API directly).
Creating a snapshot every hour might sound expensive but remember that Btrfs is a copy-on-write filesystem, so if nothing has changed on disk then the snapshot comes virtually free. Even so, there is also a cron job set up which cleans up stale snapshots in a configurable way.
The openSuSE user manual contains this rather understated quote:
Since Linux is a multitasking system, processes other than YaST or zypper may modify data in the timeframe between the pre- and the post-snapshot. If this is the case, completely reverting to the pre-snapshot will also undo these changes by other processes. In most cases this would be unwanted — therefore it is strongly recommended to closely review the changes between two snapshots before starting the rollback. If there are changes from other processes you want to keep, select which files to roll back.
GNOME/Red Hat developer Colin Walters is behind OSTree, which is notable primarily because it provides snapshotting without depending on any underlying filesystem features. Instead, it implements its own version control system for binary files.
In brief, you can create snapshots and go back to them later on, and compare differences between them. There doesn’t seem to be a way to delete them, yet — unlike Snapper, which is being developed with a strong focus on being usable today, OSTree is being developed bottom-up starting from the “Git for binaries” layer. There is work on the higher level in progress; Colin recently announced a prototype of RPM integration for OSTree.
Beyond allowing local management of filesystem trees, OSTree also has
pull command, which allow you to share branches between machines. Users of the GNOME Continuous continuous integration system can use this today to obtain a ready-built GNOME OS image from the build server, and then keep it up to date with the latest version using binary deltas.
OSTree operates atomically, and the root tree is mounted read-only so that other processes cannot write data there. This is a surefire way to avoid losing data, but it does require major rethinking on how Linux-based OSes work. OSTree provides an init ramdisk and a GRUB module, which allows you to choose at boot time which branch of the filesystem to load.
There are various trade-offs between OSTree’s approach versus depending on features in a specific filesystem. OSTree’s manual discusses these tradeoffs in some detail. An additional consideration is that both OSTree and Btrfs are still under active development and therefore you may encounter bugs that you need to triage and fix. OSTree is roughly 28k lines of C, running entirely in userland, while the kernel portion of Btrfs alone is roughly 91k lines of C.
Docker is a tool for managing Linux container images. As well as providing a helpful wrapper around LXC for running containers, it allows taking snapshots of the state of a container. Until recently it implemented this using the AUFS union file system, which requires patching your kernel to use and is unlikely to make it into mainline Linux any time soon, but as of version 0.7 Docker allows use of multiple storage backends. Alex Larsson (coincidentally another GNOME/Red Hat developer) implemented a practical device-mapper storage backend which works on all Linux-based OSes. There is talk of a Btrfs storage backend too, but I have not seen concrete details since this prototype from May 2013.
It can be kind of confusing to understand Docker’s terminology at first so I’ll run through my understanding of it. The machine you are actually running has a root filesystem and some configuration info, and that’s a container. You create containers by calling
docker run <image>, where an image is a root filesystem and some configuration info, but stored. You can have multiple versions of the same image, which are distinguished by tags. For example, Docker provide a base image named ‘ubuntu’ which has a tag for each available version. The rule seems to be that if you’re using it, it’s a container, if it’s a snapshot or something you want to use as a base for something else, it’s an image. You’ll have to train yourself to avoid ever using the term “container image” again.
Docker’s version control functionality is quite primitive. You can call
docker commit to create a new image or a new tag of an existing image from a container. You can call
docker diff to show an
svn status-style list of the differences between a container and the image that it is based on. You can also delete images and containers.
You can also
pull images from repositories such as the public Docker INDEX. This allows you to get a container going from scratch pretty quickly. However you can create an image from a rootfs tarball using
docker import, so you can base your containers off any operating system you like.
On top of all this, Docker provides a simple automation system for creating containers via calls to a package manager or a more advanced provisioning system like Puppet (or just arbitrary shell commands).
Linux Containers (LXC)
Docker is implemented on top of Linux Containers. It’s possible to use these commands directly, too (although unlike Docker, where ignoring the manual and using the default settings will get you to a working container, using LXC commands without having read the manual first is likely to lead to crashing your kernel). LXC provides a snapshotting mechanism too via the lxc-clone and lxc-snap commands. It uses the snapshotting features of either Btrfs or LVM, so it requires that your containers (not just the rootfs, but the configuration files and all) are stored on either a Btrfs or LVM filesystem.
The design of Baserock includes atomic snapshot and rollback using Btrfs. Not much time has been spent on this so far, partly because despite being originally being aimed at embedded systems Baserock is seeing quite a lot of use as a server OS.
The component that does exist is tb-diff, which is a tool for comparing two snapshots (two arbitrary filesystems, really) and constructing a binary delta. This is useful for providing upgrades, and could even be used to rebase in-development systems on top of a new base image. While all the above projects provide a ‘diff’ command, Snapper’s simply defers to the UNIX
diff command (which can only display differences in text files, not binaries), and Docker’s doesn’t display contents at all, just the names of the files that have changes.
It was remiss of me I think not to have mentioned Nix, which aims to remove the need for snapshots altogether by implementing rollback (and atomic package management transactions) at the package level using hardlinks. Nix is a complex enough beast that you wouldn’t want to use it in a non-developer-focused OS as-is, but you could easily implement a snapshot and rollback mechanism on top.