Automated point and click

I have been watching GNOME’s testing story get better and better for a long time. The progress we made since we first started discussing the GNOME OS initiative is really impressive, even when you realize that GUADEC in A Coruña took place nine years ago. We got nightly OS images, Gitlab CI and the gnome-build-meta BuildStream project, not to mention Flatpak and Flathub.

Now we have another step forwards with the introduction of OpenQA testing for the GNOME OS images. Take a look at the announcement on GNOME Discourse to find out more about it.

Automated testing is quite tedious and time consuming to set up, and there is significant effort behind this – from chasing regressions that broke the installer build, and debugging VM boot failures to creating a set of simple, reliable tests and integrating OpenQA with Gitlab CI. A big thanks to Codethink for sponsoring the time we are spending on setting this up. It is part of a wider story aiming to facilitate better cooperation upstream between companies and open source projects, which I wrote about in this Codethink article “Higher quality of FOSS”.

It’s going to take time to figure out how to get the most out of OpenQA, but I’m sure it’s going to bring GNOME some big benefits.

Return to Codethink

2020 was a year full of surprises, so surprise that I finished it by returning to work in the same job that I left exactly 3 years ago.

There are a few reasons I did that! I will someday blog in more detail about working as a language teacher. It’s a fun job but to make the most of it you have to move around regularly, and I unexpectedly found a reason to settle in Santiago. Codethink kindly agreed that I could join the ongoing remote-work revolution and work from here.

Three years is a long time. What changed since I left? There’s a much bigger and nicer office in Manchester, with nobody in it due to the pandemic. The company is now grouped into 4 internal divisions. This is still an experiment and it adds some management overhead, also helps to maintain a feeling of autonomy in a company that’s now almost 100 people. (When I started there ten years ago, I think there were seventeen employees?!)

I also want to mention some research projects that my colleagues are working on. Codethink is a services company, but has always funded some non-customer work including in the past work on dconf, Baserock, Buildstream and the Freedesktop SDK. These are termed ‘internal investments’ but they are far from internal, the goal is always to contribute to open software and hardware projects. The process for deciding where to invest has improved somewhat in my absence; it still requires some business case for the investment (I’m still thinking how to propose that I get paid to work on music recommendations and desktop search tools all day), but there is now a process!

Here are two things that are being worked on now:

RISC-V

My contribution to Codethink’s RISC-V research was writing an article about it. The tl;dr is we are playing with some RISC-V boards, mainly in the context of Freedesktop SDK. Since writing that article the team tracked down a thorny bug in how qemu-user uses GLib that had been blocking progress, and got GNOME OS running in qemu-system-riscv. Expect to see a video soon. You can thank us when you get your first RISC-V laptop 🙂

Bloodlight

I never worked on a medical device but some of my colleagues have, and this led to the Bloodlight project. It’s an open hardware device for measuring your heart rate, aiming to avoid some pitfalls that existing devices fall into:

Existing technology used in smart watches suffers various shortcomings, such as reduced effectiveness on darker skin tones and tattoos.

There is a lot of technical info on the project on Github, including an interesting data processing pipeline. Or for a higher level overview, the team recently published an article at coruzant.com.

As is often the case, I can’t say exactly what I’m working on right now, other than it’s an interesting project and I am learning more than I ever wanted about Chromium.

2017 in review

I began this year in a hedge in Mexico City and immediately had to set off on a 2 day aeroplane trek back to Manchester to make a very tired return to work on the 3rd January. From there things calmed down somewhat and I was geared up for a fairly mundane year but in fact there have been many highlights!

The single biggest event was certainly bringing GUADEC 2017 to Manchester. I had various goals for this such as ensuring we got a GUADEC 2017, showing my colleages at Codethink that GNOME is a great community, and being in the top 10 page authors on wiki.gnome.org for the year. The run up to the event from about January to July took up many evenings and it was sometimes hard to trade it off with my work at Codethink; it was great working with Allan, Alberto, Lene and Javier though and once the conference actually arrived there was a mass positive force from all involved that made sure it went well. The strangest moment was definitely walking into Kro Bar slightly before the preregistration event was due to start to find half the GNOME community already crammed into the tiny bar area waiting for something to happen. Obviously my experience of organizing music events (where you can expect people to arrive about 2 hours after you want them somewhere) didn’t help here.

Codethink provides engineers with a travel budget a little bit of extra leave for attending conferences; obviously what with GUADEC being in Manchester I didn’t make a huge use of that this year, but I did make it to FOSDEM and also to PyConES which took place in the beautiful city of Cácares. My friend Pedro was part of the organizing team and it was great to watch him running round fighting fires all day while I relaxed and watched the talks (which were mostly all trying to explain machine learning in 30 minutes with varying degrees of success).

Stream powered carriageWork wise I spent most of my year looking at compilers and build tools, perhaps not my dream job but it’s an enjoyable area to work in because (at least in terms of build tools) the state of the art is comically bad. In 10 years we will look back at GNU Autotools in the way we look at a car that needs to be started with a hand crank, and perhaps the next generation of distro packagers will think back in wonder at how their forebears had to individually maintain dependency and configuration info in their different incompatible formats.

BuildStream is in a good state and is about to hit 1.0; it’s beginning to get battle tested in a couple of places (one of these being GNOME) which is no doubt going to be a rough ride — I already have a wide selection of performance bottlenecks to be looking at in the new year. But it’s looking already like a healthy community and I want to thanks to everyone who has already got behind the project.

It also seems to have been a great year for Meson; something that has been a long time coming but seems to be finally bringing Free Software build systems into the 21st century. Last year I ported Tracker to build with Meson, and have been doing various ongoing fixes to the new build system — we’re not yet able to fully switch to Autotools primary because of issue #2166, and also because of some Tracker test suite failures that seem to only show up with Meson that we haven’t yet dug into fully.

With GUADEC out of the way I managed to spend some time prototyping something I named Tagcloud. This is the next iteration of a concept that I’ve wanted since more or less forever, that of being able to apply arbitrary tags to different local and online resources in a nice way. On the web this is a widespread concept but for some reason the desktop world doesn’t seem to buy into it. Tracker is a key part of this puzzle, as it can deal with many types of content and can actually already handle tags if you don’t mind using the commandline so part of my work on Tagcloud has been making Tracker easy to embed as a subproject. This means I can try new stuff without messing up any session-wide Tracker setup, and it builds builds on some great work Carlos has been doing to modernize Tracker as well. I’ve been developing the app in Python, which has required me to fix issues in Tracker’s introspection bindings (and GLib’s, and GTK+’s … on the whole I find the PyGObject experience pretty good and it’s obviously been a massive effort to get this far, but at the same time these teething issues are quite demotivating.) Anyway I will post more about Tagcloud in the new year once some of the ideas are a bit further worked out; and of course it may end up going nowhere at all but it’s been nice to actually write a GTK+ app for the first time in ages, and to make use of Flatpak for the first time.

It’s also been a great year for the Flatpak project; and to be honest if it wasn’t for Flatpak I would probably have talked myself out of writing a new app before I’d even started. Previously the story for getting a new app to end users was that you must either be involved or know someone involved in a distro or two so that you can have 2+ year old versions of your app installable through a package manager; or your users have to know how to drive Git and a buildsystem from the commandline. Now I can build a flatpak bundle every time I push to master and link people straight to that. What a world! And did I mention GitLab? I don’t know how I ever lived without GitLab CI and I think that GNOME’s migration to GitLab is going to be *hugely* beneficial for the project.

Looking back it seems I’ve done more programming stuff than I thought I had; perhaps a good sign that you can achieve stuff without sacrificing too much of your spare time.

It’s also been a good year music wise, Manchester continues to have a fantastic music scene which has only got better with the addition of the Old Abbey Taphouse where I in fact spent the last 4 Saturdays in a row. Last Saturday we put on Babar Luck, I saw a great gig of his 10 years ago and have managed to keep missing him ever since but things finally worked out this time. Other highlights have been Paddy Steer, Baghdaddies and a very cold gig we did with the Rubber Duck Orchestra on the outdoor stage on a snowy December evening.

I caught a few gigs by Henge who only get better with time and who will hopefully break way out of Manchester next year. And in September I had the privilege of watching Jeffrey Lewis supported by The Burning Hell in a little wooden hut outside Lochcarron in Scotland, that was certainly a highlight despite being ill and wearing cold shoes.

Lochcarron TreehouseI didn’t actually know much of Scotland until taking the van up there this year; I was amazed that such a beautiful place has been there the whole time just waiting there 400 miles north. This expedition was originally planned to be a bike trip but ended up being a road trip, and having now seen the roads that is probably for the best. However we did manage a great bike trip around the Netherlands and Belgium, the first time I’ve done a week long bike trip and hopefully the beginning of a new tradition ! Last year I did a lot of travel to crazily distant places, its a privilege to be able to do so but one that I prefer to use sparingly so it was nice to get around closer to home this year.

All in all a pretty successful year, not straightforward at times but one with several steps in the directions I wanted to head. Let’s see what next year holds 🙂

Night Bus: simple SSH-based build automation

night-858546_640

My current project at Codethink has involved testing and packaging GCC on several architectures. As part of this I wanted nightly builds of ‘master’ and the GCC 7 release branch, which called for some kind of automated build system.

What I wanted was a simple CI system that could run some shell commands on different machines, check if they failed, and save a log somewhere that can be shared publically. Some of the build targets are obsolete proprietary OSes where modern software doesn’t work out of the box so simplicity is key. I considered using GitLab CI, for example, but it requires a runner written in Go, which is not something I can just install on AIX. And I really didn’t have time to maintain a Jenkins instance.

So I started by trying to use Ansible as a CI system, and it kind of worked but the issue is that there’s no way to get the command output streamed back to you in real time. GCC builds take hours and its test suite can take a full day to run on an old machine so it’s essential to be able to see how things are progressing without waiting a full day for the command to complete. If you can’t see the output, the build could be hanging somewhere and you’d not realise. I discovered that Ansible isn’t going to support this use case and so I ended up writing a new tool: Night Bus.

Night Bus is written in Python 3 and runs tasks across different machines, similarly to Ansible but with the usecase of doing nightly builds and tests as opposed to configuration management. It provides:

  • remote task execution via SSH (using the Parallel-SSH library)
  • live logging of output to a specified directory
  • an overall report written once all tasks are done, which can contain status messages from the tasks
  • parametrization of the tasks (to e.g. build 3 branches of the same thing)
  • a support library of helper functions to make your task scripts more readable

Scheduled execution can be set up using cron or systemd. You can set up a webserver (i’m using lighttpd) on the machine that runs Night Bus to make the log output available over HTTP

You control it by creating two YAML (or JSON) files:

  • hosts describes the SSH configuration for each machine
  • tasks lists the sequence of tasks to run

Here’s an example hosts file:

host1:
  user: automation
  private_key: ssh/automation.key

host2:
  proxy_host: 86.75.30.9
  proxy_user: jenny
  proxy_private_key: ssh/jenny.key

Here’s an example tasks file:

tasks:
- name: counting-example
  commands: |
    echo "Counting to 20."
    for i in `seq 1 20`; do
      echo "Hello $i"
      sleep 1
    done

You might wonder why I didn’t just write a shell script to automate my builds as many thousands of hackers have done in the past. Basically I find maintaining shell scripts over about 10 lines to be a hateful experience. Shell is great as a “live” programming environment because it’s very flexible and quick to type. But those strengths turn into weaknesses when you’re trying to write maintainable software. Every CI system ultimately ends up with you writing shell scripts (or if you’re really unlucky, some XML equivalent) so I don’t see any point hiding the commands that are being run under layers of other stuff, but at the same time I want a clear separation between the tasks themselves and the support aspects like remote system access, task ordering, and logging.

Night Bus is released as a random GitHub project that may never get much in the way of updates. My aim is for it to fall into the category of software that doesn’t need much ongoing work or maintenance because it doesn’t try to do anything special. If it saves one person from having to maintain a Jenkins instance then the week I spent writing it will have been worthwhile.

Ten years of Codethink

32813704624_b7e3899b9f_zSpring is here and it is the 10th anniversary celebration of Codethink.  Nobody could have orchestrated it this way but we also have GUADEC happening here in Manchester in a few months and it’s the 20th anniversary of GNOME.  All roads lead to Manchester in 2017!

The company is celebrating its anniversary in various ways: cool new green-on-black T-shirts, a 10 years mug for everyone, and perhaps more significantly a big sophisticated party with a complicated cake.

The party was fun with a lot of old faces some who had travelled quite far to be there. The company was and still is a mix of very interesting and weird people and although we spend most of our time in the same room studiously not talking to eachother we do know how to celebrate things sometimes!

It was odd in a way being at a corporate party with fancy food and a function band and 150 guests in an enourmous monastery given that back when I joined the entire Manchester staff could go for lunch together and all sit at the same table. The first company party I went to was in Paul Sherwood’s conservatory, in fact the first few of them were there. It’s a good sign for sure that the company has quadrupled (or more) in size in the ensuing 6 years.

In hindsight I was quite lucky to have a world class open source software house more or less on my doorstep. I spent a long time trying to avoid working in software (and trying to avoiding working at all), but I did do a Summer of Code project back in 2009 or 2010 mentored by Allison Lortie, who then worked for Codethink and noted that I lived about 5000 miles closer to her office than she did.It was an obvious choice to apply to there when I graduated from University and luckily it was just at a time when they were hiring so I didn’t have to spend too long living on 50p a week and eating shoes for dinner. It was very surreal for the first few months of working there as a world which I’d previously only been involved via a computer turned into a world of real people (plus lots of computers), in fact the whole first year was pretty surreal what with also adapting to Manchester life and discovering the how much craziness there is underneath the surface of the technology industry.

I had no idea what the company did beforehand, and even now the Codethink website doesn’t give too much away. I saw contributions to Free Software projects such as Tracker and dconf (and various other things that were happening 7 years ago) but I didn’t know what kind of business model came out of that activity. It turned out that neither did anyone else at that point; the company grew out of consulting work from Nokia, but the Elopcalypse had just happened and so on starting I got involved in all sorts of different things as we looked for work in different areas: everything from boot speed optimizations and hardware control, to compiler testing and bugfixing, build tools, various automated testing setups, and more build tools, to Python and Ruby webapps, data visualisations, OpenStack, systems administration, report writing and more. Just before Christmas 2011 I got offered to go work in Korea, the catch being that I had to go in 2 days time, and the following year I spent another memorable month there (again with about 2 days notice). I also had month long stints in Bulgaria, and Berlin although these were actually planned in advance, plus all sorts of conferences as the company started to sponsor attendance and a couple of days off for such things. Most importantly of course I got involved in rock climbing which is now pretty much my favourite thing.

Since a long time now it’s felt like the company has a solid business model and while the work we do is still all over different sectors I think I can sum it up as bridging the gap between the worlds of corporate software projects and open-source software projects.  We have some great customers who engage us to do work upstream on Free Software projects which is ideal,  but far from everything we work on is Free Software, and we also work in various fields that I’m pretty unexcited about such as automotive and finance. It’s very hard to make money though if you  spend all your time working on something that you then give away so it’s a necessary compromise in my eyes.

And even in entirely closed source projects having knowledge of all the great Free Software that is available gives us an advantage. There are borderline-unusable proprietary tools still being sold by major vendors to do things like version control, there are unreliable proprietary hardware drivers being sold for hardware that has a functional and better open source driver, there are countless projects using medieval kernels, obsolete operating systems and all sorts of other craziness.Working for a company that trusts its employees is also pretty important, I meet operating systems engineers there are working on Linux-based devices whose corporate IT departments force them to use Windows, so right they trust them to maintain the operating system used in millions of cars but they don’t trust them to maintain the operating system on their laptop.

One thing Codethink lacks still is a model for providing engineer time to help with ongoing maintainance and development of different free software projects. There have been attempts at doing so within the company and I acknowledge it’s very difficult because the drop in, drop out nature of a consultant engineer’s time isn’t compatible with the ongoing time commitment required to be a reliable maintainer. Plus good maintenance skills require years to develop and either require someone experienced with a lot of free time to teach them to you, or they require you to maintain a real world project which you mess up continually and learn every lesson the hard way. Of course open source work that comes out of customer projects is highly regarded and if you’re lucky enough to have unallocated time it can sometimes be used to work through the backlog of bug fixes and feature additions for different tools you use that one inevitably develops as a full time software engineer. Again, it amazes me how many companies manage to actively prevent their developers from pushing things upstream.

We have been maintaining Baserock for years now (and many people have learned lots of lessons the hard way from it :-); BuildStream development is ongoing and I’m even still hopeful we can achieve the original goal of making it an order of magnitude easier to produce a high quality Free Software operating system. I should note that Codethink also contributes financially to conferences and projects in various ways.

I should also point out that we are still hiring. This wasn’t intended to be a marketing essay in which I talked up how great the company is, but it kinda has turned out out that way. I guess you take that as a good sign. My real underlying goal was to make it a bit clearer what it’s like to work here which I hope I’ve done a little.

I am quite proud of the company’s approach to hiring, we take in many graduates who show promise but never got involved in community-driven software projects or never really even got into programming except as a module in a science degree or whatever. Of course we also welcome people who do have relevant experience but they can be hard to find and focusing on them can also have an undesired effect of selecting based on certain privileges. I was debating with Tristan last week whether a consultancy is actually a good place for inexperienced developers to be, there is the problem that you don’t get to see the results of your work very often, you often move between projects fairly frequently and so you might not develop the intuition needed for being a good software maintainer, which is a complex topic but boils down to something like: “Is this going to cause problems in 5 years time?” There’s no real way around this, all we can do is give people a chance in an environment with a strong Free Software culture and that is pretty much what we do.

Ideally here I’d end with some photos from the party but I’m terrible at taking photos so it’s just all the back of people’s heads and lurid green lighting. Instead here’s a photo of a stranger taking a photo of me this afternoon while I was out biking round the river Mersey this afternoon.

stranger.jpg

Cake photo by Robert Marshall

Codethink is hiring!

We are looking for people who can write code, who match one of these job descriptions at least slightly, and who are willing to relocate to Manchester, UK (so you must either be an EU resident, or able to get a work permit for the UK.) Manchester is number 8 in Lonely Planet’s Best In Travel list for 2016, so really you’d be doing yourself a favour to move here. Remote working is possible if you have lots of contributions to public software projects that demonstrate your amazingness.

There is a nice symmetry to this blog post, I remember reading a similar one quite a few years ago, which led to me applying for a job at Codethink, and i’ve been here ever since, with various trips to exotic countries in between.

If you’re interested, send a CV & cover letter to jobs@codethink.co.uk.

The Hypercall API of the Rump Kernel project

Today I’m at the New Directions In Operating Systems conference in Shoreditch in London. It’s a cool thing about working for Codethink that they let you go to conferences like these and even pay for the extortionate hotel and train fares involved!

There is fascinating stuff being talked about, I think Robert Watson summed up the theme well as “diagrams with boxes pointing at each other” which is pretty much what we seem to do.

One of the more dramatic rearrangements of the traditional diagram of an operating system is the rump kernel concept, which a couple of people spoke about. Broadly there are some great implications, such as that you run and debug kernel drivers from user space. In practice right now you are limited to NetBSD kernel drivers so it’s not so immediately relevant to me (but perhaps it’ll make NetBSD more relevant to me in future, who knows…!).

The other implication of a rump kernel is you can add a libc and an “application” and you’ve created a really minimal kind of container. Other talks at the conference seemed to be expanding the concept of the “application” to encompass large, distributed networks of systems, but here the “application” needs to be something simple enough to not even need fork().

I wanted to make this fuzzy “container” concept a bit more concrete. All of the diagrams in Antti’s talk slides labelled the bit between the rump kernel and whatever it’s running on as the “hypercall API”, but didn’t dig into the concept. I did a little digging myself and found:

Other implentations exist for bare metal and Xen. I think the Hypercall interface is one of the key insights of the Rump Kernel project: it’s the most minimal definition of what could be called a type of ‘container’ that I’ve seen. All the applications that can run on top of a NetBSD rump kernel can ultimately boil down to that.

There seems to be one big rabbit hole in it:

int rumpuser_open(const char *name, int mode, int *fdp)

Open name for I/O and associate a file descriptor with it.  Notably,
there needs to be no mapping between name and the host's file system
namespace.  For example, it is possible to associate the file descriptor
with device I/O registers for special values of name.

That’s basically your communication channel with the world. What if you want to talk to a graphics device? I guess if the ‘hypercall host’ wants to let you, it can let you rumpuser_open() it and send data…

If this is a type of container interface we can compare it with the container interface used by Docker: that is to say, a fairly large subset of the enourmous Linux syscall API. Whatever the future holds for the rump kernel project and its ‘hypercall’ API, it’s unlikely to grow as large as Linux! By developing an app that targets the Hypercall interface (whether this involves the NetBSD anykernel or something else doesn’t really matter) you reduce the attack surface between the application and the host OS to a very small amount of code: there’s currently 9,248 sLOC in the POSIX implementation of librumpuser, for example.

This whole concept is still fairly cloudy in my head and I doubt it’s something I’m going to work with any time soon. All the information above may be totally wrong and misleading, too, please help by correcting my understanding you are able!