Writing well

We rely on written language to develop software. I used to joke that I worked as a professional email writer rather than a computer programmer (and it wasn’t really a joke). So if you want to be a better engineer, I recommend that you focus some time on improving your written English.

I recently bought 100 Ways to Improve Your Writing by Gary Provost, which is a compact and rewarding book full of simple and widely applicable guidelines to writers. My advice is to buy a copy!

You can also find plenty of resources online. Start by improving your commit messages. Since we love to automate things, try these shell scripts that catch common writing mistakes. And every time you write a paragraph simply ask yourself: what is the purpose of this paragraph? Is it serving that purpose?

Native speakers and non-native speakers will both find useful advice in Gary Provost’s book. In the UK school system we aren’t taught this stuff particularly well. Many English-as-a-second-language courses don’t teach how to write on a “macro” level either, which is sad because there are many differences from language to language that non-natives need to be aware of. I have seen “Business English” courses that focus on clear and convincing communication, so you may want to look into one of those if you want more than just a book.

Code gets read more than it gets written, so it’s worth taking extra time so that it’s easy for future developers to read. The same is true of emails that you write to project mailing lists. If you want to make a positive change to development of your project, don’t just focus on the code — see if you can find 3 ways to improve the clarity of your writing.

Advertisements
Posted in Uncategorized | 2 Comments

GUADEC 2018 Videos: All Done

All the editing & uploading for the GUADEC videos is now finished. The videos were all uploaded to YouTube some time ago, and they are all now available on http://videos.guadec.org/2018 as well.

Thanks to everyone who helped with the editing: Alexis Diavatis, Bin Li, Garrett LeSage, Alexandre Franke (who also did a lot of the work of uploading to YouTube), and Hubert Figuiere (who managed to edit so many that I’m suspicious he might be some kind of robot in disguise).

edit: If you are hungry for more videos to edit, some footage from GUADEC 2002 has been unearthed. It’d be great to have some of this history from fifteen years ago up on YouTube! If you’re interested, reply to the mail or speak up in #guadec on GIMPnet and we can coordinate efforts.

Posted in Uncategorized | 2 Comments

Natural Language Processing

This month I have been thinking about good English sentence and paragraph structure. Non-native English speakers who are learning write in English will often think of what they want to say in their first language and then translate it. This generally results in a mess. The precise structure of the mess will depend on the rules of the student’s first language. The important thing is to teach the conventions of good English writing; but how?

Visualizing a problem helps to solve it. However there doesn’t seem to be a tool available today that can clearly visualize the various concerns writers have to deal with. A paragraph might contain 100 words, each of which relate to each other in some way. How do you visualize that clearly… not like this, anyway.

I did find some useful resources though. I discovered the Paramedic Method, through this blog post from helpscout.net. The Paramedic Method was devised by Richard Lanham and consists of these 6 steps:

  1. Highlight the prepositions.
  2. Highlight the “is” verb forms.
  3. Find the action. (Who is kicking whom?)
  4. Change the action into a simple active verb.
  5. Start fast—no slow windups.
  6. Read the passage out loud with emphasis and feeling.

This is good advice for anyone writing English. It’ll be particularly helpful in my classes in Spain where we need to clean up long strings of relative clauses. (For example, a sentence such as “On the way I met one of the workers from the company where I was going to do the interview that my friend got for me”. I would rewrite this as: “On the way I met a person from Company X, where my friend had recently got me an interview.”

I found a tool called Write Music which I like a lot. The idea is simple: to illustrate and visualize the rule that varying sentence length is important when writing. The creator of the tool, Titus Wormer, seems to be doing some interesting and well documented research.

I looked at a variety of open source tools for natural language processing. These provide good ways to tokenize a text and to identify the “part of speech” (noun, verb, adjective, adverb, etc.) but I didn’t yet find one that could analyze the types of clauses that are used. Which is a shame. My understanding of this is an area of English grammar is still quite weak and I was hoping my laptop might be able teach me by example but it seems not.

I found some surprisingly polished libraries that I’m keen to use for … something. One day I’ll know what. The compromise library for JavaScript can do all kinds of parsing and wordplay and is refreshingly honest about its limitations, and spaCy for Python also looks exciting. People like to interact with a computer through text. We hide the UNIX commandline. But one of the most popular user interfaces in the world is the Google search engine, which is a text box that accepts any kind of natural language and gives the impression of understanding it. In many cases this works brilliantly — I check spellings and convert measurements all the time using this “search engine” interface. Did you realize GNOME Shell can also do unit conversions? Try typing “50lb in kg” into the GNOME Shell search box and look at the result. Very useful! More apps should do helpful things like this.

I found some other amazing natural language technologies too. Inform 7 continues to blow my mind whenever I look at it. Commercial services like IBM Watson can promise incredible things like analysing the sentiments and emotions expressed in a text, and even the relationships expressed between the subjects and objects. It’s been an interesting day of research!

Posted in Uncategorized | 2 Comments

GUADEC 2018 Videos: Help Wanted

At this year’s GUADEC in Almería we had a team of volunteers recording the talks in the second room. This was organized very last minute as initially the University were going to do this, but thanks to various efforts (thanks in particular to Adrien Plazas and Bin Li) we managed to record nearly all the talks. There were some issues with sound on both the Friday and Saturday, which Britt Yazel has done his best to overcome using science, and we are now ready to edit and upload the 19 talks that took place in the 2nd room.

To bring you the videos from last year we had a team of 5 volunteers from the local team who spent our whole weekend in the Codethink offices. (Although none of us had much prior video editing experience so the morning of the first day was largely spent trying out different video editors to see which had the features we needed and could run without crashing too often… and the afternoon was mostly figuring out how transitions worked in Kdenlive).

This year, we don’t have such a resource and so we are looking to distribute the editing.  If you can, please get involved so we can share the videos as soon as possible!

The list of videos and a step-by-step guide on how to edit them is available at https://wiki.gnome.org/GUADEC/2018/Video. The guide is written for people who have never done video editing before and recommends that you use Kdenlive; if you’re already familiar with a different tool then of course feel free to use that instead and just use the process as a guideline. The first video is already up, so you can also use this as a guide to follow.

If you want to know more, get in touch on the GUADEC mailing list, or the #guadec IRC channel.

42412488965_64b9afc8eb_z

Posted in Uncategorized | 2 Comments

Tagcloud

The way we organize content on computers hasn’t really evolved since the arrival of navigational file managers in late 1980s. We have been organizing files into directories for decades. Perhaps the biggest change anyone has managed since then is that we now call directories “folders” instead, and that we obscure the full directory tree now pointing users instead towards certain entry points such as the “Music”, “Downloads” and “Videos” folders inside their home directory.

It’s 2018 already. There must be a better way to find content than to grope around in a partially obscured tree of files and folders?

GNOME has been innovating in this area for a while, and one of the results is the Tracker search and indexing tool which creates a database of all the content it finds on the user’s computer and allows you to run arbitrary queries over it. In principle this is quite cool as you can, for example, search for all photos taken within a given time period, all songs by a specific artist, all videos above a certain resolution ordered by title, or whatever else you can think of (where the necessary metadata is available). However the caveat is for this to be at all useful you currently have to enjoy writing SPARQL queries on the commandline:  Tracker itself is a “plumbing” component, the only interface it provides is the tracker commandline tool.

There is ongoing work on content-specific user interfaces that can work with Tracker to access local content, so for photos for example you can use GNOME Photos to view and organize your whole photo collection. However, there isn’t a content-agnostic tool available that might let you view and organize all the content on your computer… other than Nautilus which is limited to files and folders.

I’m interested in organizing content using tags, which are nothing but freeform textual category labels. On the web, tags are a very common way of categorizing content. (The name hashtags is probably more widely understood than tags among web users, but hashtag has connotations to social media and sharing which don’t necessarily apply when talking about desktop content so I will call them tags here.) Despite the popularity on the web, desktop support is low: Tagspaces seems to be the only option and the free edition is very limited in what it can do. Within GNOME, we have had support for storing tags in the Tracker database for many years but I don’t know of any applications that allow viewing or editing file tags.

Around the time of GUADEC 2017 I read Alexandru’s blog post about tags in Nautilus, in which he announced that Nautilus wasn’t going to get support for organizing files using tags because it would conflict to much with the existing organization principle in Nautilus of putting files into folders. I agree with that logic there, but it leaves open a question: when will GNOME get an interface that allows me to organize files using tags?

As it happened I had a bit of free time after GUADEC 2017 was finished and I started sketching out an application designed specifically for organizing content using tags.

The result so far looks like this:

This is really just a prototype, there are lots more features I’d like to add or improve too if I get the time, but it does support the basic use case of “add tags to my files” at this point and so I’ve started a stable release branch. The app is named Tagcloud and you can get it as a Flatpak .bundle of the 0.2.1 release from here. Note that it won’t autoupdate as this isn’t a proper Flatpak repo, just a bundle file.

Tagcloud is written using Python and PyGObject, and of course GTK+. I encountered several G-I bindings issues during development which mean that Tagcloud currently requires very new versions of GLib and GTK+ but the good news is that by using the Flatpak bundle you don’t need to care about any of that. Tagcloud uses Tracker internally and I’ve been thinking a lot about how to make Tracker work better for application developers; these thoughts are quite lengthy and not really complete yet so I will save them for a separate blog post.

One of the key principles of Tagcloud is that it should recognize any type of content, so for example you can group together photos, documents and videos related to a specific project. In future I would also like to see GNOME’s content-specific applications such as Photos and Documents recognize tags; this shouldn’t require too much plumbing work since everything seems to be tending towards using Tracker as a backend, but it would of course affect the user interfaces of those apps.

I didn’t yet mentioned in this blog that a couple of months ago I quit my job at Codethink and right now I’m training to be a language teacher. So I imagine that I will have very little time available to work on Tagcloud for a while, but please do send issue reports and patches if you like to https://gitlab.com/samthursfield/tagcloud. I will be at GUADEC 2018 and hopefully we can have lots of exciting discussions about applying tags to things. And for the future … while I would like Tagcloud to become a fully fledged application, I will also be happy if it serves simply as a prototype and as a way of driving improvements in Tracker which will then benefit all of GNOME’s content apps.

Posted in Uncategorized | 3 Comments

How BuildStream uses OSTree

I’ve been asked a few times about the relationship between BuildStream and OSTree. The answer is a bit complicated so I decided to answer the question here.

OSTree is a content-addressed content store, inspired in many ways by Git but optimized for storing trees of binary files rather than trees of text files.

BuildStream is an integration tool which deals with trees of binary files, and at present it uses OSTree to help with storing, identifying and transferring these trees of binary files.

I’m deliberately using the abstract term “trees of binary files” here because neither BuildStream or OSTree limit themselves to a particular use case. BuildStream itself uses the term “artifact” to describe the output of a build job and in practice this could be the set of development headers and documentation for library, a package file such as a .deb or .rpm, a filesystem for a whole operating system, a bootable VM disk image, or whatever else.

Anyway let’s get to the point! There are actually four ways that BuildStream directly makes use of OSTree.

The `ostree` source plugin

The `ostree` source plugin allows pulling arbitrary data from a remote OSTree repository. It is normally used with an `import` element as a way of importing prebuilt binaries into a build pipeline. For example BuildStream’s integration tests currently run on top of the Freedesktop SDK binaries (which were originally intended for use with Flatpak applications but are equally useful as a generic platform runtime). The gnome-build-meta project uses this mechanism to import a prebuilt Debian base image, which is currently manually pushed to an OSTree repo (this is a temporary measure, in future we want to base gnome-build-meta on top of the upcoming Freedesktop SDK 1.8 instead).

It’s also possible to import binaries using the `tar` and `local` source types of course, and you can even use the `git` or `bzr` plugins for this if you really get off on using the wrong tools for the wrong job.

In future we will likely add other source plugins for importing binaries, for example from the Docker Registry and perhaps using casync.

Storing artifacts locally

Once a build has completed, BuildStream needs to store the results somewhere locally. The results go in the exciting-sounding “local artifact cache”, which is usually located inside your home directory at ​~/.cache/buildstream/artifacts.

There are actually two implementions of the local artifact cache, one using OSTree and one using .tar files. There are several advantages to the OSTree implementation, a major one being that it deduplicates files that are present in multiple artifacts which can save huge amounts of disk space if you do many builds of a large component. The biggest disadvantage to using OSTree is that it currently relies on a bunch of features that are specific to the Linux kernel and so it can only run on Linux OSes. BuildStream needs to support other UNIX-like operating systems and we found the simplest route for now to solve this was to implement a second type of local artifact cache which stores each artifact as a separate .tar file. This is less efficient in terms of disk space but much more portable.

So the fact that we use OSTree for caching artifacts locally should be considered an implementation detail of BuildStream. If a better tool for the job is found then we will switch to that. The precise structure of the artifacts should also be considered an internal detail — it’s possible to check artifacts out from the cache by poking around in the  ​~/.cache/buildstream/artifacts directory but there’s no stability guarantee in how you do this or what you might get out as a result. If you want to see the results of a build, use the `bst checkout` command.

It’s worth noting that we don’t yet support automated cleanups of the local artifact cache; that is issue #135.

Storing artifacts remotely

As a way of saving everyone from building the same things, BuildStream supports downloading prebuilt artifacts from a remote cache.

Currently the recommended way of setting up a remote artifact cache requires that you use OSTree. In theory, any storage mechanism could be used but that is currently not trivial because we also make use of OSTree’s transfer protocols, as described below.

We currently lack a way to do automated artifact expiry on remote caches.

Pushing and pulling artifacts

Of course there needs to be a way to push and pull artifacts between the local cache and the remote cache.

OSTree is designed to support downloading artifacts over HTTP or HTTPS and this is how `bst pull` works. The `bst push` command is more complex because officially OSTree does not support pushing, however we have a rather intricate push mechanism based off Dan  Nicholson’s ostree-push project which tunnels the data over SSH in order to get it onto the remote server.

Users of the tar cache cannot currently interact with remote artifact shares at all, which is an unfortunate issue that we aim to solve this year. The solution may be to switch away from using OSTree’s transfer protocols but marshalling the data into some other format in order to transfer it instead. We are particularly keen to make use of the Bazel
content-addressable store protocol although there may be too much of an impedence mismatch there.

Indirect uses of OSTree

It may be that you also end up deploying stuff into an OSTree repository somewhere. BuildStream itself is only interested with building and integrating your project — once that is done you run `bst checkout` and are rewarded with a tree of files on your local machine. What if, let’s say, your project aims to build a Flatpak application?

Flatpak actually uses OSTree as well and so your deployment step may involve committing those files into yet another OSTree repo ready for Flatpak to run them. (This can be a bit long winded at present so there will likely be some better integration appearing here at some point).

So, is anywhere safe from the rise of OSTree or is it going to take over completely? Something you might not know about me is that I grew up outside a town in north Shropshire called Oswestry. Is that a coincidence? I can’t say.

 

Oswestry

Oswestry, from Wikipedia.

Posted in Uncategorized | Leave a comment

2017 in review

I began this year in a hedge in Mexico City and immediately had to set off on a 2 day aeroplane trek back to Manchester to make a very tired return to work on the 3rd January. From there things calmed down somewhat and I was geared up for a fairly mundane year but in fact there have been many highlights!

The single biggest event was certainly bringing GUADEC 2017 to Manchester. I had various goals for this such as ensuring we got a GUADEC 2017, showing my colleages at Codethink that GNOME is a great community, and being in the top 10 page authors on wiki.gnome.org for the year. The run up to the event from about January to July took up many evenings and it was sometimes hard to trade it off with my work at Codethink; it was great working with Allan, Alberto, Lene and Javier though and once the conference actually arrived there was a mass positive force from all involved that made sure it went well. The strangest moment was definitely walking into Kro Bar slightly before the preregistration event was due to start to find half the GNOME community already crammed into the tiny bar area waiting for something to happen. Obviously my experience of organizing music events (where you can expect people to arrive about 2 hours after you want them somewhere) didn’t help here.

Codethink provides engineers with a travel budget a little bit of extra leave for attending conferences; obviously what with GUADEC being in Manchester I didn’t make a huge use of that this year, but I did make it to FOSDEM and also to PyConES which took place in the beautiful city of Cácares. My friend Pedro was part of the organizing team and it was great to watch him running round fighting fires all day while I relaxed and watched the talks (which were mostly all trying to explain machine learning in 30 minutes with varying degrees of success).

Stream powered carriageWork wise I spent most of my year looking at compilers and build tools, perhaps not my dream job but it’s an enjoyable area to work in because (at least in terms of build tools) the state of the art is comically bad. In 10 years we will look back at GNU Autotools in the way we look at a car that needs to be started with a hand crank, and perhaps the next generation of distro packagers will think back in wonder at how their forebears had to individually maintain dependency and configuration info in their different incompatible formats.

BuildStream is in a good state and is about to hit 1.0; it’s beginning to get battle tested in a couple of places (one of these being GNOME) which is no doubt going to be a rough ride — I already have a wide selection of performance bottlenecks to be looking at in the new year. But it’s looking already like a healthy community and I want to thanks to everyone who has already got behind the project.

It also seems to have been a great year for Meson; something that has been a long time coming but seems to be finally bringing Free Software build systems into the 21st century. Last year I ported Tracker to build with Meson, and have been doing various ongoing fixes to the new build system — we’re not yet able to fully switch to Autotools primary because of issue #2166, and also because of some Tracker test suite failures that seem to only show up with Meson that we haven’t yet dug into fully.

With GUADEC out of the way I managed to spend some time prototyping something I named Tagcloud. This is the next iteration of a concept that I’ve wanted since more or less forever, that of being able to apply arbitrary tags to different local and online resources in a nice way. On the web this is a widespread concept but for some reason the desktop world doesn’t seem to buy into it. Tracker is a key part of this puzzle, as it can deal with many types of content and can actually already handle tags if you don’t mind using the commandline so part of my work on Tagcloud has been making Tracker easy to embed as a subproject. This means I can try new stuff without messing up any session-wide Tracker setup, and it builds builds on some great work Carlos has been doing to modernize Tracker as well. I’ve been developing the app in Python, which has required me to fix issues in Tracker’s introspection bindings (and GLib’s, and GTK+’s … on the whole I find the PyGObject experience pretty good and it’s obviously been a massive effort to get this far, but at the same time these teething issues are quite demotivating.) Anyway I will post more about Tagcloud in the new year once some of the ideas are a bit further worked out; and of course it may end up going nowhere at all but it’s been nice to actually write a GTK+ app for the first time in ages, and to make use of Flatpak for the first time.

It’s also been a great year for the Flatpak project; and to be honest if it wasn’t for Flatpak I would probably have talked myself out of writing a new app before I’d even started. Previously the story for getting a new app to end users was that you must either be involved or know someone involved in a distro or two so that you can have 2+ year old versions of your app installable through a package manager; or your users have to know how to drive Git and a buildsystem from the commandline. Now I can build a flatpak bundle every time I push to master and link people straight to that. What a world! And did I mention GitLab? I don’t know how I ever lived without GitLab CI and I think that GNOME’s migration to GitLab is going to be *hugely* beneficial for the project.

Looking back it seems I’ve done more programming stuff than I thought I had; perhaps a good sign that you can achieve stuff without sacrificing too much of your spare time.

It’s also been a good year music wise, Manchester continues to have a fantastic music scene which has only got better with the addition of the Old Abbey Taphouse where I in fact spent the last 4 Saturdays in a row. Last Saturday we put on Babar Luck, I saw a great gig of his 10 years ago and have managed to keep missing him ever since but things finally worked out this time. Other highlights have been Paddy Steer, Baghdaddies and a very cold gig we did with the Rubber Duck Orchestra on the outdoor stage on a snowy December evening.

I caught a few gigs by Henge who only get better with time and who will hopefully break way out of Manchester next year. And in September I had the privilege of watching Jeffrey Lewis supported by The Burning Hell in a little wooden hut outside Lochcarron in Scotland, that was certainly a highlight despite being ill and wearing cold shoes.

Lochcarron TreehouseI didn’t actually know much of Scotland until taking the van up there this year; I was amazed that such a beautiful place has been there the whole time just waiting there 400 miles north. This expedition was originally planned to be a bike trip but ended up being a road trip, and having now seen the roads that is probably for the best. However we did manage a great bike trip around the Netherlands and Belgium, the first time I’ve done a week long bike trip and hopefully the beginning of a new tradition ! Last year I did a lot of travel to crazily distant places, its a privilege to be able to do so but one that I prefer to use sparingly so it was nice to get around closer to home this year.

All in all a pretty successful year, not straightforward at times but one with several steps in the directions I wanted to head. Let’s see what next year holds 🙂

Posted in Uncategorized | 2 Comments