September 6, 2023September 6, 2023 Sam Thursfield

Quickball media server v2

As a Linux hacker-type I am often searching for some way to apply my rather specialized skillset to a real world problem. And I am always after some sovereignty over my music collection. So I came up with the idea to make some kind of music player using a Raspberry Pi in an old radio case.

In 2020 I got as far as setting up the Pi, in a cardboard box rather than the imagined retro radio case. It’s worked mostly ok since then, I always had some friction with the Raspbian Linux distribution though; I mean Debian is not the most up-to-date distro, and Raspbian is not even up to date with the latest Debian release. But Raspbian was the only OS with sufficient driver support to be useful for workloads involving graphics, audio, or Bluetooth.

Until now, it seems! After lots of great work from various folk, Fedora now has full Pi support, using a UEFI bootloader, with graphics acceleration, etc.

Recently I did that thing where you try to mix packages from Debian repos, thinking you might be able to do it safely this time, and the system winds up broken beyond repair, as usual. So I decided to throw it away and start again on top of the rpm-ostree based Fedora IoT distro.

The Pi is now back in action and mostly working, and the setup experience was tolerable. Here’s a summary of what I did.

The Base

It’s a Pi 4 connected to:

a TV via HDMI
an external HD via USB
ethernet via a cable

The Fedora IoT install instructions are good, although somehow I didn’t get the right SSH key installed and instead had to set init=/bin/sh on first boot to gain access.

I created a regular user named “pi” and did the usual prep for a toy system: disable SELinux and disable firewalld, to avoid losing precious motivation on those.

I want to access the Pi by name; so I enabled Multicast DNS in the systemd-resolved config, and now I can refer to it with a name like pi.local on my home network.

All of the setup is automated in Ansible playbooks, when I break the machine I can rebuild it without having to go back to this blog post.

Audio

The Pi’s analogue audio output isn’t supported in Fedora IoT 38, but HDMI audio is. HDMI audio is handled by Broadcom VideoCore 4 hardware, driven by the vc4 driver.

As of Fedora IoT 38 this driver is blocklisted on the kernel commandline, and HDMI audio devices only appear after running modprobe vc4. Removing the block caused Bluetooth to stop working; I don’t really want to know what’s going on here, so instead I added a systemd unit to run modprobe vc4 after basic.target, and moved on.

I also had to add my pi user to the audio system group, done using systemd-sysusers in `/etc/sysusers.d`. Now I can see the audio devices, and (once installed) so can Pipewire and Wireplumber:

> aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: vc4hdmi0 [vc4-hdmi-0], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: vc4hdmi1 [vc4-hdmi-1], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

Amazingly, using aplay I can now play audio from my TV.

Bluetooth

The Pi has a combined Bluetooth+Wifi adapter, the BCM4345C0. By default this needs manual probing with bcattach to make it appear. Following the overlays README, we can add this line to cat /boot/efi/config.txt for it to appear automatically on boot:

dtparam=krnbt=on

You can confirm it’s working by checking if /sys/class/bluetooth/ is populated with the hc0 controller.

From there I followed this guide from our friends at Collabora, “Using a Raspberry Pi as a Bluetooth speaker with PipeWire” and, rather amazingly, I managed to cast from my phone to the television a couple of times, after manually trusting and connecting the device from bluetoothctl on the commandline.

Ideally I want to be able to stream audio without having to get out my laptop and log in to the speaker via SSH each time I arrive home so that it connects to my phone. Another short Python script named bluetooth-autoconnect solves that by watching for trusted devices to appear and connecting with them automatically. So I only have to SSH into the machine once per device to set it as trusted.

I still get connection failures sometimes, restarting Wireplumber usually works around those.

External hard disk

The external HD full of music and media doesn’t get automatically mounted, so I installed UDisks2 and udiskie to automatically mount it. I added a storage group containing my ‘pi’ user, and a service running udiskie as ‘pi’ so that the external disk is mounted by this user.

The external disk I used was NTFS-formatted, which was a terrible idea. Every time the machine booted the filesystem would be marked dirty, no matter how cleanly I unmounted it before shutdown. I gave up on NTFS and copied the data to a second disk which is ext4-formatted.

NTFS: just say no.

Kodi

I figured getting the Kodi media server to work would be the biggest challenge. On Raspbian it’s as simple as apt install kodi and everything just works, but Fedora don’t even package Kodi (presumably due to patent issues?).

The right way to do this is to use Flatpak. The ARM64 build of Kodi is now ready. (Pro tip, you can push a draft PR to test your changes on the Flathub ARM builder rather than spending a week building it on an emulated ARM device on a small laptop.)

Flatpak needs a display server to work with, and my plan is to run a minimal Weston compositor following these instructions.

Other services

For completeness, here are the other services I’m running. These are all containers, in keeping with the design of Fedora IoT.

Caddy web server (using official Alpine-based image)
Jellyfin media server (using official container image)
Samba (using ServerContainers/samba image)

One very nice thing of switching from Docker to Podman is the improved systemd integration. If you just deploy a container then it doesn’t persist across reboots, so you need a systemd service as well. Podman can generate these, and Ansible’s containers.podman.podman_container module makes it easy. The Ansible playbook to deploy Samba looks like this:

  - name: Samba container
    containers.podman.podman_container:
      name: samba
      ...
      # Let systemd manage service
      restart_policy: "no"
      generate_systemd:
        path: ~/.config/systemd/user
        restart_policy: on-failure

  - name: Start systemd unit
    systemd_service: daemon_reload=true name=container-samba state=restarted scope=user

No need to manually create a systemd unit for each container any more \o/

Calliope + Beets + Tracker Miner FS will be the next step, but first I need to bulid an ARM64 compatible container containing all the necessary pieces.

And perhaps that retro radio case – I do have some old cassette walkmans that might do the job.

Have you got any small home servers running? Show me!

January 25, 2023 Sam Thursfield

A small Rust program

I wrote a small program in Rust called cba_blooper. Its purpose is to download files from this funky looper pedal called the Blooper.

It’s the first time I finished a program in Rust. I find Rust programming a nice experience, after a couple of years of intermittent struggle to adapt my existing mental programming models to Rust’s conventions.

When I finished the tool I was surprised by the output size – initially a 5.6MB binary for a tool that basically just calls into libasound to read and write MIDI. I followed the excellent min-sized-rust guide and got that down to 1.4MB by fixing some obvious mistakes such as actually stripping the binary and building in release mode. But 1.4MB still seems quite big.

Next I ran cargo bloat and found there were two big dependencies:

the ‘clap‘ argument parser library
the ‘regex’ library, pulled in by ‘env_logger‘

I got ‘env_logger’ to shrink by disabling its optional features in Cargo.toml:

env_logger = { version ="0.10.0", default_features = false, features = [] }

As for ‘clap’, it seems unavoidable that it adds a few hundred KB to the binary. There’s an open issue here to improve that. I haven’t found a more minimal argument parser that looks anywhere near as nice as ‘clap’, so I guess the final binary will stay at 842KB for the time being. As my bin/directories fill up with Rust programs over the next decade, this overhead will start to add up but it’s fine for now.

It’s thanks to Rust being fun that this tool exists at all. It definitely easier to make a small C program but the story around distribution and dependency management for self-contained C programs is so not-fun that I probably wouldn’t even have bothered writing the tool in the first place.

October 18, 2022 Sam Thursfield

Status update 18/10/2022

The most important news this week is that my musical collaborator Vladimir Chicken just released a new song about Manchester’s most famous elephant. Released with a weird B-side about a “Baboon on the Moon”, I am not sure what he was thinking with that one.

I posted on discourse.gnome.org already about GNOME OpenQA testing, now that the tests are up to date I’m aiming to keep an eye on them for a full release cycle and see how much ongoing maintenance effort they need. Hopefully at next year’s GUADEC we’ll be able to talk about moving this beyond an “alpha” service. We’ll soon have something like GNOME Continuous back in action after “only” 6 years of downtime.

Other exciting things in this area: Abderrahim Kitouni and Jordan Petridis have updated gnome-build-meta to track exact refs in its Git history; there are some details to work out so that it still provides quick CI feedback but this was basically necessary to ensure build reproducibility. And Tristan Van Berkom already blogged about research to use Recc inside BuildStream, with the eventual goal of unlocking fast incremental builds within the reproducibility guarantees that BuildStream already provides.

There is no direct link between these projects but I think we share the common vision that Colin Walters already laid out 10 years ago when describing Continuous: GNOME contributors need to be able to develop and test system-level changes involving GNOME, using a reliable & documented process with modest hardware requirements. Many issues and bug reports go beyond a single component, and in many cases right down to the kernel. As an example, when a background indexing task causes lagging in the desktop shell, folk blame the background indexer process, but the indexer is not in control of its own scheduling and such an issue can’t be fully reproduced if we don’t control exactly which kernel is running. Hopefully when these streams of work come to fruition, these kinds of bugs will finally become “shallow”.

Outside of volunteer efforts, I’ve been working on a new client project that is essentially a complex database migration. I don’t get to do much database work at Codethink, its nice to have absolutely no legacy Makefiles to deal with for once, and its been a good opportunity to try out Nushell in a bit more depth. My research so far is mostly setting up Python scripts to run database queries and output CSV, then using Nushell to filter and sort the output. When I tried Nushell a few years ago it still lacked some important features – it didn’t even have a way to set variables at that point – now it’s prepared for anything you can throw at it and I look forward to doing more data processing with it.

I’m not yet ready to switch completely from Fish to Nushell, but … who knows? Maybe it’s coming.

September 21, 2022September 21, 2022 Sam Thursfield

Status update 21/09/22

Last week I attended OSSEU 2022 in Dublin, gave a talk about BuildStream 2.0 and the REAPI, and saw some new and old faces. Good times apart from the common cold I picked up on the way — I was glad that the event mandated face-masks for everyone so I could cover my own face without being the “odd one out”. (And so that we were safer from the 3+ COVID-19 cases reported at the event).

Being in the same room as Javier allowed some progress on our slightly “skunkworks” project to bring OpenQA testing to upstream GNOME. There was enough time to fix the big regressions that had halted testing completely since last year, one being an expired API key and the other, removal of virtio VGA support in upstream’s openqa_worker container. We prefer using the upstream container over maintaining our own fork, in the hope that our limited available time can go on maintaining tests instead, but the containers are provided on a “best effort” basis and since our tests are different to openqa.opensuse.org, regressions like this are to be expected.

I am also hoping to move the tests out of gnome-build-meta into a separate openqa-tests repo. We initially put them in gnome-build-meta because ultimately we’d like to be able to do pre-merge testing of gnome-build-meta branches, but since it takes hours to produce an ISO image from a given commit, it is painfully slow to create and update the OpenQA tests themselves. Now that Gitlab supports child pipelines, we can hopefully satisfy both use cases: one pipeline that quickly runs tests against the prebuilt “s3-image” from os.gnome.org, and a second that is triggered for a specific gnome-build-meta build pipeline and validates that.

First though, we need to update all the existing tests for the visual changes that occurred in the meantime, which are mostly due to gnome-initial-setup now using GTK4. That’s still a slow process as there are many existing needles (screenshots), and each time the tests are run, the Web UI allows updating only the first one to fail. That’s something else we’ll need to figure out before this could be called “production ready”, as any non-trivial style change to Adwaita would imply rerunning this whole update process.

All in all, for now openqa.gnome.org remains an interesting experiment. Perhaps by GUADEC next year there may be something more useful to report.

My main fascination this month besides work has been exploring “AI” image generation. It’s amazing how quickly this technology has spread – it seems we had a big appetite for generative digital images.

I am really interested in the discussion about whether such things are “art”, because I this discussion is soon going to encompass music as well. We know that both OpenAI and Spotify are researching machine-generated music, and it’s particularly convenient for Spotify if they can continue to charge you £10 a month while progressively serving you more music that they generated in-house – and therefore reducing their royalty payments to record labels.

There are two related questions: whether AI-generated content is art, and whether something generated by an AI has the same monetary value as something a human made “by hand”. In my mind the answer is clear, but at the same time not quantifiable. Art is a form of human communication. Whether you use a neural network, a synthesizer, a microphone or a wax cylinder to produce that art is not relevant. Whether you use DALL-E 2 or a paintbrush is not relevant. Whether your art is any good depends on how it makes people feel.

I’ve been using Stable Diffusion to try and illustrate some of sound worlds from my songs, and my favourite results so far are for Don’t Go Into The Zone:

And finally, a teaser for an upcoming song release…

An elephant with a yellow map background

April 10, 2021July 30, 2021 Sam Thursfield

Calliope, slowly building steam

I wrote in December about Calliope, a small toolkit for building music recommendations. It can also be used for some automation tasks.

I added a bandcamp module which list albums in your Bandcamp collection. I sometimes buy albums and then don’t download them because maybe I forgot or I wasn’t at home when I bought it. So I want to compare my Bandcamp collection against my local music collection and check if something is missing. Here’s how I did it:

# Albums in your online collection that are missing from your local collection.

ONLINE_ALBUMS="cpe bandcamp --user ssssam collection"
LOCAL_ALBUMS="cpe tracker albums"
#LOCAL_ALBUMS="cpe beets albums"

cpe diff --scope=album <($ONLINE_ALBUMS | cpe musicbrainz resolve-ids -) <($LOCAL_ALBUMS)

Like all things in Calliope this outputs a playlist as a JSON stream, in this case, a list of all the albums I need to download:

{
  "album": "Take Her Up To Monto",
  "bandcamp.album_id": 2723242634,
  "location": "https://roisinmurphy.bandcamp.com/album/take-her-up-to-monto",
  "creator": "Róisín Murphy",
  "bandcamp.artist_id": "423189696",
  "musicbrainz.artist_id": "4c56405d-ba8e-4283-99c3-1dc95bdd50e7",
  "musicbrainz.release_id": "0a79f6ee-1978-4a4e-878b-09dfe6eac3f5",
  "musicbrainz.release_group_id": "d94fb84a-2f38-4fbb-971d-895183744064"
}
{
  "album": "LA OLA INTERIOR Spanish Ambient & Acid Exoticism 1983-1990",
  "bandcamp.album_id": 3275122274,
  "location": "https://lesdisquesbongojoe.bandcamp.com/album/la-ola-interior-spanish-ambient-acid-exoticism-1983-1990",
  "creator": "Various Artists",
  "bandcamp.artist_id": "3856789729",
  "meta.warnings": [
    "musicbrainz: Unable to find release on musicbrainz"
  ]
}

There are some interesting complexities to this, and in 12 hours of hacking I didn’t solve them all. Firstly, Bandcamp artist and album names are not normalized. Some artist names have spurious “The”, some album names have “(EP)” or “(single)” appended, so they don’t match your tags. These details are of interest only to librarians, but how can software tell the difference?

The simplest approach is use Musicbrainz, specifically cpe musicbrainz resolve-ids. By comparing ids where possible we get mostly good results. There are many albums not on Musicbrainz, though, which for now turn up as false positives. Resolving Musicbrainz IDs is a tricky process, too — how do we distinguish Multi-Love (album) from Multi-Love (single) if we only have an album name?

If you want to try it out, great! It’s still aimed at hackers — you’ll have to install from source with Meson and probably fix some bugs along the way. Please share the fixes!

December 18, 2020 Sam Thursfield

Calliope: Music recommendations for hackers

I started thinking about playlist generation software about 15 years ago. In that time, so much happened that I can’t possibly summarize it all here. I’ll just mention two things. Firstly, Spotify appeared, and proceeded to hire or buy most of the world’s music recommendation experts and make automatic playlists into a commodity. Secondly, I spent a lot of time iterating on a music tool I call Calliope.

Spotify or not?

Spotify’s discovery features can be a great way to find new music, but I’ve always felt like something was missing. The recommendations are opaque. We know broadly how they work, but there’s no way to know why it’s suggesting I listen to ska punk all day, or I try a podcast titled ‘Tu Inglés’, or play some 80’s alternative classics I’m already familiar with. It gets repetitive.

Some of the most original new music isn’t even available on Spotify. Most folk don’t release that small artists have to pay a distributor to get their music to appear on streaming services like Spotify and Apple Music, a dubious investment when the return for the artist might be a cheque for $0.10 and a little exposure. No wonder that some artists use music purchase sites like Bandcamp exclusively. Of course, this means they’ll never appear in your Discover Weekly playlist.

Algorithms decide which social media posts I see, whether I can get a credit card, and how much I would pay to insure a car. Spotify’s recommendation system is another closed system like the others. But unlike credit agencies and big social networks, the world of music has some very successful repositories of open data. I’ve been saving my listen history to Last.fm since 2006. Shouldn’t I do something with it?

Introducing Calliope

Calliope is an open source tool for hackers who want to generate playlists. Its primary goals are to be a fun side project for me and to produce interesting playlists from of my digital music collection. Recently it has begun fulfilling both of those goals so I decided it’s time to share some details.

Querying my music collection with Calliope

The project consists of a set of commandline tools which operate on playlist data. You use a shell pipeline to define the data pipeline. Your local music collection is queried from Tracker or Beets. You can mix in data from Last.fm, Musicbrainz and Spotify. You can output the results as XSPF playlists in your music player. The implementation is Python, but the commandline focus means it can interact with tools in any language that parses JSON.

The goal is not to replace Spotify here. The goal is to make recommendations open and transparent. That means you’re going to see the details of how they work. My dream would be that this becomes an educational tool to help us understand more about what “algorithms” (used in the journalistic sense) actually do.

I’m developing a series of example playlist generation scripts. I’m particularly enjoying “Music I haven’t listened to in over a year” — that one requires over a year of listen history data to be useful, of course. But even the “One hour random shuffle” playlist is fun.

A breakthrough this month was the start of a constraints-based approach for selecting songs. I found a useful model in a paper from 2006 titled “Fast Generation of Optimal Music Playlists using Local Search”, and implemented a subset using the Python simpleai library. Simple things can produce great results. I’m only scratching the surface of what’s possible with this model, using constraints on the duration property to ensure songs and playlists are a suitable length. I expect to show off some more sophisticated examples in future.

I’m not going to talk much more about it here — if it sounds interesting, read the documentation which I’ve recently been working on, clone the source code, and ask me if there’s any questions. I’m keen to hear what ideas you have.

April 30, 2020September 29, 2020 Sam Thursfield

Why I love Bandcamp

The Coronavirus quarantine would be much harder if we didn’t have great music to listen to. But making an income from live music is very difficult in a pandemic. What’s a good way to support the artists who are helping us through?

One ethical way is to buy music on Bandcamp. The idea of Bandcamp is that you browse music (and merch), and if you like something you buy a real download¹. You get unlimited web streaming of everything you bought too². Their business model is clear and upfront:

Our share is 15% on digital items, and 10% on physical goods. Payment processor fees are separate and vary depending on the size of the transaction, but for an average size purchase, amount to an additional 4-7%. The remainder, usually 80-85%, goes directly to the artist, and we pay out daily.

On Friday 1st May 2020, which is tomorrow, or today, or some point in the past, Bandcamp are waiving their 10-%15% share of sales. It’s a great time to buy some music!

Here are some recommendations taken from the recent social media challenge of posting album covers that have a big effect on your music taste, with no other context. (My social media posts are mostly of music recommendations with no context anyway, so this wasn’t much of a challenge).

Orange Whip by Honeyfeet

Widow City by The Fiery Furnaces

at Version City by Victor Rice

Unknown Mortal Orchestra by Unknown Mortal Orchestra

Sonido Amazonico by Chicha Libre

When you’ve listened to those, it’s time to dive into the enourmous list of curated recommendations (curated by real humans, not by robots). The best metal, the best hip-hop, the best contemporary Chinese post-punk, the best Theremin music of the last 100 years, etc. etc. You can also follow me if you want 🙂

In the parallel universe of unethical music services, I read that Spotify have insultingly added a virtual “tip jar”. It can’t make amends for the deeply unfair business relationship that many streaming sites have with artists.

Listen to the T-shirt:

Have fun & make sure to spend your music money ethically!

1: You can even download in Ogg Vorbis format if you like.
2: In practice, you get unlimited streaming of all the music on Bandcamp. Artists can choose to put a nag screen up after a certain number of listens. Some artists would prefer the site to be more restrictive in this regard.