I started thinking about playlist generation software about 15 years ago. In that time, so much happened that I can’t possibly summarize it all here. I’ll just mention two things. Firstly, Spotify appeared, and proceeded to hire or buy most of the world’s music recommendation experts and make automatic playlists into a commodity. Secondly, I spent a lot of time iterating on a music tool I call Calliope.
Spotify or not?
Spotify’s discovery features can be a great way to find new music, but I’ve always felt like something was missing. The recommendations are opaque. We know broadly how they work, but there’s no way to know why it’s suggesting I listen to ska punk all day, or I try a podcast titled ‘Tu Inglés’, or play some 80’s alternative classics I’m already familiar with. It gets repetitive.
Some of the most original new music isn’t even available on Spotify. Most folk don’t release that small artists have to pay a distributor to get their music to appear on streaming services like Spotify and Apple Music, a dubious investment when the return for the artist might be a cheque for $0.10 and a little exposure. No wonder that some artists use music purchase sites like Bandcamp exclusively. Of course, this means they’ll never appear in your Discover Weekly playlist.
Algorithms decide which social media posts I see, whether I can get a credit card, and how much I would pay to insure a car. Spotify’s recommendation system is another closed system like the others. But unlike credit agencies and big social networks, the world of music has some very successful repositories of open data. I’ve been saving my listen history to Last.fm since 2006. Shouldn’t I do something with it?
Calliope is an open source tool for hackers who want to generate playlists. Its primary goals are to be a fun side project for me and to produce interesting playlists from of my digital music collection. Recently it has begun fulfilling both of those goals so I decided it’s time to share some details.
The project consists of a set of commandline tools which operate on playlist data. You use a shell pipeline to define the data pipeline. Your local music collection is queried from Tracker or Beets. You can mix in data from Last.fm, Musicbrainz and Spotify. You can output the results as XSPF playlists in your music player. The implementation is Python, but the commandline focus means it can interact with tools in any language that parses JSON.
The goal is not to replace Spotify here. The goal is to make recommendations open and transparent. That means you’re going to see the details of how they work. My dream would be that this becomes an educational tool to help us understand more about what “algorithms” (used in the journalistic sense) actually do.
I’m developing a series of example playlist generation scripts. I’m particularly enjoying “Music I haven’t listened to in over a year” — that one requires over a year of listen history data to be useful, of course. But even the “One hour random shuffle” playlist is fun.
A breakthrough this month was the start of a constraints-based approach for selecting songs. I found a useful model in a paper from 2006 titled “Fast Generation of Optimal Music Playlists using Local Search”, and implemented a subset using the Python simpleai library. Simple things can produce great results. I’m only scratching the surface of what’s possible with this model, using constraints on the
duration property to ensure songs and playlists are a suitable length. I expect to show off some more sophisticated examples in future.
I’m not going to talk much more about it here — if it sounds interesting, read the documentation which I’ve recently been working on, clone the source code, and ask me if there’s any questions. I’m keen to hear what ideas you have.