kingofnovember.com

I've had some whiskey, and I've been thinkin'.

Seraphim and Artificial Intelligence in Music

Wherein I wax nostalgic about an artificially intelligent disk jockey I once wrote.

Many, many moons (circa 2001) ago I wrote a program called Seraphim. Seraphim was, for lack of better terms, a “user-programmable internet radio station.” It was a bit more than that – quite a bit more actually.

The problem I was trying to solve was this: I had about 2,000 albums worth of music, and I wanted something that would stream music to my internal stereo. I wanted it to be able to perform the following functions:

1) Allow me (or others) to add single songs or whole albums to the music queue;
2) Track which songs, albums, or artists were the most popular;
3) Have a “smart” Artificial Intelligence disk jockey that could spin music on its own.

The program I wrote succeeded in this regard in all aspects. Sure, sure, the code itself was ugly as shit (and written in Perl). Were I to re-write it today, I would do a ton of things differently. But I had about 8 years less experience writing stuff, so I forgive myself.

At any rate, Seraphim itself became an example of emergent behavior on several levels. Not just in the AI of the DJ, but also in the way that people used the system.

This was in the early days of internet music. Before the dark times. Before the MPAA. I made this radio station and a way to interact with it, and then we plugged it into AOL’s network (I won’t discuss how that happened because there may be legal ramifications for the people involved). Suffice to say, suddenly a computer in my house was streaming music to hundreds (possibly thousands) of people.

If you had an account on the system, you could do any of the following:

1) Upload your own library of mp3s so that they were available for play
2) Modify the metadata for any song, artist, or album (genres, etc.)
3) Add songs to the system’s “queue” – the music it was playing.

(If you were a super-user, you could kill songs from the queue or mark them as “never play”).

People would listen to the station rather than their own personal libraries because there was a significant degree of fun involved in being a disk jockey. Perhaps my most favorite emergent user behavior was when someone would start a musical “theme” and the various DJs would try to one-up each other following said theme.

For example, someone might say “the theme is fire“. Then, we’d see a bunch of “fire” related songs show up (“Burning Down the House”, etc.). There was a game made of the music. It was a glorious amount of fun.

Each song, artist, and album had a “karma” score. The more often it was requested, the higher the karma. Picking a single song gave a +1 to each song, artist, and album. That way the system understood popularity (though the scales were different for each [song, artist, album]).

However, the most interesting part of the system (to me) was that if no one put anything into the queue, Seraphim would “auto dj”. And, having lived with it for a year or so, it became . . . exceptionally creepy in how smart of a disk jockey it became.

I wrote the artificial intelligence routines as a lark, to be honest. But this is an example of awesome emergent behavior.

The first thing I did in the system was to “fix” a weakness in the MP3 file format. MP3s have a “genre” tag but that’s very limited. It doesn’t say a lot; it’s a single dimension. So I wrote a large matrix called “Genre Brethren”.

For example, “Rap” is a genre brother to “Gangster Rap” and to “Hip-Hop.” “Speed Metal” is brother to “Death Metal” and “Heavy Metal”. (The system was far more complex, usually seeing 3-6 brethren). Albums, artists, and songs could be tagged with multiple genres.

When in “Auto DJ” mode, Seraphim would start with the most current song and then make choices. Did it stay in the current genre? This was maybe 50/50. If it decided to change genres, it would only move to one of the brethren genres (thus, we don’t move from Slayer to Michael Bolton). We keep a continuity of musical style.

Once it picked a genre, it had to choose a song. But that’s a trick, right? Obviously, we don’t want to pick songs that suck. And that’s where I wrote this thing that worked and worked well. To this day, though, I’m not sure how I arrived at the system.

Cheaply, you can just choose the song in the genre with the most karma. That works once. Ideally, though, you’ll spread out. So I wrote this complicated system whereby it would pick songs. If I recall correctly (and I could pull up the source to see, but fuck that), it went like this:

1) Choose between Song, Artist, or Album in genre.
2) Within that subset, take the top 50 karma values as a grouping.
3) Within that grouping, weigh each one. Those within the top 5 get +5 within 6-10 get +4, within 11-20 get +3, within 21-30 get +2, everyone else +1.
4) Select within that set based on weight.
5) If “songs”, play that song. Done.
6) If “albums”, repeat step 3 based on songs in album. Pick song; play; done.
7) If “artists”, repeat step 3 based on albums, then go to step 6.

I injected a degree of “fuzziness” into the AI routine, too. Without the fuzziness, it might play the same shit over and over again (like, all of Nevermind on repeat). While the ideal was the highest karma value in a given set, there was logic to ignore that aspect and just pull from lower in the stack (there was a routine to drop out of “standard top 50 of type” mode and pull from wherever, or to overweigh to the bottom of the stack).

As it played songs, it marked when things were last played. Thus, no repeats within 6 hours or so. It also did crazy-ass shit like “look for songs that have a positive karma value that haven’t been played in 5 days” and then give those songs extra weight.

What happened was this: I ended up with a disturbing, creepily good disk jockey. My ex-wife and I had multiple conversations about this. We’d be listening to it all day and there would be strange stretches of excellent music choices. So we’d go look and see who had been programming it, and it nearly always turned out to be the machine itself.

I write this only because I’m thinking about writing artificial intelligence routines and Seraphim was one of my first attempts at doing “smart” AI.

Comments on Seraphim and Artificial Intelligence in Music

  1. I remember talking to you about the genre proximity problem and I thought other than having people work out a spatial map by hand it wouldn’t be solvable. It sounds like you did some approximation of that? And I assumed Apple did something similar for itunes’ genius mode.

    I really miss Seraphim. Which reminds me, I don’t suppose the code is still available? I would mind using it for my private library. I won’t benefit from the emergent behavior we saw from it being on the internet but it *was* a pretty damned good auto-DJ.

    1. Sadly, I lost a bunch of my code archives during a system rebuild. I made big tarballs of everything, and they ran afoul of the “maximum file size” thing that older Linuxes suffered from.

      The genre map was hand constructed by me and (I think) Aneel. We pulled down a bunch of genre tags from AllMusic and worked from there.

      The biggest pain was applying multiple genres to individual MP3s. In iTunes, what I do to solve that problem (only one genre tag) is I set multiple genres with semi-colons.

      Thus, Nirvana may be “Alternative; Grunge; Punk”, while Metallica is “Metal; Speed” and R.E.M. is “Alternative; Rock”. Barry Manilow is “Rock; Soft”.

      So I can set a smart playlist for “Genre [contains] ‘Alternative'” and both REM and Nirvana may show up.

      I’m a bit anal retentive about this, too.

      Thinking about this now, I have a yen to recreate the damned thing, only smarter, better, faster. I’ve got about 8 years more experience with this type of stuff now, and having a streaming stereo system might not be a bad idea.

      1. You and I discussed multiple genres, but I didn’t actually help to construct the genre map.

        When I was classifying my own music collection (before I gave up), I was using keywords like “dark” and “noise” and “synth”, rather than “genres” in a lot of cases. Some genres give a pretty good idea of what the music is going to sound like (in my collection “ska” or “dub”), but things like “rock” or “electronica” are too big.

  2. The more I see in the world the more I am convinced that true AI will come from programs like this or games and not from the military, computer labs or big business as it is often portrayed in SciFi.

    1. Re: Ah, good times

      Yes, I have fond memories of DJ games. And I got into SO MUCH MUSIC that I might never have known about. Kind of a Pandora/Genius/thing now, but none of them have the sort of impromptu creativity flow/zen we got into.

      1. Re: Ah, good times

        I have been thinking about rebuilding it – if only the AI engine. I wish the universe were better about allowing Cool Things into existence.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.