Numlock News
The Numlock Podcast
Numlock Sunday: Glenn McDonald on the future of music in the algorithmic era

Numlock Sunday: Glenn McDonald on the future of music in the algorithmic era

By Walt Hickey

Welcome to the Numlock Sunday edition.

This week, I spoke to Glenn McDonald, author of the new book You Have Not Yet Heard Your Favorite Song: How Streaming Changes Music.

I’ve followed Glenn’s work for years now, and this book is the result of decades of work in the field, and comes from a perspective not only of technology’s bleeding edge but also a sincere, personal love of music.

We spoke about the mechanics of tracking genre data, how streaming has impacted listening trends, and how the model’s economics are holding up.

The book can be found everywhere books are sold.

This interview has been condensed and edited.

Glenn McDonald. Thank you so much for coming on. You are the author of You Have Not Yet Heard Your Favorite Song, which is a really compelling title all about the streaming revolution, but more importantly about this really fascinating moment in music data that for much of the past decade, you have been at the front seat for, or even in the driver's seat for. Your work goes back to a really interesting company called The Echo Nest.

For new listeners and folks who maybe are unfamiliar with your story, can you just tell me a little bit of the history of this field and your place in it? The Echo Nest was a really fascinating company and I think more people ought to know about it.

I had been doing software design for a long time and had worked on a bunch of different things that all had to do with making sense of data for people. None of them had been specific to music data, but I would always use my work tools on my record database or my other various music-related projects. Those were the things that I was really interested in.

At some point I ended up tabulating the Village Voice critics music poll every year. This was what big data was for music in the era before streaming: It was like 800 music critics typing 10 album votes into blanks, with typos and everything. The companies I worked for kept getting acquired and my projects would get shut down or something, so every few years I’d need a new job.

When this happened in about 2011, I just knew through contacts that there was this company in Somerville a couple of media lab people had started called The Echo Nest, which was trying to do something with music data because there suddenly was a lot more music data. The Echo Nest was trying to do recommendations and categorization stuff for streaming services. This was pre-Spotify launching in the U.S., so 7Digital and Rdio at the time were some of the existing players. And I had done enough music data things to convince them that I was a worthwhile person to add to this effort.

I remember my first task at The Echo Nest. I showed up for my first day and they were like, “Oh, Glenn, you're here. Good. We're doing these radio stations for Spotify, this company we're trying to entice into using our services, and we're putting cartoon noises on the Franz Liszt classical station. Can you please figure out why we're doing that and make it stop?”

So that was the beginning of the journey. We did not succeed initially at getting Spotify as a customer, because Spotify recognized, correctly, that to do a really good job we had to have listening data, and there was no way they were going to give us listening data when we were also powering their competitors, even though their competitors were small. I remember we tried really hard to convince them. We were like, “We'll keep your data on a server on the totally other side of the closet where we have our servers.” That obviously didn't fly. After a couple of years of doing a lot of other things along the way, it was a race to see whether Spotify would develop their own recommendations and not need us or whether they would just get enough money to buy us first.

The money happened faster. We got acquired in 2014 and basically officially became the personalization team at Spotify. A bunch of the things we did had to do with understanding music and understanding taste so that you could do personalization, but it wasn't all directly involved with personalization.

That's about when you got on my radar, because I was at the time doing pop culture stuff at 538. I think music was always a bit of an enigma to me, just because there was so much of it, and obviously all the challenges that you were facing at an industrial scale, I was facing on a journalistic scale.

When I saw what you were doing — I don't think people really appreciate enough the moment when The Echo Nest got bought by Spotify. Very soon after, that's when you started seeing the level of personalization on the platform skyrocket. Do you want to speak a little bit about that?

It was a combination of things, because some of that stuff was stuff that we brought, The Echo Nest, but that acquisition was also Spotify's moment where they bought into the idea that personalization was going to be a big part of this Spotify experience. Discover Weekly, for example, which came along shortly after that, was not an Echo Nest thing. That was done by people who had already been at Spotify, and some of them were annoyed that that feature was described as if it somehow came from the Echo Nest work.

But basically everything that came about was because Spotify decided, “All right, personalization is going to be a thing.” At the same time, they acquired another company called Tunigo that was a playlist-making editorial company. And that was the beginning of, “All right, we're really going to have an editorial effort, too.” That was the beginning of both of those areas at once in Spotify's existence.

A lot of the interesting stuff in your book comes out of the complications of using algorithms as opposed to taste, and just the serendipity of some of it. I want to play something, because I think it shows the moment that completely shattered something about how I thought about music and how I thought about what the tech was that y’all were building.

I have a lot of Spotify playlists, just as anyone else does, and I was on a kick and I added a few songs in a row. The following week, all 10 recommended songs had this kick to start it. It’s the “Be My Baby” kick. There’s no way you could ask a DJ or even an expert, “Hey, can you find me 10 songs that all have this kick that I’m apparently into right now?”

But lo and behold, I would go down this recommended songs list and it would all be that. And that showed me that, man, there's a level of depth here that not only could we never accomplish before, but that is going to change the way we really consume a lot of this stuff. I always found it really fascinating how you were really on the front of that for so much of the time at Spotify.

One of the most interesting things to realize for me in this journey was that finding those patterns often comes about not the way you think.

How did that happen?

You imagine that the computer knows there are those drumbeats, it's found that you like them, and knows these songs contain them and lines them up. In fact, in this feature, that's not happening at all. It's just patterns of playlist making.

That recommended feature at the bottom, it uses the playlist title when you don't have anything, but then as soon as you've got stuff in your playlist, it's really just doing a complicated search of songs and playlists from other people that overlap with what you put. Here's what else they did. I found over and over that it was more effective to basically mine listening for the implicit signal that people have created by listening in nonrandom ways than it was to try to find the thing you're actually looking for.

If you try to find bands from Estonia, you get screwed up by metadata mistakes and missing data all the time. But if you can find a few bands that you know are from Estonia and use them to find an audience and use that audience to find what's different about those people's listening, then you find all the rest of the bands from Estonia without having to rely on metadata. Even the system doesn't know what it's doing. People have encoded that knowledge implicitly by listening.

So I did find someone who’d been on a kick of listening to all the “Be My Baby” hooks in a row. It's fascinating stuff.

I want to take it to the book now, because that speaks to a chapter specifically all about how you talk about genres and how genres don't really exist; they're just words that people use to talk about things. You describe them as “distributed communities of interest.” Do you want to speak a little bit about what genres are?

We got into this genre thing at The Echo Nest because we promised somebody that we had genre radio. It was the era of Pandora. Algorithmic radio was mostly track and artists seated. That was how people mostly thought about it.

We had some customer — I've long since forgotten who they were — who was like, that's too complicated. I just want like 16 buttons. It should just say rock and you hit it and it plays some rock music. And we were like, “Oh yeah, totally. We totally have that.” And then we went back to the office and we were like, we don't actually have that. But we better make it really quick.

What we did have was this vast database of word frequencies. We knew what artists were written about in what vocabulary, so we were like, this will be fine. We'll just line up the artists for whom rock is a disproportionately occurring term and we'll sort them by popularity and hit play.

We did that and then Rihanna came out and we were like, ah crap. People do say rock about Rihanna. I mean, she has a song called “Rockstar.” It's not crazy, but it was definitely not what these people wanted to have happen.

So we had a few days, and I'm like, all right, there's cultural knowledge here. It's not complicated what rock is. We just have to mine this very basic cultural knowledge. We had a table full of interns from Tufts, so I'm like, “Here's what we're going to do. Interns, go for each of these 17 or however many genres we want to demo, and just go find a list of the most obvious artists. Look it up on Wikipedia or Google — don't do anything sophisticated. When someone says rock, what are they probably thinking of? Then we'll take five or 10 of those artists, and because we have this good graph of artist similarity, we'll say, what are the other artists that are collectively similar to those five or 10 seed artists for each genre? That'll probably get us close.”

And that was right. That basically worked. If you feed in that what we mean by rock is The Who and Lynyrd Skynyrd and Led Zeppelin, you get a set of artists out. If you say, no, what I meant was The Black Keys and the Foo Fighters and Coldplay, then you get a different set of artists out. So that was where we began. I didn't have a theoretical framework for what I was doing; I just had a thing that we needed to produce really quickly.

But as I got into it and tried to extend this from 16 to 300 and then to 1,000, what I realized I was doing was scouring the planet to find communities, literal communities. Sometimes of artists, sometimes of listeners, usually of both but not always, and usually with some element of practice to them, but not like a list of criteria in the classic musical logical sense. You can describe the difference between Baroque music and ragtime in informal music theory terms, but that's not really helpful because people's interest is much more specific than that. You can't just say this is definitely formally a hip-hop song, and therefore you as a hip-hop fan are going to like it, because if it's in Turkish and you only speak Bulgarian, it's probably useless to you.

Once I understood that, then it became easier to think about how we proceeded: that we're trying to find communities and show them back to themselves. And they usually have names for themselves. Sometimes we would find communities that didn't yet have a self-identification and we would have to make up names for them, but the goal of doing that was to be able to show those people, here you are, here's your taste. You're an audience, you have a taste. If you think of a name for it, tell me and I'll replace it. But I gave it a name so that we can at least talk about it.

It also gets at a big issue with music in general. Even going back to radio times, there are a lot of genres that truly don't exist, that are entirely manufactured. Things like classic rock or oldies are referendums on not just what you played, but how long ago you played it. And even things like indie rock says more about the economics of the people who distributed your record than perhaps you yourself. But nevertheless, these communities are constituencies that have an expectation that if they press an indie rock button, they want to hear some indie rock.

Indie rock is a great example where there are 12 good answers to that depending on who you are, and we couldn't call them all indie rock. Some of the exercise in making up names was like, all right, how am I going to differentiate between 12 historical, regional, philosophical variations that each think of themselves as indie rock? I have to tell the story a little differently.

Yeah, I dig that. That's a really exciting challenge.

I want to talk a little bit about some of the things that make streaming unique as a distribution format and a distribution medium. Whenever you have a new medium emerge, you have new intersections of how people work with that and consume it. You see it time and time again that technology can inform what's done.

You have a whole chapter in your book about this: “Chill is the new music.” It talks about essentially background and foreground sounds, whether that's lo-fi hip-hop radio, which is fairly well known, or things like a peaceful piano playlist. Things that would not exist in any previous iteration of the music industry are now dominant forms of consumption for lots of people. Do you want to speak to how this emerged and how you assess the space?

My favorite example of this is nature sounds. I had a CD of rainforest noises, and I would play it sometimes, but I was never going to buy another one. I think this is true of most people. I think most people had zero or one background noise CD in the CD era. The worldwide market of rainforest noises was probably a dozen, and you could compete between those dozen which was going to be the one that an individual user bought, but you couldn't go much further than that.

Streaming has made it possible to have that for no additional costs. It's like, it's not that I was against hearing a different rainforest. Costa Rica was superior to Indonesia.

Borneo is lovely this time of year.

One of the finest rainforests to listen to. But we unlocked that because now I can just put on a playlist of rainforest noises and I can hear new rainforest noises.

Does it really matter in rainforest noises? No, but it matters more in lo-fi hip-hop, where it is sort of a substance and you may prefer to hear new examples of the same form. That suddenly became totally viable. Peaceful piano is another one of those. I think a lot of people owned one classical CD and they would it put on when they needed something in the background. “I'll put on the classical music I own.”

Not only did streaming unlock the rest of the classical catalog, but then suddenly people were like, not all classical music works that well. I can just make stuff that's perfect for this mode. It's the perfect size and it's exactly as soothing, and it's not going to do some interesting thing that Chopin did because Chopin was interesting. Let's make it all fit this need. And I think there are a lot of needs that you wouldn’t have spent a lot of money to satisfy, but they’re needs you will spend a little bit of time to satisfy if it's free and it's easy to find them.

Fascinating. The rise of that has just been such an interesting side effect of the business model, in some way, but also a side effect in terms of how people want to listen to something pleasant in the background but not necessarily shell out for it. It just seems like it's a novelty of the distribution format that I enjoy, but can really only exist at this time in history.

Yeah. And it's not just streaming, too, because it's a synergy of streaming and having phones with you and earbuds and the expectation of music in all parts of your day. The idea that not only do you have earbuds, but everybody has earbuds, so it's normal for you to have your music in a public environment without bothering other people.

I want to talk a little bit about another side effect of the streaming model. This is one of the first times I've seen someone who was actually inside the house recount what this looks like, but streaming fraud has a lot of folks in the industry on edge or concerned —folks who are trying to manipulate the eventual rankings of things or the eventual performances of artists, whether it's for financial reasons, they want their artists to get more money, or they just want more people to see the person for whom they belong to the Army.

I thought this was just a really interesting look inside a company that has to deal with this and how obvious it can look at you. You had a story about Beyoncé in there that was fascinating, but I would love to hear about what streaming fraud and Army-style tactics look like from the inside.

I never intended to be involved in fighting streaming fraud at all. But as I explain in the book, I fell into it just because I was looking for patterns and sometimes the patterns that I'd find would make no sense. I'd be like, what? In one of the earliest examples, I was starting to try to look at what was different about listening in each city, and a lot of cities made sense. I could say, all right, I know what people in that region like and I can see it in the city.

And then Buffalo, New York, was all church music. I've realized in this process that I don't know that much about the world, and I've been surprised many times by things that turn out to be real features of how people move around the planet. So I tried not to jump to conclusions. I was like, okay, maybe Buffalo's a really religious place and it's a really common usage to have organ music that you play off Spotify. That theory didn't hold up very long. It was obviously not what was actually happening. I found that a lot of times, whenever I would go looking for interesting patterns in small subsets of people, whether they be geographic or by age or demographic or whatever, some of them would be weird. I realized that I'd found a subset of accounts, but not a subset of people.

Having spent 10 or 12 years at this, depending on how you look at it, if I wanted to live a life of crime, this is definitely the life of crime I am best prepared to enter into, and I would not do it. That's my message to aspiring fraudsters: shoplift, go do something else. Anything is better than this. This is a really bad way to try to earn money, because anything that you do that earns enough money forms obvious patterns and it's just trivially easy to detect. Sometimes it took me half an hour to figure out the exact pattern that some new cluster of bots was using to manipulate things in slightly different ways, but it never took long. It was always trivial to block them. It depended on the magnitude, whether Spotify would care and go after them in any punitive sense, but blocking whatever they were trying to do was never hard once it reached any magnitude where it would matter.

I always knew, and have been saying for years, that Buffalo Bills-based organ music was an industry plant. Thank you for confirming that for me.

That is actually a fun segue, because one of the most interesting chapters in here, I think, was about how the streaming model has winners and it has losers, and it has genres that are in fact losers. I know we’ve already agreed that genres are mere communities of sound, but for all intents and purposes, but let's go back to the more traditional sense here.

You write a lot about how genres like jazz, classical, experimental music, these aren't really being well served by the streaming model. And you actually write a little about whether streaming actually makes discovery of this stuff easier or harder. What got you aware of this potential side effect of the model and how do you assess where it's at?

This was always interesting to me because although I like Taylor Swift and I have some Ed Sheeran songs that I love, my taste includes a lot of obscure things. I'm very attached to those things existing and the people who make those things managing to somehow live in such a way that they get to keep making, you know, extremely florid gothic symphonic metal albums, or weird wedding music from Limpopo, or Filipino pop punk.

As a human, I want all these things to be viable whether they are super popular or not. The genre project could have stopped at 300 if it only cared about the popular genres. It kept going to 6,000 because I think everything deserves to have the same chance to find its audience, whether that audience is small or not.

As I say in the book, I think the way royalties work now in streaming is, in economic terms, actually slightly progressive. It's hard to guess this, but I didn't have to guess. I could run the numbers on the whole Spotify. I could run alternate economic models on literally all the Spotify data. That doesn't always tell you how the future will be, because sometimes when you change things, people change behavior, but I could definitely evaluate other proposals for how the existing money should be divvied up. What I found was that the model we’re currently using is a slight subsidy of less popular artists by the most popular artists in practice, which is the opposite of what some people surmise, which was interesting in itself.

And really, the headline is that it's a small factor. It doesn't actually matter very much. But every medium, like you say, has winners and losers by the nature of the format. There was a sort of artist that would appeal to the people who bought the most CDs, and in the CD era, I spent thousands of dollars. I was one of those people that spent thousands of dollars a year in order to discover all the music I was curious about, because I had software jobs and I could afford it. Therefore, I had a lot of economic power in that model. People like me exerted a lot of economic power. As an artist, if you were the kind of artist that I bought, that was excellent.

Now I spend $10 or $11 a month on streaming like everybody else, so that power has been distributed a lot more broadly. It's a lot less concentrated now, which I think is good on the whole. I think that's good for society. But it does mean there were people who thrived very specifically in the CD era, and they could put out limited editions and CD singles. This seems crazy to me in retrospect. I would spend $12 on an imported UK CD single to get one B-side that I hadn't heard, and now that's a whole month of my listening. The crazy part of that was the former state, paying $12 to hear one B-side. That's crazier than the current model.

But it's true that with a lot of things, when individual artists tell a sad story of how they used to have a career and now they don't, sometimes it's for this reason. They had found a niche and that niche went away and there are new niches. The system overall is producing as much money and it supports obscure things in general just as readily, but they're not necessarily the same obscure things to the same level.

Interesting. And that $12 single, you can't be alone. They released it for a reason. There must've been a critical mass that in the aggregate means now they have to spend another day on the road, or rely on superfans. The main way you can reach them these days, if everybody's only tithing $15 or so a month through their streaming, is through appearances or tours or other kinds of onerous things.

It's true, but also availability is totally different now. I think people sometimes fall into the trap of trying to compare the money as if the behavior is the same. They’re like, a person would have bought my CD for $10 at my show, and now they're going to stream my song once and I only get a third of a cent. Not very many people are going to come to your show, and of them, only a few are going to buy your CD, and the number of people who are going to buy that $12 CD single to hear that B-side is really small.

That B-side now could be on a playlist and a million people who've never heard of you could come across it. The dynamics are now completely different, and not everybody adapts to them immediately, but you now have a very, very broad potential casual audience that is only going to spend a third of a cent on you, but there are a lot of them. Maybe 10% of them will spend 12 cents on you by listening to a whole album a couple times, and a few of them will listen to your whole catalog and they'll buy tickets to see you when you come.

Overall it's about the same money. The music industry is, in absolute terms, now past the CD peak. Adjusted for inflation, it's not quite there, so we're not quite as far into the streaming era as the CD peak was in the CD era. It seems possible still that the CD peak will be surpassed by the streaming peak in overall money, which is good, I think.

That's neat. To back out a little bit, the book is excellent. People can find it wherever books are sold, and it's called You Have Not Yet Heard Your Favorite Song.

You are also known for another project, Every Noise at Once. You’ve since departed Spotify, and as a result of that departure, the availability of Every Noise was in jeopardy for a little bit there. You mentioned that you have a lot of physical media and I would love your view on this: How do we preserve our understanding of how music works at this point in time? Down the line, things are going to be fundamentally shifted, as the industry inherently does. You've been involved in a number of projects that have relied on some of these big players to fuel their data.

Where do you come down on how we can preserve a lot of this discovery and a lot of this understanding moving forward, even if we are losing the data through our fingers as it comes in?

Part of it is understanding what the data is and what we've accomplished. I got laid off from Spotify, and I'd been there for a good long time, so for me I could be like, that's fine. Twelve years is longer than I had at any other job. I can do something else now and that's all right.

But it definitely hurt because I built this thing and my attachment to it was very heavily tied up in its ability to constantly change. We were still adding genres to it and one of my, and a lot of people's, favorite features of it was a thing that took every week's new release list and organized it by genre. That immediately stopped working, for no good reason. It's not confidential information that the Spotify API is not arranged in such a way that you can get the information out, even though it would be in Spotify's interest to have people better able to find new releases. When I worked at Spotify, I could route around the structural problem and just ship a CSV file to my website and then everybody could see those things.

I lost that ability and initially I was like, oh, the website is dead, but then with 30 seconds more thought I realized that this is what happens to most things. They build for a while, and then they reach a state and that's the end of building them, but now they're real. That map of 6,200 genres remains a map of world listening up until 2023, and there's more music in that than you'll ever be able to listen to or discover; for practical purposes, if what you care about is exploring the world, it's still a very interesting map that will help you do that.

If what you care about is organizing what happened last week, then for now I don't have the tools to help do that in public in a way that I wish I did. But I'm still hopeful that we'll get that back. We only need one music service to say, “All right, you can get a list of this week's new releases from our API now, and it's not limited to 1,000,” and then I'll be able to revive that.

Amazing. Glenn, I really love the book. Why don't you tell folks where they can find it, where they can find you, and why they should check it out.

It's on and Amazon. The original publisher is British, so if you are in the U.K. you might be able to find it in stores. If you are somewhere else you might have to order it, but that's how most things get out now. There's a Kindle version if you don’t care about paper, and if you do, it's got a blue cover. It's nice.

It's a good-looking cover. Hey, thanks so much for coming on. I really appreciate it. Again, I've been such a fan of yours for so long, and just to see this finally come out is really cool.

Thanks for reading.

Edited by Susie Stark.

If you have anything you’d like to see in this Sunday special, shoot me an email. Comment below! Thanks for reading, and thanks so much for supporting Numlock.

Thank you so much for becoming a paid subscriber!

Send links to me on Twitter at @WaltHickey or email me with numbers, tips or feedback at

1 Comment
Numlock News
The Numlock Podcast
Numlock News is a daily morning newsletter that pops out fascinating numbers buried in the news, highlighting awesome stories you're missing out on. Every Sunday, Walt Hickey interviews someone cool. Sometimes he records it in quality befitting a podcast.