Elliot Jay Stocks logo by Emma Luczyn

Music consumption in the era of smart speakers

Article illustration for Music consumption in the era of smart speakers

Just over five years ago I wrote a piece called ‘Music collections in the era of the cloud’, in which I lamented the then-contemporary streaming services’ inabilities to properly represent a users’ music library in the cloud. A few things have changed since then: Rdio is gone, Apple Music is here (and is still a weird mix of iTunes, the iTunes Store, and radio), Spotify has got pretty good at letting users creating their own libraries (albeit limited to Spotify’s catalogue), and Google Play Music — for all its faults as an uninspiring interface and a looming merger with YouTube Red — has emerged as a sensible choice if you’re the kind of user who wants to have your local files in the cloud alongside ‘regular’ streaming tracks. Plus, services like Plex Cloud and Cloud Player are fine ways of uploading and managing your personal music collection in environments entirely removed from the streaming services — excellent solutions for those of us who place importance on a personal library. (How quaint.) And so, at the end of 2017, many of the gripes I had in 2012 are in some way solved, or at least a little less hacky than they used to be.

But there’s still a fundamental issue that has only been exacerbated by recent advances in consumer technology. In 2017, the year when voice assistants essentially became normalised, we’ve never been further removed from music’s grounding in reality. Yes, we can call up almost any song known to humankind just by barking into the corner of our rooms, but we can’t even see an album cover.

On the face of it, this is no bad thing. Why did we ever need artwork to ground us to a listening experience? Was a visual element merely a distraction from the music itself? Possibly. And don’t get me wrong, smart speakers are cool, like sci-fi tech in my home oh my god how is this even reality this is crazy kind of cool. Amazon have the Echo (which I own), Google have the Home (which I’m tempted by because I like the Google Assistant on my phone so much), and Apple have the HomePod (albeit now not until next year, it seems).

But how many times have I found myself wanting to play some music, only to say, “Alexa…” and then feebly slip into silence while the Echo’s coloured ring spins in anticipation of my response, waiting, waiting, waiting for my inspired choice of mood-setting aural delights? So many times. In fact, it’s becoming the norm. What do I want to play? Well, I don’t know. Millions and millions and millions of tracks await, just a few words away, and instead I fall back on what I know: albums I’ve played over and over again for years on end — because without visual cues, there’s no inspiration.

Image: Engadget

When music first went digital and we all ripped our CDs to iTunes, our personal music collections were represented as names on a list; effectively, a spreadsheet. It was pretty uninspiring.

But then covers came along and lo and behold, on our computers and iPods, there were our records! Sure, they were on screen, but we could scroll through them and see our music — the music we owned — and it was amazing. Remember when Apple introduced (well, acquired) Cover Flow? It blew everyone away. (Now, that UI has disappeared almost entirely; at the time of writing it’s only present in macOS’ Finder as an alternate view.)

Image: Gideon Tsang

Another small but crucial move made by Apple earlier than Cover Flow was the release of the iTunes (Music) Store, which showed cover artwork in the Browse UI. Nothing profound on the face of it, sure, but it’s important to note here because at the point at which mainstream download purchases were introduced, customers were able to browse those products visually, just like in a real record shop. That means discovery. That means inspiration. That means choosing to play — and/or buy — something based on a visual component.

Image: Engadget

The same was true when the streaming services came along; not only in their natural inclusion of cover art, but also in their heavy investment in their visuals. Spotify and its ilk create packshots for curated playlists, for your playlists, for your recommendations.

Spotify’s personalised playlists

In order to drive discovery — and therefore stream counts, and therefore revenue — you, the user, are presented with images that directly remind us of flicking through your record shelves, be they shelves of CDs, vinyl, tapes, or even the shelves of a record store.

Virgin Megastore, circa 2001, selling racks and racks and racks of CDs. You win a prize if you can spot me in this photo.

As multi-sensory creatures, the act of choosing what music to listen to has — for a very long time—been intrinsically tied to an aesthetic experience.

And this experience is entirely missing from AI-powered, voice-activated smart speakers.

That’s not to say that choosing what to play is dependent on looking at covers. Of course not. Sometimes we might be reminded of a song, or hear a snippet and want to hear more, or maybe just already have something in mind, and those sort of situations have a purity about them: they’re about the music and nothing else. It’s not my intention to paint us as shallow creatures who need constant cues from one sense to inform another. And hey, its not like you can’t use your visually focussed apps to choose music and then send it to your smart speaker. My wife and I do that for the majority of the time.

But my worry is that we’ve become so used to our smart speakers so quickly — because they adapt so well into our homes, and because ultimately we’re kind of lazy — that we’re going to lose that sense of music discovery. Even if the AI assists us with suggestions, it’s skewed by algorithms that analyse existing listening habits. Browsing is almost impossible without a physical component.

And for kids growing up in households where music is played — but, most importantly, selected — via a smart speaker, the norm is that music is entirely divorced from the real world. Yes, music by its very nature is intangible, but perhaps we might benefit from it having some form of visual anchor.

Stuart Heritage’s recent piece in The Guardian makes for amusing reading:

To my two-year-old son, it’s simply the way the world has always been. He’s heard us talk to Alexa so often he thinks humanity has always had the ability to retrieve music by yelling at a box. “LEXA!” he’s fruitlessly started to scream, “PLAY OLD MACDONALD HAD A FARM E-I-E-I-O AND ON THAT FARM HE HAD A HORSE …” This is because he doesn’t yet know the difference between a song title and the entirety of a song’s lyrics, and neither can he pronounce “Alexa” properly. The second he cracks that first syllable our lives will thrum with a cacophony of endless nursery rhymes.
To return to the point of my original article — that the concept of having a music collection has been radically changed by streaming services — I believe that a second change has happened with the invention of smart speakers. It’s not an argument between rental or ownership (because ultimately those differences are simply down to the hair-splitting of licensing terms); it’s a question of whether or not we’re moving too far away from elements that have helped shape our appreciation of music. Are we still able to discover new music as easily? Are we actually as invested in the music we stream via our smart speakers?

I would suggest: no.

My highly scientific chart based on highly scientific research
On a scale of emotional investment, purchasing a physical album of some sort has to be the highest: I’ve not only spent money on this thing, but it’s sitting there on my shelf, reminding me that it needs playing. Digital purchases are next down on the list: money has been spent, and although physically absent, it’s represented in an iTunes (or equivalent) library with album art. Streaming music is next on the list, but that drop on the chart is significant; I’ll recall the quote that I cited in my original piece from Rob Weychert:

“Rather than investing in one album, I’ve invested in all the albums, which is the same as investing in none of them.”
And finally on our list, we have streaming music that comes via a smart speaker. It’s barely there.

An ironic aside is the issue of the hardware itself: we’re so lazy that asking Alexa to play something means we listen to music on the Echo far more frequently than our AirPlay-enabled (and, by contrast, cumbersome) Bower & Wilkins A5, which has a far superior sound. So convenience trumps quality once again. When music is being invested in, it’s being done so in lo-fi.

That’s not to say that AI-powered voice assistants and the smart speakers they operate on aren’t absolutely amazing. Nor should we point the blame at them; we humans should shoulder the brunt of that ourselves because so much of this really does come down to lazy convenience. And, as I said before, coupling a smart speaker with a screen-based UI like an iPad gets around a lot of these voice-only gripes, so there’s that.

But the fact remains that as our technology has become more intelligent and our desire for convenience satiated, we’ve lost something in the process. We’ve severed ties with an aspect of listening that helped us choose new music, reminded us to play that music again, and informed our own sense of musical identity.

Now… what should I ask Alexa to play?