Teaching machines how to feel the music in a human way – part 1

Teaching machines how to feel the music in a human way – part 1
July 7, 2016 the team

A Three Part Series on Music Recommendation and Machines

Building an algorithm able to listen to music and say: “you might love this track because you played that one before” is more than compiling some mechanical rules. Actually, it’s rather like reproducing a part of your brain!

Music recommendation has gained increased relevance in recent years. While online music services are still trying to prove the viability of their business models, they have learned that pure access isn’t enough. They need to push their product’s boundaries. Recommendation and discovery features have became their value proposition to captivate more users. On the artist’s side, music recommendation is crucial to gain exposure and meet the audience.  Musicians need to cut through the noise while listeners want to discover tracks they love.

However, as a scientific topic, music recommender systems have been around for decades. We can distinguish two kinds of technologies exploring this field : data & statistic analysis (ex collaborative filtering) and content-based analysis (aka acoustic analysis ).

collaborative-filtering-music

Statistic techniques like collaborative filtering are great to capture the wisdom of the crowd.  These systems recommend items based on similarity measures between users and/or items. The items recommended to a user are those preferred by “similar” users (ex: Amazon). This kind of techniques can achieve great results when you’ve got many millions of users and tons of data interactions.

A year ago, Spotify launched its Discover Weekly feature mostly based on this approach. This playlist features songs Spotify thinks users will love, based on advanced data analysis and human curation. In less than 10 months, it became one of the company’s most successful products. Every competitor on the market is currently trying to bring their own copy (Apple, Soundcloud, Pandora). This approach works very well to put users in taste clusters and select a pool of customized tracks. It is based on an elegant expansion of the user’s established taste, by linking it with listening data from other users with similar taste profiles. It unveils what music you will like (as Netflix’ recommendation engine), but not how you will like it.

Discover_Weekly_Snapshot.0 (1)

This approach tends to teach to machines what humans like to listen to, without understanding what is recommended. It is a deaf approach that’s trying to mimic the record dealer’s behavior. It’s not a DJ that builds a listening experience. It doesn’t capture what the soundtrack of your life is. Collaborative filtering also tends to make predictable and familiar recommendations. This favors the rise of a popular artist dictatorship, harmful for the beautiful versatility of music. It also has some well-known issues like the cold-start problem. 

The other objective of recommender system is to pursue the powerful emotional meanings in music. And here, it’s time to put the spotlight on content-based recommendation techniques. Acoustic analysis, signal processing or machine listening are general fields comprising systems that try to teach machines to understand audio like humans do.

At Niland, we believe that managing the mountains of music data is already a problem of the past. Understanding the music content itself is the missing link to develop more sophisticated options to customize your listening. The new generation’s listening experience lies in the understanding of music and context awareness.

machine

Analyzing music directly from the audio is far from a new idea. The signal processing discipline started when Pythagoras discovered the foundations of musical tuning. In the second part of the last century, it established itself as the science behind our digital lives. The main challenge of audio analysis for music recommendation lies in the translation between audio features and attributes affecting listening preference: this is referred to as the semantic gap.

The next part of this blog post will introduce you to the state-of-the-art recommendation systems based on audio. read more 

In the last (coming soon), we will explain how Niland’s technology compares to others and goes beyond traditional audio analysis thanks to deep learning techniques.

0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*