Music Mining

In the MusicMiner project we used signal processing and time series analysis methods to find numerical features that describe audio files with music. We selected good features such that songs that sound the same are described by similar numbers, while a large perceptional difference of the sounds is reflected in very different numbers.

We used these features and Emergent Self-organizing Map (ESOM) to visualize a music collection analogous to geographical maps. Similar songs are placed together in valleys, large differences between songs are shown as mountain ranges in between.

Recently we extended our exhaustive audio feature generation and proposed to use logistic regression to generate easily understandable audio features. Such features can describe how melancholic a songs sounds or how similar it is to what would typically considered rock music.

The code for our audio feature generation is included in the MusicMiner software package that you can also use to create maps of your personal music collection.

This research was performed under the supervision of Prof. Dr. Alfred Ultsch in the Databionics Research Group, Philipps-University of Marburg, Germany . with the help of many students and partly in collaboration with Ingo Mierswa, Artificial Intelligence Group, University of Dortmund, Germany

The picture shows a map with 700 songs downloaded from internet radio stations. The songs are shown as colored dots, depending on the genre of the station. The genre information was not used during the training of the map.

The picture shows a decision tree of the Musical Audio Benchmark based on audio features that describe how similar each song sounds to one of 7 genre specific radio stations. For example, a song is classified as Pop if it sounds somewhat like Rap but no like Country.