|
In the MusicMiner project we used signal processing and time series analysis methods to find numerical features that describe audio files with music. We selected good features such that songs that sound the same are described by similar numbers, while a large perceptional difference of the sounds is reflected in very different numbers.
We used these features and Emergent Self-organizing Map (ESOM) to visualize a music collection analogous to geographical maps. Similar songs are placed together in valleys, large differences between songs are shown as mountain ranges in between.
Recently we extended our exhaustive audio feature generation and proposed to use logistic regression to generate easily understandable audio features. Such features can describe how melancholic a songs sounds or how similar it is to what would typically considered rock music.
The code for our audio feature generation is included in the MusicMiner software package that you can also use to create maps of your personal music collection.
This research was performed under the supervision of Prof. Dr. Alfred Ultsch in the Databionics Research Group, Philipps-University of Marburg, Germany . with the help of many students and partly in collaboration with Ingo Mierswa, Artificial Intelligence Group, University of Dortmund, Germany
| Yu, Y., Downie, J.S., Mörchen, F., Chen, L., Joe, K.: Using Exact Locality Sensitive Mapping to Group and Detect
Audio-Based Cover Songs, In Proceedings 10th IEEE International Symposium on Multimedia (ISM), IEEE Computer Society, (2008), pp. 302-309 IEEE |
| Yu, Y., Downie, J.S., Mörchen, F., Chen, L., Joe, K., Oria, V.: COSIN: content-based retrieval system for cover songs, Abdulmotaleb El-Saddik and
Son Vuong and
Carsten Griwodz and
Alberto Del Bimbo and
K. Selcuk Candan and
Alejandro Jaimes (Eds), In Proceedings 16th ACM International Conference on Multimedia, ACM, (2008), pp. 987-988 ACM |
| Weihs, C., Ligges, U., Mörchen, F., Müllensiefen, D.: Classification in Music Research, Advances in Data Analysis and Classification 1(3)Springer, (2007), pp. 255-291 SpringerLink |
| Risi, S., Mörchen, F., Ultsch, A., Lewark, P.: Visual mining in music collections with Emergent SOM, In Proceedings Workshop on Self-Organizing Maps (WSOM), Bielefeld, Germany, (2007) |
| Mörchen, F., Mierswa, I., Ultsch, A.: Understandable Models Of Music Collections Based On Exhaustive Feature Generation With Temporal Statistics, Tina Eliassi-Rad, Lyle H. Ungar, Mark Craven, Dimitrios Gunopulos (Eds), In Proceedings The Twelveth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, Philadelphia, PA, USA, (2006), pp. 882-891 ACM Digital Library |
| Mörchen, F., Ultsch, A., Thies, M., Löhken, I.: Modelling timbre distance with temporal statistics from polyphonic music, IEEE Transactions on Speech and Audio Processing 14(1)IEEE Press, (2006), pp. 81-90 IEEE Xplore |
| Mörchen, F., Ultsch, A., Nöcker, M., Stamm, C.: Databionic visualization of music collections according to perceptual
distance, Joshua D. Reiss, Geraint A. Wiggins (Eds), In Proceedings 6th International Conference on Music Information Retrieval
(ISMIR 2005), London, UK, (2005), pp. 396-403 |
| Mörchen, F., Ultsch, A., Nöcker, M., Stamm, C.: Visual mining in music collections, In Proceedings 29th Annual Conference of the German Classification Society
(GfKl 2005), Springer, Heidelberg, (2005), pp. 724-731 |
| Mörchen, F., Ultsch, A., Thies, M., Löhken, I. and
Nöcker, M., Stamm, C., Efthymiou, N., Kümmerer,
M.: MusicMiner: Visualizing timbre distances of music as topographical
maps, Technical Report No. 47, Dept. of Mathematics and Computer Science, University of Marburg,
Germany, (2005) |
|
The picture shows a map with 700 songs downloaded from internet radio stations. The songs are shown as colored dots, depending on the genre of the station. The genre information was not used during the training of the map.
The picture shows a decision tree of the Musical Audio Benchmark based on audio features that describe how similar each song sounds to one of 7 genre specific radio stations. For example, a song is classified as Pop if it sounds somewhat like Rap but no like Country.
|