A few months ago, when I had time to do some private development of music related software (
http://www.samsaffron.com/projects.html ) I started investigating how to get proper audio fingerprinting for my music collection. My problem was simple, I wanted to detect all the duplicate songs in my collection. The technology is out there (Philips and Moodlogic have proper implementations) however nothing is open (even relatable is closed and its performance is rather poor)
So…
During my investigation I managed to Google the current developments in computer audition. It turns out that it is pretty easy to get a list of songs that sound like another song (query by example). More interestingly, some cutting edge research is able to predict genres for a particular piece of music fairly accurately (at least as good as we can).
A very interesting read is:
“Beyond the Query-by-Example Paradigm: New Query Interfaces for Music Information Retrieval”
At:
http://www-2.cs.cmu.edu/~gtzan/work/publications.html
Anyway,
The code for doing all of these very cool things is open and on source forge (see the project marsyas on sf.net)
I think it would be ultra cool to have the empeg (or karma or whatever) have the functionality that allows you to cycle through your various genres (not based on id3 tags) create automatic playlists based on an example song or tell the player to play “faster” or “slower” songs.
I think that as our music collections grow we need better ways of doing context aware browsing, handling 20gig collections is very different to handling 500gig ones.
Lastly I think it is pretty critical an open source song finger printer is developed. This technology will allow to correct meta data on files (eg. MusicBrainz) and more importantly collect user preference metadata (such as global rating or people who like this also like this).
What do the people on the group think?
When are we going to see a first open source context aware music browser? Will there ever be an open source audio fingerprinter?