MUMT 621: Music Information Acquisition, Preservation, and Retrieval [course page]
24 January 2012 :: Presentation I
Hannah Robertson [home page]
Million Song Dataset
The Million Song Dataset (MSD) is a collaboration between Thierry Bertin-Mahieux and Daniel P.W. Ellis of the Laboratory for the Recognition and Organization of Speech and Audio (LabROSA) at Columbia University and Brian Whitman and Paul Lamere at the Echo Nest. It was published in 2011, as well as presented at ISMIR. The MSD was funded in part by the National Science Foundation (NSF) and Google.
Using the MSD
Tutorials and demos
Links between the MSD and other resources
- The Echo Nest API: The MSD contains Echo Nest track, song, album, artist identifiers for each song; additionally, track analysis (segmentation, etc.) was performed using the Echo Nest algorithms.
- 7digital: The MSD contains the 7digital song ID and album release identifiers for each song.
- musicbrainz: The MSD contains the musicbraniz artist id and number of musicbrainz tags for each song.
- playme: The MSD contains the playme artist identifier for each song.
- Additional MSD-related datasets, including SecondHandSongs, musiXmatch, Last.fm, and Taste Profile subset.