[.] DejaVU Online -- Musical Feature Detection in ACOI
[.] - [up] [top] - OO hush Talks Papers DLP Courses Tutorials Lectures ?

Music search facilities on the Web

Reader's Guide


contents abstract intro web ACOI detector query retrieval conclusions References
There is a wealth of powerful search engines on the Web. Technically, search engines rely either on classification schemes (as for example Yahoo) or content-based (keyword) indexing (as for example Excite or AltaVista). Searching on the Web, nowadays, is moderately effective when text-based documents are considered. For multimedia objects (such as images or music) existing search facilities are far less effective, simply because indexing on category or keywords can not be done automatically.

We will first give some examples of search based on keywords and categories, then some examples of content-based search and finally we will discuss a more exhaustive list of musical databases and search facilities on the Web.

Keywords and categories

For musical material, in particular MIDI, there are a number of sites that offer search over a body of collected works. One example is the Aria Database  [Aria], that allows to search for an aria part of an opera based on title, category and even voice part. Another example is the MIDI Farm  [Farm], which provides many MIDI-related resources, and also allows for searching for MIDI material by filename, author, artist and ratings. A category can be selected to limit the search. The MIDI Farm employs voting to achieve collaborative filtering on the results for a query.

Search indexes for sites based on categories and keywords are usually created by hand, sometimes erreonously. For example, when searching for a Twinkle fragment, Bach's variations for Twinkle were found, whereas to the best of our knowledge there exist only Twinkle variations by Mozart  [Mozart].

The Digital Tradition Folksong Database  [Folk] provides in addition a powerful lyrics (free text) search facility based on the AskSam search engine  [AskSam].

An alternative way of searching is to employ a meta-search engine. Meta-search engines assist the user in formulating an appropriate query, while leaving the actual search to (possibly multiple) search engines. Searching for musical content is generally restricted to the lyrics, but see below (and section Match).

Content-based search

Although content-based search for images and sound have been a topic of interest for over a decade  [MM], few results have been made available to the public. As an example, the MuscleFish Datablade for Informix  [Muscle], allows for obtaining information from audio based on a content analysis of the audio object.

As far as content-based musical search facilities for the Web are concerned, we have for example, the Meldex system of the New Zealand Digital Library initiative, an experimental system that allows for searching tunes in a folksong database with approximately 1000 records  [Meldex]. Querying facilities for Meldex include queries based on transcriptions from audio input, that is humming a tune! We will discuss the approach taken for the Meldex system in more detail in section Match, to assess its viability for retrieving musical fragments in a large database.

Music databases

In addition to the sites previously mentioned, there exist several databases with musical information on the Web. We observe that these databases do not rely on DBMS technology at all. This obviously leads to a plethora of file formats and re-invention of typical DBMS facilities.

Without aiming for completeness, we have for example the MIDI Universe  [Robot], which offers over a million MIDI file references, indexed primarily by composer and file length. It moreover keeps relevant statistics on popular tunes, as well as a hot set of MIDI tunes. It further offers access to a list of related smaller MIDI databases.

Another example is the aforementioned Meldex system  [MeldexDB], that offers a large collection of tunes (more than 100.000), of which a part is accessible by humming-based retrieval. In addition text-based search is possible against file names, song titles, track names and (where available) lyrics.

The Classical MIDI Archive  [Classic] is an example of a database allowing text-based search on titles only. Results are annotated with an indication of "goodness" and recency.

The Classical Themefinder Database  [Themes] allows extensive support for retrieval based on (optional) indications of meter, pitch, pitch-class, interval, semi-tone interval and melodic contour, within a fixed collection of works arranged according to composer and category. The index is clearly created and maintained manually. The resulting work is delivered in the MuseData format, which is a rich (research-based) file format from which MIDI files can be generated  [Beyond].

A site which collects librarian information concerning music resources is the International Inventory of Music Resources (RISM)  [RISM], which offers search facilities over bibliographic records for music manuscripts, librettos and secondary sources for music written after c.a. 1600. It also allows to search for libraries related to the RISM site.

Tune recognition is apparently offered by the Tune Server  [Tunes]. The user may search by offering a WAV file with a fragment of the melody. However, the actual matching occurs against a melodic outline, that is indications of rising or falling in pitch. The database contains approx. 15.000 records with such pitch contours, of which one third are popular tunes and the rest classical themes. The output is a ranked list of titles about which the user is asked to give feedback.

Discussion

There is great divergence in the scope and aims of music databases on the Web. Some, such as the RISM database  [RISM], are the result of musicological investigations, whereas others, such as the MIDI Farm  [Farm], are meant to serve an audience looking for popular tunes. With regard to the actual search facilities offered, we observe that, with the exception of Meldex  [Meldex] and and the Tune Server  [Tunes], the query facilities are usually text-based, although for example the Classical Themefinder  [Themes] allows for encoding melodic contour in a text-based fashion.
[.] - [up] [top] - OO hush Talks Papers DLP Courses Tutorials Lectures ?
Hush Online Technology
hush@cs.vu.nl
07/22/99