google search string:
Witten "search engine" Inverted "Managing Gigabytes" "Natural Language"
algorithms
32
http://www.etl.go.jp/etl/divisions/~hasida/i-content/GDA/retrieval.html
Current content-based IR (information retrieval) systems usually use
either full-text matching or keyword spotting. A major problem with these
systems is overmatching; They are biased towards high recall while sacrificing
precision. To improve both, matching mechanisms should exploit more information
in the text, such as semantic class of keywords and sentence meaning. The
current NLP (natural language processing) techniques, however, are not
mature enough to accurately pick up such information. The GDA tags should
be of a great help for improving NLP accuracy, which will then improve
IR.
Search engines:
Themefinder (Stanford)
New
Zealand Digital Music Library
TuneServer
Links:
Overview Books
& Articles
The New
Zealand Digital Library MELody inDEX (D-Lib 1997)
Online
Music Recognition and Searching (OMRAS)
In
Search of a Lost Melody
Access to
Music Information: The State of the Art (Downie)
Bainbridge, D., Nevill-Manning, C.G., Witten, I.H., Smith, L.A., &
McNab, R.J. (1999) "Towards a Digital
Library of Popular Music" Proc. Digital Libraries 1999, Fox,
E.A. and Rowe, N. (Eds.) 161--169.
Agosti, M., Bombi, F., Melucci, M., and Mian, G. (1999). Towards a digital library for the Venetian music of the eighteenth century. In Anderson, J., Deegan, M., Ross, S., and Harold, S., editors, Digital Content, Digital Methods. Office for Humanities Communication. In press.
W. B. Frakes and R. Baeza-Yates eds., Information Retrieval : Data Structures
and Algorithms, Englewood
Cliffs, N.J. : Prentice-Hall.
T. Dao. An indexing model for structured documents to support
queries on content, structure and attributes.
In Proc. of the IEEE Forum on Research and Technology Advances
in Digital Libraries, pages 88--97, California, USA, April 1998.