Application of Optical Music Recognition technologies for the development of OCVE
Ichiro Fujinaga (2004/11/29)

Optical Music Recognition (OMR) systems create symbolic, computer-readable version of musical scores by optically processing bit-mapped image of the scores. The tools used to achieve this include image-processing and machine learning algorithms. The research in developing OMR started in the late 1960’s and there are some commercially available systems but the technology is not perfect, especially compared to the Optical Character Recognition (OCR) systems. There are two potential major reasons for the difference. One is that there are more commercial interests in converting older documents to computer-readable format (e.g., government documents and newspapers). The second reason is that the task is much more difficult due to the two-dimensional structure of the music scores and that there are different degrees of complexity of musical scores. These factors complicate even the evaluation of different OMR systems. It is not trivial to compare two different output produced by different systems from a same score because the ordering in the symbolic format may be different. Furthermore, some systems work better on certain types of scores than others.
Despite these difficulties, the incorporation of OMR technologies in the OCVE is highly recommended and the use and the development of OMR system for this project will also enhance OMR systems.

  1. In the long run, using OMR will be less expensive method of converting musical scores to symbolic format than manual input.
  2. Since the piano music of Chopin is one of the most challenging musical scores for OMR systems, if the problems are solved here, the others will benefit.
  3. Once a symbolic version of a musical piece is created, the recognition of the variants can be simplified by using that version.
  4. Since one of the major goals of OCVE is to detect differences between variants, the techniques to detect the differences will be useful for evaluating different OMR systems.

There is no question that the success of OCVE depends on the creation of symbolic version of the scores and their variants. Without such representation most of the project goals will be very difficult to attain, such as detailed bar by bar comparison across different variants, creation of new editions, etc. Using only images, without symbolic information, for comparison is very limited as shown in Figure 1. These represent the first 13 bars of Chopin’s Op.57 Barceuse from six different editions. Note that Figures 1a-c has two systems, whereas others use three systems (Figures 1d, 1e) and four systems (Figure 1f) for the 13 bars. Although both Figure 1a and Figure 1b use two systems the first system contain different number of bars, six and seven, respectively. Figure 1b and figure 1b has exacty the same number of bars in both systems, yet they contain a significant difference, namely, the different pitch and rhythm in measure 13. Figure 1d and figure 1e are very similar in appearance, yet Figure 1d is more closely related content-wise to Figure 1f becaused of the similarity in the pedal markings. On the other hand, Figure 1d is unique in this collection because it is the only edition that has the stems attached downward on the downbeat bass notes of Db. Figure 1f is the only edition missing the piano marking (the letter p) in the first bar.

Figure 1a. Op. 57,  Leipzig, Breitkopf & Härtel 1845.

Figure 1b. Op.57, Leipzig, Breitkopf & Härtel not before 1846.

Figure 1c. Op.57, Leipzig, Breitkopf & Härtel 1872 or 1873

Figure 1d. Op.57,  Paris, J. Meissonnier 1845.

Figure 1e. Op.57, Philadelphia, F.A. North & Co. 1872.

Figure 1f. Op.57, London, Wessel & Co. between 1848 and 1856.

Unlike digitization of text, which can be fairly easily and inexpensively outsourced to developing countries where the data entry clerks are taught to type foreign languages, such as English, French, or even Latin, music encoding cannot be taught to workers without the extensive knowledge of music and music notation. This makes the process of manually entering music into the computer a very costly endeavor. Therefore, although the OMR technology is also currently expensive to use, as the OMR technology improves, it will eventually be the proper method for the digitization of the vast amount of culturally important music scores.

For OMR systems, piano music is one of the most difficult scores to deal with. Among the piano music repertoire, those of Chopin and Liszt are considered to be the most difficult. Thus, developing an OMR system in this project presents an opportunity to create a very robust system. It is expected, nevertheless, that manual interventions will be necessary for this music. Examples of the complex notations are shown in Figure 2.

Figure 2a. An example of omplicated notation (Op.25/6).

Figure 2b. An exmple of complicated notation (Op.25/10).

Figure 2c. An exmple of complicated notation (Op.53).

Figure 3 is an example of a score where encoding the music becomes a challenge. Note that in the first two bars, there are two voices in the left hand (the bottom stave), but one of the voice abruptly ends on the down beat of the third bar. The right hand (the top stave) mostly consists of one voice except extraneous notes appear in the middle of bar 3 (Db) and in the middle of bar 4 (d-natural). These notes are problematic since they break the syntactical rules of music notation.

Figure 3.An example of inconsistent number of voices (Op.47).

Despite the complexity of the notation in this project, many scores that need to be converted to symbolic format are very similar to one other, as shown in Figure 1. This feature of this project’s collection of scores can be exploited by the OMR system. In fact, MIDI versions of the scores, which contain pitch and duration iformation, may be useful as starting point of the OMR process.

Even though the OMR technology is not developed enough to be cost effective today, requiring careful checking and editing of the output, further development of OMR in this project will allow, for example, to compare results of two OMR systems of the same score. This will be very useful for checking the accuracy of OMR systems.