There is no question that the success of OCVE depends on the creation of symbolic version of the scores and their variants. Without such representation most of the project goals will be very difficult to attain, such as detailed bar by bar comparison across different variants, creation of new editions, etc. Using only images, without symbolic information, for comparison is very limited as shown in Figure 1. These represent the first 13 bars of Chopin’s Op.57 Barceuse from six different editions. Note that Figures 1a-c has two systems, whereas others use three systems (Figures 1d, 1e) and four systems (Figure 1f) for the 13 bars. Although both Figure 1a and Figure 1b use two systems the first system contain different number of bars, six and seven, respectively. Figure 1b and figure 1b has exacty the same number of bars in both systems, yet they contain a significant difference, namely, the different pitch and rhythm in measure 13. Figure 1d and figure 1e are very similar in appearance, yet Figure 1d is more closely related content-wise to Figure 1f becaused of the similarity in the pedal markings. On the other hand, Figure 1d is unique in this collection because it is the only edition that has the stems attached downward on the downbeat bass notes of Db. Figure 1f is the only edition missing the piano marking (the letter p) in the first bar.
Figure 1a. Op. 57, Leipzig, Breitkopf & Härtel 1845.
Figure 1b. Op.57, Leipzig, Breitkopf & Härtel not before 1846.
Figure 1c. Op.57, Leipzig, Breitkopf & Härtel 1872 or 1873
Figure 1d. Op.57, Paris, J. Meissonnier 1845.
Figure 1e. Op.57, Philadelphia, F.A. North & Co. 1872.
Figure 1f. Op.57, London, Wessel & Co. between 1848 and 1856.
Unlike digitization of text, which can be fairly easily and inexpensively outsourced
to developing countries where the data entry clerks are taught to type foreign
languages, such as English, French, or even Latin, music encoding cannot be
taught to workers without the extensive knowledge of music and music notation.
This makes the process of manually entering music into the computer a very costly
endeavor. Therefore, although the OMR technology is also currently expensive
to use, as the OMR technology improves, it will eventually be the proper method
for the digitization of the vast amount of culturally important music scores.
For OMR systems, piano music is one of the most difficult scores to deal with. Among the piano music repertoire, those of Chopin and Liszt are considered to be the most difficult. Thus, developing an OMR system in this project presents an opportunity to create a very robust system. It is expected, nevertheless, that manual interventions will be necessary for this music. Examples of the complex notations are shown in Figure 2.
Figure 2a. An example of omplicated notation (Op.25/6).
Figure 2b. An exmple of complicated notation (Op.25/10).
Figure 2c. An exmple of complicated notation (Op.53).
Figure 3 is an example of a score where encoding the music becomes a challenge.
Note that in the first two bars, there are two voices in the left hand (the
bottom stave), but one of the voice abruptly ends on the down beat of the third
bar. The right hand (the top stave) mostly consists of one voice except extraneous
notes appear in the middle of bar 3 (Db) and in the middle of bar 4 (d-natural).
These notes are problematic since they break the syntactical rules of music
notation.
Figure 3.An example of inconsistent number of voices (Op.47).
Despite the complexity of the notation in this project, many scores that need
to be converted to symbolic format are very similar to one other, as shown in
Figure 1. This feature of this project’s collection of scores can be exploited
by the OMR system. In fact, MIDI versions of the scores, which contain pitch
and duration iformation, may be useful as starting point of the OMR process.
Even though the OMR technology is not developed enough to be cost effective today, requiring careful checking and editing of the output, further development of OMR in this project will allow, for example, to compare results of two OMR systems of the same score. This will be very useful for checking the accuracy of OMR systems.