The CollabScore project, funded by the French national research agency (ANR) focuses on an original hybrid OMR approach, associated with a collaborative correction framework and a synchronization process of multimedia musical sources. As with any project of this type, we have been concerned from the beginning to establish the conditions of validation of our work, by building a reference dataset and a test environment allowing us to measure the effectiveness of our methods.
The dataset consists of 26 scores by Camille Saint-Saëns, totaling 199 pages, covering the main genres practiced by the composer with the exception of operas. For each score, the dataset provides
- images of the original edition, taken from the Gallica digital library,
- a reference encoding in MEI format
- a set of annotations linking images and regions in images, to the corresponding notation fragment in the reference score.