e2e-omr-resources

For running an end-to-end OMR workflow, you need the resources indicated in the end-to-end OMR documentation. In this repository you can find these resources, some are applicable to different manuscripts, while others are specificly for the Salzinnes Antiphonal:

The document analysis or document segmentation models. These models are used to segment the image of a manuscript page into different layers (e.g., music symbol layer, staff-line layer, and background layer). The models are trained to identify the pixels belonging to a particular layer. The three models: music-symbol model, staff-line model, and background model, trained for the Salzinnes manuscript can be found document_analysis/Salzinnes/models. These models were generated by the Training model for Patchwise Analysis of Music Document - HPC job, using the images provided in the folder document_analysis/Salzinnes/training_data.

There is also a non-HPC version of the job. Depending on the number of resources needed, you would use the HPC or non-HPC version. HPC is meant to be used when a large amount of memory is needed.

For information about the settings of the training job used to generate these models for Salzinnes and other details, please consult the document_analysis/Salzinnes directory.
For processing the music symbol layer and successfully identify the individual symbols, you also need two files to provide to the interactive (or non-interactive) classifier:
- the training data for classifying music symbols: music_symbol_recognition/split_training.xml
- the feature selection/weights to optimize such classification process: music_symbol_recognition/split_features.xml
For the processing of the text (extracted from the background layer), you need two files:
- the OCR model: text_alignment/salzinnes_model-00054500.pyrnn
- the actual transcript text for each page of the manuscript. The text can be obtained from the Cantus Database. See instructions on how to retrieve a CSV file including the text of the complete manuscript, and how to identify the rows in the file that relate to the text in a particular page. You can find the extracted texts for Salzinnes in text_alignment/Salzinnes/, and the python script used to retrieve these text files in text_alignment/get_text_per_folio.py
Finally, to encode everything together in an MEI file, you also need to provide a CSV file mapping the classes of glyphs to fragments of MEI code. This file can be obtained through CRES. For square neume notation, you could use one of the two files provided: the mei_encoding/csv-square_notation_neume_component_level.csv file if the classifier is splitting all neumes into puncta, or the mei_encoding/csv-square_notation_neume_level.csv file it the classifier is considering basic neumes (specifically the clivis, the podatus, the torculus, and the scandicus) as complete glyphs.

Name		Name	Last commit message	Last commit date
Latest commit History 987 Commits
document_analysis		document_analysis
mei_encoding		mei_encoding
music_symbol_recognition		music_symbol_recognition
resulting_mei_files		resulting_mei_files
text_alignment		text_alignment
.gitignore		.gitignore
266r.mei		266r.mei
266v.mei		266v.mei
267r.mei		267r.mei
267v.mei		267v.mei
268r.mei		268r.mei
268v.mei		268v.mei
269r.mei		269r.mei
CH-E_611_026v.mei		CH-E_611_026v.mei
CH-E_611_028v.mei		CH-E_611_028v.mei
E2E OMR Workflow with IC.json		E2E OMR Workflow with IC.json
E2E OMR Workflow.json		E2E OMR Workflow.json
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

e2e-omr-resources

About

Releases

Packages

Languages

henrytdrummond/e2e-omr-resources

Folders and files

Latest commit

History

Repository files navigation

e2e-omr-resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages