Skip to content

3.0 Interactive Classifier and symbol classification training

Geneviève Gates-Panneton edited this page Aug 20, 2024 · 4 revisions

The symbol classification training consists of a single workflow you'll be running multiple times with multiple folios. Much like the Pixel workflow, the IC workflow has an interactive step which will allow you to train the computer to recognize and classify different symbols.

3.1 Building the workflow

For this step, you will need:

  • A folio image that is cute;
  • The three (or more, depending) layer models produced by the Paco Trainer job.

The workflow is built like so:

NEW IC jobs

These diagrams assume you're working with the basic three-layer and three-model system; if you have more layers/models, you'll have to add them as input and output ports to the Fast Pixelwise Analysis of Music Document, Classifying job.

Important

Adding the 'GameraXML - Training Data' output port to the Interactive Classifier job is absolutely essential for the functioning of the workflow. Once a first round of classification is done, the workflow run will produce a Training Data file. You can then add the 'GameraXML - Training data' input port to the IC job and input that most recent Training Data file.

The 'GameraXML - Feature Selection' port is completely optional; if you don't have that resource, you can simply omit adding that input port and everything will be fine.

The input ports are connected and filled like so:

NEW IC ports

Once the necessary ports and resources are added, run the workflow. When the workflow is complete, the Interactive Classifier job's status will read 'Ready for Input'. Clicking on the job will prompt you to open the Gamera graphical interface, designed to train a music symbol model.

3.2 Using the Gamera-based Interactive Classifier (music symbols)

A full tutorial on how to use the Interactive Classifier can be found here (the screenshots may be out of date, but the basic operations remain the same).

Important

If you're planning to use this symbol classification training for a complete OMR workflow, it is extremely important that the names you give each class correspond to the names you will later give these classes in the MEI Encoding job. For more on this, please see the section on MEI mapping.

Caution

TROUBLESHOOTING: If you run into any issues--visual, UI, or processing--please immediately check for an existing issue (and bump it if it's significant/breaking) or open a new one.

When classification on a single folio is complete and you've clicked on 'Finalize', the workflow will finish and generate a 'Classified Glyphs' file and a 'Training Data' file; 'Classified Glyphs' contains all of the page glyphs, both manually and automatically classified, and 'Training Data' contains all the glyphs that you manually classified, plus the imported training glyphs.

Important

Make sure to use the 'Training Data' file for future IC workflow runs, and NOT the 'Classified Glyphs' file. (If you add the 'Class Names' output port, the workflow run will also produce a text file containing all the classes and subclasses you've created to name the symbols on the folio).