The OntoAligner library is designed to facilitate efficient ontology matching tasks, providing tools and methodologies for aligning various ontologies. To ensure that OntoAligner remains a reliable and relevant tool for the community, we establish this Maintenance Plan to outline the ongoing maintenance efforts, the roles involved, and the strategies for addressing evolving user needs. This library is released under , a permissive open-source license that allows for community contributions, reuse, and modification. A persistent DOI
is assigned via Zenodo to ensure permanent referencing of the latest stable release. Release versions will be tagged on GitHub, with versioning to track major, minor, and patch updates. The release will be pushed to the OntoAligner at PyPI. The current version of OntoAligner is
.
The primary goals of the OntoAligner maintenance objectives are to:
- Ensure long-term availability and functionality of the library by addressing and fix bugs and issues as they arise.
- Continuously add Aligner models and extend models within the OntoAligner.
- Incorporate user feedback and adapt to evolving research trends in ontology matching.
- Maintain up-to-date documentation to support ease of use and understanding.
- Regularly review and update dependencies to maintain compatibility with the latest technology and standards.
A core team will be responsible for the ongoing maintenance of OntoAligner, including:
- Lead Maintainers: Oversee all maintenance activities, ensure the direction aligns with the project's vision, and handle critical issues.
- Hamed Babaei Giglou – Project Lead – Responsible for the overall vision, maintenance activities, and coordination of the library’s development.
- Dr. Jennifer D'Souza and Prof. Dr. Sören Auer – Project Supervisors and Principal Investigators (PI) – Responsible for guiding ideation, refining ideas, and defining the project's high-level direction to align the library's goals with broader academic and research objectives.
- Assistants:
- Mahsa Sanaei – Contributes to the code-level documentation, use-case study, and ensuring clarity and consistency across the library's functions and modules.
- Amirreza Alasti – Assists with activities such as maintaining the codebase, creating tutorials, and testing library for smooth operation of the project.
A roadmap for new features and improvements, ensuring the library evolves in response to user needs and feedback is presented as follows. This list will be updated regularly as we explore the variety of works within the ontology alignment field to ensure the diverse methods within the library.
Category | Description | Status |
---|---|---|
Dependency | We should use tools like Dependabot or pyup to automate dependency updates and security alerts. |
To-Do |
Case-Study | The LLMs4Subjects shared task challenges researchers to develop LLM-based methods for automated subject indexing of TIB’s technical records in multiple languages. Using the GND , this task enhances cataloging and semantic linking of topics, people, and works. OntoAligner can contribute by aligning ontologies, improving indexing accuracy, and enhancing interoperability across datasets. This integration ensures consistent, multilingual classification and supports more efficient automated subject tagging. In the end, the resulting case-study will be used to add a tutorial on https://ontoaligner.readthedocs.org/. | Ongoing |
Retrieval | Adding finetuning capability of sentence transformer based retriever models using SBERT. This can be specially usefull for OA tasks that has the train-test dataset split (like Bio-ML track tasks) | To-Do |
LLMs | Adding finetuning capability of LLMs using PEFT (Parameter-Efficient Fine-Tuning) and Transformers library. Similar to SBERT finetuning, this can be specially usefull for OA tasks that has the train-test dataset split (like Bio-ML track tasks) | To-Do |
Aligner | LogMap is a well-known ontology alignment framework. Integrating it as a supported aligner would add significant value to OntoAligner. Since LogMap is released under the Apache-2.0 license, which is compatible with OntoAligner’s license, its inclusion would be seamless. | Ongoing |
Aligner | OWL2Vec4OA is an extension of an ontology embedding system OWL2Vec*, which incorporate confidence values in the graph edges represented by (uncertain) available mappings between the network of ontologies. It is released this year and defeats the OWL2Vec* model. It would be great to have it in OntoAligner and since it is under Apache-2.0 license it would make it as a valuable module for Aligners. | To-Do (contacted they are happy to help with integration) |
Tests | Implementing diverse unit tests to improve the maintainability of OntoAligner. The tests will cover core functionalities, including ontology parsing and alignment strategies. Until now, our focus has been on refining core features and ensuring alignment accuracy. Additionally, the evolving nature of the library made it challenging to establish stable test cases. With the current maturity of the project, adding unit tests is now one of the priorities. | To-Do |
Aligner | The work of Ontology Matching with Large Language Models and Prioritized Depth-First Search uses Prioritized Dept-First Search that results in a high matching efficiency in terms of querying the LLMs and time. Especially it outperforms state-of-the-art models in Biomedical datasets such as Bio-ML tracks. Adding this model as one of the Aligners is ideal in terms of efficiency for various engaging comunities. | To-Do (contacted and waiting for authors code release) |
Tutorials | More tutorials covering different components are needed to enhance usability beyond just documentation. This will help users interact with the library more effectively, experiment with various features, and test multiple ideas. | Ongoing |
Ontology | Integrate support for online services such as the TIB Terminology Service to enable direct access to authoritative ontologies. This will enhance OntoAligner’s ability to align terms with external knowledge sources dynamically, improving interoperability and expanding its practical applications. | To-Do |
Visualizer | Develop or integrate an ontology alignment visualizer into OntoAligner to aid in the validation and understanding of ontology alignments. VOWLMap can be one of the options here. | To-Do |
Aligner | Matcha-DL is a deep learning-based aligner library that allows the finetuning of deep learning-based models for alignment. It is mostly tested on the Bio-ML track, so we might consider this aligner in the OntoAligner as it is license is compatible with ours. | To-Do |
Aligner | OLaLa is a RAG based aligner model in Java, and it uses a different approach than we implemented some components can also be used in combination with existing approaches so it would be great if we translate the code from Java to python and have it as another aligner with the capability of allowing users to choose their own LLM. | To-Do |
... | ... | .. |
If you are willing to have your Ontology Alignment within OntoAligner don't hesitate to contact us via GitHub Issues or via email to [email protected].
The following activities will be performed regularly:
- Code Quality:
- Enforce PEP 8 compliance and code consistency using
pre-commit
hooks withruff
for linting and formatting. - Run pre-commit checks regularly to maintain code quality and prevent style violations before commits.
- Perform continuous code refactoring to improve readability and reduce technical debt.
- Enforce PEP 8 compliance and code consistency using
- Version Control:
- During the development or contribution we obligated to use Git with clear commit messages following best practices.
- We follow semantic versioning for the releases (e.g., v1.0.0 for the first stable release).
- OntoAligner Utilize GitHub CI/CD for automated ReadTheDocs deployment on every push to the main branch.
- Moreover, the PyPI release is automated for each versioning tag ensuring easy distribution.
- Each release includes (and should contain) proper documentation and a detailed in the CHANGELOG.md for new features, bug fixes, and breaking changes.
- Release Cycle:
- Major Releases: Major feature updates or breaking changes will be released every 6 months. These releases will introduce new features, remove deprecated functionality, or make significant changes to the architecture.
- Minor Releases: Regular updates that add new features or improvements will be released every month.
- Patch Releases: Bug fixes, security patches, and minor adjustments will be released as necessary, with a goal of addressing critical issues within 2-3 weeks of identification.
- Documentation
- Each new Aligner model should include a "Tutorial" guide for users to quickly use that model within the OntoAligner on real-world datasets.
- User documentation aim is to provide detailed information for each class and method for users to easily understand and apply.
- Code-level documentation is a "must" for the project as we use
Sphinx
to generate API documentation from docstrings.
- Compatibility Checks:
- Regularly monitoring and updating dependencies (e.g.,
rdflib
,owlready2
,transformers
) to ensure compatibility with the latest Python versions and security patches. - Regularly test compatibility with Python 3.10 and newer versions (e.g., Python 3.11, 3.12).
- Regularly monitoring and updating dependencies (e.g.,
- Security Management:
- The sensitive data such as API keys or passwords are not allowed to be hard-coded in the repository. But allowed to use feed as an argument while using specific models.
- Each merge request will be assigned for code reviews focusing on security, especially when handling user data or integrating with external services.
-
We use GitHub Issues to track bugs, feature requests, and enhancements, ensuring critical issues get a response within 48 hours.
-
Contribution guidelines are outlined in CONTRIBUTING.md, detailing what contributions can be made, code standards, and PR submission processes.
-
OntoAligner Code of Conduct is available at CODE_OF_CONDUCT.md to foster a welcoming and inclusive environment for community engagement.
For any inquiries regarding the integration of matching/aligner models into OntoAligner, issue reporting, or long-term support, please feel free to contact the lead maintainer at [email protected].