diff --git a/README.md b/README.md
index 30d4a687..31f1d1d4 100644
--- a/README.md
+++ b/README.md
@@ -1,12 +1,16 @@
-![Maturity level-1](https://img.shields.io/badge/Maturity%20Level-ML--1-yellow)
+![Maturity level-1](https://img.shields.io/badge/Maturity%20Level-ML--2-green)
+[Find our docs here](https://astrazeneca.github.io/KAZU/index.html)
+
# Kazu - Biomedical NLP Framework
+**Note: the recent 2.0 release has large elements of backwards incompatibility if you are using a custom model pack and curations.**
+
Welcome to Kazu (Korea AstraZeneca University), a python biomedical NLP framework built in collaboration with Korea University,
designed to handle production workloads.
@@ -16,11 +20,8 @@ research contained within are our own, but most of it comes from the community,
If you want to use Kazu, please cite our [EMNLP 2022 publication](https://aclanthology.org/2022.emnlp-industry.63)!
([**citation link**](https://aclanthology.org/2022.emnlp-industry.63.bib))
-[Please click here for the **web live demo** (Swagger UI) from http://kazu.korea.ac.kr/](http://kazu.korea.ac.kr/)
-
[Please click here for the TinyBERN2 training and evaluation code](https://github.com/dmis-lab/KAZU-NER-module)
-
# Quickstart
## Install
@@ -83,10 +84,6 @@ if __name__ == "__main__":
kazu_test()
```
-# Documentation
-
-[Find our docs here](https://astrazeneca.github.io/KAZU/index.html)
-
## License
Licensed under [Apache 2.0](https://github.com/AstraZeneca/KAZU/blob/main/LICENSE).
@@ -156,56 +153,62 @@ Christopher J Mungall, Melissa A Haendel, Peter N Robinson,
The Human Phenotype Ontology in 2021,
-Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021, Pages D1207–D1217,
+Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021, Pages D1207–D1217,
https://doi.org/10.1093/nar/gkaa1043
#### OPEN TARGETS
Open Targets datasets are kindly provided by www.opentargets.org, which are free for commercial use cases
-Ochoa, D. et al. (2021). Open Targets Platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Research.
+Ochoa, D. et al. (2021). Open Targets Platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Research.
https://doi.org/10.1093/nar/gkaa1027
#### STANZA
The Stanza framework:
-Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton and Christopher D. Manning. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Association for Computational Linguistics (ACL) System Demonstrations. 2020.
+Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton and Christopher D. Manning. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Association for Computational Linguistics (ACL) System Demonstrations. 2020.
https://arxiv.org/abs/2003.07082
Biomedical NLP models are derived from:
-Yuhao Zhang, Yuhui Zhang, Peng Qi, Christopher D. Manning, Curtis P. Langlotz.
-Biomedical and Clinical English Model Packages in the Stanza Python NLP Library,
-Journal of the American Medical Informatics Association. 2021.
+Yuhao Zhang, Yuhui Zhang, Peng Qi, Christopher D. Manning, Curtis P. Langlotz.
+Biomedical and Clinical English Model Packages in the Stanza Python NLP Library,
+Journal of the American Medical Informatics Association. 2021.
https://doi.org/10.1093/jamia/ocab090
#### SCISPACY
Biomedical scispacy models are derived from
-Mark Neumann, Daniel King, Iz Beltagy, Waleed Ammar
-ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
-Proceedings of the 18th BioNLP Workshop and Shared Task
-ACL 2019
+Mark Neumann, Daniel King, Iz Beltagy, Waleed Ammar
+ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
+Proceedings of the 18th BioNLP Workshop and Shared Task
+ACL 2019
https://www.aclweb.org/anthology/W19-5034
#### SAPBERT
Kazu uses a distilled form of SAPBERT, from
-Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, Nigel Collier
-Self-Alignment Pretraining for Biomedical Entity Representations
-ACL 2021
+Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, Nigel Collier
+Self-Alignment Pretraining for Biomedical Entity Representations
+ACL 2021
https://aclanthology.org/2021.naacl-main.334/
+#### GLINER
+
+GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer.
+Urchade Zaratiana, Nadi Tomeh, Pierre Holat, Thierry Charnois
+https://arxiv.org/abs/2311.08526
+
#### SETH
Kazu's SethStep uses Py4j to call the SETH mutation finder.
-Thomas, P., Rocktäschel, T., Hakenberg, J., Mayer, L., and Leser, U. (2016).
-[SETH detects and normalizes genetic variants in text](https://pubmed.ncbi.nlm.nih.gov/27256315/)
-Bioinformatics (2016)
+Thomas, P., Rocktäschel, T., Hakenberg, J., Mayer, L., and Leser, U. (2016).
+[SETH detects and normalizes genetic variants in text](https://pubmed.ncbi.nlm.nih.gov/27256315/)
+Bioinformatics (2016)
http://dx.doi.org/10.1093/bioinformatics/btw234
@@ -213,7 +216,7 @@ http://dx.doi.org/10.1093/bioinformatics/btw234
Kazu's OpsinStep uses Py4j to call OPSIN: Open Parser for Systematic IUPAC nomenclature.
-Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, and Robert C. Glen
-Chemical Name to Structure: OPSIN, an Open Source Solution
-Journal of Chemical Information and Modeling 2011 51 (3), 739-753
+Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, and Robert C. Glen
+Chemical Name to Structure: OPSIN, an Open Source Solution
+Journal of Chemical Information and Modeling 2011 51 (3), 739-753
DOI: [10.1021/ci100384d](https://doi.org/10.1021/ci100384d)