diff --git a/CHANGELOG.md b/CHANGELOG.md index 79c17803..914f6ef1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,32 @@ and this project adheres to [Semantic Versioning](http://semver.org/). +## 2.0.0 - 2024-06-04 + + +### Features + +- (De)serialization has been greatly improved, simplified, made correct, and given a slightly more compact serialized representation. + This does mean there are some small changes in (de)serialization behaviour since the previous release. +- Curation process has been significantly improved and simplified for the end user, including introducing the `AutoCurator` concept to aid in this. This will enable us to build out better documentation and an interactive tool in future releases, which are currently in draft. Overally, this will greatly simplify upgrading ontology versions, adding curations for a new ontology etc. +- Datamodel has been substantially revised in a **backwards incompatible** manner to clear up confusing concepts, fix longstanding issues etc. +- New Zero shot NER model with GLiNER + +### Deprecations and Removals + +- Remove deprecated `GildaUtils.replace_dashes`. This was superceded by `GildaUtils.split_on_dashes_or_space` and was already deprecated pending removal. +- Remove deprecated `SpacyToKazuObjectMapper`, as this was renamed to `KazuToSpacyObjectMapper`, and the old name already deprecated pending removal. +- Remove deprecated `create_phrasematchers_using_curations` method of `OntologyMatcher`. This was renamed to `create_phrasematchers` and was already deprecated pending removal. +- Rename `Document.json` to `to_json`, and remove optional arguments. + The previous name was inconsistent with naming on other classes, as the function signature were parallel to `to_json` methods. + The argument `drop_unmapped_ents` had functionality that was duplicated with `DropUnmappedEntityFilter` within the `CleanupStep`, + and it made sense to add the `drop_terms` behaviour to a new `LinkingCandidateRemovalCleanupAction` to collect this behaviour together + and significantly simplify the Document serialization code. +- Rename `ParserActions.from_json` and `GlobalParserActions.from_json` to `from_dict`. + The previous names were misleading, as the function signature were parallel to the `from_dict` methods on other classes, not to their `from_json` methods. +- Renamed `SynonymDatabase.add` to `SynonymDatabase.add_parser`, for consistency with `MetadataDatabase.add_parser`. + + ## 1.5.1 - 2024-01-29 diff --git a/docs/_changelog.d/+SpacyToKazuObjectMapper.removal.md b/docs/_changelog.d/+SpacyToKazuObjectMapper.removal.md deleted file mode 100644 index cd6a2145..00000000 --- a/docs/_changelog.d/+SpacyToKazuObjectMapper.removal.md +++ /dev/null @@ -1 +0,0 @@ -Remove deprecated `SpacyToKazuObjectMapper`, as this was renamed to `KazuToSpacyObjectMapper`, and the old name already deprecated pending removal. diff --git a/docs/_changelog.d/+createphrasematchersremoval.removal.md b/docs/_changelog.d/+createphrasematchersremoval.removal.md deleted file mode 100644 index a5bbcbdd..00000000 --- a/docs/_changelog.d/+createphrasematchersremoval.removal.md +++ /dev/null @@ -1 +0,0 @@ -Remove deprecated `create_phrasematchers_using_curations` method of `OntologyMatcher`. This was renamed to `create_phrasematchers` and was already deprecated pending removal. diff --git a/docs/_changelog.d/+curation.feature.md b/docs/_changelog.d/+curation.feature.md deleted file mode 100644 index 747836d3..00000000 --- a/docs/_changelog.d/+curation.feature.md +++ /dev/null @@ -1 +0,0 @@ -Curation process has been significantly improved and simplified for the end user, including introducing the `AutoCurator` concept to aid in this. This will enable us to build out better documentation and an interactive tool in future releases, which are currently in draft. Overally, this will greatly simplify upgrading ontology versions, adding curations for a new ontology etc. diff --git a/docs/_changelog.d/+datamodelchange.feature.md b/docs/_changelog.d/+datamodelchange.feature.md deleted file mode 100644 index c7ba0e8e..00000000 --- a/docs/_changelog.d/+datamodelchange.feature.md +++ /dev/null @@ -1 +0,0 @@ -Datamodel has been substantially revised in a **backwards incompatible** manner to clear up confusing concepts, fix longstanding issues etc. diff --git a/docs/_changelog.d/+gliner.feature.md b/docs/_changelog.d/+gliner.feature.md deleted file mode 100644 index b8a18728..00000000 --- a/docs/_changelog.d/+gliner.feature.md +++ /dev/null @@ -1 +0,0 @@ -New Zero shot NER model with GLiNER diff --git a/docs/_changelog.d/+remove-replace-dashes.removal.md b/docs/_changelog.d/+remove-replace-dashes.removal.md deleted file mode 100644 index 18dbff7a..00000000 --- a/docs/_changelog.d/+remove-replace-dashes.removal.md +++ /dev/null @@ -1 +0,0 @@ -Remove deprecated `GildaUtils.replace_dashes`. This was superceded by `GildaUtils.split_on_dashes_or_space` and was already deprecated pending removal. diff --git a/docs/_changelog.d/+renamedocumentjson.removal.md b/docs/_changelog.d/+renamedocumentjson.removal.md deleted file mode 100644 index 90de71c5..00000000 --- a/docs/_changelog.d/+renamedocumentjson.removal.md +++ /dev/null @@ -1,5 +0,0 @@ -Rename `Document.json` to `to_json`, and remove optional arguments. -The previous name was inconsistent with naming on other classes, as the function signature were parallel to `to_json` methods. -The argument `drop_unmapped_ents` had functionality that was duplicated with `DropUnmappedEntityFilter` within the `CleanupStep`, -and it made sense to add the `drop_terms` behaviour to a new `LinkingCandidateRemovalCleanupAction` to collect this behaviour together -and significantly simplify the Document serialization code. diff --git a/docs/_changelog.d/+renameparseractionfromdict.removal.md b/docs/_changelog.d/+renameparseractionfromdict.removal.md deleted file mode 100644 index 6fc9963b..00000000 --- a/docs/_changelog.d/+renameparseractionfromdict.removal.md +++ /dev/null @@ -1,2 +0,0 @@ -Rename `ParserActions.from_json` and `GlobalParserActions.from_json` to `from_dict`. -The previous names were misleading, as the function signature were parallel to the `from_dict` methods on other classes, not to their `from_json` methods. diff --git a/docs/_changelog.d/+renamesyndbadd.removal.md b/docs/_changelog.d/+renamesyndbadd.removal.md deleted file mode 100644 index 2a9e1a1c..00000000 --- a/docs/_changelog.d/+renamesyndbadd.removal.md +++ /dev/null @@ -1 +0,0 @@ -Renamed `SynonymDatabase.add` to `SynonymDatabase.add_parser`, for consistency with `MetadataDatabase.add_parser`. diff --git a/docs/_changelog.d/+serialization.feature.md b/docs/_changelog.d/+serialization.feature.md deleted file mode 100644 index 4a4ec143..00000000 --- a/docs/_changelog.d/+serialization.feature.md +++ /dev/null @@ -1,2 +0,0 @@ -(De)serialization has been greatly improved, simplified, made correct, and given a slightly more compact serialized representation. -This does mean there are some small changes in (de)serialization behaviour since the previous release. diff --git a/kazu/__init__.py b/kazu/__init__.py index 0f228f25..8c0d5d5b 100644 --- a/kazu/__init__.py +++ b/kazu/__init__.py @@ -1 +1 @@ -__version__ = "1.5.1" +__version__ = "2.0.0"