Releases: ETCBC/bhsa
Added NER specs
The TF feature data is identical to the previous release.
What is addes is the directory ner
at the toplevel.
This contains the config and test spec for manual entity markup with the new annotate tool.
App within repo
The BHSA app is now within the repo.
Better metadata
Improved featuredocs, made feature metadata better available to text-fabric.
No changes in the feature data !
Lex en word features straightened
Some features were present on lex nodes but not on word nodes, some features the other way round, where it was
natural to have them on both. That has been straightened.
Specifically:
gloss
now exists on word nodes toolex_utf8
,lex0
,languageISO
now exist on lex nodes too
With versions 2017, c, and 2021 and node maps to 2021
The exisiting 2021 features have not changed.
There is a new feature: omap@c-2021
, a node mapping from version c
to version 2021
.
The older versions 2017
and c
are attached to this release, in addition to version 2021
.
Added data version 2021
Added data version 2021, removed dataversion c
.
This release has attachments for data versions 2021 and 2017, so you can easily compare them in your Text-Fabric programs.
Lexeme features spread to occurrences, older BHSA versions not included
Several features only had values for lexeme nodes.
gloss
, nametype
, voc_lex
, voc_lex_utf8
Now we extend their values to all corresponding word nodes.
We say goodbye to the older ETCBC/BHSA versions of the data:
3, 4, 4b, 2016.
We keep 2017.
Renamed tf binaries
Just a renaming of the release binaries for the sake of auto-downloading from text-fabric
Default text format for lexemes
There is only a change in the otext.tf files in versions c, 2017, 2016.
To these the line @fmt:lex-default={voc_lex_utf8}
has been added.
That enables the Text-Fabric function T.text()
to easily represent lexemes.
With compact data files as assets
Many changes in the tutorials.
Now the tf files are available as assets with this release.