Skip to content

Commit

Permalink
test-diary
Browse files Browse the repository at this point in the history
  • Loading branch information
Trondtr committed Apr 5, 2024
1 parent 4461800 commit dd9efaa
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/index-header.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ and in a Kven spellchecker.
* [Yamltestit maaliskuu 2019](YamltestitMaaliskuu2019.html)
* [n_11-feilit](n_11-feilit.html)
* [Kaikki generoidut paradigmat](KaikkiGeneroidutParadigmat.html)
* [Test diary](test-diary.md)

# In-source documentation

Expand Down
56 changes: 56 additions & 0 deletions docs/test-diary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
Test diary
==========

This document writes down test statistics

The overal test command: `make check`

## yaml

The command:

`sh test/yaml-check.sh`

(data forthcoming)

## Lexical coverage
fkv
Number of words (standing in `lang-fkv`):

```
cat test/data/freecorpus.txt |\
hfst-tokenise tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |wc -l
```

Number of unknown words:

```
cat test/data/freecorpus.txt |\
hfst-tokenise tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
preprocess --corr=test/data/typos.txt|\
hfst-tokenise -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
grep " ?"|cut -d'"' -f2|wc -l
```

Test with the full corpus (free + bound):



### Lexical coverage of freecorpus

The file is `test/data/freecorpus.txt`.

Coverage:

- 240405: 1-(42819/607401) = 0.9295

### Lexical coverage of free + bound

Coverage:







0 comments on commit dd9efaa

Please sign in to comment.