-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: much progress toward running all regress cases
- Loading branch information
Showing
10 changed files
with
144 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# EveryVoice regression test suite | ||
|
||
## Preparing the regression training data: | ||
|
||
- Download LJ 1.1 from https://keithito.com/LJ-Speech-Dataset/ | ||
- Download Sinhala TTS from https://openslr.org/30/ | ||
- Download High quality TTS data for four South African languages (af, st, tn, | ||
xh) from https://openslr.org/32 | ||
- See [`prep-datasets.sh`](prep-datasets.sh) to see where these datasets are expected to be found. | ||
- Run this to create the regression testing directory structure: | ||
|
||
mkdir regress-1 # or any suffix you want | ||
cd regress-1 | ||
../prep-datasets.sh | ||
|
||
## Running the regression tests | ||
|
||
On a Slurm cluster: | ||
|
||
for dir in regress-*; do | ||
pushd $dir | ||
sbatch ../../regression-test.sh | ||
popd | ||
done | ||
|
||
Or just use `../../regression-test.sh` directly in the loop if you're not on a cluster. | ||
|
||
## Test data provenance | ||
|
||
- `test-si.txt`: copied from https://en.wikipedia.org/wiki/Sinhala_script CC BY-SA-4.0 | ||
- the first line means Sinhala script, found at the top of the page | ||
- the rest is the first verse from the Pali Dhammapada lower on the same page | ||
- `test2-si.txt`: one word from that first line | ||
|
||
- `test*-xh.txt`: individual words copied from https://en.wikipedia.org/wiki/Xhosa_language | ||
CC BY-SA-4.0 | ||
|
||
- `test*-lj.txt`: written by Eric Joanis |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
|
||
find . -name .coverage\* | coverage combine --keep | ||
coverage report --include='*/everyvoice/*' | sed 's/.*EveryVoice\/everyvoice/everyvoice/' > coverage.txt | ||
coverage html --include='*/everyvoice/*' | ||
coverage xml --include='*/everyvoice/*' | ||
sed -i 's/"[^"]*EveryVoice.everyvoice/"everyvoice/g' coverage.xml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
#!/bin/bash | ||
|
||
# Automated application of the instructions in README.md | ||
|
||
set -o errexit | ||
|
||
TOP_LEVEL_DIR=$(mktemp --directory regress-XXX) | ||
cd "$TOP_LEVEL_DIR" | ||
|
||
../prep-datasets.sh | ||
for DIR in regress-*; do | ||
pushd "$DIR" | ||
sbatch ../../regression-test.sh | ||
popd | ||
done | ||
|
||
coverage run -p -m everyvoice test | ||
|
||
JOB_COUNT=$(find . -maxdepth 1 -name regress-\* | wc -l) | ||
while true; do | ||
DONE_COUNT=$(find . -maxdepth 2 -name DONE | wc -l) | ||
if (( DONE_COUNT + 2 >= JOB_COUNT )); then | ||
break | ||
fi | ||
echo "$DONE_COUNT/$JOB_COUNT regression job(s) done. Still waiting." | ||
date | ||
sleep $(( 60 * 5 )) | ||
done | ||
|
||
echo "$DONE_COUNT regression jobs done. Calculating coverage now, but some jobs may still be running." | ||
../combine-coverage.sh | ||
cat coverage.txt | ||
|
||
while true; do | ||
DONE_COUNT=$(find . -maxdepth 2 -name DONE | wc -l) | ||
if (( DONE_COUNT >= JOB_COUNT )); then | ||
break | ||
fi | ||
echo "$DONE_COUNT/$JOB_COUNT regression job(s) done. Still waiting." | ||
date | ||
sleep $(( 60 * 5 )) | ||
done | ||
|
||
echo "All $DONE_COUNT regression jobs done. Calculating final coverage." | ||
rm .coverage | ||
../combine-coverage.sh | ||
cat coverage.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
සිංහල අක්ෂර මාලාව | ||
මනොපුබ්බඞ්ගමා ධම්මා, මනොසෙට්ඨා මනොමයා; | ||
මනසා චෙ පදුට්ඨෙන, භාසති වා කරොති වා; | ||
තතො නං දුක්ඛමන්වෙති, චක්කංව වහතො පදං. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
ukukrwentshwa | ||
uqeqesho | ||
iimpumlo |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
spec |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
අක-ෂර |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
isiXhosa |