-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I just had a look in the code in scripts and found some potentially unnecessary cases of detail, so I added an example how to compute normal cognates with the data and how to check for the thresholds and the quality. The script is called evaluation.py.
The results are:
T PSCA RSCA FSCA PLS RLS FLS
0.05 0.99 0.63 0.77 1.00 0.23 0.37
0.10 0.98 0.65 0.79 1.00 0.35 0.51
0.15 0.98 0.66 0.79 1.00 0.48 0.65
0.20 0.96 0.71 0.82 0.99 0.55 0.71
0.25 0.94 0.73 0.82 0.99 0.62 0.76
0.30 0.92 0.79 0.85 0.97 0.67 0.79
0.35 0.90 0.82 0.86 0.95 0.72 0.82
0.40 0.89 0.85 0.87 0.94 0.76 0.84
0.45 0.88 0.89 0.88 0.93 0.79 0.86
0.50 0.86 0.94 0.90 0.91 0.84 0.87
0.55 0.84 0.96 0.90 0.90 0.88 0.89
0.60 0.82 0.97 0.89 0.88 0.92 0.90
0.65 0.78 0.98 0.87 0.86 0.94 0.90
0.70 0.77 0.98 0.86 0.85 0.96 0.90
0.75 0.75 1.00 0.86 0.84 0.97 0.90
0.80 0.73 1.00 0.84 0.82 0.98 0.90
0.85 0.72 1.00 0.83 0.81 0.99 0.89
0.90 0.72 1.00 0.83 0.78 0.99 0.87
0.95 0.72 1.00 0.83 0.76 1.00 0.86
P=precision, r=recall, f=f-score, and SCA=sca, ls=lexstat, t=threshold
You can see, the difference in 0.55 and the other thresholds for lexstat is minimal, it reaches almost its peack, and also has a reasonably good precision, while it looses its precision drastically afterwards. So 0.75 or 0.8 is not needed. If the problems in the original cognate judgments are handled, it will still be the 0.55 threshold as the most reliable one (little differences are okay.
You can also see that the SCA has its peak at 0.5, and again, before, the standard threshold is 0.45, not that far away.
This provides I hope enough evidence that you can trust the standard thresholds and need to accept that some data are not perfectly cognate coded, we have 90% here, what more would one want?