Skip to content

Commit

Permalink
sharpened language in pipeline qc, quote matching
Browse files Browse the repository at this point in the history
  • Loading branch information
nrosed committed Feb 9, 2021
1 parent 39ebb1e commit 3fec287
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion analysis_scripts/qc_scripts/pipeline_qc.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,8 @@ loc_stats_df = data.frame(loc_stats_df)
#### plotting quote stats

Between pipeline step 3 and 4 we are predicting the genders of speakers using genderize.io.
So we expect almost exactly the same number of quotes and length of quotes.
So we expect exactly the same number of quotes and almost exactly the same length of quotes
(unicode characters + whitespace editing happens in step 4).
We also expect that the number of UNKNOWN gendered speakers typically decrease,
and the number of MALE/FEMALE speakers may increase.
This is not a completely 1:1 measurement. Pipeline level 3 only identifies the
Expand Down

0 comments on commit 3fec287

Please sign in to comment.