Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 1.27 KB

README.md

File metadata and controls

15 lines (9 loc) · 1.27 KB

sister_peg

Sister Peg authorial attribution case

This repo contains code used for Mark J. Hill and Mikko Tolonen's "A Computational Investigation into the Authorship of Sister Peg" in Eighteenth-Century Studies (Volume 54, Number 4, Summer 2021).*

Scripts (held in the "r" subdirectory) used to conduct the analysis conducted numerous (and I mean numerous) tests using various combinations of feature selection and measurement. For technical details on methods which worked best for attribution see publication.

Training and test material is not included with this repo for copyright reasons. The exact works and editions used are included in the appendix to the publication. As many outputs can be overwritten when a test is run, it should not be assumed that any of the results on the repo at any time are the results you are interested in. Instead, re-run the tests to ensure you get results relevant for the test you may be interested in.

For code used to normnalise words for testing the impact spelling variations had on attribution see: https://github.com/COMHIS/sister_peg/blob/master/r/normalize_words.r

For an extracted 7-gram wordlist (note 39) see: https://github.com/COMHIS/sister_peg/blob/master/wordlist.txt.

*(Publication available here: https://muse.jhu.edu/article/802445)