Skip to content

pxska/bakalaureus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bakalaureus

This repositorium was created to manage the files used in Kristjan Poska's bachelor's thesis.

The protocol texts originate from the crowdsourcing project of The National Archives of Estonia, and the manual annotations have been created in the project "Possibilities of automatic analysis of historical texts by the example of 19th-century Estonian communal court minutes". The project "Possibilities of automatic analysis of historical texts by the example of 19th-century Estonian communal court minutes" is funded by the national programme "Estonian Language and Culture in the Digital Age 2019-2027".

Thesis topic:

Named Entity Recognition in 19th Century Parish Court Protocols

Thesis instructor:

Siim Orasmaa, PhD

Libraries and technologies used in the thesis:

Technologies:

  • Python 3.7
  • EstNLTK (version 1.6). The files from the estner folder of branch devel_1.6 were also used (state of files at commit ebf1451).
  • Jupyter Notebooks (+ Anaconda)

The training has also been tested and works with Python 3.8 and EstNLTK version 1.6.8b.

Libraries/dependencies: