#GermaNER - Free Open German Named Entity Recognition Tool
GermaNER is licensed under ASL 2.0 and other lenient licenses, allowing its use for academic and commercial purposes without restrictions.
##GermaNER in three lines
To tag German texts:
- Download the binary from here or if you don't have enough memory, use GermaNER without freebase features from [here] (https://github.com/tudarmstadt-lt/GermaNER/releases/download/germaNER0.9.1/GermaNER-nofb-09-09-2015.jar).
- Tokenize your text so that it is one word per line. Sentences should be marked with a blank new line. Read details [here] (https://github.com/tudarmstadt-lt/GermaNER/blob/master/germaner/src/main/java/de/tu/darmstadt/lt/ner/doc/File-Format.md).
- Run the jar file as follows (see details here)
java -Xmx4g -jar GermaNER-09-09-2015.jar -t YourTokenizedTestFile -o OutputFileName
OR (if you have less memmory)
java -Xmx1300m -jar GermaNER-nofb-09-09-2015.jar -t YourTokenizedTestFile -o OutputFileName
The tagged document will be under output/result.tsv
- NEW
##Train GermaNER with your own training file and feature files
If you like to train GermaNER with your own training file or our training file from here but with different feature files, do as follows
- Get the data.zip file from here and change the contents of any files as needed. Once done, zip back as data.zip
- Get the config file, config.properties, here. set useFreeBase=0 if you do not have enough memory. If you have lookup feature files like this, set lookUpFeature=1. If you have list feature files like this, set listFeature=1.
- Get the GermaNER jar file from here. This jar file is only meant to train an NER model on new dataset or modified features. It does not contain usable NER model.
For training and testing at the same time, run it as follows:
java -jar GermaNER-train-04-12-2016.jar -f YOURTRAINFILE -t YOURTESTFILE -r data.zip -d MODELDIR -o OUTPUTFILENAME -c config.properties
For testing, once your run the above command and you have the NER model under MODELDIR, run it without the -f switch as follows
java -jar GermaNER-train-04-12-2016.jar -t YOURTESTFILE -r data.zip -d MODELDIR -o OUTPUTFILENAME -c config.properties
- NEW END
- Resources including files for feature generation (data.zip) and configuration files (config.properties) are found here
- System requirements
- Introduction
- Configurations
- User guide
- File format
- Features
- Customizing GermaNER
- [From source] (https://github.com/tudarmstadt-lt/GermaNER/blob/master/germaner/src/main/java/de/tu/darmstadt/lt/ner/doc/fromsource.md)