Issues with make_lemma_dictionary for treetagger engine #12

JaySLee · 2020-08-05T06:47:10Z

Hi,

Firstly, thanks for a great and useful package! I've been experimenting with the make_lemma_dictionary function and was wondering if the addition of the following features would be helpful:

Because the text is separated into tokens prior to its being sent into treetag(), some of the context is lost. Would it make sense to have an option to keep the text as is, i.e., full sentences? Here's an example: c("That food is really nice.","That felt is really nice."). Because the token/line with 'felt' is all by itself (as the other terms already appear), TreeTagger uses the default interpretation of felt as a verb. Passing in the full sentences to treetag() allows for the proper tagging.
I had some issues getting the treetag() function itself to work; potential bugs have been raised with koRpus' developer. I was wondering if a debug flag could be passed to treetag as well as an option to unsuppress messages, so that users could diagnose problems.

Thanks!

Best,
Jay

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with make_lemma_dictionary for treetagger engine #12

Issues with make_lemma_dictionary for treetagger engine #12

JaySLee commented Aug 5, 2020

Issues with make_lemma_dictionary for treetagger engine #12

Issues with make_lemma_dictionary for treetagger engine #12

Comments

JaySLee commented Aug 5, 2020