Skip to content

Tokenization

Nenad Ticaric edited this page May 22, 2018 · 1 revision

Tokenization (lexical or morphological analysis) is the process of breaking up a stream of text into words, phrases, names, or other meaningful elements called tokens. The list of tokens can then be used in indices for search, text mining, NLP, etc.

Clone this wiki locally