Skip to content

Ting2004/Caviar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Caviar

a Chinese word deliminator named Caviar who does a stupid job.

(Caviar) 请输入中文后回车:
(User) 你好笨
(Caviar) [你好, 笨]

About Caviar

The idea of making a Chinese word deliminator comes from the book, The Beauty of Mathenamatics in Computer Science by Jun Wu. To deliminate words for a segment in Chinese is the very first step of NLP in Chinese. And I happened to know a little about Markoc Chain and Viterbi Algorithm.

Thanks for Caviar's contribution to my college application.

Limitations

There are a lot of problems in this project, including but not limiting to

  • biasd and outdated corpus
  • brute method to create the tree
  • dictionary not cleaned, may include numbers which means nothing
  • use the simplest Markov model for it, the context is limited to the one segment prior to the segment

made in August 2021

About

a Chinese word deliminator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages