Tutorial: Any language dictionary extracted from any Wiktionary and converted to offline dictionary file #1651
franzmondlichtmann
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Either go to a specific subfolder of kaikki and download a pregenerated reduced set of wiktionary from there.
https://kaikki.org/
Or extract a specific language from any wiktionary (bsp: english wiktioanry) with wiktextract
https://github.com/tatuylonen/wiktextract
wiktwords --all --all-languages --out data.json enwiktionary-20230801-pages-articles.xml.bz2
(bsp:--all-languages can be instead --spanish)
Then you get a jsonl file. Source: https://jsonlines.org/
This jsonl file can now be converted to different dictionary format (bsp: .slob) with pyglossary
https://github.com/ilius/pyglossary
Convert with:
pyglossary file.jsonl file.slop
You get a great offline wiktionary:
Beta Was this translation helpful? Give feedback.
All reactions