Skip to content

lnus/yt-ja-captions

Repository files navigation

YouTube Caption Tagger

Caption processing and tagging for YouTube videos with Japanese captions.

TO-DO

  • Get video metadata and captions
  • Extract captions and model to a structure
  • Segment the Japanese text
  • Compute frequency of words
  • Compare kanji against 常用漢字 (Jōyō kanji)
  • Visualize frequency graph
  • Visualize other interesting data
  • Rank words/kanji based on JLPT level (maybe, data is kinda meh to find)
  • Rank entire video captions based on frequency and difficulty
  • Store the output in a reasonable way
  • Decide if to make this a package or add a REST API

About

this does stuff with youtube captions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages