Skip to content

VirtualRoyalty/spark-nlp-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Russian language processing via Spark(NLP) ๐Ÿ”ฅ

Go to colab

Micro project on big data technologies via spark

Content:

  1. Colab-Spark setup

  2. Data loading

  3. EDA & Preprocessing

  4. Pipelines & Experiments

  5. Text preprocessing

  6. Text classification

    • BoW models + LogReg
    • Transfer Learning (at least an attempt ๐Ÿ˜€)
  7. Entity Recgnition & Entity Linking

Tech stack:

...and much more ๐Ÿค˜