Topic-Modeling

Topic Models, in a nutshell, are a type of statistical language models used for uncovering hidden structure in a collection of texts. In a practical and more intuitively

Unsupervised Learning, where it can be compared to clustering, as in the case of clustering, the number of topics, like the number of clusters, is an output parameter. By doing topic modeling, we build clusters of words rather than clusters of texts. A text is thus a mixture of all the topics, each having a specific weight

There are several existing algorithms you can use to perform the topic modeling. The most common of it are, Latent Semantic Analysis (LSA/LSI), Probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA) In this article, we’ll take a closer look at LDA, and implement our first topic model using the sklearn

Parameters of LDA Alpha parameter is Dirichlet prior concentration parameter that represents document-topic density — with a higher alpha, documents are assumed to be made up of more topics and result in more specific topic distribution per document. Beta parameter is the same prior concentration parameter that represents topic-word density — with high beta, topics are assumed to made of up most of the words and result in a more specific word distribution per topic.

LDA Implementation

Loading data
Data cleaning
Exploratory analysis
Preparing data for LDA analysis
LDA model training
Analyzing LDA model results

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Topic_Models.ipynb		Topic_Models.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topic-Modeling

LDA Implementation

About

Releases

Packages

Languages

dhamvi01/Topic-Modeling

Folders and files

Latest commit

History

Repository files navigation

Topic-Modeling

LDA Implementation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages