Comparative Analysis of NLP Techniques for Information Extraction

This repo helps to directly extract answers from pdf documents when a query is asked.

Algorithm Used

TF-IDF and LSA have been used for document retrieval and Spacy Similarity, Jaccard Distance and doc2vec have been used for information extraction.

Python3.6, spacy, gensim, sklearn, nltk, distance, re, glob and os

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Cleaning.py		Cleaning.py
Preprocess.py		Preprocess.py
README.md		README.md
__init__.py		__init__.py
data.py		data.py
model.py		model.py
smalltalk.py		smalltalk.py