This repo helps to directly extract answers from pdf documents when a query is asked.
TF-IDF and LSA have been used for document retrieval and Spacy Similarity, Jaccard Distance and doc2vec have been used for information extraction.
Python3.6, spacy, gensim, sklearn, nltk, distance, re, glob and os