Skip to content

nsb700/nn_document_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

NEURAL NETWORK CANCER DOCUMENT CLASSIFICATION


Description :-

This jupyter notebook solves the problem of classifiying cancer documents correctly into one of 3 categories, 'Thyroid_Cancer', 'Colon_Cancer', 'Lung_Cancer'. The problem and the dataset can be found at - https://www.kaggle.com/datasets/falgunipatel19/biomedical-text-publication-classification.

This notebook shows how one can use Neural Network Embeddings to solve this classification problem. Both alternatives - Embeddings with pre-computed embeddings and without, are implemented.

Pre-computed embeddings used are the GLOVE word embeddings from 2014 English Wikipedia, downloaded from https://nlp.stanford.edu/projects/glove.

Coding is done using Python and Keras. Intermediate outputs are printed in the notebook for clarity.