Skip to content

StoreksFeed/EduClassificationSystem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SE Coursework, 6th semester

This repo contains the source code for my 6th semester Software Engineering Coursework.

It is a Docker Compose project with 3 containers:

  • Django based web-app for user interaction,
  • database (Apache Cassandra),
  • and a clustering/classification module for text.

Build and run

docker-compose up --build

TODO:

  • Init Django app
  • Init Cassandra
  • Pack it all in Docker containers

It seems like cassandra-driver reaaaaally struggles to work under Windows instead opting for crashing any Python app without any logs whatsoever. After a day of painful debugging I decided that running it under Linux sounds like a quite good idea, so here it is

  • Add an ability to manage Cassandra tables via Django
  • Create all the models required
  • Implement clustering (for all entries)
  • Implement classification (for single entry based on created groups)
  • Add Django buttons to trigger classifier
  • Prepare test dataset
  • Update web interface (grid layout and CSS)

Now it looks much better and supports dark mode

  • Check options in sklearn.cluster

Base and advanced text preprocessing, AgglomerativeClustering and KMeans

  • Measure clustering perfomance based on a true headers
  • Compare against Lingo, STC and Random distribution

Some nice plots can be found in the Jupyter notebook (results/notebook.ipynb)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published