GitHub - johnvorsten/point_categorizer: Automatically label BAS

Packages:

clustering
Clustering algorithms and modules to cluster points based on similarity. After the points are clustered, they are loaded into mongodb to be used in the MIL categorization section of this project. TODO : Move some of the database pipeline modules OUT of clustering, pipelines & mongo

data Includes .csv files for clustering, manual cleaning of some data, tensorflow TF record protos, vocabulary files, and a master .csv file of extracted databases from .mdf to .csv

MILCategorization Multiple instance categorization for bags of clustered data. Optionally, use this on databases segmented based on controller instead of clustered databases.

ranking Holds models for tensorflow ranking models. Includes rankign of optimal hyperparameters for clustering databases based on points similarity

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
MILCategorization		MILCategorization
clustering		clustering
data		data
error_dfs		error_dfs
extract		extract
load		load
ranking		ranking
transform		transform
.gitignore		.gitignore
8-16 NbClust Reduce calculation times.spydata		8-16 NbClust Reduce calculation times.spydata
README.md		README.md
__init__.py		__init__.py
classify_model1.py		classify_model1.py
data.py		data.py