Skip to content

Python code used for research paper - Computer Sciencde

Notifications You must be signed in to change notification settings

694411/Computer-Science---paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Computer Science -paper

Python code used for research paper - Computer Science

The code is divided in 8 parts:

  • Data (loading, cleaning, preparing)
  • Model Words (extracting model words from product descriptions)
  • Binary Vectors (constructing the binary vectors for each product)
  • Min-Hashing (constructing the signature matrix)
  • Locality-Sensitive Hashing (identifying candidate duplicate pairs)
  • Jaccard Similarity
  • Bootstrapping training
  • Bootstrapping testing

The code is provided with a lot of comments, making all steps as clear as possible. There is no need to switch between documents, all code is provided in one file.

About

Python code used for research paper - Computer Sciencde

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published