Skip to content

PhoneixProgrammer/Dataset-segmentation-using-Kmeans-for-figshare-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Dataset-segmentation-using-Kmeans-for-figshare-data

Figshare: Enhancing Your Ranking

Figshare is a platform that allows users to freely upload their work, making it accessible to a wide audience. However, when a service is free, the challenge arises: How can your work stand out? Figshare uses private ranking algorithms to determine which work is displayed prominently. These algorithms consider various properties of the uploaded content.

The catch is, users often don't know what these properties are, making it challenging for them to rank their materials effectively.

In conclusion, this investigation aimed to uncover the factors behind the varying levels of attention received by different datasets. The dataset itself contained critical information such as author details, backlinks, URLs, and more. The research identified this as an unsupervised learning task.

The process commenced with an initial dataset exploration, providing foundational insights into its content. Following this, data cleaning was executed, involving the removal of null values, outliers, URLs, and ambiguous data points. Subsequently, an in-depth Exploratory Data Analysis was carried out to extract valuable insights about the dataset's inherent properties.

Progressing further, an optimal model was meticulously selected, and its parameters were fine-tuned to facilitate training on the dataset. This process resulted in the creation of clusters, serving as a means to address the central question: the factors contributing to certain datasets receiving more attention than others.

In summary, our analysis revealed that datasets with a higher number of backlinks tended to attract more attention.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published