Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 1.39 KB

File metadata and controls

28 lines (21 loc) · 1.39 KB

Unsupervised-Learning-on-leukaemia-patient-data-

This project focuses on using clustering techniques to gain insights into leukaemia patient data. Clustering helps in identifying patterns and subgroups within the leukaemia dataset, which can be valuable for tailoring treatment plans or understanding the disease's progression.

Table of Contents

  1. Dataset
  2. Data Preprocessing
  3. Clustering Algorithms
  4. Results

Dataset The dataset used in this project is the leukaemia dataset. It contains information about leukaemia patients, including various clinical and demographic features.

Data Preprocessing In the data preprocessing phase, we perform the following steps:

  1. Data cleaning: Handling missing values, outliers, and duplicates.
  2. Feature selection: Identifying relevant features for clustering.
  3. Feature scaling: Standardizing or normalizing data to ensure that all features contribute equally to clustering.

Clustering Algorithms We apply the following clustering algorithms to the preprocessed data:

  1. K-Means
  2. Hierarchical Clustering Each algorithm may yield same clusters, and we evaluate their performance using relevant metrics.

Results Our clustering analysis reveals insights into the leukaemia patient data. We visualize the clusters and interpret their clinical significance. Additionally, we calculate metrics such as silhouette score and inertia to assess the quality of the clusters.