pulsars-candidate-classifier

High Time Resolution Universe (HTRU) Survey was conducted to search for Pulsars and Fast Transients using the Parkes Telescope in Australia. Majority of the Pulsars detections were actually false positives caused by radio frequency interference (RFI) and noise. We have used state of the art Machine Learning techniques that have improved significantly in recent years to evaluate feature importance and compare the performances of different approaches to design a binary classifier that automatically labels real Pulsar candidates. We have tried to address the problem of class imbalance by using Synthetic minority oversampling technique (SMOTE) and optimized our models by hyper parameter tuning to maximize accuracy and the geometric mean.

Methodology

Input Data

17,898 examples and 8 features

Feature Pre-processing

Standard Scaler
Stratified train-test split
Oversampling using SMOTE

Algorithm

Supervised Approach

Decision Tree
SVM
XgBoost
Neural Networks

Unsupervised Approach

Calculating feature importance
K-Means
Agglomerative Clustering

Performance Metric

Confusion Matrix
F-Score
G-Mean

Note: Hyper-parameters are adjusted for best performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pulsars-candidate-classifier

Methodology

Input Data

Feature Pre-processing

Algorithm

Supervised Approach

Unsupervised Approach

Performance Metric

Files

README.md

Latest commit

History

README.md

File metadata and controls

pulsars-candidate-classifier

Methodology

Input Data

Feature Pre-processing

Algorithm

Supervised Approach

Unsupervised Approach

Performance Metric