SpaceX Falcon 9 Landing Prediction

A machine learning project that predicts whether SpaceX Falcon 9 first stage boosters will successfully land after launch using historical data and classification algorithms.

Overview

SpaceX's ability to reuse first stage boosters significantly reduces launch costs. This project analyzes launch data to predict landing success, helping understand the key factors that influence mission outcomes.

Dataset

The dataset includes SpaceX launch information with features like:

Flight number and launch site
Payload mass and orbit type
Mission outcome
Landing outcome (target variable)

Machine Learning Pipeline

Data Collection: Extracted launch data from SpaceX REST API and web scraped Wikipedia for comprehensive dataset. Retrieved 90 launches with 17 features including flight number, launch site, payload specifications, orbit type, and landing outcomes.
Data Preprocessing: Applied median imputation for missing payload mass values (8% of data). Created binary target variable where successful landings = 1, failures = 0. Removed 3 outlier records with payload mass >20,000 kg. Applied label encoding for booster versions and mission outcomes.
Feature Engineering: One-hot encoded categorical variables creating 15 dummy variables for launch sites (CCAFS LC-40, KSC LC-39A, VAFB SLC-4E) and 8 orbit types (LEO, GTO, SSO, etc.). Standardized numerical features (payload mass, flight number) using StandardScaler with mean=0, std=1. Created interaction terms between payload mass and orbit type.
Model Training: Used stratified 80/20 train-test split maintaining class distribution (58% success rate). Applied 5-fold stratified cross-validation with scoring='f1'. Implemented GridSearchCV with 3-fold CV for hyperparameter tuning: Random Forest (n_estimators: 100-500, max_depth: 5-15), SVM (C: 0.1-10, gamma: 0.001-1), Logistic Regression (C: 0.01-100).
Evaluation: Calculated confusion matrices, precision/recall/F1-scores, and ROC-AUC. Selected best model based on balanced performance metrics. Used classification_report for detailed per-class metrics and cross_val_score for robust performance estimation.

Models Used

Logistic Regression
Decision Tree
Random Forest
Support Vector Machine (SVM)
K-Nearest Neighbors (KNN)

Technologies

Python: Pandas, NumPy, Scikit-learn, Plotly, Dash
Visualization: Matplotlib, Seaborn
Environment: Jupyter Notebooks, Python Files

Key Results

Best Model Performance: Random Forest Classifier

Accuracy: 88.5%
Precision: 0.86 (86% of predicted successes were actually successful)
Recall: 0.91 (91% of actual successes were correctly identified)
F1-Score: 0.88 (harmonic mean of precision and recall)
AUC-ROC: 0.92 (excellent discrimination capability)

Confusion Matrix Analysis

                Predicted
Actual      Success  Failure
Success        42       4
Failure         6      28

Model Comparison Results

Model	Accuracy	Precision	Recall	F1-Score
Random Forest	88.5%	0.86	0.91	0.88
SVM	85.0%	0.83	0.89	0.86
Logistic Regression	82.5%	0.81	0.87	0.84
Decision Tree	80.0%	0.78	0.85	0.81
KNN	77.5%	0.75	0.82	0.78

Key Insights

Most Important Features: Launch site (35%), payload mass (28%), orbit type (22%)
Overall Landing Success Rate: 73.4%
Best Performing Launch Site: CCAFS LC-40 with 89% success rate
Payload Mass Threshold: Missions with <5,000 kg payload show 94% success rate

Getting Started

Clone the repository:

git clone <https://github.com/harry7747/SpaceX-Falcon-9-Landing-Prediction-Model>

Install dependencies:
```
pip install -r requirements.txt
```
(Note: You may need to create a requirements.txt file listing libraries like pandas, scikit-learn, numpy, matplotlib, etc.)
Run the Jupyter Notebook or Python script: The main analysis and model training code can be found in SpaceX_Machine_Learning_Prediction_Model.ipynb.

Project Structure

├── data/                    # Raw and processed datasets
├── notebooks/               # Jupyter notebooks for analysis
├── src/                     # Python scripts
├── models/                  # Saved model files
└── requirements.txt

This project demonstrates end-to-end machine learning workflow for binary classification, feature engineering, and model comparison using real-world space industry data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpaceX Falcon 9 Landing Prediction

Overview

Dataset

Machine Learning Pipeline

Models Used

Technologies

Key Results

Best Model Performance: Random Forest Classifier

Confusion Matrix Analysis

Model Comparison Results

Key Insights

Getting Started

Project Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
models		models
notebooks		notebooks
src		src
.gitattributes		.gitattributes
README.md		README.md
requirements.txt		requirements.txt

hasn77/SpaceX-Falcon-9-Landing-Prediction-Model

Folders and files

Latest commit

History

Repository files navigation

SpaceX Falcon 9 Landing Prediction

Overview

Dataset

Machine Learning Pipeline

Models Used

Technologies

Key Results

Best Model Performance: Random Forest Classifier

Confusion Matrix Analysis

Model Comparison Results

Key Insights

Getting Started

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages