GitHub - Sagaryadav2006/DEFORESTATION: This project is designed to **predict and monitor deforestation** in tropical rainforests using machine learning. It analyzes satellite imagery and a wide range of environmental data such as rainfall, tree cover, and proximity to roads to identify and forecast areas at high risk of forest loss.

Deforestation Prediction Project

Project Overview This project aims to leverage machine learning to predict and monitor deforestation in vital tropical rainforests. By analyzing a rich dataset of satellite imagery and environmental factors, the model seeks to identify areas at high risk of future forest loss. The ultimate goal is to create a proactive tool that can help conservation organizations and policymakers make informed decisions to protect these critical ecosystems.
The Problem of Deforestation Deforestation is a critical environmental issue with far-reaching consequences, including biodiversity loss, disruption of water cycles, soil erosion, and a significant contribution to global climate change. Traditional methods of monitoring are often reactive. This project takes a proactive approach, aiming to forecast deforestation events before they occur.
Objective The primary objective is to build a robust predictive model that accurately classifies areas as being at high or low risk of a deforestation event. This involves:

Preprocessing and cleaning diverse geospatial and environmental data.

Analyzing the key drivers and indicators of deforestation.

Training and evaluating various machine learning models to find the most effective one.

Creating a system that can provide a "Predicted Risk Score" for a given geographical tile.

The Dataset The analysis is based on the deforestation_sample_1100.csv dataset. This dataset contains 1100 unique observations, each representing a specific geographical tile at a particular point in time.

The data is split into train, validation, and test sets.

Key Data Features Include: Geospatial Data: Latitude, Longitude, Elevation (m), Slope (°)

Climatic Data: Rainfall (mm), Temperature (°C), Cloud Cover (%)

Satellite Indices: NDVI (Vegetation Health), NDMI (Moisture Index), EVI (Enhanced Vegetation Index)

Forest Metrics: Tree Cover (%), Canopy Height (m), Forest Loss Last 3Y (%)

Human Activity Indicators: Distance to Road (km), Distance to Settlement (km), Population Density, Fire Alerts (7d)

Land Use Data: Protected Area, Logging Concession

Target Variable: Deforestation Event (Yes=1, No=0)

Project Workflow Data Loading: The initial dataset is loaded using Python's pandas library.

Data Preprocessing:

Column Name Cleaning: Standardized column names by removing special characters and spaces (e.g., Tree Cover (%) becomes tree_cover_percent).

Type Conversion: The Date column was converted to a datetime object for time-series analysis.

Categorical Encoding: Features like Region and Country were one-hot encoded to be used in the machine learning model.

Exploratory Data Analysis (EDA): In-depth analysis to understand the relationships between different features and their correlation with deforestation events.

Model Building & Training: Different classification models (e.g., Logistic Regression, Random Forest, Gradient Boosting) will be trained on the preprocessed data.

Model Evaluation: The models will be evaluated based on metrics such as Accuracy, Precision, Recall, and F1-Score to determine the best-performing algorithm.

This project utilizes a Logistic Regression model, a robust and highly interpretable machine learning algorithm ideal for binary classification tasks. After experimenting with other models, Logistic Regression was chosen for its excellent performance and reliability in this context.

What the Model Predicts The model is trained to solve a specific classification problem: "Will a deforestation event occur in this specific area?"

It analyzes 30 different input features for a geographical tile and produces two key outputs:

A binary prediction:

1 if a deforestation event is likely.

0 if a deforestation event is not likely.

A confidence score: The probability of the prediction being correct, which helps in assessing the risk level.

How the Model Makes Predictions The model learns patterns from historical data to identify which factors are most strongly associated with forest loss. The analysis revealed that the most influential features for its predictions are:

Historical Forest Loss: Cumulative Deforested Area (%) and Forest Loss Last 3Y (%) are the strongest predictors.

Human Activity: Population Density (per km²), Distance to Road (km), and the presence of a Logging Concession.

Environmental Factors: Vegetation health indices like NDVI and EVI.

Essentially, the model has learned that areas with a history of deforestation that are close to human infrastructure are at the highest risk.

Final Model Performance The model demonstrates a high degree of accuracy and reliability on unseen test data, making it a trustworthy tool for an early-warning system.

Overall Accuracy: 97.04%

Recall (for "Deforestation"): 92% - The model successfully identifies 92% of all actual deforestation events, meaning very few critical events are missed.

Precision (for "Deforestation"): 96% - When the model raises an alarm, it is correct 96% of the time, leading to very few false alarms.

This strong balance between recall and precision makes the model highly effective and suitable for real-world deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DEFORESTATION_PREDICTOR_Project_PPT.pptx		DEFORESTATION_PREDICTOR_Project_PPT.pptx
README.md		README.md
app_gradio.py		app_gradio.py
deforestation.csv		deforestation.csv
deforestation_30%.ipynb		deforestation_30%.ipynb
deforestation_60%.ipynb		deforestation_60%.ipynb
deforestation_ML_MODEL.ipynb		deforestation_ML_MODEL.ipynb
logistic_model.joblib		logistic_model.joblib
model_columns.joblib		model_columns.joblib
requirements.txt		requirements.txt
scaler.joblib		scaler.joblib
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages