Skip to content

Research project on random forests and how can one improve their prediction accuracy for the NASA airfoil learning problem using mathematical analysis

Notifications You must be signed in to change notification settings

MamouneElBoukfaoui/Machine-learning-Random-regression-forests-project

Repository files navigation

Machine-learning

This repository contains some of the R scripts of my academic reserch project in Machine Learning supervised by Dr Ian H. Jermyn. My research topic is Random regression forests. In this project, I have carried out a probabilistic and statistical analysis of random regression forests and concluded that some conditions in the construction of the random forest esimtator must be set in order to achieve better prediction accuracy on any possible test set. Throughout this project, we used the Nasa airfoil self noise data set in order to illustrate the theory studied. For a description of the data, please refer to: https://www.kaggle.com/datasets/fedesoriano/airfoil-selfnoise-dataset.

The project paper was written in LATEX and covers all the necessary statistical learning background in order to understand random regression forests in further depth. The paper outlines is available below:

As an ensemble learning procedure, random regression forests gathers many statistical techniques. Thus, in order to understand it clearly, it is necessary to analyse each of its components rigorously. The paper is thus organised as follows: Chapter 2 will aim at explaining the basic principles of supervised learning by analysing important concepts such as: the difference between parametric and non-parametric learning, the concept of errors in statistical learning with the study of loss functions, and the bias variance trade-off. In Chapter 3, the main algorithm behind random regression forests, regression trees, will be analysed. Chapter 4 will introduce a statistical procedure called ‘bagging’ which will be applied on regression trees and will aim at reducing the variance of individual trees. In Chapter 5, random forests will be analysed in a formal fashion beginning with a presentation and finishing with more complex properties about their asymptotic behavior. Finally, a conclusion will resume the main benefits of using random regression forests and all the theory discussed in previous chapters.

The project paper is available by e-mail demand: [email protected]

About

Research project on random forests and how can one improve their prediction accuracy for the NASA airfoil learning problem using mathematical analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published