Skip to content

Latest commit

 

History

History
72 lines (47 loc) · 5.24 KB

README.md

File metadata and controls

72 lines (47 loc) · 5.24 KB

Machine learning before coding

This repository was planned for those who don't know how to code, but work or want to work with machine learning 🙃

If you have no background in data science, check out this one 😎

Now, if you already code and want a repo with a faster pace, check out this one 😎

Disclaimer

This is a collaborative repository, created by the students of Instituto Metrópole Digital from UFRN.

The author of each notebook is properly acknowledged 😉

Choosing a tool

Several tools are available for this profile.

In general, they can be grouped into GUI tools and CLI tools:

  • GUI (graphical user interface): All the user interaction is done graphically. These are software like Orange3 and Weka.
  • CLI (command-line interface): User interaction is done through a programming language. The main open source languages used in data science are Python, R, and Julia.

A very nice alternative that gathers a bit of both worlds are interactive notebooks, originally from project Jupyter and currently supported also by Google Colaboratory.

This post discusses the main supported languages.

In this repo, we will use notebooks with the Python ecossystem and its main library, scikit-learn.

The whole material was planned so you don't need to learn how to code, but if you do want to, check out this this repo.

The notebooks in this repository were either created or translated by the authors indicated.

Meeting scikit-learn

[leobezerra] First steps

Open In Colab Binder

[leobezerra] Data preparation: outliers and imputation

Open In Colab Binder Watch on YouTube

[jonathanjalles][leobezerra] Data preparation: transformations

Open In Colab Binder Watch on YouTube

[kallil12][leobezerra] Feature engineering: selection

Open In Colab Binder Watch on YouTube