Skip to content

Latest commit

 

History

History
63 lines (35 loc) · 3.06 KB

README.en.md

File metadata and controls

63 lines (35 loc) · 3.06 KB

Data Science Challenges

pt-br

This repository contains 4 challenges involving topics such as data analysis, process automation, machine learning, and other subjects related to Data Science.

The problems here are common to many companies and can be solved through the study of the available data from them. Through this analysis, and with the help of a programming language, it is possible to detect patterns and gain insights to make decisions that help solve the company’s issues.

Challenge 1

The first challenge involves gathering information about 25 shopping malls from an Excel spreadsheet with over 100,000 rows and sending each mall’s report to the email of its respective manager.

Since doing this process manually would be extremely repetitive and tiring, this challenge will be solved using a simple Process Automation.

Libraries and packages used: pandas, smtplib, email. Read more details about this challenge here.

Challenge 2

The second challenge focuses on a credit card company that is progressively losing customers and wants to better understand why so many clients are canceling their cards.

Therefore, I will conduct a Data Analysis using a spreadsheet containing customer data to profile those canceling their cards. This way, the company can focus on strategies to change this situation.

Libraries and packages used: pandas, plotly. Read more details about this challenge here.

Challenge 3

This is a super challenge that involves Machine Learning and Business Intelligence. The third challenge is about creating a tool to help people determine how much to charge for their property’s daily rent and assist those seeking accommodation to know if the property’s prices are fair.

This challenge demonstrates how a Data Science project works in practice. It basically involves eight steps:

  1. Understanding the challenge
  2. Project/area evaluation
  3. Data extraction and acquisition
  4. Data cleaning
  5. Exploratory analysis
  6. Modeling and algorithms
  7. Result interpretation
  8. Conclusion

Libraries and packages used: pandas, pathlib, numpy, seaborn, matplotlib, plotly, sklearn. Read more details about this challenge here.

Challenge 4

The fourth challenge involves a growing company that needs to send invoices daily to overdue customers.

Since this process is highly repetitive, it is favorable to use Process Automation, making everything simpler and more automatic. The most interesting aspect of this challenge is automating a browser-based process involving intriguing steps.

Libraries and packages used: pandas, time, selenium. Read more details about this challenge here.


License

MIT © Gabriel Reira