This repository contains personal notes, programs, and practice exercises related to the course Development of Applications for Data Analysis. All materials here are part of my academic journey at the Instituto Politécnico Nacional (IPN) in the Bachelor’s in Data Science program, under the Escuela Superior de Cómputo (ESCOM). The content is meant for educational purposes, documenting the work done throughout the course.
- Python Programs: Various scripts developed for data acquisition, processing, and analysis. These programs demonstrate key techniques learned, including web scraping, data cleaning, and machine learning.
- Practice Exercises: Exercises provided during the course to solidify understanding of Python and its libraries (such as NumPy, Pandas, and scikit-learn) and apply them in data science contexts.
- Class Notes: Personal notes summarizing important concepts from the course, including programming principles, data manipulation, and machine learning models.
- Projects: Larger projects that integrate all course topics, from data acquisition to machine learning, including the use of tools like Apache Spark for distributed processing.
The course aims to provide practical experience in developing applications for data analysis, focusing on:
- Data Acquisition: Using Python to gather data from various sources, including files and web scraping techniques.
- Data Preprocessing: Cleaning, transforming, and structuring data for further analysis.
- Machine Learning: Implementing basic supervised and unsupervised learning models.
- Distributed Processing: Utilizing Apache Spark for handling large datasets and performing scalable machine learning.
This repository serves as a personal log of my progress through the course. It's a collection of my:
- Solutions to exercises and challenges
- Programs written in Python
- Insights gained from applying machine learning and data processing techniques
All the content here is shared for educational purposes only. It is not meant for commercial use or redistribution. Feel free to explore the code and use it as reference material, but please give proper credit if you derive any work from this repository.
- Institution: Instituto Politécnico Nacional (IPN)
- Program: Bachelor's in Data Science (Licenciatura en Ciencia de Datos)
- Course: Development of Applications for Data Analysis
- Semester: IV
This course is part of my academic formation and helps develop practical skills in Python programming for data science, which includes working with libraries, web scraping, and applying machine learning algorithms.
Disclaimer: This repository is purely for academic use and personal development. All materials are subject to the course's rules and guidelines.