Skip to content

kjdarthvader/Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to My GitHub Portfolio

Welcome! I'm a Data Science and Machine Learning enthusiast with a background in Statistics and Computer Science from the University of Illinois Urbana-Champaign. This portfolio showcases my journey in leveraging data to solve real-world problems through machine learning, deep learning, and data analysis & visualization. Each project is a testament to my dedication to advancing data science and pushing the boundaries of innovation.

About Me

As a Deep Learning Researcher and Machine Learning Assistant, I've contributed to significant advancements in precision agriculture and neuromorphic computing. My experiences at Tata Consultancy Services and Goldman Sachs have broadened my expertise to include cloud architecture, financial analytics, and more. Passionate about AI's transformative power, I aim to explore and contribute to the convergence of technology and human potential, driving the future of innovation.

🔗 Connect with me on LinkedIn for a deeper dive into my work and professional journey.

Projects

Machine Learning

  • Bike Sharing Demand Analysis: The project employs machine learning to forecast bike sharing demand, leveraging data on rental duration, start/end positions, and environmental factors. The RandomForestRegressor model showed notable predictive accuracy, highlighting temporal and weather-related usage patterns.
  • Black Friday Sales Prediction: The project focuses on forecasting Black Friday sales using machine learning, analyzing customer demographics and product categories. A Decision Tree Regressor indicated significant predictive accuracy, with product categories and demographics as key features. Insights contribute to targeted marketing and inventory management.
  • Iris Dataset: The project applies machine learning models to classify iris species based on sepal and petal measurements. K-Nearest Neighbors achieved perfect accuracy, showcasing the effectiveness of classification algorithms in distinguishing between species with high precision.
  • Traffic Forecast: The project employs Facebook Prophet to predict traffic flow, leveraging historical data for accurate forecasting. The model adeptly captures daily and seasonal trends, offering a comprehensive view of traffic patterns. Forecasting aids in managing and planning for future traffic demands efficiently.
  • Wine Quality Prediction: The project applies several machine learning algorithms to predict wine quality based on chemical properties. The ExtraTreesClassifier emerged as the most accurate model, demonstrating the potential of ensemble methods in enhancing prediction accuracy for complex datasets.

Natural Language Processing (NLP)

  • Twitter Sentiment Analysis: The project utilizes natural language processing techniques to analyze sentiment from Twitter data, employing logistic regression for classification. The model's performance, indicated by its F1 score and accuracy, showcases effective sentiment distinction. This analysis aids in understanding public opinion trends on social media platforms.
  • SMS Spam Detection: The project employs several machine learning models to differentiate between spam and ham messages. The SVM model showed exemplary performance, demonstrating the effectiveness of combining textual preprocessing and advanced classification techniques. This project highlights the potential of NLP in enhancing email filtering systems.
  • Cross Language Information Retrieval: The project explores the intersection of NLP and IR to bridge language gaps in information retrieval. Utilizing techniques like tokenization, stemming, and BM25 for document indexing, alongside machine translation models for German-to-English queries, this research signifies a pivotal step towards making digital content universally accessible. The project's use of perplexity measures and MAP scores for evaluation underlines the complexity of cross-language understanding and retrieval, showcasing a robust approach to NLP challenges in IR systems.

Data Analysis and Visualisation

  • Titanic (Kaggle Dataset) - Exploratory Analysis: Exploratory Analysis of the passengers onboard RMS Titanic using Pandas, NumPy, Matplotlib and Seaborn visualisations. The link to the dataset is Kaggle
  • Movies Dataset - Exploratory Analysis: Exploratory Analysis of Fandango's ratings in 2015 using Pandas, NumPy, Matplotlib and Seaborn visualisations to identify if there was a bias towards rating movies better to sell more tickets. The link to the dataset is Github
  • 2016 General Elections Poll Analysis: Data Analysis and Visualization of the 2016 General Election Polls using Pandas, Numpy, Matplotlib and Seaborn. The link to the dataset is Kaggle
  • 911 Calls - Exploratory Data Analysis: Utilizing 90,000 emergency (911) call records, we identify trends and distributions by analyzing frequencies across zip codes, reasons for calls, and time patterns. The analysis leverages visualizations like heatmaps and line charts to reveal key insights into emergency call dynamics. The link to the dataset is Kaggle
  • Superstore Sales - Exploratory Data Analysis: Using the Superstore dataset with 9,994 records to analyze sales trends, product category performance, and customer purchasing patterns from 2014 to 2017. The analysis suggests targeted strategies for enhancing profitability and customer engagement, particularly during peak sales periods. The link to the dataset is Kaggle

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published