Welcome to My GitHub Portfolio

Welcome! I'm a Data Science and Machine Learning enthusiast with a background in Statistics and Computer Science from the University of Illinois Urbana-Champaign. This portfolio showcases my journey in leveraging data to solve real-world problems through machine learning, deep learning, and data analysis & visualization. Each project is a testament to my dedication to advancing data science and pushing the boundaries of innovation.

About Me

As a Deep Learning Researcher and Machine Learning Assistant, I've contributed to significant advancements in precision agriculture and neuromorphic computing. My experiences at Tata Consultancy Services and Goldman Sachs have broadened my expertise to include cloud architecture, financial analytics, and more. Passionate about AI's transformative power, I aim to explore and contribute to the convergence of technology and human potential, driving the future of innovation.

🔗 Connect with me on LinkedIn for a deeper dive into my work and professional journey.

Projects

Machine Learning

Bike Sharing Demand Analysis: The project employs machine learning to forecast bike sharing demand, leveraging data on rental duration, start/end positions, and environmental factors. The RandomForestRegressor model showed notable predictive accuracy, highlighting temporal and weather-related usage patterns.
Black Friday Sales Prediction: The project focuses on forecasting Black Friday sales using machine learning, analyzing customer demographics and product categories. A Decision Tree Regressor indicated significant predictive accuracy, with product categories and demographics as key features. Insights contribute to targeted marketing and inventory management.
Iris Dataset: The project applies machine learning models to classify iris species based on sepal and petal measurements. K-Nearest Neighbors achieved perfect accuracy, showcasing the effectiveness of classification algorithms in distinguishing between species with high precision.
Traffic Forecast: The project employs Facebook Prophet to predict traffic flow, leveraging historical data for accurate forecasting. The model adeptly captures daily and seasonal trends, offering a comprehensive view of traffic patterns. Forecasting aids in managing and planning for future traffic demands efficiently.
Wine Quality Prediction: The project applies several machine learning algorithms to predict wine quality based on chemical properties. The ExtraTreesClassifier emerged as the most accurate model, demonstrating the potential of ensemble methods in enhancing prediction accuracy for complex datasets.

Natural Language Processing (NLP)

Twitter Sentiment Analysis: The project utilizes natural language processing techniques to analyze sentiment from Twitter data, employing logistic regression for classification. The model's performance, indicated by its F1 score and accuracy, showcases effective sentiment distinction. This analysis aids in understanding public opinion trends on social media platforms.
SMS Spam Detection: The project employs several machine learning models to differentiate between spam and ham messages. The SVM model showed exemplary performance, demonstrating the effectiveness of combining textual preprocessing and advanced classification techniques. This project highlights the potential of NLP in enhancing email filtering systems.
Cross Language Information Retrieval: The project explores the intersection of NLP and IR to bridge language gaps in information retrieval. Utilizing techniques like tokenization, stemming, and BM25 for document indexing, alongside machine translation models for German-to-English queries, this research signifies a pivotal step towards making digital content universally accessible. The project's use of perplexity measures and MAP scores for evaluation underlines the complexity of cross-language understanding and retrieval, showcasing a robust approach to NLP challenges in IR systems.

Data Analysis and Visualisation

Titanic (Kaggle Dataset) - Exploratory Analysis: Exploratory Analysis of the passengers onboard RMS Titanic using Pandas, NumPy, Matplotlib and Seaborn visualisations. The link to the dataset is Kaggle
Movies Dataset - Exploratory Analysis: Exploratory Analysis of Fandango's ratings in 2015 using Pandas, NumPy, Matplotlib and Seaborn visualisations to identify if there was a bias towards rating movies better to sell more tickets. The link to the dataset is Github
2016 General Elections Poll Analysis: Data Analysis and Visualization of the 2016 General Election Polls using Pandas, Numpy, Matplotlib and Seaborn. The link to the dataset is Kaggle
911 Calls - Exploratory Data Analysis: Utilizing 90,000 emergency (911) call records, we identify trends and distributions by analyzing frequencies across zip codes, reasons for calls, and time patterns. The analysis leverages visualizations like heatmaps and line charts to reveal key insights into emergency call dynamics. The link to the dataset is Kaggle
Superstore Sales - Exploratory Data Analysis: Using the Superstore dataset with 9,994 records to analyze sales trends, product category performance, and customer purchasing patterns from 2014 to 2017. The analysis suggests targeted strategies for enhancing profitability and customer engagement, particularly during peak sales periods. The link to the dataset is Kaggle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to My GitHub Portfolio

About Me

Projects

Machine Learning

Natural Language Processing (NLP)

Data Analysis and Visualisation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
911 Calls - Exploratory Data Analysis.ipynb		911 Calls - Exploratory Data Analysis.ipynb
Bike Sharing Demand Analysis - Machine Learning.ipynb		Bike Sharing Demand Analysis - Machine Learning.ipynb
Black Friday Sales Prediction - Machine Learning.ipynb		Black Friday Sales Prediction - Machine Learning.ipynb
Cross Language Information Retrieval.ipynb		Cross Language Information Retrieval.ipynb
Elections Poll Analysis.ipynb		Elections Poll Analysis.ipynb
Iris Dataset - Machine Learning.ipynb		Iris Dataset - Machine Learning.ipynb
LICENSE		LICENSE
Movies Dataset - Exploratory Analysis.ipynb		Movies Dataset - Exploratory Analysis.ipynb
README.md		README.md
SMS Spam Detection - NLP.ipynb		SMS Spam Detection - NLP.ipynb
Superstore Exploratory Data Analysis.ipynb		Superstore Exploratory Data Analysis.ipynb
Titanic (Kaggle Dataset) - Exploratory Analysis.ipynb		Titanic (Kaggle Dataset) - Exploratory Analysis.ipynb
Traffic Forecast - Machine Learning.ipynb		Traffic Forecast - Machine Learning.ipynb
Twitter Sentiment Analysis - NLP.ipynb		Twitter Sentiment Analysis - NLP.ipynb
Wine Quality Prediction - Machine Learning.ipynb		Wine Quality Prediction - Machine Learning.ipynb

License

kjdarthvader/Portfolio

Folders and files

Latest commit

History

Repository files navigation

Welcome to My GitHub Portfolio

About Me

Projects

Machine Learning

Natural Language Processing (NLP)

Data Analysis and Visualisation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages