Skip to content
View AlokTheDataGuy's full-sized avatar
💭
On my way, to become a Kick-ass Data Scientist
💭
On my way, to become a Kick-ass Data Scientist

Block or report AlokTheDataGuy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AlokTheDataGuy/README.md

💫 About Me:

Hi, I'm Alok Deep, an MCA Data Science postgraduate, passionate about Artificial Intelligence, Machine Learning, Data Science and Generative AI systems. I focus on building end-to-end AI applications, including LLM-powered systems, Retrieval-Augmented Generation (RAG), and intelligent data-driven solutions using Python, SQL, and modern AI frameworks.


🚀 Skills & Expertise

🤖 Artificial Intelligence & Machine Learning

  • Generative AI (LLMs, RAG, Prompt Engineering, Multi-modal AI)
  • NLP: Transformers, BERT, Text Classification, Sentiment Analysis
  • Retrieval Systems: Vector Databases (FAISS, ChromaDB), Semantic Search
  • Time-Series Forecasting, Anomaly Detection
  • Recommendation Systems, Clustering, Segmentation
  • Model Evaluation, Feature Engineering, Optimization

🧠 Machine Learning & Data Science

  • Exploratory Data Analysis (EDA), Statistical Analysis, A/B Testing
  • Predictive Modeling (Regression, Classification, Time Series)
  • Data Cleaning, Feature Engineering, Outlier Detection
  • Experiment design and insight generation

⚙️ AI Systems & Data Engineering

  • End-to-end AI system design (RAG pipelines, LLM integration)
  • ETL Pipelines, Data Processing, Workflow automation
  • API-based ML deployment (FastAPI, Flask)
  • Handling structured + unstructured data systems

🛠️ Tools & Technologies

  • Languages: Python, SQL
  • AI/ML: Scikit-learn, TensorFlow, PyTorch, Hugging Face
  • LLM Stack: LangChain, LangGraph, Ollama, Vector DBs
  • Data Engineering: Airflow, Docker
  • Databases: MySQL, PostgreSQL, MongoDB

🌐 Web & Backend (AI Application Development)

(Used for building and deploying AI-powered applications)

  • Frontend: React, Tailwind
  • Backend: FastAPI, Flask, Node.js
  • Deployment: Vercel, Render, Docker

💻 Tech Stack:

Python NumPy Pandas MySQL Excel Power BI Tableau Power Query Power Pivot LaTeX Git React Node.js Express.js MongoDB Tailwind CSS CSS3 JavaScript Flask NLP Computer Vision Scikit-learn TensorFlow Keras OpenCV Hugging Face PyTorch VS Code Postman Docker Vercel Render Canva


📜 Certifications

  • Product Analytics — Mixpanel (2026)
  • Alteryx Designer Core Certification (2026)
  • SQL (Advanced) Certification – HackerRank (2025)
  • Complete Data Science, Machine Learning, DL, NLP Bootcamp (Feb. 2025) - Udemy
  • Data Engineering Foundations Professional Certificate by Astronomer (Apr. 2025) - LinkedIn Learning
  • The Web Developer Bootcamp (Feb. 2023) - Udemy

🌐 Connect with Me

LinkedIn | GitHub | Portfolio

🚀 Always open to building impactful AI systems and collaborating on real-world projects!

Pinned Loading

  1. DocSense-Privacy-First-RAG-for-Enterprise-Documents DocSense-Privacy-First-RAG-for-Enterprise-Documents Public

    A dual-mode RAG system for querying financial reports, 10-Ks, and strategy decks — runs fully offline for sensitive workloads, or cloud-deployed for demos and non-sensitive use cases.

    Python

  2. ShareChat-Content-Engagement-Analytics ShareChat-Content-Engagement-Analytics Public

    A complete, end-to-end product analytics system built to demonstrate the skills — SQL depth, metric design, cohort thinking, A/B test evaluation, and analytical storytelling.

    Python

  3. India-Foodgrain-Stocks-Analytics India-Foodgrain-Stocks-Analytics Public

    A comprehensive end-to-end data analytics project analyzing India's foodgrain stocks across 26 states and 177 districts from 2010-2025. The project includes data collection from India's Open Govern…

    Jupyter Notebook

  4. nlp_projects nlp_projects Public

    Multiple chatbots and NLP-based projects recreated which were completed during my internship. Each project demonstrates different aspects of AI application development, from text summarization to m…

    Python

  5. Data-Science-Jobs-Analytics Data-Science-Jobs-Analytics Public

    This project analyzes 9,000+ Data Science job postings across India and visualizes the Indian Data Science job market.

    Jupyter Notebook

  6. arXiv-cs-expert-chatbot arXiv-cs-expert-chatbot Public

    A domain-specific chatbot for arXiv Computer Science papers with some NLP capabilities.

    Python