Open-source course by Decoding ML in collaboration with Hopsworks.
This hands-on course teaches you how to build and deploy a real-time personalized recommender system for H&M fashion articles. You'll learn:
- To architect a modern ML system for real-time personalized recommenders.
- To do feature engineering using modern tools such as Polars.
- To design and train ML models for recommender systems powered by neural networks.
- To use MLOps best practices by leveraging Hopsworks AI Lakehouse.
- To deploy the recommender on a Kubernetes cluster managed by Hopsworks Serverless using KServe.
- To apply LLM techniques for personalized recommendations.
This course is part of Decoding ML's open-source series, where we provide free hands-on resources for building GenAI and recommender systems.
The Hands-on H&M Real-Time Personalized Recommender, in collaboration with Hopsworks, is a 5-module course backed up by code, Notebooks and lessons that will teach you how to build an H&M real-time personalized recommender from scratch.
By the end of this course, you will know how to architect, build and deploy a modern recommender.
What you'll do:
- Architect a scalable and modular ML system using the Feature/Training/Inference (FTI) architecture.
- Feature engineering on top of our H&M data for collaborative and content-based filtering techniques for recommenders.
- Use the two-tower network to Create user and item embeddings in the same vector space.
- Implement an H&M real-time personalized recommender using the 4-stage recommender design and a vector database.
- Use MLOps best practices, such as a feature store and a model registry.
- Deploy the online inference pipeline to Kubernetes using KServe.
- Deploy the offline ML pipelines to GitHub Actions.
- Implement a web interface using Streamlit.
- Improve the H&M real-time personalized recommender using LLMs.
🥷 With these skills, you'll become a ninja in building real-time personalized recommenders.
Try out our deployed H&M real-time personalized recommender to see what you'll learn to build by the end of this course: 💻 Live H&M Recommender Streamlit Demo
Important
The demo is in 0-cost mode, which means that when there is no traffic, the deployment scales to 0 instances. The first time you interact with it, give it 1-2 minutes to warm up to 1+ instances. Afterward, everything will become smoother.
This course is ideal for:
- ML/AI engineers interested in building production-ready recommender systems
- Data Engineers, Data Scientists, and Software Engineers wanting to understand the engineering behind recommenders
Note: This course focuses on engineering practices and end-to-end system implementation rather than theoretical model optimization or research.
Category | Requirements |
---|---|
Skills | Basic understanding of Python and Machine Learning |
Hardware | Any modern laptop/workstation will do the job (no GPU or powerful computing power required). We also support Google Colab or GitHub Actions for compute. |
Level | Intermediate |
All tools used throughout the course will stick to their free tier, except OpenAI's API, as follows:
- Modules 1-4: Completely free
- Module 5 (Optional): ~$1-2 for OpenAI API usage when building LLM-enhanced recommenders
As an open-source course, you don't have to enroll. Everything is self-paced, free of charge and with its resources freely accessible as follows:
- code: this GitHub repository
- articles: Decoding ML
This open-source course consists of 5 comprehensive modules covering theory, system design, and hands-on implementation.
Our recommendation for each module:
- Read the article
- Run the Notebook to replicate our results (locally or on Colab)
- Following the Notebook, go deeper into the code by reading the
recsys
Python module
Note
Check the INSTALL_AND_USAGE doc for a step-by-step installation and usage guide.
Module | Article | Description | Notebooks |
---|---|---|---|
1 | Building a TikTok-like recommender | Learn how to architect a recommender system using the 4-stage architecture and two-tower network. | No code |
2 | Feature pipelines for TikTok-like recommenders | Learn how to build a scalable feature pipeline using a feature store. | •1_fp_computing_features.ipynb |
3 | Training pipelines for TikTok-like recommenders | Learn to train and evaluate the two-tower network and ranking model using MLOps best practices. | •2_tp_training_retrieval_model.ipynb •3_tp_training_ranking_model.ipynb |
4 | Deploy scalable TikTok-like recommenders | Learn how to architect and deploy the inference pipelines for real-time recommendations using the 4-stage design. | •4_ip_computing_item_embeddings.ipynb •5_ip_creating_deployments.ipynb •6_scheduling_materialization_jobs.ipynb |
5 | Building personalized real-time recommenders with LLMs | Learn how to enhance recommendations with LLMs | •7_ip_creating_deployments_llm_ranking.ipynb |
To run the Notebooks in Google Colab, copy-paste them into your Google Drive, open them, and run them. Our setup steps will prepare the Python environment automatically.
At Decoding ML we teach how to build production ML systems, thus the course follows the structure of a real-world Python project:
.
├── notebooks/ # Jupyter notebooks for each pipeline
├── recsys/ # Core recommender system package
│ ├── config.py # Configuration and settings
│ ...
│ └── training/ # Training pipelines code
├── tools/ # Utility scripts
├── streamlit_app.py # Streamlit app entry point
├── .env.example # Example environment variables template
├── Makefile # Commands to install and run the project
├── pyproject.toml # Project dependencies
We will use the H&M Personalized Fashion Recommendations dataset, available on Kaggle, open-source for academic research and education.
It is an e-commerce dataset that contains fashion articles from the H&M clothes brand.
It contains:
- 105k articles
- 137k customers
- 31 million transactions
More on the dataset in the feature engineering pipeline Notebook and article.
For detailed installation and usage instructions, see our INSTALL_AND_USAGE guide.
Recommendation: While you can follow the installation guide directly, we strongly recommend reading the accompanying articles to gain a complete understanding of the recommender system.
Have questions or running into issues? We're here to help!
Open a GitHub issue for:
- Questions about the course material
- Technical troubleshooting
- Clarification on concepts
When having issues with Hopsworks Serverless, the best place to ask questions is on Hopsworks's Slack, where their engineers can help you directly.
As an open-source course, we may not be able to fix all the bugs that arise.
If you find any bugs and know how to fix them, support future readers by contributing to this course with your bug fix.
We will deeply appreciate your support for the AI community and future readers 🤗
Hopsworks |
Paul Iusztin AI/ML Engineer |
Anca Ioana Muscalagiu AI/ML Engineer |
Paolo Perrone AI/ML Engineer |
Hopsworks's Engineering Team AI Lakehouse |
This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge your project is based on our work, you can safely clone or fork this project and use it as a source of inspiration for your educational projects (e.g., university, college degree, personal projects, etc.).