A community curated and contributed list of helpful resources and materials about Federated Learning and PETs as part of the #30DaysOfFLCode
Challenge by OpenMined.
Two main rules:
- Study Federated Learning (and/or any other PETs) for at least 1 hour/day for 30 days
- Share Your Progress Daily by posting on social media using
#30DaysOfFLCode
and engage with other participants.
Publicly commit to the challenge: Hold yourself accountable by making a public statement saying you intend to participate in the program
Discover more on www.30DaysOfFLCode.com.
We welcome contributions! Please follow these steps to contribute:
- Fork this repository
- Add your resource(s)
- Submit a pull request
Find all the information and instructions on how to contribute in CONTRIBUTING.md.
Please find below all the contributed resources, organised by category (click to expand on the resources!)
🛠️ Tools
-
SyftBox | #30DaysOfFLCode - The new project by OpenMined that aims to make privacy-enhancing technologies more accessible and user-friendly for developers.
- SyftBox Computational Model - How computation works on SyftBox, in a nutshell
- Federated CPU Tracker Member (part1) - An example of SyftBox API that monitors local CPU usage and shares a private/sanitized version of the data within the SyftBox federated network.
- Federated CPU Tracker Leader (part 2) - A SyftBox API that aggregates CPU data from all members contributing to the computation, and creates a live visualization dashboard.
- Getting Started with Federated Learning on SyftBox - A complete federated learning workflow for MNIST digit classification using SyftBox.
- Ring Computation Walkthrough: Calculating An Average Across Nodes - A brief walkthrough on creating a Ring Computation on SyftBox that computes the average value from nodes.
-
OpenVector - CoFHE (Collaborative-Fully Homomorphic Encryption). Confidential compute primitive that is 100x faster than FHE. [Github repo not found for this tool]
-
PanzaMail - Panza is an automated email assistant customized to your writing style and past email history, trained without ever sharing your sensitive data.
-
Secure XGBoost - Secure XGBoost is a library that provides the capability to collaboratively train XGBoost models on untrusted cloud enviroments using secure hardware enclaves.
👤 Differential Privacy
- Privacy-Preserving Retrieval Augmented Generation with Differential Privacy - The first paper to explore RAG (Retrieval Augmented Generation) with Differential Privacy.
-
Tutorial on Differential Privacy - Katrina Ligett, California Institute of Technology - Big Data and Differential Privacy
-
Tutorial: Differential Privacy and Learning: The Tools, The Results, and The Frontier - Katrina Ligett, California Institute of Technology - NeurIPS tutorial 2014
-
A Course In Differential Privacy - A course on Differential Privacy by Gautam Kamath, Assistant Professor at the University of Waterloo's Cheriton School of Computer Science
-
Programming Differential Privacy - An open source book about differential privacy, for programmers.
-
The Algorithmic Foundations of Differential Privacy - A foundational text that delves into the theoretical aspects of differential privacy, exploring its principles and practical applications in safeguarding individual data.
🔏 Homomorphic Encrpytion
-
Introduction to Homomorphic Encryption by Zama - This article provides a good introduction to Homomorphic Encryption, with several demo examples on HuggingFace and DeepDives attached.
-
FHE.org Resources - Compiled resources on homomorphic encryption
📡 Federated Learning
-
From Centralised to Decentralised Training: An Intro to Federated Learning - A Jupyter Notebook tutorial aimed to provide a practical overview with code examples to all the the foundational concepts tackled in federated learning. This tutorial was written by Andrej Jovanović, Sree Harsha Nelaturu and Luca Powell and presented at the 2024 iteration of the Deep Learning Indaba.
-
Collection of Tutorials in Federated Learning from Tensor Flow - A TensorFlow Colab Notebook collection of Federated Learning tutorials designed to provide practical examples of Federated Learning, covering concepts from basic to advanced levels.
-
Federated Learning for Credit Scoring - A detailed blog post exploring the application of federated learning to credit scoring. It discusses key concepts such as silo vs. device architectures, horizontal vs. vertical federated learning, and non-IID data challenges, and includes source code examples.
-
Flower Framework Tutorials - This webpage contains Flower Framework Tutorials on Federated Learning, Quickstart tutorials with Flower Framework, How-To Guides and Reference Docs.
-
FedN Tutorials by Scaleout Systems - Collection of Introductory tutorials on FedN, setting up FedN Project, using the FedN API Client, Developer Guide and much more.
-
NVFlare by NVIDIA - A great collection of tutorials in form of a Catalog on Federated Learning using NVFlare by NVIDIA.
-
Tutorials on FATE (Federated AI Technology Enabler) - Collection of tutorials on FATE framework (an industrial grade FL Framework), quickstart, pipelines, ML Tutorials and much more.
-
OpenFL Running the Federation Tutorials - These tutorials use the Jupyter Lab server to understand the APIs used in Open Federated Learning (OpenFL).
-
SubstraFL Tutorials by Owkin - These tutorials are for getting started with SubstraFL, which is a open-source Federated Learning Framework developed by Owkin Research focused on Healthcare.
-
FedML Tutorials by TensorOpera (Previously FEDML) - Getting started tutorials on Federated Learning for FedML by TensorOpera. FedML is a library for large-scale distributed training, model serving, and federated learning.
-
FedLab Tutorials - Tutorials for FedLab (A flexible Federated Learning Framework based on PyTorch) by SMILELab-FL. FedLab aims to standardize FL simulation procedure, including synchronous algorithm, asynchronous algorithm and communication compression.
-
PFL Tutorials by Apple - A collection of tutorial notebooks for pfl which is a simulation framework for accelerating research in Private Federated Learning by Apple Research.
-
An online comic on Federated Learning, by Google AI - Google AI came up with this fun online comic on Federated Learning which is a great resource for beginners starting their journey in the field of Federated Learning. Great for building up the motivation to learn FL.
-
Communication-Efficient Learning of Deep Networks from Decentralized Data - First paper on Federated Learning by McMahan et. al. Google Inc. This paper introduces the term Federated Learning, runs experiments on MNIST and CIFAR Datasets, and also introduces the famous aggregation algorithm FedAverage. Check out the blog based on the paper here: Federated Learning: Collaborative Machine Learning without Centralized Training Data
-
Federated Learning: Challenges, Methods, and Future Directions - Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.
-
Advances and Open Problems in Federated Learning - Federated learning (FL) is a machine learning setting where many clients collaboratively train a model under the orchestration of a central server, while keeping the training data decentralized. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.
-
User-Empowered Federated Learning in the Automotive Domain - This paper proposes a User-Empowered FL approach, built upon the Flower Framework, implemented in an Android Automotive app. Source code is available here. Please note that the paper is not open access and requires an IEEE subscription or institutional login.
-
Federated Learning and Privacy - This article provides a brief introduction to key concepts in federated learning and analytics with an emphasis on how privacy technologies may be combined in real-world systems and how their use charts a path toward societal benefit from aggregate statistics in new domains and with minimized risk to individuals and to the organizations who are custodians of the data.
-
Import AI 393: 10B distributed training run; China VS the chip embargo; and moral hazards of AI development - Interesting article about the future of decentralised training
-
Privacy-Preserving Retrieval Augmented Generation with Differential Privacy - The first paper to explore RAG (Retrieval Augmented Generation) with Differential Privacy.
-
Federated Learning on Non-IID Data Silos: An Experimental Study - This study introduces the first comprehensive benchmark with diverse data partitioning strategies to systematically evaluate FL algorithms under non-IID settings, providing valuable insights for future research. Source code: here.
-
Secure Multiparty Computation, Yehuda Lindall, 2020 - A great overview on the research/technical side of secure multi-party computation.
-
How Federated Learning Protects Privacy - The PAIR (People + AI Research) team at Google has published this engaging article that explains how Federated Learning protects privacy. It features clear visuals and GIFs to help you better understand the concept and its applications in real-world scenarios.
-
RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response - This pioneering algorithm enables privacy-preserving data collection through randomized response techniques. It allows statistical analysis without compromising sensitive data.
-
Federated Learning @ DeepLearning.AI - An introductory course on federated learning delivered by DeepLearning.AI in collaboration with Flower.
-
Federated Fine-tuning of LLMs with Private Data @ DeepLearning.AI x Flower Labs- Part 2 of the Intro to Federated Learning course delivered by DeepLearning.AI and Flower Labs.
-
Federated Learning Tutorial @ NeurIPS 2020 - Federated Learning Tutorial @ NeurIPS 2020
-
Federated learning course at Aalto University - A master level Federated Learning course from Aalto university.
🎮 Games & Simulations
-
DP Vision - Test your image recognition skills with differentially private images! Players manage a privacy budget to reveal image details, aiming to identify the correct image within 5 guesses while minimizing privacy loss.
-
Guess Who (DP Edition) - A privacy-preserving twist on the classic game where players ask yes/no questions with adjustable accuracy levels. Lower epsilon means less reliable but more private answers, teaching the privacy-utility tradeoff.
-
WORDPL - A Wordle-style game with differential privacy mechanics. Players guess 5-letter words while managing privacy budgets that affect the accuracy of feedback, demonstrating how DP noise impacts information gathering.
-
Federated Learning Hyperparam Tuning Game - Understand and play with federated learning hyperparams! In-browser tensorflow-js simulation of FedAvg to understand and gain intuition about IID and Non-IID Federated Learning settings.
-
Differentially Private Tetris - A unique twist on classic Tetris where players manage a privacy budget to reveal blocks, demonstrating differential privacy concepts through gameplay. Experience privacy-utility tradeoffs in an engaging way.
-
The Unlearning Protocol - An interactive game exploring machine learning unlearning and fairness concepts. Players select data points that least impact the dataset, providing hands-on experience with data removal and model fairness considerations.
🛡️ Multiple PETs and Others
- Beyond Privacy Trade-offs with Structured Transparency - Structured Transparency: a five-part framework to combine multiple PETs, such as secure computation and federated learning, to maximise their value, and to reduce lingering use-misuse trade-offs in multiple domains.
-
The Private AI Series - Learn how privacy technology is changing our world and how you can lead the charge.
-
Secure and Private AI - Learn skills to build AI systems that prioritize security and privacy using cutting-edge techniques. The course introduces tools and methods for securely handling sensitive data in AI applications, including Federated Learning, Differential Privacy, and Encrypted Computation.
-
Privacy Preserving AI (Andrew Trask) | MIT Deep Learning Series - Lecture by Andrew Trask in January 2020, part of the MIT Deep Learning Lecture Series.
-
Multi Party Computation Concepts - Beginner Friendly series of videos about MPC
-
Optimization Algorithms for Distributed Machine Learning - Textbook discussing SoTA optimization algorithms for distributed/federated machine learning. Access to materials requires subscription.
-
Data Privacy Handbook - A short guide from the Utrecht University on data privacy regulations and classical techniques used to make the data private like de-indetification through ommision and other statistical methods.