This repository contains the work I've done for the Trustworthy Machine Learning course of the University of Helsinki. The repository is organized as follow:
- In the directory \papers there are the papers that we had to read during the course. There are interesting papers about privacy and fairness in machine learning;
- In the directory \firstProject there is the first project related to the privacy part of the course. There are some implementations of important topics in differential privacy. You can find examples of randomized response and laplace mechanisms;
- In the directory \secondProject there is the second project related to the privacy part of the course. In this directory you can find more complex implementations of differentially private algorithms. In the first task there is an implementation of the differentially private stochastic gradient descent using Tensorflow Privacy. In the second part the same algorithm has been applied on the Census Adult Dataset (in the first task it has been applied to synthetic data). Finally, in the third task I have analyzed the membership inference attack, in particular I tried to see how much the power of the attack increases with the increase of the overfitting during the training of the target model;
- In the directory \thirdProject there is the project related to the fairness part of the course. In this project we have taken the Census Adult Dataset and we measured its fairness. After we have seen that there was a discrimination bias in the dataset we tried to implement the preferential resampling fairness aware strategy in such a way to reduce the bias in the historical data that composed the Census Adult Dataset. After that we tried to train a logistic regression model on the fair dataset, resulting in a fair model. After this project it is possible to state that if a model is trained on a fair dataset, then its predictions will be more or less fair even if the data which the model is applied to aren't fair at all.