The following projects are available for Master and Bachelor students. They are performed in close collaboration with an experienced member of the lab. Apply for a project by sending an email to the contact mentioned for the project.
You may also suggest new projects, ideally close enough to our ongoing, or previously completed projects. In that case, you will have to convince Anne-Marie Kermarrec that it is worthwhile, of reasonable scope, and that someone in the lab can mentor you!
Projects available for Spring 2024.
Privacy-preserving personalized decentralized learning
Master’s Thesis or MSc semester project
Contact: Sayan Biswas ([email protected])
Decentralized learning (DL) is an emerging collaborative framework that enables nodes in a network to train machine learning models without sharing their private datasets or relying on a centralized entity (e.g., a server) [1].
However, the growing heterogeneity of data used in model training, alongside recent incidents—such as Facebook mislabeling Black men as primates [2] and facial-analysis software exhibiting a 0.8% error rate for light-skinned men compared to a 34.7% error rate for dark-skinned women [3]—has exposed the lack of minority representation in current ML models. This has underscored the pressing need for training personalized ML models that account for the diverse data distributions and attributes of various communities.
While personalized model training has recently gained attention in centralized ML and Federated Learning (FL), it remains relatively underexplored in DL. Moreover, the limited research on personalized DL has primarily focused on fairness and efficiency [4].
However, personalized model training raises concerns about potential privacy risks. Models trained on data from different communities may inadvertently leak sensitive information, making it easier for adversaries to identify members of minority groups and compromise their privacy.
The primary objective of this project is to analyze the trade-off between data privacy—examined from both empirical (e.g., privacy-invasive attacks such as membership inference and gradient inversion) and information-theoretical perspectives (e.g., differential privacy)—and model personalization in decentralized frameworks. We aim to establish a foundational framework to characterize this trade-off and explore methods for developing privacy-aware personalized DL algorithms. This will contribute to creating privacy-preserving and fair approaches for training models in a decentralized manner, taking a pioneering step toward ethical and trustworthy DL.
To contribute effectively to this project, we highly value:
- A strong mathematical foundation and interest in probability theory, algebra, and analysis.
- Proficiency in basic machine learning implementation.
[1] Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent. Lian et al. NeurIPS 2017.
[2] https://www.bbc.com/news/technology-58462511
[3] https://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212
[4] Fair Decentralized Learning. Biswas et al. Under Review.