UPure

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

Cheng-Yi Lee¹ *,Ching-Chia Kao² *,Cheng-Han Yeh¹, Chun-Shien Lu^{1 📧}, Chia-Mu Yu³, Chu-Song Chen²

¹ Academia Sinica, ² National Taiwan University, ³ National Yang Ming Chiao Tung University

(*) equal contribution, (^📧) corresponding author.

WACV 2025, ArXiv Preprint (arXiv 2407.10180)

Abstract

Semi-supervised learning (SSL) has achieved remarkable performance with a small fraction of labeled data by leveraging vast amounts of unlabeled data from the Internet. However, this large pool of untrusted data is extremely vulnerable to data poisoning, leading to potential backdoor attacks. Current backdoor defenses are not yet effective against such a vulnerability in SSL. In this study, we propose a novel method, Unlabeled Data Purification (UPure), to disrupt the association between trigger patterns and target classes by introducing perturbations in the frequency domain. By leveraging the Rate-Distortion-Perception (RDP) trade-off, we further identify the frequency band, where the perturbations are added, and justify this selection. Notably, UPure purifies poisoned unlabeled data without the need of extra clean labeled data. Extensive experiments on four benchmark datasets and five SSL algorithms demonstrate that UPure effectively reduces the attack success rate from 99.78% to 0% while maintaining model accuracy.

Introduction

The official code is "Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off." In our implementation, we follow the unified Semi-supervised Learning (SSL) framework, namely USB, to train a model using SSL algorithms, such as Mixmatch, Remixmatch, and Fixmatch. To make our implementation clear, we omit the files used in this framework. Instead, we include our implementation and description in this repository.

Before the start

Please read the documents at USB and install the corresponding packages (requirements.txt).

Detailed Description

Please follow these steps to replace the specified files in USB.

Files located in the datasets folder should be replaced with those found at:
- Semi-supervised-learning/semilearn/semilearn/datasets/cv_datasets/
Files located in the config folder should be replaced with those located at:
- Semi-supervised-learning/config/classic_cv/fixmatch/
Replace algorithmbase.py located at:
- Semi-supervised-learning/semilearn/core/
Replace build.py located at:
- Semi-supervised-learning/semilearn/core/utils/
Replace eval.py located at:
- Semi-supervised-learning/

Please execute the following commands to replicate our method:

To train FixMatch on CIFAR-10 with 100 labels, use the following example command:
- python train.py --c config/usb_cv/fixmatch/fixmatch_cifar10_100_0-defense.yaml
After training, evaluate the performance with a SSL model using the command below:
- python eval.py --dataset cifar100 --num_classes 100 --load_path /PATH/TO/CHECKPOINT --poison True

Citation

If you find UPure is useful in your research or applications, please consider giving us a star 🌟 and citing it in the following BibTeX entry.

@inproceedings{lee2025defending,
  title={Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off},
  author={Lee, Cheng-Yi and Kao, Ching-Chia and Yeh, Cheng-Han and Lu, Chun-Shien and Yu, Chia-Mu and Chen, Chu-Song},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPure

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

Abstract

Introduction

Before the start

Detailed Description

Please follow these steps to replace the specified files in USB.

Please execute the following commands to replicate our method:

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
datasets		datasets
model		model
.DS_Store		.DS_Store
README.md		README.md
algorithmbase.py		algorithmbase.py
build.py		build.py
eval.py		eval.py
train.py		train.py

chengyi-chris/UPure

Folders and files

Latest commit

History

Repository files navigation

UPure

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

Abstract

Introduction

Before the start

Detailed Description

Please follow these steps to replace the specified files in USB.

Please execute the following commands to replicate our method:

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages