Skip to content

[WACV25] The official code of "Defending Against Unlabeled Data Poisoning Attack on Semi-Supervised Learning through Lens of Rate-Distortion-Perception Tradeoff"

Notifications You must be signed in to change notification settings

chengyi-chris/UPure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UPure

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

Cheng-Yi Lee1 *,Ching-Chia Kao2 *,Cheng-Han Yeh1, Chun-Shien Lu1 📧, Chia-Mu Yu3, Chu-Song Chen2

1 Academia Sinica, 2 National Taiwan University, 3 National Yang Ming Chiao Tung University

(*) equal contribution, (📧) corresponding author.

WACV 2025, ArXiv Preprint (arXiv 2407.10180)

Abstract

Semi-supervised learning (SSL) has achieved remarkable performance with a small fraction of labeled data by leveraging vast amounts of unlabeled data from the Internet. However, this large pool of untrusted data is extremely vulnerable to data poisoning, leading to potential backdoor attacks. Current backdoor defenses are not yet effective against such a vulnerability in SSL. In this study, we propose a novel method, Unlabeled Data Purification (UPure), to disrupt the association between trigger patterns and target classes by introducing perturbations in the frequency domain. By leveraging the Rate-Distortion-Perception (RDP) trade-off, we further identify the frequency band, where the perturbations are added, and justify this selection. Notably, UPure purifies poisoned unlabeled data without the need of extra clean labeled data. Extensive experiments on four benchmark datasets and five SSL algorithms demonstrate that UPure effectively reduces the attack success rate from 99.78% to 0% while maintaining model accuracy.

Introduction

The official code is "Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off." In our implementation, we follow the unified Semi-supervised Learning (SSL) framework, namely USB, to train a model using SSL algorithms, such as Mixmatch, Remixmatch, and Fixmatch. To make our implementation clear, we omit the files used in this framework. Instead, we include our implementation and description in this repository.

Before the start

Please read the documents at USB and install the corresponding packages (requirements.txt).

Detailed Description

Please follow these steps to replace the specified files in USB.

  • Files located in the datasets folder should be replaced with those found at:

    • Semi-supervised-learning/semilearn/semilearn/datasets/cv_datasets/
  • Files located in the config folder should be replaced with those located at:

    • Semi-supervised-learning/config/classic_cv/fixmatch/
  • Replace algorithmbase.py located at:

    • Semi-supervised-learning/semilearn/core/
  • Replace build.py located at:

    • Semi-supervised-learning/semilearn/core/utils/
  • Replace eval.py located at:

    • Semi-supervised-learning/

Please execute the following commands to replicate our method:

  • To train FixMatch on CIFAR-10 with 100 labels, use the following example command:

    • python train.py --c config/usb_cv/fixmatch/fixmatch_cifar10_100_0-defense.yaml
  • After training, evaluate the performance with a SSL model using the command below:

    • python eval.py --dataset cifar100 --num_classes 100 --load_path /PATH/TO/CHECKPOINT --poison True

Citation

If you find UPure is useful in your research or applications, please consider giving us a star 🌟 and citing it in the following BibTeX entry.

@inproceedings{lee2025defending,
  title={Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off},
  author={Lee, Cheng-Yi and Kao, Ching-Chia and Yeh, Cheng-Han and Lu, Chun-Shien and Yu, Chia-Mu and Chen, Chu-Song},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year={2025}
}

About

[WACV25] The official code of "Defending Against Unlabeled Data Poisoning Attack on Semi-Supervised Learning through Lens of Rate-Distortion-Perception Tradeoff"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages