Unveiling potential threats: backdoor attacks in single-cell pretrained models

2024

1School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China 2Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
*Corresponding author: lnchen@sibcb.ac.cn, chenshengquan@nankai.edu.cn
Potential backdoor attacks for single-cell pretrained models. a, Attackers can release poisoned data by tampering with benign data or distribute poisoned models trained on such compromised data. Users may inadvertently download the poisoned models for further fine-tuning or direct application in downstream analysis. Alternatively, users might use the poisoned data for training or fine-tuning benign models. Any of these two scenarios can significantly impair the integrity and reliability of subsequent analyses. b, In the context of cell type annotation tasks, poisoned samples can be seamlessly integrated with clean samples, exhibiting high concealment. Training a model on such composite dataset results in a poisoned model imbued with backdoors. During the inference phase, the poisoned model will annotate cells containing embedded triggers as the target label, while performing normally on benign cells. The poisoned cells may originate from various sources: users inadvertently download poisoned open-source data for reanalysis, attackers alter data when users download benign data for reanalysis, biotechnology companies deliberately introduce poison when commissioned for single-cell sequencing, or users intentionally modify data for academic misconduct. Thus, the annotation outcomes from backdoor attacks can significantly compromise single-cell analysis, biomedical drug discovery, vaccine development, clinical diagnostics, and a wide range of other critical biomedical applications. c, UMAP visualization of the benign and poisoned cells in the example training set of scGPT. d-f, The effects of different poisoning thresholds (d), target labels (e), and poisoning rates (f) on performance of the backdoor attack for scGPT on the pancreas dataset. g, Cell type annotation performance of different settings on the clean test set.

Abstract

Single-cell pretrained models, despite their superior performance, face significant yet often overlooked threats from backdoor attacks. Here we propose a straightforward backdoor strategy to demonstrate the vulnerabilities of these models, achieving high attack success rates while maintaining clean accuracy. We also suggest five potential defense strategies to mitigate these threats. Our findings underscore the imperative for the biomedical community to adopt robust defense mechanisms to safeguard research integrity and reliability.

Results with scGPT and scBERT

Myeloid dataset, since the cancer labels in the training and test sets do not overlap, we did not conduct a performance analysis on clean data in this case. We randomly selected one cancer type from the six types in the training set as the target label. The ASR remains high for both scGPT (0.986) and scBERT (0.987), indicating that the poisoned models can misclassify other types of cancer cells.

Heart dataset

Gastric cancer dataset


Results with GeneFormer

Brain, Immune, and Spleen datasets


The impact of batch effects on the performance of backdoor attacks


UMAP visualization of three datasets with noticeable batch effects. Cell type labels and batch labels (e.g., sample, protocol, and donor) are projected onto the visualizations, respectively.

The results demonstrated that ASR exhibits slight fluctuations between different datasets, but overall remains at a relatively high level with 0.969 for the Brain dataset, 0.939 for the Bone marrow dataset, 0.941 for the Tongue dataset.

The impact of feature selection on the performance of backdoor attacks



When the number of feature intersections (i.e., the number of feature intersection equals to the number of highly variable genes in the test set) in the training set and test set is within a certain range (i.e., number of highly variable genes in the test set is between 2,000 and 2,750), firstly, the performance of the poisoned model on the clean data remains good (Baseline: Accuracy = 0.968, Kappa = 0.954, and Macro-F1 = 0.710). Secondly, the attack effectiveness is barely affected. However, as the difference grows much larger (i.e., number of highly variable genes in the test set is 300), the attack effectiveness weakens, while the performance on clean data deteriorates.

BibTeX

Feng, S., Li, S., Chen, L. et al. Unveiling potential threats: backdoor attacks in single-cell pre-trained models. Cell Discov 10, 122 (2024). https://doi.org/10.1038/s41421-024-00753-1