16 matches found
Be Kind, Rewrite: Benign Projections Via Rewriting Defend against LLM Data Poisoning Attacks
Large language models LLMs are highly susceptible to backdoor attacks BAs, wherein training samples are poisoned using trigger-based harmful content. Furthermore, existing defenses have proven ineffective when extensively tested across BA patterns. To better combat BAs, we explore the use of LLM...
Self-Purification Mitigates Backdoors in Multimodal Diffusion Language Models
Multimodal Diffusion Language Models MDLMs have recently emerged as a competitive alternative to their autoregressive counterparts. Yet their vulnerability to backdoor attacks remains largely unexplored. In this work, we show that well-established data-poisoning pipelines can successfully implant...
SecureSplit: Mitigating Backdoor Attacks in Split Learning
Split Learning SL offers a framework for collaborative model training that respects data privacy by allowing participants to share the same dataset while maintaining distinct feature sets. However, SL is susceptible to backdoor attacks, in which malicious clients subtly alter their embeddings to...
P2P: A Poison-To-Poison Remedy for Reliable Backdoor Defense in LLMs
During fine-tuning, large language models LLMs are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability and trustworthiness. However, existing defense strategies suffer from limited generalization: they only work on specific attack types or task settings...
BeDKD: Backdoor Defense Based on Dynamic Knowledge Distillation and Directional Mapping Modulator
Although existing backdoor defenses have gained success in mitigating backdoor attacks, they still face substantial challenges. In particular, most of them rely on large amounts of clean data to weaken the backdoor mapping but generally struggle with residual trigger effects, resulting in...
Proactive Disentangled Modeling of Trigger-Object Pairings for Backdoor Defense
Deep neural networks DNNs and generative AI GenAI are increasingly vulnerable to backdoor attacks, where adversaries embed triggers into inputs to cause models to misclassify or misinterpret target labels. Beyond traditional single-trigger scenarios, attackers may inject multiple triggers across...
Scaling Decentralized Learning with FLock
Fine-tuning the large language models LLMs are prevented by the deficiency of centralized control and the massive computing and communication overhead on the decentralized schemes. While the typical standard federated learning FL supports data privacy, the central server requirement creates a...
CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation
Deep Neural Networks DNNs are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and...
InverTune: Removing Backdoors from Multimodal Contrastive Learning Models Via Trigger Inversion and Activation Tuning
Multimodal contrastive learning models like CLIP have demonstrated remarkable vision-language alignment capabilities, yet their vulnerability to backdoor attacks poses critical security risks. Attackers can implant latent triggers that persist through downstream tasks, enabling malicious control ...
TED-LaST: Towards Robust Backdoor Defense against Adaptive Attacks
Deep Neural Networks DNNs are vulnerable to backdoor attacks, where attackers implant hidden triggers during training to maliciously control model behavior. Topological Evolution Dynamics TED has recently emerged as a powerful tool for detecting backdoor attacks in DNNs. However, TED can be...
Robust Anti-Backdoor Instruction Tuning in LVLMs
Large visual language models LVLMs have demonstrated excellent instruction-following capabilities, yet remain vulnerable to stealthy backdoor attacks when finetuned using contaminated data. Existing backdoor defense techniques are usually developed for single-modal visual or language models under...
FL-PLAS: Federated Learning with Partial Layer Aggregation for Backdoor Defense against High-Ratio Malicious Clients
Federated learning FL is gaining increasing attention as an emerging collaborative machine learning approach, particularly in the context of large-scale computing and data systems. However, the fundamental algorithm of FL, Federated Averaging FedAvg, is susceptible to backdoor attacks. Although...
Defending the Edge: Representative-Attention for Mitigating Backdoor Attacks in Federated Learning
Federated learning FL enhances privacy and reduces communication cost for resource-constrained edge clients by supporting distributed model training at the edge. However, the heterogeneous nature of such devices produces diverse, non-independent, and identically distributed non-IID data, making t...
Cert-SSB: toward Certified Sample-Specific Backdoor Defense
Deep neural networks DNNs are vulnerable to backdoor attacks, where an attacker manipulates a small portion of the training data to implant hidden backdoors into the model. The compromised model behaves normally on clean samples but misclassifies backdoored samples into the attacker-specified...
TrojanDam: Detection-Free Backdoor Defense in Federated Learning through Proactive Model Robustification Utilizing OOD Data
Federated learning FL systems allow decentralized data-owning clients to jointly train a global model through uploading their locally trained updates to a centralized server. The property of decentralization enables adversaries to craft carefully designed backdoor updates to make the global model...
Secure Transfer Learning: Training Clean Models against Backdoor in (Both) Pre-Trained Encoders and Downstream Datasets
Transfer learning from pre-trained encoders has become essential in modern machine learning, enabling efficient model adaptation across diverse tasks. However, this combination of pre-training and downstream adaptation creates an expanded attack surface, exposing models to sophisticated backdoor...