5 matches found
Sparse Autoencoders Are Capable LLM Jailbreak Mitigators
Jailbreak attacks remain a persistent threat to large language model safety. We propose Context-Conditioned Delta Steering CC-Delta, an SAE-based defense that identifies jailbreak-relevant sparse features by comparing token-level representations of the same harmful request with and without...
SDN-Based False Data Detection with Its Mitigation and Machine Learning Robustness for In-Vehicle Networks
As the development of autonomous and connected vehicles advances, the complexity of modern vehicles increases, with numerous Electronic Control Units ECUs integrated into the system. In an in-vehicle network, these ECUs communicate with one another using an standard protocol called Controller Are...
When Better Features Mean Greater Risks: the Performance-Privacy Trade-Off in Contrastive Learning
With the rapid advancement of deep learning technology, pre-trained encoder models have demonstrated exceptional feature extraction capabilities, playing a pivotal role in the research and application of deep learning. However, their widespread use has raised significant concerns about the risk o...
FLTG: Byzantine-Robust Federated Learning Via Angle-Based Defense and Non-IID-Aware Weighting
Byzantine attacks during model aggregation in Federated Learning FL threaten training integrity by manipulating malicious clients' updates. Existing methods struggle with limited robustness under high malicious client ratios and sensitivity to non-i.i.d. data, leading to degraded accuracy. To...
MTL-UE: Learning to Learn Nothing for Multi-Task Learning
Most existing unlearnable strategies focus on preventing unauthorized users from training single-task learning STL models with personal data. Nevertheless, the paradigm has recently shifted towards multi-task data and multi-task learning MTL, targeting generalist and foundation models that can...