Lucene search
K

9 matches found

Packet Storm News
Packet Storm News
added 2026/04/09 12:0 a.m.4 views

Follow My Eyes: Backdoor Attacks on VLM-Based Scanpath Prediction

Scanpath prediction models forecast the sequence and timing of human fixations during visual search, driving foveated rendering and attention-based interaction in mobile systems where their integrity is a first-class security concern. We present the first study of backdoor attacks against VLM-bas...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/02/15 12:0 a.m.11 views

From SFT to RL: Demystifying the Post-Training Pipeline for LLM-Based Vulnerability Detection

The integration of LLMs into vulnerability detection VD has shifted the field toward interpretable and context-aware analysis. While post-training methods have shown promise in general coding tasks, their systematic application to VD remains underexplored. In this paper, we present the first...

5.6AI score
Exploits0
Microsoft Secure
Microsoft Secure
added 2026/02/09 5:12 p.m.10 views

A one-prompt attack that breaks LLM safety alignment

Large language models LLMs and diffusion models now power a wide range of applications, from document assistance to text-to-image generation, and users increasingly expect these systems to be safety-aligned by default. Yet safety alignment is only as robust as its weakest failure mode. Despite...

5.7AI score
Exploits0
Microsoft Secure
Microsoft Secure
added 2026/02/09 5:12 p.m.6 views

A one-prompt attack that breaks LLM safety alignment

Large language models LLMs and diffusion models now power a wide range of applications, from document assistance to text-to-image generation, and users increasingly expect these systems to be safety-aligned by default. Yet safety alignment is only as robust as its weakest failure mode. Despite...

5.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/12/22 12:0 a.m.34 views

Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models

Low-Rank Adaptation LoRA has emerged as an efficient method for fine-tuning large language models LLMs and is widely adopted within the open-source community. However, the decentralized dissemination of LoRA adapters through platforms such as Hugging Face introduces novel security vulnerabilities...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/07/07 12:0 a.m.4 views

The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation

Large Language Models LLMs have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks, and the boundary between learning and memorization. Addressi...

7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/07/07 12:0 a.m.3 views

Beyond Training-Time Poisoning: Component-Level and Post-Training Backdoors in Deep Reinforcement Learning

Deep Reinforcement Learning DRL systems are increasingly used in safety-critical applications, yet their security remains severely underexplored. This work investigates backdoor attacks, which implant hidden triggers that cause malicious actions only when specific inputs appear in the observation...

7.1AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/06 12:0 a.m.2 views

MergeGuard: Efficient Thwarting of Trojan Attacks in Machine Learning Models

This paper proposes MergeGuard, a novel methodology for mitigation of AI Trojan attacks. Trojan attacks on AI models cause inputs embedded with triggers to be misclassified to an adversary's target class, posing a significant threat to model usability trained by an untrusted third party. The core...

6.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/04/23 12:0 a.m.5 views

Enhancing Variational Autoencoders with Smooth Robust Latent Encoding

Variational Autoencoders VAEs have played a key role in scaling up diffusion-based generative models, as in Stable Diffusion, yet questions regarding their robustness remain largely underexplored. Although adversarial training has been an established technique for enhancing robustness in predicti...

6.8AI score
Exploits0
Rows per page
Query Builder