Lucene search
K

12 matches found

Packet Storm News
Packet Storm News
added 2026/05/28 12:0 a.m.2 views

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

As Large Language Models evolve for user convenience, vulnerability to jailbreak attacks continues to be reported despite ongoing efforts in safety training. Traditional jailbreak techniques typically focus on a single prompt injection, neglecting the models' ability to remember the flow of...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/04 12:0 a.m.2 views

Revisiting JBShield: Breaking and Rebuilding Representation-Level Jailbreak Defenses

Defending large language models LLMs against jailbreak attacks, such as Greedy Coordinate Gradient GCG, remains a challenge, particularly under adaptive threat models where an attacker directly targets the defense mechanism. JBShield, a recent jailbreak defense with a 0% attack success rate in so...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/01/18 12:0 a.m.1 views

TrojanPraise: Jailbreak LLMs Via Benign Fine-Tuning

The demand of customized large language models LLMs has led to commercial LLMs offering black-box fine-tuning APIs, yet this convenience introduces a critical security loophole: attackers could jailbreak the LLMs by fine-tuning them with malicious data. Though this security issue has recently bee...

5.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/17 12:0 a.m.2 views

Jailbreaking Large Vision Language Models in Intelligent Transportation Systems

Large Vision Language Models LVLMs demonstrate strong capabilities in multimodal reasoning and many real-world applications, such as visual question answering. However, LVLMs are highly vulnerable to jailbreaking attacks. This paper systematically analyzes the vulnerabilities of LVLMs integrated ...

6.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/04 12:0 a.m.3 views

Jailbreaking in the Haystack

Recent advances in long-context language models LMs have enabled million-token inputs, expanding their capabilities across complex tasks like computer-use agents. Yet, the safety implications of these extended contexts remain unclear. To bridge this gap, we introduce NINJA short for...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/07/08 12:0 a.m.2 views

CAVGAN: Unifying Jailbreak and Defense of LLMs Via Generative Adversarial Attacks on Their Internal Representations

Security alignment enables the Large Language Model LLM to gain the protection against malicious queries, but various jailbreak attack methods reveal the vulnerability of this security mechanism. Previous studies have isolated LLM jailbreak attacks and defenses. We analyze the security protection...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/22 12:0 a.m.2 views

Universal Jailbreak Suffixes Are Strong Attention Hijackers

We study suffix-based jailbreaks$\unicodex2013$a powerful family of attacks against large language models LLMs that optimize adversarial suffixes to circumvent safety alignment. Focusing on the widely used foundational GCG attack Zou et al., 2023, we observe that suffixes vary in efficacy: some...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/22 12:0 a.m.2 views

InfoFlood: Jailbreaking Large Language Models with Information Overload

Large Language Models LLMs have demonstrated remarkable capabilities across various domains. However, their potential to generate harmful responses has raised significant societal and regulatory concerns, especially when manipulated by adversarial techniques known as "jailbreak" attacks. Existing...

7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/22 12:0 a.m.2 views

Alphabet Index Mapping: Jailbreaking LLMs through Semantic Dissimilarity

Large Language Models LLMs have demonstrated remarkable capabilities, yet their susceptibility to adversarial attacks, particularly jailbreaking, poses significant safety and ethical concerns. While numerous jailbreak methods exist, many suffer from computational expense, high token usage, or...

7.1AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/16 12:0 a.m.2 views

PIG: Privacy Jailbreak Attack on LLMs Via Gradient-Based Iterative In-Context Optimization

Large Language Models LLMs excel in various domains but pose inherent privacy risks. Existing methods to evaluate privacy leakage in LLMs often use memorized prefixes or simple instructions to extract data, both of which well-alignment models can easily block. Meanwhile, Jailbreak attacks bypass...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/10 12:0 a.m.4 views

Practical Reasoning Interruption Attacks on Reasoning Large Language Models

Reasoning large language models RLLMs have demonstrated outstanding performance across a variety of tasks, yet they also expose numerous security vulnerabilities. Most of these vulnerabilities have centered on the generation of unsafe content. However, recent work has identified a distinct...

7.6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/04/15 12:0 a.m.3 views

Token-Level Constraint Boundary Search for Jailbreaking Text-To-Image Models

Recent advancements in Text-to-Image T2I generation have significantly enhanced the realism and creativity of generated images. However, such powerful generative capabilities pose risks related to the production of inappropriate or harmful content. Existing defense mechanisms, including prompt...

7AI score
Exploits0
Rows per page
Query Builder