Lucene search
K

5 matches found

Packet Storm News
Packet Storm News
added 2026/05/26 12:0 a.m.7 views

BAIT: Boundary-Guided Disclosure Escalation Via Self-Conditioned Reasoning

In this work, we propose BAIT Boundary-Aware Iterative Trap, a three-step jailbreak framework that approaches malicious goals through internal disclosure. BAIT first asks the model to identify the protection boundary, then requires it to refine that boundary, and finally requests a detailed...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/23 12:0 a.m.8 views

Reasoning As an Attack Surface: Adaptive Evolutionary CoT Jailbreaks for LLMs

Large Reasoning Models LRMs have demonstrated remarkable capabilities in reasoning and generation tasks and are increasingly deployed in real-world applications. However, their explicit chain-of-thought CoT mechanism introduces new security risks, making them particularly vulnerable to jailbreak...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/18 12:0 a.m.6 views

Babel: Jailbreaking Safety Attention Via Obfuscation Distribution Optimized Sampling

Despite rigorous safety alignment, Large Language Models LLMs remain vulnerable to jailbreak attacks. Existing black-box methods often rely on heuristic templates or exhaustive trials, lacking mechanistic interpretability and query efficiency. In this study, we investigate an intrinsic...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/08 12:0 a.m.3 views

OrchJail: Jailbreaking Tool-Calling Text-To-Image Agents by Orchestration-Guided Fuzzing

Tool-calling text-to-image T2I agents can plan and execute multi-step tool chains to accomplish complex generation and editing queries. However, this capability introduces a new safety attack surface: harmful outputs may arise from tool orchestration, where individually benign steps combine into...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/10/11 12:0 a.m.2 views

ArtPerception: ASCII Art-Based Jailbreak on LLMs with Recognition Pre-Test

The integration of Large Language Models LLMs into computer applications has introduced transformative capabilities but also significant security challenges. Existing safety alignments, which primarily focus on semantic interpretation, leave LLMs vulnerable to attacks that use non-standard data...

7.2AI score
Exploits0
Rows per page
Query Builder