Lucene search
K

24 matches found

Packet Storm News
Packet Storm News
added 2026/02/24 12:0 a.m.2 views

Analysis of LLMs against Prompt Injection and Jailbreak Attacks

Large Language Models LLMs are widely deployed in real-world systems. Given their broader applicability, prompt engineering has become an efficient tool for resource-scarce organizations to adopt LLMs for their own purposes. At the same time, LLMs are vulnerable to prompt-based attacks. Thus,...

6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/02/06 12:0 a.m.2 views

ShallowJail: Steering Jailbreaks against Large Language Models

Large Language ModelsLLMs have been successful in numerous fields. Alignment has usually been applied to prevent them from harmful purposes. However, aligned LLMs remain vulnerable to jailbreak attacks that deliberately mislead them into producing harmful outputs. Existing jailbreaks are either...

5.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/01/07 12:0 a.m.2 views

Jailbreaking LLMs and VLMs: Mechanisms, Evaluation, and Unified Defense

This paper provides a systematic survey of jailbreak attacks and defenses on Large Language Models LLMs and Vision-Language Models VLMs, emphasizing that jailbreak vulnerabilities stem from structural factors such as incomplete training data, linguistic ambiguity, and generative uncertainty. It...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/12/23 12:0 a.m.7 views

Odysseus: Jailbreaking Commercial Multimodal LLM-Integrated Systems Via Dual Steganography

By integrating language understanding with perceptual modalities such as images, multimodal large language models MLLMs constitute a critical substrate for modern AI systems, particularly intelligent agents operating in open and interactive environments. However, their increasing accessibility al...

7.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/14 12:0 a.m.2 views

NegBLEURT Forest: Leveraging Inconsistencies for Detecting Jailbreak Attacks

Jailbreak attacks designed to bypass safety mechanisms pose a serious threat by prompting LLMs to generate harmful or inappropriate content, despite alignment with ethical guidelines. Crafting universal filtering rules remains difficult due to their inherent dependence on specific contexts. To...

7.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/10/21 12:0 a.m.7 views

HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models

Large Language Models LLMs remain vulnerable to multi-turn jailbreak attacks. We introduce HarmNet, a modular framework comprising ThoughtNet, a hierarchical semantic network; a feedback-driven Simulator for iterative query refinement; and a Network Traverser for real-time adaptive attack...

7.1AI score
Exploits0
EUVD
EUVD
added 2025/10/03 8:7 p.m.2 views

EUVD-2025-14804

Malicious code in bioql PyPI...

8.1CVSS6.4AI score0.02361EPSS
Exploits0References3
Packet Storm News
Packet Storm News
added 2025/10/01 12:0 a.m.3 views

Breaking the Code: Security Assessment of AI Code Agents through Systematic Jailbreaking Attacks

Code-capable large language model LLM agents are increasingly embedded into software engineering workflows where they can read, write, and execute code, raising the stakes of safety-bypass "jailbreak" attacks beyond text-only settings. Prior evaluations emphasize refusal or harmful-text detection...

7.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/09/18 12:0 a.m.2 views

Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism Via Probabilistically Ablating Refusal Direction

Jailbreak attacks pose persistent threats to large language models LLMs. Current safety alignment methods have attempted to address these issues, but they experience two significant limitations: insufficient safety alignment depth and unrobust internal defense mechanisms. These limitations make...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/09/08 12:0 a.m.3 views

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?

Jailbreak attacks on Large Language Models LLMs have demonstrated various successful methods whereby attackers manipulate models into generating harmful responses that they are designed to avoid. Among these, Greedy Coordinate Gradient GCG has emerged as a general and effective approach that...

7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/09/04 12:0 a.m.2 views

NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models

In deployment and application, large language models LLMs typically undergo safety alignment to prevent illegal and unethical outputs. However, the continuous advancement of jailbreak attack techniques, designed to bypass safety mechanisms with adversarial prompts, has placed increasing pressure ...

7.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/08/16 12:0 a.m.2 views

Mitigating Jailbreaks with Intent-Aware LLMs

Despite extensive safety-tuning, large language models LLMs remain vulnerable to jailbreak attacks via adversarially crafted instructions, reflecting a persistent trade-off between safety and task performance. In this work, we propose Intent-FT, a simple and lightweight fine-tuning approach that...

7.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/07/06 12:0 a.m.2 views

Attention Slipping: a Mechanistic Understanding of Jailbreak Attacks and Defenses in LLMs

As large language models LLMs become more integral to society and technology, ensuring their safety becomes essential. Jailbreak attacks exploit vulnerabilities to bypass safety guardrails, posing a significant threat. However, the mechanisms enabling these attacks are not well understood. In thi...

7.4AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/23 12:0 a.m.2 views

Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks

The widespread deployment of large language models LLMs has raised critical concerns over their vulnerability to jailbreak attacks, i.e., adversarial prompts that bypass alignment mechanisms and elicit harmful or policy-violating outputs. While proprietary models like GPT-4 have undergone extensi...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/22 12:0 a.m.2 views

Investigating Vulnerabilities and Defenses against Audio-Visual Attacks: a Comprehensive Survey Emphasizing Multimodal Models

Multimodal large language models MLLMs, which bridge the gap between audio-visual and natural language processing, achieve state-of-the-art performance on several audio-visual tasks. Despite the superior performance of MLLMs, the scarcity of high-quality audio-visual training data and computation...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/21 12:0 a.m.2 views

SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning

Large Reasoning Models LRMs introduce a new generation paradigm of explicitly reasoning before answering, leading to remarkable improvements in complex tasks. However, they pose great safety risks against harmful queries and adversarial attacks. While recent mainstream safety efforts on LRMs,...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/04/27 12:0 a.m.2 views

JailbreaksOverTime: Detecting Jailbreak Attacks under Distribution Shift

Safety and security remain critical concerns in AI deployment. Despite safety training through reinforcement learning with human feedback RLHF 32, language models remain vulnerable to jailbreak attacks that bypass safety guardrails. Universal jailbreaks - prefixes that can circumvent alignment fo...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/04/14 12:0 a.m.2 views

Concept Enhancement Engineering: a Lightweight and Efficient Robust Defense against Jailbreak Attacks in Embodied AI

Embodied Intelligence EI systems integrated with large language models LLMs face significant security risks, particularly from jailbreak attacks that manipulate models into generating harmful outputs or executing unsafe physical actions. Traditional defense strategies, such as input filtering and...

7AI score
Exploits0
OSV
OSV
added 2025/03/27 6:14 p.m.5 views

GHSA-F3MF-HM6V-JFHH Mesop Class Pollution vulnerability leads to DoS and Jailbreak attacks

From @jackfromeast and @superboy-zjc: We have identified a class pollution vulnerability in Mesop = 0.14.0 application that allows attackers to overwrite global variables and class attributes in certain Mesop modules during runtime. This vulnerability could directly lead to a denial of service Do...

8.1CVSS7AI score0.02361EPSS
Exploits0References4
Github Security Blog
Github Security Blog
added 2025/03/27 6:14 p.m.18 views

Mesop Class Pollution vulnerability leads to DoS and Jailbreak attacks

From @jackfromeast and @superboy-zjc: We have identified a class pollution vulnerability in Mesop = 0.14.0 application that allows attackers to overwrite global variables and class attributes in certain Mesop modules during runtime. This vulnerability could directly lead to a denial of service Do...

8.1CVSS6.8AI score0.02361EPSS
Exploits0References4Affected Software1
Rows per page
Query Builder