CVE Search Engine - Security Vulnerabilities and Exploits Search Tool

show all

4 matches found

Packet Storm News•added 2026/05/11 12:0 a.m.•11 views

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

The rapid proliferation of LLM-based autonomous agents in real operating system environments introduces a new category of safety risk beyond content safety: behavior jailbreak, where an adversary induces an agent to execute dangerous OS-level operations with irreversible consequences. Existing...

5.9AI score

SaveExploits0

Packet Storm News•added 2025/10/17 12:0 a.m.•9 views

SoK: Taxonomy and Evaluation of Prompt Security in Large Language Models

Large Language Models LLMs have rapidly become integral to real-world applications, powering services across diverse sectors. However, their widespread deployment has exposed critical security risks, particularly through jailbreak prompts that can bypass model alignment and induce harmful outputs...

7AI score

SaveExploits0

Packet Storm News•added 2025/09/03 12:0 a.m.•7 views

VulnRepairEval: an Exploit-Based Evaluation Framework for Assessing Large Language Model Vulnerability Repair Capabilities

The adoption of Large Language Models LLMs for automated software vulnerability patching has shown promising outcomes on carefully curated evaluation sets. Nevertheless, existing datasets predominantly rely on superficial validation methods rather than exploit-based verification, leading to...

7.1AI score

SaveExploits0

Packet Storm News•added 2025/05/28 12:0 a.m.•10 views

Jailbreak Distillation: Renewable Safety Benchmarking

Large language models LLMs are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation JBDistill, a novel benchmark construction framework that "distills" jailbreak attacks into high-quality and easily-updatable safety...

7.2AI score

SaveExploits0

Rows per page

Query Builder

Family

Bulletin Type

Min CVSS Score

Date

Order by