Lucene search
K

6 matches found

Packet Storm News
Packet Storm News
added 2026/04/06 12:0 a.m.3 views

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

LLM agents with tool access can discover and exploit security vulnerabilities. This is known. What is not known is which features of a system prompt trigger this behaviour, and which do not. We present a systematic taxonomy based on approximately 10,000 trials across seven models, 37 prompt...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/10/09 12:0 a.m.5 views

Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models

Large language models LLMs remain vulnerable to multi-turn jailbreaking attacks that exploit conversational context to bypass safety constraints gradually. These attacks target different harm categories like malware generation, harassment, or fraud through distinct conversational approaches...

7.4AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/31 12:0 a.m.4 views

Security Concerns for Large Language Models: a Survey

Large Language Models LLMs such as GPT-4 and its recent iterations, Google's Gemini, Anthropic's Claude 3 models, and xAI's Grok have caused a revolution in natural language processing, but their capabilities also introduce new security vulnerabilities. In this survey, we provide a comprehensive...

7.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/04/15 12:0 a.m.2 views

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Multi-turn interactions with language models LMs pose critical safety risks, as harmful intent can be strategically spread across exchanges. Yet, the vast majority of prior work has focused on single-turn safety, while adaptability and diversity remain among the key challenges of multi-turn...

7.4AI score
Exploits0
Schneier on Security
Schneier on Security
added 2024/02/07 12:4 p.m.11 views

Teaching LLMs to Be Deceptive

Interesting research: "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training": Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given th...

7.5AI score
Exploits0
Schneier on Security
Schneier on Security
added 2024/01/24 12:6 p.m.9 views

Poisoning AI Models

New research into poisoning AI models: The researchers first trained the AI models using supervised learning and then used additional "safety training" methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidde...

7.6AI score
Exploits0
Rows per page
Query Builder