15 matches found
AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations
Indirect prompt injection in tool-use agents is a concrete production threat: LLM agents read from integrations third-party services such as Gmail, Salesforce, or Jira accessed through tool calls whose response content the user neither writes nor controls. Existing benchmarks under-measure the...
ExploitBench: A Capability Ladder Benchmark for LLM Cybersecurity Agents
Exploitation is not a binary event. It is a ladder of acquiring progressive capabilities, from executing a single buggy line of code to taking full control of the target. However, existing LLM security benchmarks treat a crash as exploitation success. That single binary outcome collapses the hard...
Can a Single Message Paralyze the AI Infrastructure? the Rise of AbO-DDoS Attacks through Targeted Mobius Injection
Large Language Model LLM agents have emerged as key intermediaries, orchestrating complex interactions between human users and a wide range of digital services and LLM infrastructures. While prior research has extensively examined the security of LLMs and agents in isolation, the systemic risk of...
AgentShield: Deception-Based Compromise Detection for Tool-Using LLM Agents
Defenses against indirect prompt injection IPI in tool-using LLM agents share two structural weaknesses. First, they all attempt to prevent attacks rather than detect the compromises that slip through. Second, they have only been evaluated in English, leaving users of low-resource languages such ...
Autonomous Adversary: Red-Teaming in the Age of LLM
Language Model Agents LMAs are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary emulation, and the orchestration of multi-step activity such as lateral movement, a core enabling capability of advanced persistent threat APT campaigns...
PT-2026-34780
Name of the Vulnerable Software and Affected Versions OpenClaw versions prior to 2026.3.28 Description An agentic consent bypass allows LLM agents to silently disable execution approval. Remote attackers can exploit this by using the config.patch parameter to bypass security controls and execute...
Benchmarking-Agent-Architectures
Benchmarking Agent Architectures for LLM-Based Exploit Gener...
Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents
Autonomous offensive agents often fail to transfer beyond the networks on which they are trained. We isolate a minimal but fundamental shift -- unseen host/subnet IP reassignment in an otherwise fixed enterprise scenario -- and evaluate attacker generalization in the NetSecGame environment. Agent...
Assessing Spear-Phishing Website Generation in Large Language Model Coding Agents
Large Language Models are expanding beyond being a tool humans use and into independent agents that can observe an environment, reason about solutions to problems, make changes that impact those environments, and understand how their actions impacted their environment. One of the most common...
MalTool: Malicious Tool Attacks on LLM Agents
In a malicious tool attack, an attacker uploads a malicious tool to a distribution platform; once a user installs the tool and the LLM agent selects it during task execution, the tool can compromise the user's security and privacy. Prior work primarily focuses on manipulating tool names and...
LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet?
Large language model LLM agents have demonstrated remarkable capabilities in software engineering and cybersecurity tasks, including code generation, vulnerability discovery, and automated testing. One critical but underexplored application is automated web vulnerability reproduction, which...
AutoPentester: An LLM Agent-Based Framework for Automated Pentesting
Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale and complexity, the demand for pentesting has surged, surpassing the capacity of human professionals to meet it effectively. With advances in AI,...
STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents
As LLMs advance into autonomous agents with tool-use capabilities, they introduce security challenges that extend beyond traditional content-based LLM safety concerns. This paper introduces Sequential Tool Attack Chaining STAC, a novel multi-turn attack framework that exploits agent tool use. STA...
Automatic Red Teaming LLM-Based Agents with Model Context Protocol Tools
The remarkable capability of large language models LLMs has led to the wide application of LLM-based agents in various domains. To standardize interactions between LLM-based agents and their environments, model context protocol MCP tools have become the de facto standard and are now widely...
SEC-Bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
Rigorous security-focused evaluation of large language model LLM agents is imperative for establishing trust in their safe deployment throughout the software development lifecycle. However, existing benchmarks largely rely on synthetic challenges or simplified vulnerability datasets that fail to...