3 matches found
SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents
Autonomous LLM agents increasingly operate in stateful environments where they access tools, files, memory, and external services. While such capabilities enable complex real-world workflows, they also introduce security risks that are difficult to capture with existing evaluations. Current agent...
AVISE: Framework for Evaluating the Security of AI Systems
As artificial intelligence AI systems are increasingly deployed across critical domains, their security vulnerabilities pose growing risks of high-profile exploits and consequential system failures. Yet systematic approaches to evaluating AI security remain underdeveloped. In this paper, we...
AdapTools: Adaptive Tool-Based Indirect Prompt Injection Attacks on Agentic LLMs
The integration of external data services e.g., Model Context Protocol, MCP has made large language model-based agents increasingly powerful for complex task execution. However, this advancement introduces critical security vulnerabilities, particularly indirect prompt injection IPI attacks...