Lucene search
K

211 matches found

GithubExploit
GithubExploit
added 4 days ago45 views

-cascade-scan

cascade-scan AI Agent security evaluation framework — autom...

6.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/17 12:0 a.m.9 views

ADR: An Agentic Detection System for Enterprise Agentic AI Security

We present the Agentic AI Detection and Response ADR system, the first large-scale, production-proven enterprise framework for securing AI agents operating through the Model Context Protocol MCP. We identify three persistent challenges in this domain: 1 limited observability -- existing Endpoint...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/16 12:0 a.m.4 views

A Red Teaming Framework for Evaluating Robustness of AI-Enabled Security Orchestration, Automation, and Response Systems

AI-enabled Security Orchestration, Automation, and Response SOAR systems increasingly employ autonomous agents for cyber defense, yet their resilience to adaptive adversaries is underexplored. We introduce an autonomous red teaming framework that integrates large language models LLMs with...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/12 12:0 a.m.5 views

IPI-Proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents against Indirect Prompt Injection

Web-browsing AI agents are increasingly deployed in enterprise settings under strict whitelists of approved domains, yet adversaries can still influence them by embedding hidden instructions in the HTML pages those domains serve. Existing red-teaming resources fall short of this scenario:...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/12 12:0 a.m.3 views

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in frontier models without overfitting. We argue tha...

5.8AI score
Exploits0
The Hacker News
The Hacker News
added 2026/05/11 11:30 a.m.17 views

Your Purple Team Isn't Purple — It's Just Red and Blue in the Same Room

Defending a network at 2 am looks a lot like this: an analyst copy-pasting a hash from a PDF into a SIEM query. A red team script is being rewritten by hand so the blue team can use it. A patch waiting on a change-approval window that's longer than the exploitation window itself. Nobody in that...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/11 12:0 a.m.2 views

LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges

The integration of Large Language Models LLMs into Electronic Design Automation EDA and hardware security is rapidly reshaping the semiconductor industry. While LLMs offer unprecedented capabilities in generating Register Transfer Level RTL code, automating testbenches, and bridging the semantic...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/10 12:0 a.m.2 views

MonitoringBench: Semi-Automated Red-Teaming for Agent Monitoring

We introduce a red-teaming methodology that exposes harder-to-catch attacks for coding-agent monitors, suggesting that current practices may under-elicit attacks and overstate monitor performance. We identify three challenges with current red-teaming. First, mode collapse in attack generation,...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/09 12:0 a.m.12 views

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

Multi-turn jailbreaks exploit the ability of large language models to accumulate and act on conversational context. Instead of stating a harmful request directly, an attacker can gradually steer the conversation toward an unsafe answer. Recent methods demonstrate this risk, but they are usually...

5.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/07 12:0 a.m.4 views

Autonomous Adversary: Red-Teaming in the Age of LLM

Language Model Agents LMAs are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary emulation, and the orchestration of multi-step activity such as lateral movement, a core enabling capability of advanced persistent threat APT campaigns...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/06 12:0 a.m.6 views

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due to their high capability and flexibility, such agents raise significant security and safety concerns. A growing number of real-world incidents have...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/05 12:0 a.m.0 views

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

AI systems are entering critical domains like healthcare, finance, and defense, yet remain vulnerable to adversarial attacks. While AI red teaming is a primary defense, current approaches force operators into manual, library-specific workflows. Operators spend weeks hand-crafting workflows -...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/03 12:0 a.m.0 views

Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration

Memory systems enable otherwise-stateless LLM agents to persist user information across sessions, but also introduce a new attack surface. We characterize the Trojan Hippo attack, a class of persistent memory attacks that operates in a more realistic threat model than prior memory poisoning work:...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/01 12:0 a.m.2 views

STARE: Step-Wise Temporal Alignment and Red-Teaming Engine for Multi-Modal Toxicity Attack

Red-teaming Vision-Language Models is essential for identifying vulnerabilities where adversarial image-text inputs trigger toxic outputs. Existing approaches treat image generation as a black box, returning only terminal toxicity scores and leaving open the question of when and how toxic semanti...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/04/24 12:0 a.m.1 views

Training a General Purpose Automated Red Teaming Model

Automated methods for red teaming LLMs are an important tool to identify LLM vulnerabilities that may not be covered in static benchmarks, allowing for more thorough probing. They can also adapt to each specific LLM to discover weaknesses unique to it. Most current automated red teaming methods a...

5.6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/04/23 12:0 a.m.0 views

AutoRISE: Agent-Driven Strategy Evolution for Red-Teaming Large Language Models

Automated red-teaming methods for large language models typically optimize attack prompts within a fixed, human-designed strategy, leaving the attack strategy itself unchanged. We instead optimize the strategy. We propose AutoRISE, a method that searches over executable attack programs rather tha...

5.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/04/22 12:0 a.m.3 views

Adaptive Instruction Composition for Automated LLM Red-Teaming

Many approaches to LLM red-teaming leverage an attacker LLM to discover jailbreaks against a target. Several of them task the attacker with identifying effective strategies through trial and error, resulting in a semantically limited range of successes. Another approach discovers diverse attacks ...

5.4AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/04/20 12:0 a.m.2 views

ARES: Adaptive Red-Teaming and End-To-End Repair of Policy-Reward System

Reinforcement Learning from Human Feedback RLHF is central to aligning Large Language Models LLMs, yet it introduces a critical vulnerability: an imperfect Reward Model RM can become a single point of failure when it fails to penalize unsafe behaviors. While existing red-teaming approaches...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/04/05 12:0 a.m.2 views

SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement

LLM-based agent systems increasingly rely on agent skills sourced from open registries to extend their capabilities, yet the openness of such ecosystems makes skills difficult to thoroughly vet. Existing attacks rely on injecting malicious instructions into skills, making them easily detectable b...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/03/24 12:0 a.m.2 views

TreeTeaming: Autonomous Red-Teaming of Vision-Language Models Via Hierarchical Strategy Exploration

The rapid advancement of Vision-Language Models VLMs has brought their safety vulnerabilities into sharp focus. However, existing red teaming methods are fundamentally constrained by an inherent linear exploration paradigm, confining them to optimizing within a predefined strategy set and...

5.8AI score
Exploits0
Rows per page
Query Builder