172 matches found
MAStrike: Shapley-Guided Collusive Red-Teaming on Multi-Agent Systems
Hierarchical multi-agent systems MAS are rapidly being deployed in high-stakes workflows across domains such as finance and software engineering. In these systems, safety and security are inherently distributed across role-specialized agents, significantly expanding the attack surface, particular...
InjectV: Modeling Fault Injection Attacks in RISC-V Simulation Environment
Fault Injection Attacks FIAs are a significant threat to hardware security, capable of compromising systems by inducing malicious faults in computation or storage. Evaluating resilience against such attacks is challenging due to the high cost, complexity, and limited availability of physical faul...
ai.ancf.lmos-router:benchmarks (>=0.2.0 <=0.28.0), ai.ancf.lmos-router:lmos-router-hybrid (>=0.2.0 <=0.28.0) +12787 more potentially affected by CVE-2026-45536 via io.netty:netty-transport-native-epoll (>=4.0.21.Final <=4.1.134.Final)
io.netty:netty-transport-native-epoll MAVEN version =4.0.21.Final, =0.2.0, =0.2.0, =0.2.0, =0.2.0, =0.2.0, =0.2.0, =0.1.1, =0.1.1, =0.1.1, =0.0.4, =0.6.0 - ai.ancf.lmos:lmos-router-hybrid =0.1.0 - ai.ancf.lmos:lmos-router-hybrid-spring-boot-starter =0.1.0 - ai.ancf.lmos:lmos-router-llm =0.1.0 -...
Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops
Agent benchmarks score submissions with outcome verifiers that are typically hand-written and brittle, leaving them open to reward hacking. We audit 1,968 tasks across five terminal-agent benchmarks and find 323 16% hackable by frontier models given only the task description. This corrupts both...
MOLOT System Card: Malicious Operational Logic Observation Transformer
MOLOT Malicious Operational Logic Observation Transformer is a static malicious-code detection system designed for SAST setup where package metadata, maintainer history, and dynamic execution traces may be unavailable or unreliable. The system represents source code as behavior sequences derived...
MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models
Diffusion large language models dLLMs generate text by iteratively denoising partially masked sequences under bidirectional context, exposing a safety surface distinct from autoregressive LLMs. Because mask tokens are native inputs and tokens are committed by confidence rather than position,...
Gate AI: LLM Security Benchmark Evaluation Methodology and Results
Published evaluations of prompt-injection and jailbreak detectors for Large Language Models often suffer from two systematic weaknesses: per-dataset threshold tuning and undisclosed operating points. We describe an evaluation harness that addresses both. The detector under evaluation is scored...
SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents
Autonomous LLM agents increasingly operate in stateful environments where they access tools, files, memory, and external services. While such capabilities enable complex real-world workflows, they also introduce security risks that are difficult to capture with existing evaluations. Current agent...
On AI Security
Good report: Executive Summary: Let's say you wanted to make sure that your AI is secure. Can you just maximize the security and privacy benchmark and call it a day? Nope, because benchmarks don't actually work for measuring AI capabilities even when they are NOT emergent systemic properties like...
Backchaining Loss of Control Mitigations from Mission-Specific Benchmarks in National Security
Affordances and permissions are promising and timely safety levers for mitigating Loss of Control LoC threats in high-stakes deployment contexts, such as national security. Deployers in defense and intelligence could rely on several approaches to identify which affordances and permissions should ...
MemRepair: Hierarchical Memory for Agentic Repository-Level Vulnerability Repair
Modern software ecosystems face a rapidly growing number of disclosed vulnerabilities, increasing the need for automated repair techniques that can operate reliably at repository scale. Although Large Language Model LLM-based agents have recently shown promise for automated vulnerability repair...
Context-Aware Entity-Relation Extraction for Threat Intelligence Knowledge Graphs
Cybersecurity Knowledge Graphs CKGs unify diverse Cyber Threat Intelligence CTI sources into structured, queryable formats, offering scalable solutions for automating proactive and real-time security responses. Their increasing adoption has significantly enhanced the workflow and decision-making...
Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack
Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in frontier models without overfitting. We argue tha...
ai.agentican:agentican-framework-core (>=0.1.0-alpha.2 <=0.1.0-alpha.4), ai.agentican:agentican-quarkus-deployment (>=0.1.0-alpha.1 <=0.1.0-alpha.4) +23724 more potentially affected by CVE-2026-42585 via io.netty:netty-codec-http (>=4.0.0.Alpha1 <=4.1.132.Final)
io.netty:netty-codec-http MAVEN version =4.0.0.Alpha1, =0.1.0-alpha.2, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.1, =0.1.0-alpha.3, =0.1.0-alpha.2, =0.1.0, =0.1.0, =0.2.0, =0.2.0, =0.28.0 and more Source cves:...
GLiNER Guard: Unified Encoder Family for Production LLM Safety and Privacy
Production LLM systems require both safety moderation and PII detection under strict latency and cost constraints. This creates a trade-off: autoregressive moderators are accurate but expensive, while lightweight encoders are faster but less capable. We present GLiNER Guard GLiGuard, a unified...
XekRung Technical Report
We present XekRung, a frontier large language model for cybersecurity, designed to provide comprehensive security capabilities. To achieve this, we develop diverse data synthesis pipelines tailored to the cybersecurity domain, enabling the scalable construction of high-quality training data and...
From Stateless Queries to Autonomous Actions: A Layered Security Framework for Agentic AI Systems
Agentic AI systems face security challenges that stateless large language models do not. They plan across extended horizons, maintain persistent memory, invoke external tools, and coordinate with peer agents. Existing security analyses organize threats by attack type prompt injection, jailbreakin...
Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges
Large Language Model LLM agents are increasingly proposed for autonomous cybersecurity tasks, but their capabilities in realistic offensive settings remain poorly understood. We present DeepRed, an open-source benchmark for evaluating LLM-based agents on realistic Capture The Flag CTF challenges ...
ARES: Adaptive Red-Teaming and End-To-End Repair of Policy-Reward System
Reinforcement Learning from Human Feedback RLHF is central to aligning Large Language Models LLMs, yet it introduces a critical vulnerability: an imperfect Reward Model RM can become a single point of failure when it fails to penalize unsafe behaviors. While existing red-teaming approaches...
Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts
The deployment of large language models LLMs in Swiss financial and regulatory contexts demands empirical evidence of both production reliability and adversarial security, dimensions not jointly operationalized in existing Swiss-focused evaluation frameworks. This paper introduces Swiss-Bench 003...