685 matches found
Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption
Software obfuscation and encryption present persistent challenges for program comprehension and security analysis, particularly when adversaries conceal Indicators of Compromise IoCs such as IP addresses within source code. While Large Language Models LLMs have recently demonstrated remarkable...
Evaluating the Reliability of Multiple Large Language Models in Risk Assessment: A CIS Controls Based Approach
Proper implementation of technical and administrative controls reinforces an organization's cybersecurity posture and business resilience, reduces risks, and enhances governance, ultimately elevating business maturity. The dynamics of the technological landscape and emerging threats negatively...
SOCpilot: Verifying Policy Compliance for LLM-Assisted Incident Response
Security operations centers SOCs are beginning to use large language models LLMs as copilots to draft incident-response plans. These plans may include actions that are valid per the catalog but still violate mandatory steps, required ordering, or approval gates before analyst review. SOCpilot mak...
Information Theoretic Adversarial Training of Large Language Models
Large language models LLMs remain vulnerable to adversarial prompting despite advances in alignment and safety, often exhibiting harmful behaviors under novel attack strategies. While adversarial training can improve robustness, existing approaches are computationally expensive and difficult to...
AFL-ICP: Enhancing Industrial Control Protocol Reliability Via Specification-Guided Fuzzing
Industrial Control Protocols ICPs are critical to the reliability and stability of industrial infrastructure, yet their security is fundamentally compromised by a specification-blindness bottleneck. Modern fuzzers, constrained by observation-driven inference, struggle to penetrate deep protocol...
Automation-Exploit-Legacy
Automation-Exploit Legacy Prototype This repository contain...
This Week in Spring - May 5th, 2026
Hi, Spring fans! Welcome to another installment of This Week in Spring! It's May 5th, 2026, and I'm in Mainz, Germany, for the legendary JAX conference! It's been infinitely far too long since I've been at this amazing show, and I'm oh-so happy to be back here! Tonight, after my two talks here, I...
Eval Injection
Overview pptagent is an An Agentic Framework for Reflective PowerPoint Generation Affected versions of this package are vulnerable to Eval Injection via the eval function when processing code generated by large language models with built-in functions available in the execution scope. An attacker...
STARE: Step-Wise Temporal Alignment and Red-Teaming Engine for Multi-Modal Toxicity Attack
Red-teaming Vision-Language Models is essential for identifying vulnerabilities where adversarial image-text inputs trigger toxic outputs. Existing approaches treat image generation as a black box, returning only terminal toxicity scores and leaving open the question of when and how toxic semanti...
Trident: Improving Malware Detection with LLMs and Behavioral Features
Traditionally, machine learning methods for PE malware detection have relied on static features like byte histograms, string information, and PE header contents. One barrier to incorporating dynamic analysis features has been the semi-structured nature of sandbox behavior reports. We show that,...
Towards Agentic Investigation of Security Alerts
Security analysts are overwhelmed by the volume of alerts and the low context provided by many detection systems. Early-stage investigations typically require manual correlation across multiple log sources, a task that is usually time-consuming. In this paper, we present an experimental, agentic...
From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems
Enterprise software engineering is shifting away from deterministic CRUD/REST architectures toward AI-native systems where large language models act as cognitive orchestrators. This transition introduces a critical security tension: probabilistic LLMs weaken classical mechanisms for validation,...
Logic-to-Code Execution via Indirect Prompt Injection
This document explores a critical architectural vulnerability in Large Language Model LLM implementations, specifically within Command Line Interface CLI tools and automated agentic workflows. The research demonstrates how the absence of separation between the control plane instructions and the...
Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets
This paper presents a system combining symbolic execution KLEE with a 4-agent multi-LLM architecture for detecting memory vulnerabilities in Rust unsafe code. A central challenge we address is the incomplete-code problem: CVE database entries provide only isolated code snippets that lack struct...
Evaluation of Prompt Injection Defenses in Large Language Models
LLM-powered applications routinely embed secrets in system prompts, yet models can be tricked into revealing them. We built an adaptive attacker that evolves its strategies over hundreds of rounds and tested it against nine defense configurations across more than 20,000 attacks. Every defense tha...
EUVD-2026-25333
OpenClaw before 2026.3.28 contains an agentic consent bypass vulnerability allowing LLM agents to silently disable execution approval via config.patch parameter. Remote attackers can exploit this to bypass security controls and execute unauthorized operations without user consent...
Flowise Information Disclosure Vulnerability
Flowise is a FlowiseAI open source tool for easily building LLM applications. Flowise suffers from an information disclosure vulnerability caused by a flaw in the /api/v1/public-chatflows/:id endpoint that can be exploited by an attacker to obtain sensitive information...
Important: Red Hat Security Advisory: Red Hat Enterprise Linux AI 3.3.1
Red Hat Enterprise Linux AI 3.3.1 is now available. Red Hat® Enterprise Linux® AI is a foundation model platform to seamlessly develop, test, and run Granite family large language models LLMs for enterprise applications...
Important: Red Hat Security Advisory: Red Hat Enterprise Linux AI 3.3.1
Red Hat Enterprise Linux AI 3.3.1 is now available. Red Hat® Enterprise Linux® AI is a foundation model platform to seamlessly develop, test, and run Granite family large language models LLMs for enterprise applications...
AutoRISE: Agent-Driven Strategy Evolution for Red-Teaming Large Language Models
Automated red-teaming methods for large language models typically optimize attack prompts within a fixed, human-designed strategy, leaving the attack strategy itself unchanged. We instead optimize the strategy. We propose AutoRISE, a method that searches over executable attack programs rather tha...