102 matches found
A Red Teaming Framework for Evaluating Robustness of AI-Enabled Security Orchestration, Automation, and Response Systems
AI-enabled Security Orchestration, Automation, and Response SOAR systems increasingly employ autonomous agents for cyber defense, yet their resilience to adaptive adversaries is underexplored. We introduce an autonomous red teaming framework that integrates large language models LLMs with...
Operationalizing Cybersecurity Governance for Mitigation Planning with Attack-Path Modeling and Reinforcement Learning
We address a fundamental challenge in cybersecurity operations of translating governance frameworks into actionable mitigation decisions under realistic resource constraints. Frameworks such as the NIST Cybersecurity Framework CSF provide widely adopted measures of organizational maturity, but do...
STARE: Step-Wise Temporal Alignment and Red-Teaming Engine for Multi-Modal Toxicity Attack
Red-teaming Vision-Language Models is essential for identifying vulnerabilities where adversarial image-text inputs trigger toxic outputs. Existing approaches treat image generation as a black box, returning only terminal toxicity scores and leaving open the question of when and how toxic semanti...
XekRung Technical Report
We present XekRung, a frontier large language model for cybersecurity, designed to provide comprehensive security capabilities. To achieve this, we develop diverse data synthesis pipelines tailored to the cybersecurity domain, enabling the scalable construction of high-quality training data and...
VeRL 权限许可和访问控制问题漏洞
VeRL is an open-source reinforcement learning framework developed by ByteDance, aimed at optimizing large model training and inference processes. Versions of VeRL prior to 0.7.0 contained vulnerabilities related to permission licensing and access control. These vulnerabilities stemmed from a...
Risk Models As Mediating Artifacts: A Postphenomenological Analysis of the CIIM Framework in Cybersecurity Practice
This article applies postphenomenological theory to the field of cybersecurity risk management, arguing that formal risk models function as mediating artifacts that shape how security practitioners or analysts perceive, interpret, and act on threats. Based on Don Ihde's taxonomy on human-technolo...
TL-RL-FusionNet: An Adaptive and Efficient Reinforcement Learning-Driven Transfer Learning Framework for Detecting Evolving Ransomware Threats
Modern ransomware exhibits polymorphic and evasive behaviors by frequently modifying execution patterns to evade detection. This dynamic nature disrupts feature spaces and limits the effectiveness of static or predefined models. To address this challenge, we propose TL-RL-FusionNet, a reinforceme...
Adaptive Instruction Composition for Automated LLM Red-Teaming
Many approaches to LLM red-teaming leverage an attacker LLM to discover jailbreaks against a target. Several of them task the attacker with identifying effective strategies through trial and error, resulting in a semantically limited range of successes. Another approach discovers diverse attacks ...
ARES: Adaptive Red-Teaming and End-To-End Repair of Policy-Reward System
Reinforcement Learning from Human Feedback RLHF is central to aligning Large Language Models LLMs, yet it introduces a critical vulnerability: an imperfect Reward Model RM can become a single point of failure when it fails to penalize unsafe behaviors. While existing red-teaming approaches...
Privacy-Aware Machine Unlearning with SISA for Reinforcement Learning-Based Ransomware Detection
Ransomware detection systems increasingly rely on behavior-based machine learning to address evolving attack strategies. However, emerging privacy compliance, data governance, and responsible AI deployment demand not only accurate detection but also the ability to efficiently remove the influence...
CSLE: A Reinforcement Learning Platform for Autonomous Security Management
Reinforcement learning is a promising approach to autonomous and adaptive security management in networked systems. However, current reinforcement learning solutions for security management are mostly limited to simulation environments and it is unclear how they generalize to operational systems...
Beyond Static Sandboxing: Learned Capability Governance for Autonomous AI Agents
Autonomous AI agents built on open-source runtimes such as OpenClaw expose every available tool to every session by default, regardless of the task. A summarization task receives the same shell execution, subagent spawning, and credential access capabilities as a code deployment task, a 15x...
Asking AI for personal advice is a bad idea, Stanford study shows
Stanford computer scientists just proved what therapists already suspected: AI chatbots will agree with almost anything you say to keep you happy. The researchers caught these systems validating dangerous decisions just to maintain user engagement. That's a worrying development, especially given...
Secure Reinforcement Learning: On Model-Free Detection of Man in the Middle Attacks
We consider the problem of learning-based man-in-the-middle MITM attacks in cyber-physical systems CPS, and extend our previously proposed Bellman Deviation Detection BDD framework for model-free reinforcement learning RL. We refine the standard MDP attack model by allowing the reward function to...
DeepXplain: XAI-Guided Autonomous Defense against Multi-Stage APT Campaigns
Advanced Persistent Threats APTs are stealthy, multi-stage attacks that require adaptive and timely defense. While deep reinforcement learning DRL enables autonomous cyber defense, its decisions are often opaque and difficult to trust in operational environments. This paper presents DeepXplain, a...
Cyber Deception for Mission Surveillance Via Hypergame-Theoretic Deep Reinforcement Learning
Unmanned Aerial Vehicles UAVs are valuable for mission-critical systems like surveillance, rescue, or delivery. Not surprisingly, such systems attract cyberattacks, including Denial-of-Service DoS attacks to overwhelm the resources of mission drones MDs. How can we defend UAV mission systems...
NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing
Penetration testing, the practice of simulating cyberattacks to identify vulnerabilities, is a complex sequential decision-making task that is inherently partially observable and features large action spaces. Training reinforcement learning RL policies for this domain faces a fundamental...
PISmith: Reinforcement Learning-Based Red Teaming for Prompt Injection Defenses
Prompt injection poses serious security risks to real-world LLM applications, particularly autonomous agents. Although many defenses have been proposed, their robustness against adaptive attacks remains insufficiently evaluated, potentially creating a false sense of security. In this work, we...
Adversarial Reinforcement Learning for Detecting False Data Injection Attacks in Vehicular Routing
In modern transportation networks, adversaries can manipulate routing algorithms using false data injection attacks, such as simulating heavy traffic with multiple devices running crowdsourced navigation applications, to mislead vehicles toward suboptimal routes and increase congestion. To addres...
SlowBA: An Efficiency Backdoor Attack Towards VLM-Based GUI Agents
Modern vision-language-model VLM based graphical user interface GUI agents are expected not only to execute actions accurately but also to respond to user instructions with low latency. While existing research on GUI-agent security mainly focuses on manipulating action correctness, the security...