5 matches found
Learn from Your Mistakes: Tree-Like Self-Play for Secure Code LLMs
While Large Language Models LLMs excel in code generation, they remain prone to replicating subtle yet critical vulnerabilities endemic to their training data. Current alignment techniques, such as Supervised Fine-Tuning SFT and Reinforcement Learning RL, typically apply coarse-grained optimizati...
PoC-Adapt: Semantic-Aware Automated Vulnerability Reproduction with LLM Multi-Agents and Reinforcement Learning-Driven Adaptive Policy
While recent approaches leverage large language models LLMs and multi-agent pipelines to automatically generate proof-of-concept PoC exploits from vulnerability reports, existing systems often suffer from two fundamental limitations: unreliable validation based on surface-level execution signals...
Explainable Autonomous Cyber Defense Using Adversarial Multi-Agent Reinforcement Learning
Autonomous agents are increasingly deployed in both offensive and defensive cyber operations, creating high-speed, closed-loop interactions in critical infrastructure environments. Advanced Persistent Threat APT actors exploit "Living off the Land" techniques and targeted telemetry perturbations ...
VULPO: Context-Aware Vulnerability Detection Via On-Policy LLM Optimization
The widespread reliance on open-source software dramatically increases the risk of vulnerability exploitation, underscoring the need for effective and scalable vulnerability detection VD. Existing VD techniques, whether traditional machine learning-based or LLM-based approaches like prompt...
Secure Time-Modulated Intelligent Reflecting Surface via Generative Flow Networks
We propose a novel directional modulation DM design for OFDM transmitters aided by a time-modulated intelligent reflecting surface TM-IRS. The TM-IRS is configured to preserve the integrity of transmitted signals toward multiple legitimate users while scrambling the signal in all other directions...