330 matches found
AI Model Extraction Attacks: Bypassing Single-Client Assumptions in Defenses
Ensuring the protection of Artificial Intelligence AI models deployed in military Command and Control C2 systems and critical infrastructure is essential for maintaining information superiority. Model Extraction Attacks MEAs pose a significant threat, as they enable adversaries to replicate...
From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents
Memory is a core component of AI agents, enabling them to accumulate knowledge across interactions and improve performance. However, persistent memory introduces the risk of memory poisoning, where a single adversarial memory write can exert long-term influence over agent behavior. We present a...
Investigating Detection and Obfuscation of Prompt Injection Attacks against Software Reverse Engineering AI Agents
Agentic software reverse engineering systems are vulnerable to prompt injection attacks placed into the source code of executable binary files. This research demonstrates defensive tactics for detecting the presences of prompt injection strings in the decompiler output of adversarial example...
Security of OpenClaw Agents: Fundamentals, Attacks, and Countermeasures
The rapid evolution of large language model LLM-driven autonomous agents has given rise to OpenClaw, a new class of open-source agent frameworks that operate as continuously running, skill-augmented systems with persistent memory, multi-channel interaction, and high degrees of autonomy. Such...
Security, Privacy, and Ethical Risks in OpenClaw
This paper systematically investigates the security, privacy, and ethical risks, as well as the traceability challenges of OpenClaw, a locally executable AI agent system for natural language interaction and real-world task completion. While OpenClaw shows strong potential for personal assistance,...
Fake malware-signing service Fox Tempest dismantled by Microsoft
Microsoft says it dismantled a malware-signing-as-a-service MSaaS called Fox Tempest, which helped cybercriminals make malware appear legitimate. The service let customers submit malicious files to be digitally signed with short-lived Microsoft-issued certificates, making the malware look...
The End of Trust: How Agentic AI Breaks Security Assumptions
For decades, the security of digital interaction has rested on an unacknowledged economic constraint. Attackers faced a tradeoff between the fidelity of a deception and the scale at which it could be deployed. Convincing impersonation required sustained human effort and was confined to a narrow s...
Backdoor Threats in Variational Quantum Circuits: Taxonomy, Attacks, and Defenses
Variational quantum algorithms VQAs are a central paradigm for noisy intermediate-scale NISQ quantum computing, yet their reliance on predesigned and pretrained variational quantum circuits VQCs introduces critical security vulnerabilities, particularly backdoor attacks. These attacks embed hidde...
Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety Alignment
Recent advancements in visual context compression enable MLLMs to process ultra-long contexts efficiently by rendering text into images. However, we identify a critical vulnerability inherent to this paradigm: lowering image resolution inadvertently catalyzes jailbreaking. Our experiments reveal...
Stego Battlefield: Evaluating Image Steganography Attacks and Steganalysis Defenses
Image steganography is widely used to protect user privacy and enable covert communication. However, it can also be abused by the adversary as a covert channel to bypass content moderation, disseminate harmful semantics, and even hide malicious instructions in images to elicit dangerous outputs...
Revisiting JBShield: Breaking and Rebuilding Representation-Level Jailbreak Defenses
Defending large language models LLMs against jailbreak attacks, such as Greedy Coordinate Gradient GCG, remains a challenge, particularly under adaptive threat models where an attacker directly targets the defense mechanism. JBShield, a recent jailbreak defense with a 0% attack success rate in so...
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
Memory systems enable otherwise-stateless LLM agents to persist user information across sessions, but also introduce a new attack surface. We characterize the Trojan Hippo attack, a class of persistent memory attacks that operates in a more realistic threat model than prior memory poisoning work:...
Semantic Denial of Service in LLM-Controlled Robots
Safety-oriented instruction-following is supposed to keep LLM-controlled robots safe. We show it also creates an availability attack surface. By injecting short safety-plausible phrases 1-5 tokens into a robots audio channel, an adversary can trigger the models safety reasoning to halt or disrupt...
From Stateless Queries to Autonomous Actions: A Layered Security Framework for Agentic AI Systems
Agentic AI systems face security challenges that stateless large language models do not. They plan across extended horizons, maintain persistent memory, invoke external tools, and coordinate with peer agents. Existing security analyses organize threats by attack type prompt injection, jailbreakin...
EUVD-2026-25322
OpenClaw before 2026.3.31 contains a time-of-check-time-of-use vulnerability in sandbox file operations that allows attackers to bypass fd-based defenses. Attackers can exploit check-then-act patterns in applypatch, remove, and mkdir operations to manipulate files between validation and execution...
Containing a domain compromise: How predictive shielding shut down lateral movement
In this article 1. Predictive shielding overview 2. Attack chain overview 3. How predictive shielding changed the outcome 4. MITRE ATT&CK® techniques observed 5. Learn more In identity-based attack campaigns, any initial access activity can turn an already serious intrusion into a critical incide...
Primer on GitHub Actions Security - Threat Model, Attacks and Defenses (Part 1/2)
Understanding and defending your GitHub Actions - from threat model to security controls...
What Is Threat Hunting? A Complete Guide for Security Teams
What Is Threat Hunting? A Complete Guide for Security Teams Security tools catch a lot. They do not catch everything. Automated detection systems rely on known signatures, predefined rules, and behavioral baselines. Sophisticated adversaries know this and design their operations to slip through t...
Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions
Retrieval-augmented generation RAG significantly enhances large language models LLMs but introduces novel security risks through external knowledge access. While existing studies cover various RAG vulnerabilities, they often conflate inherent LLM risks with those specifically introduced by RAG. I...
Follow My Eyes: Backdoor Attacks on VLM-Based Scanpath Prediction
Scanpath prediction models forecast the sequence and timing of human fixations during visual search, driving foveated rendering and attention-based interaction in mobile systems where their integrity is a first-class security concern. We present the first study of backdoor attacks against VLM-bas...