12 matches found
Context-Based Adversarial Attacks on AI Code Generators: Vulnerability Analysis and Implications
AI-powered code generation systems have transformed software development but introduce critical inference-time security vulnerabilities. This research presents a systematic investigation of context-based adversarial attacks, where strategically crafted contextual inputs, including comments,...
Don't Let the Claw Grip Your Hand: A Security Analysis and Defense Framework for OpenClaw
Code agents powered by large language models can execute shell commands on behalf of users, introducing severe security vulnerabilities. This paper presents a two-phase security analysis of the OpenClaw platform. As an open-source AI agent framework that operates locally, OpenClaw can be integrat...
Recursive Language Models for Jailbreak Detection: A Procedural Defense for Tool-Augmented Agents
Jailbreak prompts are a practical and evolving threat to large language models LLMs, particularly in agentic systems that execute tools over untrusted content. Many attacks exploit long-context hiding, semantic camouflage, and lightweight obfuscations that can evade single-pass guardrails. We...
Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-Based Phishing Detection
Phishing sites continue to grow in volume and sophistication. Recent work leverages large language models LLMs to analyze URLs, HTML, and rendered content to decide whether a website is a phishing site. While these approaches are promising, LLMs are inherently vulnerable to prompt injection PI...
Securing AI Agents against Prompt Injection Attacks
Retrieval-augmented generation RAG systems have become widely used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled A...
Evidence of Cognitive Biases in Capture-The-Flag Cybersecurity Competitions
Understanding how cognitive biases influence adversarial decision-making is essential for developing effective cyber defenses. Capture-the-Flag CTF competitions provide an ecologically valid testbed to study attacker behavior at scale, simulating real-world intrusion scenarios under pressure. We...
Adversarial Bug Reports As a Security Risk in Language Model-Based Automated Program Repair
Large Language Model LLM - based Automated Program Repair APR systems are increasingly integrated into modern software development workflows, offering automated patches in response to natural language bug reports. However, this reliance on untrusted user input introduces a novel and underexplored...
PromptSleuth: Detecting Prompt Injection Via Semantic Intent Invariance
Large Language Models LLMs are increasingly integrated into real-world applications, from virtual assistants to autonomous agents. However, their flexibility also introduces new attack vectors-particularly Prompt Injection PI, where adversaries manipulate model behavior through crafted inputs. As...
Addressing the Devastating Effects of Single-Task Data Poisoning in Exemplar-Free Continual Learning
Our research addresses the overlooked security concerns related to data poisoning in continual learning CL. Data poisoning - the intentional manipulation of training data to affect the predictions of machine learning models - was recently shown to be a threat to CL training stability. While...
Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models
The protection of cyber Intellectual Property IP such as web content is an increasingly critical concern. The rise of large language models LLMs with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. As users...
Test-Time Immunization: a Universal Defense Framework against Jailbreaks for (Multimodal) Large Language Models
While multimodal large language models LLMs have attracted widespread attention due to their exceptional capabilities, they remain vulnerable to jailbreak attacks. Various defense methods are proposed to defend against jailbreak attacks, however, they are often tailored to specific types of...
Sparsification under Siege: Defending against Poisoning Attacks in Communication-Efficient Federated Learning
Federated Learning FL enables collaborative model training across distributed clients while preserving data privacy, yet it faces significant challenges in communication efficiency and vulnerability to poisoning attacks. While sparsification techniques mitigate communication overhead by...