12 matches found
Assessing Automated Prompt Injection Attacks in Agentic Environments
Indirect prompt injection poses a critical threat to LLM agents that interact with untrusted external data, yet automated attack methods--proven effective for jailbreaking--remain underexplored in realistic agentic settings. We present a comprehensive empirical evaluation of automated prompt...
A No-Defense Defense against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?
Gradient-based adversarial attacks subtly manipulate inputs of Machine Learning ML models to induce incorrect predictions. This paper investigates whether careful architectural choices alone can yield an inherently robust Deep Neural Network DNN-based Network Intrusion Detection Systems NIDS,...
Information-Dense Reasoning for Efficient and Auditable Security Alert Triage
Security Operations Centers face massive, heterogeneous alert streams under minute-level service windows, creating the Alert Triage Latency Paradox: verbose reasoning chains ensure accuracy and compliance but incur prohibitive latency and token costs, while minimal chains sacrifice transparency a...
Colliding with Adversaries at ECML-PKDD 2025 Adversarial Attack Competition 1st Prize Solution
This report presents the winning solution for Task 1 of Colliding with Adversaries: A Challenge on Robust Learning in High Energy Physics Discovery at ECML-PKDD 2025. The task required designing an adversarial attack against a provided classification model that maximizes misclassification while...
Foe for Fraud: Transferable Adversarial Attacks in Credit Card Fraud Detection
Credit card fraud detection CCFD is a critical application of Machine Learning ML in the financial sector, where accurately identifying fraudulent transactions is essential for mitigating financial losses. ML models have demonstrated their effectiveness in fraud detection task, in particular with...
PLA: Prompt Learning Attack against Text-To-Image Generative Models
Text-to-Image T2I models have gained widespread adoption across various applications. Despite the success, the potential misuse of T2I models poses significant risks of generating Not-Safe-For-Work NSFW content. To investigate the vulnerability of T2I models, this paper delves into adversarial...
Evaluating the Evaluators: Trust in Adversarial Robustness Tests
Despite significant progress in designing powerful adversarial evasion attacks for robustness verification, the evaluation of these methods often remains inconsistent and unreliable. Many assessments rely on mismatched models, unverified implementations, and uneven computational budgets, which ca...
Winter Soldier: Backdooring Language Models at Pre-Training with Indirect Data Poisoning
The pre-training of large language models LLMs relies on massive text datasets sourced from diverse and difficult-to-curate origins. Although membership inference attacks and hidden canaries have been explored to trace data usage, such methods rely on memorization of training data, which LM...
GradEscape: a Gradient-Based Evader against AI-Generated Text Detectors
In this paper, we introduce GradEscape, the first gradient-based evader designed to attack AI-generated text AIGT detectors. GradEscape overcomes the undifferentiable computation problem, caused by the discrete nature of text, by introducing a novel approach to construct weighted embeddings for t...
D2R: Dual Regularization Loss with Collaborative Adversarial Generation for Model Robustness
The robustness of Deep Neural Network models is crucial for defending models against adversarial attacks. Recent defense methods have employed collaborative learning frameworks to enhance model robustness. Two key limitations of existing methods are i insufficient guidance of the target model via...
Anti-Sensing: Defense against Unauthorized Radar-Based Human Vital Sign Sensing with Physically Realizable Wearable Oscillators
Recent advancements in Ultra-Wideband UWB radar technology have enabled contactless, non-line-of-sight vital sign monitoring, making it a valuable tool for healthcare. However, UWB radar's ability to capture sensitive physiological data, even through walls, raises significant privacy concerns,...
Input-Specific and Universal Adversarial Attack Generation for Spiking Neural Networks in the Spiking Domain
As Spiking Neural Networks SNNs gain traction across various applications, understanding their security vulnerabilities becomes increasingly important. In this work, we focus on the adversarial attacks, which is perhaps the most concerning threat. An adversarial attack aims at finding a subtle...