20 matches found
MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models
Diffusion large language models dLLMs generate text by iteratively denoising partially masked sequences under bidirectional context, exposing a safety surface distinct from autoregressive LLMs. Because mask tokens are native inputs and tokens are committed by confidence rather than position,...
Babel: Jailbreaking Safety Attention Via Obfuscation Distribution Optimized Sampling
Despite rigorous safety alignment, Large Language Models LLMs remain vulnerable to jailbreak attacks. Existing black-box methods often rely on heuristic templates or exhaustive trials, lacking mechanistic interpretability and query efficiency. In this study, we investigate an intrinsic...
Targeted Adversarial Traffic Generation : Black-Box Approach to Evade Intrusion Detection Systems in IoT Networks
The integration of machine learning ML algorithms into Internet of Things IoT applications has introduced significant advantages alongside vulnerabilities to adversarial attacks, especially within IoT-based intrusion detection systems IDS. While theoretical adversarial attacks have been extensive...
Frequency Bias Matters: Diving into Robust and Generalized Deep Image Forgery Detection
As deep image forgery powered by AI generative models, such as GANs, continues to challenge today's digital world, detecting AI-generated forgeries has become a vital security topic. Generalizability and robustness are two critical concerns of a forgery detector, determining its reliability when...
"To Survive, I Must Defect": Jailbreaking LLMs Via the Game-Theory Scenarios
As LLMs become more common, non-expert users can pose risks, prompting extensive research into jailbreak attacks. However, most existing black-box jailbreak attacks rely on hand-crafted heuristics or narrow search spaces, which limit scalability. Compared with prior attacks, we propose Game-Theor...
JPRO: Automated Multimodal Jailbreaking Via Multi-Agent Collaboration Framework
The widespread application of large VLMs makes ensuring their secure deployment critical. While recent studies have demonstrated jailbreak attacks on VLMs, existing approaches are limited: they require either white-box access, restricting practicality, or rely on manually crafted patterns, leadin...
Spectral Masking and Interpolation Attack (SMIA): a Black-Box Adversarial Attack against Voice Authentication and Anti-Spoofing Systems
Voice Authentication Systems VAS use unique vocal characteristics for verification. They are increasingly integrated into high-security sectors such as banking and healthcare. Despite their improvements using deep learning, they face severe vulnerabilities from sophisticated threats like deepfake...
Leveraging Trustworthy AI for Automotive Security in Multi-Domain Operations: Towards a Responsive Human-AI Multi-Domain Task Force for Cyber Social Security
Multi-Domain Operations MDOs emphasize cross-domain defense against complex and synergistic threats, with civilian infrastructures like smart cities and Connected Autonomous Vehicles CAVs emerging as primary targets. As dual-use assets, CAVs are vulnerable to Multi-Surface Threats MSTs,...
Attacking Interpretable NLP Systems
Studies have shown that machine learning systems are vulnerable to adversarial examples in theory and practice. Where previous attacks have focused mainly on visual models that exploit the difference between human and machine perception, text-based models have also fallen victim to these attacks...
Vulnerability Disclosure through Adaptive Black-Box Adversarial Attacks on NIDS
Adversarial attacks, wherein slight inputs are carefully crafted to mislead intelligent models, have attracted increasing attention. However, a critical gap persists between theoretical advancements and practical application, particularly in structured data like network traffic, where...
IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems
The rapid advancement of Large Language Models LLMs has led to the emergence of Multi-Agent Systems MAS to perform complex tasks through collaboration. However, the intricate nature of MAS, including their architecture and agent interactions, raises significant concerns regarding intellectual...
GenBreak: Red Teaming Text-To-Image Generators Using Large Language Models
Text-to-image T2I models such as Stable Diffusion have advanced rapidly and are now widely used in content creation. However, these models can be misused to generate harmful content, including nudity or violence, posing significant safety risks. While most platforms employ content moderation...
HauntAttack: When Attack Follows Reasoning As a Shadow
Emerging Large Reasoning Models LRMs consistently excel in mathematical and reasoning tasks, showcasing exceptional capabilities. However, the enhancement of reasoning abilities and the exposure of their internal reasoning processes introduce new safety vulnerabilities. One intriguing concern is:...
Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples
Adversarial detection protects models from adversarial attacks by refusing suspicious test samples. However, current detection methods often suffer from weak generalization: their effectiveness tends to degrade significantly when applied to adversarially trained models rather than naturally train...
Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
Membership inference attack MIA has become one of the most widely used and effective methods for evaluating the privacy risks of machine learning models. These attacks aim to determine whether a specific sample is part of the model's training set by analyzing the model's output. While traditional...
AdInject: Real-World Black-Box Attacks on Web Agents Via Advertising Delivery
Vision-Language Model VLM based Web Agents represent a significant step towards automating complex tasks by simulating human-like interaction with websites. However, their deployment in uncontrolled web environments introduces significant security vulnerabilities. Existing research on adversarial...
Token-Level Constraint Boundary Search for Jailbreaking Text-To-Image Models
Recent advancements in Text-to-Image T2I generation have significantly enhanced the realism and creativity of generated images. However, such powerful generative capabilities pose risks related to the production of inappropriate or harmful content. Existing defense mechanisms, including prompt...
Introduction and Application of Model Hacking
ARCHIVED STORY Introduction and Application of Model Hacking By Steve Povolny · Febraury 19, 2020 Catherine Huang, Ph.D., and Shivangee Trivedi contributed to this blog. The term “Adversarial Machine Learning” AML is a mouthful! The term describes a research field regarding the study and design o...
Introduction and Application of Model Hacking
ARCHIVED STORY Introduction and Application of Model Hacking By Steve Povolny · Febraury 19, 2020 Catherine Huang, Ph.D., and Shivangee Trivedi contributed to this blog. The term “Adversarial Machine Learning” AML is a mouthful! The term describes a research field regarding the study and design o...
KoffeyMaker: notebook vs. ATM
Despite CCTV and the risk of being caught by security staff, attacks on ATMs using a direct connection — so-called black box attacks — are still popular with cybercriminals. The main reason is the low "entry requirements" for would-be cyber-robbers: specialized sites offer both the necessary tool...