5 matches found
Attention Is Where You Attack
Safety-aligned large language models rely on RLHF and instruction tuning to refuse harmful requests, yet the internal mechanisms implementing safety behavior remain poorly understood. We introduce the Attention Redistribution Attack ARA, a white-box adversarial attack that identifies...
Physical ID-Transfer Attacks against Multi-Object Tracking Via Adversarial Trajectory
Multi-Object Tracking MOT is a critical task in computer vision, with applications ranging from surveillance systems to autonomous driving. However, threats to MOT algorithms have yet been widely studied. In particular, incorrect association between the tracked objects and their assigned IDs can...
Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples
Adversarial detection protects models from adversarial attacks by refusing suspicious test samples. However, current detection methods often suffer from weak generalization: their effectiveness tends to degrade significantly when applied to adversarially trained models rather than naturally train...
Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
Membership inference attack MIA has become one of the most widely used and effective methods for evaluating the privacy risks of machine learning models. These attacks aim to determine whether a specific sample is part of the model's training set by analyzing the model's output. While traditional...
adversarial-attacks-white-black-box (=0.1.7) potentially affected by CVE-2025-25302 via rembg (=2.0.57)
rembg PYPI version =2.0.57 is affected by a known vulnerability. The following packages have a transitive dependency on rembg and may be impacted: - adversarial-attacks-white-black-box =0.1.7 Source cves: CVE-2025-25302 Source advisory: OSV:PYSEC-2025-25...