4 matches found
Temporal UI State Inconsistency in Desktop GUI Agents: Formalizing and Defending against TOCTOU Attacks on Computer-Use Agents
GUI agents that control desktop computers via screenshot-and-click loops introduce a new class of vulnerability: the observation-to-action gap mean 6.51 s on real OSWorld workloads creates a Time-Of-Check, Time-Of-Use TOCTOU window during which an unprivileged attacker can manipulate the UI state...
Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense
The rapid evolution of GUI-enabled agents has rendered traditional CAPTCHAs obsolete. While previous benchmarks like OpenCaptchaWorld established a baseline for evaluating multimodal agents, recent advancements in reasoning-heavy models, such as Gemini3-Pro-High and GPT-5.2-Xhigh have effectively...
Realistic Environmental Injection Attacks on GUI Agents
GUI agents built on LVLMs are increasingly used to interact with websites. However, their exposure to open-world content makes them vulnerable to Environmental Injection Attacks EIAs that hijack agent behavior via webpage elements. Many recent studies assume the attacker to be a regular user who...
The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections
A Large Language Model LLM powered GUI agent is a specialized autonomous system that performs tasks on the user's behalf according to high-level instructions. It does so by perceiving and interpreting the graphical user interfaces GUIs of relevant apps, often visually, inferring necessary sequenc...