Screen Hijack: Visual Poisoning of VLM Agents in Mobile Environments
With the growing integration of vision-language models VLMs, mobile agents are now widely used for tasks like UI automation and camera-based user assistance. These agents are often fine-tuned on limited user-generated datasets, leaving them vulnerable to covert threats during the training process...