Disrupting Vision-Language Model-Driven Navigation Services Via Adversarial Object Fusion
We present Adversarial Object Fusion AdvOF, a novel attack framework targeting vision-and-language navigation VLN agents in service-oriented environments by generating adversarial 3D objects. While foundational models like Large Language Models LLMs and Vision Language Models VLMs have enhanced...