Code Agent Can Be an End-To-End System Hacker: Benchmarking Real-World Threats of Computer-Use Agent
Computer-use agent CUA frameworks, powered by large language models LLMs or multimodal LLMs MLLMs, are rapidly maturing as assistants that can perceive context, reason, and act directly within software environments. Among their most critical applications is operating system OS control. As CUAs in...