2 matches found
SafeClaw-R: Towards Safe and Secure Multi-Agent Personal Assistants
LLM-based multi-agent systems MASs are transforming personal productivity by autonomously executing complex, cross-platform tasks. Frameworks such as OpenClaw demonstrate the potential of locally deployed agents integrated with personal data and services, but this autonomy introduces significant...
A Red Teaming Roadmap Towards System-Level Safety
Large Language Model LLM safeguards, which implement request refusals, have become a widely adopted mitigation strategy against misuse. At the intersection of adversarial machine learning and AI safety, safeguard red teaming has effectively identified critical vulnerabilities in state-of-the-art...