RedTWIZ: Diverse LLM Red Teaming Via Adaptive Attack Planning
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework, to audit the robustness of Large Language Models LLMs in AI-assisted software development. Our work is driven by three major research streams: 1...