NEXUS: Network Exploration for EXploiting Unsafe Sequences in Multi-Turn LLM Jailbreaks
Large Language Models LLMs have revolutionized natural language processing but remain vulnerable to jailbreak attacks, especially multi-turn jailbreaks that distribute malicious intent across benign exchanges and bypass alignment mechanisms. Existing approaches often explore the adversarial space...