Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
Large language models LLMs remain vulnerable to sophisticated prompt engineering attacks that exploit contextual framing to bypass safety mechanisms, posing significant risks in cybersecurity applications. We introduce Jailbreak Mimicry, a systematic methodology for training compact attacker mode...