2 matches found
Evolving Skill-Structured Attack Memory Enhances LLM Jailbreaking
Jailbreak attacks on large language models LLMs aim to induce LLMs to produce content that they are expected to refuse. Automated black-box jailbreak generation is especially important for safety evaluation, where the attacker observes only model outputs and needs to automatically search for...
evolver
🧬 Evolver !GitHub starshttps://img.shields.io/github/star...