Adversarial Suffix Filtering: a Defense Pipeline for LLMs
Large Language Models LLMs are increasingly embedded in autonomous systems and public-facing environments, yet they remain susceptible to jailbreak vulnerabilities that may undermine their security and trustworthiness. Adversarial suffixes are considered to be the current state-of-the-art...