EASE: Practical and Efficient Safety Alignment for Small Language Models
Small language models SLMs are increasingly deployed on edge devices, making their safety alignment crucial yet challenging. Current shallow alignment methods that rely on direct refusal of malicious queries fail to provide robust protection, particularly against adversarial jailbreaks. While...