2 matches found
Jailbreak Distillation: Renewable Safety Benchmarking
Large language models LLMs are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation JBDistill, a novel benchmark construction framework that "distills" jailbreak attacks into high-quality and easily-updatable safety...
GenoArmory: a Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models
We propose the first unified adversarial attack benchmark for Genomic Foundation Models GFMs, named GenoArmory. Unlike existing GFM benchmarks, GenoArmory offers the first comprehensive evaluation framework to systematically assess the vulnerability of GFMs to adversarial attacks. Methodologicall...