Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking
Jailbreak techniques for large language models LLMs evolve faster than benchmarks, making robustness estimates stale and difficult to compare across papers due to drift in datasets, harnesses, and judging protocols. We introduce JAILBREAK FOUNDRY JBF, a system that addresses this gap via a...