RLCracker: Exposing the Vulnerability of LLM Watermarks with Adaptive RL Attacks
Large Language Models LLMs watermarking has shown promise in detecting AI-generated content and mitigating misuse, with prior work claiming robustness against paraphrasing and text editing. In this paper, we argue that existing evaluations are not sufficiently adversarial, obscuring critical...