P2P: A Poison-To-Poison Remedy for Reliable Backdoor Defense in LLMs
During fine-tuning, large language models LLMs are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability and trustworthiness. However, existing defense strategies suffer from limited generalization: they only work on specific attack types or task settings...