UCD: Unlearning in LLMs Via Contrastive Decoding
Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models LLMs while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive decoding, leveraging two auxiliary smaller models, one train...