VulnRepairEval: an Exploit-Based Evaluation Framework for Assessing Large Language Model Vulnerability Repair Capabilities
The adoption of Large Language Models LLMs for automated software vulnerability patching has shown promising outcomes on carefully curated evaluation sets. Nevertheless, existing datasets predominantly rely on superficial validation methods rather than exploit-based verification, leading to...