CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions
The evaluation of Large Language Models LLMs for code generation relies heavily on the quality and robustness of test cases. However, existing benchmarks often lack coverage for subtle corner cases, allowing incorrect solutions to pass. To bridge this gap, we propose CodeHacker, an automated agen...