2 matches found
Evaluating and Enhancing the Vulnerability Reasoning Capabilities of Large Language Models
Large Language Models LLMs have demonstrated remarkable proficiency in vulnerability detection. However, a critical reliability gap persists: models frequently yield correct detection verdicts based on hallucinated logic or superficial patterns that deviate from the actual root cause. This...
The Semantic Trap: Do Fine-Tuned LLMs Learn Vulnerability Root Cause or Just Functional Pattern?
LLMs demonstrate promising performance in software vulnerability detection after fine-tuning. However, it remains unclear whether these gains reflect a genuine understanding of vulnerability root causes or merely an exploitation of functional patterns. In this paper, we identify a critical failur...