3 matches found
VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection
We present VulStyle, a multi-modal software vulnerability detection model that jointly encodes function-level source code, non-terminal Abstract Syntax Tree AST structure, and code stylometry CStyle features. Prior work in code representation primarily leverages token-level models or full AST...
APT-ClaritySet: A Large-Scale, High-Fidelity Labeled Dataset for APT Malware with Alias Normalization and Graph-Based Deduplication
Large-scale, standardized datasets for Advanced Persistent Threat APT research are scarce, and inconsistent actor aliases and redundant samples hinder reproducibility. This paper presents APT-ClaritySet and its construction pipeline that normalizes threat actor aliases reconciling approximately...
MalVis: a Large-Scale Image-Based Framework and Dataset for Advancing Android Malware Classification
As technology advances, Android malware continues to pose significant threats to devices and sensitive data. The open-source nature of the Android OS and the availability of its SDK contribute to this rapid growth. Traditional malware detection techniques, such as signature-based, static, and...