12 matches found
Separating Secrets from Placeholders: A Hybrid CNN-CodeBERT Framework for Three-Class Credential Leakage Detection
Credential leakage in public source code repositories poses a critical security threat, with over 23.8 million secrets exposed in 2024 alone. Existing detection tools suffer from high false-positive rates because rigid pattern matching and binary classification schemes fail to distinguish genuine...
Weaponizing the Commons: A Taxonomy and Detection Framework of Abuse on GitHub
GitHub plays a critical role in modern software supply chains, making its security an important research concern. Existing studies have primarily focused on CI/CD automation, collaboration patterns, and community management, while abuse behaviors on GitHub have received little systematic...
A Synthetic Conversational Smishing Dataset for Social Engineering Detection
Smishing SMS phishing has become a serious cybersecurity threat, especially for elderly and cyber-unaware individuals, causing financial loss and undermining user trust. Although prior work has focused on detecting smishing at the level of individual messages, real-world attackers often rely on...
Learning the APT Kill Chain: Temporal Reasoning over Provenance Data for Attack Stage Estimation
Advanced Persistent Threats APTs evolve through multiple stages, each exhibiting distinct temporal and structural behaviors. Accurate stage estimation is critical for enabling adaptive cyber defense. This paper presents StageFinder, a temporal graph learning framework for multi-stage attack...
SafePickle: Robust and Generic ML Detection of Malicious Pickle-Based ML Models
Model repositories such as Hugging Face increasingly distribute machine learning artifacts serialized with Python's pickle format, exposing users to remote code execution RCE risks during model loading. Recent defenses, such as PickleBall, rely on per-library policy synthesis that requires comple...
Hydra: Robust Hardware-Assisted Malware Detection
Malware detection using Hardware Performance Counters HPCs offers a promising, low-overhead approach for monitoring program behavior. However, a fundamental architectural constraint, that only a limited number of hardware events can be monitored concurrently, creates a significant bottleneck,...
Retrieval-Augmented Few-Shot Prompting Versus Fine-Tuning for Code Vulnerability Detection
Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models LLMs in specialized tasks. However, its effectiveness depends heavily on the selection and quality of in-context examples, particularly in complex domains. In this wor...
CLASP: Cost-Optimized LLM-Based Agentic System for Phishing Detection
Phishing websites remain a significant cybersecurity threat, necessitating accurate and cost-effective detection mechanisms. In this paper, we present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents, built using large language models...
Robust Federated Learning with Confidence-Weighted Filtering and GAN-Based Completion under Noisy and Incomplete Data
Federated learning FL presents an effective solution for collaborative model training while maintaining data privacy across decentralized client datasets. However, data quality issues such as noisy labels, missing classes, and imbalanced distributions significantly challenge its effectiveness. Th...
Machine Learning-Based Detection of DDoS Attacks in VANETs for Emergency Vehicle Communication
Vehicular Ad Hoc Networks VANETs play a key role in Intelligent Transportation Systems ITS, particularly in enabling real-time communication for emergency vehicles. However, Distributed Denial of Service DDoS attacks, which interfere with safety-critical communication channels, can severely impai...
Hybrid Privacy Policy-Code Consistency Check Using Knowledge Graphs and LLMs
The increasing concern in user privacy misuse has accelerated research into checking consistencies between smartphone apps' declared privacy policies and their actual behaviors. Recent advances in Large Language Models LLMs have introduced promising techniques for semantic comparison, but these...
DYNAMITE: Dynamic Defense Selection for Enhancing Machine Learning-Based Intrusion Detection against Adversarial Attacks
The rapid proliferation of the Internet of Things IoT has introduced substantial security vulnerabilities, highlighting the need for robust Intrusion Detection Systems IDS. Machine learning-based intrusion detection systems ML-IDS have significantly improved threat detection capabilities; however...