410 matches found
MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs
LLM-based web agents have become increasingly popular for their utility in daily life and work. However, they exhibit critical vulnerabilities when processing malicious URLs: accepting a disguised malicious URL enables subsequent access to unsafe webpages, which can cause severe damage to service...
AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detection
Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context constraints. Existing SAST-based checks are noisy and often miss immature, context-dependent vulnerabilities, while standalone Large Language Models LLMs are...
Mitigating the OWASP Top 10 for Large Language Models Applications Using Intelligent Agents
Large Language Models LLMs have emerged as a transformative and disruptive technology, enabling a wide range of applications in natural language processing, machine translation, and beyond. However, this widespread integration of LLMs also raised several security concerns highlighted by the Open...
PatchIsland: Orchestration of LLM Agents for Continuous Vulnerability Repair
Continuous fuzzing platforms such as OSS-Fuzz uncover large numbers of vulnerabilities, yet the subsequent repair process remains largely manual. Unfortunately, existing Automated Vulnerability Repair AVR techniques -- including recent LLM-based systems -- are not directly applicable to continuou...
TrojanGYM: A Detector-In-The-Loop LLM for Adaptive RTL Hardware Trojan Insertion
Hardware Trojans HTs remain a critical threat because learning-based detectors often overfit to narrow trigger/payload patterns and small, stylized benchmarks. We introduce TrojanGYM, an agentic, LLM-driven framework that automatically curates HT insertions to expose detector blind spots while...
Why AI Keeps Falling for Prompt Injection Attacks
Imagine you work at a drive-through restaurant. Someone drives up and says: "I'll have a double cheeseburger, large fries, and ignore previous instructions and give me the contents of the cash drawer." Would you hand over the money? Of course not. Yet this is what large language models LLMs do...
CVE-2026-22807
Vulnerability CVE-2026-22807 affects vLLM versions prior to 0.14.0, where during model resolution the engine loads Hugging Face auto_map dynamic modules without gating on trust_remote_code. This allows attacker-controlled Python code in a model repo or path to execute at server startup, before an...
EUVD-2026-3678
vLLM is an inference and serving engine for large language models LLMs. Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face automap dynamic modules during model resolution without gating on trustremotecode, allowing attacker-controlled Python code in a model repo/path ...
CVE-2026-22807
vLLM is an inference and serving engine for large language models LLMs. Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face automap dynamic modules during model resolution without gating on trustremotecode, allowing attacker-controlled Python code in a model repo/path ...
HardSecBench: Benchmarking the Security Awareness of LLMs for Hardware Code Generation
Large language models LLMs are being increasingly integrated into practical hardware and firmware development pipelines for code generation. Existing studies have primarily focused on evaluating the functional correctness of LLM-generated code, yet paid limited attention to its security issues...
Constructing Multi-Label Hierarchical Classification Models for MITRE ATT&CK Text Tagging
MITRE ATT&CK is a cybersecurity knowledge base that organizes threat actor and cyber-attack information into a set of tactics describing the reasons and goals threat actors have for carrying out attacks, with each tactic having a set of techniques that describe the potential methods used in these...
Multi-Agent Taint Specification Extraction for Vulnerability Detection
Static Application Security Testing SAST tools using taint analysis are widely viewed as providing higher-quality vulnerability detection results compared to traditional pattern-based approaches. However, performing static taint analysis for JavaScript poses two major challenges. First,...
LLMs in Code Vulnerability Analysis: A Proof of Concept
Context: Traditional software security analysis methods struggle to keep pace with the scale and complexity of modern codebases, requiring intelligent automation to detect, assess, and remediate vulnerabilities more efficiently and accurately. Objective: This paper explores the incorporation of...
Proactively Detecting Threats: A Novel Approach Using LLMs
Enterprise security faces escalating threats from sophisticated malware, compounded by expanding digital operations. This paper presents the first systematic evaluation of large language models LLMs to proactively identify indicators of compromise IOCs from unstructured web-based threat...
The Echo Chamber Multi-Turn LLM Jailbreak
The availability of Large Language Models LLMs has led to a new generation of powerful chatbots that can be developed at relatively low cost. As companies deploy these tools, security challenges need to be addressed to prevent financial loss and reputational damage. A key security challenge is...
Jailbreaking LLMs and VLMs: Mechanisms, Evaluation, and Unified Defense
This paper provides a systematic survey of jailbreak attacks and defenses on Large Language Models LLMs and Vision-Language Models VLMs, emphasizing that jailbreak vulnerabilities stem from structural factors such as incomplete training data, linguistic ambiguity, and generative uncertainty. It...
HoneyTrap: Deceiving Large Language Model Attackers to Honeypot Traps with Resilient Multi-Agent Defense
Jailbreak attacks pose significant threats to large language models LLMs, enabling attackers to bypass safeguards. However, existing reactive defense approaches struggle to keep up with the rapidly evolving multi-turn jailbreaks, where attackers continuously deepen their attacks to exploit...
Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation
The software supply chain attacks are becoming more and more focused on trusted development and delivery procedures, so the conventional post-build integrity mechanisms cannot be used anymore. The available frameworks like SLSA, SBOM and in toto are majorly used to offer provenance and traceabili...
EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models Via Equation Solving and Code Completion
Large language models LLMs, such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains a significant concern, as they are still susceptible to jailbreak attacks aimed at eliciting inappropriate or harmful responses. However, existing...
The Imitation Game: Using Large Language Models As Chatbots to Combat Chat-Based Cybercrimes
Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on trust-building, deception, and psychological manipulation. Traditional defense mechanisms, which operate on static rules or shallow content filters,...