659 matches found
Shell or Nothing: Real-World Benchmarks and Memory-Activated Agents for Automated Penetration Testing
Penetration testing is critical for identifying and mitigating security vulnerabilities, yet traditional approaches remain expensive, time-consuming, and dependent on expert human labor. Recent work has explored AI-driven pentesting agents, but their evaluation relies on oversimplified...
PatchSeeker: Mapping NVD Records to Their Vulnerability-Fixing Commits with LLM Generated Commits and Embeddings
Software vulnerabilities pose serious risks to modern software ecosystems. While the National Vulnerability Database NVD is the authoritative source for cataloging these vulnerabilities, it often lacks explicit links to the corresponding Vulnerability-Fixing Commits VFCs. VFCs encode precise code...
AgentSentinel: an End-To-End and Real-Time Security Defense Framework for Computer-Use Agents
Large Language Models LLMs have been increasingly integrated into computer-use agents, which can autonomously operate tools on a user's computer to accomplish complex tasks. However, due to the inherently unstable and unpredictable nature of LLM outputs, they may issue unintended tool commands or...
Decoding Latent Attack Surfaces in LLMs: Prompt Injection Via HTML in Web Summarization
Large Language Models LLMs are increasingly integrated into web-based systems for content summarization, yet their susceptibility to prompt injection attacks remains a pressing concern. In this study, we explore how non-visible HTML elements such as , aria-label, and alt attributes can be exploit...
Behind the Mask: Benchmarking Camouflaged Jailbreaks in Large Language Models
Large Language Models LLMs are increasingly vulnerable to a sophisticated form of adversarial prompting known as camouflaged jailbreaking. This method embeds malicious intent within seemingly benign language to evade existing safety mechanisms. Unlike overt attacks, these subtle prompts exploit...
An Empirical Study of Vulnerabilities in Python Packages and Their Detection
In the rapidly evolving software development landscape, Python stands out for its simplicity, versatility, and extensive ecosystem. Python packages, as units of organization, reusability, and distribution, have become a pressing concern, highlighted by the considerable number of vulnerability...
VULSOVER: Vulnerability Detection Via LLM-Driven Constraint Solving
Traditional vulnerability detection methods rely heavily on predefined rule matching, which often fails to capture vulnerabilities accurately. With the rise of large language models LLMs, leveraging their ability to understand code semantics has emerged as a promising direction for achieving more...
Agentic Discovery and Validation of Android App Vulnerabilities
Existing Android vulnerability detection tools overwhelm teams with thousands of low-signal warnings yet uncover few true positives. Analysts spend days triaging these results, creating a bottleneck in the security pipeline. Meanwhile, genuinely exploitable vulnerabilities often slip through,...
PromptSleuth: Detecting Prompt Injection Via Semantic Intent Invariance
Large Language Models LLMs are increasingly integrated into real-world applications, from virtual assistants to autonomous agents. However, their flexibility also introduces new attack vectors-particularly Prompt Injection PI, where adversaries manipulate model behavior through crafted inputs. As...
Mind the Gap: Time-Of-Check to Time-Of-Use Vulnerabilities in LLM-Enabled Agents
Large Language Model LLM-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks e.g., prompt injection and data-oriented threats e.g., data exfiltration...
A Robust Cross-Domain IDS Using BiGRU-LSTM-Attention for Medical and Industrial IoT Security
The increased Internet of Medical Things IoMT and the Industrial Internet of Things IIoT interconnectivity has introduced complex cybersecurity challenges, exposing sensitive data, patient safety, and industrial operations to advanced cyber threats. To mitigate these risks, this paper introduces ...
MCPSecBench: a Systematic Security Benchmark and Playground for Testing Model Context Protocols
Large Language Models LLMs are increasingly integrated into real-world applications via the Model Context Protocol MCP, a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new securit...
Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations
Natural language explanations in visual question answering VQA-NLE aim to make black-box models more transparent by elucidating their decision-making processes. However, we find that existing VQA-NLE systems can produce inconsistent explanations and reach conclusions without genuinely understandi...
CryptoScope: Utilizing Large Language Models for Automated Cryptographic Logic Vulnerability Detection
Cryptographic algorithms are fundamental to modern security, yet their implementations frequently harbor subtle logic flaws that are hard to detect. We introduce CryptoScope, a novel framework for automated cryptographic vulnerability detection powered by Large Language Models LLMs. CryptoScope...
Malicious code in sun-cloud-sudo-array-benchmark (npm)
The package sun-cloud-sudo-array-benchmark was found to contain malicious code...
Malicious code in final-abstract-info-benchmark-gamma (npm)
The package final-abstract-info-benchmark-gamma was found to contain malicious code...
Malicious code in book-short-grid-benchmark-route (npm)
The package book-short-grid-benchmark-route was found to contain malicious code...
Malicious code in export-double-cache-benchmark-resolve (npm)
The package export-double-cache-benchmark-resolve was found to contain malicious code...
Malicious code in deploy-benchmark-fork-bundle-dog (npm)
The package deploy-benchmark-fork-bundle-dog was found to contain malicious code...
Malicious code in benchmark-deserialize-mu-epsilon-shell (npm)
The package benchmark-deserialize-mu-epsilon-shell was found to contain malicious code...