658 matches found
Do Coding Agents Understand Least-Privilege Authorization?
As coding agents gain access to shells, repositories, and user files, least-privilege authorization becomes a prerequisite for safe deployment: an agent should receive enough authority to complete the task, without unnecessary authority that exposes sensitive surfaces.To study whether current...
DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization
Software vulnerability detection plays a critical role in ensuring system security, where real-world auditing requires not only determining whether a function is vulnerable but also pinpointing the specific lines responsible. However, existing approaches either rely on a single information source...
Defense at AI speed: Microsoft’s new multi-model agentic security system tops leading industry benchmark
In this article 1. AI-powered vulnerability discovery at hyper-scale 2. Codename: MDASH—Microsoft Security’s new multi-model agentic scanning harness 3. Using codename MDASH for security research 4. The 5.12.2026 Patch Tuesday cohort 5. Two deep dives 1. CVE-2026-33827—Remote unauthenticated UAF ...
b2aiprep (>=0.19.0 <=3.3.2), capstone-text-mining (>=0.0.6 <=0.1.2) +10 more potentially affected by CVE-2026-31224 via snorkel (>=0.10.0 <=0.9.9)
snorkel PYPI version =0.10.0, =0.19.0, =0.0.6, =1.0.2, =0.8.0, =0.1.1, =0.1.2, =0.1.0, =0.6.1, =0.0.0, =1.3.1a1 - t2r2 =0.0.1 - ws-benchmark =1.1.2rc0 Source cves: CVE-2026-31224 Source advisory: SNYK:PYTHON-SNORKEL-16758048...
b2aiprep (>=0.19.0 <=3.3.2), capstone-text-mining (>=0.0.6 <=0.1.2) +10 more potentially affected by CVE-2026-31222 via snorkel (>=0.10.0 <=0.9.9)
snorkel PYPI version =0.10.0, =0.19.0, =0.0.6, =1.0.2, =0.8.0, =0.1.1, =0.1.2, =0.1.0, =0.6.1, =0.0.0, =1.3.1a1 - t2r2 =0.0.1 - ws-benchmark =1.1.2rc0 Source cves: CVE-2026-31222 Source advisory: SNYK:PYTHON-SNORKEL-16758049...
b2aiprep (>=0.19.0 <=3.3.2), capstone-text-mining (>=0.0.6 <=0.1.2) +10 more potentially affected by CVE-2026-31223 via snorkel (>=0.10.0 <=0.9.9)
snorkel PYPI version =0.10.0, =0.19.0, =0.0.6, =1.0.2, =0.8.0, =0.1.1, =0.1.2, =0.1.0, =0.6.1, =0.0.0, =1.3.1a1 - t2r2 =0.0.1 - ws-benchmark =1.1.2rc0 Source cves: CVE-2026-31223 Source advisory: SNYK:PYTHON-SNORKEL-16758051...
Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack
Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in frontier models without overfitting. We argue tha...
SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces
Reusable skills are becoming a common interface for extending large language model agents, packaging procedural guidance with access to files, tools, memory, and execution environments. However, this modularity introduces attack surfaces that are largely missed by existing safety evaluations: eve...
LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments
The rapid proliferation of LLM-based autonomous agents in real operating system environments introduces a new category of safety risk beyond content safety: behavior jailbreak, where an adversary induces an agent to execute dangerous OS-level operations with irreversible consequences. Existing...
MonitoringBench: Semi-Automated Red-Teaming for Agent Monitoring
We introduce a red-teaming methodology that exposes harder-to-catch attacks for coding-agent monitors, suggesting that current practices may under-elicit attacks and overstate monitor performance. We identify three challenges with current red-teaming. First, mode collapse in attack generation,...
ai.timefold.solver:timefold-solver-quarkus-benchmark-deployment (>=0.8.38 <=0.9.38), ai.timefold.solver:timefold-solver-quarkus-benchmark-integration-test (>=0.8.38 <=0.9.38) +3086 more potentially affected by CVE-2026-6860 via io.vertx:vertx-core (>=4.3.4 <=4.3.8)
io.vertx:vertx-core MAVEN version =4.3.4, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =22.9.0, =22.9.0, =22.9.0, =22.9.0, =22.9.5 and more Source cves: CVE-2026-6860 Source advisory: OSV:GHSA-3G76-F9XQ-8VP6https://vulners.com/osv/OSV...
LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution
LLMs are increasingly explored for malware analysis; however, current LLM-based malware attribution remains limited by unsupported indicators and insufficient code-level grounding for identifying malicious and vulnerable code segments. To address these limitations, this research introduces LCC-LL...
Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption
Software obfuscation and encryption present persistent challenges for program comprehension and security analysis, particularly when adversaries conceal Indicators of Compromise IoCs such as IP addresses within source code. While Large Language Models LLMs have recently demonstrated remarkable...
Pen-Strategist: A Reasoning Framework for Penetration Testing Strategy Formation and Analysis
Cyber threats are rapidly increasing, expanding their impact from large-scale enterprises to government services and individual users, making robust security systems increasingly essential. However, a significant shortage of skilled cybersecurity professionals exacerbates this challenge. While...
ARGUS: Defending LLM Agents against Context-Aware Prompt Injection
The rise of Large Language Model LLM agents, augmented with tool use, skills, and external knowledge, has introduced new security risks. Among them, prompt injection attacks, where adversaries embed malicious instructions into the agent workflow, have emerged as the primary threat. However,...
ai.timefold.solver:timefold-solver-quarkus-benchmark-integration-test (>=0.9.38 <=1.20.1), ai.timefold.solver:timefold-solver-quarkus-devui-integration-test (>=0.9.38 <=1.20.1) +1589 more potentially affected by CVE-2026-39852 via io.quarkus:quarkus-vertx-http (>=3.0.0.Alpha1 <=3.20.6)
io.quarkus:quarkus-vertx-http MAVEN version =3.0.0.Alpha1, =0.9.38, =0.9.38, =0.9.38, =0.9.38, =0.9.38, =0.9.38, =0.0.1, =0.0.1, =0.0.1, =0.0.4, =0.0.4, =0.0.4, =0.0.4, =0.0.2, =0.0.1, =0.0.5 and more Source cves: CVE-2026-39852 Source advisory: SNYK:JAVA-IOQUARKUS-16420254...
VulKey: Automated Vulnerability Repair Guided by Domain-Specific Repair Patterns
The increasing prevalence of software vulnerabilities highlights the need for effective Automatic Vulnerability Repair AVR tools. While LLM-based approaches are promising, they struggle to incorporate structured security knowledge from sources like CWE and NVD. Current methods either use this...
CVE-2026-7510
A vulnerability was determined in OWAP DefectDojo up to 2.55.4. Affected by this vulnerability is an unknown functionality of the component Benchmark/Engagement/Product/Survey. Executing a manipulation can lead to authorization bypass. The attack can be executed remotely. The exploit has been...
CVE-2026-7510
The CVE-2026-7510 entry concerns OWAP DefectDojo up to 2.55.4, with an authorization bypass affecting the Benchmark/Engagement/Product/Survey functionality. The issue is reachable remotely and is supported by a public disclosure; upgrading to DefectDojo 2.56.0 addresses the vulnerability (patch e...
CVE-2026-7510 OWAP DefectDojo Benchmark/Engagement/Product/Survey authorization
A vulnerability was determined in OWAP DefectDojo up to 2.55.4. Affected by this vulnerability is an unknown functionality of the component Benchmark/Engagement/Product/Survey. Executing a manipulation can lead to authorization bypass. The attack can be executed remotely. The exploit has been...