Lucene search
K

646 matches found

Packet Storm News
Packet Storm News
added 2026/05/27 12:0 a.m.7 views

Towards Cybersecurity SuperIntelligence (CSI): What'S the Best Harness for Cybersecurity?

What is the best harness for cybersecurity AI? Cybersecurity systems are converging on a single execution scaffold per agent, an iterative shell loop driven by a Large Language Model LLM. However, scaffolds are not interchangeable, rarely interoperable, and no single scaffold dominates across all...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/27 12:0 a.m.5 views

Relevance As a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents

AI agents augment large language models with external tools such as web retrieval, enabling grounded and up-to-date responses. However, incorporating external content into the generation pipeline can weaken the safety alignment mechanisms that govern model outputs. Prior work shows that enabling...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/26 12:0 a.m.12 views

SEC-Bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Large language models LLMs now support automated software security tasks, including vulnerability discovery and proof-of-concept PoC generation. Existing benchmarks do not faithfully evaluate LLMs in real-world bug hunting scenarios because they rely on fuzzing harnesses, target-specific...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/23 12:0 a.m.4 views

CyberMaskQA: A Privacy-Aware Benchmark for Evaluating Large Language Models in Cybersecurity Question Answering

Large language models LLMs are increasingly applied to cybersecurity question answering QA for critical tasks such as incident response and vulnerability analysis. However, real-world operational contexts, including system logs and network configurations, inherently contain sensitive identifiers,...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/21 12:0 a.m.3 views

Parser-Free Querying of Security Logs

Security analysts routinely query system logs to detect threats and investigate incidents, but each log source uses its own semi-structured format: logs are cheap to produce, but expensive to use. The standard approach, building per-source parsers to normalize logs into structured schemas, is...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/21 12:0 a.m.3 views

Measuring Security without Fooling Ourselves: Why Benchmarking Agents Is Hard

The benchmarks used to evaluate AI agents in security-critical roles suffer from crucial weaknesses. Building on recent empirical evidence, we characterize three core challenges that undermine security evaluations: benchmark vulnerabilities, temporal staleness, and runtime uncertainty. We then...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/21 12:0 a.m.4 views

UNAD+: An Explainable Hybrid Framework for Unknown Network Attack Detection

The detection of previously unseen network attacks remains a major challenge for intrusion detection systems. Although supervised learning methods often perform well on known attack classes, they are limited when new attack types are not represented in the training data. Unsupervised methods are...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/20 12:0 a.m.4 views

HIDBench: Benchmarking Large Language Models for Host-Based Intrusion Detection

Recent benchmark efforts have advanced the evaluation of large language models LLMs in cybersecurity, including tasks such as penetration testing and vulnerability identification. However, a critical cybersecurity task, namely intrusion detection from system logs, remains unexplored. In this work...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/18 12:0 a.m.6 views

Not What You Asked For: Typographic Attacks in Household Robot Manipulation

Open-vocabulary embodied AI agents increasingly rely on vision-language models such as CLIP for object perception and task grounding. However, the shared embedding space that enables this flexibility introduces a structural vulnerability to typographic attacks, where printed text in a physical...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/17 12:0 a.m.3 views

ContraFix: Agentic Vulnerability Repair Via Differential Runtime Evidence and Skill Reuse

Large language model LLM agents are increasingly used for automated vulnerability repair AVR, where repository-level reasoning enables them to inspect context and produce source-code patches. However, recent empirical results show that these agents still struggle with real-world vulnerabilities...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/17 12:0 a.m.8 views

ADR: An Agentic Detection System for Enterprise Agentic AI Security

We present the Agentic AI Detection and Response ADR system, the first large-scale, production-proven enterprise framework for securing AI agents operating through the Model Context Protocol MCP. We identify three persistent challenges in this domain: 1 limited observability -- existing Endpoint...

5.8AI score
Exploits0
The Hacker News
The Hacker News
added 2026/05/14 11:30 a.m.9 views

How AI Hallucinations Are Creating Real Security Risks

AI hallucinations are introducing serious security risks into critical infrastructure decision-making by exploiting human trust through highly confident yet incorrect outputs. When an AI model lacks certainty, it doesn’t have a mechanism to recognize that. Instead, it generates the most probable...

5.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/14 12:0 a.m.20 views

Do Coding Agents Understand Least-Privilege Authorization?

As coding agents gain access to shells, repositories, and user files, least-privilege authorization becomes a prerequisite for safe deployment: an agent should receive enough authority to complete the task, without unnecessary authority that exposes sensitive surfaces.To study whether current...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/13 12:0 a.m.3 views

DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization

Software vulnerability detection plays a critical role in ensuring system security, where real-world auditing requires not only determining whether a function is vulnerable but also pinpointing the specific lines responsible. However, existing approaches either rely on a single information source...

5.9AI score
Exploits0
Microsoft Secure
Microsoft Secure
added 2026/05/12 10:0 p.m.5 views

Defense at AI speed: Microsoft’s new multi-model agentic security system tops leading industry benchmark

In this article 1. AI-powered vulnerability discovery at hyper-scale 2. Codename: MDASH—Microsoft Security’s new multi-model agentic scanning harness 3. Using codename MDASH for security research 4. The 5.12.2026 Patch Tuesday cohort 5. Two deep dives 1. CVE-2026-33827—Remote unauthenticated UAF ...

9.8CVSS7AI score0.00088EPSS
Exploits3
Packet Storm News
Packet Storm News
added 2026/05/12 12:0 a.m.4 views

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Reusable skills are becoming a common interface for extending large language model agents, packaging procedural guidance with access to files, tools, memory, and execution environments. However, this modularity introduces attack surfaces that are largely missed by existing safety evaluations: eve...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/12 12:0 a.m.2 views

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in frontier models without overfitting. We argue tha...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/11 12:0 a.m.3 views

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

The rapid proliferation of LLM-based autonomous agents in real operating system environments introduces a new category of safety risk beyond content safety: behavior jailbreak, where an adversary induces an agent to execute dangerous OS-level operations with irreversible consequences. Existing...

5.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2026/05/10 12:0 a.m.1 views

MonitoringBench: Semi-Automated Red-Teaming for Agent Monitoring

We introduce a red-teaming methodology that exposes harder-to-catch attacks for coding-agent monitors, suggesting that current practices may under-elicit attacks and overstate monitor performance. We identify three challenges with current red-teaming. First, mode collapse in attack generation,...

5.9AI score
Exploits0
vulnersOsv
vulnersOsv
added 2026/05/09 12:38 a.m.3 views

ai.timefold.solver:timefold-solver-quarkus-benchmark-deployment (>=0.8.38 <=0.9.38), ai.timefold.solver:timefold-solver-quarkus-benchmark-integration-test (>=0.8.38 <=0.9.38) +3086 more potentially affected by CVE-2026-6860 via io.vertx:vertx-core (>=4.3.4 <=4.3.8)

io.vertx:vertx-core MAVEN version =4.3.4, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =0.8.38, =22.9.0, =22.9.0, =22.9.0, =22.9.0, =22.9.5 and more Source cves: CVE-2026-6860 Source advisory: OSV:GHSA-3G76-F9XQ-8VP6https://vulners.com/osv/OSV...

6.9CVSS5.8AI score0.00012EPSS
Exploits1
Rows per page
Query Builder