Lucene search
K

647 matches found

Packet Storm News
Packet Storm News
added 2025/12/02 12:0 a.m.4 views

Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks

Vibe coding is a new programming paradigm in which human engineers instruct large language model LLM agents to complete complex coding tasks with little supervision. Although it is increasingly adopted, are vibe coding outputs really safe to deploy in production? To answer this question, we propo...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/12/01 12:0 a.m.11 views

BackportBench: A Multilingual Benchmark for Automated Backporting of Patches

Many modern software projects evolve rapidly to incorporate new features and security patches. It is important for users to update their dependencies to safer versions, but many still use older, vulnerable package versions because upgrading can be difficult and may break their existing codebase...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/30 12:0 a.m.7 views

Large Language Models Cannot Reliably Detect Vulnerabilities in JavaScript: The First Systematic Benchmark and Evaluation

Researchers have proposed numerous methods to detect vulnerabilities in JavaScript, especially those assisted by Large Language Models LLMs. However, the actual capability of LLMs in JavaScript vulnerability detection remains questionable, necessitating systematic evaluation and comprehensive...

6.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/28 12:0 a.m.2 views

Clustering Malware at Scale: A First Full-Benchmark Study

Recent years have shown that malware attacks still happen with high frequency. Malware experts seek to categorize and classify incoming samples to confirm their trustworthiness or prove their maliciousness. One of the ways in which groups of malware samples can be identified is through malware...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/28 12:0 a.m.2 views

GAPS: Guiding Dynamic Android Analysis with Static Path Synthesis

Dynamically resolving method reachability in Android applications remains a critical and largely unsolved problem. Despite notable advancements in GUI testing and static call graph construction, current tools are insufficient for reliably driving execution toward specific target methods, especial...

7.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/25 12:0 a.m.16 views

BrowseSafe: Understanding and Preventing Prompt Injection within AI Browser Agents

The integration of artificial intelligence AI agents into web browsers introduces security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real-world environments...

6.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/24 12:0 a.m.2 views

DUALGUAGE: Automated Joint Security-Functionality Benchmarking for Secure Code Generation

Large language models LLMs and autonomous coding agents are increasingly used to generate software across a wide range of domains. Yet a core requirement remains unmet: ensuring that generated code is secure without compromising its functional correctness. Existing benchmarks and evaluations for...

7.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/22 12:0 a.m.2 views

Building Browser Agents: Architecture, Security, and Practical Solutions

Browser agents enable autonomous web interaction but face critical reliability and security challenges in production. This paper presents findings from building and operating a production browser agent. The analysis examines where current approaches fail and what prevents safe autonomous operatio...

7.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/21 12:0 a.m.3 views

ThreadFuzzer: Fuzzing Framework for Thread Protocol

With the rapid growth of IoT, secure and efficient mesh networking has become essential. Thread has emerged as a key protocol, widely used in smart-home and commercial systems, and serving as a core transport layer in the Matter standard. This paper presents ThreadFuzzer, the first dedicated...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/19 12:0 a.m.4 views

Securing AI Agents against Prompt Injection Attacks

Retrieval-augmented generation RAG systems have become widely used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled A...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/19 12:0 a.m.2 views

Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments

The rapid proliferation of Multimodal Large Language Models MLLMs has introduced unprecedented security challenges, particularly in phishing detection within academic environments. Academic institutions and researchers are high-value targets, facing dynamic, multilingual, and context-dependent...

6.6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/18 12:0 a.m.4 views

LFreeDA: Label-Free Drift Adaptation for Windows Malware Detection

Machine learning ML-based malware detectors degrade over time as concept drift introduces new and evolving families unseen during training. Retraining is limited by the cost and time of manual labeling or sandbox analysis. Existing approaches mitigate this via drift detection and selective...

6.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/17 12:0 a.m.7 views

Beyond Fixed and Dynamic Prompts: Embedded Jailbreak Templates for Advancing LLM Security

As the use of large language models LLMs continues to expand, ensuring their safety and robustness has become a critical challenge. In particular, jailbreak attacks that bypass built-in safety mechanisms are increasingly recognized as a tangible threat across industries, driving the need for...

7.3AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/16 12:0 a.m.2 views

Adaptive Dual-Layer Web Application Firewall (ADL-WAF) Leveraging Machine Learning for Enhanced Anomaly and Threat Detection

Web Application Firewalls are crucial for protecting web applications against a wide range of cyber threats. Traditional Web Application Firewalls often struggle to effectively distinguish between malicious and legitimate traffic, leading to limited efficacy in threat detection. To overcome these...

6.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/11/14 12:0 a.m.17 views

PATCHEVAL: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities

Software vulnerabilities are increasing at an alarming rate. However, manual patching is both time-consuming and resource-intensive, while existing automated vulnerability repair AVR techniques remain limited in effectiveness. Recent advances in large language models LLMs have opened a new paradi...

6.9AI score
Exploits0
EUVD
EUVD
added 2025/11/13 3:23 a.m.0 views

EUVD-2025-178450

Malicious code in import-benchmark-warn-node-catch npm...

6.6AI score
Exploits0
EUVD
EUVD
added 2025/11/13 3:23 a.m.1 views

EUVD-2025-175867

Malicious code in try-benchmark-assert-module-protected npm...

6.6AI score
Exploits0
OSV
OSV
added 2025/11/13 3:23 a.m.2 views

MAL-2025-189692 Malicious code in string-compile-module-benchmark-report (npm)

--- -= Per source details. Do not edit below this line.=- Source: amazon-inspector 4a32edf99d2adb837e219033e596267b024daa93bbe2f9e9e2030e20bfffdffd This package appears to be part of the tea.xyz token reward campaign that flooded npm. These packages typically contain autopublish scripts auto.js,...

6.8AI score
Exploits0
EUVD
EUVD
added 2025/11/13 3:23 a.m.0 views

EUVD-2025-180005

Malicious code in boolean-double-benchmark-star-node npm...

6.6AI score
Exploits0
EUVD
EUVD
added 2025/11/13 3:23 a.m.0 views

EUVD-2025-179252

Malicious code in double-benchmark-pipe-hash-virtualize npm...

6.6AI score
Exploits0
Rows per page
Query Builder