Lucene search
K

5 matches found

Packet Storm News
Packet Storm News
added 2026/03/01 12:0 a.m.4 views

A Systematic Study of LLM-Based Architectures for Automated Patching

Large language models LLMs have shown promise for automated patching, but their effectiveness depends strongly on how they are integrated into patching systems. While prior work explores prompting strategies and individual agent designs, the field lacks a systematic comparison of patching...

6.2AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/12/02 12:0 a.m.21 views

Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks

Vibe coding is a new programming paradigm in which human engineers instruct large language model LLM agents to complete complex coding tasks with little supervision. Although it is increasingly adopted, are vibe coding outputs really safe to deploy in production? To answer this question, we propo...

6.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/10/02 12:0 a.m.9 views

RedCodeAgent: Automatic Red-Teaming Agent against Diverse Code Agents

Code agents have gained widespread adoption due to their strong code generation capabilities and integration with code interpreters, enabling dynamic execution, debugging, and interactive programming capabilities. While these advancements have streamlined complex workflows, they have also...

7.9AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/10/01 12:0 a.m.5 views

Breaking the Code: Security Assessment of AI Code Agents through Systematic Jailbreaking Attacks

Code-capable large language model LLM agents are increasingly embedded into software engineering workflows where they can read, write, and execute code, raising the stakes of safety-bypass "jailbreak" attacks beyond text-only settings. Prior evaluations emphasize refusal or harmful-text detection...

7.7AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/09/26 12:0 a.m.39 views

SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios

Large language model LLM powered code agents are rapidly transforming software engineering by automating tasks such as testing, debugging, and repairing, yet the security risks of their generated code have become a critical concern. Existing benchmarks have offered valuable insights but remain...

7.4AI score
Exploits0
Rows per page
Query Builder