50 matches found
Microsoft Build 2026: Securing code, agents, and models across the development lifecycle
In this article 1. Secure your code 2. Secure your agents 3. Trust agents with your data 4. Secure your models 5. Trust starts with security Today, developers and security teams are caught in growing tension. AI is accelerating development and introducing new issues around insecure code, opaque...
Learn from Your Mistakes: Tree-Like Self-Play for Secure Code LLMs
While Large Language Models LLMs excel in code generation, they remain prone to replicating subtle yet critical vulnerabilities endemic to their training data. Current alignment techniques, such as Supervised Fine-Tuning SFT and Reinforcement Learning RL, typically apply coarse-grained optimizati...
Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game
I was scrolling through my feed one evening when I came across OpenClaw, an open source personal AI assistant that people were calling everything from "Jarvis" to "a portal to a new reality." The idea is beautiful: an AI that lives on your machine or in the cloud, talks to you over WhatsApp or...
DeepGuard Secure Code Generation
Large Language Models LLMs for code generation can replicate insecure patterns from their training data. To mitigate this, a common strategy for security hardening is to fine-tune models using supervision derived from the final transformer layer. However, this design may suffer from a final-layer...
SecPI: Secure Code Generation with Reasoning Models Via Security Reasoning Internalization
Reasoning language models RLMs are increasingly used in programming. Yet, even state-of-the-art RLMs frequently introduce critical security vulnerabilities in generated code. Prior training-based approaches for secure code generation face a critical limitation that prevents their direct applicati...
TOSSS: A CVE-Based Software Security Benchmark for Large Language Models
With their increasing capabilities, Large Language Models LLMs are now used across many industries. They have become useful tools for software engineers and support a wide range of development tasks. As LLMs are increasingly used in software development workflows, a critical question arises: are...
SecCodePRM: A Process Reward Model for Code Security
Large Language Models are rapidly becoming core components of modern software development workflows, yet ensuring code security remains challenging. Existing vulnerability detection pipelines either rely on static analyzers or use LLM/GNN-based detectors trained with coarse program-level...
Persistent Human Feedback, LLMs, and Static Analyzers for Secure Code Generation and Vulnerability Detection
Existing literature heavily relies on static analysis tools to evaluate LLMs for secure code generation and vulnerability detection. We reviewed 1,080 LLM-generated code samples, built a human-validated ground-truth, and compared the outputs of two widely used static security tools, CodeQL and...
Can Developers Rely on LLMs for Secure IaC Development?
We investigated the capabilities of GPT-4o and Gemini 2.0 Flash for secure Infrastructure as Code IaC development. For security smell detection, on the Stack Overflow dataset, which primarily contains small, simplified code snippets, the models detected at least 71% of security smells when prompt...
AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detection
Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context constraints. Existing SAST-based checks are noisy and often miss immature, context-dependent vulnerabilities, while standalone Large Language Models LLMs are...
EUVD-2025-178859
Malicious code in fork-secure-code-daemon-abstract npm...
RESCUE: Retrieval Augmented Secure Code Generation
Despite recent advances, Large Language Models LLMs still generate vulnerable code. Retrieval-Augmented Generation RAG has the potential to enhance LLMs for secure code generation by incorporating external security knowledge. However, the conventional RAG design struggles with the noise of raw...
EUVD-2021-11832
Malware in sbrugna...
SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios
Large language model LLM powered code agents are rapidly transforming software engineering by automating tasks such as testing, debugging, and repairing, yet the security risks of their generated code have become a critical concern. Existing benchmarks have offered valuable insights but remain...
A Systematic Evaluation of Parameter-Efficient Fine-Tuning Methods for the Security of Code LLMs
Code-generating Large Language Models LLMs significantly accelerate software development. However, their frequent generation of insecure code presents serious risks. We present a comprehensive evaluation of seven parameter-efficient fine-tuning PEFT techniques, demonstrating substantial gains in...
A.S.E: a Repository-Level Benchmark for Evaluating Security in AI-Generated Code
The increasing adoption of large language models LLMs in software engineering necessitates rigorous security evaluation of their generated code. However, existing benchmarks are inadequate, as they focus on isolated code snippets, employ unstable evaluation methods that lack reproducibility, and...
Rules Files for Safer Vibe Coding
Helping LLMs generate safer and more secure code through open-sourced rules files...
Teaching an Old LLM Secure Coding: Localized Preference Optimization on Distilled Preferences
LLM generated code often contains security issues. We address two key challenges in improving secure code generation. First, obtaining high quality training data covering a broad set of security issues is critical. To address this, we introduce a method for distilling a preference dataset of...
CVE-2021-24920
The StatCounter WordPress plugin before 2.0.7 does not sanitise and escape the Project ID and Secure Code settings, which could allow high privilege users to perform Cross-Site Scripting attacks even when the unfilteredhtml capability is disallowed...
LlamaFirewall: an Open Source Guardrail System for Building Secure AI Agents
Large language models LLMs have evolved from simple chatbots into autonomous agents capable of performing complex tasks such as editing production code, orchestrating workflows, and taking higher-stakes actions based on untrusted inputs like webpages and emails. These capabilities introduce new...