685 matches found
HIDBench: Benchmarking Large Language Models for Host-Based Intrusion Detection
Recent benchmark efforts have advanced the evaluation of large language models LLMs in cybersecurity, including tasks such as penetration testing and vulnerability identification. However, a critical cybersecurity task, namely intrusion detection from system logs, remains unexplored. In this work...
A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox
Sandbox evasion remains a critical challenge for automated malware analysis, as modern malware employs environment checks to detect analysis platforms and suppress malicious behavior. Existing approaches rely on manually crafted bypass rules that require deep reverse engineering of each evasion...
Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025)
The evaluation of large language model refusal on malicious-coding tasks now spans at least thirteen publicly released prompt corpora AdvBench, the CyberSecEval family, RMCBench, RedCode, MCGMark, JailbreakBench, CySecBench, MalwareBench, CIRCLE, MOCHA, ASTRA, Scam2Prompt / Innoc2Scam-bench, and...
A Red Teaming Framework for Evaluating Robustness of AI-Enabled Security Orchestration, Automation, and Response Systems
AI-enabled Security Orchestration, Automation, and Response SOAR systems increasingly employ autonomous agents for cyber defense, yet their resilience to adaptive adversaries is underexplored. We introduce an autonomous red teaming framework that integrates large language models LLMs with...
Important: Red Hat Security Advisory: Red Hat Enterprise Linux AI 3.3.3
Red Hat Enterprise Linux AI 3.3.3 is now available. Red Hat® Enterprise Linux® AI is a foundation model platform to seamlessly develop, test, and run Granite family large language models LLMs for enterprise applications...
Important: Red Hat Security Advisory: Red Hat Enterprise Linux AI 3.3.3
Red Hat Enterprise Linux AI 3.3.3 is now available. Red Hat® Enterprise Linux® AI is a foundation model platform to seamlessly develop, test, and run Granite family large language models LLMs for enterprise applications...
Kimsuky targets organizations with PebbleDash-based tools
Over the past few months, we have conducted an in-depth analysis of specific activity clusters of Kimsuky aka APT43, Ruby Sleet, Black Banshee, Sparkling Pisces, Velvet Chollima, and Springtail, a prolific Korean-speaking threat actor. Our research revealed notable tactical shifts throughout...
Detecting Privilege Escalation in Polyglot Microservices Via Agentic Program Analysis
Microservices are widely adopted in modern cloud systems due to their scalability and fault tolerance. However, microservice architectures introduce significant complexity in privilege and permission control, creating risks of privilege escalation where attackers can gain unauthorized access to...
UGen: An Agentic Framework for Generating Microarchitectural Attack PoCs
Microarchitectural attacks continue to evolve, uncovering new exploitation vectors in modern processors. From a defensive perspective, assessing a system's susceptibility to such attacks remains challenging. Developing functional attack implementations is labor-intensive, requires deep...
MetaBackdoor: Exploiting Positional Encoding As a Backdoor Attack Surface in LLMs
Backdoor attacks pose a serious security threat to large language models LLMs, which are increasingly deployed as general-purpose assistants in safety- and privacy-critical applications. Existing LLM backdoors rely primarily on content-based triggers, requiring explicit modification of the input...
Identifying AI Web Scrapers Using Canary Tokens
From pre-training to query-time augmentation, web-scraped data helps to improve the quality and contextual relevancy of content generated by large language models LLMs. However, large-scale web scraping to feed LLMs can affect site stability and raise legal, privacy, or ethics concerns. If websit...
CVE-2026-44223
vLLM contains a vulnerability (CVE-2026-44223) where the extract_hidden_states speculative decoding pathway can crash the EngineCore process if any request uses penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). The issue arises from an incorrect tensor shape after t...
When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions
Automated intrusion-style workflows require LLM agents to reason over partial observations, tool outputs, and executable artifacts under bounded budgets. A single LLM instance often compresses evidence extraction, planning, execution, and validation into one context, which increases the risk of...
This Week in Spring - May 12th, 2026
Hi, Spring fans! As I write this I am in Miami, FL at the CodeRemix.ai show, focused on the wide and wonderful world of OpenRewrite and Moderne. I've got a talk to give so let's dive right into it! a quick note about the upcoming release train dates in last week's installment of A Bootiful Podcas...
CTFusion: A CTF-Based Benchmark for LLM Agent Evaluation
Recent advances in Large Language Models LLMs have enabled agentic systems for complex, multi-step tasks; cybersecurity is emerging as a prominent application. To evaluate such agents, researchers widely adopt Capture The Flag CTF benchmarks. However, current CTF benchmarks reuse existing...
LLMs and Text-in-Text Steganography
Turns out that LLMs are really good at hiding text messages in other text messages...
LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges
The integration of Large Language Models LLMs into Electronic Design Automation EDA and hardware security is rapidly reshaping the semiconductor industry. While LLMs offer unprecedented capabilities in generating Register Transfer Level RTL code, automating testbenches, and bridging the semantic...
Adversarial SQL Injection Generation with LLM-Based Architectures
SQL injection SQLi attacks are still one of the serious attacks ranked in the Open Worldwide Application Security Project OWASP Top 10 threats. Today, with advances in Artificial Intelligence AI, especially in Large Language Models LLMs, an opportunity has been created for automating adversarial...
Guaranteed Jailbreaking Defense Via Disrupt-And-Rectify Smoothing
This paper proposes a guaranteed defense method for large language models LLMs to safeguard against jailbreaking attacks. Drawing inspiration from the denoised-smoothing approach in the adversarial defense domain, we propose a novel smoothing-based defense method, termed Disrupt-and-Rectify...
Mythos
Mythos Autonomous cybersecurity agent that connects to multip...