685 matches found
StealthInk: a Multi-Bit and Stealthy Watermark for Large Language Models
Watermarking for large language models LLMs offers a promising approach to identifying AI-generated text. Existing approaches, however, either compromise the distribution of original generated text by LLMs or are limited to embedding zero-bit information that only allows for watermark detection b...
SoK: Are Watermarks in LLMs Ready for Deployment?
Large Language Models LLMs have transformed natural language processing, demonstrating impressive capabilities across diverse tasks. However, deploying these models introduces critical risks related to intellectual property violations and potential misuse, particularly as adversaries can imitate...
On Automating Security Policies with Contemporary LLMs
The complexity of modern computing environments and the growing sophistication of cyber threats necessitate a more robust, adaptive, and automated approach to security enforcement. In this paper, we present a framework leveraging large language models LLMs for automating attack mitigation policy...
VLMs Can Aggregate Scattered Training Patches
Whitepaper called VLMs Can Aggregate Scattered Training Patches...
BitBypass: a New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage
The inherent risk of generating harmful and unsafe content by Large Language Models LLMs, has highlighted the need for their safety alignment. Various techniques like supervised fine-tuning, reinforcement learning from human feedback, and red-teaming were developed for ensuring the safety alignme...
ReGA: Representation-Guided Abstraction for Model-Based Safeguarding of LLMs
Large Language Models LLMs have achieved significant success in various tasks, yet concerns about their safety and security have emerged. In particular, they pose risks in generating harmful content and vulnerability to jailbreaking attacks. To analyze and monitor machine learning models,...
The Security Threat of Compressed Projectors in Large Vision-Language Models
The choice of a suitable visual language projector VLP is critical to the successful training of large visual language models LVLMs. Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offering distinct advantages in performance and computational...
PYSEC-2025-55
vLLM is an inference and serving engine for large language models LLMs. Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service ReDoS that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to...
PYSEC-2025-54
vLLM is an inference and serving engine for large language models LLMs. In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid jsonschema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex...
CVE-2025-48942
vLLM is an inference and serving engine for large language models LLMs. In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid jsonschema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex...
CVE-2025-48944 vLLM Tool Schema allows DoS via Malformed pattern and type Fields
vLLM is an inference and serving engine for large language models LLMs. In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality ...
PYSEC-2025-53
vLLM is an inference and serving engine for large language models LLMs. Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT Time to First Token. These timing differences...
CVE-2025-46570 vLLM’s Chunk-Based Prefix Caching Vulnerable to Potential Timing Side-Channel
vLLM is an inference and serving engine for large language models LLMs. Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT Time to First Token. These timing differences...
LLM Agents Should Employ Security Principles
Large Language Model LLM agents show considerable promise for automating complex tasks using contextual reasoning; however, interactions involving multiple agents and the system's susceptibility to prompt injection and other forms of context manipulation introduce new vulnerabilities related to...
Test-Time Immunization: a Universal Defense Framework against Jailbreaks for (Multimodal) Large Language Models
While multimodal large language models LLMs have attracted widespread attention due to their exceptional capabilities, they remain vulnerable to jailbreak attacks. Various defense methods are proposed to defend against jailbreak attacks, however, they are often tailored to specific types of...
Permissioned LLMs: Enforcing Access Control in Large Language Models
In enterprise settings, organizational data is segregated, siloed and carefully protected by elaborate access control frameworks. These access control structures can completely break down if an LLM fine-tuned on the siloed data serves requests, for downstream tasks, from individuals with disparat...
Spa-VLM: Stealthy Poisoning Attacks on RAG-Based VLM
With the rapid development of the Vision-Language Model VLM, significant progress has been made in Visual Question Answering VQA tasks. However, existing VLM often generate inaccurate answers due to a lack of up-to-date knowledge. To address this issue, recent research has introduced...
GeneBreaker: Jailbreak Attacks against DNA Language Models with Pathogenicity Guidance
DNA, encoding genetic instructions for almost all living organisms, fuels groundbreaking advances in genomics and synthetic biology. Recently, DNA Foundation Models have achieved success in designing synthetic functional DNA sequences, even whole genomes, but their susceptibility to jailbreaking...
System Prompt Extraction Attacks and Defenses in Large Language Models
The system prompt in Large Language Models LLMs plays a pivotal role in guiding model behavior and response generation. Often containing private configuration details, user roles, and operational instructions, the system prompt has become an emerging attack target. Recent studies have shown that...
Phare: a Safety Probe for Large Language Models
Ensuring the safety of large language models LLMs is critical for responsible deployment, yet existing evaluations often prioritize performance over identifying failure modes. We introduce Phare, a multilingual diagnostic framework to probe and evaluate LLM behavior across three critical...