11 matches found
CVE-2026-44222
vLLM is an inference and serving engine for large language models LLMs. From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder...
CVE-2026-34760
vLLM is an inference and serving engine for large language models LLMs. From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing tomono, while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results...
CVE-2026-34760 vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models
vLLM is an inference and serving engine for large language models LLMs. From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing tomono, while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results...
CVE-2025-62372 vLLM vulnerable to DoS with incorrect shape of multimodal embedding inputs
vLLM is an inference and serving engine for large language models LLMs. From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape e.g. hidden dimension is wrong, regardless of whether...
CVE-2025-62164
The CVE affects vLLM (inference/serving engine) before 0.11.1, where the Completions API loads user-supplied prompt embeddings with torch.load() lacking proper validation. A PyTorch 2.8.0 change disables sparse-tensor invariants checks, allowing crafted tensors to bypass bounds checks and trigger...
PT-2025-41009
Name of the Vulnerable Software and Affected Versions vLLM versions prior to 0.11.0rc2 Description vLLM is an inference and serving engine for large language models LLMs. The API key validation mechanism in versions prior to 0.11.0rc2 is susceptible to a timing attack. The string comparison used...
EUVD-2025-16512
Malicious code in bioql PyPI...
CVE-2025-48956 vLLM API endpoints vulnerable to Denial of Service Attacks
vLLM is an inference and serving engine for large language models LLMs. From 0.1.0 to before 0.10.1.1, a Denial of Service DoS vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion,...
CVE-2025-30165 Remote Code Execution Vulnerability in vLLM Multi-Node Cluster Configuration
vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a SUB ZeroMQ socket and connect to an XPUB socket on the primary vLLM host. When data ...
PYSEC-2025-223
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output a.k.a. guided decoding. Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has...
PYSEC-2025-62
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python's built-i...