554 matches found
CVE-2026-41523
vLLM prior to 0.22.0 is affected by an assert-based security check in the activation function loading that can permit arbitrary code execution when a malicious HuggingFace model is loaded and vLLM runs in Python optimized mode. The attacker-controlled inputs are the activation function names from...
CVE-2026-54235
Summary: CVE-2026-54235 affects vLLM prior to 0.23.1rc0, where temperature validation gates using can silently mis-handle NaN and positive Infinity due to Python IEEE 754 behavior. This allows non-finite temperatures to bypass guards and propagate to GPU sampling kernels, causing undefined behav...
CVE-2026-48746
vLLM OpenAI auth bypass (CVE-2026-48746) affects vLLM versions 0.3.0 through 0.21.0. Root cause: ASGI servers and Starlette trust the Host header from the request scope, enabling manipulation of the reconstructed URL path and bypassing the OpenAI API AuthenticationMiddleware for routes beginning ...
CVE-2025-71379
vLLM versions = 0.6.3 and 0.9.0 contain multiple regular expression denial of service ReDoS vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint — are susceptible to catastrophic backtracking. An attacker...
EUVD-2026-38129
vLLM versions = 0.10.2 and 0.13.0 are missing sparse tensor validation in multimodal embeddings processing. Because PyTorch disables sparse tensor invariant checks by default, an attacker can submit crafted embedding requests with malformed negative or out-of-bounds tensor indices, when the...
CVE-2026-56340
vLLM versions >= 0.10.2 and
CVE-2025-71379
Vulnerability summary: vLLM versions 0.6.3–0.8.x (i.e.,
EUVD-2025-210290
vLLM versions = 0.6.3 and 0.9.0 contain multiple regular expression denial of service ReDoS vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint — are susceptible to catastrophic backtracking. An attacker...
PT-2026-51172
Name of the Vulnerable Software and Affected Versions vLLM versions 0.10.2 through 0.12.x Description Multimodal embeddings processing lacks sparse tensor validation. Since PyTorch disables sparse tensor invariant checks by default, an attacker can submit crafted embedding requests containing...
CVE-2026-50269 vulnerabilities
Vulnerabilities for packages: py3-vllm-cuda-12.4, py3-vllm-cuda-12.9...
GHSA-M6QW-4CW2-HM4M vulnerabilities
Vulnerabilities for packages: py3-vllm-cuda-12.4, py3-vllm-cuda-12.9...
Improper Handling of Highly Compressed Data (Data Amplification)
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Improper Handling of Highly Compressed Data Data Amplification through the audio.py file. An attacker can cause excessive memory consumption by...
GHSA-HGG8-FQQC-VFMW vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via Anthropic router
vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via the Anthropic API router Researcher: Kai Aizen — SnailSploit @SnailSploit, Adversarial & Offensive Security Research Severity: CVSS 3.1 5.3 Medium AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N Target: https://github.com/vllm-project/vllm ---...
Incorrect Conversion between Numeric Types
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Incorrect Conversion between Numeric Types in the ggmldequantize, ggmlmulmatveca8, ggmlmulmata8, and ggmlmoea8 functions when tensor dimensions are...
Improper Validation of Specified Type of Input
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Improper Validation of Specified Type of Input due to improper validation of the temperature parameter while sampling. An attacker can cause the...
vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
Summary All temperature validation gates use comparison operators , which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors tha...
GHSA-V9PG-7XVM-68HF vulnerabilities
Vulnerabilities for packages: litellm, tritonserver-backend-vllm-cuda-12.9...
GHSA-6JV3-5F52-599M vulnerabilities
Vulnerabilities for packages: litellm, tritonserver-backend-vllm-cuda-12.9, airflow-postgres-fips, wazuh-manager-fips, airflow-core...
CVE-2026-53540 vulnerabilities
Vulnerabilities for packages: litellm, tritonserver-backend-vllm-cuda-12.9...
CVE-2026-53537 vulnerabilities
Vulnerabilities for packages: litellm, tritonserver-backend-vllm-cuda-12.9, airflow-postgres-fips, wazuh-manager-fips, airflow-core...