125 matches found
CVE-2026-56340
A flaw was found in vLLM. This vulnerability allows a remote attacker to trigger crashes or resource exhaustion, leading to a denial of service DoS. By submitting specially crafted embedding requests with malformed tensor indices, when the prompt-embeds feature is enabled, an attacker could also...
CVE-2026-48746
A flaw was found in vLLM, an inference and serving engine for large language models LLMs. This vulnerability, residing in ASGI web servers and Starlette's trust in them, allows an attacker to bypass the OpenAI API Authentication Middleware. This bypass enables unauthorized access to the API witho...
CVE-2026-41523
vLLM is an inference and serving engine for large language models LLMs. Prior to 0.22.0, an assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the server by publishing a malicious HuggingFace model, when vLL...
CVE-2026-48746
vLLM is an inference and serving engine for large language models LLMs. From 0.3.0 until 0.22.0, a vulnerability in ASGI web servers and starlette's trust on those web servers enables an authentication bypass of the OpenAI API AuthenticationMiddleware. It allows to use the API without providing t...
CVE-2026-41523
vLLM is an inference and serving engine for large language models LLMs. Prior to 0.22.0, an assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the server by publishing a malicious HuggingFace model, when vLL...
CVE-2026-48746
vLLM is an inference and serving engine for large language models LLMs. From 0.3.0 until 0.22.0, a vulnerability in ASGI web servers and starlette's trust on those web servers enables an authentication bypass of the OpenAI API AuthenticationMiddleware. It allows to use the API without providing t...
CVE-2025-71379
vLLM versions = 0.6.3 and 0.9.0 contain multiple regular expression denial of service ReDoS vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint — are susceptible to catastrophic backtracking. An attacker...
Improper Handling of Highly Compressed Data (Data Amplification)
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Improper Handling of Highly Compressed Data Data Amplification through the audio.py file. An attacker can cause excessive memory consumption by...
Incorrect Conversion between Numeric Types
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Incorrect Conversion between Numeric Types in the ggmldequantize, ggmlmulmatveca8, ggmlmulmata8, and ggmlmoea8 functions when tensor dimensions are...
vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
Summary All temperature validation gates use comparison operators , which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors tha...
HTTP Request Smuggling
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to HTTP Request Smuggling via improper validation of the Host header in the request scope. An attacker can gain unauthorized access to API endpoints by...
vLLM: OpenAI auth bypass
Summary A vulnerability in ASGI web servers and starlette's trust on those web servers enables an authentication bypass of the OpenAI API AuthenticationMiddleware, which was discovered during @x41sec's source code audit. It allows to use the API without providing the configured VLLMAPIKEY or...
Use of Incorrectly-Resolved Name or Reference
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Use of Incorrectly-Resolved Name or Reference through several model loading paths. An attacker can make the server load a different Hugging Face...
CVE-2026-4944 Hardcoded trust_remote_code=True in vllm-project/vllm Bypasses User Security Control
vllm-project/vllm version 0.14.1 contains a vulnerability where the trustremotecode=True parameter is hardcoded in two model implementation files vllm/modelexecutor/models/nemotronvl.py and vllm/modelexecutor/models/kimik25.py. This bypasses the user's explicit --trust-remote-code=False setting,...
CVE-2026-4944
The provided documents describe a vulnerability in vllm-project/vllm version 0.14.1 where trust_remote_code is hardcoded to True in nemotron_vl.py and kimi_k25.py, bypassing user-specified --trust-remote-code=False and enabling remote code execution via malicious HuggingFace model repositories. T...
Improper Resource Shutdown or Release
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Improper Resource Shutdown or Release via the OpenAI-compatible Serving Path component. An attacker can cause the service to become unavailable by...
EUVD-2026-31810
A vulnerability was identified in vllm-project vllm 0.19.0. This issue affects some unknown processing of the component OpenAI-compatible Serving Path. Such manipulation leads to denial of service. It is possible to launch the attack remotely. The exploit is publicly available and might be used...
CVE-2026-44222 vLLM: Remote DoS via Special-Token Placeholders
vLLM is an inference and serving engine for large language models LLMs. From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder...
vLLM 输入验证错误漏洞
vLLM is an open-source inference and service engine designed for LLM models, featuring high throughput and efficient memory usage. Versions of vLLM prior to 0.6.1 to 0.20.0 contained a vulnerability related to input validation errors. This vulnerability stemmed from token injection issues during...
vLLM 安全漏洞
vLLM is an open-source LLM-based inference and service engine that features high throughput and efficient memory usage. Versions of vLLM prior to 0.20.0 contained a security vulnerability. This vulnerability stemmed from the extracthiddenstates speculative decoding proposal, which returned tensor...