4 matches found
vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters
Summary The extracthiddenstates speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters...
PT-2025-23227 · Vllm · Vllm
Name of the Vulnerable Software and Affected Versions: vLLM versions 0.8.0 through 0.8.x Description: The issue is a Denial of Service ReDoS that causes the vLLM server to crash if an invalid regex is provided while using structured output. This is similar to a previously identified issue, but it...
PT-2025-18215 · Vllm +1 · Vllm +1
Name of the Vulnerable Software and Affected Versions: vLLM versions 0.5.2 through 0.8.5 Description: The issue affects vLLM, a high-throughput and memory-efficient inference and serving engine for LLMs. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes,...
PT-2025-6000 · Vllm +1 · Vllm +1
Name of the Vulnerable Software and Affected Versions: vLLM versions prior to 0.7.2 Description: Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. The issue arises from the use of...