3 matches found
PYSEC-2025-53
vLLM is an inference and serving engine for large language models LLMs. Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT Time to First Token. These timing differences...
PYSEC-2025-53
vLLM is an inference and serving engine for large language models LLMs. Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT Time to First Token. These timing differences...
CVE-2025-46570
The CVE-2025-46570 entry concerns vLLM (inference/serving engine). The concrete detail across connected records shows a vulnerability in the PageAttention-based prefill path: when a new prompt is processed, a matching prefix chunk can accelerate prefill, creating timing differences (TTFT) that co...