3 matches found
CVE-2026-53923
A flaw was found in vLLM. Integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels leads to partial tensor processing. This results in the output tensor retaining previously used GPU memory, which, in multi-tenant inference deployments, can expose sensitive tensor data from other...
CVE-2026-53923
vLLM is an inference and serving engine for large language models LLMs. From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels csrc/quantization/gguf/ggufkernel.cu causes partial tensor processing. The output tensor is allocated at full size via...
PT-2026-50472
Name of the Vulnerable Software and Affected Versions vLLM versions 0.5.5 through 0.23.1rc0 Description Integer truncation of tensor dimensions in GGUF dequantize kernels within csrc/quantization/gguf/gguf kernel.cu leads to partial tensor processing. The output tensor is allocated at full size...