7 matches found
EUVD-2025-23049
Malicious code in bioql PyPI...
Integer Overflow lead to DOS in API `v2/models/<model-name>/infer`
This report is not public...
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-Cache in LLM Inference
The Key-Value KV cache, which stores intermediate attention computations Key and Value pairs to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model LLM inference. However, this efficiency optimization introduces significant yet underexplored privacy risk...
CVE-2025-6920 Ai-inference-server: authentication bypass via unprotected inference endpoint in api
A flaw was found in the authentication enforcement mechanism of a model inference API in ai-inference-server. All /v1/ endpoints are expected to enforce API key validation. However, the POST /invocations endpoint failed to do so, resulting in an authentication bypass. This vulnerability allows...
An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
Recent advances in Large Language Models LLMs have led to the widespread adoption of third-party inference services, raising critical privacy concerns. Existing methods of performing private third-party inference, such as Secure Multiparty Computation SMPC, often rely on cryptographic methods...
CVE-2025-32375
Summary: CVE-2025-32375 affects BentoML prior to version 1.4.8, due to an insecure deserialization in BentoML’s runner server. The vulnerability allows an attacker to craft POST requests with specific headers/parameters to execute arbitrary code on the server, giving initial access and informatio...
CVE-2025-27520
BentoML is a Python library for building online serving systems optimized for AI apps and model inference. A Remote Code Execution RCE vulnerability caused by insecure deserialization has been identified in the latest version v1.4.2 of BentoML. It allows any unauthenticated user to execute...