8 matches found
A Bayesian Network Approach for Enhancing Security-Focused Decision Support Systems
The adoption and integration of heterogeneous stacks in most of today's open-source based networks brings clear benefits like interoperability and availability of advanced features. Yet, on the other hand the increasing number of interconnecting components and moving parts requires maintaining an...
EUVD-2025-23049
Malicious code in bioql PyPI...
Integer Overflow lead to DOS in API `v2/models/<model-name>/infer`
This report is not public...
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-Cache in LLM Inference
The Key-Value KV cache, which stores intermediate attention computations Key and Value pairs to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model LLM inference. However, this efficiency optimization introduces significant yet underexplored privacy risk...
CVE-2025-6920 Ai-inference-server: authentication bypass via unprotected inference endpoint in api
A flaw was found in the authentication enforcement mechanism of a model inference API in ai-inference-server. All /v1/ endpoints are expected to enforce API key validation. However, the POST /invocations endpoint failed to do so, resulting in an authentication bypass. This vulnerability allows...
An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
Recent advances in Large Language Models LLMs have led to the widespread adoption of third-party inference services, raising critical privacy concerns. Existing methods of performing private third-party inference, such as Secure Multiparty Computation SMPC, often rely on cryptographic methods...
CVE-2025-32375
Summary: CVE-2025-32375 affects BentoML prior to version 1.4.8, due to an insecure deserialization in BentoML’s runner server. The vulnerability allows an attacker to craft POST requests with specific headers/parameters to execute arbitrary code on the server, giving initial access and informatio...
CVE-2025-27520
BentoML is a Python library for building online serving systems optimized for AI apps and model inference. A Remote Code Execution RCE vulnerability caused by insecure deserialization has been identified in the latest version v1.4.2 of BentoML. It allows any unauthenticated user to execute...