9.6 High
CVSS3
Attack Vector
NETWORK
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
REQUIRED
Scope
CHANGED
Confidentiality Impact
HIGH
Integrity Impact
HIGH
Availability Impact
HIGH
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H
8.3 High
AI Score
Confidence
Low
6.8 Medium
CVSS2
Access Vector
NETWORK
Access Complexity
MEDIUM
Authentication
NONE
Confidentiality Impact
PARTIAL
Integrity Impact
PARTIAL
Availability Impact
PARTIAL
AV:N/AC:M/Au:N/C:P/I:P/A:P
0.0004 Low
EPSS
Percentile
8.7%
llama-cpp-python is the Python bindings for llama.cpp. llama-cpp-python
depends on class Llama
in llama.py
to load .gguf
llama.cpp or Latency Machine Learning Models. The __init__
constructor built in the Llama
takes several parameters to configure the loading and running of the model. Other than NUMA, LoRa settings
, loading tokenizers,
and hardware settings
, __init__
also loads the chat template
from targeted .gguf
's Metadata and furtherly parses it to llama_chat_format.Jinja2ChatFormatter.to_chat_handler()
to construct the self.chat_handler
for this model. Nevertheless, Jinja2ChatFormatter
parse the chat template
within the Metadate with sandbox-less jinja2.Environment
, which is furthermore rendered in __call__
to construct the prompt
of interaction. This allows jinja2
Server Side Template Injection which leads to remote code execution by a carefully constructed payload.
9.6 High
CVSS3
Attack Vector
NETWORK
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
REQUIRED
Scope
CHANGED
Confidentiality Impact
HIGH
Integrity Impact
HIGH
Availability Impact
HIGH
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H
8.3 High
AI Score
Confidence
Low
6.8 Medium
CVSS2
Access Vector
NETWORK
Access Complexity
MEDIUM
Authentication
NONE
Confidentiality Impact
PARTIAL
Integrity Impact
PARTIAL
Availability Impact
PARTIAL
AV:N/AC:M/Au:N/C:P/I:P/A:P
0.0004 Low
EPSS
Percentile
8.7%