Lucene search

K
githubGitHub Advisory DatabaseGHSA-79W7-VH3H-8G4J
HistoryJul 02, 2024 - 3:58 p.m.

yt-dlp File system modification and RCE through improper file-extension sanitization

2024-07-0215:58:35
CWE-434
CWE-669
GitHub Advisory Database
github.com
9
yt-dlp
file system modification
rce
improper file-extension
version 2024.07.01
output template
extraction options

CVSS3

7.8

Attack Vector

LOCAL

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

REQUIRED

Scope

UNCHANGED

Confidentiality Impact

HIGH

Integrity Impact

HIGH

Availability Impact

HIGH

CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

AI Score

7.7

Confidence

High

EPSS

0

Percentile

13.2%

Summary

yt-dlp does not limit the extensions of downloaded files, which could lead to arbitrary filenames being created in the download folder (and path traversal on Windows). Since yt-dlp also reads config from the working directory (and on Windows executables will be executed from the yt-dlp directory) this could lead to arbitrary code being executed.

Patches

yt-dlp version 2024.07.01 fixes this issue by whitelisting the allowed extensions.
This means some very uncommon extensions might not get downloaded; however, it will also limit the possible exploitation surface.

Workarounds

It is recommended to upgrade yt-dlp to version 2024.07.01 as soon as possible, always have .%(ext)s at the end of the output template, and make sure you trust the websites that you are downloading from. Also, make sure to never download to a directory within PATH or other sensitive locations like your user directory, system32, or other binaries locations.

For users not able to upgrade:

  • Make sure the extension of the media to download is a common video/audio/sub/… one
  • Try to avoid the generic extractor (--ies default,-generic)
  • Keep the default output template (-o "%(title)s [%(id)s].%(ext)s)
  • Omit any of the subtitle options (--write-subs, --write-auto-subs, --all-subs, --write-srt)
  • Use --ignore-config --config-location ... to not load config from common locations

Details

One potential exploitation might look like this:

From a mimetype we do not know, we default to trimming the leading bit and using the remainder. Given a webpage that contains

<script type="application/ld+json">
{
    "@context": "https://schema.org",
    "@type": "VideoObject",
    "name": "ffmpeg",
    "encodingFormat": "video/exe",
    "contentUrl": "https://example.com/video.mp4"
}
</script>

this will try and download a file called ffmpeg.exe (-o "%(title)s.%(ext)s).
ffmpeg.exe will be searched for in the current directory, and so upon the next run arbitrary code can be executed.

Alternatively, when engineering a file called yt-dlp.conf to be created, the config file could contain --exec ... and so would also execute arbitrary code.

Acknowledgement

A big thanks to @JarLob for independently finding a new application of the same underlying issue.
More can be read about on the dedicated GitHub Security Lab disclosure here: Path traversal saving subtitles (GHSL-2024-090)

References

Affected configurations

Vulners
Node
ytdlpRange<2024.07.01
VendorProductVersionCPE
ytdlp*cpe:2.3:a:yt:dlp:*:*:*:*:*:*:*:*

CVSS3

7.8

Attack Vector

LOCAL

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

REQUIRED

Scope

UNCHANGED

Confidentiality Impact

HIGH

Integrity Impact

HIGH

Availability Impact

HIGH

CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

AI Score

7.7

Confidence

High

EPSS

0

Percentile

13.2%