Lucene search

K
githubGitHub Advisory DatabaseGHSA-H9J7-5XVC-QHG5
HistoryFeb 26, 2024 - 6:30 p.m.

langchain Server-Side Request Forgery vulnerability

2024-02-2618:30:29
CWE-918
GitHub Advisory Database
github.com
12
server-side request forgery
vulnerability
crawler
configuration
attacker
control
malicious html
download
prevent_outside
patch

CVSS3

3.7

Attack Vector

LOCAL

Attack Complexity

HIGH

Privileges Required

HIGH

User Interaction

REQUIRED

Scope

CHANGED

Confidentiality Impact

LOW

Integrity Impact

LOW

Availability Impact

NONE

CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N

AI Score

4

Confidence

High

EPSS

0.001

Percentile

26.4%

With the following crawler configuration:

from bs4 import BeautifulSoup as Soup

url = "https://example.com"
loader = RecursiveUrlLoader(
    url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text 
)
docs = loader.load()

An attacker in control of the contents of https://example.com could place a malicious HTML file in there with links like “https://example.completely.different/my_file.html” and the crawler would proceed to download that file as well even though prevent_outside=True.

https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51

Resolved in https://github.com/langchain-ai/langchain/pull/15559

Affected configurations

Vulners
Node
langchainlangchainRange<0.1.0
VendorProductVersionCPE
langchainlangchain*cpe:2.3:a:langchain:langchain:*:*:*:*:*:*:*:*

CVSS3

3.7

Attack Vector

LOCAL

Attack Complexity

HIGH

Privileges Required

HIGH

User Interaction

REQUIRED

Scope

CHANGED

Confidentiality Impact

LOW

Integrity Impact

LOW

Availability Impact

NONE

CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N

AI Score

4

Confidence

High

EPSS

0.001

Percentile

26.4%