HDF5 Group libhdf5 H5T_COMPOUND Code Execution Vulnerability(CVE-2016-4333)

2017-10-11T00:00:00
ID SSV:96651
Type seebug
Reporter Root
Modified 2017-10-11T00:00:00

Description

Description

HDF5 is a file format that is maintained by a non-profit organization, The HDF Group. HDF5 is designed to be used for storage and organization of large amounts of scientific data and is used to exchange data structures between applications in industries such as the GIS industry via libraries such as GDAL, OGR, or as part of software like ArcGIS.

The vulnerability exists due to the library allocating space for the array using a value from the file, and then within the loop for initializing said array allowing a value within the file to modify the loop's terminator. Due to this, an aggressor can cause the loop's index to point outside the bounds of the array when initializing it. This is a heap-based buffer overflow, and can lead to code execution under the context of the application using the library.

Tested Versions

  • hdf5-1.8.16.tar.bz2
  • tools/h5ls: Version 1.8.16
  • tools/h5stat: Version 1.8.16
  • tools/h5dump: Version 1.8.16

Product Urls

  • http://www.hdfgroup.org/HDF5/
  • http://www.hdfgroup.org/HDF5/release/obtainsrc.html
  • http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.16.tar.bz2

CVSSv3 Score

8.6 -- CVSS:3.0/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Details

The HDF file format is intended to be a general file format that is self-describing for various types of data structures used in the scientific community [1]. These data structures are intended to be stored in two types of objects, Datasets and Groups. Paralleling the file-format to a file system, a Dataset can be interpreted as a file, and a Group can be interpreted as a directory that's able to contain other Datasets or Groups. Associated with each entry, is metadata containing user-defined named attributes that can be used to describe the dataset.

Within the HDF file format, paths can be specified as the '/'-separated posix format. When reading a dataset, the library will open the object using H5Dopenoid. Inside this function, the library will read the type and it's location. Once the type and it's location are read, then the library will pass the H5ODTYPEID value along with it's location onto H5Omsg_read. ``` src/H5Dint.c:1221 static herr_t H5D__open_oid(H5D_t dataset, hid_t dapl_id, hid_t dxpl_id) { ... / Open the dataset object */ if(H5O_open(&(dataset->oloc)) < 0) HGOTO_ERROR(H5E_DATASET, H5E_CANTOPENOBJ, FAIL, "unable to open")

/* Get the type and space */
if(NULL == (dataset-&gt;shared-&gt;type = (H5T_t *)H5O_msg_read(&(dataset-&gt;oloc), H5O_DTYPE_ID, NULL, dxpl_id)))  // XXX: \
    HGOTO_ERROR(H5E_DATASET, H5E_CANTINIT, FAIL, "unable to load type info from dataset header")

\ src/H5Omessage.c:463 void H5O_msg_read(const H5O_loc_t loc, unsigned type_id, void mesg, hid_t dxpl_id) { H5O_t oh = NULL; / Object header to use / void ret_value; / Return value / ... / Get the object header */ if(NULL == (oh = H5O_protect(loc, dxpl_id, H5AC_READ))) HGOTO_ERROR(H5E_OHDR, H5E_CANTPROTECT, NULL, "unable to protect object header")

/* Call the "real" read routine */
if(NULL == (ret_value = H5O_msg_read_oh(loc-&gt;file, dxpl_id, oh, type_id, mesg)))    // XXX: read the message from the object header
HGOTO_ERROR(H5E_OHDR, H5E_READERROR, NULL, "unable to read object header message")

Inside H5Omsgreadoh, the application will use the typeid argument to determine which message type is being used for a message. This message type is used to determine which callback to use in order to handle the message. This process occurs within the macro H5OLOADNATIVE at H5Omessage.c:545

src/H5Omessage.c:517 void H5O_msg_read_oh(H5F_t f, hid_t dxpl_id, H5O_t oh, unsigned type_id, void mesg) { const H5O_msg_class_t type; / Actual H5O class type for the ID / unsigned idx; / Message's index in object header / void ret_value = NULL; ... for(idx = 0; idx < oh->nmesgs; idx++) if(type == oh->mesg[idx].type) break; ... H5O_LOAD_NATIVE(f, dxpl_id, 0, oh, &(oh->mesg[idx]), NULL) ```

Inside the H5OLOADNATIVE macro, the application will select a structure containing function pointers out of the msg->type field. This structure contains various functions that are used to decode the message. When decoding a msg of type H5ODTYPEID, the library will dispatch into the H5Odtypeshareddecode function. This function will eventually call H5Odtypedecode. Inside H5Odtypedecode, the application will then call H5Odtypedecodehelper which is responsible for decoding the data types. ``` src/H5Oshared.h:50 static H5_INLINE void H5O_SHARED_DECODE(H5F_t f, hid_t dxpl_id, H5O_t open_oh, unsigned mesg_flags, unsigned ioflags, const uint8_t p) { ... / Decode native message directly / if(NULL == (ret_value = H5O_SHARED_DECODE_REAL(f, dxpl_id, open_oh, mesg_flags, ioflags, p))) // XXX: \ HGOTO_ERROR(H5E_OHDR, H5E_CANTDECODE, NULL, "unable to decode native message") } / end else / \ src/H5Odtype.c:1091 static void H5O_dtype_decode(H5F_t f, hid_t H5_ATTR_UNUSED dxpl_id, H5O_t H5_ATTR_UNUSED open_oh, unsigned H5_ATTR_UNUSED mesg_flags, unsigned ioflags/in,out/, const uint8_t p) { ... / Allocate datatype message / if(NULL == (dt = H5T__alloc())) HGOTO_ERROR(H5E_RESOURCE, H5E_NOSPACE, NULL, "memory allocation failed")

/* Perform actual decode of message */
if(H5O_dtype_decode_helper(f, ioflags, &p, dt) &lt; 0)
    HGOTO_ERROR(H5E_DATATYPE, H5E_CANTDECODE, NULL, "can't decode type")

```

Inside decode helper, the library will read a dword from the file and use the bottom 4 bits to determine the datatype. If the datatype is H5T_COMPOUND(6), the library will enter the case at src/H5Odtype.c:260. At the beginning of this case, the library will use a bitmask from the fields to allocate space for the number of members. src/H5Odtype.c:133 static htri_t H5O_dtype_decode_helper(H5F_t *f, unsigned *ioflags/*in,out*/, const uint8_t **pp, H5T_t *dt) { ... case H5T_COMPOUND: { ... dt-&gt;shared-&gt;u.compnd.nmembs = flags & 0xffff; if(dt-&gt;shared-&gt;u.compnd.nmembs == 0) HGOTO_ERROR(H5E_DATATYPE, H5E_BADVALUE, FAIL, "invalid number of members: %u", dt-&gt;shared-&gt;u.compnd.nmembs) dt-&gt;shared-&gt;u.compnd.nalloc = dt-&gt;shared-&gt;u.compnd.nmembs; // XXX: proof-of-concept sets this to 3 dt-&gt;shared-&gt;u.compnd.memb = (H5T_cmemb_t *)H5MM_calloc(dt-&gt;shared-&gt;u.compnd.nalloc * sizeof(H5T_cmemb_t)); // XXX: buffer that's later written to dt-&gt;shared-&gt;u.compnd.memb_size = 0;

Immediately afterwards, the library will enter a loop that is terminated by the number of members in the prior snippet. For each iteration of this loop, the library will read a number of dimensions that will be passed to a function H5T_arraycreate. Although the library checks that the number of dimensions that are read are bound by 4, the check is done via an assertion. When the library is built in production mode[3], this assertion will be optimized out by the preprocessor. src/H5Odtype.c:282 for(i = 0; i &lt; dt-&gt;shared-&gt;u.compnd.nmembs; i++) { // XXX: u.array.ndims unsigned ndims = 0; /* Number of dimensions of the array field */ htri_t can_upgrade; /* Whether we can upgrade this type's version */ hsize_t dim[H5O_LAYOUT_NDIMS]; /* Dimensions of the array */ H5T_t *array_dt; /* Temporary pointer to the array datatype */ H5T_t *temp_type; /* Temporary pointer to the field's datatype */ ... if(version == H5O_DTYPE_VERSION_1) { /* Decode the number of dimensions */ ndims = *(*pp)++; // XXX: ndims can be changed within the loop HDassert(ndims &lt;= 4); // XXX: assertion, if ndims &gt; 4 then H5T_array_create will read oob *pp += 3; /*reserved bytes */ ... } /* end if */ ... if(version == H5O_DTYPE_VERSION_1) { ... if((array_dt = H5T__array_create(temp_type, ndims, dim)) == NULL) { // XXX: ndims is passed here ... } /* end if */

Inside H5T_arraycreate, the library will use the ndims value as a terminator to a loop. This loop is used to calculate the size of the array. Due to the index being oob of the 4-element array, the loop can assign an arbitrary value to u.array.ndims and u.array.nelem. These values are actually a union within the structure that they're written to, and due to this can be used to change the length of the loop after the space has already been allocated. ``` src/H5Tarray.c:179 H5T_t H5Tarray_create(H5T_t base, unsigned ndims, const hsize_t dim[/ ndims /]) { H5T_t ret_value; / new array data type / unsigned u; / local index variable / ... / Build new type / if(NULL == (ret_value = H5Talloc())) HGOTO_ERROR(H5E_RESOURCE, H5E_NOSPACE, NULL, "memory allocation failed") ret_value->shared->type = H5T_ARRAY; ... / Set the array parameters */ ret_value->shared->u.array.ndims = ndims; // XXX: writes to u.compnd.nmembs

/* Copy the array dimensions & compute the # of elements in the array */
for(u = 0, ret_value-&gt;shared-&gt;u.array.nelem = 1; u &lt; ndims; u++) {
    H5_CHECKED_ASSIGN(ret_value-&gt;shared-&gt;u.array.dim[u], size_t, dim[u], hsize_t);
    ret_value-&gt;shared-&gt;u.array.nelem *= (size_t)dim[u];             // XXX: multiply using uninitialized values. writes to u.compnd.nalloc
} /* end for */

/* Set the array's size (number of elements * element datatype's size) */
ret_value-&gt;shared-&gt;size = ret_value-&gt;shared-&gt;parent-&gt;shared-&gt;size * ret_value-&gt;shared-&gt;u.array.nelem;   // XXX

... FUNC_LEAVE_NOAPI(ret_value) } / end H5T__array_create / ```

The structure's that overlap are located within the H5Tsharedt definition in src/H5Tpkg.h:288. In this structure, the "u" field is a union of both an H5Tarrayt and an H5TcompndT which both are used within the loop that was explained in the prior snippet. src/H5Tpkg.h:288 typedef struct H5T_shared_t { hsize_t fo_count; /* number of references to this file object */ ... struct H5T_t *parent;/*parent type for derived datatypes */ union { H5T_atomic_t atomic; /* an atomic datatype */ H5T_compnd_t compnd; /* a compound datatype (struct) */ H5T_enum_t enumer; /* an enumeration type (enum) */ H5T_vlen_t vlen; /* a variable-length datatype */ H5T_opaque_t opaque; /* an opaque datatype */ H5T_array_t array; /* an array datatype */ } u; } H5T_shared_t;

In these structures, H5Tarrayt.nelem is the same as H5Tcompndt.nalloc, and H5Tarrayt.ndims is the same as H5Tcompndt.nmembs. These are defined below. The field's that are used to control the allocation and the loop are marked. ``` src/H5Tpkg.h:273 typedef struct H5T_array_t { size_t nelem; / total number of elements in array / // XXX: modified using elements outside of the dims variable unsigned ndims; / member dimensionality / // XXX: modified inside H5T__array_create size_t dim[H5S_MAX_RANK]; / size in each dimension / } H5T_array_t;

src/H5Tpkg.h:217 typedef struct H5T_compnd_t { unsigned nalloc; /num entries allocated in MEMB array/ // XXX: used to control the allocation unsigned nmembs; /number of members defined in struct/ // XXX: used to terminate the loop H5T_sort_t sorted; /how are members sorted? / hbool_t packed; /are members packed together? / H5T_cmemb_t memb; /array of struct members / size_t memb_size; /total of all member sizes */ } H5T_compnd_t; ```

Referring back to the loop, these two fields are used to control when the loop terminates. Since u.array.ndims let's the librayr modify the value of u.compnd.nmembs, the code at line 391 will write outside the bounds of the allocation. This is a heap-based buffer overflow and can lead to code execution under the context of the application using the library. ``` src/H5Odtype.c:282 for(i = 0; i < dt->shared->u.compnd.nmembs; i++) { // XXX: u.array.ndims ... src/H5Odtype.c:391 ... / Member size / dt->shared->u.compnd.memb[i].size = temp_type->shared->size; // XXX: writes outside of bounds of loop. dt->shared->u.compnd.memb_size += temp_type->shared->size;

    /* Set the field datatype (finally :-) */
    dt-&gt;shared-&gt;u.compnd.memb[i].type = temp_type;

```

Crash Analysis

``` $ gdb -q --args bin/h5stat poc.hdf 1542 ../../../tools/h5stat/h5stat.c: No such file or directory. (gdb) bp src/H5Odtype.c:278 Breakpoint 4 at 0xb6b04b3f: file ../../src/H5Odtype.c, line 278. (gdb) bp src/H5Odtype.c:312 Breakpoint 5 at 0xb6b07356: file ../../src/H5Odtype.c, line 312. (gdb) bp src/H5Odtype.c:352 Breakpoint 6 at 0xb6b091f7: file ../../src/H5Odtype.c, line 352. (gdb) bp src/H5Odtype.c:392 Breakpoint 7 at 0xb6b0a852: file ../../src/H5Odtype.c, line 392. (gdb) r Starting program: $HOME/hdf5-1.8.16/release/bin/h5stat poc.hdf Filename: poc.hdf

Breakpoint 3, H5O_dtype_decode_helper (f=f@entry=0x83f0e48, ioflags=ioflags@entry=0xbfffed6c, pp=pp@entry=0xbfffed1c, dt=dt@entry=0x83df358) at ../../src/H5Odtype.c:278 278 dt->shared->u.compnd.memb = (H5T_cmemb_t )H5MM_calloc(dt->shared->u.compnd.nalloc * sizeof(H5T_cmemb_t)); (gdb) p dt->shared->u.compnd.nalloc * sizeof(H5T_cmemb_t) $1 = 0x30 (gdb) n 279 dt->shared->u.compnd.memb_size = 0; (gdb) p dt->shared->u.compnd.memb $2 = (H5T_cmemb_t ) 0x83f4070 (gdb) ba dt->shared->u.compnd.memb + dt->shared->u.compnd.nalloc * sizeof(H5T_cmemb_t) Hardware watchpoint 7: *(dt->shared->u.compnd.memb + dt->shared->u.compnd.nalloc * sizeof(H5T_cmemb_t))

(gdb) c Continuing. Hardware watchpoint 7: *(dt->shared->u.compnd.memb + dt->shared->u.compnd.nalloc * sizeof(H5T_cmemb_t))

Old value = 0x0 New value = <unreadable> H5T__array_create (base=base@entry=0x83df448, ndims=ndims@entry=0x80, dim=dim@entry=0xbfffebc8) at ../../src/H5Tarray.c:206 206 ret_value->shared->u.array.nelem *= (size_t)dim[u];

(gdb) c Continuing.

Breakpoint 6, H5O_dtype_decode_helper (f=f@entry=0x83f0e48, ioflags=ioflags@entry=0xbfffed6c, pp=pp@entry=0xbfffed1c, dt=dt@entry=0x83df358) at ../../src/H5Odtype.c:392 392 dt->shared->u.compnd.memb[i].size = temp_type->shared->size; (gdb) n

Catchpoint 2 (signal SIGSEGV), 0x08148372 in H5O_dtype_decode_helper (f=f@entry=0x83f0e48, ioflags=ioflags@entry=0xbfffed6c, pp=pp@entry=0xbfffed1c, dt=dt@entry=0x83df358) at ../../src/H5Odtype.c:392 392 dt->shared->u.compnd.memb[i].size = temp_type->shared->size;

Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. (gdb)

==2061==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb2b20758 at pc 0xb699e18c bp 0xbfa0e618 sp 0xbfa0e610 WRITE of size 4 at 0xb2b20758 thread T0 #0 0xb699e18b in H5Tarray_create $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Tarray.c:205 #1 0xb629b2e4 in H5O_dtype_decode_helper $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Odtype.c:352 #2 0xb628d881 in H5O_dtype_decode $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Odtype.c:1108 #3 0xb6259fd8 in H5O_dtype_shared_decode $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Oshared.h:84 #4 0xb6335a5c in H5O_msg_read_oh $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Omessage.c:554 #5 0xb63338a6 in H5O_msg_read $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Omessage.c:483 #6 0xb57d3b96 in H5Dopen_oid $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Dint.c:1245 #7 0xb57d0df7 in H5D_open $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Dint.c:1153 #8 0xb56763f9 in H5Dopen2 $HOME/hdf5-1.8.16/memcheck/src/../../src/H5D.c:368 #9 0x80e0ecd in dataset_stats $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:473 #10 0x80d1d39 in obj_stats $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:685 #11 0x81d307d in traverse_cb $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:237 #12 0xb5c6a66a in H5G_visit_cb $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gint.c:939 #13 0xb5cbea72 in H5Gnode_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gnode.c:1026 #14 0xb54b2c85 in H5B_iterate_helper $HOME/hdf5-1.8.16/memcheck/src/../../src/H5B.c:1175 #15 0xb54b06db in H5B_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5B.c:1220 #16 0xb5d17773 in H5Gstab_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gstab.c:565 #17 0xb5ce2af2 in H5G__obj_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gobj.c:707 #18 0xb5c67be2 in H5G_visit $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gint.c:1174 #19 0xb6022f7d in H5Lvisit_by_name $HOME/hdf5-1.8.16/memcheck/src/../../src/H5L.c:1378 #20 0x81bed2e in traverse $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:310 #21 0x81c9df5 in h5trav_visit $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:1164 #22 0x80cf9e3 in main $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:1623 #23 0xb506ea82 (/lib/i386-linux-gnu/libc.so.6+0x19a82) #24 0x80cde74 in _start ($HOME/hdf5-1.8.16/memcheck/bin/h5stat+0x80cde74)

0xb2b20758 is located 0 bytes to the right of 168-byte region [0xb2b206b0,0xb2b20758) allocated by thread T0 here: #0 0x80b6b8e in calloc ($HOME/hdf5-1.8.16/memcheck/bin/h5stat+0x80b6b8e) #1 0xb6093d5b in H5MM_calloc $HOME/hdf5-1.8.16/memcheck/src/../../src/H5MM.c:107 #2 0xb6982712 in H5Talloc $HOME/hdf5-1.8.16/memcheck/src/../../src/H5T.c:3462 #3 0xb699d08c in H5Tarray_create $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Tarray.c:192 #4 0xb629b2e4 in H5O_dtype_decode_helper $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Odtype.c:352 #5 0xb628d881 in H5O_dtype_decode $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Odtype.c:1108 #6 0xb6259fd8 in H5O_dtype_shared_decode $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Oshared.h:84 #7 0xb6335a5c in H5O_msg_read_oh $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Omessage.c:554 #8 0xb63338a6 in H5O_msg_read $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Omessage.c:483 #9 0xb57d3b96 in H5Dopen_oid $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Dint.c:1245 #10 0xb57d0df7 in H5D_open $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Dint.c:1153 #11 0xb56763f9 in H5Dopen2 $HOME/hdf5-1.8.16/memcheck/src/../../src/H5D.c:368 #12 0x80e0ecd in dataset_stats $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:473 #13 0x80d1d39 in obj_stats $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:685 #14 0x81d307d in traverse_cb $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:237 #15 0xb5c6a66a in H5G_visit_cb $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gint.c:939 #16 0xb5cbea72 in H5Gnode_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gnode.c:1026 #17 0xb54b2c85 in H5B_iterate_helper $HOME/hdf5-1.8.16/memcheck/src/../../src/H5B.c:1175 #18 0xb54b06db in H5B_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5B.c:1220 #19 0xb5d17773 in H5Gstab_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gstab.c:565 #20 0xb5ce2af2 in H5Gobj_iterate $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gobj.c:707 #21 0xb5c67be2 in H5G_visit $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Gint.c:1174 #22 0xb6022f7d in H5Lvisit_by_name $HOME/hdf5-1.8.16/memcheck/src/../../src/H5L.c:1378 #23 0x81bed2e in traverse $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:310 #24 0x81c9df5 in h5trav_visit $HOME/hdf5-1.8.16/memcheck/tools/lib/../../../tools/lib/h5trav.c:1164 #25 0x80cf9e3 in main $HOME/hdf5-1.8.16/memcheck/tools/h5stat/../../../tools/h5stat/h5stat.c:1623 #26 0xb506ea82 (/lib/i386-linux-gnu/libc.so.6+0x19a82)

SUMMARY: AddressSanitizer: heap-buffer-overflow $HOME/hdf5-1.8.16/memcheck/src/../../src/H5Tarray.c:205 H5T__array_create ```

Timeline

  • 2016-05-08 - Discovery
  • 2016-05-17 - Vendor Notification
  • 2016-11-15 - Public Disclosure

References

  • [1] https://en.wikipedia.org/wiki/HierarchicalDataFormat
  • [2] http://www.hdfgroup.org/HDF5/