We've found a very severe vulnerability in the IRIX telnetd service that upon successful exploitation can give remote root access to any IRIX 6.2-6.5.8[m,f] system.
The bug discussed here appeared in IRIX 5.2-6.1 systems and was the result of SGI efforts to patch a security vulnerability reported by CERT back in 1995 (CERT Advisory CA-95:14). Because it was introduced to the IRIX 5.2-6.1 systems along with the 1010/1020 security patches, their default "clean" installations are rather immune from this vulnerability. All later IRIX editions are by default vulnerable to the bug presented in this post.
The vulnerability we've found belongs to the most recently discussed class of the so-called "format bugs". IRIX telnetd service upon receiving the IAC-SB-TELOPT_ENVIRON request to set one of the _RLD family environment variables calls the syslog() function with a partially user supplied format string. The syslog message that is generated upon detecting such an attempt is of the following format: "ignored attempt to setenv(%.32s,%.128s)". The strings enclosed by the setenv() brackets are adequately: variable name and variable value. If variable name/value pairs are appropriately constructed, arbitrary telnetd process image memory values can be overwritten and execution flow can be redirected to the user supplied machine code instructions.
After some careful investigation we've managed to exploit the vulnerability. A proof of concept code was developed and it is available at our website. We have also implemented a quick fix, so that people can protect their IRIX boxes from being exploited. By releasing the exploit code we take the opportunity to discuss the specifics of its development as it was somewhat different from what is already known and exploited by people working with x86 operating systems.
When we noticed that IRIX telnetd uses syslog() function with partially user supplied strings, the first attempt that we undertook was to try to overwrite its stack by using the "[shellcode]%[space padding].c[return address]" attack scheme. Unfortunately, it turned out to be ineffective, as we could not seize control over the telnetd PC. This was mainly caused by the fact that the number of spaces in the format string could not be adjusted in such a way so that PC would have been loaded with our arbitrary return address value. We could not either use the
"%[space padding].c[shellcode][address]" "%[pad1]x%[pad2]x%[pad3]x%[pad4]x%[param number]$n"
attack scheme because of the MIPS big endianness and the fact that the machine code implementing the %n feature was using the sw (store word) instruction. On MIPS and other RISC machines compilers usually generate code with a speed in mind. So, if the c language (int)var=val equivalent operation is encountered in the source code it is usually processed in such a way that produces the sw instruction in the output assembly code. And since it is the sw store, it must be 4 bytes aligned on MIPS. If this is not the case BUSERROR is signalled to the process and it core dumps.
The processor big endianness and aligned memory load/writes were the primary difficulties that we had encountered when exploiting the format string telnetd bug. The other problem we noticed was that only 100 bytes long buffer could be used for telnet IAC-SB command.
Because we do not give up so easily, another try was made to the telnetd exploit. After some deep analysis of all the environment constraints, we decided to use the
"[shellcode][addrlo][addrhi]" "%[pad1]x%[param number1]$hn%[pad2]x%[param number2]$hn"
string at our attack. We simply changed from %n to %hn scheme and performed two short integer writes instead of one common int write. The values of pad1 and pad2, although kept in 32 bit registers are stored by the %hn feature as 16 bit values using sh (store halfword) machine instruction. If carefully adjusted, they can form high and low nibbles of the 32 bit value stored at a given memory address (addrlo for first %hn store, and addrhi for the second one). We've come up to the point where we were able to store arbitrary values in telnetd process memory locations. The problem we faced next was how to effectively get control over the program counter. Performing an overwrite of the return address stored in a local function frame is one of the obvious ways to achieve that, but since we were not able to remotely inspect the telnetd stack it seemed to be rather ineffective. This is why we decided to make a jump through the process GOT table. On IRIX every call to the function from the shared library linked with a given program is made with the use of the following instruction sequences:
lw t9,-got_offset(gp) jalr ra,t9 nop
If GOT entry for a shared library function called from within a telnetd would be overwritten with an arbitrary address, the next time this function would be executed, the PC would be loaded with that address and in a result control over the process would be gained. The most important thing here is that GOT entries for a given function call do not differ so much from one to other binary. The other advantage is that they are 32 bit entities, regardless of whether ELF 32 or N32 binaries are in use. It is important as long as IRIX 6.4 and up use 64 bit pointers for $ra and $gp, which are usually difficult to overwrite with most often occuring str* buffer overflows.
We inspected what function calls telnetd was using by viewing its GOT table. We also found that after processing the TELOPT_ENVIRON telnet protocol suboption telnetd was waiting on a read() function call. So, we decided to overwrite the GOT entry of the read() function. Its address was obtained by issuing odump -Dg /usr/etc/telnetd | grep "\[read\]" command:
[ 77]: 0x0fa38654 -32444(gp), 7fc4981c [read]
and was 0x7fc4981c.
So we solved the "where to store" problem and could control the value of PC, but "where to jump" location was still unknown for us since it was also placed somewhere on a stack of which parameters were unpredictable. This is why we decided to change our format string and used the following one instead:
"%[space padding].c[shellcode]" "[addrlo][addrhi]%[pad1]x%[param number1]$hn%[pad2]x%[param number2]$hn"
Because space paddings are before shellcode instructions and the space value is 0x20 they could act as 0x20202020 NOPs (addi $zero,$at,8224). By using a large decimal value for space padding we could make our NOP buffer large and simultaneously, jump address was becoming much more predictable. This is what we did, but very soon got disappointed.
Everything seemed to be working fine. The telnetd GOT entry for the read() function was overwritten in two shots with our start address pointing to the middle of the NOP buffer. The jump was made but we were always getting ILLEGAL INSTRUCION signal and telnetd core dumped after executing several 0x20202020 NOPs. We knew very well what was going on. After a couple of years of IRIX buffer overflow exploitation that was nothing than a classic example of the MIPS cache incoherence behaviour. We usually avoided that cache problems by supplying large NOP buffers to the program input so that cache had time to "flush". But that didn't work in the telnetd case and we were stuck again.
The enlightenment came after careful telnetd memory inspection. We found out that one of its global symbols was used for storing telnet protocol options. It was called subbuffer and its location was predictable since it was stored in a telnetd GOT table. We used odump -Dg /usr/etc/telnetd | grep "\[subbuffer\]" command:
[ 186]: 0x7fc4cf98 -32008(gp), 7fc499d0 [subbuffer]
and obtained the forementioned buffer address - 0x7fc4cf98.
The format string was changed again and we got rid of the padding spaces in it since they were not needed any more. Jumping to the location within a subbuffer turned out to be effective but not on all platforms. We had still cache problems on R4600 systems. To solve that we decided to use a trick that has been first applied by us back in 1998 in our named exploit. We were overwriting the same memory location for 2 times with a time period between each single write. By doing so processor cache usually has enough time to "become coherent". This is usually the case because during that time process sleeps on a read() syscall, and its context is switched. This is why in our telnetd exploit we set environment variables for two times. The first setting places only shellcode and data in a subbuffer. This is the second operation, which triggers the memory overwrites and makes the exploit go run.
So, we had a working exploit version on a 6.5 platform. We tested it and it worked fine on all 6.5.x systems we had in our operating environment. It was time to move to another IRIX versions. And this is where new problems with the exploit popped out.
First, we noticed that on IRIX 6.2 and 6.3 different format strings had to be used. That became quite apparent to us when we inspected the telnetd binary and found out that it was an ELF o32 binary, not the new N32 ELF used on IRIX 6.4 and above. So, we had to deal with appropriate format strings for different MIPS ABI's.
The second, much more painful difference we noticed was that on IRIX 6.2-6.4 even if the right telnetd GOT entry for the read() call was overwritten our code was not executed. Instead, we always were ending up with a sigabort() function call. Overwritting the abort() function seemed to be the only way to get control over telnetd program counter. What was not promising for us was that sigabort was called from within the syslog function of which definition is located within libc.so.1. We went through several IRIX boxes and checked out the differences between their standard c language libraries. And it all looked like a mess for us. We knew that some patches changed libc.so.1. We somewhat found out that what odump/elfdump was showing about the libc.so.1 GOT entries was not usually how the things were really looking like. We were forced to use the following scheme to obtain the address of the libc.so.1 GOT entry for a given function:
where got_base_address and function_index_in_got ware obtained with the following commands: odump -h -n .got /usr/lib/libc.so.1 | grep got odump -Dg /usr/lib/libc.so.1 | grep "\[abort\]"
Finally, we went through the SGI patchbase in order to find out what patches could change the libc.so.1 file in the IRIX system. This is what we found:
IRIX 6.2 patchSG0003490.eoe_sw.irix_lib (libc rollup + Y2K fixes + MIPS ABI) patchSG0003723.eoe_sw.irix_lib (libc rollup + Y2K fixes + MIPS ABI) patchSG0003771.eoe_sw.irix_lib (libc rollup + Y2K fixes + MIPS ABI) patchSG0001918.eoe_sw.irix_lib (libc rollup) patchSG0002086.eoe_sw.irix_lib (libc rollup)
IRIX 6.3 patchSG0003535.eoe_sw.irix_lib (libc bug fixes and enhancements + y2k) patchSG0003737.eoe_sw.irix_lib (libc bug fixes and enhancements + y2k) patchSG0003770.eoe_sw.irix_lib (libc bug fixes and enhancements + y2k)
IRIX 6.4 patchSG0003491.eoe_sw.irix_lib (6.4-S2MP+O + y2k + 64-bit strcoll segv fix) patchSG0003738.eoe_sw.irix_lib (6.4-S2MP+O + y2k + 64-bit strcoll segv fix) patchSG0003769.eoe_sw.irix_lib (6.4-S2MP+O + y2k + 64-bit strcoll segv fix)
IRIX 6.5 no patches available
After applying each patch separately on appropriate IRIX versions and by checking the addresses of the abort() function GOT table entries it turned out that from our "GOT overwriting" point of view some libraries were equivalent.
Similar elf binary inspection was applied to the telnetd program (remember subbuffer jump address), of which different versions could be installed in the system due to the following patch matrix:
IRIX 6.2 patchSG0001485.eoe_sw.unix patchSG0002070.eoe_sw.unix patchSG0003117.eoe_sw.unix patchSG0003414.eoe_sw.unix
IRIX 6.3 6.4 6.5 no patches available
This is how we managed to reduce the possible number of abort() function GOT entry/telnetd subbufer address locations from 39 to 13. The final table of all possible combinations for all IRIX 6.x systems looked like this:
irix 6.2 libc.so.1: no patches telnetd: no patches irix 6.2 libc.so.1: 1918|2086 telnetd: no patches irix 6.2 libc.so.1: 3490|3723|3771 telnetd: no patches irix 6.2 libc.so.1: no patches telnetd: 1485|2070|3117|3414 irix 6.2 libc.so.1: 1918|2086 telnetd: 1485|2070|3117|3414 irix 6.2 libc.so.1: 3490|3723|3771 telnetd: 1485|2070|3117|3414 irix 6.3 libc.so.1: no patches telnetd: no patches irix 6.3 libc.so.1: 2087 telnetd: no patches irix 6.3 libc.so.1: 3535|3737|3770 telnetd: no patches irix 6.4 libc.so.1: no patches telnetd: no patches irix 6.4 libc.so.1: 3491|3769|3738 telnetd: no patches irix 6.5-6.5.8m 6.5-6.5.7f telnetd: no patches irix 6.5.8f telnetd: no patches
This is how we've come up to the ending point.
The shellcode we use in the exploit code is a slightly modified version of our 40 byte long IRIX MIPS shellcode. It couldn't be longer because of the space limits imposed on a telnet protocol options buffer. The following command shows that we have only 100 bytes available for suboptions data: odump -Dt /usr/etc/telnetd | grep subbuffer
We use 97 out of these 100 bytes. As you see this is far enough to take control over the IRIX box.
Primarily aim of this story was to show how much effort must be usually made to develop an exploit code. The guys who just use them do not even think about it. Each exploit code has its own story. Only exploit coders who know what it takes to write them are familiar with that pain of development.
Below we provide link to exploit and fix codes: http://lsd-pl.net/files/get?IRIX/irx_telnetd
lsd folks http://lsd-pl.net