Lucene search

K
cve416baaa9-dc9f-4396-8d5f-8c081fb06d67CVE-2024-26958
HistoryMay 01, 2024 - 6:15 a.m.

CVE-2024-26958

2024-05-0106:15:12
416baaa9-dc9f-4396-8d5f-8c081fb06d67
web.nvd.nist.gov
48
linux kernel
uaf vulnerability
nfs
direct writes
refcount_t
underflow
warning
refcount.c
nfs_direct_write_schedule_work
nfs_commit_end
nfs_direct_write_complete
asynchronous requests
completion handling
stress test

7.1 High

AI Score

Confidence

Low

0.0004 Low

EPSS

Percentile

10.0%

In the Linux kernel, the following vulnerability has been resolved:

nfs: fix UAF in direct writes

In production we have been hitting the following warning consistently

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0
Workqueue: nfsiod nfs_direct_write_schedule_work [nfs]
RIP: 0010:refcount_warn_saturate+0x9c/0xe0
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0x9f/0x130
? refcount_warn_saturate+0x9c/0xe0
? report_bug+0xcc/0x150
? handle_bug+0x3d/0x70
? exc_invalid_op+0x16/0x40
? asm_exc_invalid_op+0x16/0x20
? refcount_warn_saturate+0x9c/0xe0
nfs_direct_write_schedule_work+0x237/0x250 [nfs]
process_one_work+0x12f/0x4a0
worker_thread+0x14e/0x3b0
? ZSTD_getCParams_internal+0x220/0x220
kthread+0xdc/0x120
? __btf_name_valid+0xa0/0xa0
ret_from_fork+0x1f/0x30

This is because we’re completing the nfs_direct_request twice in a row.

The source of this is when we have our commit requests to submit, we
process them and send them off, and then in the completion path for the
commit requests we have

if (nfs_commit_end(cinfo.mds))
nfs_direct_write_complete(dreq);

However since we’re submitting asynchronous requests we sometimes have
one that completes before we submit the next one, so we end up calling
complete on the nfs_direct_request twice.

The only other place we use nfs_generic_commit_list() is in
__nfs_commit_inode, which wraps this call in a

nfs_commit_begin();
nfs_commit_end();

Which is a common pattern for this style of completion handling, one
that is also repeated in the direct code with get_dreq()/put_dreq()
calls around where we process events as well as in the completion paths.

Fix this by using the same pattern for the commit requests.

Before with my 200 node rocksdb stress running this warning would pop
every 10ish minutes. With my patch the stress test has been running for
several hours without popping.

VendorProductVersionCPE
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
linuxlinux_kernel*cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*

7.1 High

AI Score

Confidence

Low

0.0004 Low

EPSS

Percentile

10.0%