In the Linux kernel, the following vulnerability has been resolved:
btrfs: protect folio::private when attaching extent buffer folios
[BUG]
Since v6.8 there are rare kernel crashes reported by various people,
the common factor is bad page status error messages like this:
BUG: Bad page state in process kswapd0 pfn:d6e840
page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c
pfn:0xd6e840
aops:btree_aops ino:1
flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff)
page_type: 0xffffffff()
raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0
raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: non-NULL mapping
[CAUSE]
Commit 09e6cef19c9f (“btrfs: refactor alloc_extent_buffer() to
allocate-then-attach method”) changes the sequence when allocating a new
extent buffer.
Previously we always called grab_extent_buffer() under
mapping->i_private_lock, to ensure the safety on modification on
folio::private (which is a pointer to extent buffer for regular
sectorsize).
This can lead to the following race:
Thread A is trying to allocate an extent buffer at bytenr X, with 4
4K pages, meanwhile thread B is trying to release the page at X + 4K
(the second page of the extent buffer at X).
Thread A | Thread B
-----------------------------------±------------------------------------
| btree_release_folio()
| | This is for the page at X + 4K,
| | Not page X.
| |
alloc_extent_buffer() | |- release_extent_buffer()
|- filemap_add_folio() for the | | |- atomic_dec_and_test(eb->refs)
| page at bytenr X (the first | | |
| page). | | |
| Which returned -EEXIST. | | |
| | | |
|- filemap_lock_folio() | | |
| Returned the first page locked. | | |
| | | |
|- grab_extent_buffer() | | |
| |- atomic_inc_not_zero() | | |
| | Returned false | | |
| |- folio_detach_private() | | |- folio_detach_private() for X
| |- folio_test_private() | | |- folio_test_private()
| Returned true | | | Returned true
|- folio_put() | |- folio_put()
Now there are two puts on the same folio at folio X, leading to refcount
underflow of the folio X, and eventually causing the BUG_ON() on the
page->mapping.
The condition is not that easy to hit:
seq 1 80
; doOS | Version | Architecture | Package | Version | Filename |
---|---|---|---|---|---|
ubuntu | 20.04 | noarch | linux | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux | < any | UNKNOWN |
ubuntu | 24.04 | noarch | linux | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-aws | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-aws | < any | UNKNOWN |
ubuntu | 24.04 | noarch | linux-aws | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-aws-5.15 | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-aws-6.5 | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-azure | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-azure | < any | UNKNOWN |
git.kernel.org/linus/f3a5367c679d31473d3fbb391675055b4792c309 (6.10-rc3)
git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b
git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309
launchpad.net/bugs/cve/CVE-2024-38306
nvd.nist.gov/vuln/detail/CVE-2024-38306
security-tracker.debian.org/tracker/CVE-2024-38306
www.cve.org/CVERecord?id=CVE-2024-38306