| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mm: revert "mm: shmem: fix data-race in shmem_getattr()"
Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as
suggested by Chuck [1]. It is causing deadlocks when accessing tmpfs over
NFS.
As Hugh commented, "added just to silence a syzbot sanitizer splat: added
where there has never been any practical problem". |
| In the Linux kernel, the following vulnerability has been resolved:
net: fix data-races around sk->sk_forward_alloc
Syzkaller reported this warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x1c5/0x1e0
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc5 #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:inet_sock_destruct+0x1c5/0x1e0
Code: 24 12 4c 89 e2 5b 48 c7 c7 98 ec bb 82 41 5c e9 d1 18 17 ff 4c 89 e6 5b 48 c7 c7 d0 ec bb 82 41 5c e9 bf 18 17 ff 0f 0b eb 83 <0f> 0b eb 97 0f 0b eb 87 0f 0b e9 68 ff ff ff 66 66 2e 0f 1f 84 00
RSP: 0018:ffffc9000008bd90 EFLAGS: 00010206
RAX: 0000000000000300 RBX: ffff88810b172a90 RCX: 0000000000000007
RDX: 0000000000000002 RSI: 0000000000000300 RDI: ffff88810b172a00
RBP: ffff88810b172a00 R08: ffff888104273c00 R09: 0000000000100007
R10: 0000000000020000 R11: 0000000000000006 R12: ffff88810b172a00
R13: 0000000000000004 R14: 0000000000000000 R15: ffff888237c31f78
FS: 0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc63fecac8 CR3: 000000000342e000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __warn+0x88/0x130
? inet_sock_destruct+0x1c5/0x1e0
? report_bug+0x18e/0x1a0
? handle_bug+0x53/0x90
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? inet_sock_destruct+0x1c5/0x1e0
__sk_destruct+0x2a/0x200
rcu_do_batch+0x1aa/0x530
? rcu_do_batch+0x13b/0x530
rcu_core+0x159/0x2f0
handle_softirqs+0xd3/0x2b0
? __pfx_smpboot_thread_fn+0x10/0x10
run_ksoftirqd+0x25/0x30
smpboot_thread_fn+0xdd/0x1d0
kthread+0xd3/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---
Its possible that two threads call tcp_v6_do_rcv()/sk_forward_alloc_add()
concurrently when sk->sk_state == TCP_LISTEN with sk->sk_lock unlocked,
which triggers a data-race around sk->sk_forward_alloc:
tcp_v6_rcv
tcp_v6_do_rcv
skb_clone_and_charge_r
sk_rmem_schedule
__sk_mem_schedule
sk_forward_alloc_add()
skb_set_owner_r
sk_mem_charge
sk_forward_alloc_add()
__kfree_skb
skb_release_all
skb_release_head_state
sock_rfree
sk_mem_uncharge
sk_forward_alloc_add()
sk_mem_reclaim
// set local var reclaimable
__sk_mem_reclaim
sk_forward_alloc_add()
In this syzkaller testcase, two threads call
tcp_v6_do_rcv() with skb->truesize=768, the sk_forward_alloc changes like
this:
(cpu 1) | (cpu 2) | sk_forward_alloc
... | ... | 0
__sk_mem_schedule() | | +4096 = 4096
| __sk_mem_schedule() | +4096 = 8192
sk_mem_charge() | | -768 = 7424
| sk_mem_charge() | -768 = 6656
... | ... |
sk_mem_uncharge() | | +768 = 7424
reclaimable=7424 | |
| sk_mem_uncharge() | +768 = 8192
| reclaimable=8192 |
__sk_mem_reclaim() | | -4096 = 4096
| __sk_mem_reclaim() | -8192 = -4096 != 0
The skb_clone_and_charge_r() should not be called in tcp_v6_do_rcv() when
sk->sk_state is TCP_LISTEN, it happens later in tcp_v6_syn_recv_sock().
Fix the same issue in dccp_v6_do_rcv(). |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: cope racing subflow creation in mptcp_rcv_space_adjust
Additional active subflows - i.e. created by the in kernel path
manager - are included into the subflow list before starting the
3whs.
A racing recvmsg() spooling data received on an already established
subflow would unconditionally call tcp_cleanup_rbuf() on all the
current subflows, potentially hitting a divide by zero error on
the newly created ones.
Explicitly check that the subflow is in a suitable state before
invoking tcp_cleanup_rbuf(). |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: fs, lock FTE when checking if active
The referenced commits introduced a two-step process for deleting FTEs:
- Lock the FTE, delete it from hardware, set the hardware deletion function
to NULL and unlock the FTE.
- Lock the parent flow group, delete the software copy of the FTE, and
remove it from the xarray.
However, this approach encounters a race condition if a rule with the same
match value is added simultaneously. In this scenario, fs_core may set the
hardware deletion function to NULL prematurely, causing a panic during
subsequent rule deletions.
To prevent this, ensure the active flag of the FTE is checked under a lock,
which will prevent the fs_core layer from attaching a new steering rule to
an FTE that is in the process of deletion.
[ 438.967589] MOSHE: 2496 mlx5_del_flow_rules del_hw_func
[ 438.968205] ------------[ cut here ]------------
[ 438.968654] refcount_t: decrement hit 0; leaking memory.
[ 438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcount_warn_saturate+0xfb/0x110
[ 438.970054] Modules linked in: act_mirred cls_flower act_gact sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core zram zsmalloc fuse [last unloaded: cls_flower]
[ 438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ #8
[ 438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 438.974874] RIP: 0010:refcount_warn_saturate+0xfb/0x110
[ 438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90
[ 438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286
[ 438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000
[ 438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0
[ 438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0
[ 438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0
[ 438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0
[ 438.980607] FS: 00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
[ 438.983984] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0
[ 438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 438.986507] Call Trace:
[ 438.986799] <TASK>
[ 438.987070] ? __warn+0x7d/0x110
[ 438.987426] ? refcount_warn_saturate+0xfb/0x110
[ 438.987877] ? report_bug+0x17d/0x190
[ 438.988261] ? prb_read_valid+0x17/0x20
[ 438.988659] ? handle_bug+0x53/0x90
[ 438.989054] ? exc_invalid_op+0x14/0x70
[ 438.989458] ? asm_exc_invalid_op+0x16/0x20
[ 438.989883] ? refcount_warn_saturate+0xfb/0x110
[ 438.990348] mlx5_del_flow_rules+0x2f7/0x340 [mlx5_core]
[ 438.990932] __mlx5_eswitch_del_rule+0x49/0x170 [mlx5_core]
[ 438.991519] ? mlx5_lag_is_sriov+0x3c/0x50 [mlx5_core]
[ 438.992054] ? xas_load+0x9/0xb0
[ 438.992407] mlx5e_tc_rule_unoffload+0x45/0xe0 [mlx5_core]
[ 438.993037] mlx5e_tc_del_fdb_flow+0x2a6/0x2e0 [mlx5_core]
[ 438.993623] mlx5e_flow_put+0x29/0x60 [mlx5_core]
[ 438.994161] mlx5e_delete_flower+0x261/0x390 [mlx5_core]
[ 438.994728] tc_setup_cb_destroy+0xb9/0x190
[ 438.995150] fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
[ 438.995650] fl_change+0x11a4/0x13c0 [cls_flower]
[ 438.996105] tc_new_tfilter+0x347/0xbc0
[ 438.996503] ? __
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Start the RTC update work later
The RTC update work involves runtime resuming the UFS controller. Hence,
only start the RTC update work after runtime power management in the UFS
driver has been fully initialized. This patch fixes the following kernel
crash:
Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
Workqueue: events ufshcd_rtc_work
Call trace:
_raw_spin_lock_irqsave+0x34/0x8c (P)
pm_runtime_get_if_active+0x24/0x9c (L)
pm_runtime_get_if_active+0x24/0x9c
ufshcd_rtc_work+0x138/0x1b4
process_one_work+0x148/0x288
worker_thread+0x2cc/0x3d4
kthread+0x110/0x114
ret_from_fork+0x10/0x20 |
| In the Linux kernel, the following vulnerability has been resolved:
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
We're seeing crashes from rq_qos_wake_function that look like this:
BUG: unable to handle page fault for address: ffffafe180a40084
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
Oops: Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
try_to_wake_up+0x5a/0x6a0
rq_qos_wake_function+0x71/0x80
__wake_up_common+0x75/0xa0
__wake_up+0x36/0x60
scale_up.part.0+0x50/0x110
wb_timer_fn+0x227/0x450
...
So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).
p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.
What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:
rq_qos_wait() rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
data->got_token = true;
list_del_init(&curr->entry);
if (data.got_token)
break;
finish_wait(&rqw->wait, &data.wq);
^- returns immediately because
list_empty_careful(&wq_entry->entry)
is true
... return, go do something else ...
wake_up_process(data->task)
(NO LONGER VALID!)-^
Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.
The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.
Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait(). |
| In the Linux kernel, the following vulnerability has been resolved:
mm/mremap: fix move_normal_pmd/retract_page_tables race
In mremap(), move_page_tables() looks at the type of the PMD entry and the
specified address range to figure out by which method the next chunk of
page table entries should be moved.
At that point, the mmap_lock is held in write mode, but no rmap locks are
held yet. For PMD entries that point to page tables and are fully covered
by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,
which first takes rmap locks, then does move_normal_pmd().
move_normal_pmd() takes the necessary page table locks at source and
destination, then moves an entire page table from the source to the
destination.
The problem is: The rmap locks, which protect against concurrent page
table removal by retract_page_tables() in the THP code, are only taken
after the PMD entry has been read and it has been decided how to move it.
So we can race as follows (with two processes that have mappings of the
same tmpfs file that is stored on a tmpfs mount with huge=advise); note
that process A accesses page tables through the MM while process B does it
through the file rmap:
process A process B
========= =========
mremap
mremap_to
move_vma
move_page_tables
get_old_pmd
alloc_new_pmd
*** PREEMPT ***
madvise(MADV_COLLAPSE)
do_madvise
madvise_walk_vmas
madvise_vma_behavior
madvise_collapse
hpage_collapse_scan_file
collapse_file
retract_page_tables
i_mmap_lock_read(mapping)
pmdp_collapse_flush
i_mmap_unlock_read(mapping)
move_pgt_entry(NORMAL_PMD, ...)
take_rmap_locks
move_normal_pmd
drop_rmap_locks
When this happens, move_normal_pmd() can end up creating bogus PMD entries
in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`. The effect
depends on arch-specific and machine-specific details; on x86, you can end
up with physical page 0 mapped as a page table, which is likely
exploitable for user->kernel privilege escalation.
Fix the race by letting process B recheck that the PMD still points to a
page table after the rmap locks have been taken. Otherwise, we bail and
let the caller fall back to the PTE-level copying path, which will then
bail immediately at the pmd_none() check.
Bug reachability: Reaching this bug requires that you can create
shmem/file THP mappings - anonymous THP uses different code that doesn't
zap stuff under rmap locks. File THP is gated on an experimental config
flag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need
shmem THP to hit this bug. As far as I know, getting shmem THP normally
requires that you can mount your own tmpfs with the right mount flags,
which would require creating your own user+mount namespace; though I don't
know if some distros maybe enable shmem THP by default or something like
that.
Bug impact: This issue can likely be used for user->kernel privilege
escalation when it is reachable. |
| In the Linux kernel, the following vulnerability has been resolved:
exec: don't WARN for racy path_noexec check
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
[brauner: keep redundant path_noexec() check] |
| In the Linux kernel, the following vulnerability has been resolved:
media: venus: fix use after free bug in venus_remove due to race condition
in venus_probe, core->work is bound with venus_sys_error_handler, which is
used to handle error. The code use core->sys_err_done to make sync work.
The core->work is started in venus_event_notify.
If we call venus_remove, there might be an unfished work. The possible
sequence is as follows:
CPU0 CPU1
|venus_sys_error_handler
venus_remove |
hfi_destroy |
venus_hfi_destroy |
kfree(hdev); |
|hfi_reinit
|venus_hfi_queues_reinit
|//use hdev
Fix it by canceling the work in venus_remove. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/gup: fix memfd_pin_folios alloc race panic
If memfd_pin_folios tries to create a hugetlb page, but someone else
already did, then folio gets the value -EEXIST here:
folio = memfd_alloc_folio(memfd, start_idx);
if (IS_ERR(folio)) {
ret = PTR_ERR(folio);
if (ret != -EEXIST)
goto err;
then on the next trip through the "while start_idx" loop we panic here:
if (folio) {
folio_put(folio);
To fix, set the folio to NULL on error. |
| In the Linux kernel, the following vulnerability has been resolved:
tracing/timerlat: Fix a race during cpuhp processing
There is another found exception that the "timerlat/1" thread was
scheduled on CPU0, and lead to timer corruption finally:
```
ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220
WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0
Modules linked in:
CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:debug_print_object+0x7d/0xb0
...
Call Trace:
<TASK>
? __warn+0x7c/0x110
? debug_print_object+0x7d/0xb0
? report_bug+0xf1/0x1d0
? prb_read_valid+0x17/0x20
? handle_bug+0x3f/0x70
? exc_invalid_op+0x13/0x60
? asm_exc_invalid_op+0x16/0x20
? debug_print_object+0x7d/0xb0
? debug_print_object+0x7d/0xb0
? __pfx_timerlat_irq+0x10/0x10
__debug_object_init+0x110/0x150
hrtimer_init+0x1d/0x60
timerlat_main+0xab/0x2d0
? __pfx_timerlat_main+0x10/0x10
kthread+0xb7/0xe0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x40
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
```
After tracing the scheduling event, it was discovered that the migration
of the "timerlat/1" thread was performed during thread creation. Further
analysis confirmed that it is because the CPU online processing for
osnoise is implemented through workers, which is asynchronous with the
offline processing. When the worker was scheduled to create a thread, the
CPU may has already been removed from the cpu_online_mask during the offline
process, resulting in the inability to select the right CPU:
T1 | T2
[CPUHP_ONLINE] | cpu_device_down()
osnoise_hotplug_workfn() |
| cpus_write_lock()
| takedown_cpu(1)
| cpus_write_unlock()
[CPUHP_OFFLINE] |
cpus_read_lock() |
start_kthread(1) |
cpus_read_unlock() |
To fix this, skip online processing if the CPU is already offline. |
| In the Linux kernel, the following vulnerability has been resolved:
rxrpc: Fix a race between socket set up and I/O thread creation
In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
thread that will handle it. This is a problem, however, as there's a gap
between the two phases in which a packet may come into rxrpc_encap_rcv()
from the UDP packet but we oops when trying to wake the not-yet created I/O
thread.
As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
no I/O thread yet.
A better, but more intrusive fix would perhaps be to rearrange things such
that the socket creation is done by the I/O thread. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix race setting file private on concurrent lseek using same fd
When doing concurrent lseek(2) system calls against the same file
descriptor, using multiple threads belonging to the same process, we have
a short time window where a race happens and can result in a memory leak.
The race happens like this:
1) A program opens a file descriptor for a file and then spawns two
threads (with the pthreads library for example), lets call them
task A and task B;
2) Task A calls lseek with SEEK_DATA or SEEK_HOLE and ends up at
file.c:find_desired_extent() while holding a read lock on the inode;
3) At the start of find_desired_extent(), it extracts the file's
private_data pointer into a local variable named 'private', which has
a value of NULL;
4) Task B also calls lseek with SEEK_DATA or SEEK_HOLE, locks the inode
in shared mode and enters file.c:find_desired_extent(), where it also
extracts file->private_data into its local variable 'private', which
has a NULL value;
5) Because it saw a NULL file private, task A allocates a private
structure and assigns to the file structure;
6) Task B also saw a NULL file private so it also allocates its own file
private and then assigns it to the same file structure, since both
tasks are using the same file descriptor.
At this point we leak the private structure allocated by task A.
Besides the memory leak, there's also the detail that both tasks end up
using the same cached state record in the private structure (struct
btrfs_file_private::llseek_cached_state), which can result in a
use-after-free problem since one task can free it while the other is
still using it (only one task took a reference count on it). Also, sharing
the cached state is not a good idea since it could result in incorrect
results in the future - right now it should not be a problem because it
end ups being used only in extent-io-tree.c:count_range_bits() where we do
range validation before using the cached state.
Fix this by protecting the private assignment and check of a file while
holding the inode's spinlock and keep track of the task that allocated
the private, so that it's used only by that task in order to prevent
user-after-free issues with the cached state record as well as potentially
using it incorrectly in the future. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: ath11k: use work queue to process beacon tx event
Commit 3a415daa3e8b ("wifi: ath11k: add P2P IE in beacon template")
from Feb 28, 2024 (linux-next), leads to the following Smatch static
checker warning:
drivers/net/wireless/ath/ath11k/wmi.c:1742 ath11k_wmi_p2p_go_bcn_ie()
warn: sleeping in atomic context
The reason is that ath11k_bcn_tx_status_event() will directly call might
sleep function ath11k_wmi_cmd_send() during RCU read-side critical
sections. The call trace is like:
ath11k_bcn_tx_status_event()
-> rcu_read_lock()
-> ath11k_mac_bcn_tx_event()
-> ath11k_mac_setup_bcn_tmpl()
……
-> ath11k_wmi_bcn_tmpl()
-> ath11k_wmi_cmd_send()
-> rcu_read_unlock()
Commit 886433a98425 ("ath11k: add support for BSS color change") added the
ath11k_mac_bcn_tx_event(), commit 01e782c89108 ("ath11k: fix warning
of RCU usage for ath11k_mac_get_arvif_by_vdev_id()") added the RCU lock
to avoid warning but also introduced this BUG.
Use work queue to avoid directly calling ath11k_mac_bcn_tx_event()
during RCU critical sections. No need to worry about the deletion of vif
because cancel_work_sync() will drop the work if it doesn't start or
block vif deletion until the running work is done.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: check stripe size compatibility on remount as well
We disable stripe size in __ext4_fill_super if it is not a multiple of
the cluster ratio however this check is missed when trying to remount.
This can leave us with cases where stripe < cluster_ratio after
remount:set making EXT4_B2C(sbi->s_stripe) become 0 that can cause some
unforeseen bugs like divide by 0.
Fix that by adding the check in remount path as well. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: get rid of online repaire on corrupted directory
syzbot reports a f2fs bug as below:
kernel BUG at fs/f2fs/inode.c:896!
RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
Call Trace:
evict+0x532/0x950 fs/inode.c:704
dispose_list fs/inode.c:747 [inline]
evict_inodes+0x5f9/0x690 fs/inode.c:797
generic_shutdown_super+0x9d/0x2d0 fs/super.c:627
kill_block_super+0x44/0x90 fs/super.c:1696
kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
task_work_run+0x24f/0x310 kernel/task_work.c:228
ptrace_notify+0x2d2/0x380 kernel/signal.c:2402
ptrace_report_syscall include/linux/ptrace.h:415 [inline]
ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173
syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline]
syscall_exit_to_user_mode+0x279/0x370 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
Online repaire on corrupted directory in f2fs_lookup() can generate
dirty data/meta while racing w/ readonly remount, it may leave dirty
inode after filesystem becomes readonly, however, checkpoint() will
skips flushing dirty inode in a state of readonly mode, result in
above panic.
Let's get rid of online repaire in f2fs_lookup(), and leave the work
to fsck.f2fs. |
| In the Linux kernel, the following vulnerability has been resolved:
vfs: fix race between evice_inodes() and find_inode()&iput()
Hi, all
Recently I noticed a bug[1] in btrfs, after digged it into
and I believe it'a race in vfs.
Let's assume there's a inode (ie ino 261) with i_count 1 is
called by iput(), and there's a concurrent thread calling
generic_shutdown_super().
cpu0: cpu1:
iput() // i_count is 1
->spin_lock(inode)
->dec i_count to 0
->iput_final() generic_shutdown_super()
->__inode_add_lru() ->evict_inodes()
// cause some reason[2] ->if (atomic_read(inode->i_count)) continue;
// return before // inode 261 passed the above check
// list_lru_add_obj() // and then schedule out
->spin_unlock()
// note here: the inode 261
// was still at sb list and hash list,
// and I_FREEING|I_WILL_FREE was not been set
btrfs_iget()
// after some function calls
->find_inode()
// found the above inode 261
->spin_lock(inode)
// check I_FREEING|I_WILL_FREE
// and passed
->__iget()
->spin_unlock(inode) // schedule back
->spin_lock(inode)
// check (I_NEW|I_FREEING|I_WILL_FREE) flags,
// passed and set I_FREEING
iput() ->spin_unlock(inode)
->spin_lock(inode) ->evict()
// dec i_count to 0
->iput_final()
->spin_unlock()
->evict()
Now, we have two threads simultaneously evicting
the same inode, which may trigger the BUG(inode->i_state & I_CLEAR)
statement both within clear_inode() and iput().
To fix the bug, recheck the inode->i_count after holding i_lock.
Because in the most scenarios, the first check is valid, and
the overhead of spin_lock() can be reduced.
If there is any misunderstanding, please let me know, thanks.
[1]: https://lore.kernel.org/linux-btrfs/000000000000eabe1d0619c48986@google.com/
[2]: The reason might be 1. SB_ACTIVE was removed or 2. mapping_shrinkable()
return false when I reproduced the bug. |
| In the Linux kernel, the following vulnerability has been resolved:
lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
If we need to increase the tree depth, allocate a new node, and then
race with another thread that increased the tree depth before us, we'll
still have a preallocated node that might be used later.
If we then use that node for a new non-root node, it'll still have a
pointer to the old root instead of being zeroed - fix this by zeroing it
in the cmpxchg failure path. |
| In the Linux kernel, the following vulnerability has been resolved:
spi: hisi-kunpeng: Add verification for the max_frequency provided by the firmware
If the value of max_speed_hz is 0, it may cause a division by zero
error in hisi_calc_effective_speed().
The value of max_speed_hz is provided by firmware.
Firmware is generally considered as a trusted domain. However, as
division by zero errors can cause system failure, for defense measure,
the value of max_speed is validated here. So 0 is regarded as invalid
and an error code is returned. |
| In the Linux kernel, the following vulnerability has been resolved:
staging: iio: frequency: ad9834: Validate frequency parameter value
In ad9834_write_frequency() clk_get_rate() can return 0. In such case
ad9834_calc_freqreg() call will lead to division by zero. Checking
'if (fout > (clk_freq / 2))' doesn't protect in case of 'fout' is 0.
ad9834_write_frequency() is called from ad9834_write(), where fout is
taken from text buffer, which can contain any value.
Modify parameters checking.
Found by Linux Verification Center (linuxtesting.org) with SVACE. |