| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
NFSv4.0: Fix a use-after-free problem in the asynchronous open()
Yang Erkun reports that when two threads are opening files at the same
time, and are forced to abort before a reply is seen, then the call to
nfs_release_seqid() in nfs4_opendata_free() can result in a
use-after-free of the pointer to the defunct rpc task of the other
thread.
The fix is to ensure that if the RPC call is aborted before the call to
nfs_wait_on_sequence() is complete, then we must call nfs_release_seqid()
in nfs4_open_release() before the rpc_task is freed. |
| In the Linux kernel, the following vulnerability has been resolved:
initramfs: avoid filename buffer overrun
The initramfs filename field is defined in
Documentation/driver-api/early-userspace/buffer-format.rst as:
37 cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data
...
55 ============= ================== =========================
56 Field name Field size Meaning
57 ============= ================== =========================
...
70 c_namesize 8 bytes Length of filename, including final \0
When extracting an initramfs cpio archive, the kernel's do_name() path
handler assumes a zero-terminated path at @collected, passing it
directly to filp_open() / init_mkdir() / init_mknod().
If a specially crafted cpio entry carries a non-zero-terminated filename
and is followed by uninitialized memory, then a file may be created with
trailing characters that represent the uninitialized memory. The ability
to create an initramfs entry would imply already having full control of
the system, so the buffer overrun shouldn't be considered a security
vulnerability.
Append the output of the following bash script to an existing initramfs
and observe any created /initramfs_test_fname_overrunAA* path. E.g.
./reproducer.sh | gzip >> /myinitramfs
It's easiest to observe non-zero uninitialized memory when the output is
gzipped, as it'll overflow the heap allocated @out_buf in __gunzip(),
rather than the initrd_start+initrd_size block.
---- reproducer.sh ----
nilchar="A" # change to "\0" to properly zero terminate / pad
magic="070701"
ino=1
mode=$(( 0100777 ))
uid=0
gid=0
nlink=1
mtime=1
filesize=0
devmajor=0
devminor=1
rdevmajor=0
rdevminor=0
csum=0
fname="initramfs_test_fname_overrun"
namelen=$(( ${#fname} + 1 )) # plus one to account for terminator
printf "%s%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%s" \
$magic $ino $mode $uid $gid $nlink $mtime $filesize \
$devmajor $devminor $rdevmajor $rdevminor $namelen $csum $fname
termpadlen=$(( 1 + ((4 - ((110 + $namelen) & 3)) % 4) ))
printf "%.s${nilchar}" $(seq 1 $termpadlen)
---- reproducer.sh ----
Symlink filename fields handled in do_symlink() won't overrun past the
data segment, due to the explicit zero-termination of the symlink
target.
Fix filename buffer overrun by aborting the initramfs FSM if any cpio
entry doesn't carry a zero-terminator at the expected (name_len - 1)
offset. |
| In the Linux kernel, the following vulnerability has been resolved:
mm: revert "mm: shmem: fix data-race in shmem_getattr()"
Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as
suggested by Chuck [1]. It is causing deadlocks when accessing tmpfs over
NFS.
As Hugh commented, "added just to silence a syzbot sanitizer splat: added
where there has never been any practical problem". |
| In the Linux kernel, the following vulnerability has been resolved:
net: fix data-races around sk->sk_forward_alloc
Syzkaller reported this warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x1c5/0x1e0
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc5 #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:inet_sock_destruct+0x1c5/0x1e0
Code: 24 12 4c 89 e2 5b 48 c7 c7 98 ec bb 82 41 5c e9 d1 18 17 ff 4c 89 e6 5b 48 c7 c7 d0 ec bb 82 41 5c e9 bf 18 17 ff 0f 0b eb 83 <0f> 0b eb 97 0f 0b eb 87 0f 0b e9 68 ff ff ff 66 66 2e 0f 1f 84 00
RSP: 0018:ffffc9000008bd90 EFLAGS: 00010206
RAX: 0000000000000300 RBX: ffff88810b172a90 RCX: 0000000000000007
RDX: 0000000000000002 RSI: 0000000000000300 RDI: ffff88810b172a00
RBP: ffff88810b172a00 R08: ffff888104273c00 R09: 0000000000100007
R10: 0000000000020000 R11: 0000000000000006 R12: ffff88810b172a00
R13: 0000000000000004 R14: 0000000000000000 R15: ffff888237c31f78
FS: 0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc63fecac8 CR3: 000000000342e000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __warn+0x88/0x130
? inet_sock_destruct+0x1c5/0x1e0
? report_bug+0x18e/0x1a0
? handle_bug+0x53/0x90
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? inet_sock_destruct+0x1c5/0x1e0
__sk_destruct+0x2a/0x200
rcu_do_batch+0x1aa/0x530
? rcu_do_batch+0x13b/0x530
rcu_core+0x159/0x2f0
handle_softirqs+0xd3/0x2b0
? __pfx_smpboot_thread_fn+0x10/0x10
run_ksoftirqd+0x25/0x30
smpboot_thread_fn+0xdd/0x1d0
kthread+0xd3/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---
Its possible that two threads call tcp_v6_do_rcv()/sk_forward_alloc_add()
concurrently when sk->sk_state == TCP_LISTEN with sk->sk_lock unlocked,
which triggers a data-race around sk->sk_forward_alloc:
tcp_v6_rcv
tcp_v6_do_rcv
skb_clone_and_charge_r
sk_rmem_schedule
__sk_mem_schedule
sk_forward_alloc_add()
skb_set_owner_r
sk_mem_charge
sk_forward_alloc_add()
__kfree_skb
skb_release_all
skb_release_head_state
sock_rfree
sk_mem_uncharge
sk_forward_alloc_add()
sk_mem_reclaim
// set local var reclaimable
__sk_mem_reclaim
sk_forward_alloc_add()
In this syzkaller testcase, two threads call
tcp_v6_do_rcv() with skb->truesize=768, the sk_forward_alloc changes like
this:
(cpu 1) | (cpu 2) | sk_forward_alloc
... | ... | 0
__sk_mem_schedule() | | +4096 = 4096
| __sk_mem_schedule() | +4096 = 8192
sk_mem_charge() | | -768 = 7424
| sk_mem_charge() | -768 = 6656
... | ... |
sk_mem_uncharge() | | +768 = 7424
reclaimable=7424 | |
| sk_mem_uncharge() | +768 = 8192
| reclaimable=8192 |
__sk_mem_reclaim() | | -4096 = 4096
| __sk_mem_reclaim() | -8192 = -4096 != 0
The skb_clone_and_charge_r() should not be called in tcp_v6_do_rcv() when
sk->sk_state is TCP_LISTEN, it happens later in tcp_v6_syn_recv_sock().
Fix the same issue in dccp_v6_do_rcv(). |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: cope racing subflow creation in mptcp_rcv_space_adjust
Additional active subflows - i.e. created by the in kernel path
manager - are included into the subflow list before starting the
3whs.
A racing recvmsg() spooling data received on an already established
subflow would unconditionally call tcp_cleanup_rbuf() on all the
current subflows, potentially hitting a divide by zero error on
the newly created ones.
Explicitly check that the subflow is in a suitable state before
invoking tcp_cleanup_rbuf(). |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: fs, lock FTE when checking if active
The referenced commits introduced a two-step process for deleting FTEs:
- Lock the FTE, delete it from hardware, set the hardware deletion function
to NULL and unlock the FTE.
- Lock the parent flow group, delete the software copy of the FTE, and
remove it from the xarray.
However, this approach encounters a race condition if a rule with the same
match value is added simultaneously. In this scenario, fs_core may set the
hardware deletion function to NULL prematurely, causing a panic during
subsequent rule deletions.
To prevent this, ensure the active flag of the FTE is checked under a lock,
which will prevent the fs_core layer from attaching a new steering rule to
an FTE that is in the process of deletion.
[ 438.967589] MOSHE: 2496 mlx5_del_flow_rules del_hw_func
[ 438.968205] ------------[ cut here ]------------
[ 438.968654] refcount_t: decrement hit 0; leaking memory.
[ 438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcount_warn_saturate+0xfb/0x110
[ 438.970054] Modules linked in: act_mirred cls_flower act_gact sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core zram zsmalloc fuse [last unloaded: cls_flower]
[ 438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ #8
[ 438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 438.974874] RIP: 0010:refcount_warn_saturate+0xfb/0x110
[ 438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90
[ 438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286
[ 438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000
[ 438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0
[ 438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0
[ 438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0
[ 438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0
[ 438.980607] FS: 00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
[ 438.983984] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0
[ 438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 438.986507] Call Trace:
[ 438.986799] <TASK>
[ 438.987070] ? __warn+0x7d/0x110
[ 438.987426] ? refcount_warn_saturate+0xfb/0x110
[ 438.987877] ? report_bug+0x17d/0x190
[ 438.988261] ? prb_read_valid+0x17/0x20
[ 438.988659] ? handle_bug+0x53/0x90
[ 438.989054] ? exc_invalid_op+0x14/0x70
[ 438.989458] ? asm_exc_invalid_op+0x16/0x20
[ 438.989883] ? refcount_warn_saturate+0xfb/0x110
[ 438.990348] mlx5_del_flow_rules+0x2f7/0x340 [mlx5_core]
[ 438.990932] __mlx5_eswitch_del_rule+0x49/0x170 [mlx5_core]
[ 438.991519] ? mlx5_lag_is_sriov+0x3c/0x50 [mlx5_core]
[ 438.992054] ? xas_load+0x9/0xb0
[ 438.992407] mlx5e_tc_rule_unoffload+0x45/0xe0 [mlx5_core]
[ 438.993037] mlx5e_tc_del_fdb_flow+0x2a6/0x2e0 [mlx5_core]
[ 438.993623] mlx5e_flow_put+0x29/0x60 [mlx5_core]
[ 438.994161] mlx5e_delete_flower+0x261/0x390 [mlx5_core]
[ 438.994728] tc_setup_cb_destroy+0xb9/0x190
[ 438.995150] fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
[ 438.995650] fl_change+0x11a4/0x13c0 [cls_flower]
[ 438.996105] tc_new_tfilter+0x347/0xbc0
[ 438.996503] ? __
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
In error flow of mlx5_tc_ct_entry_add_rule(), in case ct_rule_add()
callback returns error, zone_rule->attr is used uninitiated. Fix it to
use attr which has the needed pointer value.
Kernel log:
BUG: kernel NULL pointer dereference, address: 0000000000000110
RIP: 0010:mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
…
Call Trace:
<TASK>
? __die+0x20/0x70
? page_fault_oops+0x150/0x3e0
? exc_page_fault+0x74/0x140
? asm_exc_page_fault+0x22/0x30
? mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
? mlx5_tc_ct_entry_add_rule+0x1d5/0x2f0 [mlx5_core]
mlx5_tc_ct_block_flow_offload+0xc6a/0xf90 [mlx5_core]
? nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
flow_offload_work_handler+0x142/0x320 [nf_flow_table]
? finish_task_switch.isra.0+0x15b/0x2b0
process_one_work+0x16c/0x320
worker_thread+0x28c/0x3a0
? __pfx_worker_thread+0x10/0x10
kthread+0xb8/0xf0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
mm: fix NULL pointer dereference in alloc_pages_bulk_noprof
We triggered a NULL pointer dereference for ac.preferred_zoneref->zone in
alloc_pages_bulk_noprof() when the task is migrated between cpusets.
When cpuset is enabled, in prepare_alloc_pages(), ac->nodemask may be
¤t->mems_allowed. when first_zones_zonelist() is called to find
preferred_zoneref, the ac->nodemask may be modified concurrently if the
task is migrated between different cpusets. Assuming we have 2 NUMA Node,
when traversing Node1 in ac->zonelist, the nodemask is 2, and when
traversing Node2 in ac->zonelist, the nodemask is 1. As a result, the
ac->preferred_zoneref points to NULL zone.
In alloc_pages_bulk_noprof(), for_each_zone_zonelist_nodemask() finds a
allowable zone and calls zonelist_node_idx(ac.preferred_zoneref), leading
to NULL pointer dereference.
__alloc_pages_noprof() fixes this issue by checking NULL pointer in commit
ea57485af8f4 ("mm, page_alloc: fix check for NULL preferred_zone") and
commit df76cee6bbeb ("mm, page_alloc: remove redundant checks from alloc
fastpath").
To fix it, check NULL pointer for preferred_zoneref->zone. |
| In the Linux kernel, the following vulnerability has been resolved:
smb: client: Fix use-after-free of network namespace.
Recently, we got a customer report that CIFS triggers oops while
reconnecting to a server. [0]
The workload runs on Kubernetes, and some pods mount CIFS servers
in non-root network namespaces. The problem rarely happened, but
it was always while the pod was dying.
The root cause is wrong reference counting for network namespace.
CIFS uses kernel sockets, which do not hold refcnt of the netns that
the socket belongs to. That means CIFS must ensure the socket is
always freed before its netns; otherwise, use-after-free happens.
The repro steps are roughly:
1. mount CIFS in a non-root netns
2. drop packets from the netns
3. destroy the netns
4. unmount CIFS
We can reproduce the issue quickly with the script [1] below and see
the splat [2] if CONFIG_NET_NS_REFCNT_TRACKER is enabled.
When the socket is TCP, it is hard to guarantee the netns lifetime
without holding refcnt due to async timers.
Let's hold netns refcnt for each socket as done for SMC in commit
9744d2bf1976 ("smc: Fix use-after-free in tcp_write_timer_handler().").
Note that we need to move put_net() from cifs_put_tcp_session() to
clean_demultiplex_info(); otherwise, __sock_create() still could touch a
freed netns while cifsd tries to reconnect from cifs_demultiplex_thread().
Also, maybe_get_net() cannot be put just before __sock_create() because
the code is not under RCU and there is a small chance that the same
address happened to be reallocated to another netns.
[0]:
CIFS: VFS: \\XXXXXXXXXXX has not responded in 15 seconds. Reconnecting...
CIFS: Serverclose failed 4 times, giving up
Unable to handle kernel paging request at virtual address 14de99e461f84a07
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[14de99e461f84a07] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in: cls_bpf sch_ingress nls_utf8 cifs cifs_arc4 cifs_md4 dns_resolver tcp_diag inet_diag veth xt_state xt_connmark nf_conntrack_netlink xt_nat xt_statistic xt_MASQUERADE xt_mark xt_addrtype ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment nft_compat nf_tables nfnetlink overlay nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sm4_ce_cipher sm4 sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 sha1_ce ena button sch_fq_codel loop fuse configfs dmi_sysfs sha2_ce sha256_arm64 dm_mirror dm_region_hash dm_log dm_mod dax efivarfs
CPU: 5 PID: 2690970 Comm: cifsd Not tainted 6.1.103-109.184.amzn2023.aarch64 #1
Hardware name: Amazon EC2 r7g.4xlarge/, BIOS 1.0 11/1/2018
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : fib_rules_lookup+0x44/0x238
lr : __fib_lookup+0x64/0xbc
sp : ffff8000265db790
x29: ffff8000265db790 x28: 0000000000000000 x27: 000000000000bd01
x26: 0000000000000000 x25: ffff000b4baf8000 x24: ffff00047b5e4580
x23: ffff8000265db7e0 x22: 0000000000000000 x21: ffff00047b5e4500
x20: ffff0010e3f694f8 x19: 14de99e461f849f7 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 3f92800abd010002
x11: 0000000000000001 x10: ffff0010e3f69420 x9 : ffff800008a6f294
x8 : 0000000000000000 x7 : 0000000000000006 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffff001924354280 x3 : ffff8000265db7e0
x2 : 0000000000000000 x1 : ffff0010e3f694f8 x0 : ffff00047b5e4500
Call trace:
fib_rules_lookup+0x44/0x238
__fib_lookup+0x64/0xbc
ip_route_output_key_hash_rcu+0x2c4/0x398
ip_route_output_key_hash+0x60/0x8c
tcp_v4_connect+0x290/0x488
__inet_stream_connect+0x108/0x3d0
inet_stream_connect+0x50/0x78
kernel_connect+0x6c/0xac
generic_ip_conne
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT
In qdisc_tree_reduce_backlog, Qdiscs with major handle ffff: are assumed
to be either root or ingress. This assumption is bogus since it's valid
to create egress qdiscs with major handle ffff:
Budimir Markovic found that for qdiscs like DRR that maintain an active
class list, it will cause a UAF with a dangling class pointer.
In 066a3b5b2346, the concern was to avoid iterating over the ingress
qdisc since its parent is itself. The proper fix is to stop when parent
TC_H_ROOT is reached because the only way to retrieve ingress is when a
hierarchy which does not contain a ffff: major handle call into
qdisc_lookup with TC_H_MAJ(TC_H_ROOT).
In the scenario where major ffff: is an egress qdisc in any of the tree
levels, the updates will also propagate to TC_H_ROOT, which then the
iteration must stop.
net/sched/sch_api.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-) |
| In the Linux kernel, the following vulnerability has been resolved:
ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_find()
The per-netns IP tunnel hash table is protected by the RTNL mutex and
ip_tunnel_find() is only called from the control path where the mutex is
taken.
Add a lockdep expression to hlist_for_each_entry_rcu() in
ip_tunnel_find() in order to validate that the mutex is held and to
silence the suspicious RCU usage warning [1].
[1]
WARNING: suspicious RCU usage
6.12.0-rc3-custom-gd95d9a31aceb #139 Not tainted
-----------------------------
net/ipv4/ip_tunnel.c:221 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/362:
#0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60
stack backtrace:
CPU: 12 UID: 0 PID: 362 Comm: ip Not tainted 6.12.0-rc3-custom-gd95d9a31aceb #139
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0xba/0x110
lockdep_rcu_suspicious.cold+0x4f/0xd6
ip_tunnel_find+0x435/0x4d0
ip_tunnel_newlink+0x517/0x7a0
ipgre_newlink+0x14c/0x170
__rtnl_newlink+0x1173/0x19c0
rtnl_newlink+0x6c/0xa0
rtnetlink_rcv_msg+0x3cc/0xf60
netlink_rcv_skb+0x171/0x450
netlink_unicast+0x539/0x7f0
netlink_sendmsg+0x8c1/0xd80
____sys_sendmsg+0x8f9/0xc20
___sys_sendmsg+0x197/0x1e0
__sys_sendmsg+0x122/0x1f0
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| In the Linux kernel, the following vulnerability has been resolved:
arm64/sve: Discard stale CPU state when handling SVE traps
The logic for handling SVE traps manipulates saved FPSIMD/SVE state
incorrectly, and a race with preemption can result in a task having
TIF_SVE set and TIF_FOREIGN_FPSTATE clear even though the live CPU state
is stale (e.g. with SVE traps enabled). This has been observed to result
in warnings from do_sve_acc() where SVE traps are not expected while
TIF_SVE is set:
| if (test_and_set_thread_flag(TIF_SVE))
| WARN_ON(1); /* SVE access shouldn't have trapped */
Warnings of this form have been reported intermittently, e.g.
https://lore.kernel.org/linux-arm-kernel/CA+G9fYtEGe_DhY2Ms7+L7NKsLYUomGsgqpdBj+QwDLeSg=JhGg@mail.gmail.com/
https://lore.kernel.org/linux-arm-kernel/000000000000511e9a060ce5a45c@google.com/
The race can occur when the SVE trap handler is preempted before and
after manipulating the saved FPSIMD/SVE state, starting and ending on
the same CPU, e.g.
| void do_sve_acc(unsigned long esr, struct pt_regs *regs)
| {
| // Trap on CPU 0 with TIF_SVE clear, SVE traps enabled
| // task->fpsimd_cpu is 0.
| // per_cpu_ptr(&fpsimd_last_state, 0) is task.
|
| ...
|
| // Preempted; migrated from CPU 0 to CPU 1.
| // TIF_FOREIGN_FPSTATE is set.
|
| get_cpu_fpsimd_context();
|
| if (test_and_set_thread_flag(TIF_SVE))
| WARN_ON(1); /* SVE access shouldn't have trapped */
|
| sve_init_regs() {
| if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
| ...
| } else {
| fpsimd_to_sve(current);
| current->thread.fp_type = FP_STATE_SVE;
| }
| }
|
| put_cpu_fpsimd_context();
|
| // Preempted; migrated from CPU 1 to CPU 0.
| // task->fpsimd_cpu is still 0
| // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then:
| // - Stale HW state is reused (with SVE traps enabled)
| // - TIF_FOREIGN_FPSTATE is cleared
| // - A return to userspace skips HW state restore
| }
Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set
by calling fpsimd_flush_task_state() to detach from the saved CPU
state. This ensures that a subsequent context switch will not reuse the
stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the
new state to be reloaded from memory prior to a return to userspace. |
| In the Linux kernel, the following vulnerability has been resolved:
vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans
During loopback communication, a dangling pointer can be created in
vsk->trans, potentially leading to a Use-After-Free condition. This
issue is resolved by initializing vsk->trans to NULL. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix out-of-bounds write in trie_get_next_key()
trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
full paths from the root to leaves. For example, consider a trie with
max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
.prefixlen = 8 make 9 nodes be written on the node stack with size 8. |
| In the Linux kernel, the following vulnerability has been resolved:
macsec: Fix use-after-free while sending the offloading packet
KASAN reports the following UAF. The metadata_dst, which is used to
store the SCI value for macsec offload, is already freed by
metadata_dst_free() in macsec_free_netdev(), while driver still use it
for sending the packet.
To fix this issue, dst_release() is used instead to release
metadata_dst. So it is not freed instantly in macsec_free_netdev() if
still referenced by skb.
BUG: KASAN: slab-use-after-free in mlx5e_xmit+0x1e8f/0x4190 [mlx5_core]
Read of size 2 at addr ffff88813e42e038 by task kworker/7:2/714
[...]
Workqueue: mld mld_ifc_work
Call Trace:
<TASK>
dump_stack_lvl+0x51/0x60
print_report+0xc1/0x600
kasan_report+0xab/0xe0
mlx5e_xmit+0x1e8f/0x4190 [mlx5_core]
dev_hard_start_xmit+0x120/0x530
sch_direct_xmit+0x149/0x11e0
__qdisc_run+0x3ad/0x1730
__dev_queue_xmit+0x1196/0x2ed0
vlan_dev_hard_start_xmit+0x32e/0x510 [8021q]
dev_hard_start_xmit+0x120/0x530
__dev_queue_xmit+0x14a7/0x2ed0
macsec_start_xmit+0x13e9/0x2340
dev_hard_start_xmit+0x120/0x530
__dev_queue_xmit+0x14a7/0x2ed0
ip6_finish_output2+0x923/0x1a70
ip6_finish_output+0x2d7/0x970
ip6_output+0x1ce/0x3a0
NF_HOOK.constprop.0+0x15f/0x190
mld_sendpack+0x59a/0xbd0
mld_ifc_work+0x48a/0xa80
process_one_work+0x5aa/0xe50
worker_thread+0x79c/0x1290
kthread+0x28f/0x350
ret_from_fork+0x2d/0x70
ret_from_fork_asm+0x11/0x20
</TASK>
Allocated by task 3922:
kasan_save_stack+0x20/0x40
kasan_save_track+0x10/0x30
__kasan_kmalloc+0x77/0x90
__kmalloc_noprof+0x188/0x400
metadata_dst_alloc+0x1f/0x4e0
macsec_newlink+0x914/0x1410
__rtnl_newlink+0xe08/0x15b0
rtnl_newlink+0x5f/0x90
rtnetlink_rcv_msg+0x667/0xa80
netlink_rcv_skb+0x12c/0x360
netlink_unicast+0x551/0x770
netlink_sendmsg+0x72d/0xbd0
__sock_sendmsg+0xc5/0x190
____sys_sendmsg+0x52e/0x6a0
___sys_sendmsg+0xeb/0x170
__sys_sendmsg+0xb5/0x140
do_syscall_64+0x4c/0x100
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Freed by task 4011:
kasan_save_stack+0x20/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x37/0x50
poison_slab_object+0x10c/0x190
__kasan_slab_free+0x11/0x30
kfree+0xe0/0x290
macsec_free_netdev+0x3f/0x140
netdev_run_todo+0x450/0xc70
rtnetlink_rcv_msg+0x66f/0xa80
netlink_rcv_skb+0x12c/0x360
netlink_unicast+0x551/0x770
netlink_sendmsg+0x72d/0xbd0
__sock_sendmsg+0xc5/0x190
____sys_sendmsg+0x52e/0x6a0
___sys_sendmsg+0xeb/0x170
__sys_sendmsg+0xb5/0x140
do_syscall_64+0x4c/0x100
entry_SYSCALL_64_after_hwframe+0x4b/0x53 |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
If access to offset + length is larger than the skbuff length, then
skb_checksum() triggers BUG_ON().
skb_checksum() internally subtracts the length parameter while iterating
over skbuff, BUG_ON(len) at the end of it checks that the expected
length to be included in the checksum calculation is fully consumed. |
| In the Linux kernel, the following vulnerability has been resolved:
cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:
cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
[..]
cxld_unregister: cxl decoder14.0:
cxl_region_decode_reset: cxl_region region3:
mock_decoder_reset: cxl_port port3: decoder3.0 reset
2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
cxl_endpoint_decoder_release: cxl decoder14.0:
[..]
cxld_unregister: cxl decoder7.0:
3) cxl_region_decode_reset: cxl_region region3:
Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
[..]
RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
[..]
Call Trace:
<TASK>
cxl_region_decode_reset+0x69/0x190 [cxl_core]
cxl_region_detach+0xe8/0x210 [cxl_core]
cxl_decoder_kill_region+0x27/0x40 [cxl_core]
cxld_unregister+0x5d/0x60 [cxl_core]
At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.
The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.
In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.
A new device_for_each_child_reverse_from() is added to cleanup
port->commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0->1->2 and disabled 1->2->0 then
port->commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously. |
| In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix possible double free in smb2_set_ea()
Clang static checker(scan-build) warning:
fs/smb/client/smb2ops.c:1304:2: Attempt to free released memory.
1304 | kfree(ea);
| ^~~~~~~~~
There is a double free in such case:
'ea is initialized to NULL' -> 'first successful memory allocation for
ea' -> 'something failed, goto sea_exit' -> 'first memory release for ea'
-> 'goto replay_again' -> 'second goto sea_exit before allocate memory
for ea' -> 'second memory release for ea resulted in double free'.
Re-initialie 'ea' to NULL near to the replay_again label, it can fix this
double free problem. |
| In the Linux kernel, the following vulnerability has been resolved:
usb: typec: altmode should keep reference to parent
The altmode device release refers to its parent device, but without keeping
a reference to it.
When registering the altmode, get a reference to the parent and put it in
the release function.
Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
like this:
[ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
[ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
[ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
[ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
[ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
[ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
[ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
[ 46.612867] ==================================================================
[ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
[ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
[ 46.614538]
[ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
[ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[ 46.616042] Workqueue: events kobject_delayed_cleanup
[ 46.616446] Call Trace:
[ 46.616648] <TASK>
[ 46.616820] dump_stack_lvl+0x5b/0x7c
[ 46.617112] ? typec_altmode_release+0x38/0x129
[ 46.617470] print_report+0x14c/0x49e
[ 46.617769] ? rcu_read_unlock_sched+0x56/0x69
[ 46.618117] ? __virt_addr_valid+0x19a/0x1ab
[ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d
[ 46.618807] ? typec_altmode_release+0x38/0x129
[ 46.619161] kasan_report+0x8d/0xb4
[ 46.619447] ? typec_altmode_release+0x38/0x129
[ 46.619809] ? process_scheduled_works+0x3cb/0x85f
[ 46.620185] typec_altmode_release+0x38/0x129
[ 46.620537] ? process_scheduled_works+0x3cb/0x85f
[ 46.620907] device_release+0xaf/0xf2
[ 46.621206] kobject_delayed_cleanup+0x13b/0x17a
[ 46.621584] process_scheduled_works+0x4f6/0x85f
[ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10
[ 46.622353] ? hlock_class+0x31/0x9a
[ 46.622647] ? lock_acquired+0x361/0x3c3
[ 46.622956] ? move_linked_works+0x46/0x7d
[ 46.623277] worker_thread+0x1ce/0x291
[ 46.623582] ? __kthread_parkme+0xc8/0xdf
[ 46.623900] ? __pfx_worker_thread+0x10/0x10
[ 46.624236] kthread+0x17e/0x190
[ 46.624501] ? kthread+0xfb/0x190
[ 46.624756] ? __pfx_kthread+0x10/0x10
[ 46.625015] ret_from_fork+0x20/0x40
[ 46.625268] ? __pfx_kthread+0x10/0x10
[ 46.625532] ret_from_fork_asm+0x1a/0x30
[ 46.625805] </TASK>
[ 46.625953]
[ 46.626056] Allocated by task 678:
[ 46.626287] kasan_save_stack+0x24/0x44
[ 46.626555] kasan_save_track+0x14/0x2d
[ 46.626811] __kasan_kmalloc+0x3f/0x4d
[ 46.627049] __kmalloc_noprof+0x1bf/0x1f0
[ 46.627362] typec_register_port+0x23/0x491
[ 46.627698] cros_typec_probe+0x634/0xbb6
[ 46.628026] platform_probe+0x47/0x8c
[ 46.628311] really_probe+0x20a/0x47d
[ 46.628605] device_driver_attach+0x39/0x72
[ 46.628940] bind_store+0x87/0xd7
[ 46.629213] kernfs_fop_write_iter+0x1aa/0x218
[ 46.629574] vfs_write+0x1d6/0x29b
[ 46.629856] ksys_write+0xcd/0x13b
[ 46.630128] do_syscall_64+0xd4/0x139
[ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 46.630820]
[ 46.630946] Freed by task 48:
[ 46.631182] kasan_save_stack+0x24/0x44
[ 46.631493] kasan_save_track+0x14/0x2d
[ 46.631799] kasan_save_free_info+0x3f/0x4d
[ 46.632144] __kasan_slab_free+0x37/0x45
[ 46.632474]
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: bpf: must hold reference on net namespace
BUG: KASAN: slab-use-after-free in __nf_unregister_net_hook+0x640/0x6b0
Read of size 8 at addr ffff8880106fe400 by task repro/72=
bpf_nf_link_release+0xda/0x1e0
bpf_link_free+0x139/0x2d0
bpf_link_release+0x68/0x80
__fput+0x414/0xb60
Eric says:
It seems that bpf was able to defer the __nf_unregister_net_hook()
after exit()/close() time.
Perhaps a netns reference is missing, because the netns has been
dismantled/freed already.
bpf_nf_link_attach() does :
link->net = net;
But I do not see a reference being taken on net.
Add such a reference and release it after hook unreg.
Note that I was unable to get syzbot reproducer to work, so I
do not know if this resolves this splat. |