| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mm/sparsemem: fix race in accessing memory_section->usage
The below race is observed on a PFN which falls into the device memory
region with the system memory configuration where PFN's are such that
[ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and end
pfn contains the device memory PFN's as well, the compaction triggered
will try on the device memory PFN's too though they end up in NOP(because
pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections). When
from other core, the section mappings are being removed for the
ZONE_DEVICE region, that the PFN in question belongs to, on which
compaction is currently being operated is resulting into the kernel crash
with CONFIG_SPASEMEM_VMEMAP enabled. The crash logs can be seen at [1].
compact_zone() memunmap_pages
------------- ---------------
__pageblock_pfn_to_page
......
(a)pfn_valid():
valid_section()//return true
(b)__remove_pages()->
sparse_remove_section()->
section_deactivate():
[Free the array ms->usage and set
ms->usage = NULL]
pfn_section_valid()
[Access ms->usage which
is NULL]
NOTE: From the above it can be said that the race is reduced to between
the pfn_valid()/pfn_section_valid() and the section deactivate with
SPASEMEM_VMEMAP enabled.
The commit b943f045a9af("mm/sparse: fix kernel crash with
pfn_section_valid check") tried to address the same problem by clearing
the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns
false thus ms->usage is not accessed.
Fix this issue by the below steps:
a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage.
b) RCU protected read side critical section will either return NULL
when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage.
c) Free the ->usage with kfree_rcu() and set ms->usage = NULL. No
attempt will be made to access ->usage after this as the
SECTION_HAS_MEM_MAP is cleared thus valid_section() return false.
Thanks to David/Pavan for their inputs on this patch.
[1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/
On Snapdragon SoC, with the mentioned memory configuration of PFN's as
[ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of
issues daily while testing on a device farm.
For this particular issue below is the log. Though the below log is
not directly pointing to the pfn_section_valid(){ ms->usage;}, when we
loaded this dump on T32 lauterbach tool, it is pointing.
[ 540.578056] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[ 540.578068] Mem abort info:
[ 540.578070] ESR = 0x0000000096000005
[ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits
[ 540.578077] SET = 0, FnV = 0
[ 540.578080] EA = 0, S1PTW = 0
[ 540.578082] FSC = 0x05: level 1 translation fault
[ 540.578085] Data abort info:
[ 540.578086] ISV = 0, ISS = 0x00000005
[ 540.578088] CM = 0, WnR = 0
[ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--)
[ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c
[ 540.579454] lr : compact_zone+0x994/0x1058
[ 540.579460] sp : ffffffc03579b510
[ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c
[ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640
[ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000
[ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140
[ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff
[ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001
[ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440
[ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4
[ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix race condition between session lookup and expire
Thread A + Thread B
ksmbd_session_lookup | smb2_sess_setup
sess = xa_load |
|
| xa_erase(&conn->sessions, sess->id);
|
| ksmbd_session_destroy(sess) --> kfree(sess)
|
// UAF! |
sess->last_active = jiffies |
+
This patch add rwsem to fix race condition between ksmbd_session_lookup
and ksmbd_expire_session. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr
If multiple connection requests attempt to create an implicit mptcp
endpoint in parallel, more than one caller may end up in
mptcp_pm_nl_append_new_local_addr because none found the address in
local_addr_list during their call to mptcp_pm_nl_get_local_id. In this
case, the concurrent new_local_addr calls may delete the address entry
created by the previous caller. These deletes use synchronize_rcu, but
this is not permitted in some of the contexts where this function may be
called. During packet recv, the caller may be in a rcu read critical
section and have preemption disabled.
An example stack:
BUG: scheduling while atomic: swapper/2/0/0x00000302
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1))
dump_stack (lib/dump_stack.c:124)
__schedule_bug (kernel/sched/core.c:5943)
schedule_debug.constprop.0 (arch/x86/include/asm/preempt.h:33 kernel/sched/core.c:5970)
__schedule (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 kernel/sched/features.h:29 kernel/sched/core.c:6621)
schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6804 kernel/sched/core.c:6818)
schedule_timeout (kernel/time/timer.c:2160)
wait_for_completion (kernel/sched/completion.c:96 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:148)
__wait_rcu_gp (include/linux/rcupdate.h:311 kernel/rcu/update.c:444)
synchronize_rcu (kernel/rcu/tree.c:3609)
mptcp_pm_nl_append_new_local_addr (net/mptcp/pm_netlink.c:966 net/mptcp/pm_netlink.c:1061)
mptcp_pm_nl_get_local_id (net/mptcp/pm_netlink.c:1164)
mptcp_pm_get_local_id (net/mptcp/pm.c:420)
subflow_check_req (net/mptcp/subflow.c:98 net/mptcp/subflow.c:213)
subflow_v4_route_req (net/mptcp/subflow.c:305)
tcp_conn_request (net/ipv4/tcp_input.c:7216)
subflow_v4_conn_request (net/mptcp/subflow.c:651)
tcp_rcv_state_process (net/ipv4/tcp_input.c:6709)
tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1934)
tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2334)
ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1))
ip_local_deliver_finish (include/linux/rcupdate.h:813 net/ipv4/ip_input.c:234)
ip_local_deliver (include/linux/netfilter.h:314 include/linux/netfilter.h:308 net/ipv4/ip_input.c:254)
ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:580)
ip_sublist_rcv (net/ipv4/ip_input.c:640)
ip_list_rcv (net/ipv4/ip_input.c:675)
__netif_receive_skb_list_core (net/core/dev.c:5583 net/core/dev.c:5631)
netif_receive_skb_list_internal (net/core/dev.c:5685 net/core/dev.c:5774)
napi_complete_done (include/linux/list.h:37 include/net/gro.h:449 include/net/gro.h:444 net/core/dev.c:6114)
igb_poll (drivers/net/ethernet/intel/igb/igb_main.c:8244) igb
__napi_poll (net/core/dev.c:6582)
net_rx_action (net/core/dev.c:6653 net/core/dev.c:6787)
handle_softirqs (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:588 kernel/softirq.c:427 kernel/softirq.c:636)
irq_exit_rcu (kernel/softirq.c:651)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
</IRQ>
This problem seems particularly prevalent if the user advertises an
endpoint that has a different external vs internal address. In the case
where the external address is advertised and multiple connections
already exist, multiple subflow SYNs arrive in parallel which tends to
trigger the race during creation of the first local_addr_list entries
which have the internal address instead.
Fix by skipping the replacement of an existing implicit local address if
called via mptcp_pm_nl_get_local_id. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: nl80211: reject cooked mode if it is set along with other flags
It is possible to set both MONITOR_FLAG_COOK_FRAMES and MONITOR_FLAG_ACTIVE
flags simultaneously on the same monitor interface from the userspace. This
causes a sub-interface to be created with no IEEE80211_SDATA_IN_DRIVER bit
set because the monitor interface is in the cooked state and it takes
precedence over all other states. When the interface is then being deleted
the kernel calls WARN_ONCE() from check_sdata_in_driver() because of missing
that bit.
Fix this by rejecting MONITOR_FLAG_COOK_FRAMES if it is set along with
other flags.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller. |
| In the Linux kernel, the following vulnerability has been resolved:
perf/core: Order the PMU list to fix warning about unordered pmu_ctx_list
Syskaller triggers a warning due to prev_epc->pmu != next_epc->pmu in
perf_event_swap_task_ctx_data(). vmcore shows that two lists have the same
perf_event_pmu_context, but not in the same order.
The problem is that the order of pmu_ctx_list for the parent is impacted by
the time when an event/PMU is added. While the order for a child is
impacted by the event order in the pinned_groups and flexible_groups. So
the order of pmu_ctx_list in the parent and child may be different.
To fix this problem, insert the perf_event_pmu_context to its proper place
after iteration of the pmu_ctx_list.
The follow testcase can trigger above warning:
# perf record -e cycles --call-graph lbr -- taskset -c 3 ./a.out &
# perf stat -e cpu-clock,cs -p xxx // xxx is the pid of a.out
test.c
void main() {
int count = 0;
pid_t pid;
printf("%d running\n", getpid());
sleep(30);
printf("running\n");
pid = fork();
if (pid == -1) {
printf("fork error\n");
return;
}
if (pid == 0) {
while (1) {
count++;
}
} else {
while (1) {
count++;
}
}
}
The testcase first opens an LBR event, so it will allocate task_ctx_data,
and then open tracepoint and software events, so the parent context will
have 3 different perf_event_pmu_contexts. On inheritance, child ctx will
insert the perf_event_pmu_context in another order and the warning will
trigger.
[ mingo: Tidied up the changelog. ] |
| In the Linux kernel, the following vulnerability has been resolved:
timers/migration: Fix off-by-one root mis-connection
Before attaching a new root to the old root, the children counter of the
new root is checked to verify that only the upcoming CPU's top group have
been connected to it. However since the recently added commit b729cc1ec21a
("timers/migration: Fix another race between hotplug and idle entry/exit")
this check is not valid anymore because the old root is pre-accounted
as a child to the new root. Therefore after connecting the upcoming
CPU's top group to the new root, the children count to be expected must
be 2 and not 1 anymore.
This omission results in the old root to not be connected to the new
root. Then eventually the system may run with more than one top level,
which defeats the purpose of a single idle migrator.
Also the old root is pre-accounted but not connected upon the new root
creation. But it can be connected to the new root later on. Therefore
the old root may be accounted twice to the new root. The propagation of
such overcommit can end up creating a double final top-level root with a
groupmask incorrectly initialized. Although harmless given that the final
top level roots will never have a parent to walk up to, this oddity
opportunistically reported the core issue:
WARNING: CPU: 8 PID: 0 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote
CPU: 8 UID: 0 PID: 0 Comm: swapper/8
RIP: 0010:tmigr_requires_handle_remote
Call Trace:
<IRQ>
? tmigr_requires_handle_remote
? hrtimer_run_queues
update_process_times
tick_periodic
tick_handle_periodic
__sysvec_apic_timer_interrupt
sysvec_apic_timer_interrupt
</IRQ>
Fix the problem by taking the old root into account in the children count
of the new root so the connection is not omitted.
Also warn when more than one top level group exists to better detect
similar issues in the future. |
| In the Linux kernel, the following vulnerability has been resolved:
net: rose: fix timer races against user threads
Rose timers only acquire the socket spinlock, without
checking if the socket is owned by one user thread.
Add a check and rearm the timers if needed.
BUG: KASAN: slab-use-after-free in rose_timer_expiry+0x31d/0x360 net/rose/rose_timer.c:174
Read of size 2 at addr ffff88802f09b82a by task swapper/0/0
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.13.0-rc5-syzkaller-00172-gd1bf27c4e176 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0x169/0x550 mm/kasan/report.c:489
kasan_report+0x143/0x180 mm/kasan/report.c:602
rose_timer_expiry+0x31d/0x360 net/rose/rose_timer.c:174
call_timer_fn+0x187/0x650 kernel/time/timer.c:1793
expire_timers kernel/time/timer.c:1844 [inline]
__run_timers kernel/time/timer.c:2418 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2430
run_timer_base kernel/time/timer.c:2439 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2449
handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
</IRQ> |
| In the Linux kernel, the following vulnerability has been resolved:
i2c: virtio: fix completion handling
The driver currently assumes that the notify callback is only received
when the device is done with all the queued buffers.
However, this is not true, since the notify callback could be called
without any of the queued buffers being completed (for example, with
virtio-pci and shared interrupts) or with only some of the buffers being
completed (since the driver makes them available to the device in
multiple separate virtqueue_add_sgs() calls).
This can lead to incorrect data on the I2C bus or memory corruption in
the guest if the device operates on buffers which are have been freed by
the driver. (The WARN_ON in the driver is also triggered.)
BUG kmalloc-128 (Tainted: G W ): Poison overwritten
First byte 0x0 instead of 0x6b
Allocated in i2cdev_ioctl_rdwr+0x9d/0x1de age=243 cpu=0 pid=28
memdup_user+0x2e/0xbd
i2cdev_ioctl_rdwr+0x9d/0x1de
i2cdev_ioctl+0x247/0x2ed
vfs_ioctl+0x21/0x30
sys_ioctl+0xb18/0xb41
Freed in i2cdev_ioctl_rdwr+0x1bb/0x1de age=68 cpu=0 pid=28
kfree+0x1bd/0x1cc
i2cdev_ioctl_rdwr+0x1bb/0x1de
i2cdev_ioctl+0x247/0x2ed
vfs_ioctl+0x21/0x30
sys_ioctl+0xb18/0xb41
Fix this by calling virtio_get_buf() from the notify handler like other
virtio drivers and by actually waiting for all the buffers to be
completed. |
| In the Linux kernel, the following vulnerability has been resolved:
ethtool: do not perform operations on net devices being unregistered
There is a short period between a net device starts to be unregistered
and when it is actually gone. In that time frame ethtool operations
could still be performed, which might end up in unwanted or undefined
behaviours[1].
Do not allow ethtool operations after a net device starts its
unregistration. This patch targets the netlink part as the ioctl one
isn't affected: the reference to the net device is taken and the
operation is executed within an rtnl lock section and the net device
won't be found after unregister.
[1] For example adding Tx queues after unregister ends up in NULL
pointer exceptions and UaFs, such as:
BUG: KASAN: use-after-free in kobject_get+0x14/0x90
Read of size 1 at addr ffff88801961248c by task ethtool/755
CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014
Call Trace:
dump_stack_lvl+0x57/0x72
print_address_description.constprop.0+0x1f/0x140
kasan_report.cold+0x7f/0x11b
kobject_get+0x14/0x90
kobject_add_internal+0x3d1/0x450
kobject_init_and_add+0xba/0xf0
netdev_queue_update_kobjects+0xcf/0x200
netif_set_real_num_tx_queues+0xb4/0x310
veth_set_channels+0x1c3/0x550
ethnl_set_channels+0x524/0x610 |
| In the Linux kernel, the following vulnerability has been resolved:
udp: fix race between close() and udp_abort()
Kaustubh reported and diagnosed a panic in udp_lib_lookup().
The root cause is udp_abort() racing with close(). Both
racing functions acquire the socket lock, but udp{v6}_destroy_sock()
release it before performing destructive actions.
We can't easily extend the socket lock scope to avoid the race,
instead use the SOCK_DEAD flag to prevent udp_abort from doing
any action when the critical race happens.
Diagnosed-and-tested-by: Kaustubh Pandey <kapandey@codeaurora.org> |
| In the Linux kernel, the following vulnerability has been resolved:
ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
do_mq_timedreceive calls wq_sleep with a stack local address. The
sender (do_mq_timedsend) uses this address to later call pipelined_send.
This leads to a very hard to trigger race where a do_mq_timedreceive
call might return and leave do_mq_timedsend to rely on an invalid
address, causing the following crash:
RIP: 0010:wake_q_add_safe+0x13/0x60
Call Trace:
__x64_sys_mq_timedsend+0x2a9/0x490
do_syscall_64+0x80/0x680
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f5928e40343
The race occurs as:
1. do_mq_timedreceive calls wq_sleep with the address of `struct
ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
holds a valid `struct ext_wait_queue *` as long as the stack has not
been overwritten.
2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and
do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
__pipelined_op.
3. Sender calls __pipelined_op::smp_store_release(&this->state,
STATE_READY). Here is where the race window begins. (`this` is
`ewq_addr`.)
4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
will see `state == STATE_READY` and break.
5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
stack. (Although the address may not get overwritten until another
function happens to touch it, which means it can persist around for an
indefinite time.)
6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
`struct ext_wait_queue *`, and uses it to find a task_struct to pass to
the wake_q_add_safe call. In the lucky case where nothing has
overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
bogus address as the receiver's task_struct causing the crash.
do_mq_timedsend::__pipelined_op() should not dereference `this` after
setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's
task_struct returned by get_task_struct, instead of dereferencing `this`
which sits on the receiver's stack.
As Manfred pointed out, the race potentially also exists in
ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix
those in the same way. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: compress: fix race condition of overwrite vs truncate
pos_fsstress testcase complains a panic as belew:
------------[ cut here ]------------
kernel BUG at fs/f2fs/compress.c:1082!
invalid opcode: 0000 [#1] SMP PTI
CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Workqueue: writeback wb_workfn (flush-252:16)
RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs]
Call Trace:
f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs]
f2fs_write_cache_pages+0x468/0x8a0 [f2fs]
f2fs_write_data_pages+0x2a4/0x2f0 [f2fs]
do_writepages+0x38/0xc0
__writeback_single_inode+0x44/0x2a0
writeback_sb_inodes+0x223/0x4d0
__writeback_inodes_wb+0x56/0xf0
wb_writeback+0x1dd/0x290
wb_workfn+0x309/0x500
process_one_work+0x220/0x3c0
worker_thread+0x53/0x420
kthread+0x12f/0x150
ret_from_fork+0x22/0x30
The root cause is truncate() may race with overwrite as below,
so that one reference count left in page can not guarantee the
page attaching in mapping tree all the time, after truncation,
later find_lock_page() may return NULL pointer.
- prepare_compress_overwrite
- f2fs_pagecache_get_page
- unlock_page
- f2fs_setattr
- truncate_setsize
- truncate_inode_page
- delete_from_page_cache
- find_lock_page
Fix this by avoiding referencing updated page. |
| In the Linux kernel, the following vulnerability has been resolved:
net/smc: fix kernel panic caused by race of smc_sock
A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
but smc_release() has already freed it.
[ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
[ 4570.696048] #PF: supervisor write access in kernel mode
[ 4570.696728] #PF: error_code(0x0002) - not-present page
[ 4570.697401] PGD 0 P4D 0
[ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
[ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
[ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
<...>
[ 4570.711446] Call Trace:
[ 4570.711746] <IRQ>
[ 4570.711992] smc_cdc_tx_handler+0x41/0xc0
[ 4570.712470] smc_wr_tx_tasklet_fn+0x213/0x560
[ 4570.712981] ? smc_cdc_tx_dismisser+0x10/0x10
[ 4570.713489] tasklet_action_common.isra.17+0x66/0x140
[ 4570.714083] __do_softirq+0x123/0x2f4
[ 4570.714521] irq_exit_rcu+0xc4/0xf0
[ 4570.714934] common_interrupt+0xba/0xe0
Though smc_cdc_tx_handler() checked the existence of smc connection,
smc_release() may have already dismissed and released the smc socket
before smc_cdc_tx_handler() further visits it.
smc_cdc_tx_handler() |smc_release()
if (!conn) |
|
|smc_cdc_tx_dismiss_slots()
| smc_cdc_tx_dismisser()
|
|sock_put(&smc->sk) <- last sock_put,
| smc_sock freed
bh_lock_sock(&smc->sk) (panic) |
To make sure we won't receive any CDC messages after we free the
smc_sock, add a refcount on the smc_connection for inflight CDC
message(posted to the QP but haven't received related CQE), and
don't release the smc_connection until all the inflight CDC messages
haven been done, for both success or failed ones.
Using refcount on CDC messages brings another problem: when the link
is going to be destroyed, smcr_link_clear() will reset the QP, which
then remove all the pending CQEs related to the QP in the CQ. To make
sure all the CQEs will always come back so the refcount on the
smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
by smc_ib_modify_qp_error().
And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
need to wait for all pending WQEs done, or we may encounter use-after-
free when handling CQEs.
For IB device removal routine, we need to wait for all the QPs on that
device been destroyed before we can destroy CQs on the device, or
the refcount on smc_connection won't reach 0 and smc_sock cannot be
released. |
| Windows Hyper-V Remote Code Execution Vulnerability |
| Windows USB Print Driver Elevation of Privilege Vulnerability |
| Windows Telephony Server Elevation of Privilege Vulnerability |
| Windows Telephony Server Elevation of Privilege Vulnerability |
| Windows Update Stack Elevation of Privilege Vulnerability |
| Windows USB Print Driver Elevation of Privilege Vulnerability |
| Visual Studio Denial of Service Vulnerability |