Commit 6ae315d37924 for kernel
commit 6ae315d37924435516d697ea7dde0b799a5928e0
Author: Andrea Righi <arighi@nvidia.com>
Date: Wed May 13 13:24:38 2026 +0200
sched_ext: Use HK_TYPE_DOMAIN_BOOT to detect isolcpus= domain isolation
scx_enable() refuses to attach a BPF scheduler when isolcpus=domain is
in effect by comparing housekeeping_cpumask(HK_TYPE_DOMAIN) against
cpu_possible_mask.
Since commit 27c3a5967f05 ("sched/isolation: Convert housekeeping
cpumasks to rcu pointers"), HK_TYPE_DOMAIN's cpumask is RCU protected
and dereferencing it requires either RCU read lock, the cpu_hotplug
write lock, or the cpuset lock; scx_enable() holds none of these, so
booting with isolcpus=domain and attaching any BPF scheduler triggers
the following lockdep splat:
=============================
WARNING: suspicious RCU usage
-----------------------------
kernel/sched/isolation.c:60 suspicious rcu_dereference_check() usage!
1 lock held by scx_flash/281:
#0: ffffffff8379fce0 (update_mutex){+.+.}-{4:4}, at:
bpf_struct_ops_link_create+0x134/0x1c0
Call Trace:
dump_stack_lvl+0x6f/0xb0
lockdep_rcu_suspicious.cold+0x37/0x70
housekeeping_cpumask+0xcd/0xe0
scx_enable.isra.0+0x17/0x120
bpf_scx_reg+0x5e/0x80
bpf_struct_ops_link_create+0x151/0x1c0
__sys_bpf+0x1e4b/0x33c0
__x64_sys_bpf+0x21/0x30
do_syscall_64+0x117/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
In addition, commit 03ff73510169 ("cpuset: Update HK_TYPE_DOMAIN cpumask
from cpuset") made HK_TYPE_DOMAIN include cpuset isolated partitions as
well, which means the current check also rejects BPF schedulers when a
cpuset partition is active. That contradicts the original intent of
commit 9f391f94a173 ("sched_ext: Disallow loading BPF scheduler if
isolcpus= domain isolation is in effect"), which explicitly noted that
cpuset partitions are honored through per-task cpumasks and should not
be rejected.
Switch to housekeeping_enabled(HK_TYPE_DOMAIN_BOOT), which reads only
the housekeeping flag bit (no RCU dereference) and reflects exactly the
boot-time isolcpus= configuration that the error message refers to.
Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
Cc: stable@vger.kernel.org # v7.0+
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 23f7b3f63b09..a6d0a93d8174 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7415,8 +7415,7 @@ static s32 scx_enable(struct sched_ext_ops *ops, struct bpf_link *link)
static DEFINE_MUTEX(helper_mutex);
struct scx_enable_cmd cmd;
- if (!cpumask_equal(housekeeping_cpumask(HK_TYPE_DOMAIN),
- cpu_possible_mask)) {
+ if (housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) {
pr_err("sched_ext: Not compatible with \"isolcpus=\" domain isolation\n");
return -EINVAL;
}