v6.9-rc1-scx1 #20

Byte-Lab · 2024-03-28T07:53:55Z

Note: None of the schedulers in the scx repo will run on this kernel. We should update them before we do a formal release now that the libbpf backwards compat stuff has been released. All of the selftests do run and pass.

Scx cleanups from split

scx_rusty: doc comment update

scx: Update print_scx_info() comment

- p->scx.runnable_at is in jiffies and rq->clock is in ktime ns. Subtracting the two doesn't yield anything useful. Also, it's more intuitive for negative delta to represent past. Fix delta calculation. - ops_state is always 0 for running tasks. Let's skip it for now. - Use return value from copy_from_kernel_nofault() to determine whether the read was successful and clearly report read failures. - scx_enabled() is always nested inside scx_ops_enable_state() != DISABLED. Let's just test the latter.

scx: Update print_scx_info()

rusty: Improve overview documentation as suggested by Josh Don

The new print_scx_info() uses scx_ops_enable_state_str[] outside CONFIG_SCHED_DEBUG. Let's relocated it outside of CONFIG_SCHED_DEBUG and to the top. Reported-by: Changwoo Min <[email protected]> Reported-by: Andrea Righi <[email protected]> Signed-off-by: Tejun Heo <[email protected]>

scx: Move scx_ops_enable_state_str[] outside CONFIG_SCHED_DEBUG

scx: Fix a straggling atomic64_set

…g schedulers This is to make life easier for the user sched/tools repo which uses meson to build.

The availability of s/uSIZE types are hit and miss. Let's always define them in terms of stdint types. This makes life easier for the scx user repo.

Misc updates for example schedulers to make life easier for user sched repo

Currently, skel files are put in src/bpf/.output. Place it inside $OUT_DIR where build artifacts belong.

…move .rs.h file

… rust userland schedulers - NAME_sys and NAME was used to refer to rust wrapper of the bindgen-generated header file and the bpf skeleton, respectively. The NAME part is self-referential and thus doesn't really signify anything and _sys suffix is arbitrary too. Let's use bpf_intf and bpf_skel instead. - The env vars that are used during build are a bit unusual and the SCX_RUST_CLANG name is a bit confusing as it doesn't indicate it's for compiling BPF. Let's use the names BPF_CLANG and BPF_CFLAGS instead. - build.rs is now identical between the two schedulers.

… explicit paths from includes So that build env can decide where to put these headers.

This greatly simplifies build.rs and allows building more common logic into build_helpers such as discovering BPF_CFLAGS on its own without depending on upper level Makefile. Some caveats: - Dropped static libbpf-sys dep. scx_utils is out of kernel tree and pulls in libbpf-sys through libbpf-cargo which conflicts with the explicit libbpf-sys dependency. This means that we use packaged version of libbpf-cargo for skel generation. Should be fine. - Path dependency for scx_utils is temporary during development. Should be dropped later.

scx: Common include files relocated and more build updates

scx_sync: Sync scheduler changes from https://github.com/sched-ext/scx

Internal DSQs, i.e. SCX_DSQ_LOCAL and SCX_DSQ_GLOBAL, have somewhat special behavior in that they're automatically consumed by the internal ext.c logic. A user could therefore accidentally starve tasks on either of the DSQs if they dispatch to both the vtime and FIFO queues, as they're consumed in a specific order by the internal logic. It likely doesn't make sense to ever use both FIFO and PRIQ ordering in the same DSQ, so let's explicitly disable it for the internal DSQs. In a follow-on change, we'll error out a scheduler if a user dispatches to both FIFO and vtime for any DSQ. Reported-by: Changwoo Min <[email protected]> Signed-off-by: David Vernet <[email protected]>

Pull in bpf/for-next to receive multiple struct_ops feature

Actual feature is not implemented yet.

The default dump buffer size of 32k is okay on smaller and mostly idle systems but it's not difficult to run over. Userspace can already communicate the desired size through sched_ext_ops.exit_dump_len. This commit actually implements dynamic buffer sizing. This unfortunately makes user_exit_info interface macro-fest but the usage in schedulers doesn't get more complicated at least.

I changed my GitHub user name, so let's update the link to unblock CI. Signed-off-by: David Vernet <[email protected]>

scx: Update .github workflow link

scx: Make exit debug dump buffer resizable

scx: Trivial updates from patch splitting

reset_idle_masks() is called while loading a BPF scheduler to mark all CPUs as idle for scx_bpf_select_cpu_dfl(). It was cpumask_setall() which could make scx_bpf_select_cpu_dfl() pick a CPU which is possible but not online. Note that such spurious picking can only happen one time and it's generally safe to pick an ineligible CPU, so nothing should be broken but the behavior isn't ideal. In general the initial values of idle masks aren't that important. They quickly get synchronized to the actual state through the CPUs entering and leaving the idle state. However, let's still use cpu_online_mask instead so that the idle masks are initialized with online CPUs.

scx: Use cpu_online_mask when resetting idle masks

The UEI macros were updated in a prior commit. Apply the changes to the sched_ext selftests dir. Signed-off-by: David Vernet <[email protected]>

Right now we're just printing what the user passes to SCX_ERROR(). This can cause the output from that error message to appear on the same line as the results output from the test runner. Let's append a newline. Signed-off-by: David Vernet <[email protected]>

scx: Update selftests to use new UEI macros

…dly NULL Make the sanity check a bit more concise and ensure that ops.cgroup_move() is never called with NULL source cgroup.

…ugh ops.cgroup_prep_move() sched_move_task() takes an early exit if the source and destination are identical. This triggers the warning in scx_cgroup_can_attach() as it leaves p->scx.cgrp_moving_from uncleared. Update the cgroup migration path so that ops.cgroup_prep_move() is skipped for identity migrations so that its invocations always match ops.cgroup_move() one-to-one.

scx: cgroup: Fix mismatch between `ops.cgroup_prep_move()` and `ops.cgroup_move()` invocations

Signed-off-by: David Vernet <[email protected]>

We no longer have scx_bpf_switch_all(). Let's update the test to use __COMPAT_SCX_OPS_SWITCH_PARTIAL. Along the way, make it less flaky. Signed-off-by: David Vernet <[email protected]>

Signed-off-by: David Vernet <[email protected]>

…ples use resizing of datasec maps in examples

commit a5b862c upstream. l2cap_le_flowctl_init() can cause both div-by-zero and an integer overflow since hdev->le_mtu may not fall in the valid range. Move MTU from hci_dev to hci_conn to validate MTU and stop the connection process earlier if MTU is invalid. Also, add a missing validation in read_buffer_size() and make it return an error value if the validation fails. Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a kzalloc failure and invalid MTU value. divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G W 6.9.0-rc5+ #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci0 hci_rx_work RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547 Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c 89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42 RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246 RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084 R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000 FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline] l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline] l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline] l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline] hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335 worker_thread+0x926/0xe70 kernel/workqueue.c:3416 kthread+0x2e3/0x380 kernel/kthread.c:388 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Fixes: 6ed58ec ("Bluetooth: Use LE buffers for LE traffic") Suggested-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 #20 0x55c9400e53ad in main tools/perf/perf.c:561 #21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a5b862c ] l2cap_le_flowctl_init() can cause both div-by-zero and an integer overflow since hdev->le_mtu may not fall in the valid range. Move MTU from hci_dev to hci_conn to validate MTU and stop the connection process earlier if MTU is invalid. Also, add a missing validation in read_buffer_size() and make it return an error value if the validation fails. Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a kzalloc failure and invalid MTU value. divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G W 6.9.0-rc5+ #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci0 hci_rx_work RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547 Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c 89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42 RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246 RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084 R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000 FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline] l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline] l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline] l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline] hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335 worker_thread+0x926/0xe70 kernel/workqueue.c:3416 kthread+0x2e3/0x380 kernel/kthread.c:388 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Fixes: 6ed58ec ("Bluetooth: Use LE buffers for LE traffic") Suggested-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

htejun and others added 30 commits November 7, 2023 22:46

scx: whitespace update

39b906e

Merge pull request #80 from sched-ext/scx-cleanups-from-split

607afb6

Scx cleanups from split

scx_rusty: doc comment update

725cfa3

Merge pull request #81 from sched-ext/scx-cleanups-from-split

c818dc5

scx_rusty: doc comment update

scx: Update print_scx_info() comment

ea98edf

Merge pull request #82 from sched-ext/scx-cleanups-from-split

9a64d87

scx: Update print_scx_info() comment

Merge pull request #83 from sched-ext/scx_print_info-updates

b0d2ae0

scx: Update print_scx_info()

rusty: Improve overview documentation as suggested by Josh Don

b7e1419

Merge pull request #84 from sched-ext/rusty-doc-update

1d88c4a

rusty: Improve overview documentation as suggested by Josh Don

Merge pull request #85 from sched-ext/misc-fixes

e69323c

scx: Move scx_ops_enable_state_str[] outside CONFIG_SCHED_DEBUG

scx: Fix a straggling atomic64_set

6b245e8

Merge pull request #87 from sched-ext/atomic_long-fix

df9ef4e

scx: Fix a straggling atomic64_set

scx: Use .bpf.[sub]skel.h suffix instead of .[sub]skel.h when buildin…

70331a6

…g schedulers This is to make life easier for the user sched/tools repo which uses meson to build.

scx: Add s/uSIZE typedefs in scx_common.h

7a1c90f

The availability of s/uSIZE types are hit and miss. Let's always define them in terms of stdint types. This makes life easier for the scx user repo.

Merge pull request #88 from sched-ext/misc-updates

48b4554

Misc updates for example schedulers to make life easier for user sched repo

scx_{rusty|layered}: Generate skel file in $OUT_DIR

bc7c2af

Currently, skel files are put in src/bpf/.output. Place it inside $OUT_DIR where build artifacts belong.

scx_{rusty|layered}: ravg_read is now provided by scx_utils crate, re…

1d9acf6

…move .rs.h file

scx_{rusty|layered}: Run bindgen's clang with CLANG_CFLAGS and remove…

2d46bf9

… explicit paths from includes So that build env can decide where to put these headers.

scx_{rusty|layered}: Follow scx_utils::BpfBuilder API updates

df7ea88

scx_{layered, rusty}: Minor build updates

5f200bb

scx: Move common headers under include/scx

47c9356

scx: More include path and build updates

d6bd20a

Merge pull request #89 from sched-ext/misc-updates

f0566ba

scx: Common include files relocated and more build updates

scx_sync: Sync scheduler changes from https://github.com/sched-ext/scx

234eb2c

Merge pull request #91 from sched-ext/scx-sync

61ce4fe

scx_sync: Sync scheduler changes from https://github.com/sched-ext/scx

Byte-Lab and others added 22 commits March 6, 2024 15:04

Merge pull request #157 from sched-ext/sched_ext-base

57eae53

Pull in bpf/for-next to receive multiple struct_ops feature

scx: Add sched_ext_ops.exit_dump_len and add userspace plumbing

c39b30f

Actual feature is not implemented yet.

compat.bpf.h: Add DEFINE_SCX_OPS()

910e089

compat: Introduce SCX_LOAD/ATTACH() and apply to all example schedulers

37402bb

scx: Update .github workflow link

9204fc3

I changed my GitHub user name, so let's update the link to unblock CI. Signed-off-by: David Vernet <[email protected]>

Merge pull request #159 from sched-ext/ci

fd154f5

scx: Update .github workflow link

scx examples: Drop stray "[-p]" from usage help messages

541f061

Merge pull request #158 from sched-ext/htejun

13f2f03

scx: Make exit debug dump buffer resizable

scx: Trivial updates from patch splitting

68fc8d6

Merge pull request #160 from sched-ext/htejun

aea1cd0

scx: Trivial updates from patch splitting

Merge pull request #161 from sched-ext/htejun

925e4d8

scx: Use cpu_online_mask when resetting idle masks

scx: Update selftests to use new UEI macros

7b1e063

The UEI macros were updated in a prior commit. Apply the changes to the sched_ext selftests dir. Signed-off-by: David Vernet <[email protected]>

Merge pull request #163 from sched-ext/fix_selftests

bbb65a3

scx: Update selftests to use new UEI macros

scx: Improve error behavior when p->scx.cgrp_moving_from is unexpecte…

6d5da8c

…dly NULL Make the sanity check a bit more concise and ensure that ops.cgroup_move() is never called with NULL source cgroup.

Merge pull request #165 from sched-ext/htejun

fc86083

scx: cgroup: Fix mismatch between `ops.cgroup_prep_move()` and `ops.cgroup_move()` invocations

Merge commit 'fc86083986063457029fe1039bbf1632d2fdca2b' into scx-6.9-rc1

82aa233

Signed-off-by: David Vernet <[email protected]>

scx: Fix init_enable_count

f95ba24

We no longer have scx_bpf_switch_all(). Let's update the test to use __COMPAT_SCX_OPS_SWITCH_PARTIAL. Along the way, make it less flaky. Signed-off-by: David Vernet <[email protected]>

v6.9-rc1-scx1

4706b5a

Signed-off-by: David Vernet <[email protected]>

Byte-Lab requested a review from htejun March 28, 2024 07:53

htejun approved these changes Mar 28, 2024

View reviewed changes

htejun merged commit dd11fee into scx-6.9rc.y Mar 28, 2024
2 checks passed

htejun deleted the scx-6.9-rc1 branch March 28, 2024 16:26

htejun pushed a commit that referenced this pull request Mar 28, 2024

Merge pull request #20 from inwardvessel/resize_percpu_arrays_in_exam…

8ade500

…ples use resizing of datasec maps in examples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v6.9-rc1-scx1 #20

v6.9-rc1-scx1 #20

Byte-Lab commented Mar 28, 2024

v6.9-rc1-scx1 #20

v6.9-rc1-scx1 #20

Conversation

Byte-Lab commented Mar 28, 2024