Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6.9-rc1-scx1 #20

Merged
merged 512 commits into from
Mar 28, 2024
Merged

v6.9-rc1-scx1 #20

merged 512 commits into from
Mar 28, 2024

Conversation

Byte-Lab
Copy link
Contributor

Note: None of the schedulers in the scx repo will run on this kernel. We should update them before we do a formal release now that the libbpf backwards compat stuff has been released. All of the selftests do run and pass.

htejun and others added 30 commits November 7, 2023 22:46
- p->scx.runnable_at is in jiffies and rq->clock is in ktime ns. Subtracting
  the two doesn't yield anything useful. Also, it's more intuitive for
  negative delta to represent past. Fix delta calculation.

- ops_state is always 0 for running tasks. Let's skip it for now.

- Use return value from copy_from_kernel_nofault() to determine whether the
  read was successful and clearly report read failures.

- scx_enabled() is always nested inside scx_ops_enable_state() != DISABLED.
  Let's just test the latter.
rusty: Improve overview documentation as suggested by Josh Don
The new print_scx_info() uses scx_ops_enable_state_str[] outside
CONFIG_SCHED_DEBUG. Let's relocated it outside of CONFIG_SCHED_DEBUG and to
the top.

Reported-by: Changwoo Min <[email protected]>
Reported-by: Andrea Righi <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
scx: Move scx_ops_enable_state_str[] outside CONFIG_SCHED_DEBUG
scx: Fix a straggling atomic64_set
…g schedulers

This is to make life easier for the user sched/tools repo which uses meson
to build.
The availability of s/uSIZE types are hit and miss. Let's always define them
in terms of stdint types. This makes life easier for the scx user repo.
Misc updates for example schedulers to make life easier for user sched repo
Currently, skel files are put in src/bpf/.output. Place it inside $OUT_DIR
where build artifacts belong.
… rust userland schedulers

- NAME_sys and NAME was used to refer to rust wrapper of the
  bindgen-generated header file and the bpf skeleton, respectively. The NAME
  part is self-referential and thus doesn't really signify anything and _sys
  suffix is arbitrary too. Let's use bpf_intf and bpf_skel instead.

- The env vars that are used during build are a bit unusual and the
  SCX_RUST_CLANG name is a bit confusing as it doesn't indicate it's for
  compiling BPF. Let's use the names BPF_CLANG and BPF_CFLAGS instead.

- build.rs is now identical between the two schedulers.
… explicit paths from includes

So that build env can decide where to put these headers.
This greatly simplifies build.rs and allows building more common logic into
build_helpers such as discovering BPF_CFLAGS on its own without depending on
upper level Makefile. Some caveats:

- Dropped static libbpf-sys dep. scx_utils is out of kernel tree and pulls
  in libbpf-sys through libbpf-cargo which conflicts with the explicit
  libbpf-sys dependency. This means that we use packaged version of
  libbpf-cargo for skel generation. Should be fine.

- Path dependency for scx_utils is temporary during development. Should be
  dropped later.
scx: Common include files relocated and more build updates
Internal DSQs, i.e. SCX_DSQ_LOCAL and SCX_DSQ_GLOBAL, have somewhat
special behavior in that they're automatically consumed by the internal
ext.c logic. A user could therefore accidentally starve tasks on either
of the DSQs if they dispatch to both the vtime and FIFO queues, as
they're consumed in a specific order by the internal logic. It likely
doesn't make sense to ever use both FIFO and PRIQ ordering in the same
DSQ, so let's explicitly disable it for the internal DSQs. In a
follow-on change, we'll error out a scheduler if a user dispatches to
both FIFO and vtime for any DSQ.

Reported-by: Changwoo Min <[email protected]>
Signed-off-by: David Vernet <[email protected]>
Byte-Lab and others added 22 commits March 6, 2024 15:04
Pull in bpf/for-next to receive multiple struct_ops feature
The default dump buffer size of 32k is okay on smaller and mostly idle
systems but it's not difficult to run over. Userspace can already
communicate the desired size through sched_ext_ops.exit_dump_len. This
commit actually implements dynamic buffer sizing. This unfortunately makes
user_exit_info interface macro-fest but the usage in schedulers doesn't get
more complicated at least.
I changed my GitHub user name, so let's update the link to unblock CI.

Signed-off-by: David Vernet <[email protected]>
scx: Update .github workflow link
scx: Make exit debug dump buffer resizable
scx: Trivial updates from patch splitting
reset_idle_masks() is called while loading a BPF scheduler to mark all CPUs
as idle for scx_bpf_select_cpu_dfl(). It was cpumask_setall() which could
make scx_bpf_select_cpu_dfl() pick a CPU which is possible but not online.

Note that such spurious picking can only happen one time and it's generally
safe to pick an ineligible CPU, so nothing should be broken but the behavior
isn't ideal. In general the initial values of idle masks aren't that
important. They quickly get synchronized to the actual state through the
CPUs entering and leaving the idle state.

However, let's still use cpu_online_mask instead so that the idle masks are
initialized with online CPUs.
scx: Use cpu_online_mask when resetting idle masks
The UEI macros were updated in a prior commit. Apply the changes to the
sched_ext selftests dir.

Signed-off-by: David Vernet <[email protected]>
Right now we're just printing what the user passes to SCX_ERROR(). This
can cause the output from that error message to appear on the same line
as the results output from the test runner. Let's append a newline.

Signed-off-by: David Vernet <[email protected]>
scx: Update selftests to use new UEI macros
…dly NULL

Make the sanity check a bit more concise and ensure that ops.cgroup_move()
is never called with NULL source cgroup.
…ugh ops.cgroup_prep_move()

sched_move_task() takes an early exit if the source and destination are
identical. This triggers the warning in scx_cgroup_can_attach() as it leaves
p->scx.cgrp_moving_from uncleared.

Update the cgroup migration path so that ops.cgroup_prep_move() is skipped
for identity migrations so that its invocations always match
ops.cgroup_move() one-to-one.
scx: cgroup: Fix mismatch between `ops.cgroup_prep_move()` and `ops.cgroup_move()` invocations
We no longer have scx_bpf_switch_all(). Let's update the test to use
__COMPAT_SCX_OPS_SWITCH_PARTIAL. Along the way, make it less flaky.

Signed-off-by: David Vernet <[email protected]>
Signed-off-by: David Vernet <[email protected]>
@Byte-Lab Byte-Lab requested a review from htejun March 28, 2024 07:53
@htejun htejun merged commit dd11fee into scx-6.9rc.y Mar 28, 2024
2 checks passed
@htejun htejun deleted the scx-6.9-rc1 branch March 28, 2024 16:26
htejun pushed a commit that referenced this pull request Mar 28, 2024
…ples

use resizing of datasec maps in examples
Byte-Lab pushed a commit that referenced this pull request Jun 3, 2024
commit a5b862c upstream.

l2cap_le_flowctl_init() can cause both div-by-zero and an integer
overflow since hdev->le_mtu may not fall in the valid range.

Move MTU from hci_dev to hci_conn to validate MTU and stop the connection
process earlier if MTU is invalid.
Also, add a missing validation in read_buffer_size() and make it return
an error value if the validation fails.
Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a
kzalloc failure and invalid MTU value.

divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G        W          6.9.0-rc5+ #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Workqueue: hci0 hci_rx_work
RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547
Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c
89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d
b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42
RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246
RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f
RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa
R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084
R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000
FS:  0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <TASK>
 l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline]
 l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline]
 l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline]
 l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809
 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506
 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline]
 hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335
 worker_thread+0x926/0xe70 kernel/workqueue.c:3416
 kthread+0x2e3/0x380 kernel/kthread.c:388
 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

Fixes: 6ed58ec ("Bluetooth: Use LE buffers for LE traffic")
Suggested-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Sungwoo Kim <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Byte-Lab pushed a commit that referenced this pull request Jun 21, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    #20 0x55c9400e53ad in main tools/perf/perf.c:561
    #21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Byte-Lab pushed a commit that referenced this pull request Jun 21, 2024
[ Upstream commit a5b862c ]

l2cap_le_flowctl_init() can cause both div-by-zero and an integer
overflow since hdev->le_mtu may not fall in the valid range.

Move MTU from hci_dev to hci_conn to validate MTU and stop the connection
process earlier if MTU is invalid.
Also, add a missing validation in read_buffer_size() and make it return
an error value if the validation fails.
Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a
kzalloc failure and invalid MTU value.

divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G        W          6.9.0-rc5+ #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Workqueue: hci0 hci_rx_work
RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547
Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c
89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d
b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42
RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246
RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f
RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa
R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084
R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000
FS:  0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <TASK>
 l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline]
 l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline]
 l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline]
 l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809
 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506
 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline]
 hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335
 worker_thread+0x926/0xe70 kernel/workqueue.c:3416
 kthread+0x2e3/0x380 kernel/kthread.c:388
 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

Fixes: 6ed58ec ("Bluetooth: Use LE buffers for LE traffic")
Suggested-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Sungwoo Kim <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants