mbox series

[RFC,00/10] Uthreads

Message ID 20250214140031.484344-1-jerome.forissier@linaro.org
Headers show
Series Uthreads | expand

Message

Jerome Forissier Feb. 14, 2025, 2 p.m. UTC
This is a rework of the "Coroutines" RFC v2 series [1] which allowed to
run functions in parallel, and more specifically the ones that rely
on udelay() to poll hardware and wait for some event to happen. With
that we have shown that some intializations could be sped up
(namely, efi_init_obj_list()).

In this new version I dropped coroutines for threads and used the USB
subsystem as a (hopefully) better example. The goal and the basic
concepts are the same but threads are likely more familiar to
programmers. The API is  more self-contained: the main thread just
needs to create the threads, call a schedule function, then cleanup.
Threads may yield the processor to another thread by calling the same
schedule function. When doing so they do not switch to the main thread;
they switch to the new thread directly. Another major change is the
simplification of the stack management. Now each thread has its own
stack, there is no stack sharing anymore. The code is inspired from the
barebox threads [2].

The custom assembly code that was present in the coroutines series is
mostly replaced by setjmp()/longjmp(). As a result, supporting multiple
architectures is much easier, although there is still a need for a
non-standard extension to setjmp()/longjmp() called initjmp(). The new
function is added in several patches, one for each architecture that
supports HAVE_SETJMP. A new symbol is defined: HAVE_INITJMP. Two tests,
one for initjmp() and one for uthread scheduling, are added to the lib
suite. NOTE: the SANDBOX version of initjmp() appears to have problems
and needs to be worked on.

After introducing uthreads and making udelay() a thread re-scheduling
point, the USB stack initialization is modified to benefit from
concurrency when UTHREAD is enabled, where uthreads are used in
usb_init() to initialize and scan multiple busses at the same time.
The code was tested on arm64 and arm QEMU with 4 simulated XHCI buses
and some devices. On this platform the USB scan takes 2.2 s instead of
5.6 s. Tested on i.MX93 EVK with two USB hubs, one ethernet adapter and
one webcam on each, "usb start" takes 2.4 s instead of 4.6 s.

With UTHREAD=y on qemu_arm64_defconfig the code size increases by less
than 1KB (936 bytes exactly).

CI:
- (UTHREAD not set):
  https://source.denx.de/u-boot/custodians/u-boot-net/-/pipelines/24625
- (UTHREAD enabled for QEMU arm/arm64/riscv32/riscv64):
  https://source.denx.de/u-boot/custodians/u-boot-net/-/pipelines/24626

[1] https://lists.denx.de/pipermail/u-boot/2025-January/578779.html
[2] https://github.com/barebox/barebox/blob/master/common/bthread.c

Jerome Forissier (10):
  arch: introduce symbol HAVE_INITJMP
  arm: add initjmp()
  riscv: add initjmp()
  sandbox: add initjmp()
  test: lib: add initjmp() test
  uthread: add cooperative multi-tasking interface
  lib: time: hook uthread_schedule() into udelay()
  dm: usb: move bus initialization into new static function
    usb_init_bus()
  dm: usb: initialize and scan multiple buses simultaneously with
    uthread
  test: lib: add uthread test

 arch/Kconfig                      |   8 ++
 arch/arm/include/asm/setjmp.h     |   1 +
 arch/arm/lib/setjmp.S             |  11 ++
 arch/arm/lib/setjmp_aarch64.S     |   9 ++
 arch/riscv/include/asm/setjmp.h   |   1 +
 arch/riscv/lib/setjmp.S           |  10 ++
 arch/sandbox/cpu/Makefile         |  11 +-
 arch/sandbox/cpu/initjmp.c        | 172 ++++++++++++++++++++++++++++++
 arch/sandbox/include/asm/setjmp.h |   5 +
 drivers/usb/host/usb-uclass.c     | 167 ++++++++++++++++++++---------
 include/uthread.h                 |  31 ++++++
 lib/Kconfig                       |  19 ++++
 lib/Makefile                      |   2 +
 lib/time.c                        |  17 ++-
 lib/uthread.c                     | 108 +++++++++++++++++++
 test/boot/bootdev.c               |  14 +--
 test/boot/bootflow.c              |   3 +-
 test/lib/Makefile                 |   2 +
 test/lib/initjmp.c                |  72 +++++++++++++
 test/lib/uthread.c                |  58 ++++++++++
 20 files changed, 660 insertions(+), 61 deletions(-)
 create mode 100644 arch/sandbox/cpu/initjmp.c
 create mode 100644 include/uthread.h
 create mode 100644 lib/uthread.c
 create mode 100644 test/lib/initjmp.c
 create mode 100644 test/lib/uthread.c