Message ID | 20230905232757.36459-1-wahrenst@gmx.net |
---|---|
State | Accepted |
Commit | b58a36008bfa1aadf55f516bcbfae40c779eb54b |
Headers | show |
Series | [V2] hwrng: bcm2835: Fix hwrng throughput regression | expand |
On Wed, Sep 06, 2023 at 01:27:57AM +0200, Stefan Wahren wrote: > The last RCU stall fix caused a massive throughput regression of the > hwrng on Raspberry Pi 0 - 3. hwrng_msleep doesn't sleep precisely enough > and usleep_range doesn't allow scheduling. So try to restore the > best possible throughput by introducing hwrng_yield which interruptable > sleeps for one jiffy. > > Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig): > > sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000 > > cpu_relax ~138025 Bytes / sec > hwrng_msleep(1000) ~13 Bytes / sec > hwrng_yield ~2510 Bytes / sec > > Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()") > Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/ > Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Thanks, looks good to me. Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
On Wed, Sep 06, 2023 at 01:27:57AM +0200, Stefan Wahren wrote: > The last RCU stall fix caused a massive throughput regression of the > hwrng on Raspberry Pi 0 - 3. hwrng_msleep doesn't sleep precisely enough > and usleep_range doesn't allow scheduling. So try to restore the > best possible throughput by introducing hwrng_yield which interruptable > sleeps for one jiffy. > > Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig): > > sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000 > > cpu_relax ~138025 Bytes / sec > hwrng_msleep(1000) ~13 Bytes / sec > hwrng_yield ~2510 Bytes / sec > > Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()") > Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/ > Signed-off-by: Stefan Wahren <wahrenst@gmx.net> > --- > > Changes in V2: > - introduce hwrng_yield and use it > > drivers/char/hw_random/bcm2835-rng.c | 2 +- > drivers/char/hw_random/core.c | 6 ++++++ > include/linux/hw_random.h | 1 + > 3 files changed, 8 insertions(+), 1 deletion(-) Patch applied. Thanks.
[CCing Linus in case he wants to voice in to clarify how fixes for regressions that are nearly a year old should be handled] On 04.10.23 03:18, Herbert Xu wrote: > On Tue, Oct 03, 2023 at 01:35:29PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: >> >> Hi Herbert, I there a strong reason why you merged this to what from >> here looks like the branch that targets the next merge window? The patch >> fixes a regression introduced during the last 12 months and thus >> normally should not wait for the next merge window. For details see >> "Expectations and best practices for fixing regressions" in >> Documentation/process/handling-regressions.rst; or if you want to hear >> it from Linus directly, check these out: >> https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/ >> https://lore.kernel.org/all/CAHk-=wgD98pmSK3ZyHk_d9kZ2bhgN6DuNZMAJaV0WTtbkf=RDw@mail.gmail.com/ > > This is a one-year-old regression,
diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c index e98fcac578d6..634eab4776f3 100644 --- a/drivers/char/hw_random/bcm2835-rng.c +++ b/drivers/char/hw_random/bcm2835-rng.c @@ -71,7 +71,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max, while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) { if (!wait) return 0; - hwrng_msleep(rng, 1000); + hwrng_yield(rng); } num_words = rng_readl(priv, RNG_STATUS) >> 24; diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index f34d356fe2c0..599a4bc2c548 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -679,6 +679,12 @@ long hwrng_msleep(struct hwrng *rng, unsigned int msecs) } EXPORT_SYMBOL_GPL(hwrng_msleep); +long hwrng_yield(struct hwrng *rng) +{ + return wait_for_completion_interruptible_timeout(&rng->dying, 1); +} +EXPORT_SYMBOL_GPL(hwrng_yield); + static int __init hwrng_modinit(void) { int ret; diff --git a/include/linux/hw_random.h b/include/linux/hw_random.h index 8a3115516a1b..136e9842120e 100644 --- a/include/linux/hw_random.h +++ b/include/linux/hw_random.h @@ -63,5 +63,6 @@ extern void hwrng_unregister(struct hwrng *rng); extern void devm_hwrng_unregister(struct device *dve, struct hwrng *rng); extern long hwrng_msleep(struct hwrng *rng, unsigned int msecs); +extern long hwrng_yield(struct hwrng *rng); #endif /* LINUX_HWRANDOM_H_ */
The last RCU stall fix caused a massive throughput regression of the hwrng on Raspberry Pi 0 - 3. hwrng_msleep doesn't sleep precisely enough and usleep_range doesn't allow scheduling. So try to restore the best possible throughput by introducing hwrng_yield which interruptable sleeps for one jiffy. Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig): sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000 cpu_relax ~138025 Bytes / sec hwrng_msleep(1000) ~13 Bytes / sec hwrng_yield ~2510 Bytes / sec Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()") Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/ Signed-off-by: Stefan Wahren <wahrenst@gmx.net> --- Changes in V2: - introduce hwrng_yield and use it drivers/char/hw_random/bcm2835-rng.c | 2 +- drivers/char/hw_random/core.c | 6 ++++++ include/linux/hw_random.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) -- 2.34.1