Message ID | 1483991849-32448-2-git-send-email-ard.biesheuvel@linaro.org |
---|---|
State | New |
Headers | show |
On 9 January 2017 at 19:57, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > This updates the kernel mode NEON handling to allow the NEON to be used > in softirq context as well as process context. This involves disabling > softirq processing when the NEON is used in kernel mode in process context, > and dealing with the situation where 'current' is not the owner of the > userland context that is present in the NEON register file when the NEON > is enabled in kernel mode. > > The rationale for this change is that the NEON is shared with the ARMv8 > Crypto Extensions (which are also defined for the AArch32 execution state), > which can give a huge performance boost (15x) to use cases like mac80211 > CCMP processing, which executes in softirq context. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm/vfp/vfpmodule.c | 22 ++++++++++++++------ > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c > index 569d5a650a4a..814752811537 100644 > --- a/arch/arm/vfp/vfpmodule.c > +++ b/arch/arm/vfp/vfpmodule.c > @@ -690,26 +690,33 @@ void kernel_neon_begin(void) > u32 fpexc; > > /* > - * Kernel mode NEON is only allowed outside of interrupt context > + * Kernel mode NEON is only allowed outside of hardirq context > * with preemption disabled. This will make sure that the kernel > * mode NEON register contents never need to be preserved. > */ > - BUG_ON(in_interrupt()); > + BUG_ON(in_irq()); > cpu = get_cpu(); > > + /* > + * Disable softirq processing while the NEON is used by the kernel in > + * process context. This ensures that only a single kernel mode NEON > + * state is live at any given time. > + */ > + if (!in_serving_softirq()) > + local_bh_disable(); > + > fpexc = fmrx(FPEXC) | FPEXC_EN; > fmxr(FPEXC, fpexc); > > /* > - * Save the userland NEON/VFP state. Under UP, > - * the owner could be a task other than 'current' > + * Save the userland NEON/VFP state. Under UP, or when executing in > + * softirq context, the owner could be a task other than 'current' > */ > if (vfp_state_in_hw(cpu, thread)) > vfp_save_state(&thread->vfpstate, fpexc); > -#ifndef CONFIG_SMP > else if (vfp_current_hw_state[cpu] != NULL) > vfp_save_state(vfp_current_hw_state[cpu], fpexc); > -#endif > + Actually, I think this should not be necessary (and the change to the comment is incorrect). Whether we're in process or softirq context makes no difference here, and the comment is slightly confusing: under SMP, the owner could also be a task other than 'current', but due to the eager preserve, the latest userland NEON state will already have been recorded, and there is no need doing it again. > vfp_current_hw_state[cpu] = NULL; > } > EXPORT_SYMBOL(kernel_neon_begin); > @@ -718,7 +725,10 @@ void kernel_neon_end(void) > { > /* Disable the NEON/VFP unit. */ > fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN); > + if (!in_serving_softirq()) > + local_bh_enable(); > put_cpu(); > + > } > EXPORT_SYMBOL(kernel_neon_end); > > -- > 2.7.4 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Jan 09, 2017 at 07:57:28PM +0000, Ard Biesheuvel wrote: > This updates the kernel mode NEON handling to allow the NEON to be used > in softirq context as well as process context. This involves disabling > softirq processing when the NEON is used in kernel mode in process context, > and dealing with the situation where 'current' is not the owner of the > userland context that is present in the NEON register file when the NEON > is enabled in kernel mode. I really don't like this idea as-is. We have cases where kernel code accesses VFP to (eg) save or restore register state, such as during signal handling. We assume that this will not be interrupted by another user, and that if we enable access to the VFP, it will stay enabled. If it gets disabled beneath us, then things won't go well. For example, consider vfp_sync_hwstate(): vfp_sync_hwstate() vfp_state_in_hw() => true fpexc read softirq happens kernel_neon_begin() kernel_neon_end() fpexc re-enabled current register state saved out (corrupting what was there) fpexc restored, possible in an enabled state Or we could have: vfp_sync_hwstate() vfp_state_in_hw() => true softirq happens kernel_neon_begin() kernel_neon_end() fpexc read fpexc re-enabled current register state saved out (corrupting what was there) fpexc disabled Or worse: vfp_sync_hwstate() vfp_state_in_hw() => true fpexc read fpexc re-enabled softirq happens kernel_neon_begin() kernel_neon_end() current register state saved out, blowing up because VFP is unexpectedly disabled So we would need to disable softirqs around every sensitive point in the VFP support code, and over all VFP instruction emulations for those VFPs which bounce "difficult" operations to the kernel support code. > The rationale for this change is that the NEON is shared with the ARMv8 > Crypto Extensions (which are also defined for the AArch32 execution state), > which can give a huge performance boost (15x) to use cases like mac80211 > CCMP processing, which executes in softirq context. I think, once the implementation is more correct, this would need to be re-evaluated, and I'd also like other more general performance measurements as well (eg, latency.) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 11 January 2017 at 17:56, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Mon, Jan 09, 2017 at 07:57:28PM +0000, Ard Biesheuvel wrote: >> This updates the kernel mode NEON handling to allow the NEON to be used >> in softirq context as well as process context. This involves disabling >> softirq processing when the NEON is used in kernel mode in process context, >> and dealing with the situation where 'current' is not the owner of the >> userland context that is present in the NEON register file when the NEON >> is enabled in kernel mode. > > I really don't like this idea as-is. > > We have cases where kernel code accesses VFP to (eg) save or restore > register state, such as during signal handling. We assume that this > will not be interrupted by another user, and that if we enable access > to the VFP, it will stay enabled. If it gets disabled beneath us, then > things won't go well. > > For example, consider vfp_sync_hwstate(): > > vfp_sync_hwstate() > vfp_state_in_hw() => true > fpexc read > softirq happens > kernel_neon_begin() > kernel_neon_end() > fpexc re-enabled > current register state saved out (corrupting what was there) > fpexc restored, possible in an enabled state > > Or we could have: > > vfp_sync_hwstate() > vfp_state_in_hw() => true > softirq happens > kernel_neon_begin() > kernel_neon_end() > fpexc read > fpexc re-enabled > current register state saved out (corrupting what was there) > fpexc disabled > > Or worse: > > vfp_sync_hwstate() > vfp_state_in_hw() => true > fpexc read > fpexc re-enabled > softirq happens > kernel_neon_begin() > kernel_neon_end() > current register state saved out, blowing up because VFP is > unexpectedly disabled > > So we would need to disable softirqs around every sensitive point in the > VFP support code, and over all VFP instruction emulations for those VFPs > which bounce "difficult" operations to the kernel support code. > Ah yes, I should have known it couldn't be that simple. Thanks for the critique: i will look into the impact of making these changes. >> The rationale for this change is that the NEON is shared with the ARMv8 >> Crypto Extensions (which are also defined for the AArch32 execution state), >> which can give a huge performance boost (15x) to use cases like mac80211 >> CCMP processing, which executes in softirq context. > > I think, once the implementation is more correct, this would need to > be re-evaluated, and I'd also like other more general performance > measurements as well (eg, latency.) > Re latency, I thought about adding a kernel_neon_yield(), which does a kernel_neon_end()/do_softirq()/kernel_neon_begin() sequence if any softirqs are pending, to be invoked by kernel mode NEON users at times when there are no live NEON registers. But in-kernel users of the crypto API are naturally quantised into disk sectors, pages or network packets, so I would not expect any noticeable starvation to occur. But that does mean such algorithms should not be exposed to userland (which sounds like a bad idea in any case, given that userland can simply execute the same instructions) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, Jan 11, 2017 at 06:23:10PM +0000, Ard Biesheuvel wrote: > Re latency, I thought about adding a kernel_neon_yield(), which does a > kernel_neon_end()/do_softirq()/kernel_neon_begin() sequence if any > softirqs are pending, to be invoked by kernel mode NEON users at times > when there are no live NEON registers. But in-kernel users of the > crypto API are naturally quantised into disk sectors, pages or network > packets, so I would not expect any noticeable starvation to occur. But > that does mean such algorithms should not be exposed to userland > (which sounds like a bad idea in any case, given that userland can > simply execute the same instructions) I was actually thinking about the impact of adding softirq disabling over much of the VFP code necessary to make this safe, rather than the softirq disable coming from the kernel mode neon itself. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 569d5a650a4a..814752811537 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -690,26 +690,33 @@ void kernel_neon_begin(void) u32 fpexc; /* - * Kernel mode NEON is only allowed outside of interrupt context + * Kernel mode NEON is only allowed outside of hardirq context * with preemption disabled. This will make sure that the kernel * mode NEON register contents never need to be preserved. */ - BUG_ON(in_interrupt()); + BUG_ON(in_irq()); cpu = get_cpu(); + /* + * Disable softirq processing while the NEON is used by the kernel in + * process context. This ensures that only a single kernel mode NEON + * state is live at any given time. + */ + if (!in_serving_softirq()) + local_bh_disable(); + fpexc = fmrx(FPEXC) | FPEXC_EN; fmxr(FPEXC, fpexc); /* - * Save the userland NEON/VFP state. Under UP, - * the owner could be a task other than 'current' + * Save the userland NEON/VFP state. Under UP, or when executing in + * softirq context, the owner could be a task other than 'current' */ if (vfp_state_in_hw(cpu, thread)) vfp_save_state(&thread->vfpstate, fpexc); -#ifndef CONFIG_SMP else if (vfp_current_hw_state[cpu] != NULL) vfp_save_state(vfp_current_hw_state[cpu], fpexc); -#endif + vfp_current_hw_state[cpu] = NULL; } EXPORT_SYMBOL(kernel_neon_begin); @@ -718,7 +725,10 @@ void kernel_neon_end(void) { /* Disable the NEON/VFP unit. */ fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN); + if (!in_serving_softirq()) + local_bh_enable(); put_cpu(); + } EXPORT_SYMBOL(kernel_neon_end);
This updates the kernel mode NEON handling to allow the NEON to be used in softirq context as well as process context. This involves disabling softirq processing when the NEON is used in kernel mode in process context, and dealing with the situation where 'current' is not the owner of the userland context that is present in the NEON register file when the NEON is enabled in kernel mode. The rationale for this change is that the NEON is shared with the ARMv8 Crypto Extensions (which are also defined for the AArch32 execution state), which can give a huge performance boost (15x) to use cases like mac80211 CCMP processing, which executes in softirq context. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/vfp/vfpmodule.c | 22 ++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) -- 2.7.4 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel