Message ID | 20180925172043.20248-5-julien.grall@arm.com |
---|---|
State | Superseded |
Headers | show |
Series | xen/arm: SMCCC fixup and improvement | expand |
On Tue, 25 Sep 2018, Julien Grall wrote: > Some capababilities are set right during boot and will never change > afterwards. At the moment, the function cpu_have_caps will check whether > the cap is enabled from the memory. > > It is possible to avoid the load from the memory by using an > ALTERNATIVE. With that the check is just reduced to 1 instruction. > > Signed-off-by: Julien Grall <julien.grall@arm.com> I enjoyed reading the patch :-) But I don't think it is worth going into this extreme level of optimization. test_bit is efficient enough, right? What do you think we need to use alternatives just to check one bit? > --- > > This is the static key for the poor. At some point we might want to > introduce something similar to static key in Xen. > > Changes in v2: > - Use unlikely > --- > xen/include/asm-arm/cpufeature.h | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h > index 3de6b54301..c6cbc2ec84 100644 > --- a/xen/include/asm-arm/cpufeature.h > +++ b/xen/include/asm-arm/cpufeature.h > @@ -63,6 +63,18 @@ static inline bool cpus_have_cap(unsigned int num) > return test_bit(num, cpu_hwcaps); > } > > +/* System capability check for constant cap */ > +#define cpus_have_const_cap(num) ({ \ > + bool __ret; \ > + \ > + asm volatile (ALTERNATIVE("mov %0, #0", \ > + "mov %0, #1", \ > + num) \ > + : "=r" (__ret)); \ > + \ > + unlikely(__ret); \ > + }) > + > static inline void cpus_set_cap(unsigned int num) > { > if (num >= ARM_NCAPS) > -- > 2.11.0 >
Hi Stefano, On 09/26/2018 05:53 PM, Stefano Stabellini wrote: > On Tue, 25 Sep 2018, Julien Grall wrote: >> Some capababilities are set right during boot and will never change >> afterwards. At the moment, the function cpu_have_caps will check whether >> the cap is enabled from the memory. >> >> It is possible to avoid the load from the memory by using an >> ALTERNATIVE. With that the check is just reduced to 1 instruction. >> >> Signed-off-by: Julien Grall <julien.grall@arm.com> > > I enjoyed reading the patch :-) But I don't think it is worth going > into this extreme level of optimization. test_bit is efficient enough, > right? What do you think we need to use alternatives just to check one > bit? We already have an helper using test_bit (see cpus_have_cap). However test_bit requires to load the word from the memory. This is an overhead when the decision never change after boot. One load may be insignificant by itself (thought may have a cache miss), but if you are in hotpath this is a win in long term. The mechanism is very similar to static key but for the poor (I don't have much time to implement better for now). We already use similar construction on other places (see CHECK_WORKAROUND_HELPER). Cheers,
On Wed, 26 Sep 2018, Julien Grall wrote: > Hi Stefano, > > On 09/26/2018 05:53 PM, Stefano Stabellini wrote: > > On Tue, 25 Sep 2018, Julien Grall wrote: > > > Some capababilities are set right during boot and will never change > > > afterwards. At the moment, the function cpu_have_caps will check whether > > > the cap is enabled from the memory. > > > > > > It is possible to avoid the load from the memory by using an > > > ALTERNATIVE. With that the check is just reduced to 1 instruction. > > > > > > Signed-off-by: Julien Grall <julien.grall@arm.com> > > > > I enjoyed reading the patch :-) But I don't think it is worth going > > into this extreme level of optimization. test_bit is efficient enough, > > right? What do you think we need to use alternatives just to check one > > bit? > > We already have an helper using test_bit (see cpus_have_cap). However test_bit > requires to load the word from the memory. This is an overhead when the > decision never change after boot. > > One load may be insignificant by itself (thought may have a cache miss), but > if you are in hotpath this is a win in long term. > > The mechanism is very similar to static key but for the poor (I don't have > much time to implement better for now). We already use similar construction on > other places (see CHECK_WORKAROUND_HELPER). I like the mechanism and the little ALTERNATIVE trick. I can see it could very useful in some cases. It is just that it is a bit of a waste to use it on SMCCC virtualization -- SMCCC calls cannot be done often enough for this to make a difference. But who am I to complain when you make things faster? :-)
Hi, On 09/27/2018 11:34 PM, Stefano Stabellini wrote: > On Wed, 26 Sep 2018, Julien Grall wrote: >> Hi Stefano, >> >> On 09/26/2018 05:53 PM, Stefano Stabellini wrote: >>> On Tue, 25 Sep 2018, Julien Grall wrote: >>>> Some capababilities are set right during boot and will never change >>>> afterwards. At the moment, the function cpu_have_caps will check whether >>>> the cap is enabled from the memory. >>>> >>>> It is possible to avoid the load from the memory by using an >>>> ALTERNATIVE. With that the check is just reduced to 1 instruction. >>>> >>>> Signed-off-by: Julien Grall <julien.grall@arm.com> >>> >>> I enjoyed reading the patch :-) But I don't think it is worth going >>> into this extreme level of optimization. test_bit is efficient enough, >>> right? What do you think we need to use alternatives just to check one >>> bit? >> >> We already have an helper using test_bit (see cpus_have_cap). However test_bit >> requires to load the word from the memory. This is an overhead when the >> decision never change after boot. >> >> One load may be insignificant by itself (thought may have a cache miss), but >> if you are in hotpath this is a win in long term. >> >> The mechanism is very similar to static key but for the poor (I don't have >> much time to implement better for now). We already use similar construction on >> other places (see CHECK_WORKAROUND_HELPER). > > I like the mechanism and the little ALTERNATIVE trick. I can see it > could very useful in some cases. It is just that it is a bit of a waste > to use it on SMCCC virtualization -- SMCCC calls cannot be done often > enough for this to make a difference. I would not be so sure ;). SMC can be called at *every* entry and exit into the hypervisor to apply SSBD workaround. While SSBD will directly call arm_smccc_1_1_smc (and not the generic one), I would not rule out more hotpath with SMC call in it. Cheers,
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h index 3de6b54301..c6cbc2ec84 100644 --- a/xen/include/asm-arm/cpufeature.h +++ b/xen/include/asm-arm/cpufeature.h @@ -63,6 +63,18 @@ static inline bool cpus_have_cap(unsigned int num) return test_bit(num, cpu_hwcaps); } +/* System capability check for constant cap */ +#define cpus_have_const_cap(num) ({ \ + bool __ret; \ + \ + asm volatile (ALTERNATIVE("mov %0, #0", \ + "mov %0, #1", \ + num) \ + : "=r" (__ret)); \ + \ + unlikely(__ret); \ + }) + static inline void cpus_set_cap(unsigned int num) { if (num >= ARM_NCAPS)
Some capababilities are set right during boot and will never change afterwards. At the moment, the function cpu_have_caps will check whether the cap is enabled from the memory. It is possible to avoid the load from the memory by using an ALTERNATIVE. With that the check is just reduced to 1 instruction. Signed-off-by: Julien Grall <julien.grall@arm.com> --- This is the static key for the poor. At some point we might want to introduce something similar to static key in Xen. Changes in v2: - Use unlikely --- xen/include/asm-arm/cpufeature.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)