Message ID | 20220125001114.193425-1-broonie@kernel.org |
---|---|
Headers | show |
Series | arm64/sme: Initial support for the Scalable Matrix Extension | expand |
On 25/01/2022 00:10, Mark Brown wrote: > Since all the fields in the main ID registers are 4 bits wide we have up > until now not bothered specifying the width in the code. Since we now > wish to use this mechanism to enumerate features from the floating point > feature registers which do not follow this pattern add a width to the > table. This means updating all the existing table entries but makes it > less likely that we run into issues in future due to implicitly assuming > a 4 bit width. > > Signed-off-by: Mark Brown <broonie@kernel.org> > Cc: Suzuki K Poulose <suzuki.poulose@arm.com> > --- > arch/arm64/include/asm/cpufeature.h | 1 + > arch/arm64/kernel/cpufeature.c | 167 +++++++++++++++++----------- > 2 files changed, 102 insertions(+), 66 deletions(-) > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index ef6be92b1921..2728abd9cae4 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -356,6 +356,7 @@ struct arm64_cpu_capabilities { > struct { /* Feature register checking */ > u32 sys_reg; > u8 field_pos; > + u8 field_width; > u8 min_field_value; > u8 hwcap_type; > bool sign; > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index a46ab3b1c4d5..d9f09e40aaf6 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -1307,7 +1307,9 @@ u64 __read_sysreg_by_encoding(u32 sys_id) > static bool > feature_matches(u64 reg, const struct arm64_cpu_capabilities *entry) > { > - int val = cpuid_feature_extract_field(reg, entry->field_pos, entry->sign); > + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, > + entry->field_width, > + entry->sign); > Could we do something like : + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, entry->field_width ? : 4, .. ); and leave the existing structures as they are ? > return val >= entry->min_field_value; > } > @@ -1952,6 +1954,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { There is arm64_errata[] array in cpu_errata.c. So, if we use the above proposal we could leave things unchanged for all existing uses. > .matches = has_cpuid_feature, > .sys_reg = SYS_ID_AA64MMFR0_EL1, > .field_pos = ID_AA64MMFR0_ECV_SHIFT, > + .field_width = 4, > .sign = FTR_UNSIGNED, > .min_field_value = 1, > }, ... > -#define HWCAP_CPUID_MATCH(reg, field, s, min_value) \ > +#define HWCAP_CPUID_MATCH(reg, field, width, s, min_value) \ > .matches = has_cpuid_feature, \ > .sys_reg = reg, \ > .field_pos = field, \ > + .field_width = width, \ > .sign = s, \ > .min_field_value = min_value, And that could avoid these changes too. We could add : #define HWCAP_CPUID_MATCH_WIDTH(...) when/if we need one, which sets the width. Cheers Suzuki
On Tue, 25 Jan 2022 00:11:01 +0000, Mark Brown <broonie@kernel.org> wrote: > > SME defines two new traps which need to be enabled for guests to ensure > that they can't use SME, one for the main SME operations which mirrors the > traps for SVE and another for access to TPIDR2 in SCTLR_EL2. > > For VHE manage SMEN along with ZEN in activate_traps() and the FP state > management callbacks. > > For nVHE the value to be used for CPTR_EL2 in the guest is stored in > vcpu->arch.cptr_el2, set TSM there during initialisation. It will be > cleared in __deactivate_traps_common() by virtue of not being set in > CPTR_EL2_DEFAULT. > > For both VHE and nVHE cases handle SCTLR_EL2.EnTPIDR2 in the shared > __active_traps_common() and __deactivate_traps_common(), there is no > existing dynamic management of SCTLR_EL2. > > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > arch/arm64/kvm/hyp/nvhe/switch.c | 30 ++++++++++++++++++++++++++++++ > arch/arm64/kvm/hyp/vhe/switch.c | 10 +++++++++- > 2 files changed, 39 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c > index 6410d21d8695..184bf6bd79b9 100644 > --- a/arch/arm64/kvm/hyp/nvhe/switch.c > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c > @@ -47,10 +47,25 @@ static void __activate_traps(struct kvm_vcpu *vcpu) > val |= CPTR_EL2_TFP | CPTR_EL2_TZ; > __activate_traps_fpsimd32(vcpu); > } > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) Please drop the IS_ENABLED(). We purposely avoid conditional compilation in KVM in order to avoid bitrot, and the amount of code you save isn't significant. Having a static key is more than enough to avoid runtime costs. > + val |= CPTR_EL2_TSM; > > write_sysreg(val, cptr_el2); > write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2); > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && > + cpus_have_final_cap(ARM64_HAS_FGT)) { > + val = read_sysreg_s(SYS_HFGRTR_EL2); > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > + HFGxTR_EL2_nSMPRI_EL1_MASK); > + write_sysreg_s(val, SYS_HFGRTR_EL2); > + > + val = read_sysreg_s(SYS_HFGWTR_EL2); > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > + HFGxTR_EL2_nSMPRI_EL1_MASK); > + write_sysreg_s(val, SYS_HFGWTR_EL2); > + } If the CPUs do not have FGT, what provides the equivalent trapping? If FGT is mandatory when SME exists, then you should simplify the condition. M.
On Tue, Jan 25, 2022 at 10:57:47AM +0000, Suzuki K Poulose wrote: > On 25/01/2022 00:10, Mark Brown wrote: > > + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, > > + entry->field_width, > > + entry->sign); > Could we do something like : > + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, > entry->field_width ? : 4, > .. > ); > and leave the existing structures as they are ? Obviously we *could* (ideally without the ternery operator) but having the implicit default like that makes me nervous that it's too easy to just forget to fill it in and get the wrong default. > And that could avoid these changes too. We could add : > #define HWCAP_CPUID_MATCH_WIDTH(...) > when/if we need one, which sets the width. I'd originally had a completely separate set of definitions for single bit fields but Catlain wanted to get everything in one. I'm not sure I see a huge advantage in sharing the match function and not also sharing the macro TBH.
On Tue, Jan 25, 2022 at 11:59:02AM +0000, Marc Zyngier wrote: > Mark Brown <broonie@kernel.org> wrote: > > + if (has_vhe()) { > > + if (system_supports_sme()) { > nit: if (has_vhe() && system_supports_sme()) { > saves you one level of indentation. Yes, for now. IIRC there was some other stuff there when I had some of the code for doing the register switching properly. > > + /* Also restore EL0 state seen on entry */ > > + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) > > + sysreg_clear_set(CPACR_EL1, 0, > > + CPACR_EL1_SMEN_EL0EN | > > + CPACR_EL1_SMEN_EL1EN); > > + else > > + sysreg_clear_set(CPACR_EL1, > > + CPACR_EL1_SMEN_EL0EN, > > + CPACR_EL1_SMEN_EL1EN); > I find the use of CPACR_EL1_SMEN in some cases and its individual bits > in some others pretty confusing. I understand that you have modelled > it after the SVE code, but maybe this is a mistake we don't need to > repeat. I'd be in favour of directly exposing the individual bits in > all cases. OK, it is just the KVM code that uses the plain ZEN. I'll add a cleanup patch for that at the start of the series for ZEN I guess otherwise it looks worse, though that will inflate the size of the series a bit.
On Tue, 25 Jan 2022 12:52:18 +0000, Mark Brown <broonie@kernel.org> wrote: > > [1 <text/plain; us-ascii (7bit)>] > On Tue, Jan 25, 2022 at 11:59:02AM +0000, Marc Zyngier wrote: > > Mark Brown <broonie@kernel.org> wrote: > > > > + if (has_vhe()) { > > > + if (system_supports_sme()) { > > > nit: if (has_vhe() && system_supports_sme()) { > > > saves you one level of indentation. > > Yes, for now. IIRC there was some other stuff there when I had some of > the code for doing the register switching properly. > > > > + /* Also restore EL0 state seen on entry */ > > > + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) > > > + sysreg_clear_set(CPACR_EL1, 0, > > > + CPACR_EL1_SMEN_EL0EN | > > > + CPACR_EL1_SMEN_EL1EN); > > > + else > > > + sysreg_clear_set(CPACR_EL1, > > > + CPACR_EL1_SMEN_EL0EN, > > > + CPACR_EL1_SMEN_EL1EN); > > > I find the use of CPACR_EL1_SMEN in some cases and its individual bits > > in some others pretty confusing. I understand that you have modelled > > it after the SVE code, but maybe this is a mistake we don't need to > > repeat. I'd be in favour of directly exposing the individual bits in > > all cases. > > OK, it is just the KVM code that uses the plain ZEN. I'll add a cleanup > patch for that at the start of the series for ZEN I guess otherwise it > looks worse, though that will inflate the size of the series a bit. I'm happy to merge such a patch early if that helps. M.