Message ID | 20250602192702.2125115-1-coltonlewis@google.com |
---|---|
Headers | show |
Series | ARM64 PMU Partitioning | expand |
Hi Colton, On Mon, Jun 02, 2025 at 07:26:46PM +0000, Colton Lewis wrote: > Add a capability for FEAT_HPMN0, whether MDCR_EL2.HPMN can specify 0 > counters reserved for the guest. > > This required changing HPMN0 to an UnsignedEnum in tools/sysreg > because otherwise not all the appropriate macros are generated to add > it to arm64_cpu_capabilities_arm64_features. > > Signed-off-by: Colton Lewis <coltonlewis@google.com> > --- > arch/arm64/kernel/cpufeature.c | 8 ++++++++ > arch/arm64/tools/cpucaps | 1 + > arch/arm64/tools/sysreg | 6 +++--- > 3 files changed, 12 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index a3da020f1d1c..578eea321a60 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -541,6 +541,7 @@ static const struct arm64_ftr_bits ftr_id_mmfr0[] = { > }; > > static const struct arm64_ftr_bits ftr_id_aa64dfr0[] = { > + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_EL1_HPMN0_SHIFT, 4, 0), > S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_EL1_DoubleLock_SHIFT, 4, 0), > ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64DFR0_EL1_PMSVer_SHIFT, 4, 0), > ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_EL1_CTX_CMPs_SHIFT, 4, 0), > @@ -2884,6 +2885,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > .matches = has_cpuid_feature, > ARM64_CPUID_FIELDS(ID_AA64MMFR0_EL1, FGT, FGT2) > }, > + { > + .desc = "Hypervisor PMU Partitioning 0 Guest Counters", nit: just use the the FEAT_xxx name for the description (i.e. "HPMN0"). Thanks, Oliver
On Mon, Jun 02, 2025 at 07:26:51PM +0000, Colton Lewis wrote: > static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu) > { > + u8 hpmn = vcpu->kvm->arch.arm_pmu->hpmn; > + > preempt_disable(); > > /* > * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK > * to disable guest access to the profiling and trace buffers > */ > - vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, > - *host_data_ptr(nr_event_counters)); > - vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM | > + vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, hpmn); > + vcpu->arch.mdcr_el2 |= (MDCR_EL2_HPMD | > + MDCR_EL2_TPM | This isn't safe, as there's no guarantee that kvm_arch::arm_pmu is pointing that the PMU for this CPU. KVM needs to derive HPMN from some per-CPU state, not anything tied to the VM/vCPU. > +/** > + * kvm_pmu_partition() - Partition the PMU > + * @pmu: Pointer to pmu being partitioned > + * @host_counters: Number of host counters to reserve > + * > + * Partition the given PMU by taking a number of host counters to > + * reserve and, if it is a valid reservation, recording the > + * corresponding HPMN value in the hpmn field of the PMU and clearing > + * the guest-reserved counters from the counter mask. > + * > + * Passing 0 for @host_counters has the effect of disabling partitioning. > + * > + * Return: 0 on success, -ERROR otherwise > + */ > +int kvm_pmu_partition(struct arm_pmu *pmu, u8 host_counters) > +{ > + u8 nr_counters; > + u8 hpmn; > + > + if (!kvm_pmu_reservation_is_valid(host_counters)) > + return -EINVAL; > + > + nr_counters = *host_data_ptr(nr_event_counters); > + hpmn = kvm_pmu_hpmn(host_counters); > + > + if (hpmn < nr_counters) { > + pmu->hpmn = hpmn; > + /* Inform host driver of available counters */ > + bitmap_clear(pmu->cntr_mask, 0, hpmn); > + bitmap_set(pmu->cntr_mask, hpmn, nr_counters); > + clear_bit(ARMV8_PMU_CYCLE_IDX, pmu->cntr_mask); > + if (pmuv3_has_icntr()) > + clear_bit(ARMV8_PMU_INSTR_IDX, pmu->cntr_mask); > + > + kvm_debug("Partitioned PMU with HPMN %u", hpmn); > + } else { > + pmu->hpmn = nr_counters; > + bitmap_set(pmu->cntr_mask, 0, nr_counters); > + set_bit(ARMV8_PMU_CYCLE_IDX, pmu->cntr_mask); > + if (pmuv3_has_icntr()) > + set_bit(ARMV8_PMU_INSTR_IDX, pmu->cntr_mask); > + > + kvm_debug("Unpartitioned PMU"); > + } > + > + return 0; > +} Hmm... Just in terms of code organization I'm not sure I like having KVM twiddling with *host* support for PMUv3. Feels like the ARM PMU driver should own partitioning and KVM just takes what it can get. > @@ -239,6 +245,13 @@ void kvm_host_pmu_init(struct arm_pmu *pmu) > if (!pmuv3_implemented(kvm_arm_pmu_get_pmuver_limit())) > return; > > + if (reserved_host_counters) { > + if (kvm_pmu_partition_supported()) > + WARN_ON(kvm_pmu_partition(pmu, reserved_host_counters)); > + else > + kvm_err("PMU Partition is not supported"); > + } > + Hasn't the ARM PMU been registered with perf at this point? Surely the driver wouldn't be very pleased with us ripping counters out from under its feet. Thanks, Oliver
On Mon, Jun 02, 2025 at 07:27:01PM +0000, Colton Lewis wrote: > + case KVM_ARM_PARTITION_PMU: { This should be a vCPU attribute similar to the other PMUv3 controls we already have. Ideally a single attribute where userspace tells us it wants paritioning and specifies the PMU ID to use. None of this can be changed after INIT'ing the PMU. > + struct arm_pmu *pmu; > + u8 host_counters; > + > + if (unlikely(!kvm_vcpu_initialized(vcpu))) > + return -ENOEXEC; > + > + if (!kvm_pmu_partition_supported()) > + return -EPERM; > + > + if (copy_from_user(&host_counters, argp, sizeof(host_counters))) > + return -EFAULT; > + > + pmu = vcpu->kvm->arch.arm_pmu; > + return kvm_pmu_partition(pmu, host_counters); Yeah, we really can't be changing the counters available to the ARM PMU driver at this point. What happens to host events already scheduled on the CPU? Either the partition of host / KVM-owned counters needs to be computed up front (prior to scheduling events) or KVM needs a way to direct perf to reschedule events on the PMU based on the new operating constraints. Thanks, Oliver