mbox series

[v2,00/17] Prepare for new Intel Family numbers

Message ID 20250211194407.2577252-1-sohil.mehta@intel.com
Headers show
Series Prepare for new Intel Family numbers | expand

Message

Sohil Mehta Feb. 11, 2025, 7:43 p.m. UTC
---Summary---
Mainstream Intel processors have been using Family 6 for a couple of decades.
Audit all the Intel Family-model checks to get ready for the upcoming Family 18
and 19 models.

Patch 1: Preparatory cleanup in smpboot.
Patch 2-7: Fixes in arch/x86 and drivers.
Patch 8-17: Cleanups in arch/x86 to convert x86_model checks to VFM ones.

This series does not include cleanups in drivers/.

Please feel free to pick up whichever patches seem ready. Most of the patches
can be applied out of order except patches 1 and 2 which should be applied
together.

---v2 changes---

* Improve commit messages.
* Split and reorder patches for better readability.
* Added a cleanup patch in the beginning.

RFC-v1: https://lore.kernel.org/lkml/20241220213711.1892696-1-sohil.mehta@intel.com/

---Background---
The last mainstream Intel processors that deviated from Family 6 were the
Netburst architecture based Pentium 4s that had a Family number of 15. Intel
has recently started to introduce extended Family numbers greater than 15, as
seen in commit d1fb034b75a8 ("x86/cpu: Add two Intel CPU model numbers").
Though newer CPUs can have any Family number, the currently planned CPUs lie in
Families 18 and 19.

Some kernel code assumes that the Family number would always remain 6.  There
are checks that apply to recent Family 6 models such as Lunar Lake and
Clearwater Forest but don't automatically extend to Family 19 models such as
Diamond Rapids. This series aims to fix and cleanup all of such Intel specific
checks (mainly in arch/x86) to avoid such issues in the future. It also
converts almost all of the x86_model checks in arch/x86 to the new VFM ones.

OTOH, x86_model usage in drivers/ is a huge mess. Some drivers might need to be
completely rewritten to make them future-proof. I have attempted a couple of
fixes in cpufreq and hwmon, but they are mostly superficial.  A more thorough
clean up of drivers is needed to replace all x86_model usage with the new VFM
checks.

---Assumptions and Trade-offs---
Newer CPUs will have model numbers only in Family 6 or after Family 15.  No new
processors would be added between Family 6 and 15.

As a convention, Intel Family numbers are referenced using decimals (Family 15,
19, etc.) even though the AMD-specific code might prefer hexadecimals in
similar situations.

It would be preferable to have simpler and more maintainable checks that might
in a rare situation change the behavior on really old platforms. If someone
pops up with an issue, the code would be fixed later.
For example, the check,
	c->x86_vfm >= INTEL_PENTIUM_PRO
is preferred over,
	c->x86 == 6 || c->x86 > 15
if the likelihood of adversely affecting Family 15 is low.

Also, the CPU defines in intel-family.h have been added as and when needed to
make reviewing and applying patches out of order easier.

Sohil Mehta (17):
  x86/smpboot: Remove confusing quirk usage in INIT delay
  x86/smpboot: Fix INIT delay optimization for extended Intel Families
  x86/apic: Fix 32-bit APIC initialization for extended Intel Families
  x86/cpu/intel: Fix the movsl alignment preference for extended
    Families
  x86/cpu/intel: Fix page copy performance for extended Families
  cpufreq: Fix the efficient idle check for Intel extended Families
  hwmon: Fix Intel Family-model checks to include extended Families
  x86/microcode: Update the Intel processor flag scan check
  x86/mtrr: Modify a x86_model check to an Intel VFM check
  x86/cpu/intel: Replace early Family 6 checks with VFM ones
  x86/cpu/intel: Replace Family 15 checks with VFM ones
  x86/cpu/intel: Replace Family 5 model checks with VFM ones
  x86/pat: Replace Intel x86_model checks with VFM ones
  x86/acpi/cstate: Improve Intel Family model checks
  x86/cpu/intel: Bound the non-architectural constant_tsc model checks
  perf/x86: Simplify P6 PMU initialization
  perf/x86/p4: Replace Pentium 4 model checks with VFM ones

 arch/x86/events/intel/p4.c            |  7 +--
 arch/x86/events/intel/p6.c            | 28 +++--------
 arch/x86/include/asm/intel-family.h   | 21 ++++++++-
 arch/x86/kernel/acpi/cstate.c         |  8 ++--
 arch/x86/kernel/apic/apic.c           |  4 +-
 arch/x86/kernel/cpu/intel.c           | 68 +++++++++++++--------------
 arch/x86/kernel/cpu/microcode/intel.c |  2 +-
 arch/x86/kernel/cpu/mtrr/generic.c    |  4 +-
 arch/x86/kernel/smpboot.c             | 17 ++++---
 arch/x86/mm/pat/memtype.c             |  7 +--
 drivers/cpufreq/cpufreq_ondemand.c    | 15 +++---
 drivers/hwmon/coretemp.c              | 26 ++++++----
 12 files changed, 111 insertions(+), 96 deletions(-)

Comments

Sohil Mehta Feb. 11, 2025, 7:43 p.m. UTC | #1
X86_FEATURE_REP_GOOD is a linux defined feature flag to track whether
fast string operations should be used for copy_page(). It is also used
as a backup alternative for clear_page() if enhanced fast string
operations (ERMS) are not available.

Currently, the flag is only set for Family 6 processors. Extend the
check to include upcoming processors in Family 18 and 19.

It is uncertain whether X86_FEATURE_REP_GOOD should be set for Family 15
(Pentium 4) as well. Commit 185f3b9da24c ("x86: make intel.c have 64-bit
support code") that originally set the flag also set the
x86_cache_alignment preference for Family 15 processors in the same
commit. The omission of the Family 15 may have been intentional.

Also, move the check before a related check in early_init_intel() to
avoid resetting the flag.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Separate out the REP_GOOD (copy page) specific change into a
separate commit.
Dave Hansen Feb. 11, 2025, 9:03 p.m. UTC | #2
On 2/11/25 11:44, Sohil Mehta wrote:
> Introduce names for some old pentium 4 models and replace the x86_model
> checks with VFM ones.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Dave Hansen Feb. 11, 2025, 9:09 p.m. UTC | #3
On 2/11/25 11:44, Sohil Mehta wrote:
> +	if (c->x86_vendor == X86_VENDOR_INTEL &&
> +	    ((c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm <= INTEL_PENTIUM_M_DOTHAN) ||
> +	    (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <= INTEL_P4_CEDARMILL))) {

Since these are both closed checks and not open-ended, is the

	if (c->x86_vendor == X86_VENDOR_INTEL &&

bit needed or superfluous?

Also, super nit, can you vertically align the two range checks, please?

	    ((c->x86_vfm >= INTEL_PENTIUM_PRO   && c->x86_vfm <=
INTEL_PENTIUM_M_DOTHAN) ||
	     (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <=
INTEL_P4_CEDARMILL))) {
Sohil Mehta Feb. 11, 2025, 9:38 p.m. UTC | #4
On 2/11/2025 12:58 PM, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
>> +	/*
>> +	 * Return without adjustment if the Family isn't 6.
>> +	 * The rest of the function assumes Family 6.
>> +	 */
>> +	if (c->x86 != 6)
>> +		return tjmax;
> 
> Shouldn't we be converting this over to the vfm matches?
> 

For drivers/, I mainly focused on fixes instead of cleanups.

Converting drivers over to VFM checks is significant work. There are a
lot of such comparisons and switch cases (probably more than 50) across
drivers/cpufreq/ and drivers/hwmon/.

Some of the functions might need significant refactoring and rewrites. I
think someone with expertise in that particular driver should probably
do it. I did start with it initially but it is beyond my bandwidth at
the moment.

> This is kinda icky:
> 
>> +	return family > 15 ||
>> +	       (family == 6 &&
>> +		model > 0xe &&
>> +		model != 0x1c &&
>> +		model != 0x26 &&
>> +		model != 0x27 &&
>> +		model != 0x35 &&
>> +		model != 0x36);
>>  }
> 
> I'm not sure how this escaped so far. Probably because it's not in arch/x86.
>
Sohil Mehta Feb. 11, 2025, 9:42 p.m. UTC | #5
On 2/11/2025 1:09 PM, Dave Hansen wrote:
> On 2/11/25 11:44, Sohil Mehta wrote:
>> +	if (c->x86_vendor == X86_VENDOR_INTEL &&
>> +	    ((c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm <= INTEL_PENTIUM_M_DOTHAN) ||
>> +	    (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <= INTEL_P4_CEDARMILL))) {
> 
> Since these are both closed checks and not open-ended, is the
> 
> 	if (c->x86_vendor == X86_VENDOR_INTEL &&
> 
> bit needed or superfluous?
> 

You are right, since it is close ended on both sides we should be able
to remove the X86_VENDOR_INTEL.

I was thinking if we should leave it there to avoid confusion. But,
INTEL_* in the VFM string is a good enough hint that the checks are
Intel specific. Also, it's not like this check is going to be modified
frequently.

> Also, super nit, can you vertically align the two range checks, please?
> 
> 	    ((c->x86_vfm >= INTEL_PENTIUM_PRO   && c->x86_vfm <=
> INTEL_PENTIUM_M_DOTHAN) ||
> 	     (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <=
> INTEL_P4_CEDARMILL))) {
> 
>
Andrew Cooper Feb. 12, 2025, 12:54 a.m. UTC | #6
On 11/02/2025 8:53 pm, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
>> +	/*
>> +	 * Modern CPUs are generally expected to have a sane fast string
>> +	 * implementation. However, the BIOS may disable it on certain CPUs
>> +	 * via the architectural FAST_STRING bit.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_X86_64) && (c->x86 == 6 || c->x86 > 15))
>> +		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> I'm not sure the BIOS comment is helpful here.
>
> Also, at this point, let's just make the check >=6 (or the >=PPRO
> equivalent).
>
> It will only matter if *all* of these are true:
> 1. Someone has a 64-bit capable P4 that powers on
> 2. They're running a 64-bit mainline kernel
> 3. String copy is *actually* slower than the alternative
> 4. They are performance sensitive enough to notice
>
> We don't even know the answer to #3 for sure. Let's just say what we're
> doing in a comment:
>
> 	/* Assume that any 64-bit CPU has a good implementation */

If you're going to override the BIOS setting, then you need to
explicitly set MSR_MISC_ENABLE.FAST_STRINGS.

Otherwise you're claiming to Linux that REP is good even when hardware
is prohibited from using optimisations.

~Andrew
Zhang, Rui Feb. 12, 2025, 1:10 p.m. UTC | #7
On Tue, 2025-02-11 at 12:58 -0800, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
> > +       /*
> > +        * Return without adjustment if the Family isn't 6.
> > +        * The rest of the function assumes Family 6.
> > +        */
> > +       if (c->x86 != 6)
> > +               return tjmax;
> 
> Shouldn't we be converting this over to the vfm matches?
> 
> This is kinda icky:
> 
> > +       return family > 15 ||
> > +              (family == 6 &&
> > +               model > 0xe &&
> > +               model != 0x1c &&
> > +               model != 0x26 &&
> > +               model != 0x27 &&
> > +               model != 0x35 &&
> > +               model != 0x36);
> >  }
> 
> I'm not sure how this escaped so far. Probably because it's not in
> arch/x86.
> 
This code was introduced 10+ years ago, and it only brings a warning
message when reading MSR_IA32_TEMPERATURE_TARGET fails.
So probably no one has ever checked this.

thanks,
rui
Zhang, Rui Feb. 12, 2025, 1:43 p.m. UTC | #8
On Tue, 2025-02-11 at 13:38 -0800, Sohil Mehta wrote:
> On 2/11/2025 12:58 PM, Dave Hansen wrote:
> > On 2/11/25 11:43, Sohil Mehta wrote:
> > > +       /*
> > > +        * Return without adjustment if the Family isn't 6.
> > > +        * The rest of the function assumes Family 6.
> > > +        */
> > > +       if (c->x86 != 6)
> > > +               return tjmax;
> > 
> > Shouldn't we be converting this over to the vfm matches?
> > 
> 
> For drivers/, I mainly focused on fixes instead of cleanups.
> 
> Converting drivers over to VFM checks is significant work. There are
> a
> lot of such comparisons and switch cases (probably more than 50)
> across
> drivers/cpufreq/ and drivers/hwmon/.
> 
> Some of the functions might need significant refactoring and
> rewrites. I
> think someone with expertise in that particular driver should
> probably
> do it. I did start with it initially but it is beyond my bandwidth at
> the moment.
> 
I agree.
adjust_tjmax() contains a list of quirks based on PCI-
ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that all
the quirks are for Fam6 processors but the family id is not checked. So
the fix is sufficient. In fact, I think it is better to move the check
to the very beginning of adjust_tjmax().

Plus that, I do think we can have more cleanups on top
1. rename adjust_tjmax() to adjust_tjmax_for_fam6()
2. move all model specific quirks altogether and avoid model checks in
the main functions.
3. for processors newer than fam6, the driver should fail to probe
rather than using a hardcoded value when reading
MSR_IA32_TEMPERATURE_TARGET fails.

maybe I can start with something like below.

---
 drivers/hwmon/coretemp.c | 98 +++++++++++++++++++++++-----------------
 1 file changed, 57 insertions(+), 41 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 1aa67a2b5f18..fc2cf607aa36 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -99,6 +99,7 @@ struct platform_data {
 	struct device_attribute name_attr;
 };
 
+/* Beginning of Model specific quirks */
 struct tjmax_pci {
 	unsigned int device;
 	int tjmax;
@@ -147,12 +148,11 @@ static const struct tjmax_model tjmax_model_table[] = {
 				 */
 };
 
-static bool is_pkg_temp_data(struct temp_data *tdata)
-{
-	return tdata->index < 0;
-}
-
-static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
+/*
+ * Adjust tjmax value for early Fam6 CPUs with unreadable MSR_IA32_TEMPERATURE_TARGET
+ * NOTE: the calculated value may not be correct.
+ */
+static int adjust_tjmax_for_fam6(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 {
 	/* The 100C is default for both mobile and non mobile CPUs */
 
@@ -163,8 +163,16 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 	u32 eax, edx;
 	int i;
 	u16 devfn = PCI_DEVFN(0, 0);
-	struct pci_dev *host_bridge = pci_get_domain_bus_and_slot(0, 0, devfn);
+	struct pci_dev *host_bridge;
+
+	/*
+	 * Return without adjustment if the Family isn't 6.
+	 * The rest of the function assumes Family 6.
+	 */
+	if (c->x86 != 6)
+		return tjmax;
 
+	host_bridge = pci_get_domain_bus_and_slot(0, 0, devfn);
 	/*
 	 * Explicit tjmax table entries override heuristics.
 	 * First try PCI host bridge IDs, followed by model ID strings
@@ -185,12 +193,6 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 			return tjmax_table[i].tjmax;
 	}
 
-	/*
-	 * Return without adjustment if the Family isn't 6.
-	 * The rest of the function assumes Family 6.
-	 */
-	if (c->x86 != 6)
-		return tjmax;
 
 	for (i = 0; i < ARRAY_SIZE(tjmax_model_table); i++) {
 		const struct tjmax_model *tm = &tjmax_model_table[i];
@@ -280,6 +282,37 @@ static bool cpu_has_tjmax(struct cpuinfo_x86 *c)
 		model != 0x36);
 }
 
+static bool cpu_has_ttarget(struct temp_data *tdata)
+{
+	struct cpuinfo_x86 *c = &cpu_data(tdata->cpu);
+
+	/*
+	 * The target temperature is available on older CPUs but not in the
+	 * MSR_IA32_TEMPERATURE_TARGET register. Atoms don't have the register
+	 * at all.
+	 */
+	if (c->x86 > 15 || (c->x86 == 6 && c->x86_model > 0xe && c->x86_model != 0x1c))
+		return true;
+	return false;
+}
+
+static bool cpu_has_broken_ucode(unsigned int cpu)
+{
+	struct cpuinfo_x86 *c = &cpu_data(cpu);
+
+	/*
+	 * Check if we have problem with errata AE18 of Core processors:
+	 * Readings might stop update when processor visited too deep sleep,
+	 * fixed for stepping D0 (6EC).
+	 */
+	if (c->x86 == 6 && c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
+		pr_err("Errata AE18 not fixed, update BIOS or microcode of the CPU!\n");
+		return true;
+	}
+	return false;
+}
+/* End of Model specific quirks */
+
 static int get_tjmax(struct temp_data *tdata, struct device *dev)
 {
 	struct cpuinfo_x86 *c = &cpu_data(tdata->cpu);
@@ -312,9 +345,8 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
 	} else {
 		/*
 		 * An assumption is made for early CPUs and unreadable MSR.
-		 * NOTE: the calculated value may not be correct.
 		 */
-		tdata->tjmax = adjust_tjmax(c, tdata->cpu, dev);
+		tdata->tjmax = adjust_tjmax_for_fam6(c, tdata->cpu, dev);
 	}
 	return tdata->tjmax;
 }
@@ -324,6 +356,8 @@ static int get_ttarget(struct temp_data *tdata, struct device *dev)
 	u32 eax, edx;
 	int tjmax, ttarget_offset, ret;
 
+	if (!cpu_has_ttarget(tdata))
+		return -ENODEV;
 	/*
 	 * ttarget is valid only if tjmax can be retrieved from
 	 * MSR_IA32_TEMPERATURE_TARGET
@@ -348,6 +382,11 @@ static int max_zones __read_mostly;
 /* Array of zone pointers. Serialized by cpu hotplug lock */
 static struct platform_device **zone_devices;
 
+static bool is_pkg_temp_data(struct temp_data *tdata)
+{
+	return tdata->index < 0;
+}
+
 static ssize_t show_label(struct device *dev,
 				struct device_attribute *devattr, char *buf)
 {
@@ -460,23 +499,6 @@ static int create_core_attrs(struct temp_data *tdata, struct device *dev)
 	return sysfs_create_group(&dev->kobj, &tdata->attr_group);
 }
 
-
-static int chk_ucode_version(unsigned int cpu)
-{
-	struct cpuinfo_x86 *c = &cpu_data(cpu);
-
-	/*
-	 * Check if we have problem with errata AE18 of Core processors:
-	 * Readings might stop update when processor visited too deep sleep,
-	 * fixed for stepping D0 (6EC).
-	 */
-	if (c->x86 == 6 && c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
-		pr_err("Errata AE18 not fixed, update BIOS or microcode of the CPU!\n");
-		return -ENODEV;
-	}
-	return 0;
-}
-
 static struct platform_device *coretemp_get_pdev(unsigned int cpu)
 {
 	int id = topology_logical_die_id(cpu);
@@ -585,14 +607,8 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	/* Make sure tdata->tjmax is a valid indicator for dynamic/static tjmax */
 	get_tjmax(tdata, &pdev->dev);
 
-	/*
-	 * The target temperature is available on older CPUs but not in the
-	 * MSR_IA32_TEMPERATURE_TARGET register. Atoms don't have the register
-	 * at all.
-	 */
-	if (c->x86 > 15 || (c->x86 == 6 && c->x86_model > 0xe && c->x86_model != 0x1c))
-		if (get_ttarget(tdata, &pdev->dev) >= 0)
-			tdata->attr_size++;
+	if (get_ttarget(tdata, &pdev->dev) >= 0)
+		tdata->attr_size++;
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, pdata->hwmon_dev);
@@ -696,7 +712,7 @@ static int coretemp_cpu_online(unsigned int cpu)
 		struct device *hwmon;
 
 		/* Check the microcode version of the CPU */
-		if (chk_ucode_version(cpu))
+		if (cpu_has_broken_ucode(cpu))
 			return -EINVAL;
 
 		/*
Dave Hansen Feb. 12, 2025, 4:57 p.m. UTC | #9
On 2/12/25 05:43, Zhang, Rui wrote:
> I agree.
> adjust_tjmax() contains a list of quirks based on PCI-
> ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that all
> the quirks are for Fam6 processors but the family id is not checked. So
> the fix is sufficient. In fact, I think it is better to move the check
> to the very beginning of adjust_tjmax().

Or, heck, just remove the model list. dev_warn_once() if the rdmsr
fails. Who cares about one more line in dmesg?

Why not do the attached patch?
Sohil Mehta Feb. 12, 2025, 9:19 p.m. UTC | #10
On 2/11/2025 4:54 PM, Andrew Cooper wrote:

> If you're going to override the BIOS setting, then you need to
> explicitly set MSR_MISC_ENABLE.FAST_STRINGS.
> 
> Otherwise you're claiming to Linux that REP is good even when hardware
> is prohibited from using optimisations.
> 

I think the current checks have unnecessary overlap which makes them
confusing. We should be fine if we only rely on the architectural
MSR_MISC_ENABLE.FAST_STRINGS bit and rely just on the BIOS setting. My
justification is below.

The simplified version of the current checks is as follows:

Check 1 (Based on Family Model numbers):
> /*
>  * Unconditionally set REP_GOOD on early Family 6 processors
>  */
> if (IS_ENABLED(CONFIG_X86_64) &&
>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);

This check is mostly redundant since it is targeted for 64 bit and very
few if any of those CPUs support 64 bit processing. I suggest that we
get rid of this check completely. The risk here is fairly limited as well.

Check 2 (Based on MISC_ENABLE.FAST_STRING):
> /*
>  * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
>  * clear the fast string and enhanced fast string CPU capabilities.
>  */
> if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) {
> 	rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
> 	if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING) {
> 		/* X86_FEATURE_ERMS will be automatically set based on CPUID */
> 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> 	} else {
> 		pr_info("Disabled fast string operations\n");
> 		setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
> 		setup_clear_cpu_cap(X86_FEATURE_ERMS);
> 	}
> }

This is the only real check that is needed and should likely suffice in
all meaningful scenarios.

Comments?
Andrew Cooper Feb. 13, 2025, 11:02 p.m. UTC | #11
On 12/02/2025 9:19 pm, Sohil Mehta wrote:
> Check 1 (Based on Family Model numbers):
>> /*
>>  * Unconditionally set REP_GOOD on early Family 6 processors
>>  */
>> if (IS_ENABLED(CONFIG_X86_64) &&
>>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
>> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> This check is mostly redundant since it is targeted for 64 bit and very
> few if any of those CPUs support 64 bit processing. I suggest that we
> get rid of this check completely. The risk here is fairly limited as well.

PENTIUM_PRO is model 0x1.  M_DOTHAN isn't introduced until patch 10, but
is model 0xd.

And model 0xf (Memron) is the first 64bit capable fam6 CPU, so this is
dead code given the CONFIG_X86_64 which the compiler can't actually
optimise out.

>
> Check 2 (Based on MISC_ENABLE.FAST_STRING):
>> /*
>>  * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
>>  * clear the fast string and enhanced fast string CPU capabilities.

I'd suggest that a better way of phrasing this is:

/* BIOSes typically have a knob for Fast Strings.  Honour the user's
wishes. */

>>  */
>> if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) {
>> 	rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
>> 	if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING) {
>> 		/* X86_FEATURE_ERMS will be automatically set based on CPUID */
>> 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
>> 	} else {
>> 		pr_info("Disabled fast string operations\n");
>> 		setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
>> 		setup_clear_cpu_cap(X86_FEATURE_ERMS);
>> 	}
>> }

MSR_MISC_ENABLE exists on all 64bit CPUs, and some 32bit ones too. 
Therefore, this section alone seems to suffice in order to set up
REP_GOOD properly.

~Andrew
Sohil Mehta Feb. 14, 2025, 12:29 a.m. UTC | #12
On 2/13/2025 3:02 PM, Andrew Cooper wrote:
> On 12/02/2025 9:19 pm, Sohil Mehta wrote:
>> Check 1 (Based on Family Model numbers):
>>> /*
>>>  * Unconditionally set REP_GOOD on early Family 6 processors
>>>  */
>>> if (IS_ENABLED(CONFIG_X86_64) &&
>>>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
>>> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
>> This check is mostly redundant since it is targeted for 64 bit and very
>> few if any of those CPUs support 64 bit processing. I suggest that we
>> get rid of this check completely. The risk here is fairly limited as well.
> 
> PENTIUM_PRO is model 0x1.  M_DOTHAN isn't introduced until patch 10, but
> is model 0xd.
> 
> And model 0xf (Memron) is the first 64bit capable fam6 CPU, so this is
> dead code given the CONFIG_X86_64 which the compiler can't actually
> optimise out.
> 

Thanks for confirming. I figured this is likely dead code but wasn't
completely sure.
Zhang, Rui Feb. 14, 2025, 2:23 a.m. UTC | #13
On Wed, 2025-02-12 at 08:57 -0800, Dave Hansen wrote:
> On 2/12/25 05:43, Zhang, Rui wrote:
> > I agree.
> > adjust_tjmax() contains a list of quirks based on PCI-
> > ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that
> > all
> > the quirks are for Fam6 processors but the family id is not
> > checked. So
> > the fix is sufficient. In fact, I think it is better to move the
> > check
> > to the very beginning of adjust_tjmax().
> 
> Or, heck, just remove the model list. dev_warn_once() if the rdmsr
> fails. Who cares about one more line in dmesg?
> 
> Why not do the attached patch?

The patch looks good to me.

-rui