diff mbox series

[v6,6/6] drivers/node: Show in sysfs node's crypto capabilities

Message ID 20220203164328.203629-7-martin.fernandez@eclypsium.com
State New
Headers show
Series x86: Show in sysfs if a memory node is able to do encryption | expand

Commit Message

Martin Fernandez Feb. 3, 2022, 4:43 p.m. UTC
Show in each node in sysfs if its memory is able to do be encrypted by
the CPU, ie. if all its memory is marked with EFI_MEMORY_CPU_CRYPTO in
the EFI memory map.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 Documentation/ABI/testing/sysfs-devices-node | 10 ++++++++++
 drivers/base/node.c                          | 10 ++++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-node

Comments

Mike Rapoport Feb. 4, 2022, 4:56 a.m. UTC | #1
On Thu, Feb 03, 2022 at 01:43:28PM -0300, Martin Fernandez wrote:
> Show in each node in sysfs if its memory is able to do be encrypted by
> the CPU, ie. if all its memory is marked with EFI_MEMORY_CPU_CRYPTO in
> the EFI memory map.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-node | 10 ++++++++++
>  drivers/base/node.c                          | 10 ++++++++++
>  2 files changed, 20 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-devices-node
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-node b/Documentation/ABI/testing/sysfs-devices-node
> new file mode 100644
> index 000000000000..0d1fd86c9faf
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-node
> @@ -0,0 +1,10 @@
> +What:		/sys/devices/system/node/nodeX/crypto_capable
> +Date:		February 2022
> +Contact:	Martin Fernandez <martin.fernandez@eclypsium.com>
> +Users:		fwupd (https://fwupd.org)
> +Description:
> +		This value is 1 if all system memory in this node is
> +		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the

It didn't jump at me at previous postings, but other architectures won't
necessary have EFI_MEMORY_CPU_CRYPTO marking crypto-capable memory. 

How about

  This value is 1 if all system memory in this node is capable of being
  protected with the CPU's memory cryptographic capabilities. It is 0
  otherwise.
  On EFI architectures with value corresponds to EFI_MEMORY_CPU_CRYPTO.


> +		system memory is capable of being protected with the
> +		CPU’s memory cryptographic capabilities. It is 0
> +		otherwise.
> \ No newline at end of file
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 87acc47e8951..dabaed997ecd 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -560,11 +560,21 @@ static ssize_t node_read_distance(struct device *dev,
>  }
>  static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
>  
> +static ssize_t crypto_capable_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
> +
> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
> +}
> +static DEVICE_ATTR_RO(crypto_capable);
> +
>  static struct attribute *node_dev_attrs[] = {
>  	&dev_attr_meminfo.attr,
>  	&dev_attr_numastat.attr,
>  	&dev_attr_distance.attr,
>  	&dev_attr_vmstat.attr,
> +	&dev_attr_crypto_capable.attr,
>  	NULL
>  };
>  
> -- 
> 2.30.2
>
Martin Fernandez Feb. 4, 2022, 1:21 p.m. UTC | #2
On 2/4/22, Limonciello, Mario <mario.limonciello@amd.com> wrote:
> On 2/3/2022 10:43, Martin Fernandez wrote:
>> +static ssize_t crypto_capable_show(struct device *dev,
>> +				   struct device_attribute *attr, char *buf)
>> +{
>> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
>> +
>> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
>
> As there is interest in seeing these capabilities from userspace, it
> seems like a logical time to also expose a `crypto_active` attribute.

I planned to do something similar to this, but to show (or actually
hide if inactive) tme in cpuinfo, just as Borislav Petkov suggested a
few versions back.

https://lore.kernel.org/linux-efi/YXrnkxgdjWbcPlJA@zn.tnic/

> Then userspace can make a judgement call if the system supports crypto
> memory (`crypto_capable`) and then also whether or not it's been turned
> on (`crypto_active`).
>
> `crypto_active` could be detected with some existing support in the
> kernel of `mem_encrypt_active()`.  This will then work for a variety of
> architectures too that offer `mem_encrypt_active()`.

I need a hand with this, I grepped for mem_encrypt_active and nothing
showed up...

> As it stands today the only reliable way to tell from userspace (at
> least for AMD's x86 implementation) is by grepping the system log for
> the line "AMD Memory Encryption Features active".

Isn't enough to grep for sme/sev in cpuinfo?
Tom Lendacky Feb. 4, 2022, 3:59 p.m. UTC | #3
On 2/4/22 07:21, Martin Fernandez wrote:
> On 2/4/22, Limonciello, Mario <mario.limonciello@amd.com> wrote:
>> On 2/3/2022 10:43, Martin Fernandez wrote:
>>> +static ssize_t crypto_capable_show(struct device *dev,
>>> +				   struct device_attribute *attr, char *buf)
>>> +{
>>> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
>>> +
>>> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
>>
>> As there is interest in seeing these capabilities from userspace, it
>> seems like a logical time to also expose a `crypto_active` attribute.
> 
> I planned to do something similar to this, but to show (or actually
> hide if inactive) tme in cpuinfo, just as Borislav Petkov suggested a
> few versions back.
> 
> https://lore.kernel.org/linux-efi/YXrnkxgdjWbcPlJA@zn.tnic/
> 
>> Then userspace can make a judgement call if the system supports crypto
>> memory (`crypto_capable`) and then also whether or not it's been turned
>> on (`crypto_active`).
>>
>> `crypto_active` could be detected with some existing support in the
>> kernel of `mem_encrypt_active()`.  This will then work for a variety of
>> architectures too that offer `mem_encrypt_active()`.
> 
> I need a hand with this, I grepped for mem_encrypt_active and nothing
> showed up...

The mem_encrypt_active() function has been replaced by 
cc_platform_has(CC_ATTR_MEM_ENCRYPT).

> 
>> As it stands today the only reliable way to tell from userspace (at
>> least for AMD's x86 implementation) is by grepping the system log for
>> the line "AMD Memory Encryption Features active".
> 
> Isn't enough to grep for sme/sev in cpuinfo?

No, it's not enough. Cpuinfo shows a processors capabilities and not 
necessarily whether that capability is being used.

Thanks,
Tom
Borislav Petkov Feb. 4, 2022, 4:28 p.m. UTC | #4
On Fri, Feb 04, 2022 at 10:23:22AM -0600, Limonciello, Mario wrote:
> > > > As there is interest in seeing these capabilities from userspace, it

This needs to be explained in a lot more detail: why, what is going to
use it, how, etc.

We don't do user-visible APIs just because.

> As Tom agreed in previous post, Boris is mistaken here.  I just double
> checked on my side on a workstation that supports SME and comparing
> /proc/cpuinfo before and after SME is enabled via mem_encrypt=on.  I
> confirmed that nothing changed.

Then we should clear that "sme" flag if memory encryption is not
enabled. Like we do for all other flags.
Borislav Petkov Feb. 4, 2022, 6 p.m. UTC | #5
On Fri, Feb 04, 2022 at 11:12:04AM -0600, Tom Lendacky wrote:
> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/process.c#L761

For those who won't open a browser just to see what he means :), that's
this snippet:

void stop_this_cpu(void *dummy):
	/*
	 * Use wbinvd on processors that support SME. This provides support
	 * for performing a successful kexec when going from SME inactive
	 * to SME active (or vice-versa). The cache must be cleared so that
	 * if there are entries with the same physical address, both with and
	 * without the encryption bit, they don't race each other when flushed
	 * and potentially end up with the wrong entry being committed to
	 * memory.
	 */
	if (boot_cpu_has(X86_FEATURE_SME))
		native_wbinvd();


Well, we do clear our *representation* of CPUID flags for other features
- see output of

$ git grep -E "(setup_)?clear_cpu_cap"

for examples. We do that for SME even: early_detect_mem_encrypt().

Which means, since this needs to be "processors that support SME", this
line should change to:

	/* ... test the CPUID bit directly because the machine might've cleared
	 * X86_FEATURE_SME due to cmdline options.
	 */
	if (cpuid_eax(0x8000001f) & BIT(0))
		native_wbinvd();

I'd say...
Borislav Petkov Feb. 4, 2022, 9:49 p.m. UTC | #6
On Fri, Feb 04, 2022 at 12:49:08PM -0600, Tom Lendacky wrote:
> Yep, and that should be safe. We would have to look at the generated code as
> there can't be any memory stores after the native_wbinvd() and before the
> native_halt().

I don't think anything else changes here besides the CPUID. Rest of the
asm is the same.
Kees Cook Feb. 7, 2022, 3:39 a.m. UTC | #7
On Fri, Feb 04, 2022 at 05:28:43PM +0100, Borislav Petkov wrote:
> Then we should clear that "sme" flag if memory encryption is not
> enabled. Like we do for all other flags.

Oh, this seems weird to me, as I'd expect it to show up since the CPU is
_capable_ of it, even if it's not in use. (Am I really using avx512vl,
e.g.?)

But as you point out later, it does work that way for a lot of things
and boot params. If this is the way things are supposed to be done,
it looks like we should wire up "nx" vs "noexec=off" boot param to do
the same (separate from this series), though it would need special
care since that bit needs very very early handling both and boot
and resume. Maybe kernel/cpu/common.c should check for _PAGE_NX in
__supported_pte_mask? (And would that break KVM's NX, etc?)

Hmmm.
Borislav Petkov Feb. 7, 2022, 10:02 a.m. UTC | #8
On Sun, Feb 06, 2022 at 07:39:46PM -0800, Kees Cook wrote:
> Oh, this seems weird to me, as I'd expect it to show up since the CPU is
> _capable_ of it, even if it's not in use. (Am I really using avx512vl,
> e.g.?)

We're trying to put feature flags in /proc/cpuinfo which mean that the
kernel supports the feature - not every CPUID bit out there. For that
there's tools/arch/x86/kcpuid/kcpuid.c

Otherwise /proc/cpuinfo becomes a dumping ground for feature flags and
there's no shortage of those.

> But as you point out later, it does work that way for a lot of things
> and boot params. If this is the way things are supposed to be done,
> it looks like we should wire up "nx" vs "noexec=off" boot param to do

See here:

https://lore.kernel.org/r/20220127115626.14179-1-bp@alien8.de
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-devices-node b/Documentation/ABI/testing/sysfs-devices-node
new file mode 100644
index 000000000000..0d1fd86c9faf
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-node
@@ -0,0 +1,10 @@ 
+What:		/sys/devices/system/node/nodeX/crypto_capable
+Date:		February 2022
+Contact:	Martin Fernandez <martin.fernandez@eclypsium.com>
+Users:		fwupd (https://fwupd.org)
+Description:
+		This value is 1 if all system memory in this node is
+		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the
+		system memory is capable of being protected with the
+		CPU’s memory cryptographic capabilities. It is 0
+		otherwise.
\ No newline at end of file
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 87acc47e8951..dabaed997ecd 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -560,11 +560,21 @@  static ssize_t node_read_distance(struct device *dev,
 }
 static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
 
+static ssize_t crypto_capable_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct pglist_data *pgdat = NODE_DATA(dev->id);
+
+	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
+}
+static DEVICE_ATTR_RO(crypto_capable);
+
 static struct attribute *node_dev_attrs[] = {
 	&dev_attr_meminfo.attr,
 	&dev_attr_numastat.attr,
 	&dev_attr_distance.attr,
 	&dev_attr_vmstat.attr,
+	&dev_attr_crypto_capable.attr,
 	NULL
 };