mbox series

[v3,0/9] SEV Kernel Selftests

Message ID 20240905124107.6954-1-pratikrajesh.sampat@amd.com
Headers show
Series SEV Kernel Selftests | expand

Message

Pratik R. Sampat Sept. 5, 2024, 12:40 p.m. UTC
This series primarily introduces SEV-SNP test for the kernel selftest
framework. It tests boot, ioctl, pre fault, and fallocate in various
combinations to exercise both positive and negative launch flow paths.

Patch 1 - Adds a wrapper for the ioctl calls that decouple ioctl and
asserts, which enables the use of negative test cases. No functional
change intended.
Patch 2 - Extend the sev smoke tests to use the SNP specific ioctl
calls and sets up memory to boot a SNP guest VM
Patch 3 - Adds SNP to shutdown testing
Patch 4, 5 - Tests the ioctl path for SEV, SEV-ES and SNP
Patch 6 - Adds support for SNP in KVM_SEV_INIT2 tests
Patch 7,8,9 - Enable Prefault tests for SEV, SEV-ES and SNP

The patchset is rebased on top of kvm-x86/next branch.

v3:
1. Remove the assignments for the prefault and fallocate test type
   enums.
2. Fix error message for sev launch measure and finish.
3. Collect tested-by tags [Peter, Srikanth]

v2:
https://lore.kernel.org/kvm/20240816192310.117456-1-pratikrajesh.sampat@amd.com/
1. Add SMT parsing check to populate SNP policy flags
2. Extend Peter Gonda's shutdown test to include SNP
3. Introduce new tests for prefault which include exercising prefault,
   fallocate, hole-punch in various combinations.
4. Decouple ioctl patch reworked to introduce private variants of the
   the functions that call into the ioctl. Also reordered the patch for
   it to arrive first so that new APIs are not written right after
   their introduction.
5. General cleanups - adding comments, avoiding local booleans, better
   error message. Suggestions incorporated from Peter, Tom, and Sean.

RFC:
https://lore.kernel.org/kvm/20240710220540.188239-1-pratikrajesh.sampat@amd.com/

Any feedback/review is highly appreciated!

Michael Roth (2):
  KVM: selftests: Add interface to manually flag protected/encrypted
    ranges
  KVM: selftests: Add a CoCo-specific test for KVM_PRE_FAULT_MEMORY

Pratik R. Sampat (7):
  KVM: selftests: Decouple SEV ioctls from asserts
  KVM: selftests: Add a basic SNP smoke test
  KVM: selftests: Add SNP to shutdown testing
  KVM: selftests: SEV IOCTL test
  KVM: selftests: SNP IOCTL test
  KVM: selftests: SEV-SNP test for KVM_SEV_INIT2
  KVM: selftests: Interleave fallocate for KVM_PRE_FAULT_MEMORY

 tools/testing/selftests/kvm/Makefile          |   1 +
 .../testing/selftests/kvm/include/kvm_util.h  |  13 +
 .../selftests/kvm/include/x86_64/processor.h  |   1 +
 .../selftests/kvm/include/x86_64/sev.h        |  76 +++-
 tools/testing/selftests/kvm/lib/kvm_util.c    |  53 ++-
 .../selftests/kvm/lib/x86_64/processor.c      |   6 +-
 tools/testing/selftests/kvm/lib/x86_64/sev.c  | 190 +++++++-
 .../kvm/x86_64/coco_pre_fault_memory_test.c   | 421 ++++++++++++++++++
 .../selftests/kvm/x86_64/sev_init2_tests.c    |  13 +
 .../selftests/kvm/x86_64/sev_smoke_test.c     | 297 +++++++++++-
 10 files changed, 1023 insertions(+), 48 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/coco_pre_fault_memory_test.c

Comments

Sean Christopherson Oct. 14, 2024, 10:18 p.m. UTC | #1
On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
> +static inline int __sev_launch_update_data(struct kvm_vm *vm, vm_paddr_t gpa,
> +					   uint64_t hva, uint64_t size)
>  {
>  	struct kvm_sev_launch_update_data update_data = {
> -		.uaddr = (unsigned long)addr_gpa2hva(vm, gpa),
> +		.uaddr = hva,
>  		.len = size,
>  	};
>  
> -	vm_sev_ioctl(vm, KVM_SEV_LAUNCH_UPDATE_DATA, &update_data);
> +	return __vm_sev_ioctl(vm, KVM_SEV_LAUNCH_UPDATE_DATA, &update_data);
> +}
> +
> +static inline void sev_launch_update_data(struct kvm_vm *vm, vm_paddr_t gpa,
> +					  uint64_t hva, uint64_t size)
> +{
> +	int ret = __sev_launch_update_data(vm, gpa, hva, size);
> +
> +	TEST_ASSERT_VM_VCPU_IOCTL(!ret, KVM_SEV_LAUNCH_UPDATE_DATA, ret, vm);
>  }
>  
>  #endif /* SELFTEST_KVM_SEV_H */
> diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c
> index e9535ee20b7f..125a72246e09 100644
> --- a/tools/testing/selftests/kvm/lib/x86_64/sev.c
> +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c
> @@ -14,15 +14,16 @@
>   * and find the first range, but that's correct because the condition
>   * expression would cause us to quit the loop.
>   */
> -static void encrypt_region(struct kvm_vm *vm, struct userspace_mem_region *region)
> +static int encrypt_region(struct kvm_vm *vm, struct userspace_mem_region *region)

This is all kinds of wrong.  encrypt_region() should never fail.  And by allowing
it to fail, any unexpected failure becomes harder to debug.  It's also a lie,
because sev_register_encrypted_memory() isn't allowed to fail, and I would bet
that most readers would expect _that_ call to fail given the name.

The granularity is also poor, and the complete lack of idempotency is going to
be problematic.  E.g. only the first region is actually tested, and if someone
tries to do negative testing on multiple regions, sev_register_encrypted_memory()
will fail due to trying to re-encrypt a region.

__sev_vm_launch_update() has similar issues.  encrypt_region() is allowed to
fail, but its call to KVM_SEV_LAUNCH_UPDATE_VMSA is not.

And peeking ahead, passing an @assert parameter to __test_snp_launch_start() (or
any helper) is a non-starter.  Readers should not have to dive into a helper's
implementation to understand that this

  __test_snp_launch_start(type, policy, 0, true);

is a happy path and this

  ret = __test_snp_launch_start(type, policy, BIT(i), false);

is a sad path.

And re-creating the VM every time is absurdly wasteful.  While performance isn't
a priority for selftests, there's no reason to make everything as slow as possible.

Even just passing the page type to encrypt_region() is confusing.  When the test
is actually going to run the guest, applying ZERO and CPUID types to _all_ pages
is completely nonsensical.

In general, I think trying to reuse the happy path's infrastructure is going to
do more harm than good.  This is what I was trying to get at in my feedback for
the previous version.

For negative tests, I would honestly say development them "from scratch", i.e.
deliberately don't reuse the existing SEV-MEM/ES infrastructure.  It'll require
more copy+paste to get rolling, but I suspect that the end result will be less
churn and far easier to read.
Sean Christopherson Oct. 14, 2024, 10:23 p.m. UTC | #2
On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
> This series primarily introduces SEV-SNP test for the kernel selftest
> framework. It tests boot, ioctl, pre fault, and fallocate in various
> combinations to exercise both positive and negative launch flow paths.
> 
> Patch 1 - Adds a wrapper for the ioctl calls that decouple ioctl and
> asserts, which enables the use of negative test cases. No functional
> change intended.
> Patch 2 - Extend the sev smoke tests to use the SNP specific ioctl
> calls and sets up memory to boot a SNP guest VM
> Patch 3 - Adds SNP to shutdown testing
> Patch 4, 5 - Tests the ioctl path for SEV, SEV-ES and SNP
> Patch 6 - Adds support for SNP in KVM_SEV_INIT2 tests
> Patch 7,8,9 - Enable Prefault tests for SEV, SEV-ES and SNP

There are three separate series here:

 1. Smoke test support for SNP
 2. Negative tests for SEV+
 3. Prefault tests for SEV+

#3 likely has a dependency on #1, and probably on #2 as well (for style if nothing
else).  But that's really just an argument for focuing on #1 first, and the moving
onto the others once that's ready to go.
Sean Christopherson Oct. 14, 2024, 10:58 p.m. UTC | #3
On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> For SEV and SNP, currently __vm_phy_pages_alloc() handles setting the
> region->protected_phy_pages bitmap to mark that the region needs to be
> encrypted/measured into the initial guest state prior to

Nothing needs to be measured, no?  (because there's no attestation)

> finalizing/starting the guest. It also marks what GPAs need to be mapped
> as encrypted in the initial guest page table.

...

>  static inline void vm_mem_set_private(struct kvm_vm *vm, uint64_t gpa,
>  				      uint64_t size)
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index bbf90ad224da..d44a37aebcec 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -1991,6 +1991,43 @@ const char *exit_reason_str(unsigned int exit_reason)
>  	return "Unknown";
>  }
>  
> +/*
> + * Set what guest GFNs need to be encrypted prior to finalizing a CoCo VM.
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *   memslot - Memory region to allocate page from
> + *   paddr - Start of physical address to mark as encrypted
> + *   num - number of pages
> + *
> + * Output Args: None
> + *
> + * Return: None
> + *
> + * Generally __vm_phy_pages_alloc() will handle this automatically, but
> + * for cases where the test handles managing the physical allocation and
> + * mapping directly this interface should be used to mark physical pages
> + * that are intended to be encrypted as part of the initial guest state.
> + * This will also affect whether virt_map()/virt_pg_map() will map the
> + * page as encrypted or not in the initial guest page table.
> + *
> + * If the initial guest state has already been finalized, then setting
> + * it as encrypted will essentially be a noop since nothing more can be
> + * encrypted into the initial guest state at that point.
> + */
> +void vm_mem_set_protected(struct kvm_vm *vm, uint32_t memslot,
> +			  vm_paddr_t paddr, size_t num)
> +{
> +	struct userspace_mem_region *region;
> +	sparsebit_idx_t pg, base;
> +
> +	base = paddr >> vm->page_shift;
> +	region = memslot2region(vm, memslot);

Please no, doing a memslot lookup in a helper like this is only going to encourage
proliferation of bad code.  vm_mem_add() really should be able to mark the region
as protected.

E.g. practically speaking, the only code that will be able to use this helper is
code that is marking the entire memslot as protection.  And ability to _clear_
the protected_phy_pages bit is conspicuously missing.

> +
> +	for (pg = base; pg < base + num; ++pg)
> +		sparsebit_set(region->protected_phy_pages, pg);
> +}
Pratik R. Sampat Oct. 21, 2024, 8:23 p.m. UTC | #4
Hi Sean,

On 10/14/2024 5:18 PM, Sean Christopherson wrote:
> On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
>> +static inline int __sev_launch_update_data(struct kvm_vm *vm, vm_paddr_t gpa,
>> +					   uint64_t hva, uint64_t size)
>>  {
>>  	struct kvm_sev_launch_update_data update_data = {
>> -		.uaddr = (unsigned long)addr_gpa2hva(vm, gpa),
>> +		.uaddr = hva,
>>  		.len = size,
>>  	};
>>  
>> -	vm_sev_ioctl(vm, KVM_SEV_LAUNCH_UPDATE_DATA, &update_data);
>> +	return __vm_sev_ioctl(vm, KVM_SEV_LAUNCH_UPDATE_DATA, &update_data);
>> +}
>> +
>> +static inline void sev_launch_update_data(struct kvm_vm *vm, vm_paddr_t gpa,
>> +					  uint64_t hva, uint64_t size)
>> +{
>> +	int ret = __sev_launch_update_data(vm, gpa, hva, size);
>> +
>> +	TEST_ASSERT_VM_VCPU_IOCTL(!ret, KVM_SEV_LAUNCH_UPDATE_DATA, ret, vm);
>>  }
>>  
>>  #endif /* SELFTEST_KVM_SEV_H */
>> diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c
>> index e9535ee20b7f..125a72246e09 100644
>> --- a/tools/testing/selftests/kvm/lib/x86_64/sev.c
>> +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c
>> @@ -14,15 +14,16 @@
>>   * and find the first range, but that's correct because the condition
>>   * expression would cause us to quit the loop.
>>   */
>> -static void encrypt_region(struct kvm_vm *vm, struct userspace_mem_region *region)
>> +static int encrypt_region(struct kvm_vm *vm, struct userspace_mem_region *region)
> 
> This is all kinds of wrong.  encrypt_region() should never fail.  And by allowing
> it to fail, any unexpected failure becomes harder to debug.  It's also a lie,
> because sev_register_encrypted_memory() isn't allowed to fail, and I would bet
> that most readers would expect _that_ call to fail given the name.
> 
> The granularity is also poor, and the complete lack of idempotency is going to
> be problematic.  E.g. only the first region is actually tested, and if someone
> tries to do negative testing on multiple regions, sev_register_encrypted_memory()
> will fail due to trying to re-encrypt a region.
> 
> __sev_vm_launch_update() has similar issues.  encrypt_region() is allowed to
> fail, but its call to KVM_SEV_LAUNCH_UPDATE_VMSA is not.
> 
> And peeking ahead, passing an @assert parameter to __test_snp_launch_start() (or
> any helper) is a non-starter.  Readers should not have to dive into a helper's
> implementation to understand that this
> 
>   __test_snp_launch_start(type, policy, 0, true);
> 
> is a happy path and this
> 
>   ret = __test_snp_launch_start(type, policy, BIT(i), false);
> 
> is a sad path.
> 
> And re-creating the VM every time is absurdly wasteful.  While performance isn't
> a priority for selftests, there's no reason to make everything as slow as possible.
> 
> Even just passing the page type to encrypt_region() is confusing.  When the test
> is actually going to run the guest, applying ZERO and CPUID types to _all_ pages
> is completely nonsensical.
> 
> In general, I think trying to reuse the happy path's infrastructure is going to
> do more harm than good.  This is what I was trying to get at in my feedback for
> the previous version.
> 
> For negative tests, I would honestly say development them "from scratch", i.e.
> deliberately don't reuse the existing SEV-MEM/ES infrastructure.  It'll require
> more copy+paste to get rolling, but I suspect that the end result will be less
> churn and far easier to read.

This makes sense. I was trying to be as minimal as possible without a
lot of replication while trying to introduce the negative tests. I see
that this has created several issues of granularity, even general
correctness and overall has created more problems than it solves.

I will try to develop the negative interface separately tailored for
this specific use-case rather than piggybacking on the happy path when I
send out the patchset #2.
Pratik R. Sampat Oct. 21, 2024, 8:23 p.m. UTC | #5
Hi Sean,

On 10/14/2024 5:58 PM, Sean Christopherson wrote:
> On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
>> From: Michael Roth <michael.roth@amd.com>
>>
>> For SEV and SNP, currently __vm_phy_pages_alloc() handles setting the
>> region->protected_phy_pages bitmap to mark that the region needs to be
>> encrypted/measured into the initial guest state prior to
> 
> Nothing needs to be measured, no?  (because there's no attestation)
> 

Right.

>> finalizing/starting the guest. It also marks what GPAs need to be mapped
>> as encrypted in the initial guest page table.
> 
> ...
> 
>>  static inline void vm_mem_set_private(struct kvm_vm *vm, uint64_t gpa,
>>  				      uint64_t size)
>> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
>> index bbf90ad224da..d44a37aebcec 100644
>> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
>> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
>> @@ -1991,6 +1991,43 @@ const char *exit_reason_str(unsigned int exit_reason)
>>  	return "Unknown";
>>  }
>>  
>> +/*
>> + * Set what guest GFNs need to be encrypted prior to finalizing a CoCo VM.
>> + *
>> + * Input Args:
>> + *   vm - Virtual Machine
>> + *   memslot - Memory region to allocate page from
>> + *   paddr - Start of physical address to mark as encrypted
>> + *   num - number of pages
>> + *
>> + * Output Args: None
>> + *
>> + * Return: None
>> + *
>> + * Generally __vm_phy_pages_alloc() will handle this automatically, but
>> + * for cases where the test handles managing the physical allocation and
>> + * mapping directly this interface should be used to mark physical pages
>> + * that are intended to be encrypted as part of the initial guest state.
>> + * This will also affect whether virt_map()/virt_pg_map() will map the
>> + * page as encrypted or not in the initial guest page table.
>> + *
>> + * If the initial guest state has already been finalized, then setting
>> + * it as encrypted will essentially be a noop since nothing more can be
>> + * encrypted into the initial guest state at that point.
>> + */
>> +void vm_mem_set_protected(struct kvm_vm *vm, uint32_t memslot,
>> +			  vm_paddr_t paddr, size_t num)
>> +{
>> +	struct userspace_mem_region *region;
>> +	sparsebit_idx_t pg, base;
>> +
>> +	base = paddr >> vm->page_shift;
>> +	region = memslot2region(vm, memslot);
> 
> Please no, doing a memslot lookup in a helper like this is only going to encourage
> proliferation of bad code.  vm_mem_add() really should be able to mark the region
> as protected.
> 
> E.g. practically speaking, the only code that will be able to use this helper is
> code that is marking the entire memslot as protection.  And ability to _clear_
> the protected_phy_pages bit is conspicuously missing.
>
Pratik R. Sampat Oct. 21, 2024, 8:23 p.m. UTC | #6
Hi Sean,

On 10/14/2024 5:23 PM, Sean Christopherson wrote:
> On Thu, Sep 05, 2024, Pratik R. Sampat wrote:
>> This series primarily introduces SEV-SNP test for the kernel selftest
>> framework. It tests boot, ioctl, pre fault, and fallocate in various
>> combinations to exercise both positive and negative launch flow paths.
>>
>> Patch 1 - Adds a wrapper for the ioctl calls that decouple ioctl and
>> asserts, which enables the use of negative test cases. No functional
>> change intended.
>> Patch 2 - Extend the sev smoke tests to use the SNP specific ioctl
>> calls and sets up memory to boot a SNP guest VM
>> Patch 3 - Adds SNP to shutdown testing
>> Patch 4, 5 - Tests the ioctl path for SEV, SEV-ES and SNP
>> Patch 6 - Adds support for SNP in KVM_SEV_INIT2 tests
>> Patch 7,8,9 - Enable Prefault tests for SEV, SEV-ES and SNP
> 
> There are three separate series here:
> 
>  1. Smoke test support for SNP
>  2. Negative tests for SEV+
>  3. Prefault tests for SEV+
> 
> #3 likely has a dependency on #1, and probably on #2 as well (for style if nothing
> else).  But that's really just an argument for focuing on #1 first, and the moving
> onto the others once that's ready to go.

Based on your feedback on the rest of this patchset, this makes sense to
me. I will first prep for the changes for patchset #1 and once we lock
that down I can introduce patchset #2 and #3 based on that design.

Thank you again for your feedback!
Pratik