mbox series

[v6,0/2] New s390 specific protected key hmac

Message ID 20241129111059.303905-1-freude@linux.ibm.com
Headers show
Series New s390 specific protected key hmac | expand

Message

Harald Freudenberger Nov. 29, 2024, 11:10 a.m. UTC
Add support for protected key hmac ("phmac") for s390 arch.

With the latest machine generation there is now support for
protected key (that is a key wrapped by a master key stored
in firmware) hmac for sha2 (sha224, sha256, sha384 and sha512)
for the s390 specific CPACF instruction kmac.

This patch adds support via 4 new hashes registered as
phmac(sha224), phmac(sha256), phmac(sha384) and phmac(sha512).

Please note that as of now, there is no selftest enabled for
these shashes, but the implementation has been tested with
testcases via AF_ALG interface. However, there may come an
improvement soon to use the available clear key hmac selftests.

Changelog:
v1: Initial version
v2: Increase HASH_MAX_DESCSIZE generic (not just for arch s390).
    Fix one finding to use kmemdup instead of kmalloc/memcpy from test
    robot. Remove unneeded cpacf subfunctions checks. Simplify
    clone_tfm() function. Rebased to s390/features.
v3: Feedback from Herbert: Use GFP_ATOMIC in setkey function.
    Feedback from Holger: rework tfm clone function, move convert key
    invocation from setkey to init function. Rebased to updated
    s390/features from 11/7/2024. Ready for integration if there are
    no complains on v3.
v4: Rewind back more or less to v2. Add code to check for non-sleeping
    context. Non-sleeping context during attempt to derive the
    protected key from raw key material is not accepted and
    -EOPNOTSUPP is returned (also currently all derivation pathes
    would in fact never sleep). In general the phmac implementation is
    not to be used within non-sleeping context and the code header
    mentions this. Tested with (patched) dm-integrity - works fine.
v5: As suggested by Herbert now the shashes have been marked as
    'internal' and wrapped by ahashes which use the cryptd if an
    atomic context is detected. So the visible phmac algorithms are
    now ahashes. Unfortunately the dm-integrity implementation
    currently requests and deals only with shashes and this phmac
    implementation is not fitting to the original goal any more...
v6: As suggested by Herbert now a pure async phmac implementation.
    Tested via AF_ALG interface. Untested via dm-integrity as this layer
    only supports shashes. Maybe I'll develop a patch to switch the
    dm-integrity to ahash as it is anyway the more flexible interface.

Harald Freudenberger (1):
  s390/crypto: New s390 specific protected key hash phmac

Holger Dengler (1):
  s390/crypto: Add protected key hmac subfunctions for KMAC

 arch/s390/configs/debug_defconfig |   1 +
 arch/s390/configs/defconfig       |   1 +
 arch/s390/crypto/Makefile         |   1 +
 arch/s390/crypto/phmac_s390.c     | 474 ++++++++++++++++++++++++++++++
 arch/s390/include/asm/cpacf.h     |   4 +
 drivers/crypto/Kconfig            |  12 +
 6 files changed, 493 insertions(+)
 create mode 100644 arch/s390/crypto/phmac_s390.c


base-commit: 3f020399e4f1c690ce87b4c472f75b1fc89e07d5
--
2.43.0

Comments

Herbert Xu Nov. 29, 2024, 2:48 p.m. UTC | #1
On Fri, Nov 29, 2024 at 12:10:58PM +0100, Harald Freudenberger wrote:
>
> +static inline int phmac_keyblob2pkey(const u8 *key, unsigned int keylen,
> +				     struct phmac_protkey *pk)
> +{
> +	int i, rc = -EIO;
> +
> +	/* try three times in case of busy card */
> +	for (i = 0; rc && i < 3; i++) {
> +		if (rc == -EBUSY && msleep_interruptible(1000))
> +			return -EINTR;

You can't sleep in an ahash algorithm either.  What you can do
however is schedule a delayed work and pick up where you left
off.  That's how asynchronous completion works.

But my question still stands, under what circumstances can
this fail? I don't think storage folks will be too happy with
a crypto algorithm that can produce random failures.

Cheers,
Harald Freudenberger Dec. 2, 2024, 5:25 p.m. UTC | #2
On 2024-11-29 15:48, Herbert Xu wrote:
> On Fri, Nov 29, 2024 at 12:10:58PM +0100, Harald Freudenberger wrote:
>> 
>> +static inline int phmac_keyblob2pkey(const u8 *key, unsigned int 
>> keylen,
>> +				     struct phmac_protkey *pk)
>> +{
>> +	int i, rc = -EIO;
>> +
>> +	/* try three times in case of busy card */
>> +	for (i = 0; rc && i < 3; i++) {
>> +		if (rc == -EBUSY && msleep_interruptible(1000))
>> +			return -EINTR;
> 
> You can't sleep in an ahash algorithm either.  What you can do
> however is schedule a delayed work and pick up where you left
> off.  That's how asynchronous completion works.
> 
> But my question still stands, under what circumstances can
> this fail? I don't think storage folks will be too happy with
> a crypto algorithm that can produce random failures.
> 
> Cheers,

- The attempt to derive a protected key usable by the cpacf instructions
   depends of the raw key material used. For 'clear key' material the
   derivation process is a simple instruction which can't fail.
   A more preferred way however is to use 'secure key' material which
   is transferred to a crypto card and then re-wrapped to be usable
   with cpacf instructions. This requires communication with a crypto
   card and thus may fail - because there is no card at all or there
   is temporarily no card available or the card is in bad state. If there
   is no usable card the AP bus returns -EBUSY at the pkey_key2protkey()
   function and triggers an asynchronous bus scan. As long as this scan
   is running (usually about 100ms or so) the -EBUSY is returned to 
indicate
   that the caller should retry "later". Other states are covered with
   other return codes like ENODEV or EIO and the caller is not supposed
   to loop but should fail. When there is no accessible hardware 
available
   to derive a protected key either the user or the admin broke something
   or something went really the bad way and then there is no help but the
   storage device must fail.
- How can it happen that a re-derive is needed? A re-derive is triggered 
when
   the cpacf instruction detects that the protected key is not valid any 
more.
   A protected key includes a verification pattern (hash) of the firmware 
key
   used to encrypt the key. This hash is checked on each invocation of a
   cpacf instruction. So when the code execution "awakes" on another 
machine
   ("live guest migration" of an KVM guest to another machine) the next
   cpacf instruction will complain about verification pattern mismatch 
and
   the protected key needs to get re-derived from the source material.
   It could also happen via suspend/resume on the very same machine when
   there is something in between (for example the whole machine runs a
   cold-start). It does NOT happen out of the sudden without any reason,
   but the code affected is not aware of any "live guest migration" or
   "suspend/resume cycle" and thus as the crypto algorithm implementation 
has
   no awareness of a "live guest migration" just happened - it looks like
   this occurred suddenly.
- Do I get you right, that a completion is ok? I always had the 
impression
   that waiting on a completion is also a sleeping act and thus not 
allowed?

Thanks for your help and being so patient with us.
Herbert Xu Dec. 3, 2024, 8:50 a.m. UTC | #3
On Mon, Dec 02, 2024 at 06:25:22PM +0100, Harald Freudenberger wrote:
>
> - The attempt to derive a protected key usable by the cpacf instructions
>   depends of the raw key material used. For 'clear key' material the
>   derivation process is a simple instruction which can't fail.
>   A more preferred way however is to use 'secure key' material which
>   is transferred to a crypto card and then re-wrapped to be usable
>   with cpacf instructions. This requires communication with a crypto
>   card and thus may fail - because there is no card at all or there
>   is temporarily no card available or the card is in bad state. If there
>   is no usable card the AP bus returns -EBUSY at the pkey_key2protkey()
>   function and triggers an asynchronous bus scan. As long as this scan
>   is running (usually about 100ms or so) the -EBUSY is returned to indicate
>   that the caller should retry "later". Other states are covered with
>   other return codes like ENODEV or EIO and the caller is not supposed
>   to loop but should fail. When there is no accessible hardware available
>   to derive a protected key either the user or the admin broke something
>   or something went really the bad way and then there is no help but the
>   storage device must fail.

Thanks for the explanation.  I think it's fair enough to fail an
op if the hardware is absent or broken.

So all I need is for you to turn the BUSY case into a delayed retry
and I think that should be good enough.

> - Do I get you right, that a completion is ok? I always had the impression
>   that waiting on a completion is also a sleeping act and thus not allowed?

No, what I mean is that if you get an EBUSY, you should return
-EINPROGRESS to indicate that the operation is pending, and then
schedule a delayed work to retry the operation.  When the retry
fails or succeeds, it should invoke the callback with the correct
error status.

If the retry gets EBUSY again, then schedule another delayed
work, or fail permanently by invoking the callback if you hit
some sort of threshold like your existing limit of 3.

Cheers,