Message ID | 20241129111059.303905-1-freude@linux.ibm.com |
---|---|
Headers | show |
Series | New s390 specific protected key hmac | expand |
On Fri, Nov 29, 2024 at 12:10:58PM +0100, Harald Freudenberger wrote: > > +static inline int phmac_keyblob2pkey(const u8 *key, unsigned int keylen, > + struct phmac_protkey *pk) > +{ > + int i, rc = -EIO; > + > + /* try three times in case of busy card */ > + for (i = 0; rc && i < 3; i++) { > + if (rc == -EBUSY && msleep_interruptible(1000)) > + return -EINTR; You can't sleep in an ahash algorithm either. What you can do however is schedule a delayed work and pick up where you left off. That's how asynchronous completion works. But my question still stands, under what circumstances can this fail? I don't think storage folks will be too happy with a crypto algorithm that can produce random failures. Cheers,
On 2024-11-29 15:48, Herbert Xu wrote: > On Fri, Nov 29, 2024 at 12:10:58PM +0100, Harald Freudenberger wrote: >> >> +static inline int phmac_keyblob2pkey(const u8 *key, unsigned int >> keylen, >> + struct phmac_protkey *pk) >> +{ >> + int i, rc = -EIO; >> + >> + /* try three times in case of busy card */ >> + for (i = 0; rc && i < 3; i++) { >> + if (rc == -EBUSY && msleep_interruptible(1000)) >> + return -EINTR; > > You can't sleep in an ahash algorithm either. What you can do > however is schedule a delayed work and pick up where you left > off. That's how asynchronous completion works. > > But my question still stands, under what circumstances can > this fail? I don't think storage folks will be too happy with > a crypto algorithm that can produce random failures. > > Cheers, - The attempt to derive a protected key usable by the cpacf instructions depends of the raw key material used. For 'clear key' material the derivation process is a simple instruction which can't fail. A more preferred way however is to use 'secure key' material which is transferred to a crypto card and then re-wrapped to be usable with cpacf instructions. This requires communication with a crypto card and thus may fail - because there is no card at all or there is temporarily no card available or the card is in bad state. If there is no usable card the AP bus returns -EBUSY at the pkey_key2protkey() function and triggers an asynchronous bus scan. As long as this scan is running (usually about 100ms or so) the -EBUSY is returned to indicate that the caller should retry "later". Other states are covered with other return codes like ENODEV or EIO and the caller is not supposed to loop but should fail. When there is no accessible hardware available to derive a protected key either the user or the admin broke something or something went really the bad way and then there is no help but the storage device must fail. - How can it happen that a re-derive is needed? A re-derive is triggered when the cpacf instruction detects that the protected key is not valid any more. A protected key includes a verification pattern (hash) of the firmware key used to encrypt the key. This hash is checked on each invocation of a cpacf instruction. So when the code execution "awakes" on another machine ("live guest migration" of an KVM guest to another machine) the next cpacf instruction will complain about verification pattern mismatch and the protected key needs to get re-derived from the source material. It could also happen via suspend/resume on the very same machine when there is something in between (for example the whole machine runs a cold-start). It does NOT happen out of the sudden without any reason, but the code affected is not aware of any "live guest migration" or "suspend/resume cycle" and thus as the crypto algorithm implementation has no awareness of a "live guest migration" just happened - it looks like this occurred suddenly. - Do I get you right, that a completion is ok? I always had the impression that waiting on a completion is also a sleeping act and thus not allowed? Thanks for your help and being so patient with us.
On Mon, Dec 02, 2024 at 06:25:22PM +0100, Harald Freudenberger wrote: > > - The attempt to derive a protected key usable by the cpacf instructions > depends of the raw key material used. For 'clear key' material the > derivation process is a simple instruction which can't fail. > A more preferred way however is to use 'secure key' material which > is transferred to a crypto card and then re-wrapped to be usable > with cpacf instructions. This requires communication with a crypto > card and thus may fail - because there is no card at all or there > is temporarily no card available or the card is in bad state. If there > is no usable card the AP bus returns -EBUSY at the pkey_key2protkey() > function and triggers an asynchronous bus scan. As long as this scan > is running (usually about 100ms or so) the -EBUSY is returned to indicate > that the caller should retry "later". Other states are covered with > other return codes like ENODEV or EIO and the caller is not supposed > to loop but should fail. When there is no accessible hardware available > to derive a protected key either the user or the admin broke something > or something went really the bad way and then there is no help but the > storage device must fail. Thanks for the explanation. I think it's fair enough to fail an op if the hardware is absent or broken. So all I need is for you to turn the BUSY case into a delayed retry and I think that should be good enough. > - Do I get you right, that a completion is ok? I always had the impression > that waiting on a completion is also a sleeping act and thus not allowed? No, what I mean is that if you get an EBUSY, you should return -EINPROGRESS to indicate that the operation is pending, and then schedule a delayed work to retry the operation. When the retry fails or succeeds, it should invoke the callback with the correct error status. If the retry gets EBUSY again, then schedule another delayed work, or fail permanently by invoking the callback if you hit some sort of threshold like your existing limit of 3. Cheers,