Message ID | 20190930121520.1388317-2-arnd@arndb.de |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen <pvanleeuwen@verimatrix.com> wrote: > > Alternatively, it should be possible to shrink these allocations > > as the extra buffers appear to be largely unnecessary, but doing > > this would be a much more invasive change. > > > Actually, for HMAC-SHA512 you DO need all that buffer space. > You could shrink it to 2 * ctx->state_sz but then your simple indexing > is no longer going to fly. Not sure if that would be worth the effort. Stack allocations can no longer be dynamically sized in the kernel, so that would not work. What I noticed though is that the largest part of safexcel_ahash_export_state is used in the 'cache' member, and this is apparently only referenced inside of safexcel_hmac_init_iv, which you call twice. If that cache can be allocated only once, you save SHA512_BLOCK_SIZE bytes in one of the two paths. > I don't like the part where you dynamically allocate the cryto_aes_ctx > though, I think that was not necessary considering its a lot smaller. I count 484 bytes for it, which is really large. > And it conflicts with another change I have waiting that gets rid of > aes_expandkey and that struct alltogether (since it was really just > abused to do a key size check, which was very wasteful since the > function actually generates all roundkeys we don't need at all ...) Right, this is what I noticed there. With 480 of the 484 bytes gone, you are well below the warning limit even without the other change. Arnd
> -----Original Message----- > From: Arnd Bergmann <arnd@arndb.de> > Sent: Monday, September 30, 2019 10:12 PM > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com> > Cc: Antoine Tenart <antoine.tenart@bootlin.com>; Herbert Xu <herbert@gondor.apana.org.au>; > David S. Miller <davem@davemloft.net>; Pascal van Leeuwen <pascalvanl@gmail.com>; Ard > Biesheuvel <ard.biesheuvel@linaro.org>; Eric Biggers <ebiggers@google.com>; linux- > crypto@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 2/3] crypto: inside-secure - Reduce stack usage > > On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen > <pvanleeuwen@verimatrix.com> wrote: > > > > Alternatively, it should be possible to shrink these allocations > > > as the extra buffers appear to be largely unnecessary, but doing > > > this would be a much more invasive change. > > > > > Actually, for HMAC-SHA512 you DO need all that buffer space. > > You could shrink it to 2 * ctx->state_sz but then your simple indexing > > is no longer going to fly. Not sure if that would be worth the effort. > > Stack allocations can no longer be dynamically sized in the kernel, > so that would not work. > I was actually referring to your kzalloc, not to the original stack based implementation ... > What I noticed though is that the largest part of safexcel_ahash_export_state > is used in the 'cache' member, and this is apparently only referenced inside of > safexcel_hmac_init_iv, which you call twice. If that cache can be allocated > only once, you save SHA512_BLOCK_SIZE bytes in one of the two paths. > Well ... hmmm ... "my"... This is not originally "my" code. And until you brought this up, I did not fully realise it was using this export_state struct to store those digests. Seems to have something to do with directly taking the crypto_ahash_export state output, which is much more than that, in case you need to continue the hash (which you don't here). I guess you need to "catch" that output somewhere, so probably you could save a bit of memory but I doubt it would be a significant improvement. > > I don't like the part where you dynamically allocate the cryto_aes_ctx > > though, I think that was not necessary considering its a lot smaller. > > I count 484 bytes for it, which is really large. > Maybe I should've counted myself ... totally unexpectedly huge. Why?? Anyway, glad I got rid of it already then. > > And it conflicts with another change I have waiting that gets rid of > > aes_expandkey and that struct alltogether (since it was really just > > abused to do a key size check, which was very wasteful since the > > function actually generates all roundkeys we don't need at all ...) > > Right, this is what I noticed there. With 480 of the 484 bytes gone, > you are well below the warning limit even without the other change. > And by "other change" you mean the safexcel_ahash_export_state? Ok, good to known, although I do like to improve that one as well, but preferably by not exporting the cache so I don't need the full struct. Regards, Pascal van Leeuwen Silicon IP Architect, Multi-Protocol Engines @ Verimatrix www.insidesecure.com
On Mon, Sep 30, 2019 at 11:09 PM Pascal Van Leeuwen <pvanleeuwen@verimatrix.com> wrote: > > Subject: Re: [PATCH 2/3] crypto: inside-secure - Reduce stack usage > > > > On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen > > <pvanleeuwen@verimatrix.com> wrote: > > > > > > Alternatively, it should be possible to shrink these allocations > > > > as the extra buffers appear to be largely unnecessary, but doing > > > > this would be a much more invasive change. > > > > > > > Actually, for HMAC-SHA512 you DO need all that buffer space. > > > You could shrink it to 2 * ctx->state_sz but then your simple indexing > > > is no longer going to fly. Not sure if that would be worth the effort. > > > > Stack allocations can no longer be dynamically sized in the kernel, > > so that would not work. > > > I was actually referring to your kzalloc, not to the original stack > based implementation ... Ok, got it. For the kzalloc version, the size matters much less, as this is not coming from a scarce resource and only takes a few more cycles to do the initial clearing of the larger struct. > > > And it conflicts with another change I have waiting that gets rid of > > > aes_expandkey and that struct alltogether (since it was really just > > > abused to do a key size check, which was very wasteful since the > > > function actually generates all roundkeys we don't need at all ...) > > > > Right, this is what I noticed there. With 480 of the 484 bytes gone, > > you are well below the warning limit even without the other change. > > > And by "other change" you mean the safexcel_ahash_export_state? Yes. > Ok, good to known, although I do like to improve that one as well, > but preferably by not exporting the cache so I don't need the full > struct. Sounds good to me. Arnd
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c index ef51f8c2b473..51a4112aa9bc 100644 --- a/drivers/crypto/inside-secure/safexcel_cipher.c +++ b/drivers/crypto/inside-secure/safexcel_cipher.c @@ -305,10 +305,10 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key, { struct crypto_tfm *tfm = crypto_aead_tfm(ctfm); struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm); - struct safexcel_ahash_export_state istate, ostate; + struct safexcel_ahash_export_state *state; struct safexcel_crypto_priv *priv = ctx->priv; + struct crypto_aes_ctx *aes; struct crypto_authenc_keys keys; - struct crypto_aes_ctx aes; int err = -EINVAL; if (crypto_authenc_extractkeys(&keys, key, len) != 0) @@ -334,7 +334,14 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key, goto badkey_expflags; break; case SAFEXCEL_AES: - err = aes_expandkey(&aes, keys.enckey, keys.enckeylen); + aes = kzalloc(sizeof(*aes), GFP_KERNEL); + if (!aes) { + err = -ENOMEM; + goto badkey; + } + + err = aes_expandkey(aes, keys.enckey, keys.enckeylen); + kfree(aes); if (unlikely(err)) goto badkey; break; @@ -347,56 +354,66 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key, memcmp(ctx->key, keys.enckey, keys.enckeylen)) ctx->base.needs_inv = true; + state = kzalloc(sizeof(struct safexcel_ahash_export_state) * 2, GFP_KERNEL); + if (!state) { + err = -ENOMEM; + goto badkey; + } + /* Auth key */ switch (ctx->hash_alg) { case CONTEXT_CONTROL_CRYPTO_ALG_SHA1: if (safexcel_hmac_setkey("safexcel-sha1", keys.authkey, - keys.authkeylen, &istate, &ostate)) - goto badkey; + keys.authkeylen, &state[0], &state[1])) + goto badkey_free; break; case CONTEXT_CONTROL_CRYPTO_ALG_SHA224: if (safexcel_hmac_setkey("safexcel-sha224", keys.authkey, - keys.authkeylen, &istate, &ostate)) - goto badkey; + keys.authkeylen, &state[0], &state[1])) + goto badkey_free; break; case CONTEXT_CONTROL_CRYPTO_ALG_SHA256: if (safexcel_hmac_setkey("safexcel-sha256", keys.authkey, - keys.authkeylen, &istate, &ostate)) - goto badkey; + keys.authkeylen, &state[0], &state[1])) + goto badkey_free; break; case CONTEXT_CONTROL_CRYPTO_ALG_SHA384: if (safexcel_hmac_setkey("safexcel-sha384", keys.authkey, - keys.authkeylen, &istate, &ostate)) - goto badkey; + keys.authkeylen, &state[0], &state[1])) + goto badkey_free; break; case CONTEXT_CONTROL_CRYPTO_ALG_SHA512: if (safexcel_hmac_setkey("safexcel-sha512", keys.authkey, - keys.authkeylen, &istate, &ostate)) - goto badkey; + keys.authkeylen, &state[0], &state[1])) + goto badkey_free; break; default: dev_err(priv->dev, "aead: unsupported hash algorithm\n"); - goto badkey; + goto badkey_free; } crypto_aead_set_flags(ctfm, crypto_aead_get_flags(ctfm) & CRYPTO_TFM_RES_MASK); if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma && - (memcmp(ctx->ipad, istate.state, ctx->state_sz) || - memcmp(ctx->opad, ostate.state, ctx->state_sz))) + (memcmp(ctx->ipad, &state[0].state, ctx->state_sz) || + memcmp(ctx->opad, &state[1].state, ctx->state_sz))) ctx->base.needs_inv = true; /* Now copy the keys into the context */ memcpy(ctx->key, keys.enckey, keys.enckeylen); ctx->key_len = keys.enckeylen; - memcpy(ctx->ipad, &istate.state, ctx->state_sz); - memcpy(ctx->opad, &ostate.state, ctx->state_sz); + memcpy(ctx->ipad, &state[0].state, ctx->state_sz); + memcpy(ctx->opad, &state[1].state, ctx->state_sz); memzero_explicit(&keys, sizeof(keys)); + kfree(state); + return 0; +badkey_free: + kfree(state); badkey: crypto_aead_set_flags(ctfm, CRYPTO_TFM_RES_BAD_KEY_LEN); badkey_expflags:
safexcel_aead_setkey() contains three large stack variables, totalling slightly more than the 1024 byte warning limit: drivers/crypto/inside-secure/safexcel_cipher.c:303:12: error: stack frame size of 1032 bytes in function 'safexcel_aead_setkey' [-Werror,-Wframe-larger-than=] The function already contains a couple of dynamic allocations, so it is likely not performance critical and it can only be called in a context that allows sleeping, so the easiest workaround is to add change it to use dynamic allocations. Combining istate and ostate into a single variable simplifies the allocation at the cost of making it slightly less readable. Alternatively, it should be possible to shrink these allocations as the extra buffers appear to be largely unnecessary, but doing this would be a much more invasive change. Fixes: 0e17e3621a28 ("crypto: inside-secure - add support for authenc(hmac(sha*),rfc3686(ctr(aes))) suites") Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- .../crypto/inside-secure/safexcel_cipher.c | 53 ++++++++++++------- 1 file changed, 35 insertions(+), 18 deletions(-) -- 2.20.0