Message ID | 20221017222620.715153-1-nhuck@google.com |
---|---|
State | New |
Headers | show |
Series | crypto: x86/polyval - Fix crashes when keys are not 16-byte aligned | expand |
On Mon, Oct 17, 2022 at 03:26:20PM -0700, Nathan Huckleberry wrote: > The key_powers array is not guaranteed to be 16-byte aligned, so using > movaps to operate on key_powers is not allowed. > > Switch movaps to movups. > > Fixes: 34f7f6c30112 ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") > Reported-by: Bruno Goncalves <bgoncalv@redhat.com> > Signed-off-by: Nathan Huckleberry <nhuck@google.com> > --- > arch/x86/crypto/polyval-clmulni_asm.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/crypto/polyval-clmulni_asm.S b/arch/x86/crypto/polyval-clmulni_asm.S > index a6ebe4e7dd2b..32b98cb53ddf 100644 > --- a/arch/x86/crypto/polyval-clmulni_asm.S > +++ b/arch/x86/crypto/polyval-clmulni_asm.S > @@ -234,7 +234,7 @@ > > movups (MSG), %xmm0 > pxor SUM, %xmm0 > - movaps (KEY_POWERS), %xmm1 > + movups (KEY_POWERS), %xmm1 > schoolbook1_noload > dec BLOCKS_LEFT > addq $16, MSG I thought that crypto_tfm::__crt_ctx is guaranteed to be 16-byte aligned, and that the x86 AES code relies on that property. But now I see that actually the x86 AES code manually aligns the context. See aes_ctx() in arch/x86/crypto/aesni-intel_glue.c. Did you consider doing the same for polyval? If you do prefer this way, it would be helpful to leave a comment for schoolbook1_iteration that mentions that the unaligned access support of vpclmulqdq is being relied on, i.e. pclmulqdq wouldn't work. - Eric
On Mon, Oct 17, 2022 at 4:02 PM Eric Biggers <ebiggers@kernel.org> wrote: > > On Mon, Oct 17, 2022 at 03:26:20PM -0700, Nathan Huckleberry wrote: > > The key_powers array is not guaranteed to be 16-byte aligned, so using > > movaps to operate on key_powers is not allowed. > > > > Switch movaps to movups. > > > > Fixes: 34f7f6c30112 ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") > > Reported-by: Bruno Goncalves <bgoncalv@redhat.com> > > Signed-off-by: Nathan Huckleberry <nhuck@google.com> > > --- > > arch/x86/crypto/polyval-clmulni_asm.S | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/crypto/polyval-clmulni_asm.S b/arch/x86/crypto/polyval-clmulni_asm.S > > index a6ebe4e7dd2b..32b98cb53ddf 100644 > > --- a/arch/x86/crypto/polyval-clmulni_asm.S > > +++ b/arch/x86/crypto/polyval-clmulni_asm.S > > @@ -234,7 +234,7 @@ > > > > movups (MSG), %xmm0 > > pxor SUM, %xmm0 > > - movaps (KEY_POWERS), %xmm1 > > + movups (KEY_POWERS), %xmm1 > > schoolbook1_noload > > dec BLOCKS_LEFT > > addq $16, MSG > > I thought that crypto_tfm::__crt_ctx is guaranteed to be 16-byte aligned, > and that the x86 AES code relies on that property. > > But now I see that actually the x86 AES code manually aligns the context. > See aes_ctx() in arch/x86/crypto/aesni-intel_glue.c. > > Did you consider doing the same for polyval? I'll submit a v2 aligning the tfm_ctx. I think that makes more sense than working on unaligned keys. Is there a need to do the same changes on arm64? The keys are also unaligned there. > > If you do prefer this way, it would be helpful to leave a comment for > schoolbook1_iteration that mentions that the unaligned access support of > vpclmulqdq is being relied on, i.e. pclmulqdq wouldn't work. > > - Eric Thanks, Huck
On Mon, Oct 17, 2022 at 04:38:25PM -0700, Nathan Huckleberry wrote: > On Mon, Oct 17, 2022 at 4:02 PM Eric Biggers <ebiggers@kernel.org> wrote: > > > > On Mon, Oct 17, 2022 at 03:26:20PM -0700, Nathan Huckleberry wrote: > > > The key_powers array is not guaranteed to be 16-byte aligned, so using > > > movaps to operate on key_powers is not allowed. > > > > > > Switch movaps to movups. > > > > > > Fixes: 34f7f6c30112 ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") > > > Reported-by: Bruno Goncalves <bgoncalv@redhat.com> > > > Signed-off-by: Nathan Huckleberry <nhuck@google.com> > > > --- > > > arch/x86/crypto/polyval-clmulni_asm.S | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/crypto/polyval-clmulni_asm.S b/arch/x86/crypto/polyval-clmulni_asm.S > > > index a6ebe4e7dd2b..32b98cb53ddf 100644 > > > --- a/arch/x86/crypto/polyval-clmulni_asm.S > > > +++ b/arch/x86/crypto/polyval-clmulni_asm.S > > > @@ -234,7 +234,7 @@ > > > > > > movups (MSG), %xmm0 > > > pxor SUM, %xmm0 > > > - movaps (KEY_POWERS), %xmm1 > > > + movups (KEY_POWERS), %xmm1 > > > schoolbook1_noload > > > dec BLOCKS_LEFT > > > addq $16, MSG > > > > I thought that crypto_tfm::__crt_ctx is guaranteed to be 16-byte aligned, > > and that the x86 AES code relies on that property. > > > > But now I see that actually the x86 AES code manually aligns the context. > > See aes_ctx() in arch/x86/crypto/aesni-intel_glue.c. > > > > Did you consider doing the same for polyval? > > I'll submit a v2 aligning the tfm_ctx. I think that makes more sense > than working on unaligned keys. > > Is there a need to do the same changes on arm64? The keys are also > unaligned there. > arm64 defines ARCH_DMA_MINALIGN to 128, so I don't think the same issue applies there. Also the instructions used don't assume aligned addresses. - Eric
On Mon, Oct 17, 2022 at 05:12:16PM -0700, Eric Biggers wrote: > > arm64 defines ARCH_DMA_MINALIGN to 128, so I don't think the same issue applies > there. Also the instructions used don't assume aligned addresses. Note that arm64 may lower ARCH_KMALLOC_MINALIGN to 8 soon. Cheers,
On Tue, Oct 18, 2022 at 02:56:23PM -0700, Nathan Huckleberry wrote: > -static void internal_polyval_update(const struct polyval_tfm_ctx *keys, > +static inline struct polyval_tfm_ctx *polyval_tfm_ctx(const void *raw_ctx) > +{ > + unsigned long addr = (unsigned long)raw_ctx; > + unsigned long align = POLYVAL_ALIGN; > + > + if (align <= crypto_tfm_ctx_alignment()) > + align = 1; > + return (struct polyval_tfm_ctx *)ALIGN(addr, align); > +} This could just use PTR_ALIGN. Also, checking for POLYVAL_ALIGN <= crypto_tfm_ctx_alignment() isn't necessary. > + > +static void internal_polyval_update(const void *raw_keys, > const u8 *in, size_t nblocks, u8 *accumulator) > { > + const struct polyval_tfm_ctx *keys = polyval_tfm_ctx(raw_keys); This is being passed a struct polyval_tfm_ctx. There's no need to cast it back to a void pointer and align it again redundantly. > if (likely(crypto_simd_usable())) { > kernel_fpu_begin(); > clmul_polyval_update(keys, in, nblocks, accumulator); > @@ -102,7 +117,8 @@ static int polyval_x86_update(struct shash_desc *desc, > const u8 *src, unsigned int srclen) > { > struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); > - const struct polyval_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); > + const struct polyval_tfm_ctx *tctx = > + polyval_tfm_ctx(crypto_shash_ctx(desc->tfm)); > u8 *pos; > unsigned int nblocks; > unsigned int n; It would make more sense to have the polyval_tfm_ctx() function take in the struct crypto_shash. How about using the following: static inline struct polyval_tfm_ctx *polyval_tfm_ctx(struct crypto_shash *tfm) { return PTR_ALIGN(crypto_shash_ctx(tfm), POLYVAL_ALIGN); } - Eric
diff --git a/arch/x86/crypto/polyval-clmulni_asm.S b/arch/x86/crypto/polyval-clmulni_asm.S index a6ebe4e7dd2b..32b98cb53ddf 100644 --- a/arch/x86/crypto/polyval-clmulni_asm.S +++ b/arch/x86/crypto/polyval-clmulni_asm.S @@ -234,7 +234,7 @@ movups (MSG), %xmm0 pxor SUM, %xmm0 - movaps (KEY_POWERS), %xmm1 + movups (KEY_POWERS), %xmm1 schoolbook1_noload dec BLOCKS_LEFT addq $16, MSG
The key_powers array is not guaranteed to be 16-byte aligned, so using movaps to operate on key_powers is not allowed. Switch movaps to movups. Fixes: 34f7f6c30112 ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Signed-off-by: Nathan Huckleberry <nhuck@google.com> --- arch/x86/crypto/polyval-clmulni_asm.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)