Message ID | 20250428170040.423825-9-ebiggers@kernel.org |
---|---|
State | New |
Headers | show |
Series | Architecture-optimized SHA-256 library API | expand |
On Mon, 28 Apr 2025 10:00:33 -0700 Eric Biggers <ebiggers@kernel.org> wrote: > From: Eric Biggers <ebiggers@google.com> > > Instead of providing crypto_shash algorithms for the arch-optimized > SHA-256 code, instead implement the SHA-256 library. This is much > simpler, it makes the SHA-256 library functions be arch-optimized, and > it fixes the longstanding issue where the arch-optimized SHA-256 was > disabled by default. SHA-256 still remains available through > crypto_shash, but individual architectures no longer need to handle it. I can get to the following error after this patch, now merged as commit b9eac03edcf8 ("crypto: s390/sha256 - implement library instead of shash"): error: the following would cause module name conflict: crypto/sha256.ko arch/s390/lib/crypto/sha256.ko Base config file is generated from: $ CONFIG=$(mktemp) $ cat << EOF > $CONFIG CONFIG_MODULES=y CONFIG_CRYPTO=y CONFIG_CRYPTO_SHA256=m EOF Base config applied to allnoconfig: $ KCONFIG_ALLCONFIG=$CONFIG make ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- allnoconfig Resulting in: $ grep SHA256 .config CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_LIB_SHA256=m CONFIG_CRYPTO_ARCH_HAVE_LIB_SHA256=y CONFIG_CRYPTO_LIB_SHA256_GENERIC=m CONFIG_CRYPTO_SHA256_S390=m Thanks, Alex
On Thu, May 29, 2025 at 11:05:26AM -0600, Alex Williamson wrote: > On Mon, 28 Apr 2025 10:00:33 -0700 > Eric Biggers <ebiggers@kernel.org> wrote: > > > From: Eric Biggers <ebiggers@google.com> > > > > Instead of providing crypto_shash algorithms for the arch-optimized > > SHA-256 code, instead implement the SHA-256 library. This is much > > simpler, it makes the SHA-256 library functions be arch-optimized, and > > it fixes the longstanding issue where the arch-optimized SHA-256 was > > disabled by default. SHA-256 still remains available through > > crypto_shash, but individual architectures no longer need to handle it. > > I can get to the following error after this patch, now merged as commit > b9eac03edcf8 ("crypto: s390/sha256 - implement library instead of shash"): > > error: the following would cause module name conflict: > crypto/sha256.ko > arch/s390/lib/crypto/sha256.ko Thanks for reporting this. For now the s390 one should be renamed to sha256-s390, similar to how the other architectures' sha256 modules are named. I'll send a patch. Long-term, I'd like to find a clean way to consolidate the library code for each algorithm into a single module. So instead of e.g. libsha256.ko, libsha256-generic.ko, and sha256-s390.ko (all of which get loaded when the SHA-256 library is needed), we'd just have libsha256.ko. (Or just sha256.ko, with the old-school crypto API one renamed to sha256-cryptoapi.ko.) A lot of these weird build problems we've been having are caused by the unnecessary separation into multiple modules. - Eric
On Thu, May 29, 2025 at 05:37:02PM +0000, Eric Biggers wrote: > On Thu, May 29, 2025 at 11:05:26AM -0600, Alex Williamson wrote: > > On Mon, 28 Apr 2025 10:00:33 -0700 > > Eric Biggers <ebiggers@kernel.org> wrote: > > > > > From: Eric Biggers <ebiggers@google.com> > > > > > > Instead of providing crypto_shash algorithms for the arch-optimized > > > SHA-256 code, instead implement the SHA-256 library. This is much > > > simpler, it makes the SHA-256 library functions be arch-optimized, and > > > it fixes the longstanding issue where the arch-optimized SHA-256 was > > > disabled by default. SHA-256 still remains available through > > > crypto_shash, but individual architectures no longer need to handle it. > > > > I can get to the following error after this patch, now merged as commit > > b9eac03edcf8 ("crypto: s390/sha256 - implement library instead of shash"): > > > > error: the following would cause module name conflict: > > crypto/sha256.ko > > arch/s390/lib/crypto/sha256.ko > > Thanks for reporting this. For now the s390 one should be renamed to > sha256-s390, similar to how the other architectures' sha256 modules are named. > I'll send a patch. > > Long-term, I'd like to find a clean way to consolidate the library code for each > algorithm into a single module. So instead of e.g. libsha256.ko, > libsha256-generic.ko, and sha256-s390.ko (all of which get loaded when the > SHA-256 library is needed), we'd just have libsha256.ko. (Or just sha256.ko, > with the old-school crypto API one renamed to sha256-cryptoapi.ko.) A lot of > these weird build problems we've been having are caused by the unnecessary > separation into multiple modules. > > - Eric > Patch sent: https://lore.kernel.org/r/20250529185913.25091-1-ebiggers@kernel.org - Eric
On Thu, 29 May 2025 at 10:37, Eric Biggers <ebiggers@kernel.org> wrote: > > Long-term, I'd like to find a clean way to consolidate the library code for each > algorithm into a single module. No, while I think the current situation isn't great, I think the "make it one single module" is even worse. For most architectures - including s390 - you end up being in the situation that these kinds of hw accelerated crypto things depend on some CPU capability, and aren't necessarily statically always available. So these things end up having stupid extra overhead due to having some conditional. That extra overhead is then in turn minimized with tricks like static branches, but that's all just just piling more ugly hacks on top because it picked a bad choice to begin with. So what's the *right* thing to do? The right thing to do is to just link the right routine in the first place, and *not* have static branch hackery at all. Because you didn't need it. And we already do runtime linking at module loading time. So if it's a module, if the hardware acceleration doesn't exist, the module load should just fail, and the loader should go on to load the next option. Not any silly "one module to rule them all" hackery that only results in worse code. Just a simple "if this module loads successfully, you'll link the optimal hw acceleration". Now, the problem with this all is the *non*modular case. For modules, we already have the optimal solution in the form of init-module error handling and runtime linking. So I think the module case is "solved" (except the solution is not what we actually do). For the non-module case, the problem is that "I linked this unconditionally, and now it turns out I run on hardware that doesn't have the capability to run this". And that's when you need to do things like static_call_update() to basically do runtime re-linking of a static decision. And currently we very much do this wrong. See how s390 and x86-64 (and presumably others) basically have the *exact* same problems, but they then mix static branches and static calls (in the case of x86-64) and just have non-optimal code in general. What I think the generic code should do (for the built-in case) is just have DEFINE_STATIC_CALL(sha256_blocks_fn, sha256_blocks_generic); and do static_call(sha256_blocks_fn)(args..); and then architecture code can do the static_call_update() to set their optimal version. And yeah, we'd presumably need multiple versions, since there's the whole "is simd usable" thing. Although maybe that's going away? Linus
On Thu, May 29, 2025 at 01:14:31PM -0700, Linus Torvalds wrote: > On Thu, 29 May 2025 at 10:37, Eric Biggers <ebiggers@kernel.org> wrote: > > > > Long-term, I'd like to find a clean way to consolidate the library code for each > > algorithm into a single module. > > No, while I think the current situation isn't great, I think the "make > it one single module" is even worse. > > For most architectures - including s390 - you end up being in the > situation that these kinds of hw accelerated crypto things depend on > some CPU capability, and aren't necessarily statically always > available. > > So these things end up having stupid extra overhead due to having some > conditional. > > That extra overhead is then in turn minimized with tricks like static > branches, but that's all just just piling more ugly hacks on top > because it picked a bad choice to begin with. > > So what's the *right* thing to do? > > The right thing to do is to just link the right routine in the first > place, and *not* have static branch hackery at all. Because you didn't > need it. > > And we already do runtime linking at module loading time. So if it's a > module, if the hardware acceleration doesn't exist, the module load > should just fail, and the loader should go on to load the next option. So using crc32c() + ext4 + x86 as an example (but SHA-256 would be very similar), the current behavior is that ext4.ko depends on the crc32c_arch() symbol. That causes crc32-x86.ko to be loaded, which then depends on the crc32c_base() symbol as a fallback, which causes crc32.ko to be loaded too. My idea is to consolidate the two crc32 modules into one (they always go together, after all), keeping the same symbols. The main challenge is just the current directory structure. Your suggestion sounds like: ext4.ko would depend on the crc32c() symbol, which would be defined in *both* crc32-x86.ko and crc32.ko. The module loader would try to load crc32-x86.ko first. If the CPU does not support any of the x86 accelerated CRC32 code, then loading that module would fail. The module loader would then load crc32.ko instead. Does any of the infrastructure to handle "this symbol is in multiple modules and they must be loaded in this particular order" actually exist, though? And how do we avoid the issues the crypto API often has where the accelerated modules don't get loaded, causing slow generic code to unnecessarily be used? IMO this sounds questionable compared to just using static keys and/or branches, which we'd need anyway to support the non-modular case. > Not any silly "one module to rule them all" hackery that only results > in worse code. Just a simple "if this module loads successfully, > you'll link the optimal hw acceleration". > > Now, the problem with this all is the *non*modular case. > > For modules, we already have the optimal solution in the form of > init-module error handling and runtime linking. > > So I think the module case is "solved" (except the solution is not > what we actually do). > > For the non-module case, the problem is that "I linked this > unconditionally, and now it turns out I run on hardware that doesn't > have the capability to run this". > > And that's when you need to do things like static_call_update() to > basically do runtime re-linking of a static decision. > > And currently we very much do this wrong. See how s390 and x86-64 (and > presumably others) basically have the *exact* same problems, but they > then mix static branches and static calls (in the case of x86-64) and > just have non-optimal code in general. > > What I think the generic code should do (for the built-in case) is just have > > DEFINE_STATIC_CALL(sha256_blocks_fn, sha256_blocks_generic); > > and do > > static_call(sha256_blocks_fn)(args..); > > and then architecture code can do the static_call_update() to set > their optimal version. > > And yeah, we'd presumably need multiple versions, since there's the > whole "is simd usable" thing. Although maybe that's going away? Moving the static_call into the generic code might make sense. I don't think it's a win in all cases currently, though. Only x86 and PPC32 actually have a real static_call implementation; everywhere else it's an indirect call which is slower than a static branch. Also, some arch code is just usable unconditionally without any CPU feature check, e.g. the MIPS ChaCha code. That doesn't use (or need to use) a static call or branch at all. Also, while the centralized static_call would *allow* for the generic code to be loaded while the arch code is not, in the vast majority of cases that would be a bug, not a feature. The generic crypto infrastructure has that bug, and this has caused a huge amount of pain over the years. People have to go out of the way to ensure that the arch-optimized crypto code gets loaded. And they often forget, resulting in the slow generic code being used unnecessarily... Making the arch-optimized code be loaded through a direct symbol dependency solves that problem. - Eric
On Thu, 29 May 2025 at 14:16, Eric Biggers <ebiggers@kernel.org> wrote: > > So using crc32c() + ext4 + x86 as an example (but SHA-256 would be very > similar), the current behavior is that ext4.ko depends on the crc32c_arch() > symbol. Yes, I think that's a good example. I think it's an example of something that "works", but it certainly is a bit hacky. Wouldn't it be nicer if just plain "crc32c()" did the right thing, instead of users having to do strange hacks just to get the optimized version that they are looking for? > Does any of the infrastructure to handle "this symbol is in multiple modules and > they must be loaded in this particular order" actually exist, though? Hmm. I was sure we already did that for other things, but looking around, I'm not finding any cases. Or rather, I _am_ finding cases where we export the same symbol from different code, but all the ones I found were being careful to not be active at the same time. I really thought we had cases where depending on which module you loaded you got different implementations, but it looks like it either was some historical thing that no longer exists - or that I need to go take my meds. > IMO this sounds questionable compared to just using static keys and/or branches, > which we'd need anyway to support the non-modular case. I really wish the non-modular case used static calls, not static keys like it does now. In fact, that should work even for modular users. Of course, not all architectures actually do the optimized thing, and the generic fallback uses indirect calls through a function pointer, but hey, if an architecture didn't bother with the rewriting code that is fixable - if the architecture maintainer cares. (On some architectures, indirect calls are not noticeably slower than direct calls - because you have to load the address from some global pointer area anyway - so not having the rewriting can be a "we don't need it" thing) Linus
On Thu, May 29, 2025 at 04:54:34PM -0700, Linus Torvalds wrote: > On Thu, 29 May 2025 at 14:16, Eric Biggers <ebiggers@kernel.org> wrote: > > > > So using crc32c() + ext4 + x86 as an example (but SHA-256 would be very > > similar), the current behavior is that ext4.ko depends on the crc32c_arch() > > symbol. > > Yes, I think that's a good example. > > I think it's an example of something that "works", but it certainly is > a bit hacky. > > Wouldn't it be nicer if just plain "crc32c()" did the right thing, > instead of users having to do strange hacks just to get the optimized > version that they are looking for? For crc32c() that's exactly how it works (since v6.14, when I implemented it). The users call crc32c() which is an inline function, which then calls crc32c_arch() or crc32c_base() depending on the kconfig. So that's why I said the symbol dependency is currently on crc32c_arch. Sorry if I wasn't clear. The SHA-256, ChaCha, and Poly1305 library code now has a similar design too. If we merged the arch and generic modules together, then the symbol would become crc32c. But in either case crc32c() is the API that all the users call. - Eric
On Fri, May 30, 2025 at 12:18:58AM +0000, Eric Biggers wrote: > On Thu, May 29, 2025 at 04:54:34PM -0700, Linus Torvalds wrote: > > On Thu, 29 May 2025 at 14:16, Eric Biggers <ebiggers@kernel.org> wrote: > > > > > > So using crc32c() + ext4 + x86 as an example (but SHA-256 would be very > > > similar), the current behavior is that ext4.ko depends on the crc32c_arch() > > > symbol. > > > > Yes, I think that's a good example. > > > > I think it's an example of something that "works", but it certainly is > > a bit hacky. > > > > Wouldn't it be nicer if just plain "crc32c()" did the right thing, > > instead of users having to do strange hacks just to get the optimized > > version that they are looking for? > > For crc32c() that's exactly how it works (since v6.14, when I implemented it). > The users call crc32c() which is an inline function, which then calls > crc32c_arch() or crc32c_base() depending on the kconfig. So that's why I said > the symbol dependency is currently on crc32c_arch. Sorry if I wasn't clear. > The SHA-256, ChaCha, and Poly1305 library code now has a similar design too. > > If we merged the arch and generic modules together, then the symbol would become > crc32c. But in either case crc32c() is the API that all the users call. > > - Eric > I implemented my proposal, for lib/crc first, in https://lore.kernel.org/lkml/20250601224441.778374-1-ebiggers@kernel.org. I think it's strictly better than the status quo, and once applied to lib/crypto it will solve some of the problems we've been having there too. But let me know if you still have misgivings. - Eric
On Sun, 1 Jun 2025 at 16:00, Eric Biggers <ebiggers@kernel.org> wrote: > > I implemented my proposal, for lib/crc first, Ok, I scanned through that series, and it looks good to me. A clear improvement. Linus
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig index 6f2c9ce1b1548..de69faa4d94f3 100644 --- a/arch/s390/configs/debug_defconfig +++ b/arch/s390/configs/debug_defconfig @@ -793,11 +793,10 @@ CONFIG_CRYPTO_USER_API_HASH=m CONFIG_CRYPTO_USER_API_SKCIPHER=m CONFIG_CRYPTO_USER_API_RNG=m CONFIG_CRYPTO_USER_API_AEAD=m CONFIG_CRYPTO_SHA512_S390=m CONFIG_CRYPTO_SHA1_S390=m -CONFIG_CRYPTO_SHA256_S390=m CONFIG_CRYPTO_SHA3_256_S390=m CONFIG_CRYPTO_SHA3_512_S390=m CONFIG_CRYPTO_GHASH_S390=m CONFIG_CRYPTO_AES_S390=m CONFIG_CRYPTO_DES_S390=m diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig index f18a7d97ac216..f12679448e976 100644 --- a/arch/s390/configs/defconfig +++ b/arch/s390/configs/defconfig @@ -780,11 +780,10 @@ CONFIG_CRYPTO_USER_API_HASH=m CONFIG_CRYPTO_USER_API_SKCIPHER=m CONFIG_CRYPTO_USER_API_RNG=m CONFIG_CRYPTO_USER_API_AEAD=m CONFIG_CRYPTO_SHA512_S390=m CONFIG_CRYPTO_SHA1_S390=m -CONFIG_CRYPTO_SHA256_S390=m CONFIG_CRYPTO_SHA3_256_S390=m CONFIG_CRYPTO_SHA3_512_S390=m CONFIG_CRYPTO_GHASH_S390=m CONFIG_CRYPTO_AES_S390=m CONFIG_CRYPTO_DES_S390=m diff --git a/arch/s390/crypto/Kconfig b/arch/s390/crypto/Kconfig index a2bfd6eef0ca3..e2c27588b21a9 100644 --- a/arch/s390/crypto/Kconfig +++ b/arch/s390/crypto/Kconfig @@ -20,20 +20,10 @@ config CRYPTO_SHA1_S390 Architecture: s390 It is available as of z990. -config CRYPTO_SHA256_S390 - tristate "Hash functions: SHA-224 and SHA-256" - select CRYPTO_HASH - help - SHA-224 and SHA-256 secure hash algorithms (FIPS 180) - - Architecture: s390 - - It is available as of z9. - config CRYPTO_SHA3_256_S390 tristate "Hash functions: SHA3-224 and SHA3-256" select CRYPTO_HASH help SHA3-224 and SHA3-256 secure hash algorithms (FIPS 202) diff --git a/arch/s390/crypto/Makefile b/arch/s390/crypto/Makefile index e3853774e1a3a..21757d86cd499 100644 --- a/arch/s390/crypto/Makefile +++ b/arch/s390/crypto/Makefile @@ -2,11 +2,10 @@ # # Cryptographic API # obj-$(CONFIG_CRYPTO_SHA1_S390) += sha1_s390.o sha_common.o -obj-$(CONFIG_CRYPTO_SHA256_S390) += sha256_s390.o sha_common.o obj-$(CONFIG_CRYPTO_SHA512_S390) += sha512_s390.o sha_common.o obj-$(CONFIG_CRYPTO_SHA3_256_S390) += sha3_256_s390.o sha_common.o obj-$(CONFIG_CRYPTO_SHA3_512_S390) += sha3_512_s390.o sha_common.o obj-$(CONFIG_CRYPTO_DES_S390) += des_s390.o obj-$(CONFIG_CRYPTO_AES_S390) += aes_s390.o diff --git a/arch/s390/crypto/sha256_s390.c b/arch/s390/crypto/sha256_s390.c deleted file mode 100644 index e6876c49414d5..0000000000000 --- a/arch/s390/crypto/sha256_s390.c +++ /dev/null @@ -1,144 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0+ -/* - * Cryptographic API. - * - * s390 implementation of the SHA256 and SHA224 Secure Hash Algorithm. - * - * s390 Version: - * Copyright IBM Corp. 2005, 2011 - * Author(s): Jan Glauber (jang@de.ibm.com) - */ -#include <asm/cpacf.h> -#include <crypto/internal/hash.h> -#include <crypto/sha2.h> -#include <linux/cpufeature.h> -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/string.h> - -#include "sha.h" - -static int s390_sha256_init(struct shash_desc *desc) -{ - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); - - sctx->state[0] = SHA256_H0; - sctx->state[1] = SHA256_H1; - sctx->state[2] = SHA256_H2; - sctx->state[3] = SHA256_H3; - sctx->state[4] = SHA256_H4; - sctx->state[5] = SHA256_H5; - sctx->state[6] = SHA256_H6; - sctx->state[7] = SHA256_H7; - sctx->count = 0; - sctx->func = CPACF_KIMD_SHA_256; - - return 0; -} - -static int sha256_export(struct shash_desc *desc, void *out) -{ - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); - struct crypto_sha256_state *octx = out; - - octx->count = sctx->count; - memcpy(octx->state, sctx->state, sizeof(octx->state)); - return 0; -} - -static int sha256_import(struct shash_desc *desc, const void *in) -{ - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); - const struct crypto_sha256_state *ictx = in; - - sctx->count = ictx->count; - memcpy(sctx->state, ictx->state, sizeof(ictx->state)); - sctx->func = CPACF_KIMD_SHA_256; - return 0; -} - -static struct shash_alg sha256_alg = { - .digestsize = SHA256_DIGEST_SIZE, - .init = s390_sha256_init, - .update = s390_sha_update_blocks, - .finup = s390_sha_finup, - .export = sha256_export, - .import = sha256_import, - .descsize = S390_SHA_CTX_SIZE, - .statesize = sizeof(struct crypto_sha256_state), - .base = { - .cra_name = "sha256", - .cra_driver_name= "sha256-s390", - .cra_priority = 300, - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize = SHA256_BLOCK_SIZE, - .cra_module = THIS_MODULE, - } -}; - -static int s390_sha224_init(struct shash_desc *desc) -{ - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); - - sctx->state[0] = SHA224_H0; - sctx->state[1] = SHA224_H1; - sctx->state[2] = SHA224_H2; - sctx->state[3] = SHA224_H3; - sctx->state[4] = SHA224_H4; - sctx->state[5] = SHA224_H5; - sctx->state[6] = SHA224_H6; - sctx->state[7] = SHA224_H7; - sctx->count = 0; - sctx->func = CPACF_KIMD_SHA_256; - - return 0; -} - -static struct shash_alg sha224_alg = { - .digestsize = SHA224_DIGEST_SIZE, - .init = s390_sha224_init, - .update = s390_sha_update_blocks, - .finup = s390_sha_finup, - .export = sha256_export, - .import = sha256_import, - .descsize = S390_SHA_CTX_SIZE, - .statesize = sizeof(struct crypto_sha256_state), - .base = { - .cra_name = "sha224", - .cra_driver_name= "sha224-s390", - .cra_priority = 300, - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize = SHA224_BLOCK_SIZE, - .cra_module = THIS_MODULE, - } -}; - -static int __init sha256_s390_init(void) -{ - int ret; - - if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA_256)) - return -ENODEV; - ret = crypto_register_shash(&sha256_alg); - if (ret < 0) - goto out; - ret = crypto_register_shash(&sha224_alg); - if (ret < 0) - crypto_unregister_shash(&sha256_alg); -out: - return ret; -} - -static void __exit sha256_s390_fini(void) -{ - crypto_unregister_shash(&sha224_alg); - crypto_unregister_shash(&sha256_alg); -} - -module_cpu_feature_match(S390_CPU_FEATURE_MSA, sha256_s390_init); -module_exit(sha256_s390_fini); - -MODULE_ALIAS_CRYPTO("sha256"); -MODULE_ALIAS_CRYPTO("sha224"); -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION("SHA256 and SHA224 Secure Hash Algorithm"); diff --git a/arch/s390/lib/crypto/Kconfig b/arch/s390/lib/crypto/Kconfig index 069b355fe51aa..e3f855ef43934 100644 --- a/arch/s390/lib/crypto/Kconfig +++ b/arch/s390/lib/crypto/Kconfig @@ -3,5 +3,11 @@ config CRYPTO_CHACHA_S390 tristate default CRYPTO_LIB_CHACHA select CRYPTO_LIB_CHACHA_GENERIC select CRYPTO_ARCH_HAVE_LIB_CHACHA + +config CRYPTO_SHA256_S390 + tristate + default CRYPTO_LIB_SHA256 + select CRYPTO_ARCH_HAVE_LIB_SHA256 + select CRYPTO_LIB_SHA256_GENERIC diff --git a/arch/s390/lib/crypto/Makefile b/arch/s390/lib/crypto/Makefile index 06c2cf77178ef..920197967f463 100644 --- a/arch/s390/lib/crypto/Makefile +++ b/arch/s390/lib/crypto/Makefile @@ -1,4 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_CRYPTO_CHACHA_S390) += chacha_s390.o chacha_s390-y := chacha-glue.o chacha-s390.o + +obj-$(CONFIG_CRYPTO_SHA256_S390) += sha256.o diff --git a/arch/s390/lib/crypto/sha256.c b/arch/s390/lib/crypto/sha256.c new file mode 100644 index 0000000000000..50c592ce7a5de --- /dev/null +++ b/arch/s390/lib/crypto/sha256.c @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * SHA-256 optimized using the CP Assist for Cryptographic Functions (CPACF) + * + * Copyright 2025 Google LLC + */ +#include <asm/cpacf.h> +#include <crypto/internal/sha2.h> +#include <linux/cpufeature.h> +#include <linux/kernel.h> +#include <linux/module.h> + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_cpacf_sha256); + +void sha256_blocks_arch(u32 state[SHA256_STATE_WORDS], + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_cpacf_sha256)) + cpacf_kimd(CPACF_KIMD_SHA_256, state, data, + nblocks * SHA256_BLOCK_SIZE); + else + sha256_blocks_generic(state, data, nblocks); +} +EXPORT_SYMBOL(sha256_blocks_arch); + +bool sha256_is_arch_optimized(void) +{ + return static_key_enabled(&have_cpacf_sha256); +} +EXPORT_SYMBOL(sha256_is_arch_optimized); + +static int __init sha256_s390_mod_init(void) +{ + if (cpu_have_feature(S390_CPU_FEATURE_MSA) && + cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA_256)) + static_branch_enable(&have_cpacf_sha256); + return 0; +} +arch_initcall(sha256_s390_mod_init); + +static void __exit sha256_s390_mod_exit(void) +{ +} +module_exit(sha256_s390_mod_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("SHA-256 using the CP Assist for Cryptographic Functions (CPACF)");