diff mbox series

[9/9] crypto: qce - switch to using a mutex

Message ID 20241203-crypto-qce-refactor-v1-9-c5901d2dd45c@linaro.org
State New
Headers show
Series crypto: qce - refactor the driver | expand

Commit Message

Bartosz Golaszewski Dec. 3, 2024, 9:19 a.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Having switched to workqueue from tasklet, we are no longer limited to
atomic APIs and can now convert the spinlock to a mutex. This, along
with the conversion from tasklet to workqueue grants us ~15% improvement
in cryptsetup benchmarks for AES encryption.

While at it: use guards to simplify locking code.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/crypto/qce/core.c | 46 +++++++++++++++++++++-------------------------
 drivers/crypto/qce/core.h |  3 ++-
 2 files changed, 23 insertions(+), 26 deletions(-)

Comments

Neil Armstrong Dec. 4, 2024, 10:17 a.m. UTC | #1
On 03/12/2024 16:10, Bartosz Golaszewski wrote:
> On Tue, 3 Dec 2024 14:53:21 +0100, neil.armstrong@linaro.org said:
>> On 03/12/2024 10:19, Bartosz Golaszewski wrote:
>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>>
>>> Having switched to workqueue from tasklet, we are no longer limited to
>>> atomic APIs and can now convert the spinlock to a mutex. This, along
>>> with the conversion from tasklet to workqueue grants us ~15% improvement
>>> in cryptsetup benchmarks for AES encryption.
>>
>> Can you share on which platforms you did the tests and the results you got ?
>>
> 
> Sure, I tested on sm8650 with the following results (they vary from
> one run to other but are more or less in this range):
> 
> With this series:
> 
> #     Algorithm |       Key |      Encryption |      Decryption
>          aes-cbc        128b        94.1 MiB/s       138.6 MiB/s
>      serpent-cbc        128b               N/A               N/A
>      twofish-cbc        128b               N/A               N/A
>          aes-cbc        256b        94.8 MiB/s       128.5 MiB/s
>      serpent-cbc        256b               N/A               N/A
>      twofish-cbc        256b               N/A               N/A
>          aes-xts        256b       132.9 MiB/s       131.8 MiB/s
>      serpent-xts        256b               N/A               N/A
>      twofish-xts        256b               N/A               N/A
>          aes-xts        512b       122.6 MiB/s       122.4 MiB/s
>      serpent-xts        512b               N/A               N/A
>      twofish-xts        512b               N/A               N/A
> 
> Without it:
> 
> #     Algorithm |       Key |      Encryption |      Decryption
>          aes-cbc        128b        96.4 MiB/s       141.0 MiB/s
>      serpent-cbc        128b               N/A               N/A
>      twofish-cbc        128b               N/A               N/A
>          aes-cbc        256b        67.0 MiB/s        97.8 MiB/s
>      serpent-cbc        256b               N/A               N/A
>      twofish-cbc        256b               N/A               N/A
>          aes-xts        256b       131.7 MiB/s       132.0 MiB/s
>      serpent-xts        256b               N/A               N/A
>      twofish-xts        256b               N/A               N/A
>          aes-xts        512b        93.9 MiB/s        96.8 MiB/s
>      serpent-xts        512b               N/A               N/A
>      twofish-xts        512b               N/A               N/A
> 
> AES-CBC and AES-XTS with shorter keys remain pretty much the same. I'm not
> sure why that is. I also tested on sa8775p but there are no visible
> improvements there. :(

Thanks for the results !

Neil

> 
> Bart
Eric Biggers Jan. 18, 2025, 8:06 a.m. UTC | #2
On Tue, Dec 03, 2024 at 10:10:13AM -0500, Bartosz Golaszewski wrote:
> On Tue, 3 Dec 2024 14:53:21 +0100, neil.armstrong@linaro.org said:
> > On 03/12/2024 10:19, Bartosz Golaszewski wrote:
> >> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>
> >> Having switched to workqueue from tasklet, we are no longer limited to
> >> atomic APIs and can now convert the spinlock to a mutex. This, along
> >> with the conversion from tasklet to workqueue grants us ~15% improvement
> >> in cryptsetup benchmarks for AES encryption.
> >
> > Can you share on which platforms you did the tests and the results you got ?
> >
> 
> Sure, I tested on sm8650 with the following results (they vary from
> one run to other but are more or less in this range):

FYI, when I test this driver on sm8650 with linux-next there are lots of test
failures, including ones where the driver straight out returns wrong hashes:

[    6.330656] alg: skcipher: xts-aes-qce setkey failed on test vector 0; expected_error=0, actual_error=-126, flags=0x1
[    6.347196] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126)
[    6.347200] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126)
[    6.784951] alg: skcipher: ctr-aes-qce encryption test failed (wrong output IV) on test vector 4, cfg="in-place (one sglist)"
[    6.810709] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22)
[    6.941639] alg: skcipher: cbc-3des-qce setkey failed on test vector "random: len=16 klen=24"; expected_error=0, actual_error=-126, flags=0x1
[    6.947029] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22)
[    6.954394] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126)
[    6.975348] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126)
[    7.454433] alg: skcipher: ecb-3des-qce setkey failed on test vector "random: len=32 klen=24"; expected_error=0, actual_error=-126, flags=0x1
[    7.454482] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126)
[    7.454493] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126)
[    8.593828] alg: ahash: hmac-sha256-qce test failed (wrong result) on test vector "random: psize=0 ksize=80", cfg="random: may_sleep use_finup src_divs=[<reimport>100.0%@+1800] key_offset=7"
[    8.627337] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22)
[    8.639889] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22)
[    8.933595] alg: ahash: hmac-sha1-qce test failed (wrong result) on test vector "random: psize=0 ksize=56", cfg="random: inplace_one_sglist use_finup nosimd_setkey src_divs=[100.0%@+3969] key_offset=71"
[    8.952885] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22)
[    8.965096] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22)

Also a KASAN error:

[    9.420862] CPU: 3 UID: 0 PID: 393 Comm: cryptomgr_test Tainted: G S      W          6.13.0-rc7-next-20250117-00007-g58182eb6d73d #12
[    9.420881] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[    9.420886] Hardware name: Qualcomm Technologies, Inc. SM8650 HDK (DT)
[    9.420891] Call trace:
[    9.420895]  show_stack+0x18/0x24 (C)
[    9.420918]  dump_stack_lvl+0x60/0x80
[    9.420937]  print_report+0x17c/0x4d0
[    9.420960]  kasan_report+0xb0/0xf8
[    9.420983]  kasan_check_range+0x100/0x1a8
[    9.421001]  __asan_memcpy+0x3c/0xa0
[    9.421020]  swiotlb_bounce+0x1b4/0x340
[    9.421034]  swiotlb_tbl_map_single+0x2ac/0x5a0
[    9.421050]  iommu_dma_map_page+0x2b8/0x4cc
[    9.421063]  iommu_dma_map_sg+0x2e8/0xb3c
[    9.421072]  __dma_map_sg_attrs+0x11c/0x1b0
[    9.421086]  dma_map_sg_attrs+0x10/0x20
[    9.421097]  qce_aead_async_req_handle+0x784/0x1aa0
[    9.421125]  qce_handle_queue+0x20c/0x3e8
[    9.421139]  qce_async_request_enqueue+0x10/0x1c
[    9.421153]  qce_aead_crypt+0x1f0/0x81c
[    9.421168]  qce_aead_encrypt+0x14/0x20
[    9.421182]  crypto_aead_encrypt+0xa0/0xe0
[    9.421194]  test_aead_vec_cfg+0x84c/0x1be8
[    9.421207]  test_aead_vs_generic_impl+0x530/0x850
[    9.421219]  alg_test_aead+0x7c8/0xe40
[    9.421230]  alg_test+0x1f4/0xb0c
[    9.421241]  cryptomgr_test+0x50/0x80
[    9.421251]  kthread+0x378/0x678
[    9.421272]  ret_from_fork+0x10/0x20

[    9.550765] Allocated by task 393:
[    9.554279]  kasan_save_stack+0x3c/0x64
[    9.558252]  kasan_save_track+0x20/0x40
[    9.562221]  kasan_save_alloc_info+0x40/0x54
[    9.566641]  __kasan_kmalloc+0xb8/0xbc
[    9.570515]  __kmalloc_noprof+0x188/0x410
[    9.574669]  qce_aead_async_req_handle+0xbd4/0x1aa0
[    9.579701]  qce_handle_queue+0x20c/0x3e8
[    9.583841]  qce_async_request_enqueue+0x10/0x1c
[    9.588607]  qce_aead_crypt+0x1f0/0x81c
[    9.592577]  qce_aead_encrypt+0x14/0x20
[    9.596547]  crypto_aead_encrypt+0xa0/0xe0
[    9.600776]  test_aead_vec_cfg+0x84c/0x1be8
[    9.605097]  test_aead_vs_generic_impl+0x530/0x850
[    9.610039]  alg_test_aead+0x7c8/0xe40
[    9.613911]  alg_test+0x1f4/0xb0c
[    9.617345]  cryptomgr_test+0x50/0x80
[    9.621124]  kthread+0x378/0x678
[    9.624473]  ret_from_fork+0x10/0x20

[    9.629738] The buggy address belongs to the object at ffff5c7440d1cc00
                which belongs to the cache kmalloc-32 of size 32
[    9.642426] The buggy address is located 0 bytes inside of
                allocated 22-byte region [ffff5c7440d1cc00, ffff5c7440d1cc16)

[    9.656672] The buggy address belongs to the physical page:
[    9.662409] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x880d1c
[    9.670646] flags: 0xbfffe0000000000(node=0|zone=2|lastcpupid=0x1ffff)
[    9.677371] page_type: f5(slab)
[    9.680624] raw: 0bfffe0000000000 ffff5c7440002780 dead000000000100 dead000000000122
[    9.688595] raw: 0000000000000000 0000000080400040 00000000f5000000 0000000000000000
[    9.696560] page dumped because: kasan: bad access detected

[    9.703854] Memory state around the buggy address:
[    9.708796]  ffff5c7440d1cb00: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
[    9.716227]  ffff5c7440d1cb80: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
[    9.723660] >ffff5c7440d1cc00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc
[    9.731092]                          ^
[    9.734959]  ffff5c7440d1cc80: fa fb fb fb fc fc fc fc 00 00 06 fc fc fc fc fc
[    9.742392]  ffff5c7440d1cd00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc
[    9.749824] ==================================================================


I personally still struggle to understand how this driver could plausibly be
useful when the software crypto has no issues, is much faster, and is much
better tested.  What is motivating having this driver in the kernel?

- Eric
Bartosz Golaszewski Jan. 18, 2025, 9:28 a.m. UTC | #3
On Sat, Jan 18, 2025 at 9:06 AM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Dec 03, 2024 at 10:10:13AM -0500, Bartosz Golaszewski wrote:
> > On Tue, 3 Dec 2024 14:53:21 +0100, neil.armstrong@linaro.org said:
> > > On 03/12/2024 10:19, Bartosz Golaszewski wrote:
> > >> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >>
> > >> Having switched to workqueue from tasklet, we are no longer limited to
> > >> atomic APIs and can now convert the spinlock to a mutex. This, along
> > >> with the conversion from tasklet to workqueue grants us ~15% improvement
> > >> in cryptsetup benchmarks for AES encryption.
> > >
> > > Can you share on which platforms you did the tests and the results you got ?
> > >
> >
> > Sure, I tested on sm8650 with the following results (they vary from
> > one run to other but are more or less in this range):
>
> FYI, when I test this driver on sm8650 with linux-next there are lots of test
> failures, including ones where the driver straight out returns wrong hashes:
>
> [    6.330656] alg: skcipher: xts-aes-qce setkey failed on test vector 0; expected_error=0, actual_error=-126, flags=0x1
> [    6.347196] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126)
> [    6.347200] alg: self-tests for xts(aes) using xts-aes-qce failed (rc=-126)
> [    6.784951] alg: skcipher: ctr-aes-qce encryption test failed (wrong output IV) on test vector 4, cfg="in-place (one sglist)"
> [    6.810709] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22)
> [    6.941639] alg: skcipher: cbc-3des-qce setkey failed on test vector "random: len=16 klen=24"; expected_error=0, actual_error=-126, flags=0x1
> [    6.947029] alg: self-tests for ctr(aes) using ctr-aes-qce failed (rc=-22)
> [    6.954394] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126)
> [    6.975348] alg: self-tests for cbc(des3_ede) using cbc-3des-qce failed (rc=-126)
> [    7.454433] alg: skcipher: ecb-3des-qce setkey failed on test vector "random: len=32 klen=24"; expected_error=0, actual_error=-126, flags=0x1
> [    7.454482] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126)
> [    7.454493] alg: self-tests for ecb(des3_ede) using ecb-3des-qce failed (rc=-126)
> [    8.593828] alg: ahash: hmac-sha256-qce test failed (wrong result) on test vector "random: psize=0 ksize=80", cfg="random: may_sleep use_finup src_divs=[<reimport>100.0%@+1800] key_offset=7"
> [    8.627337] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22)
> [    8.639889] alg: self-tests for hmac(sha256) using hmac-sha256-qce failed (rc=-22)
> [    8.933595] alg: ahash: hmac-sha1-qce test failed (wrong result) on test vector "random: psize=0 ksize=56", cfg="random: inplace_one_sglist use_finup nosimd_setkey src_divs=[100.0%@+3969] key_offset=71"
> [    8.952885] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22)
> [    8.965096] alg: self-tests for hmac(sha1) using hmac-sha1-qce failed (rc=-22)
>

I was testing with kcapi-speed and cryptsetup benchmark. I've never
seen any errors.

Is this after my changes only or did it exist before? You're testing
with the tcrypt module? How are you inserting it exactly? What params?

> Also a KASAN error:
>
> [    9.420862] CPU: 3 UID: 0 PID: 393 Comm: cryptomgr_test Tainted: G S      W          6.13.0-rc7-next-20250117-00007-g58182eb6d73d #12
> [    9.420881] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [    9.420886] Hardware name: Qualcomm Technologies, Inc. SM8650 HDK (DT)
> [    9.420891] Call trace:
> [    9.420895]  show_stack+0x18/0x24 (C)
> [    9.420918]  dump_stack_lvl+0x60/0x80
> [    9.420937]  print_report+0x17c/0x4d0
> [    9.420960]  kasan_report+0xb0/0xf8
> [    9.420983]  kasan_check_range+0x100/0x1a8
> [    9.421001]  __asan_memcpy+0x3c/0xa0
> [    9.421020]  swiotlb_bounce+0x1b4/0x340
> [    9.421034]  swiotlb_tbl_map_single+0x2ac/0x5a0
> [    9.421050]  iommu_dma_map_page+0x2b8/0x4cc
> [    9.421063]  iommu_dma_map_sg+0x2e8/0xb3c
> [    9.421072]  __dma_map_sg_attrs+0x11c/0x1b0
> [    9.421086]  dma_map_sg_attrs+0x10/0x20
> [    9.421097]  qce_aead_async_req_handle+0x784/0x1aa0
> [    9.421125]  qce_handle_queue+0x20c/0x3e8
> [    9.421139]  qce_async_request_enqueue+0x10/0x1c
> [    9.421153]  qce_aead_crypt+0x1f0/0x81c
> [    9.421168]  qce_aead_encrypt+0x14/0x20
> [    9.421182]  crypto_aead_encrypt+0xa0/0xe0
> [    9.421194]  test_aead_vec_cfg+0x84c/0x1be8
> [    9.421207]  test_aead_vs_generic_impl+0x530/0x850
> [    9.421219]  alg_test_aead+0x7c8/0xe40
> [    9.421230]  alg_test+0x1f4/0xb0c
> [    9.421241]  cryptomgr_test+0x50/0x80
> [    9.421251]  kthread+0x378/0x678
> [    9.421272]  ret_from_fork+0x10/0x20
>
> [    9.550765] Allocated by task 393:
> [    9.554279]  kasan_save_stack+0x3c/0x64
> [    9.558252]  kasan_save_track+0x20/0x40
> [    9.562221]  kasan_save_alloc_info+0x40/0x54
> [    9.566641]  __kasan_kmalloc+0xb8/0xbc
> [    9.570515]  __kmalloc_noprof+0x188/0x410
> [    9.574669]  qce_aead_async_req_handle+0xbd4/0x1aa0
> [    9.579701]  qce_handle_queue+0x20c/0x3e8
> [    9.583841]  qce_async_request_enqueue+0x10/0x1c
> [    9.588607]  qce_aead_crypt+0x1f0/0x81c
> [    9.592577]  qce_aead_encrypt+0x14/0x20
> [    9.596547]  crypto_aead_encrypt+0xa0/0xe0
> [    9.600776]  test_aead_vec_cfg+0x84c/0x1be8
> [    9.605097]  test_aead_vs_generic_impl+0x530/0x850
> [    9.610039]  alg_test_aead+0x7c8/0xe40
> [    9.613911]  alg_test+0x1f4/0xb0c
> [    9.617345]  cryptomgr_test+0x50/0x80
> [    9.621124]  kthread+0x378/0x678
> [    9.624473]  ret_from_fork+0x10/0x20
>
> [    9.629738] The buggy address belongs to the object at ffff5c7440d1cc00
>                 which belongs to the cache kmalloc-32 of size 32
> [    9.642426] The buggy address is located 0 bytes inside of
>                 allocated 22-byte region [ffff5c7440d1cc00, ffff5c7440d1cc16)
>
> [    9.656672] The buggy address belongs to the physical page:
> [    9.662409] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x880d1c
> [    9.670646] flags: 0xbfffe0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> [    9.677371] page_type: f5(slab)
> [    9.680624] raw: 0bfffe0000000000 ffff5c7440002780 dead000000000100 dead000000000122
> [    9.688595] raw: 0000000000000000 0000000080400040 00000000f5000000 0000000000000000
> [    9.696560] page dumped because: kasan: bad access detected
>
> [    9.703854] Memory state around the buggy address:
> [    9.708796]  ffff5c7440d1cb00: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> [    9.716227]  ffff5c7440d1cb80: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> [    9.723660] >ffff5c7440d1cc00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc
> [    9.731092]                          ^
> [    9.734959]  ffff5c7440d1cc80: fa fb fb fb fc fc fc fc 00 00 06 fc fc fc fc fc
> [    9.742392]  ffff5c7440d1cd00: 00 00 06 fc fc fc fc fc fa fb fb fb fc fc fc fc
> [    9.749824] ==================================================================
>

Hmm, I am running with KASAN on, never seen this How would I run
tcrypt to trigger it? By default the module doesn't load with no
params and gives me:

modprobe: ERROR: could not insert 'tcrypt': Resource temporarily unavailable

>
> I personally still struggle to understand how this driver could plausibly be
> useful when the software crypto has no issues, is much faster, and is much
> better tested.  What is motivating having this driver in the kernel?

We want to use it in conjunction with the upcoming scminvoke (for
loading TAs and invoking objects - used to program the keys into the
QCE) to support the DRM use-case for decrypting streaming data inside
secure buffers upstream.

Bart
Eric Biggers Jan. 18, 2025, 5:55 p.m. UTC | #4
On Sat, Jan 18, 2025 at 10:28:26AM +0100, Bartosz Golaszewski wrote:
> I was testing with kcapi-speed and cryptsetup benchmark. I've never
> seen any errors.
> 
> Is this after my changes only or did it exist before? You're testing
> with the tcrypt module? How are you inserting it exactly? What params?

Those are all benchmarks, not tests.  The tests run at registration time if you
just enable the kconfig options for them:

    # CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
    CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y

The test failures and KASAN error occur on mainline too, so yes they occur
before your patchset too.

> >
> > I personally still struggle to understand how this driver could plausibly be
> > useful when the software crypto has no issues, is much faster, and is much
> > better tested.  What is motivating having this driver in the kernel?
> 
> We want to use it in conjunction with the upcoming scminvoke (for
> loading TAs and invoking objects - used to program the keys into the
> QCE) to support the DRM use-case for decrypting streaming data inside
> secure buffers upstream.

Notably lacking is any claim that any of the current features of the driver are
actually useful.

- Eric
Bartosz Golaszewski Jan. 20, 2025, 1:46 p.m. UTC | #5
On Sat, Jan 18, 2025 at 6:55 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Sat, Jan 18, 2025 at 10:28:26AM +0100, Bartosz Golaszewski wrote:
> > I was testing with kcapi-speed and cryptsetup benchmark. I've never
> > seen any errors.
> >
> > Is this after my changes only or did it exist before? You're testing
> > with the tcrypt module? How are you inserting it exactly? What params?
>
> Those are all benchmarks, not tests.  The tests run at registration time if you
> just enable the kconfig options for them:
>
>     # CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
>     CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
>
> The test failures and KASAN error occur on mainline too, so yes they occur
> before your patchset too.
>
> > >
> > > I personally still struggle to understand how this driver could plausibly be
> > > useful when the software crypto has no issues, is much faster, and is much
> > > better tested.  What is motivating having this driver in the kernel?
> >
> > We want to use it in conjunction with the upcoming scminvoke (for
> > loading TAs and invoking objects - used to program the keys into the
> > QCE) to support the DRM use-case for decrypting streaming data inside
> > secure buffers upstream.
>
> Notably lacking is any claim that any of the current features of the driver are
> actually useful.
>

Noted. I'm still quite early into working on the upstream-bound code
supporting the streaming use-case but I will consider a proposal to
remove existing features that are better provided by ARM CE.

Thanks,
Bartosz
diff mbox series

Patch

diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
index 6de9f1e23e282..e95e84486d9ae 100644
--- a/drivers/crypto/qce/core.c
+++ b/drivers/crypto/qce/core.c
@@ -3,6 +3,7 @@ 
  * Copyright (c) 2010-2014, The Linux Foundation. All rights reserved.
  */
 
+#include <linux/cleanup.h>
 #include <linux/clk.h>
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
@@ -11,7 +12,6 @@ 
 #include <linux/module.h>
 #include <linux/mod_devicetable.h>
 #include <linux/platform_device.h>
-#include <linux/spinlock.h>
 #include <linux/types.h>
 #include <crypto/algapi.h>
 #include <crypto/internal/hash.h>
@@ -89,34 +89,28 @@  static int qce_handle_queue(struct qce_device *qce,
 			    struct crypto_async_request *req)
 {
 	struct crypto_async_request *async_req, *backlog;
-	unsigned long flags;
 	int ret = 0, err;
 
-	spin_lock_irqsave(&qce->lock, flags);
+	scoped_guard(mutex, &qce->lock) {
+		if (req)
+			ret = crypto_enqueue_request(&qce->queue, req);
 
-	if (req)
-		ret = crypto_enqueue_request(&qce->queue, req);
+		/* busy, do not dequeue request */
+		if (qce->req)
+			return ret;
 
-	/* busy, do not dequeue request */
-	if (qce->req) {
-		spin_unlock_irqrestore(&qce->lock, flags);
-		return ret;
+		backlog = crypto_get_backlog(&qce->queue);
+		async_req = crypto_dequeue_request(&qce->queue);
+		if (async_req)
+			qce->req = async_req;
 	}
 
-	backlog = crypto_get_backlog(&qce->queue);
-	async_req = crypto_dequeue_request(&qce->queue);
-	if (async_req)
-		qce->req = async_req;
-
-	spin_unlock_irqrestore(&qce->lock, flags);
-
 	if (!async_req)
 		return ret;
 
 	if (backlog) {
-		spin_lock_bh(&qce->lock);
-		crypto_request_complete(backlog, -EINPROGRESS);
-		spin_unlock_bh(&qce->lock);
+		scoped_guard(mutex, &qce->lock)
+			crypto_request_complete(backlog, -EINPROGRESS);
 	}
 
 	err = qce_handle_request(async_req);
@@ -133,12 +127,11 @@  static void qce_req_done_work(struct work_struct *work)
 	struct qce_device *qce = container_of(work, struct qce_device,
 					      done_work);
 	struct crypto_async_request *req;
-	unsigned long flags;
 
-	spin_lock_irqsave(&qce->lock, flags);
-	req = qce->req;
-	qce->req = NULL;
-	spin_unlock_irqrestore(&qce->lock, flags);
+	scoped_guard(mutex, &qce->lock) {
+		req = qce->req;
+		qce->req = NULL;
+	}
 
 	if (req)
 		crypto_request_complete(req, qce->result);
@@ -243,7 +236,10 @@  static int qce_crypto_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
-	spin_lock_init(&qce->lock);
+	ret = devm_mutex_init(qce->dev, &qce->lock);
+	if (ret)
+		return ret;
+
 	INIT_WORK(&qce->done_work, qce_req_done_work);
 	crypto_init_queue(&qce->queue, QCE_QUEUE_LENGTH);
 
diff --git a/drivers/crypto/qce/core.h b/drivers/crypto/qce/core.h
index 39e75a75a4293..eb6fa7a8b64a8 100644
--- a/drivers/crypto/qce/core.h
+++ b/drivers/crypto/qce/core.h
@@ -6,6 +6,7 @@ 
 #ifndef _CORE_H_
 #define _CORE_H_
 
+#include <linux/mutex.h>
 #include <linux/workqueue.h>
 
 #include "dma.h"
@@ -30,7 +31,7 @@ 
  */
 struct qce_device {
 	struct crypto_queue queue;
-	spinlock_t lock;
+	struct mutex lock;
 	struct work_struct done_work;
 	struct crypto_async_request *req;
 	int result;