Message ID | Z72FJnbA39zWh4zS@gondor.apana.org.au |
---|---|
State | New |
Headers | show |
Series | mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead | expand |
On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote: > > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path? crypto_free_acomp does not allocate memory. However, it takes the same mutex that is also taken on the allocation path. The specific call path can be seen in the original report: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82 Cheers,
On Wed, Feb 26, 2025 at 09:25:23AM +0800, Herbert Xu wrote: > On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote: > > > > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path? > > crypto_free_acomp does not allocate memory. However, it takes > the same mutex that is also taken on the allocation path. > > The specific call path can be seen in the original report: > > https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82 After staring at this for a while I think the following situation could be the problem: Task A running on CPU #1: crypto_alloc_acomp_node() Holds scomp_lock Enters reclaim Reads per_cpu_ptr(pool->acomp_ctx, cpu) Task A is descheduled zswap_cpu_comp_dead(CPU #1) // CPU #1 going offline Holds per_cpu_ptr(pool->acomp_ctx, cpu)) Calls crypto_free_acomp() Waits for scomp_lock Task A running on CPU #2: Waits for per_cpu_ptr(pool->acomp_ctx, cpu) DEADLOCK In this case I think the fix is correct, thanks for looking into it. Could you please: (1) Explain the exact scenario in the commit log, I did not understand it at first, only after looking at the syzbot dashboard for a while (and I am not sure how long this persists). (2) Move all the freeing operations outside the mutex? Right now crypto_free_acomp() was the problematic call but it could be acomp_request_free() next. Something like: static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); struct struct acomp_req *req; struct crypto_acomp *acomp; u8 *buffer; if (IS_ERR_OR_NULL(acomp_ctx)) return 0; mutex_lock(&acomp_ctx->mutex); req = acomp_ctx->req; acomp_ctx->req = NULL; acomp = acomp_ctx->acomp; acomp_ctx->acomp = NULL; buffer = acomp_ctx->buffer; acomp_ctx->buffer = NULL; mutex_unlock(&acomp_ctx->mutex); /* * Do the actual freeing after releasing the mutex to avoid subtle * locking dependencies causing deadlocks */ if (!IS_ERR_OR_NULL(req)) acomp_request_free(req); if (!IS_ERR_OR_NULL(acomp)) crypto_free_acomp(acomp); kfree(acomp_ctx->buffer); return 0; }
On Wed, Feb 26, 2025 at 10:10:22AM +0800, Herbert Xu wrote: > On Wed, Feb 26, 2025 at 02:08:14AM +0000, Yosry Ahmed wrote: > > > > Could you please: > > > > (1) Explain the exact scenario in the commit log, I did not understand > > it at first, only after looking at the syzbot dashboard for a while (and > > I am not sure how long this persists). > > > > (2) Move all the freeing operations outside the mutex? Right now > > crypto_free_acomp() was the problematic call but it could be > > acomp_request_free() next. > > > > Something like: > > Looks good to me. Feel free to send this patch since it is your > system after all :) Can do :) May I add your Co-developed-by and Signed-off-by since this would be based off your patch?
On Wed, Feb 26, 2025 at 02:46:53AM +0000, Yosry Ahmed wrote: > > Can do :) May I add your Co-developed-by and Signed-off-by since this > would be based off your patch? Sure you can also add my ack: Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks,
diff --git a/mm/zswap.c b/mm/zswap.c index 6504174fbc6a..24d36266a791 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -881,18 +881,23 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); + struct crypto_acomp *acomp = NULL; + + if (IS_ERR_OR_NULL(acomp_ctx)) + return 0; mutex_lock(&acomp_ctx->mutex); - if (!IS_ERR_OR_NULL(acomp_ctx)) { - if (!IS_ERR_OR_NULL(acomp_ctx->req)) - acomp_request_free(acomp_ctx->req); - acomp_ctx->req = NULL; - if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) - crypto_free_acomp(acomp_ctx->acomp); - kfree(acomp_ctx->buffer); - } + if (!IS_ERR_OR_NULL(acomp_ctx->req)) + acomp_request_free(acomp_ctx->req); + acomp_ctx->req = NULL; + acomp = acomp_ctx->acomp; + acomp_ctx->acomp = NULL; + kfree(acomp_ctx->buffer); + acomp_ctx->buffer = NULL; mutex_unlock(&acomp_ctx->mutex); + crypto_free_acomp(acomp); + return 0; }