Message ID | 20220630213733.17689-4-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | Call blk_mq_free_tag_set() earlier | expand |
On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: > There are two .exit_cmd_priv implementations. Both implementations use > resources associated with the SCSI host. Make sure that these resources are Please document what the exact resources associated with this SCSI host is. We need the root cause. I understand it might be related with module unloading, since ib_srp may be gone already, but not sure if it is the exact one in this report. > still available when .exit_cmd_priv is called by moving the .exit_cmd_priv > calls from scsi_host_dev_release() to scsi_forget_host(). Moving > blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is > safe because scsi_forget_host() drains all the request queues that use the > host tag set. This guarantees that no requests are in flight and also that > no new requests will be allocated from the host tag set. > > This patch fixes the following use-after-free: > > ================================================================== > BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] > Read of size 8 at addr ffff888100337000 by task multipathd/16727 What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 point to? Thanks, Ming
On 01/07/2022 11:44, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. > >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? This bug was reported by me, let's input some debug information. *Attention*: below debug info come from a modified source, so the offset it is a bit different from above one. [ 120.400572] ib_srp: lizhijian: srp_exit_cmd_priv:975: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400578] ib_srp: lizhijian: srp_exit_cmd_priv:977: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400590] ================================================================== [ 120.400592] BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x6c/0x109 [ib_srp] [ 120.400616] Read of size 8 at addr ffff8881076b1800 by task multipathd/1417 [ 120.400619] [ 120.400621] CPU: 0 PID: 1417 Comm: multipathd Not tainted 5.19.0-rc1-roce-flush+ #85 [ 120.400626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 crash> struct srp_target_port.srp_host ffff88810b8d67e0 srp_host = 0xffff8881076b1800, crash> struct srp_target_port.srp_host struct srp_target_port { [104] struct srp_host *srp_host; } crash> struct srp_host.srp_dev 0xffff8881076b1800 srp_dev = 0xffff88800bcd1400, crash> struct srp_device 0xffff88800bcd1400 struct srp_device { dev_list = { next = 0xffff888106fafd00, prev = 0xb680010900000749 }, dev = 0x0, pd = 0x0, global_rkey = 0, mr_page_mask = 3, mr_page_size = 181960704, mr_max_size = -30592, max_pages_per_mr = 117112064, has_fr = 129, use_fast_reg = 136 } crash> dis -s srp_exit_cmd_priv+0x6c FILE: ../drivers/infiniband/ulp/srp/ib_srp.c LINE: 978 973 struct srp_request *req; 974 975 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); 976 target = host_to_target(shost); 977 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); * 978 dev = target->srp_host->srp_dev; 979 ibdev = dev->dev; 980 req = scsi_cmd_priv(cmd); 981 982 kfree(req->fr_list); 983 if (req->indirect_dma_addr) { 984 ib_dma_unmap_single(ibdev, req->indirect_dma_addr, 985 target->indirect_size, 986 DMA_TO_DEVICE); Thanks Zhijian > > Thanks, > Ming >
Send again, the format of previous one is wrong. on 7/1/2022 11:44 AM, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. > >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? This bug was reported by me, let's input some debug information. *Attention*: below debug info came from a modified source, so the offset it is a bit different from above one. [ 120.400572] ib_srp: lizhijian: srp_exit_cmd_priv:975: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400578] ib_srp: lizhijian: srp_exit_cmd_priv:977: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400590] ================================================================== [ 120.400592] BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x6c/0x109 [ib_srp] [ 120.400616] Read of size 8 at addr ffff8881076b1800 by task multipathd/1417 [ 120.400619] [ 120.400621] CPU: 0 PID: 1417 Comm: multipathd Not tainted 5.19.0-rc1-roce-flush+ #85 [ 120.400626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 crash> struct srp_target_port.srp_host ffff88810b8d67e0 srp_host = 0xffff8881076b1800, crash> struct srp_target_port.srp_host struct srp_target_port { [104] struct srp_host *srp_host; } crash> struct srp_host.srp_dev 0xffff8881076b1800 srp_dev = 0xffff88800bcd1400, crash> struct srp_device 0xffff88800bcd1400 struct srp_device { dev_list = { next = 0xffff888106fafd00, prev = 0xb680010900000749 }, dev = 0x0, pd = 0x0, global_rkey = 0, mr_page_mask = 3, mr_page_size = 181960704, mr_max_size = -30592, max_pages_per_mr = 117112064, has_fr = 129, use_fast_reg = 136 } crash> dis -s srp_exit_cmd_priv+0x6c FILE: ../drivers/infiniband/ulp/srp/ib_srp.c LINE: 978 973 struct srp_request *req; 974 975 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); 976 target = host_to_target(shost); 977 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); * 978 dev = target->srp_host->srp_dev; 979 ibdev = dev->dev; 980 req = scsi_cmd_priv(cmd); 981 982 kfree(req->fr_list); 983 if (req->indirect_dma_addr) { 984 ib_dma_unmap_single(ibdev, req->indirect_dma_addr, 985 target->indirect_size, 986 DMA_TO_DEVICE); 987 } 988 kfree(req->indirect_desc); 989 990 return 0; 991 } Thanks Zhijian > > > Thanks, > Ming >
On 6/30/22 20:44, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. It is not necessary to unload ib_srp to trigger this scenario. Hot-unplugging an RDMA adapter used by the ib_srp driver is sufficient. Hot-unplugging triggers a call of srp_remove_one(). srp_remove_one() itself and also its caller free resources used by srp_exit_cmd_priv(), e.g. struct ib_device. >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? I think that Li already answered this question. Thanks, Bart.
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index ef6c0e37acce..74bfa187fe19 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -182,6 +182,8 @@ void scsi_remove_host(struct Scsi_Host *shost) mutex_unlock(&shost->scan_mutex); scsi_proc_host_rm(shost); + scsi_mq_destroy_tags(shost); + spin_lock_irqsave(shost->host_lock, flags); if (scsi_host_set_state(shost, SHOST_DEL)) BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY)); @@ -295,8 +297,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, return error; /* - * Any host allocation in this function will be freed in - * scsi_host_dev_release(). + * Any resources associated with the SCSI host in this function except + * the tag set will be freed by scsi_host_dev_release(). */ out_del_dev: device_del(&shost->shost_dev); @@ -312,6 +314,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, pm_runtime_disable(&shost->shost_gendev); pm_runtime_set_suspended(&shost->shost_gendev); pm_runtime_put_noidle(&shost->shost_gendev); + scsi_mq_destroy_tags(shost); fail: return error; } @@ -345,9 +348,6 @@ static void scsi_host_dev_release(struct device *dev) kfree(dev_name(&shost->shost_dev)); } - if (shost->tag_set.tags) - scsi_mq_destroy_tags(shost); - kfree(shost->shost_data); ida_free(&host_index_ida, shost->host_no); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 6ffc9e4258a8..1aa1a279f8f3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1990,7 +1990,10 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost) void scsi_mq_destroy_tags(struct Scsi_Host *shost) { + if (!shost->tag_set.tags) + return; blk_mq_free_tag_set(&shost->tag_set); + WARN_ON_ONCE(shost->tag_set.tags); } /**