Message ID | 20250506144941.2715345-1-usama.anjum@collabora.com |
---|---|
State | New |
Headers | show |
Series | [v4] bus: mhi: host: don't free bhie tables during suspend/hibernation | expand |
On 5/6/2025 10:49 PM, Muhammad Usama Anjum wrote: > Fix dma_direct_alloc() failure at resume time during bhie_table > allocation because of memory pressure. There is a report where at > resume time, the memory from the dma doesn't get allocated and MHI > fails to re-initialize. > > To fix it, don't free the memory at power down during suspend / > hibernation. Instead, use the same allocated memory again after every > resume / hibernation. This patch has been tested with resume and > hibernation both. > > The rddm is of constant size for a given hardware. While the fbc_image > size depends on the firmware. If the firmware changes, we'll free and > allocate new memory for it. > > Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6 > > Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> with WLAN firmware unchanged, suspend/resume is working well: Tested-on: WCN7850 hw2.0 WLAN.HMT.1.1.c5-00284-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Tested-by: Baochen Qiang <quic_bqiang@quicinc.com>
On 5/6/2025 7:49 AM, Muhammad Usama Anjum wrote: > Fix dma_direct_alloc() failure at resume time during bhie_table > allocation because of memory pressure. There is a report where at > resume time, the memory from the dma doesn't get allocated and MHI > fails to re-initialize. > > To fix it, don't free the memory at power down during suspend / > hibernation. Instead, use the same allocated memory again after every > resume / hibernation. This patch has been tested with resume and > hibernation both. > > The rddm is of constant size for a given hardware. While the fbc_image > size depends on the firmware. If the firmware changes, we'll free and > allocate new memory for it. > > Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6 > > Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Mani, if this looks ok to you I assume you'll take this through your MHI tree. Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote: > Fix dma_direct_alloc() failure at resume time during bhie_table > allocation because of memory pressure. There is a report where at > resume time, the memory from the dma doesn't get allocated and MHI > fails to re-initialize. > > To fix it, don't free the memory at power down during suspend / > hibernation. Instead, use the same allocated memory again after every > resume / hibernation. This patch has been tested with resume and > hibernation both. > > The rddm is of constant size for a given hardware. While the fbc_image > size depends on the firmware. If the firmware changes, we'll free and > allocate new memory for it. Why is it valid to load new firmware as a result of suspend? I don't users would expect that. > diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c > index efa3b6dddf4d2..bc8459798bbee 100644 > --- a/drivers/bus/mhi/host/boot.c > +++ b/drivers/bus/mhi/host/boot.c > @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl) > * device transitioning into MHI READY state > */ > if (fw_load_type == MHI_FW_LOAD_FBC) { Why is this FBC specific? > - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); > - if (ret) { > - release_firmware(firmware); > - goto error_fw_load; > + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { > + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); > + mhi_cntrl->fbc_image = NULL; > + } > + if (!mhi_cntrl->fbc_image) { > + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); > + if (ret) { > + release_firmware(firmware); > + goto error_fw_load; > + } > + mhi_cntrl->prev_fw_sz = fw_sz; > } > > /* Load the firmware into BHIE vec table */ > diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c > index e6c3ff62bab1d..107d71b4cc51a 100644 > --- a/drivers/bus/mhi/host/pm.c > +++ b/drivers/bus/mhi/host/pm.c > @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful) > } > EXPORT_SYMBOL_GPL(mhi_power_down); > > +static void __mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl) > +{ > + mhi_cntrl->bhi = NULL; > + mhi_cntrl->bhie = NULL; Why? > + > + mhi_deinit_dev_ctxt(mhi_cntrl); > +}
On 5/12/25 11:46 PM, Jeff Hugo wrote: > On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote: >> Fix dma_direct_alloc() failure at resume time during bhie_table >> allocation because of memory pressure. There is a report where at >> resume time, the memory from the dma doesn't get allocated and MHI >> fails to re-initialize. >> >> To fix it, don't free the memory at power down during suspend / >> hibernation. Instead, use the same allocated memory again after every >> resume / hibernation. This patch has been tested with resume and >> hibernation both. >> >> The rddm is of constant size for a given hardware. While the fbc_image >> size depends on the firmware. If the firmware changes, we'll free and >> allocate new memory for it. > > Why is it valid to load new firmware as a result of suspend? I don't > users would expect that. I'm not sure its valid or not. Like other users, I also don't expect that firmware would get changed. It doesn't seem to be tested and hence supported case. But other drivers have code which have implementation like this. I'd mentioned previously that this patch was motivated from the ath12k [1] and ath11k [2] patches. They don't free the memory and reuse the same memory if new size is same. > >> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c >> index efa3b6dddf4d2..bc8459798bbee 100644 >> --- a/drivers/bus/mhi/host/boot.c >> +++ b/drivers/bus/mhi/host/boot.c >> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller >> *mhi_cntrl) >> * device transitioning into MHI READY state >> */ >> if (fw_load_type == MHI_FW_LOAD_FBC) { > > Why is this FBC specific? It seems we allocate fbc_image only when firmware load type is FW_LOAD_FBC. I'm just optimizing the buffer allocation here. > >> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, >> fw_sz); >> - if (ret) { >> - release_firmware(firmware); >> - goto error_fw_load; >> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { >> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); >> + mhi_cntrl->fbc_image = NULL; >> + } >> + if (!mhi_cntrl->fbc_image) { >> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl- >> >fbc_image, fw_sz); >> + if (ret) { >> + release_firmware(firmware); >> + goto error_fw_load; >> + } >> + mhi_cntrl->prev_fw_sz = fw_sz; >> } >> /* Load the firmware into BHIE vec table */ >> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c >> index e6c3ff62bab1d..107d71b4cc51a 100644 >> --- a/drivers/bus/mhi/host/pm.c >> +++ b/drivers/bus/mhi/host/pm.c >> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller >> *mhi_cntrl, bool graceful) >> } >> EXPORT_SYMBOL_GPL(mhi_power_down); >> +static void __mhi_power_down_unprepare_keep_dev(struct >> mhi_controller *mhi_cntrl) >> +{ >> + mhi_cntrl->bhi = NULL; >> + mhi_cntrl->bhie = NULL; > > Why? This function is shorter version of mhi_unprepare_after_power_down(). As we need different code path in case of suspend/hibernation case, I was adding a new API which Mani asked me remove and consolidate into mhi_power_down_keep_dev() instead. So this static function has been added. [3] > >> + >> + mhi_deinit_dev_ctxt(mhi_cntrl); >> +} [1] https://lore.kernel.org/all/20240419034034.2842-1-quic_bqiang@quicinc.com/ [2] https://lore.kernel.org/all/20220506141448.10340-1-quic_akolli@quicinc.com/ [3] https://lore.kernel.org/all/y5odcxzms6mwpz5bdxhbjxo7p6whsdgwm772usmmzqobhf6nam@p4ul7vn7d3an
On 5/13/2025 12:44 AM, Muhammad Usama Anjum wrote: > On 5/12/25 11:46 PM, Jeff Hugo wrote: >> On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote: >>> Fix dma_direct_alloc() failure at resume time during bhie_table >>> allocation because of memory pressure. There is a report where at >>> resume time, the memory from the dma doesn't get allocated and MHI >>> fails to re-initialize. >>> >>> To fix it, don't free the memory at power down during suspend / >>> hibernation. Instead, use the same allocated memory again after every >>> resume / hibernation. This patch has been tested with resume and >>> hibernation both. >>> >>> The rddm is of constant size for a given hardware. While the fbc_image >>> size depends on the firmware. If the firmware changes, we'll free and >>> allocate new memory for it. >> >> Why is it valid to load new firmware as a result of suspend? I don't >> users would expect that. > I'm not sure its valid or not. Like other users, I also don't expect > that firmware would get changed. It doesn't seem to be tested and hence > supported case. > > But other drivers have code which have implementation like this. I'd > mentioned previously that this patch was motivated from the ath12k [1] > and ath11k [2] patches. They don't free the memory and reuse the same > memory if new size is same. It feels like this justification needs to be detailed in the commit text. I suspect at some point we'll have another MHI device where the FW will need to be cached. >>> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c >>> index efa3b6dddf4d2..bc8459798bbee 100644 >>> --- a/drivers/bus/mhi/host/boot.c >>> +++ b/drivers/bus/mhi/host/boot.c >>> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller >>> *mhi_cntrl) >>> * device transitioning into MHI READY state >>> */ >>> if (fw_load_type == MHI_FW_LOAD_FBC) { >> >> Why is this FBC specific? > It seems we allocate fbc_image only when firmware load type is > FW_LOAD_FBC. I'm just optimizing the buffer allocation here. We alloc bhie tables in non FBC usecases. Is this somehow an FBC specific issue? Perhaps you could clarify the limits of this solution in the commit text? > >> >>> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, >>> fw_sz); >>> - if (ret) { >>> - release_firmware(firmware); >>> - goto error_fw_load; >>> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { >>> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); >>> + mhi_cntrl->fbc_image = NULL; >>> + } >>> + if (!mhi_cntrl->fbc_image) { >>> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl- >>>> fbc_image, fw_sz); >>> + if (ret) { >>> + release_firmware(firmware); >>> + goto error_fw_load; >>> + } >>> + mhi_cntrl->prev_fw_sz = fw_sz; >>> } >>> /* Load the firmware into BHIE vec table */ >>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c >>> index e6c3ff62bab1d..107d71b4cc51a 100644 >>> --- a/drivers/bus/mhi/host/pm.c >>> +++ b/drivers/bus/mhi/host/pm.c >>> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller >>> *mhi_cntrl, bool graceful) >>> } >>> EXPORT_SYMBOL_GPL(mhi_power_down); >>> +static void __mhi_power_down_unprepare_keep_dev(struct >>> mhi_controller *mhi_cntrl) >>> +{ >>> + mhi_cntrl->bhi = NULL; >>> + mhi_cntrl->bhie = NULL; >> >> Why? > This function is shorter version of mhi_unprepare_after_power_down(). As > we need different code path in case of suspend/hibernation case, I was > adding a new API which Mani asked me remove and consolidate into > mhi_power_down_keep_dev() instead. So this static function has been > added. [3] I don't understand the need to zero these out. Also, if you are copying part of the functionality of mhi_unprepare_after_power_down(), shouldn't that functionality be moved into your new API to eliminate duplication?
On 5/13/25 8:16 PM, Jeff Hugo wrote: > On 5/13/2025 12:44 AM, Muhammad Usama Anjum wrote: >> On 5/12/25 11:46 PM, Jeff Hugo wrote: >>> On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote: >>>> Fix dma_direct_alloc() failure at resume time during bhie_table >>>> allocation because of memory pressure. There is a report where at >>>> resume time, the memory from the dma doesn't get allocated and MHI >>>> fails to re-initialize. >>>> >>>> To fix it, don't free the memory at power down during suspend / >>>> hibernation. Instead, use the same allocated memory again after every >>>> resume / hibernation. This patch has been tested with resume and >>>> hibernation both. >>>> >>>> The rddm is of constant size for a given hardware. While the fbc_image >>>> size depends on the firmware. If the firmware changes, we'll free and >>>> allocate new memory for it. >>> >>> Why is it valid to load new firmware as a result of suspend? I don't >>> users would expect that. >> I'm not sure its valid or not. Like other users, I also don't expect >> that firmware would get changed. It doesn't seem to be tested and hence >> supported case. >> >> But other drivers have code which have implementation like this. I'd >> mentioned previously that this patch was motivated from the ath12k [1] >> and ath11k [2] patches. They don't free the memory and reuse the same >> memory if new size is same. > > It feels like this justification needs to be detailed in the commit > text. I suspect at some point we'll have another MHI device where the FW > will need to be cached. Okay. I'll add this information to the commit message. Currently I've not seen firmware caching being used other than graphics driver. > >>>> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c >>>> index efa3b6dddf4d2..bc8459798bbee 100644 >>>> --- a/drivers/bus/mhi/host/boot.c >>>> +++ b/drivers/bus/mhi/host/boot.c >>>> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller >>>> *mhi_cntrl) >>>> * device transitioning into MHI READY state >>>> */ >>>> if (fw_load_type == MHI_FW_LOAD_FBC) { >>> >>> Why is this FBC specific? >> It seems we allocate fbc_image only when firmware load type is >> FW_LOAD_FBC. I'm just optimizing the buffer allocation here. > > We alloc bhie tables in non FBC usecases. Is this somehow an FBC > specific issue? Perhaps you could clarify the limits of this solution in > the commit text? Okay. I'll add information that we are optimizing the bhie allocations. It has nothing to do with firmware type. I've found only 2 bhie allocations; fbc_image and rddm_image. So we are optimizing those. > >> >>> >>>> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, >>>> fw_sz); >>>> - if (ret) { >>>> - release_firmware(firmware); >>>> - goto error_fw_load; >>>> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { >>>> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); >>>> + mhi_cntrl->fbc_image = NULL; >>>> + } >>>> + if (!mhi_cntrl->fbc_image) { >>>> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl- >>>>> fbc_image, fw_sz); >>>> + if (ret) { >>>> + release_firmware(firmware); >>>> + goto error_fw_load; >>>> + } >>>> + mhi_cntrl->prev_fw_sz = fw_sz; >>>> } >>>> /* Load the firmware into BHIE vec table */ >>>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c >>>> index e6c3ff62bab1d..107d71b4cc51a 100644 >>>> --- a/drivers/bus/mhi/host/pm.c >>>> +++ b/drivers/bus/mhi/host/pm.c >>>> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller >>>> *mhi_cntrl, bool graceful) >>>> } >>>> EXPORT_SYMBOL_GPL(mhi_power_down); >>>> +static void __mhi_power_down_unprepare_keep_dev(struct >>>> mhi_controller *mhi_cntrl) >>>> +{ >>>> + mhi_cntrl->bhi = NULL; >>>> + mhi_cntrl->bhie = NULL; >>> >>> Why? >> This function is shorter version of mhi_unprepare_after_power_down(). As >> we need different code path in case of suspend/hibernation case, I was >> adding a new API which Mani asked me remove and consolidate into >> mhi_power_down_keep_dev() instead. So this static function has been >> added. [3] > > I don't understand the need to zero these out. Also, if you are copying > part of the functionality of mhi_unprepare_after_power_down(), shouldn't > that functionality be moved into your new API to eliminate duplication? This how the cleanup works mhi_unprepare_after_power_down(). Yeah, it makes sense to use this function in mhi_unprepare_after_power_down(). Sending next version soon. >
On 5/14/2025 1:17 AM, Muhammad Usama Anjum wrote: > On 5/13/25 8:16 PM, Jeff Hugo wrote: >> On 5/13/2025 12:44 AM, Muhammad Usama Anjum wrote: >>> On 5/12/25 11:46 PM, Jeff Hugo wrote: >>>> On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote: >>>>> Fix dma_direct_alloc() failure at resume time during bhie_table >>>>> allocation because of memory pressure. There is a report where at >>>>> resume time, the memory from the dma doesn't get allocated and MHI >>>>> fails to re-initialize. >>>>> >>>>> To fix it, don't free the memory at power down during suspend / >>>>> hibernation. Instead, use the same allocated memory again after every >>>>> resume / hibernation. This patch has been tested with resume and >>>>> hibernation both. >>>>> >>>>> The rddm is of constant size for a given hardware. While the fbc_image >>>>> size depends on the firmware. If the firmware changes, we'll free and >>>>> allocate new memory for it. >>>> >>>> Why is it valid to load new firmware as a result of suspend? I don't >>>> users would expect that. >>> I'm not sure its valid or not. Like other users, I also don't expect >>> that firmware would get changed. It doesn't seem to be tested and hence >>> supported case. >>> >>> But other drivers have code which have implementation like this. I'd >>> mentioned previously that this patch was motivated from the ath12k [1] >>> and ath11k [2] patches. They don't free the memory and reuse the same >>> memory if new size is same. >> >> It feels like this justification needs to be detailed in the commit >> text. I suspect at some point we'll have another MHI device where the FW >> will need to be cached. > Okay. I'll add this information to the commit message. Currently I've > not seen firmware caching being used other than graphics driver. > >> >>>>> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c >>>>> index efa3b6dddf4d2..bc8459798bbee 100644 >>>>> --- a/drivers/bus/mhi/host/boot.c >>>>> +++ b/drivers/bus/mhi/host/boot.c >>>>> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller >>>>> *mhi_cntrl) >>>>> * device transitioning into MHI READY state >>>>> */ >>>>> if (fw_load_type == MHI_FW_LOAD_FBC) { >>>> >>>> Why is this FBC specific? >>> It seems we allocate fbc_image only when firmware load type is >>> FW_LOAD_FBC. I'm just optimizing the buffer allocation here. >> >> We alloc bhie tables in non FBC usecases. Is this somehow an FBC >> specific issue? Perhaps you could clarify the limits of this solution in >> the commit text? > Okay. I'll add information that we are optimizing the bhie allocations. > It has nothing to do with firmware type. I've found only 2 bhie > allocations; fbc_image and rddm_image. So we are optimizing those. There is a 3rd allocation, and it has everything to do with firmware type. Did you miss mhi_load_image_bhie()? I'm not asking you to touch mhi_load_image_bhie(), but to recognize that what you are doing is specific to some BHIe devices, not all. > >> >>> >>>> >>>>> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, >>>>> fw_sz); >>>>> - if (ret) { >>>>> - release_firmware(firmware); >>>>> - goto error_fw_load; >>>>> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { >>>>> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); >>>>> + mhi_cntrl->fbc_image = NULL; >>>>> + } >>>>> + if (!mhi_cntrl->fbc_image) { >>>>> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl- >>>>>> fbc_image, fw_sz); >>>>> + if (ret) { >>>>> + release_firmware(firmware); >>>>> + goto error_fw_load; >>>>> + } >>>>> + mhi_cntrl->prev_fw_sz = fw_sz; >>>>> } >>>>> /* Load the firmware into BHIE vec table */ >>>>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c >>>>> index e6c3ff62bab1d..107d71b4cc51a 100644 >>>>> --- a/drivers/bus/mhi/host/pm.c >>>>> +++ b/drivers/bus/mhi/host/pm.c >>>>> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller >>>>> *mhi_cntrl, bool graceful) >>>>> } >>>>> EXPORT_SYMBOL_GPL(mhi_power_down); >>>>> +static void __mhi_power_down_unprepare_keep_dev(struct >>>>> mhi_controller *mhi_cntrl) >>>>> +{ >>>>> + mhi_cntrl->bhi = NULL; >>>>> + mhi_cntrl->bhie = NULL; >>>> >>>> Why? >>> This function is shorter version of mhi_unprepare_after_power_down(). As >>> we need different code path in case of suspend/hibernation case, I was >>> adding a new API which Mani asked me remove and consolidate into >>> mhi_power_down_keep_dev() instead. So this static function has been >>> added. [3] >> >> I don't understand the need to zero these out. Also, if you are copying >> part of the functionality of mhi_unprepare_after_power_down(), shouldn't >> that functionality be moved into your new API to eliminate duplication? > This how the cleanup works mhi_unprepare_after_power_down(). Yeah, it > makes sense to use this function in mhi_unprepare_after_power_down(). > > Sending next version soon. >> > >
diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c index efa3b6dddf4d2..bc8459798bbee 100644 --- a/drivers/bus/mhi/host/boot.c +++ b/drivers/bus/mhi/host/boot.c @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl) * device transitioning into MHI READY state */ if (fw_load_type == MHI_FW_LOAD_FBC) { - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); - if (ret) { - release_firmware(firmware); - goto error_fw_load; + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); + mhi_cntrl->fbc_image = NULL; + } + if (!mhi_cntrl->fbc_image) { + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); + if (ret) { + release_firmware(firmware); + goto error_fw_load; + } + mhi_cntrl->prev_fw_sz = fw_sz; } /* Load the firmware into BHIE vec table */ diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c index 13e7a55f54ff4..a7663ad16bfc6 100644 --- a/drivers/bus/mhi/host/init.c +++ b/drivers/bus/mhi/host/init.c @@ -1173,8 +1173,9 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl) /* * Allocate RDDM table for debugging purpose if specified */ - mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image, - mhi_cntrl->rddm_size); + if (!mhi_cntrl->rddm_image) + mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image, + mhi_cntrl->rddm_size); if (mhi_cntrl->rddm_image) { ret = mhi_rddm_prepare(mhi_cntrl, mhi_cntrl->rddm_image); diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c index e6c3ff62bab1d..107d71b4cc51a 100644 --- a/drivers/bus/mhi/host/pm.c +++ b/drivers/bus/mhi/host/pm.c @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful) } EXPORT_SYMBOL_GPL(mhi_power_down); +static void __mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl) +{ + mhi_cntrl->bhi = NULL; + mhi_cntrl->bhie = NULL; + + mhi_deinit_dev_ctxt(mhi_cntrl); +} + void mhi_power_down_keep_dev(struct mhi_controller *mhi_cntrl, bool graceful) { __mhi_power_down(mhi_cntrl, graceful, false); + __mhi_power_down_unprepare_keep_dev(mhi_cntrl); } EXPORT_SYMBOL_GPL(mhi_power_down_keep_dev); diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c index acd76e9392d31..c5dc776b23643 100644 --- a/drivers/net/wireless/ath/ath11k/mhi.c +++ b/drivers/net/wireless/ath/ath11k/mhi.c @@ -460,12 +460,12 @@ void ath11k_mhi_stop(struct ath11k_pci *ab_pci, bool is_suspend) * workaround, otherwise ath11k_core_resume() will timeout * during resume. */ - if (is_suspend) + if (is_suspend) { mhi_power_down_keep_dev(ab_pci->mhi_ctrl, true); - else + } else { mhi_power_down(ab_pci->mhi_ctrl, true); - - mhi_unprepare_after_power_down(ab_pci->mhi_ctrl); + mhi_unprepare_after_power_down(ab_pci->mhi_ctrl); + } } int ath11k_mhi_suspend(struct ath11k_pci *ab_pci) diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c index 08f44baf182a5..3af524ccf4a5a 100644 --- a/drivers/net/wireless/ath/ath12k/mhi.c +++ b/drivers/net/wireless/ath/ath12k/mhi.c @@ -601,6 +601,12 @@ static int ath12k_mhi_set_state(struct ath12k_pci *ab_pci, ath12k_mhi_set_state_bit(ab_pci, mhi_state); + /* mhi_power_down_keep_dev() has been updated to DEINIT without + * freeing bhie tables + */ + if (mhi_state == ATH12K_MHI_POWER_OFF_KEEP_DEV) + ath12k_mhi_set_state_bit(ab_pci, ATH12K_MHI_DEINIT); + return 0; out: @@ -635,12 +641,12 @@ void ath12k_mhi_stop(struct ath12k_pci *ab_pci, bool is_suspend) * workaround, otherwise ath12k_core_resume() will timeout * during resume. */ - if (is_suspend) + if (is_suspend) { ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF_KEEP_DEV); - else + } else { ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF); - - ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT); + ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT); + } } void ath12k_mhi_suspend(struct ath12k_pci *ab_pci) diff --git a/include/linux/mhi.h b/include/linux/mhi.h index dd372b0123a6d..6fd218a877855 100644 --- a/include/linux/mhi.h +++ b/include/linux/mhi.h @@ -306,6 +306,7 @@ struct mhi_controller_config { * if fw_image is NULL and fbc_download is true (optional) * @fw_sz: Firmware image data size for normal booting, used only if fw_image * is NULL and fbc_download is true (optional) + * @prev_fw_sz: Previous firmware image data size, when fbc_download is true * @edl_image: Firmware image name for emergency download mode (optional) * @rddm_size: RAM dump size that host should allocate for debugging purpose * @sbl_size: SBL image size downloaded through BHIe (optional) @@ -382,6 +383,7 @@ struct mhi_controller { const char *fw_image; const u8 *fw_data; size_t fw_sz; + size_t prev_fw_sz; const char *edl_image; size_t rddm_size; size_t sbl_size;
Fix dma_direct_alloc() failure at resume time during bhie_table allocation because of memory pressure. There is a report where at resume time, the memory from the dma doesn't get allocated and MHI fails to re-initialize. To fix it, don't free the memory at power down during suspend / hibernation. Instead, use the same allocated memory again after every resume / hibernation. This patch has been tested with resume and hibernation both. The rddm is of constant size for a given hardware. While the fbc_image size depends on the firmware. If the firmware changes, we'll free and allocate new memory for it. Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6 Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes since v1: - Don't free bhie tables during suspend/hibernation only - Handle fbc_image changed size correctly - Remove fbc_image getting set to NULL in *free_bhie_table() Changes since v2: - Remove the new mhi_partial_unprepare_after_power_down() and instead update mhi_power_down_keep_dev() to use mhi_power_down_unprepare_keep_dev() as suggested by Mani - Update all users of this API such as ath12k (previously only ath11k was updated) - Define prev_fw_sz in docs - Do better alignment of comments Changes since v3: - Fix state machine of ath12k by setting ATH12K_MHI_DEINIT with ATH12K_MHI_POWER_OFF_KEEP_DEV state (Thanks Sebastian for testing and finding the problem) - Use static with mhi_power_down_unprepare_keep_dev() - Remove crash log as it was showing that kworker wasn't able to allocate memory. This patch doesn't have fixes tag as we are avoiding error in case of memory pressure. We are just making this driver more robust by not freeing the memory and using the same after resuming. --- drivers/bus/mhi/host/boot.c | 15 +++++++++++---- drivers/bus/mhi/host/init.c | 5 +++-- drivers/bus/mhi/host/pm.c | 9 +++++++++ drivers/net/wireless/ath/ath11k/mhi.c | 8 ++++---- drivers/net/wireless/ath/ath12k/mhi.c | 14 ++++++++++---- include/linux/mhi.h | 2 ++ 6 files changed, 39 insertions(+), 14 deletions(-)