Message ID | 1602234434-924-1-git-send-email-loic.poulain@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | bus: mhi: Fix channel close issue on driver remove | expand |
On 2020-10-09 02:07, Loic Poulain wrote: > Some MHI device drivers need to stop the channels in their driver > remove callback (e.g. module unloading), but the unprepare function > is aborted because MHI core moved the channels to suspended state > prior calling driver remove callback. This prevents the driver to > send a proper MHI RESET CHAN command to the device. Device is then > unaware of the stopped state of these channels. > > This causes issue when driver tries to start the channels again (e.g. > module is reloaded), since device considers channels as already > started (inconsistent state). > > Fix this by allowing channel reset when channel is suspended. > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> > --- > drivers/bus/mhi/core/main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > index d20967a..a588eac 100644 > --- a/drivers/bus/mhi/core/main.c > +++ b/drivers/bus/mhi/core/main.c > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct > mhi_controller *mhi_cntrl, > /* no more processing events for this channel */ > mutex_lock(&mhi_chan->mutex); > write_lock_irq(&mhi_chan->lock); > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { > write_unlock_irq(&mhi_chan->lock); > mutex_unlock(&mhi_chan->mutex); > return; Hi Loic, There should not be any reason for drivers to do an "unprepare" and send a reset channel command during remove, as the channel context gets cleaned up after the remove callback returns. We do not want to allow moving from MHI_CH_STATE_SUSPENDED to MHI_CH_STATE_DISABLED state because if a remove is called, channel context being cleaned up implies a reset. Also, I have a bunch of channel state machine related patches coming up soon which solve this issue and more. We are also introducing some missing features with that. It would be nice if you can review/comment on those as it overhauls the state machine. Let me know what you think. Thanks, Bhaumik The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
HI Bhaumik, On Sat, 10 Oct 2020 at 02:23, <bbhatt@codeaurora.org> wrote: > > On 2020-10-09 02:07, Loic Poulain wrote: > > Some MHI device drivers need to stop the channels in their driver > > remove callback (e.g. module unloading), but the unprepare function > > is aborted because MHI core moved the channels to suspended state > > prior calling driver remove callback. This prevents the driver to > > send a proper MHI RESET CHAN command to the device. Device is then > > unaware of the stopped state of these channels. > > > > This causes issue when driver tries to start the channels again (e.g. > > module is reloaded), since device considers channels as already > > started (inconsistent state). > > > > Fix this by allowing channel reset when channel is suspended. > > > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> > > --- > > drivers/bus/mhi/core/main.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > > index d20967a..a588eac 100644 > > --- a/drivers/bus/mhi/core/main.c > > +++ b/drivers/bus/mhi/core/main.c > > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct > > mhi_controller *mhi_cntrl, > > /* no more processing events for this channel */ > > mutex_lock(&mhi_chan->mutex); > > write_lock_irq(&mhi_chan->lock); > > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { > > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && > > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { > > write_unlock_irq(&mhi_chan->lock); > > mutex_unlock(&mhi_chan->mutex); > > return; > Hi Loic, > > There should not be any reason for drivers to do an "unprepare" and send > a reset channel > command during remove, as the channel context gets cleaned up after the > remove callback > returns. Well, a good practice is to have a balanced interface, and everything we do in probe() should be undoable in remove(). Here we start the channel in probe() and explicitly stop them in remove(), So I think doing unprepare in remove should work anyway, even if the MHI stack does some cleanup on its own. > > > We do not want to allow moving from MHI_CH_STATE_SUSPENDED to > MHI_CH_STATE_DISABLED state > because if a remove is called, channel context being cleaned up implies > a reset. AFAIK today, no reset command is sent on remove. > > Also, I have a bunch of channel state machine related patches coming up > soon which solve > this issue and more. We are also introducing some missing features with > that. > > It would be nice if you can review/comment on those as it overhauls the > state machine. Sure, feel free to submit. Regards, Loic
On 2020-10-09 23:06, Loic Poulain wrote: > HI Bhaumik, > > On Sat, 10 Oct 2020 at 02:23, <bbhatt@codeaurora.org> wrote: >> >> On 2020-10-09 02:07, Loic Poulain wrote: >> > Some MHI device drivers need to stop the channels in their driver >> > remove callback (e.g. module unloading), but the unprepare function >> > is aborted because MHI core moved the channels to suspended state >> > prior calling driver remove callback. This prevents the driver to >> > send a proper MHI RESET CHAN command to the device. Device is then >> > unaware of the stopped state of these channels. >> > >> > This causes issue when driver tries to start the channels again (e.g. >> > module is reloaded), since device considers channels as already >> > started (inconsistent state). >> > >> > Fix this by allowing channel reset when channel is suspended. >> > >> > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> >> > --- >> > drivers/bus/mhi/core/main.c | 3 ++- >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c >> > index d20967a..a588eac 100644 >> > --- a/drivers/bus/mhi/core/main.c >> > +++ b/drivers/bus/mhi/core/main.c >> > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct >> > mhi_controller *mhi_cntrl, >> > /* no more processing events for this channel */ >> > mutex_lock(&mhi_chan->mutex); >> > write_lock_irq(&mhi_chan->lock); >> > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { >> > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && >> > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { >> > write_unlock_irq(&mhi_chan->lock); >> > mutex_unlock(&mhi_chan->mutex); >> > return; >> Hi Loic, >> >> There should not be any reason for drivers to do an "unprepare" and >> send >> a reset channel >> command during remove, as the channel context gets cleaned up after >> the >> remove callback >> returns. > > Well, a good practice is to have a balanced interface, and everything > we do in > probe() should be undoable in remove(). Here we start the channel in > probe() > and explicitly stop them in remove(), So I think doing unprepare in > remove should > work anyway, even if the MHI stack does some cleanup on its own. > I agree. You are allowed to call "unprepare" but MHI core driver decides what to do with it. I can explain below why we do nothing if the channel is suspended. >> >> >> We do not want to allow moving from MHI_CH_STATE_SUSPENDED to >> MHI_CH_STATE_DISABLED state >> because if a remove is called, channel context being cleaned up >> implies >> a reset. > > AFAIK today, no reset command is sent on remove. > Yes. That's correct and it can be a problem for modules like yours because the device never gets notified so it remains unable to clean up its local info whereas host has moved on already. Your change must go in to ensure that this clean up happens on device but there need to be additional changes to make sure that we do not end up sending any command if unnecessary. These are the ways we could have a .remove call: 1. An explicit module unload such as yours 2. After a device crash or SYS_ERROR 3. After a host crash or host initiated shutdown where an explicit MHI RESET is sent due to the host processor powering off MHI. In cases #2 and #3, we cannot send individual channel reset commands because MHI on device is already in a RESET state. In #2, device will be dead, so we don't expect to receive any command responses. In #3, the master switch MHI RESET command lets the device know not to attempt any DDR accesses so no channel traffic will be there. Both these cases allow us to clean up the channel context for individual channels such as yours without the need to send an individual channel reset. But in case #1, with your patch in place, if we allow channel reset to be sent, we will also send this command and wait for a response in cases #2 and #3. Hence, we need knowledge of MHI_PM_IN_ERROR_STATE() present in the "unprepare" function or any check that allows us to skip sending a command. My upcoming set of patches adds that along with other features. If required, I can push these checks as a separate change to unblock this. >> >> Also, I have a bunch of channel state machine related patches coming >> up >> soon which solve >> this issue and more. We are also introducing some missing features >> with >> that. >> >> It would be nice if you can review/comment on those as it overhauls >> the >> state machine. > > Sure, feel free to submit. > > Regards, > Loic Thanks, Bhaumik
On 2020-10-15 10:47, Bhaumik Bhatt wrote: > On 2020-10-09 23:06, Loic Poulain wrote: >> HI Bhaumik, >> >> On Sat, 10 Oct 2020 at 02:23, <bbhatt@codeaurora.org> wrote: >>> >>> On 2020-10-09 02:07, Loic Poulain wrote: >>> > Some MHI device drivers need to stop the channels in their driver >>> > remove callback (e.g. module unloading), but the unprepare function >>> > is aborted because MHI core moved the channels to suspended state >>> > prior calling driver remove callback. This prevents the driver to >>> > send a proper MHI RESET CHAN command to the device. Device is then >>> > unaware of the stopped state of these channels. >>> > >>> > This causes issue when driver tries to start the channels again (e.g. >>> > module is reloaded), since device considers channels as already >>> > started (inconsistent state). >>> > >>> > Fix this by allowing channel reset when channel is suspended. >>> > >>> > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> >>> > --- >>> > drivers/bus/mhi/core/main.c | 3 ++- >>> > 1 file changed, 2 insertions(+), 1 deletion(-) >>> > >>> > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c >>> > index d20967a..a588eac 100644 >>> > --- a/drivers/bus/mhi/core/main.c >>> > +++ b/drivers/bus/mhi/core/main.c >>> > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct >>> > mhi_controller *mhi_cntrl, >>> > /* no more processing events for this channel */ >>> > mutex_lock(&mhi_chan->mutex); >>> > write_lock_irq(&mhi_chan->lock); >>> > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { >>> > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && >>> > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { >>> > write_unlock_irq(&mhi_chan->lock); >>> > mutex_unlock(&mhi_chan->mutex); >>> > return; >>> Hi Loic, >>> >>> There should not be any reason for drivers to do an "unprepare" and >>> send >>> a reset channel >>> command during remove, as the channel context gets cleaned up after >>> the >>> remove callback >>> returns. >> >> Well, a good practice is to have a balanced interface, and everything >> we do in >> probe() should be undoable in remove(). Here we start the channel in >> probe() >> and explicitly stop them in remove(), So I think doing unprepare in >> remove should >> work anyway, even if the MHI stack does some cleanup on its own. >> > I agree. You are allowed to call "unprepare" but MHI core driver > decides what to > do with it. I can explain below why we do nothing if the channel is > suspended. >>> >>> >>> We do not want to allow moving from MHI_CH_STATE_SUSPENDED to >>> MHI_CH_STATE_DISABLED state >>> because if a remove is called, channel context being cleaned up >>> implies >>> a reset. >> >> AFAIK today, no reset command is sent on remove. >> > Yes. That's correct and it can be a problem for modules like yours > because the > device never gets notified so it remains unable to clean up its local > info > whereas host has moved on already. > > Your change must go in to ensure that this clean up happens on device > but there > need to be additional changes to make sure that we do not end up > sending any > command if unnecessary. > > These are the ways we could have a .remove call: > 1. An explicit module unload such as yours > 2. After a device crash or SYS_ERROR > 3. After a host crash or host initiated shutdown where an explicit MHI > RESET is > sent due to the host processor powering off MHI. > > In cases #2 and #3, we cannot send individual channel reset commands > because > MHI on device is already in a RESET state. > In #2, device will be dead, so we don't expect to receive any command > responses. > In #3, the master switch MHI RESET command lets the device know not to > attempt > any DDR accesses so no channel traffic will be there. > > Both these cases allow us to clean up the channel context for > individual > channels such as yours without the need to send an individual channel > reset. > > But in case #1, with your patch in place, if we allow channel reset to > be sent, > we will also send this command and wait for a response in cases #2 and > #3. > Hence, we need knowledge of MHI_PM_IN_ERROR_STATE() present in the > "unprepare" > function or any check that allows us to skip sending a command. My > upcoming set > of patches adds that along with other features. > > If required, I can push these checks as a separate change to unblock > this. >>> >>> Also, I have a bunch of channel state machine related patches coming >>> up >>> soon which solve >>> this issue and more. We are also introducing some missing features >>> with >>> that. >>> >>> It would be nice if you can review/comment on those as it overhauls >>> the >>> state machine. >> >> Sure, feel free to submit. >> >> Regards, >> Loic > > Thanks, > Bhaumik Hi Loic, Your patch can go in, I see we do have MHI_PM_IN_ERROR_STATE() check in the "unprepare" function. My patches will refactor and account for your change as well along with others. Feel free to have it picked up. Moved Mani to "to" for this.
On 2020-10-09 02:07, Loic Poulain wrote: > Some MHI device drivers need to stop the channels in their driver > remove callback (e.g. module unloading), but the unprepare function > is aborted because MHI core moved the channels to suspended state > prior calling driver remove callback. This prevents the driver to > send a proper MHI RESET CHAN command to the device. Device is then > unaware of the stopped state of these channels. > > This causes issue when driver tries to start the channels again (e.g. > module is reloaded), since device considers channels as already > started (inconsistent state). > > Fix this by allowing channel reset when channel is suspended. > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Bhaumik Bhatt <bbhatt@codeaurora.org> > --- > drivers/bus/mhi/core/main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > index d20967a..a588eac 100644 > --- a/drivers/bus/mhi/core/main.c > +++ b/drivers/bus/mhi/core/main.c > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct > mhi_controller *mhi_cntrl, > /* no more processing events for this channel */ > mutex_lock(&mhi_chan->mutex); > write_lock_irq(&mhi_chan->lock); > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { > write_unlock_irq(&mhi_chan->lock); > mutex_unlock(&mhi_chan->mutex); > return; Thanks, Bhaumik -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
On Fri, Oct 09, 2020 at 11:07:14AM +0200, Loic Poulain wrote: > Some MHI device drivers need to stop the channels in their driver > remove callback (e.g. module unloading), but the unprepare function > is aborted because MHI core moved the channels to suspended state > prior calling driver remove callback. This prevents the driver to > send a proper MHI RESET CHAN command to the device. Device is then > unaware of the stopped state of these channels. > > This causes issue when driver tries to start the channels again (e.g. > module is reloaded), since device considers channels as already > started (inconsistent state). > > Fix this by allowing channel reset when channel is suspended. > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Thanks, Mani > --- > drivers/bus/mhi/core/main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > index d20967a..a588eac 100644 > --- a/drivers/bus/mhi/core/main.c > +++ b/drivers/bus/mhi/core/main.c > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct mhi_controller *mhi_cntrl, > /* no more processing events for this channel */ > mutex_lock(&mhi_chan->mutex); > write_lock_irq(&mhi_chan->lock); > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { > write_unlock_irq(&mhi_chan->lock); > mutex_unlock(&mhi_chan->mutex); > return; > -- > 2.7.4 >
On Fri, Oct 09, 2020 at 11:07:14AM +0200, Loic Poulain wrote: > Some MHI device drivers need to stop the channels in their driver > remove callback (e.g. module unloading), but the unprepare function > is aborted because MHI core moved the channels to suspended state > prior calling driver remove callback. This prevents the driver to > send a proper MHI RESET CHAN command to the device. Device is then > unaware of the stopped state of these channels. > > This causes issue when driver tries to start the channels again (e.g. > module is reloaded), since device considers channels as already > started (inconsistent state). > > Fix this by allowing channel reset when channel is suspended. > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Applied to mhi-next! Thanks, Mani > --- > drivers/bus/mhi/core/main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > index d20967a..a588eac 100644 > --- a/drivers/bus/mhi/core/main.c > +++ b/drivers/bus/mhi/core/main.c > @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct mhi_controller *mhi_cntrl, > /* no more processing events for this channel */ > mutex_lock(&mhi_chan->mutex); > write_lock_irq(&mhi_chan->lock); > - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { > + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && > + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { > write_unlock_irq(&mhi_chan->lock); > mutex_unlock(&mhi_chan->mutex); > return; > -- > 2.7.4 >
Hello: This patch was applied to qcom/linux.git (refs/heads/for-next): On Fri, 9 Oct 2020 11:07:14 +0200 you wrote: > Some MHI device drivers need to stop the channels in their driver > remove callback (e.g. module unloading), but the unprepare function > is aborted because MHI core moved the channels to suspended state > prior calling driver remove callback. This prevents the driver to > send a proper MHI RESET CHAN command to the device. Device is then > unaware of the stopped state of these channels. > > [...] Here is the summary with links: - bus: mhi: Fix channel close issue on driver remove https://git.kernel.org/qcom/c/a7f422f2f89e You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c index d20967a..a588eac 100644 --- a/drivers/bus/mhi/core/main.c +++ b/drivers/bus/mhi/core/main.c @@ -1232,7 +1232,8 @@ static void __mhi_unprepare_channel(struct mhi_controller *mhi_cntrl, /* no more processing events for this channel */ mutex_lock(&mhi_chan->mutex); write_lock_irq(&mhi_chan->lock); - if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) { + if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED && + mhi_chan->ch_state != MHI_CH_STATE_SUSPENDED) { write_unlock_irq(&mhi_chan->lock); mutex_unlock(&mhi_chan->mutex); return;
Some MHI device drivers need to stop the channels in their driver remove callback (e.g. module unloading), but the unprepare function is aborted because MHI core moved the channels to suspended state prior calling driver remove callback. This prevents the driver to send a proper MHI RESET CHAN command to the device. Device is then unaware of the stopped state of these channels. This causes issue when driver tries to start the channels again (e.g. module is reloaded), since device considers channels as already started (inconsistent state). Fix this by allowing channel reset when channel is suspended. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> --- drivers/bus/mhi/core/main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)