mbox series

[v2,0/4] PCI: dwc: Link Up IRQ fixes

Message ID 20250506073934.433176-6-cassel@kernel.org
Headers show
Series PCI: dwc: Link Up IRQ fixes | expand

Message

Niklas Cassel May 6, 2025, 7:39 a.m. UTC
Hello there,

Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
Link Up event") added support for DWC to not wait for link up before
enumerating the bus. However, we cannot simply enumerate the bus after
receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
to allow a device to become ready after deasserting PERST. To avoid
bringing back an conditional delay during probe, perform the wait in the
threaded IRQ handler instead.

Please review.


Kind regards,
Niklas


Changes since v1:
-Added missing include pci.h that was lost during rebase.


Niklas Cassel (4):
  PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
    ready
  PCI: qcom: Do not enumerate bus before endpoint devices are ready
  PCI: dw-rockchip: Replace PERST sleep time with proper macro
  PCI: qcom: Replace PERST sleep time with proper macro

 drivers/pci/controller/dwc/pcie-dw-rockchip.c | 4 +++-
 drivers/pci/controller/dwc/pcie-qcom.c        | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

Comments

Manivannan Sadhasivam May 13, 2025, 10:53 a.m. UTC | #1
On Tue, May 06, 2025 at 09:39:35AM +0200, Niklas Cassel wrote:
> Hello there,
> 
> Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> Link Up event") added support for DWC to not wait for link up before
> enumerating the bus. However, we cannot simply enumerate the bus after
> receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> to allow a device to become ready after deasserting PERST. To avoid
> bringing back an conditional delay during probe, perform the wait in the
> threaded IRQ handler instead.
> 

This wait time is a grey area in the spec tbh. If the Readiness Notification
(RN) is not supported, then the spec suggests waiting 1s for the device to
become 'configuration ready'. That's why we have the 1s delay in dwc driver.

Also, it has the below in r6.0, sec 6.6.1:

```
* On the completion of Link Training (entering the DL_Active state, see §
Section 3.2 ), a component must be able to receive and process TLPs and DLLPs.
* Following exit from a Conventional Reset of a device, within 1.0 s the device
must be able to receive a Configuration Request and return a Successful
Completion if the Request is valid. This period is independent of how quickly
Link training completes. If Readiness Notifications mechanisms are used (see
§ Section 6.22 .), this period may be shorter.
```

As per the first note, once link training is completed, the device should be
ready to accept configuration requests from the host. So no delay should be
required.

But the second note says that the 1s delay is independent of how quickly the
link training completes. This essentially contradicts with the above point.

So I think it is not required to add delay after completing the LTSSM, unless
someone sees any issue.

- Mani
Niklas Cassel May 13, 2025, 2:07 p.m. UTC | #2
Hello Mani,

On Tue, May 13, 2025 at 11:53:29AM +0100, Manivannan Sadhasivam wrote:
> 
> This wait time is a grey area in the spec tbh. If the Readiness Notification
> (RN) is not supported, then the spec suggests waiting 1s for the device to
> become 'configuration ready'. That's why we have the 1s delay in dwc driver.
> 
> Also, it has the below in r6.0, sec 6.6.1:
> 
> ```
> * On the completion of Link Training (entering the DL_Active state, see §
> Section 3.2 ), a component must be able to receive and process TLPs and DLLPs.
> * Following exit from a Conventional Reset of a device, within 1.0 s the device
> must be able to receive a Configuration Request and return a Successful
> Completion if the Request is valid. This period is independent of how quickly
> Link training completes. If Readiness Notifications mechanisms are used (see
> § Section 6.22 .), this period may be shorter.
> ```
> 
> As per the first note, once link training is completed, the device should be
> ready to accept configuration requests from the host. So no delay should be
> required.
> 
> But the second note says that the 1s delay is independent of how quickly the
> link training completes. This essentially contradicts with the above point.
> 
> So I think it is not required to add delay after completing the LTSSM, unless
> someone sees any issue.

If you look at the commit message in patch 1/2, the whole reason for this
series is that someone has seen an issue :)

While I personally haven't seen any issue, the user reporting that commit
ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect
Link Up") regressed his system so that it can no longer mount rootfs
(which is on a PLEXTOR PX-256M8PeGN NVMe SSD) clearly has seen an issue.

It is possible that his device is not following the spec.
I simply compared the code before and after ec9fd499b9c6, to try to
figure out why it was actually working before, and came up with this,
which made his device functional again.

Perhaps we should add a comment above the sleep that says that this
should strictly not be needed as per the spec?
(And also add the same comment in the (single) controller driver in
mainline which already does an msleep(PCIE_T_RRS_READY_MS).)


Kind regards,
Niklas
Laszlo Fiat May 15, 2025, 5:33 p.m. UTC | #3
Hello,

On Tuesday, May 13th, 2025 at 4:07 PM, Niklas Cassel <cassel@kernel.org> wrote:

> Hello Mani,
> 
> On Tue, May 13, 2025 at 11:53:29AM +0100, Manivannan Sadhasivam wrote:
> 
> > This wait time is a grey area in the spec tbh. If the Readiness Notification
> > (RN) is not supported, then the spec suggests waiting 1s for the device to
> > become 'configuration ready'. That's why we have the 1s delay in dwc driver.
> > 
> > Also, it has the below in r6.0, sec 6.6.1:
> > 
> > `* On the completion of Link Training (entering the DL_Active state, see § Section 3.2 ), a component must be able to receive and process TLPs and DLLPs. * Following exit from a Conventional Reset of a device, within 1.0 s the device must be able to receive a Configuration Request and return a Successful Completion if the Request is valid. This period is independent of how quickly Link training completes. If Readiness Notifications mechanisms are used (see § Section 6.22 .), this period may be shorter.`
> > 
> > As per the first note, once link training is completed, the device should be
> > ready to accept configuration requests from the host. So no delay should be
> > required.
> > 
> > But the second note says that the 1s delay is independent of how quickly the
> > link training completes. This essentially contradicts with the above point.
> > 
> > So I think it is not required to add delay after completing the LTSSM, unless
> > someone sees any issue.
> 
> 
> If you look at the commit message in patch 1/2, the whole reason for this
> series is that someone has seen an issue :)
> 
> While I personally haven't seen any issue, the user reporting that commit
> ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect
> Link Up") regressed his system so that it can no longer mount rootfs
> (which is on a PLEXTOR PX-256M8PeGN NVMe SSD) clearly has seen an issue.
> 
> It is possible that his device is not following the spec.
> I simply compared the code before and after ec9fd499b9c6, to try to
> figure out why it was actually working before, and came up with this,
> which made his device functional again.
> 
> Perhaps we should add a comment above the sleep that says that this
> should strictly not be needed as per the spec?
> (And also add the same comment in the (single) controller driver in
> mainline which already does an msleep(PCIE_T_RRS_READY_MS).)

I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD. 

I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:

- ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
- 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")

my system booted fine again. 
After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.

I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these 

- revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
- revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
- apply the following patch:

diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index b3615d125942..5dee689ecd95 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
                if (dw_pcie_link_up(pci))
                        break;

-               msleep(LINK_WAIT_SLEEP_MS);
+               usleep_range(100, 200);
        }

        if (retries >= LINK_WAIT_MAX_RETRIES) {


which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.
So please allow to fix the regression that is already in 6.14.x. I now so far only I have reported this, but we cannot be sure how many SSDs have this timing issue. Most users use older, distribution packaged kernels, so others will face this later.

Bye,

Laszlo Fiat

> 
> 
> Kind regards,
> Niklas
Niklas Cassel May 16, 2025, 10 a.m. UTC | #4
On Thu, May 15, 2025 at 05:33:41PM +0000, Laszlo Fiat wrote:
> I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD. 
> 
> I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:
> 
> - ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> - 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
> 
> my system booted fine again. 
> After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.
> 
> I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these 
> 
> - revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> - revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
> - apply the following patch:
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index b3615d125942..5dee689ecd95 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
>                 if (dw_pcie_link_up(pci))
>                         break;
> 
> -               msleep(LINK_WAIT_SLEEP_MS);
> +               usleep_range(100, 200);
>         }
> 
>         if (retries >= LINK_WAIT_MAX_RETRIES) {
> 
> 
> which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.

That message was unexpected.

What I expected to happen was that the link would come up, but by reducing
delay between "link is up" and device is accessed (from 90 ms -> 100 us),
I was that you would see the same problem on "older" kernels as you do with
the "link up IRQ" patches (which originally had no delay, but this series
basically re-added the same delay (PCIE_T_RRS_READY_MS, 100 ms) as we had
before (LINK_WAIT_SLEEP_MS, 90 ms).

But I see the problem with the test code that I asked you to test to verify
that this problem also existed before (if you had a shorter delay).
(By reducing the delay, the LINK_WAIT_MAX_RETRIES also need to be bumped.)

Could you please test:
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index b3615d125942..5dee689ecd95 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
                if (dw_pcie_link_up(pci))
                        break;
 
-               msleep(LINK_WAIT_SLEEP_MS);
+               usleep_range(100, 200);
        }
 
        if (retries >= LINK_WAIT_MAX_RETRIES) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 4dd16aa4b39e..8422661b79d5 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -61,7 +61,7 @@
        set_bit(DW_PCIE_CAP_ ## _cap, &(_pci)->caps)
 
 /* Parameters for the waiting for link up routine */
-#define LINK_WAIT_MAX_RETRIES          10
+#define LINK_WAIT_MAX_RETRIES          10000
 #define LINK_WAIT_SLEEP_MS             90
 
 /* Parameters for the waiting for iATU enabled routine */


On top of an old kernel instead?

We expect the link to come up, but that you will not be able to mount rootfs.

If that is the case, we are certain that the this patch series is 100% needed
for your device to have the same functional behavior as before.


Kind regards,
Niklas
Laszlo Fiat May 16, 2025, 6:48 p.m. UTC | #5
Hello,
-------- Original Message --------
On 16/05/2025 12:00, Niklas Cassel <cassel@kernel.org> wrote:

>  On Thu, May 15, 2025 at 05:33:41PM +0000, Laszlo Fiat wrote:
>  > I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD.
>  >
>  > I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:
>  >
>  > - ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
>  > - 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
>  >
>  > my system booted fine again.
>  > After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.
>  >
>  > I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these
>  >
>  > - revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
>  > - revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
>  > - apply the following patch:
>  >
>  > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
>  > index b3615d125942..5dee689ecd95 100644
>  > --- a/drivers/pci/controller/dwc/pcie-designware.c
>  > +++ b/drivers/pci/controller/dwc/pcie-designware.c
>  > @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
>  >                 if (dw_pcie_link_up(pci))
>  >                         break;
>  >
>  > -               msleep(LINK_WAIT_SLEEP_MS);
>  > +               usleep_range(100, 200);
>  >         }
>  >
>  >         if (retries >= LINK_WAIT_MAX_RETRIES) {
>  >
>  >
>  > which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.
>  
>  That message was unexpected.
>  
>  What I expected to happen was that the link would come up, but by reducing
>  delay between "link is up" and device is accessed (from 90 ms -> 100 us),
>  I was that you would see the same problem on "older" kernels as you do with
>  the "link up IRQ" patches (which originally had no delay, but this series
>  basically re-added the same delay (PCIE_T_RRS_READY_MS, 100 ms) as we had
>  before (LINK_WAIT_SLEEP_MS, 90 ms).
>  
>  But I see the problem with the test code that I asked you to test to verify
>  that this problem also existed before (if you had a shorter delay).
>  (By reducing the delay, the LINK_WAIT_MAX_RETRIES also need to be bumped.)
>  
>  Could you please test:
>  diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
>  index b3615d125942..5dee689ecd95 100644
>  --- a/drivers/pci/controller/dwc/pcie-designware.c
>  +++ b/drivers/pci/controller/dwc/pcie-designware.c
>  @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
>                  if (dw_pcie_link_up(pci))
>                          break;
>  
>  -               msleep(LINK_WAIT_SLEEP_MS);
>  +               usleep_range(100, 200);
>          }
>  
>          if (retries >= LINK_WAIT_MAX_RETRIES) {
>  diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
>  index 4dd16aa4b39e..8422661b79d5 100644
>  --- a/drivers/pci/controller/dwc/pcie-designware.h
>  +++ b/drivers/pci/controller/dwc/pcie-designware.h
>  @@ -61,7 +61,7 @@
>          set_bit(DW_PCIE_CAP_ ## _cap, &(_pci)->caps)
>  
>   /* Parameters for the waiting for link up routine */
>  -#define LINK_WAIT_MAX_RETRIES          10
>  +#define LINK_WAIT_MAX_RETRIES          10000
>   #define LINK_WAIT_SLEEP_MS             90
>  
>   /* Parameters for the waiting for iATU enabled routine */
>  
>  
>  On top of an old kernel instead?
>

I have compiled a vanilla 6.12.28, that booted fine, as expeced. Then compiled a  version with the patch directly above.

>  We expect the link to come up, but that you will not be able to mount rootfs.
>  

That is exactly what happened. 

>  If that is the case, we are certain that the this patch series is 100% needed
>  for your device to have the same functional behavior as before.

That is the case.

Bye,

Laszlo Fiat 
>  
>  
>  Kind regards,
>  Niklas
>
Manivannan Sadhasivam May 19, 2025, 12:10 p.m. UTC | #6
On Mon, May 19, 2025 at 11:44:17AM +0200, Niklas Cassel wrote:
> On Fri, May 16, 2025 at 06:48:59PM +0000, Laszlo Fiat wrote:
> > 
> > I have compiled a vanilla 6.12.28, that booted fine, as expeced. Then compiled a  version with the patch directly above.
> > 
> > >  We expect the link to come up, but that you will not be able to mount rootfs.
> > >  
> > 
> > That is exactly what happened. 
> > 
> > >  If that is the case, we are certain that the this patch series is 100% needed
> > >  for your device to have the same functional behavior as before.
> > 
> > That is the case.
> 
> Laszlo,
> 
> Thank you for verifying!
> 
> > 
> > Bye,
> > 
> > Laszlo Fiat 
> 
> Mani, Krzysztof,
> 
> It does indeed seem like the reason for this regression is that this weird
> NVMe drive requires some additional time after link up to be ready.
> 
> This series does re-add a similar delay, but in the 'link up' IRQ thread,
> so even with this series, things are nicer than before (where we had an
> unconditional delay after starting the link).
> 
> Do you require any additional changes?
> 

No. Sorry for the delay in getting back. I was at Linaro Connect last week and
had to take some vacation for the first half of this week.

It is fine with me to add the delay only in the Rockchip platform where the
issue was reported (next time please mention the issue that triggered the series
in the cover letter). But I do not want to add the same in Qcom platform as the
delay is not strictly required as per the spec (if I understood it properly).

So I'll merge rest of the 3 patches.

- Mani
Manivannan Sadhasivam May 19, 2025, 12:37 p.m. UTC | #7
On Tue, 06 May 2025 09:39:35 +0200, Niklas Cassel wrote:
> Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> Link Up event") added support for DWC to not wait for link up before
> enumerating the bus. However, we cannot simply enumerate the bus after
> receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> to allow a device to become ready after deasserting PERST. To avoid
> bringing back an conditional delay during probe, perform the wait in the
> threaded IRQ handler instead.
> 
> [...]

Applied to controller/dw-rockchip, thanks!

[1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
      commit: c1c9c4b6130defb671c45b53ec12fcd92afea971
[3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
      commit: 9ad44a6f6f8775786a9543427c82625d96a340b4
[4/4] PCI: qcom: Replace PERST sleep time with proper macro
      commit: be0b42befa87d81780aa0f003644b0ad520b0234

(I do need to change the name of the topic-branch now...)

Best regards,