Message ID | 1489585887-8683-1-git-send-email-rogerq@ti.com |
---|---|
State | New |
Headers | show |
On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: > Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") > phy_suspend() doesn't get called as part of phy_stop() for PHYs using > interrupts because the phy state machine is never triggered after a phy_stop(). > > Explicitly trigger the PHY state machine so that it can > see the new PHY state (HALTED) and suspend the PHY. > > Signed-off-by: Roger Quadros <rogerq@ti.com> Hi Roger This seems sensible. It mirrors what phy_start() does. Reviewed-by: Andrew Lunn <andrew@lunn.ch> It does however lead to a follow up question. Are there other places phydev->state is changed and it is missing a phy_trigger_machine()? Andrew
Andrew, On 15/03/17 16:08, Andrew Lunn wrote: > On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >> interrupts because the phy state machine is never triggered after a phy_stop(). >> >> Explicitly trigger the PHY state machine so that it can >> see the new PHY state (HALTED) and suspend the PHY. >> >> Signed-off-by: Roger Quadros <rogerq@ti.com> > > Hi Roger > > This seems sensible. It mirrors what phy_start() does. > > Reviewed-by: Andrew Lunn <andrew@lunn.ch> The reason for this being an RFC was the following comment just before where I add the phy_trigger_machine() /* Cannot call flush_scheduled_work() here as desired because * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() * will not reenable interrupts. */ Is this comment still applicable? If yes, is it OK to call phy_trigger_machine() there? > > It does however lead to a follow up question. Are there other places > phydev->state is changed and it is missing a phy_trigger_machine()? > One other place I think we should add phy_trigger_machine() is phy_start_aneg(). Since commit 3c293f4e08b5 I was not getting my ethernet link to come up if ethernet cable was plugged before the ethernet interface was brought up. The PHY seems to be stuck from RUNNING to AN state with no new interrupts from the PHY. I could workaround it on my board by doing either of the following: 1) explicitly halt the PHY at ethernet driver probe time. Then when network interface is brought up it resumes the PHY and we see the Link/ANEG done interrupt. 2) force BMCR_ANRESTART every time from .config_aneg in the PHY driver. I will still keep approach 1 as it is desirable to put the PHY in low power mode for us but I need a second opinion if we should call phy_trigger_machine() from phy_start_aneg() or not. -- cheers, -roger
On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote: > Andrew, > > On 15/03/17 16:08, Andrew Lunn wrote: > > On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: > >> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") > >> phy_suspend() doesn't get called as part of phy_stop() for PHYs using > >> interrupts because the phy state machine is never triggered after a phy_stop(). > >> > >> Explicitly trigger the PHY state machine so that it can > >> see the new PHY state (HALTED) and suspend the PHY. > >> > >> Signed-off-by: Roger Quadros <rogerq@ti.com> > > > > Hi Roger > > > > This seems sensible. It mirrors what phy_start() does. > > > > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > > The reason for this being an RFC was the following comment just before > where I add the phy_trigger_machine() > > /* Cannot call flush_scheduled_work() here as desired because > * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() > * will not reenable interrupts. > */ > > Is this comment still applicable? If yes, is it OK to call > phy_trigger_machine() there? Humm, good question. My _guess_ would be, calling it with sync=True could deadlock. sync=False is probably safe. But lets see what Florian says. > > > > > It does however lead to a follow up question. Are there other places > > phydev->state is changed and it is missing a phy_trigger_machine()? > > > > One other place I think we should add phy_trigger_machine() is phy_start_aneg(). Humm, that might get us into a tight loop. phy_start_aneg() kicks the phy driver to start autoneg and sets phydev->state = PHY_AN. phy_trigger_machine() triggers the state machine immediately. In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg = true. At the end of the state machine, this then calls phy_start_aneg(), and it all starts again. We are missing the 1s delay we have with polling. So for phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which waits a second before doing anything? Andrew
On 15/03/17 17:49, Andrew Lunn wrote: > On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote: >> Andrew, >> >> On 15/03/17 16:08, Andrew Lunn wrote: >>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >>>> interrupts because the phy state machine is never triggered after a phy_stop(). >>>> >>>> Explicitly trigger the PHY state machine so that it can >>>> see the new PHY state (HALTED) and suspend the PHY. >>>> >>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>> >>> Hi Roger >>> >>> This seems sensible. It mirrors what phy_start() does. >>> >>> Reviewed-by: Andrew Lunn <andrew@lunn.ch> >> >> The reason for this being an RFC was the following comment just before >> where I add the phy_trigger_machine() >> >> /* Cannot call flush_scheduled_work() here as desired because >> * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() >> * will not reenable interrupts. >> */ >> >> Is this comment still applicable? If yes, is it OK to call >> phy_trigger_machine() there? > > Humm, good question. > > My _guess_ would be, calling it with sync=True could > deadlock. sync=False is probably safe. But lets see what Florian says. I agree that we should use phy_trigger_machine() with sync=False. > >> >>> >>> It does however lead to a follow up question. Are there other places >>> phydev->state is changed and it is missing a phy_trigger_machine()? >>> >> >> One other place I think we should add phy_trigger_machine() is phy_start_aneg(). > > Humm, that might get us into a tight loop. > > phy_start_aneg() kicks the phy driver to start autoneg and sets > phydev->state = PHY_AN. > > phy_trigger_machine() triggers the state machine immediately. > > In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg > = true. At the end of the state machine, this then calls > phy_start_aneg(), and it all starts again. > > We are missing the 1s delay we have with polling. So for > phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which > waits a second before doing anything? I think that should do the trick. How about this? -- cheers, -rogerdiff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 8fef03b..162061c 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev) out_unlock: mutex_unlock(&phydev->lock); + + if (!err) + queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, HZ); + return err; } EXPORT_SYMBOL(phy_start_aneg);
On 03/16/2017 12:46 AM, Roger Quadros wrote: > On 15/03/17 17:49, Andrew Lunn wrote: >> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote: >>> Andrew, >>> >>> On 15/03/17 16:08, Andrew Lunn wrote: >>>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >>>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >>>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >>>>> interrupts because the phy state machine is never triggered after a phy_stop(). >>>>> >>>>> Explicitly trigger the PHY state machine so that it can >>>>> see the new PHY state (HALTED) and suspend the PHY. >>>>> >>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>> >>>> Hi Roger >>>> >>>> This seems sensible. It mirrors what phy_start() does. >>>> >>>> Reviewed-by: Andrew Lunn <andrew@lunn.ch> >>> >>> The reason for this being an RFC was the following comment just before >>> where I add the phy_trigger_machine() >>> >>> /* Cannot call flush_scheduled_work() here as desired because >>> * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() >>> * will not reenable interrupts. >>> */ >>> >>> Is this comment still applicable? If yes, is it OK to call >>> phy_trigger_machine() there? >> >> Humm, good question. >> >> My _guess_ would be, calling it with sync=True could >> deadlock. sync=False is probably safe. But lets see what Florian says. > > I agree that we should use phy_trigger_machine() with sync=False. > >> >>> >>>> >>>> It does however lead to a follow up question. Are there other places >>>> phydev->state is changed and it is missing a phy_trigger_machine()? >>>> >>> >>> One other place I think we should add phy_trigger_machine() is phy_start_aneg(). >> >> Humm, that might get us into a tight loop. >> >> phy_start_aneg() kicks the phy driver to start autoneg and sets >> phydev->state = PHY_AN. >> >> phy_trigger_machine() triggers the state machine immediately. >> >> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg >> = true. At the end of the state machine, this then calls >> phy_start_aneg(), and it all starts again. >> >> We are missing the 1s delay we have with polling. So for >> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which >> waits a second before doing anything? > > I think that should do the trick. > > How about this? This sounds like a possible fix indeed, however I would like to better assess the impact on non interrupt driven PHYs before rolling such a change. > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 8fef03b..162061c 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev) > > out_unlock: > mutex_unlock(&phydev->lock); > + > + if (!err) > + queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, HZ); > + > return err; > } > EXPORT_SYMBOL(phy_start_aneg); > -- Florian
On 03/15/2017 08:00 AM, Roger Quadros wrote: > Andrew, > > On 15/03/17 16:08, Andrew Lunn wrote: >> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >>> interrupts because the phy state machine is never triggered after a phy_stop(). >>> >>> Explicitly trigger the PHY state machine so that it can >>> see the new PHY state (HALTED) and suspend the PHY. >>> >>> Signed-off-by: Roger Quadros <rogerq@ti.com> >> >> Hi Roger >> >> This seems sensible. It mirrors what phy_start() does. >> >> Reviewed-by: Andrew Lunn <andrew@lunn.ch> > > The reason for this being an RFC was the following comment just before > where I add the phy_trigger_machine() > > /* Cannot call flush_scheduled_work() here as desired because > * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() > * will not reenable interrupts. > */ > > Is this comment still applicable? If yes, is it OK to call > phy_trigger_machine() there? I think the comment is still applicable, most PHYLIB consumers will call this with rtnl_lock() held from ndo_close() for instance. > >> >> It does however lead to a follow up question. Are there other places >> phydev->state is changed and it is missing a phy_trigger_machine()? >> > > One other place I think we should add phy_trigger_machine() is phy_start_aneg(). > > Since commit 3c293f4e08b5 I was not getting my ethernet link to come up if > ethernet cable was plugged before the ethernet interface was brought up. > The PHY seems to be stuck from RUNNING to AN state with no new interrupts from the > PHY. > > I could workaround it on my board by doing either of the following: > > 1) explicitly halt the PHY at ethernet driver probe time. Then when > network interface is brought up it resumes the PHY and we see the > Link/ANEG done interrupt. > > 2) force BMCR_ANRESTART every time from .config_aneg in the PHY driver. > > I will still keep approach 1 as it is desirable to put the PHY in low power mode > for us but I need a second opinion if we should call phy_trigger_machine() > from phy_start_aneg() or not. > -- Florian
On 20/03/17 18:41, Florian Fainelli wrote: > On 03/16/2017 12:46 AM, Roger Quadros wrote: >> On 15/03/17 17:49, Andrew Lunn wrote: >>> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote: >>>> Andrew, >>>> >>>> On 15/03/17 16:08, Andrew Lunn wrote: >>>>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >>>>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >>>>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >>>>>> interrupts because the phy state machine is never triggered after a phy_stop(). >>>>>> >>>>>> Explicitly trigger the PHY state machine so that it can >>>>>> see the new PHY state (HALTED) and suspend the PHY. >>>>>> >>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>> >>>>> Hi Roger >>>>> >>>>> This seems sensible. It mirrors what phy_start() does. >>>>> >>>>> Reviewed-by: Andrew Lunn <andrew@lunn.ch> >>>> >>>> The reason for this being an RFC was the following comment just before >>>> where I add the phy_trigger_machine() >>>> >>>> /* Cannot call flush_scheduled_work() here as desired because >>>> * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() >>>> * will not reenable interrupts. >>>> */ >>>> >>>> Is this comment still applicable? If yes, is it OK to call >>>> phy_trigger_machine() there? >>> >>> Humm, good question. >>> >>> My _guess_ would be, calling it with sync=True could >>> deadlock. sync=False is probably safe. But lets see what Florian says. >> >> I agree that we should use phy_trigger_machine() with sync=False. >> >>> >>>> >>>>> >>>>> It does however lead to a follow up question. Are there other places >>>>> phydev->state is changed and it is missing a phy_trigger_machine()? >>>>> >>>> >>>> One other place I think we should add phy_trigger_machine() is phy_start_aneg(). >>> >>> Humm, that might get us into a tight loop. >>> >>> phy_start_aneg() kicks the phy driver to start autoneg and sets >>> phydev->state = PHY_AN. >>> >>> phy_trigger_machine() triggers the state machine immediately. >>> >>> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg >>> = true. At the end of the state machine, this then calls >>> phy_start_aneg(), and it all starts again. >>> >>> We are missing the 1s delay we have with polling. So for >>> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which >>> waits a second before doing anything? >> >> I think that should do the trick. >> >> How about this? > > This sounds like a possible fix indeed, however I would like to better > assess the impact on non interrupt driven PHYs before rolling such a change. Is it safer if I add a check to do this only for interrupt driven PHYs? e.g. -- 2.7.4 -- cheers, -rogerdiff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 4b855f2..e0f5755 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -630,6 +630,9 @@ int phy_start_aneg(struct phy_device *phydev) out_unlock: mutex_unlock(&phydev->lock); + if (!err && phy_interrupt_is_valid(phydev)) + queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, HZ); + return err; } EXPORT_SYMBOL(phy_start_aneg);
On 03/21/2017 03:09 AM, Roger Quadros wrote: > On 20/03/17 18:41, Florian Fainelli wrote: >> On 03/16/2017 12:46 AM, Roger Quadros wrote: >>> On 15/03/17 17:49, Andrew Lunn wrote: >>>> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote: >>>>> Andrew, >>>>> >>>>> On 15/03/17 16:08, Andrew Lunn wrote: >>>>>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote: >>>>>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") >>>>>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using >>>>>>> interrupts because the phy state machine is never triggered after a phy_stop(). >>>>>>> >>>>>>> Explicitly trigger the PHY state machine so that it can >>>>>>> see the new PHY state (HALTED) and suspend the PHY. >>>>>>> >>>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>>> >>>>>> Hi Roger >>>>>> >>>>>> This seems sensible. It mirrors what phy_start() does. >>>>>> >>>>>> Reviewed-by: Andrew Lunn <andrew@lunn.ch> >>>>> >>>>> The reason for this being an RFC was the following comment just before >>>>> where I add the phy_trigger_machine() >>>>> >>>>> /* Cannot call flush_scheduled_work() here as desired because >>>>> * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() >>>>> * will not reenable interrupts. >>>>> */ >>>>> >>>>> Is this comment still applicable? If yes, is it OK to call >>>>> phy_trigger_machine() there? >>>> >>>> Humm, good question. >>>> >>>> My _guess_ would be, calling it with sync=True could >>>> deadlock. sync=False is probably safe. But lets see what Florian says. >>> >>> I agree that we should use phy_trigger_machine() with sync=False. >>> >>>> >>>>> >>>>>> >>>>>> It does however lead to a follow up question. Are there other places >>>>>> phydev->state is changed and it is missing a phy_trigger_machine()? >>>>>> >>>>> >>>>> One other place I think we should add phy_trigger_machine() is phy_start_aneg(). >>>> >>>> Humm, that might get us into a tight loop. >>>> >>>> phy_start_aneg() kicks the phy driver to start autoneg and sets >>>> phydev->state = PHY_AN. >>>> >>>> phy_trigger_machine() triggers the state machine immediately. >>>> >>>> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg >>>> = true. At the end of the state machine, this then calls >>>> phy_start_aneg(), and it all starts again. >>>> >>>> We are missing the 1s delay we have with polling. So for >>>> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which >>>> waits a second before doing anything? >>> >>> I think that should do the trick. >>> >>> How about this? >> >> This sounds like a possible fix indeed, however I would like to better >> assess the impact on non interrupt driven PHYs before rolling such a change. > > Is it safer if I add a check to do this only for interrupt driven PHYs? Yes I think this is a good solution that would not impact polled PHYs. Can you submit a formal patch for that change? Thanks! > > e.g. > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 4b855f2..e0f5755 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -630,6 +630,9 @@ int phy_start_aneg(struct phy_device *phydev) > > out_unlock: > mutex_unlock(&phydev->lock); > + if (!err && phy_interrupt_is_valid(phydev)) > + queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, HZ); > + > return err; > } > EXPORT_SYMBOL(phy_start_aneg); > -- Florian
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 1be69d8..8fef03b 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -903,6 +903,7 @@ void phy_stop(struct phy_device *phydev) * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change() * will not reenable interrupts. */ + phy_trigger_machine(phydev, true); } EXPORT_SYMBOL(phy_stop);
Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") phy_suspend() doesn't get called as part of phy_stop() for PHYs using interrupts because the phy state machine is never triggered after a phy_stop(). Explicitly trigger the PHY state machine so that it can see the new PHY state (HALTED) and suspend the PHY. Signed-off-by: Roger Quadros <rogerq@ti.com> --- drivers/net/phy/phy.c | 1 + 1 file changed, 1 insertion(+) -- 2.7.4