[net-next,00/13] Introduce an ethernet port representation

Message ID	20250207223634.600218-1-maxime.chevallier@bootlin.com
Headers	show Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 212251AF0AF; Fri, 7 Feb 2025 22:36:40 +0000 (UTC) From: Maxime Chevallier <maxime.chevallier@bootlin.com> To: davem@davemloft.net Cc: Maxime Chevallier <maxime.chevallier@bootlin.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, thomas.petazzoni@bootlin.com, Andrew Lunn <andrew@lunn.ch>, Jakub Kicinski <kuba@kernel.org>, Eric Dumazet <edumazet@google.com>, Paolo Abeni <pabeni@redhat.com>, Russell King <linux@armlinux.org.uk>, linux-arm-kernel@lists.infradead.org, Christophe Leroy <christophe.leroy@csgroup.eu>, Herve Codina <herve.codina@bootlin.com>, Florian Fainelli <f.fainelli@gmail.com>, Heiner Kallweit <hkallweit1@gmail.com>, Vladimir Oltean <vladimir.oltean@nxp.com>, =?utf-8?q?K=C3=B6ry_Maincent?= <kory.maincent@bootlin.com>, =?utf-8?q?Marek?= =?utf-8?q?_Beh=C3=BAn?= <kabel@kernel.org>, Oleksij Rempel <o.rempel@pengutronix.de>, =?utf-8?q?Nicol=C3=B2_Veronese?= <nicveronese@gmail.com>, Simon Horman <horms@kernel.org>, mwojtas@chromium.org, Antoine Tenart <atenart@kernel.org>, devicetree@vger.kernel.org, Conor Dooley <conor+dt@kernel.org>, Krzysztof Kozlowski <krzk+dt@kernel.org>, Rob Herring <robh@kernel.org>, Romain Gantois <romain.gantois@bootlin.com> Subject: [PATCH net-next 00/13] Introduce an ethernet port representation Date: Fri, 7 Feb 2025 23:36:19 +0100 Message-ID: <20250207223634.600218-1-maxime.chevallier@bootlin.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Introduce an ethernet port representation \| expand [net-next,00/13] Introduce an ethernet port representation [net-next,01/13] net: ethtool: Introduce ETHTOOL_LINK_MEDIUM_* values [net-next,02/13] net: ethtool: Export the link_mode_params definitions [net-next,04/13] net: phy: dp83822: Add support for phy_port representation [net-next,05/13] net: phy: Create a phy_port for PHY-driven SFPs [net-next,06/13] net: phy: Intrduce generic SFP handling for PHY drivers [net-next,07/13] net: phy: marvell-88x2222: Support SFP through phy_port interface [net-next,08/13] net: phy: marvell: Support SFP through phy_port interface [net-next,09/13] net: phy: marvell10g: Support SFP through phy_port [net-next,10/13] net: phy: at803x: Support SFP through phy_port interface [net-next,11/13] net: phy: Only rely on phy_port for PHY-driven SFP [net-next,12/13] net: phy: dp83822: Add SFP support through the phy_port interface [net-next,13/13] dt-bindings: net: Introduce the phy-port description

Maxime Chevallier Feb. 7, 2025, 10:36 p.m. UTC

Hello everyone,

This series follows the 2 RFC that were sent a few weeks ago :
RFC V2: https://lore.kernel.org/netdev/20250122174252.82730-1-maxime.chevallier@bootlin.com/
RFC V1: https://lore.kernel.org/netdev/20241220201506.2791940-1-maxime.chevallier@bootlin.com/

The goal of this series is to introduce an internal way of representing
the "outputs" of ethernet devices, for now only focusing on PHYs.

This allows laying the groundwork for multi-port devices support (both 1
PHY 2 ports, or more exotic setups with 2 PHYs in parallel, or MII
multiplexers).

Compared to the RFCs, this series tries to properly support SFP,
especially PHY-driven SFPs through special phy_ports named "serdes"
ports. They have the particularity of outputing a generic interface,
that feeds into another component (usually, an SFP cage and therefore an
SFP module).

This allows getting a fairly generic PHY-driven SFP support (MAC-driven
SFP is handled by phylink).

This series doesn't address PHY-less interfaces (bare MAC devices, MACs
with embedded PHYs not driven by phylink, or MAC connected to optical
SFPs) to stay within the 15 patches limit, nor does it include the uAPI
part that exposes these ports to userspace.

I've kept the cover short, much more details can be found in the RFC
covers.

Thanks everyone,

Maxime

Maxime Chevallier (13):
  net: ethtool: Introduce ETHTOOL_LINK_MEDIUM_* values
  net: ethtool: Export the link_mode_params definitions
  net: phy: Introduce PHY ports representation
  net: phy: dp83822: Add support for phy_port representation
  net: phy: Create a phy_port for PHY-driven SFPs
  net: phy: Intrduce generic SFP handling for PHY drivers
  net: phy: marvell-88x2222: Support SFP through phy_port interface
  net: phy: marvell: Support SFP through phy_port interface
  net: phy: marvell10g: Support SFP through phy_port
  net: phy: at803x: Support SFP through phy_port interface
  net: phy: Only rely on phy_port for PHY-driven SFP
  net: phy: dp83822: Add SFP support through the phy_port interface
  dt-bindings: net: Introduce the phy-port description

 .../devicetree/bindings/net/ethernet-phy.yaml |  18 +
 .../bindings/net/ethernet-port.yaml           |  47 +++
 drivers/net/phy/Makefile                      |   2 +-
 drivers/net/phy/dp83822.c                     |  71 ++--
 drivers/net/phy/marvell-88x2222.c             |  96 +++---
 drivers/net/phy/marvell.c                     | 100 +++---
 drivers/net/phy/marvell10g.c                  |  37 +--
 drivers/net/phy/phy_device.c                  | 307 +++++++++++++++++-
 drivers/net/phy/phy_port.c                    | 176 ++++++++++
 drivers/net/phy/phylink.c                     |  32 ++
 drivers/net/phy/qcom/at803x.c                 |  64 +---
 include/linux/ethtool.h                       |  73 +++++
 include/linux/phy.h                           |  39 ++-
 include/linux/phy_port.h                      |  92 ++++++
 include/linux/phylink.h                       |   2 +
 net/ethtool/common.c                          | 231 ++++++-------
 net/ethtool/common.h                          |   7 -
 17 files changed, 1048 insertions(+), 346 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/ethernet-port.yaml
 create mode 100644 drivers/net/phy/phy_port.c
 create mode 100644 include/linux/phy_port.h

Maxime Chevallier Feb. 7, 2025, 10:36 p.m. UTC | #1

Ethernet provides a wide variety of layer 1 protocols and standards for
data transmission. The front-facing ports of an interface have their own
complexity and configurability.

Introduce a representation of these front-facing ports. The current code
is minimalistic and only support ports controlled by PHY devices, but
the plan is to extend that to SFP as well as raw Ethernet MACs that
don't use PHY devices.

This minimal port representation allows describing the media and number
of lanes of a port. From that information, we can derive the linkmodes
usable on the port, which can be used to limit the capabilities of an
interface.

For now, the port lanes and medium is derived from devicetree, defined
by the PHY driver, or populated with default values (as we assume that
all PHYs expose at least one port).

The typical example is 100M ethernet. 100BaseT can work using only 2
lanes on a Cat 5 cables. However, in the situation where a 10/100/1000
capable PHY is wired to its RJ45 port through 2 lanes only, we have no
way of detecting that. The "max-speed" DT property can be used, but a
more accurate representation can be used :

mdi {
	port@0 {
		media = "BaseT";
		lanes = <2>;
	};
};

Sean Anderson Feb. 8, 2025, 2:14 a.m. UTC | #2

Hi Maxime,

On 2/7/25 17:36, Maxime Chevallier wrote:
> Hello everyone,
> 
> This series follows the 2 RFC that were sent a few weeks ago :
> RFC V2: https://lore.kernel.org/netdev/20250122174252.82730-1-maxime.chevallier@bootlin.com/
> RFC V1: https://lore.kernel.org/netdev/20241220201506.2791940-1-maxime.chevallier@bootlin.com/
> 
> The goal of this series is to introduce an internal way of representing
> the "outputs" of ethernet devices, for now only focusing on PHYs.
> 
> This allows laying the groundwork for multi-port devices support (both 1
> PHY 2 ports, or more exotic setups with 2 PHYs in parallel, or MII
> multiplexers).
> 
> Compared to the RFCs, this series tries to properly support SFP,
> especially PHY-driven SFPs through special phy_ports named "serdes"
> ports. They have the particularity of outputing a generic interface,
> that feeds into another component (usually, an SFP cage and therefore an
> SFP module).
> 
> This allows getting a fairly generic PHY-driven SFP support (MAC-driven
> SFP is handled by phylink).
> 
> This series doesn't address PHY-less interfaces (bare MAC devices, MACs
> with embedded PHYs not driven by phylink, or MAC connected to optical
> SFPs) to stay within the 15 patches limit, nor does it include the uAPI
> part that exposes these ports to userspace.
> 
> I've kept the cover short, much more details can be found in the RFC
> covers.
> 
> Thanks everyone,
> 
> Maxime

Forgive me for my ignorance, but why have a new ethtool interface instead of
extending ethtool_link_settings.port? It's a rather ancient interface, but it
seems to be tackling the exact same problem as you are trying to address. Older
NICs used to have several physical connectors (e.g. BNC, MII, twisted-pair) but
only one could be used at once. This seems directly analogous to a PHY that
supports multiple "port"s but not all at once. In fact, the only missing
connector type seems to be PORT_BACKPLANE.

I can think of a few reasons why you wouldn't use PORT_*:

- It describes the NIC and not the PHY, and perhaps there is too much impedance
   mismatch?
- There is too much legacy in userspace (or in the kernel) to use that API in
   this way?
- You need more flexibility?

At the very least, I think some discussion in one of the commits would be
warranted. Perhaps there was some on the RFC that I missed?

--Sean

Maxime Chevallier Feb. 10, 2025, 8:55 a.m. UTC | #3

Hi Sean,

On Fri, 7 Feb 2025 21:14:32 -0500
Sean Anderson <seanga2@gmail.com> wrote:

> Hi Maxime,
> 
> On 2/7/25 17:36, Maxime Chevallier wrote:
> > Hello everyone,
> > 
> > This series follows the 2 RFC that were sent a few weeks ago :
> > RFC V2: https://lore.kernel.org/netdev/20250122174252.82730-1-maxime.chevallier@bootlin.com/
> > RFC V1: https://lore.kernel.org/netdev/20241220201506.2791940-1-maxime.chevallier@bootlin.com/
> > 
> > The goal of this series is to introduce an internal way of representing
> > the "outputs" of ethernet devices, for now only focusing on PHYs.
> > 
> > This allows laying the groundwork for multi-port devices support (both 1
> > PHY 2 ports, or more exotic setups with 2 PHYs in parallel, or MII
> > multiplexers).
> > 
> > Compared to the RFCs, this series tries to properly support SFP,
> > especially PHY-driven SFPs through special phy_ports named "serdes"
> > ports. They have the particularity of outputing a generic interface,
> > that feeds into another component (usually, an SFP cage and therefore an
> > SFP module).
> > 
> > This allows getting a fairly generic PHY-driven SFP support (MAC-driven
> > SFP is handled by phylink).
> > 
> > This series doesn't address PHY-less interfaces (bare MAC devices, MACs
> > with embedded PHYs not driven by phylink, or MAC connected to optical
> > SFPs) to stay within the 15 patches limit, nor does it include the uAPI
> > part that exposes these ports to userspace.
> > 
> > I've kept the cover short, much more details can be found in the RFC
> > covers.
> > 
> > Thanks everyone,
> > 
> > Maxime  
> 
> Forgive me for my ignorance, but why have a new ethtool interface instead of
> extending ethtool_link_settings.port? It's a rather ancient interface, but it
> seems to be tackling the exact same problem as you are trying to address. Older
> NICs used to have several physical connectors (e.g. BNC, MII, twisted-pair) but
> only one could be used at once. This seems directly analogous to a PHY that
> supports multiple "port"s but not all at once. In fact, the only missing
> connector type seems to be PORT_BACKPLANE.
> 
> I can think of a few reasons why you wouldn't use PORT_*:
> 
> - It describes the NIC and not the PHY, and perhaps there is too much impedance
>    mismatch?
> - There is too much legacy in userspace (or in the kernel) to use that API in
>    this way?
> - You need more flexibility?

So there are multiple reasons that make the PORT_* field limited :

 - We can't gracefully handle multi-port PHYs for complex scenarios
where we could say "I'm currently using the Copper port, but does the
Fiber port has link ?"

 - As you mention in your first argument, what I'd like to try to do is
come-up with a "generic" representation of outgoing NIC interfaces. The
final use-cases I'd like to cover are multi-port NICs, allowing
userspace to control which physical interfaces are available, and which
t use. Looking at the hardware, this can be implemented in multiple
ways :

           ___ Copper
          /
 MAC - PHY
          \__ SFP

Here, a single PHY has 2 media-side interfaces, and we'd like to select
the one to use. That's fairly common now, there are quite a number of
PHYs that support this : mv33x3310, VSC8552, mv88x2222 only to name a
few. But there are other, more uncommon topologies that exist :

                           ____ SGMII PHY -- Copper
                          /
 MAC - SGMII/1000BaseX MUX
                          \____ SFP

Here, we also have 2 media-side ports, but they are driver through
different entities : The Copper port sits behind a single-port PHY,
that is itself behind a *MII MUX, that's also connected to an SFP. Here
the port selection is done at the MUX level

Finally, I've been working on supporting devices whith another topology
(actually, what started this whole work) :

            ___ PHY
           /
 MAC --MUX |
           \__ PHY

Here both PHYs are on the same *MII bus, with some physical,
gpio-driven MUX, and we have 2 PORT_TP on the same NIC. That design is
used for link redundancy, if one PHY loses the link, we switch to the
other one (that hopefully has link).

All these cases have different drivers involved in the MUX'ing (phy
driver itself, intermediate MUX in-between...), so the end-goal would
be to expose to userspace info about the media interfaces themselves.

This phy_port object would be what we expose to userspace. One missing
step in this series is adding control on the ports (netlink API,
enabling/disabling logic for ports) but that far exceeds the 15 patches
limitation :)

Sorry if all of that was blurry, I did make so good of a job linking to
all previous discussions on the topic, I'll address that for the next
round.

Thanks,

Maxime

Kory Maincent Feb. 11, 2025, 1:32 p.m. UTC | #4

On Fri,  7 Feb 2025 23:36:22 +0100
Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> Ethernet provides a wide variety of layer 1 protocols and standards for
> data transmission. The front-facing ports of an interface have their own
> complexity and configurability.
> 
> Introduce a representation of these front-facing ports. The current code
> is minimalistic and only support ports controlled by PHY devices, but
> the plan is to extend that to SFP as well as raw Ethernet MACs that
> don't use PHY devices.
> 
> This minimal port representation allows describing the media and number
> of lanes of a port. From that information, we can derive the linkmodes
> usable on the port, which can be used to limit the capabilities of an
> interface.
> 
> For now, the port lanes and medium is derived from devicetree, defined
> by the PHY driver, or populated with default values (as we assume that
> all PHYs expose at least one port).
> 
> The typical example is 100M ethernet. 100BaseT can work using only 2
> lanes on a Cat 5 cables. However, in the situation where a 10/100/1000
> capable PHY is wired to its RJ45 port through 2 lanes only, we have no
> way of detecting that. The "max-speed" DT property can be used, but a
> more accurate representation can be used :
> 
> mdi {
> 	port@0 {
> 		media = "BaseT";
> 		lanes = <2>;
> 	};
> };
> 
> From that information, we can derive the max speed reachable on the
> port.
> 
> Another benefit of having that is to avoid vendor-specific DT properties
> (micrel,fiber-mode or ti,fiber-mode).
> 
> This basic representation is meant to be expanded, by the introduction
> of port ops, userspace listing of ports, and support for multi-port
> devices.

This patch is tackling the support of ports only for the PHY API. Keeping in
mind that this port abstraction support will also be of interest to the NICs.
Isn't it preferable to handle port in a standalone API?

With net drivers having PHY managed by the firmware or DSA, there is no linux
description of their PHYs. On that case, if we want to use port abstraction,
what is the best? Register a virtual phy_device to use the abstraction port or
use the port abstraction API directly which meant that it is not related to any
PHY?

Regards,

Kory Maincent Feb. 11, 2025, 1:52 p.m. UTC | #5

On Tue, 11 Feb 2025 14:42:43 +0100
Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> Hi Köry,
> 
> On Tue, 11 Feb 2025 14:32:09 +0100
> Kory Maincent <kory.maincent@bootlin.com> wrote:
> 
> > On Fri,  7 Feb 2025 23:36:22 +0100
> > Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:
> >   
> > > Ethernet provides a wide variety of layer 1 protocols and standards for
> > > data transmission. The front-facing ports of an interface have their own
> > > complexity and configurability.
> > > 
> > > Introduce a representation of these front-facing ports. The current code
> > > is minimalistic and only support ports controlled by PHY devices, but
> > > the plan is to extend that to SFP as well as raw Ethernet MACs that
> > > don't use PHY devices.
> > > 
> > > This minimal port representation allows describing the media and number
> > > of lanes of a port. From that information, we can derive the linkmodes
> > > usable on the port, which can be used to limit the capabilities of an
> > > interface.
> > > 
> > > For now, the port lanes and medium is derived from devicetree, defined
> > > by the PHY driver, or populated with default values (as we assume that
> > > all PHYs expose at least one port).
> > > 
> > > The typical example is 100M ethernet. 100BaseT can work using only 2
> > > lanes on a Cat 5 cables. However, in the situation where a 10/100/1000
> > > capable PHY is wired to its RJ45 port through 2 lanes only, we have no
> > > way of detecting that. The "max-speed" DT property can be used, but a
> > > more accurate representation can be used :
> > > 
> > > mdi {
> > > 	port@0 {
> > > 		media = "BaseT";
> > > 		lanes = <2>;
> > > 	};
> > > };
> > > 
> > > From that information, we can derive the max speed reachable on the
> > > port.
> > > 
> > > Another benefit of having that is to avoid vendor-specific DT properties
> > > (micrel,fiber-mode or ti,fiber-mode).
> > > 
> > > This basic representation is meant to be expanded, by the introduction
> > > of port ops, userspace listing of ports, and support for multi-port
> > > devices.    
> > 
> > This patch is tackling the support of ports only for the PHY API. Keeping in
> > mind that this port abstraction support will also be of interest to the
> > NICs. Isn't it preferable to handle port in a standalone API?  
> 
> The way I see it, nothing prevents from using the port definition in
> ethernet-port.yml in DSA/raw nics.
> 
> > With net drivers having PHY managed by the firmware or DSA, there is no
> > linux description of their PHYs. On that case, if we want to use port
> > abstraction, what is the best? Register a virtual phy_device to use the
> > abstraction port or use the port abstraction API directly which meant that
> > it is not related to any PHY?  
> 
> I think the next steps will be to have net_device have a list of ports
> (maintained in the phy_link_topology) that aggregates ports from all
> its PHYs/SFPs/raw interfaces. in that case net_device will be the
> direct parent. I haven't worked on the bindings for that though,
> especially for DSA :'(

Having it under phy_link_topology is a great idea!
 
> I don't think the virtual phydev is going to be helpful. I'm hitting
> the 15 patches limit, but a possible extension is to make so that
> phylink also creates a port when it finds an SFP (hence, when upstream
> is a MAC).

I would say not only for SFP but phylink should create a port when it can find
a mdi description in the devicetree. Port with PoE, leds or whatever future
supported features should be created by phylink. 

> This is why phy_port has these fields :
> 
> 
> enum phy_port_parent {
> 	PHY_PORT_PHY,
> };
> 
> struct phy_port {
> 	...
> 	enum phy_port_parent parent_type;
> 	union {
> 		struct phy_device *phy;
> 	};
> 
> };
> 
> The parent type may (will) be extended with PORT_PHY_MAC, and that's
> also why the parent pointer is in a union :)

Ok for me!
 
> I'm trying hard to make so that phy_port doesn't depend on phylib
> (altough, phylib depends on phy_port). There's a dependency on some
> core stuff (converting from medium => linkmodes) and phylink
> (converting the interfaces list to linkmodes), but we can extract these
> fairly easily.
> 
> You're correct in that for now, the integration is with phylib only
> though, but let's make sure this will also work for phy-less devices.
> 
> Thanks a lot for your input,

Thanks for your work, it will be really helpful to add support for PoE in DSA. 

Regards,

Andrew Lunn Feb. 11, 2025, 2:04 p.m. UTC | #6

> With net drivers having PHY managed by the firmware or DSA, there is no linux
> description of their PHYs.

DSA should not be special, Linux is driving the PHY so it has to exist
as a linux device.

Firmware is a different case. If the firmware has decided to hide the
PHY, the MAC driver is using a higher level API, generally just
ksetting_set etc. It would be up to the MAC driver to export its PHY
topology and provide whatever other firmware calls are needed. We
should keep this in mind when designing the kAPI, but don't need to
actually implement it. The kAPI should not directly reference a
phydev/phylink instance, but an abstract object which represents a
PHY.

	Andrew

[net-next,00/13] Introduce an ethernet port representation

Message

Comments