Message ID | 20250207223634.600218-1-maxime.chevallier@bootlin.com |
---|---|
Headers | show |
Series | Introduce an ethernet port representation | expand |
Ethernet provides a wide variety of layer 1 protocols and standards for data transmission. The front-facing ports of an interface have their own complexity and configurability. Introduce a representation of these front-facing ports. The current code is minimalistic and only support ports controlled by PHY devices, but the plan is to extend that to SFP as well as raw Ethernet MACs that don't use PHY devices. This minimal port representation allows describing the media and number of lanes of a port. From that information, we can derive the linkmodes usable on the port, which can be used to limit the capabilities of an interface. For now, the port lanes and medium is derived from devicetree, defined by the PHY driver, or populated with default values (as we assume that all PHYs expose at least one port). The typical example is 100M ethernet. 100BaseT can work using only 2 lanes on a Cat 5 cables. However, in the situation where a 10/100/1000 capable PHY is wired to its RJ45 port through 2 lanes only, we have no way of detecting that. The "max-speed" DT property can be used, but a more accurate representation can be used : mdi { port@0 { media = "BaseT"; lanes = <2>; }; };
Hi Maxime, On 2/7/25 17:36, Maxime Chevallier wrote: > Hello everyone, > > This series follows the 2 RFC that were sent a few weeks ago : > RFC V2: https://lore.kernel.org/netdev/20250122174252.82730-1-maxime.chevallier@bootlin.com/ > RFC V1: https://lore.kernel.org/netdev/20241220201506.2791940-1-maxime.chevallier@bootlin.com/ > > The goal of this series is to introduce an internal way of representing > the "outputs" of ethernet devices, for now only focusing on PHYs. > > This allows laying the groundwork for multi-port devices support (both 1 > PHY 2 ports, or more exotic setups with 2 PHYs in parallel, or MII > multiplexers). > > Compared to the RFCs, this series tries to properly support SFP, > especially PHY-driven SFPs through special phy_ports named "serdes" > ports. They have the particularity of outputing a generic interface, > that feeds into another component (usually, an SFP cage and therefore an > SFP module). > > This allows getting a fairly generic PHY-driven SFP support (MAC-driven > SFP is handled by phylink). > > This series doesn't address PHY-less interfaces (bare MAC devices, MACs > with embedded PHYs not driven by phylink, or MAC connected to optical > SFPs) to stay within the 15 patches limit, nor does it include the uAPI > part that exposes these ports to userspace. > > I've kept the cover short, much more details can be found in the RFC > covers. > > Thanks everyone, > > Maxime Forgive me for my ignorance, but why have a new ethtool interface instead of extending ethtool_link_settings.port? It's a rather ancient interface, but it seems to be tackling the exact same problem as you are trying to address. Older NICs used to have several physical connectors (e.g. BNC, MII, twisted-pair) but only one could be used at once. This seems directly analogous to a PHY that supports multiple "port"s but not all at once. In fact, the only missing connector type seems to be PORT_BACKPLANE. I can think of a few reasons why you wouldn't use PORT_*: - It describes the NIC and not the PHY, and perhaps there is too much impedance mismatch? - There is too much legacy in userspace (or in the kernel) to use that API in this way? - You need more flexibility? At the very least, I think some discussion in one of the commits would be warranted. Perhaps there was some on the RFC that I missed? --Sean
Hi Sean, On Fri, 7 Feb 2025 21:14:32 -0500 Sean Anderson <seanga2@gmail.com> wrote: > Hi Maxime, > > On 2/7/25 17:36, Maxime Chevallier wrote: > > Hello everyone, > > > > This series follows the 2 RFC that were sent a few weeks ago : > > RFC V2: https://lore.kernel.org/netdev/20250122174252.82730-1-maxime.chevallier@bootlin.com/ > > RFC V1: https://lore.kernel.org/netdev/20241220201506.2791940-1-maxime.chevallier@bootlin.com/ > > > > The goal of this series is to introduce an internal way of representing > > the "outputs" of ethernet devices, for now only focusing on PHYs. > > > > This allows laying the groundwork for multi-port devices support (both 1 > > PHY 2 ports, or more exotic setups with 2 PHYs in parallel, or MII > > multiplexers). > > > > Compared to the RFCs, this series tries to properly support SFP, > > especially PHY-driven SFPs through special phy_ports named "serdes" > > ports. They have the particularity of outputing a generic interface, > > that feeds into another component (usually, an SFP cage and therefore an > > SFP module). > > > > This allows getting a fairly generic PHY-driven SFP support (MAC-driven > > SFP is handled by phylink). > > > > This series doesn't address PHY-less interfaces (bare MAC devices, MACs > > with embedded PHYs not driven by phylink, or MAC connected to optical > > SFPs) to stay within the 15 patches limit, nor does it include the uAPI > > part that exposes these ports to userspace. > > > > I've kept the cover short, much more details can be found in the RFC > > covers. > > > > Thanks everyone, > > > > Maxime > > Forgive me for my ignorance, but why have a new ethtool interface instead of > extending ethtool_link_settings.port? It's a rather ancient interface, but it > seems to be tackling the exact same problem as you are trying to address. Older > NICs used to have several physical connectors (e.g. BNC, MII, twisted-pair) but > only one could be used at once. This seems directly analogous to a PHY that > supports multiple "port"s but not all at once. In fact, the only missing > connector type seems to be PORT_BACKPLANE. > > I can think of a few reasons why you wouldn't use PORT_*: > > - It describes the NIC and not the PHY, and perhaps there is too much impedance > mismatch? > - There is too much legacy in userspace (or in the kernel) to use that API in > this way? > - You need more flexibility? So there are multiple reasons that make the PORT_* field limited : - We can't gracefully handle multi-port PHYs for complex scenarios where we could say "I'm currently using the Copper port, but does the Fiber port has link ?" - As you mention in your first argument, what I'd like to try to do is come-up with a "generic" representation of outgoing NIC interfaces. The final use-cases I'd like to cover are multi-port NICs, allowing userspace to control which physical interfaces are available, and which t use. Looking at the hardware, this can be implemented in multiple ways : ___ Copper / MAC - PHY \__ SFP Here, a single PHY has 2 media-side interfaces, and we'd like to select the one to use. That's fairly common now, there are quite a number of PHYs that support this : mv33x3310, VSC8552, mv88x2222 only to name a few. But there are other, more uncommon topologies that exist : ____ SGMII PHY -- Copper / MAC - SGMII/1000BaseX MUX \____ SFP Here, we also have 2 media-side ports, but they are driver through different entities : The Copper port sits behind a single-port PHY, that is itself behind a *MII MUX, that's also connected to an SFP. Here the port selection is done at the MUX level Finally, I've been working on supporting devices whith another topology (actually, what started this whole work) : ___ PHY / MAC --MUX | \__ PHY Here both PHYs are on the same *MII bus, with some physical, gpio-driven MUX, and we have 2 PORT_TP on the same NIC. That design is used for link redundancy, if one PHY loses the link, we switch to the other one (that hopefully has link). All these cases have different drivers involved in the MUX'ing (phy driver itself, intermediate MUX in-between...), so the end-goal would be to expose to userspace info about the media interfaces themselves. This phy_port object would be what we expose to userspace. One missing step in this series is adding control on the ports (netlink API, enabling/disabling logic for ports) but that far exceeds the 15 patches limitation :) Sorry if all of that was blurry, I did make so good of a job linking to all previous discussions on the topic, I'll address that for the next round. Thanks, Maxime
On Fri, 7 Feb 2025 23:36:22 +0100 Maxime Chevallier <maxime.chevallier@bootlin.com> wrote: > Ethernet provides a wide variety of layer 1 protocols and standards for > data transmission. The front-facing ports of an interface have their own > complexity and configurability. > > Introduce a representation of these front-facing ports. The current code > is minimalistic and only support ports controlled by PHY devices, but > the plan is to extend that to SFP as well as raw Ethernet MACs that > don't use PHY devices. > > This minimal port representation allows describing the media and number > of lanes of a port. From that information, we can derive the linkmodes > usable on the port, which can be used to limit the capabilities of an > interface. > > For now, the port lanes and medium is derived from devicetree, defined > by the PHY driver, or populated with default values (as we assume that > all PHYs expose at least one port). > > The typical example is 100M ethernet. 100BaseT can work using only 2 > lanes on a Cat 5 cables. However, in the situation where a 10/100/1000 > capable PHY is wired to its RJ45 port through 2 lanes only, we have no > way of detecting that. The "max-speed" DT property can be used, but a > more accurate representation can be used : > > mdi { > port@0 { > media = "BaseT"; > lanes = <2>; > }; > }; > > From that information, we can derive the max speed reachable on the > port. > > Another benefit of having that is to avoid vendor-specific DT properties > (micrel,fiber-mode or ti,fiber-mode). > > This basic representation is meant to be expanded, by the introduction > of port ops, userspace listing of ports, and support for multi-port > devices. This patch is tackling the support of ports only for the PHY API. Keeping in mind that this port abstraction support will also be of interest to the NICs. Isn't it preferable to handle port in a standalone API? With net drivers having PHY managed by the firmware or DSA, there is no linux description of their PHYs. On that case, if we want to use port abstraction, what is the best? Register a virtual phy_device to use the abstraction port or use the port abstraction API directly which meant that it is not related to any PHY? Regards,
On Tue, 11 Feb 2025 14:42:43 +0100 Maxime Chevallier <maxime.chevallier@bootlin.com> wrote: > Hi Köry, > > On Tue, 11 Feb 2025 14:32:09 +0100 > Kory Maincent <kory.maincent@bootlin.com> wrote: > > > On Fri, 7 Feb 2025 23:36:22 +0100 > > Maxime Chevallier <maxime.chevallier@bootlin.com> wrote: > > > > > Ethernet provides a wide variety of layer 1 protocols and standards for > > > data transmission. The front-facing ports of an interface have their own > > > complexity and configurability. > > > > > > Introduce a representation of these front-facing ports. The current code > > > is minimalistic and only support ports controlled by PHY devices, but > > > the plan is to extend that to SFP as well as raw Ethernet MACs that > > > don't use PHY devices. > > > > > > This minimal port representation allows describing the media and number > > > of lanes of a port. From that information, we can derive the linkmodes > > > usable on the port, which can be used to limit the capabilities of an > > > interface. > > > > > > For now, the port lanes and medium is derived from devicetree, defined > > > by the PHY driver, or populated with default values (as we assume that > > > all PHYs expose at least one port). > > > > > > The typical example is 100M ethernet. 100BaseT can work using only 2 > > > lanes on a Cat 5 cables. However, in the situation where a 10/100/1000 > > > capable PHY is wired to its RJ45 port through 2 lanes only, we have no > > > way of detecting that. The "max-speed" DT property can be used, but a > > > more accurate representation can be used : > > > > > > mdi { > > > port@0 { > > > media = "BaseT"; > > > lanes = <2>; > > > }; > > > }; > > > > > > From that information, we can derive the max speed reachable on the > > > port. > > > > > > Another benefit of having that is to avoid vendor-specific DT properties > > > (micrel,fiber-mode or ti,fiber-mode). > > > > > > This basic representation is meant to be expanded, by the introduction > > > of port ops, userspace listing of ports, and support for multi-port > > > devices. > > > > This patch is tackling the support of ports only for the PHY API. Keeping in > > mind that this port abstraction support will also be of interest to the > > NICs. Isn't it preferable to handle port in a standalone API? > > The way I see it, nothing prevents from using the port definition in > ethernet-port.yml in DSA/raw nics. > > > With net drivers having PHY managed by the firmware or DSA, there is no > > linux description of their PHYs. On that case, if we want to use port > > abstraction, what is the best? Register a virtual phy_device to use the > > abstraction port or use the port abstraction API directly which meant that > > it is not related to any PHY? > > I think the next steps will be to have net_device have a list of ports > (maintained in the phy_link_topology) that aggregates ports from all > its PHYs/SFPs/raw interfaces. in that case net_device will be the > direct parent. I haven't worked on the bindings for that though, > especially for DSA :'( Having it under phy_link_topology is a great idea! > I don't think the virtual phydev is going to be helpful. I'm hitting > the 15 patches limit, but a possible extension is to make so that > phylink also creates a port when it finds an SFP (hence, when upstream > is a MAC). I would say not only for SFP but phylink should create a port when it can find a mdi description in the devicetree. Port with PoE, leds or whatever future supported features should be created by phylink. > This is why phy_port has these fields : > > > enum phy_port_parent { > PHY_PORT_PHY, > }; > > struct phy_port { > ... > enum phy_port_parent parent_type; > union { > struct phy_device *phy; > }; > > }; > > The parent type may (will) be extended with PORT_PHY_MAC, and that's > also why the parent pointer is in a union :) Ok for me! > I'm trying hard to make so that phy_port doesn't depend on phylib > (altough, phylib depends on phy_port). There's a dependency on some > core stuff (converting from medium => linkmodes) and phylink > (converting the interfaces list to linkmodes), but we can extract these > fairly easily. > > You're correct in that for now, the integration is with phylib only > though, but let's make sure this will also work for phy-less devices. > > Thanks a lot for your input, Thanks for your work, it will be really helpful to add support for PoE in DSA. Regards,
> With net drivers having PHY managed by the firmware or DSA, there is no linux > description of their PHYs. DSA should not be special, Linux is driving the PHY so it has to exist as a linux device. Firmware is a different case. If the firmware has decided to hide the PHY, the MAC driver is using a higher level API, generally just ksetting_set etc. It would be up to the MAC driver to export its PHY topology and provide whatever other firmware calls are needed. We should keep this in mind when designing the kAPI, but don't need to actually implement it. The kAPI should not directly reference a phydev/phylink instance, but an abstract object which represents a PHY. Andrew