Message ID | 20210315195413.2679929-1-tobias@waldekranz.com |
---|---|
State | Superseded |
Headers | show |
Series | [net] net: dsa: Centralize validation of VLAN configuration | expand |
On Mon, Mar 15, 2021 at 08:54:13PM +0100, Tobias Waldekranz wrote: > There are four kinds of events that have an inpact on VLAN impact > configuration of DSA ports: > > - Adding VLAN uppers > (ip link add dev swp0.1 link swp0 type vlan id 1) (..) > +static bool dsa_8021q_uppers_are_coherent(struct dsa_switch_tree *dst, > + struct net_device *br, > + bool seen_vlan_upper, have_8021q_uppers_in_bridge maybe? > + unsigned long *upper_vids, > + struct netlink_ext_ack *extack) > +{ > + struct net_device *lower, *upper; > + struct list_head *liter, *uiter; It doesn't hurt to name them lower_iter, upper_iter? > + struct dsa_slave_priv *priv; > + bool seen_offloaded = false; > + u16 vid; > + > + netdev_for_each_lower_dev(br, lower, liter) { > + priv = dsa_slave_dev_lower_find(lower); > + if (!priv || priv->dp->ds->dst != dst) > + /* Ignore ports that are not related to us in > + * any way. > + */ > + continue; So "priv" is the lower of a bridge port... > + > + if (is_vlan_dev(lower)) { > + seen_vlan_upper = true; > + continue; > + } But in the code path below, that bridge port is not a VLAN... So it must be a LAG or a HSR ring.... > + if (dsa_port_offloads_bridge(priv->dp, br) && > + dsa_port_offloads_bridge_port(priv->dp, lower)) > + seen_offloaded = true; > + else > + /* Non-offloaded uppers can to whatever they s/can to/can do/ > + * want. > + */ > + continue; Which is offloaded.. > + netdev_for_each_upper_dev_rcu(lower, upper, uiter) { > + if (!is_vlan_dev(upper)) > + continue; So this iterates through VLAN uppers of offloaded LAGs and HSR rings? Does it also iterate through 8021q uppers of "priv" somehow? > + vid = vlan_dev_vlan_id(upper); > + if (!test_bit(vid, upper_vids)) { > + set_bit(vid, upper_vids); > + continue; > + } > + > + NL_SET_ERR_MSG_MOD(extack, > + "Multiple VLAN interfaces cannot use the same VID"); > + return false; > + } > + } > + > + if (seen_offloaded && seen_vlan_upper) { > + NL_SET_ERR_MSG_MOD(extack, > + "VLAN interfaces cannot share bridge with offloaded port"); > + return false; > + } > + > + return true; > +} > + > +static bool dsa_bridge_vlans_are_coherent(struct net_device *br, > + u16 new_vid, unsigned long *upper_vids, const unsigned long *upper_vids > + struct netlink_ext_ack *extack) > +{ > + u16 vid; > + > + if (new_vid && test_bit(new_vid, upper_vids)) > + goto err; > + > + for_each_set_bit(vid, upper_vids, VLAN_N_VID) { > + struct bridge_vlan_info br_info; > + > + if (br_vlan_get_info(br, vid, &br_info)) You should only error out if VLAN filtering is enabled/turning on in the bridge, no? > + /* Error means that the VID does not exist, > + * which is what we want to ensure. > + */ > + continue; > + > + goto err; > + } > + > + return true; > + > +err: > + NL_SET_ERR_MSG_MOD(extack, "No bridge VID may be used on a related VLAN interface"); > + return false; > +} > + > +/** > + * dsa_bridge_is_coherent - Verify that DSA tree accepts a bridge config. > + * @dst: Tree to verify against. > + * @br: Bridge netdev to verify. > + * @mod: Description of the modification to introduce. > + * @extack: Netlink extended ack for error reporting. > + * > + * Verify that the VLAN config of @br, its offloaded ports belonging > + * to @dst and their VLAN uppers, can be correctly offloaded after > + * introducing the change described by @mod. If this is not the case, > + * an error is reported via @extack. > + * > + * Return: true if the config can be offloaded, false otherwise. > + */ > +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, > + struct dsa_bridge_mod *mod, > + struct netlink_ext_ack *extack) > +{ > + unsigned long *upper_vids = NULL; initialization with NULL is pointless. > + bool filter; > + > + if (mod->filter) > + filter = *mod->filter; > + else > + filter = br && br_vlan_enabled(br); > + > + if (!dsa_bridge_filtering_is_coherent(dst, filter, extack)) > + goto err; > + > + if (!filter) > + return true; > + > + upper_vids = bitmap_zalloc(VLAN_N_VID, GFP_KERNEL); > + if (!upper_vids) { > + WARN_ON_ONCE(1); if (WARN_ON_ONCE(!upper_vids)) > + goto err; > + } > + > + if (mod->upper_vid) > + set_bit(mod->upper_vid, upper_vids); > + > + if (!dsa_8021q_uppers_are_coherent(dst, br, mod->bridge_upper, > + upper_vids, extack)) > + goto err_free; > + > + if (!dsa_bridge_vlans_are_coherent(br, mod->br_vid, upper_vids, extack)) > + goto err_free; > + > + kfree(upper_vids); > + return true; > + > +err_free: > + kfree(upper_vids); > +err: > + return false; > +} > + > /** > * dsa_tree_notify - Execute code for all switches in a DSA switch tree. > * @dst: collection of struct dsa_switch devices to notify. > diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h > index 9d4b0e9b1aa1..8d8d307df437 100644 > --- a/net/dsa/dsa_priv.h > +++ b/net/dsa/dsa_priv.h > @@ -361,6 +369,27 @@ int dsa_switch_register_notifier(struct dsa_switch *ds); > void dsa_switch_unregister_notifier(struct dsa_switch *ds); > > /* dsa2.c */ > + > +/** > + * struct dsa_bridge_mod - Modification of bridge related config > + * @filter: If non-NULL, the new state of VLAN filtering. > + * @br_vid: If non-zero, this VID will be added to the bridge. > + * @upper_vid: If non-zero, a VLAN upper using this VID will be added to > + * a bridge port. > + * @bridge_upper: If non-NULL, a VLAN upper will be added to the bridge. I would name this "add_8021q_upper_to_bridge". Longer name, but clearer. > + * > + * Describes a bridge related modification that is about to be applied. > + */ > +struct dsa_bridge_mod { > + bool *filter; > + u16 br_vid; > + u16 upper_vid; > + bool bridge_upper; > +}; Frankly this is a bit ugly, but I have no better idea, and the structure is good enough for describing a state transition. Fully describing the state is a lot more difficult, due to the need to list all bridges which may span a DSA switch tree. > +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, > + struct dsa_bridge_mod *mod, > + struct netlink_ext_ack *extack); > void dsa_lag_map(struct dsa_switch_tree *dst, struct net_device *lag); > void dsa_lag_unmap(struct dsa_switch_tree *dst, struct net_device *lag); > int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); (...) > -static struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) > +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) > { > struct netdev_nested_priv priv = { > .data = NULL, > }; > > + if (dsa_slave_dev_check(dev)) > + return netdev_priv(dev); > + > netdev_walk_all_lower_dev_rcu(dev, dsa_lower_dev_walk, &priv); > > return (struct dsa_slave_priv *)priv.data; Ah, so that's what you did there. I don't like it. If the function is called "lower_find" and you come back with "dev" itself, I think that would qualify as "unexpected". Could you create a new function called dsa_slave_find_in_lowers_or_self, or something like that, which calls dsa_slave_dev_lower_find with the extra identity check?
On Tue, Mar 16, 2021 at 01:49, Vladimir Oltean <olteanv@gmail.com> wrote: > On Mon, Mar 15, 2021 at 08:54:13PM +0100, Tobias Waldekranz wrote: >> There are four kinds of events that have an inpact on VLAN > > impact > >> configuration of DSA ports: >> >> - Adding VLAN uppers >> (ip link add dev swp0.1 link swp0 type vlan id 1) > (..) Parse error; I need more context :) >> +static bool dsa_8021q_uppers_are_coherent(struct dsa_switch_tree *dst, >> + struct net_device *br, >> + bool seen_vlan_upper, > > have_8021q_uppers_in_bridge maybe? I like that the current name hints of a relation with seen_offloaded. Your suggestion seems awfully long for an argument name. > >> + unsigned long *upper_vids, >> + struct netlink_ext_ack *extack) >> +{ >> + struct net_device *lower, *upper; >> + struct list_head *liter, *uiter; > > It doesn't hurt to name them lower_iter, upper_iter? > >> + struct dsa_slave_priv *priv; >> + bool seen_offloaded = false; >> + u16 vid; >> + >> + netdev_for_each_lower_dev(br, lower, liter) { >> + priv = dsa_slave_dev_lower_find(lower); >> + if (!priv || priv->dp->ds->dst != dst) >> + /* Ignore ports that are not related to us in >> + * any way. >> + */ >> + continue; > > So "priv" is the lower of a bridge port... > >> + >> + if (is_vlan_dev(lower)) { >> + seen_vlan_upper = true; >> + continue; >> + } > > But in the code path below, that bridge port is not a VLAN... So it must > be a LAG or a HSR ring.... > >> + if (dsa_port_offloads_bridge(priv->dp, br) && >> + dsa_port_offloads_bridge_port(priv->dp, lower)) >> + seen_offloaded = true; >> + else >> + /* Non-offloaded uppers can to whatever they > > s/can to/can do/ > >> + * want. >> + */ >> + continue; > > Which is offloaded.. > >> + netdev_for_each_upper_dev_rcu(lower, upper, uiter) { >> + if (!is_vlan_dev(upper)) >> + continue; > > So this iterates through VLAN uppers of offloaded LAGs and HSR rings? > Does it also iterate through 8021q uppers of "priv" somehow? As you discovered below, dsa_slave_dev_lower_find now also matches the starting device as well as any device below it. So we iterate through all uppers of any bridge port that this tree is offloading. >> + vid = vlan_dev_vlan_id(upper); >> + if (!test_bit(vid, upper_vids)) { >> + set_bit(vid, upper_vids); >> + continue; >> + } >> + >> + NL_SET_ERR_MSG_MOD(extack, >> + "Multiple VLAN interfaces cannot use the same VID"); >> + return false; >> + } >> + } >> + >> + if (seen_offloaded && seen_vlan_upper) { >> + NL_SET_ERR_MSG_MOD(extack, >> + "VLAN interfaces cannot share bridge with offloaded port"); >> + return false; >> + } >> + >> + return true; >> +} >> + >> +static bool dsa_bridge_vlans_are_coherent(struct net_device *br, >> + u16 new_vid, unsigned long *upper_vids, > > const unsigned long *upper_vids > >> + struct netlink_ext_ack *extack) >> +{ >> + u16 vid; >> + >> + if (new_vid && test_bit(new_vid, upper_vids)) >> + goto err; >> + >> + for_each_set_bit(vid, upper_vids, VLAN_N_VID) { >> + struct bridge_vlan_info br_info; >> + >> + if (br_vlan_get_info(br, vid, &br_info)) > > You should only error out if VLAN filtering is enabled/turning on in the > bridge, no? We only validate upper and bridge VLAN coherency for filtering bridges. Otherwise we return early from dsa_bridge_is_coherent. >> + /* Error means that the VID does not exist, >> + * which is what we want to ensure. >> + */ >> + continue; >> + >> + goto err; >> + } >> + >> + return true; >> + >> +err: >> + NL_SET_ERR_MSG_MOD(extack, "No bridge VID may be used on a related VLAN interface"); >> + return false; >> +} >> + >> +/** >> + * dsa_bridge_is_coherent - Verify that DSA tree accepts a bridge config. >> + * @dst: Tree to verify against. >> + * @br: Bridge netdev to verify. >> + * @mod: Description of the modification to introduce. >> + * @extack: Netlink extended ack for error reporting. >> + * >> + * Verify that the VLAN config of @br, its offloaded ports belonging >> + * to @dst and their VLAN uppers, can be correctly offloaded after >> + * introducing the change described by @mod. If this is not the case, >> + * an error is reported via @extack. >> + * >> + * Return: true if the config can be offloaded, false otherwise. >> + */ >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, >> + struct dsa_bridge_mod *mod, >> + struct netlink_ext_ack *extack) >> +{ >> + unsigned long *upper_vids = NULL; > > initialization with NULL is pointless. > >> + bool filter; >> + >> + if (mod->filter) >> + filter = *mod->filter; >> + else >> + filter = br && br_vlan_enabled(br); >> + >> + if (!dsa_bridge_filtering_is_coherent(dst, filter, extack)) >> + goto err; >> + >> + if (!filter) >> + return true; >> + >> + upper_vids = bitmap_zalloc(VLAN_N_VID, GFP_KERNEL); >> + if (!upper_vids) { >> + WARN_ON_ONCE(1); > > if (WARN_ON_ONCE(!upper_vids)) > >> + goto err; >> + } >> + >> + if (mod->upper_vid) >> + set_bit(mod->upper_vid, upper_vids); >> + >> + if (!dsa_8021q_uppers_are_coherent(dst, br, mod->bridge_upper, >> + upper_vids, extack)) >> + goto err_free; >> + >> + if (!dsa_bridge_vlans_are_coherent(br, mod->br_vid, upper_vids, extack)) >> + goto err_free; >> + >> + kfree(upper_vids); >> + return true; >> + >> +err_free: >> + kfree(upper_vids); >> +err: >> + return false; >> +} >> + >> /** >> * dsa_tree_notify - Execute code for all switches in a DSA switch tree. >> * @dst: collection of struct dsa_switch devices to notify. >> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h >> index 9d4b0e9b1aa1..8d8d307df437 100644 >> --- a/net/dsa/dsa_priv.h >> +++ b/net/dsa/dsa_priv.h >> @@ -361,6 +369,27 @@ int dsa_switch_register_notifier(struct dsa_switch *ds); >> void dsa_switch_unregister_notifier(struct dsa_switch *ds); >> >> /* dsa2.c */ >> + >> +/** >> + * struct dsa_bridge_mod - Modification of bridge related config >> + * @filter: If non-NULL, the new state of VLAN filtering. >> + * @br_vid: If non-zero, this VID will be added to the bridge. >> + * @upper_vid: If non-zero, a VLAN upper using this VID will be added to >> + * a bridge port. >> + * @bridge_upper: If non-NULL, a VLAN upper will be added to the bridge. > > I would name this "add_8021q_upper_to_bridge". Longer name, but clearer. It is not like it is a global variable or anything, there is plenty of context here I think. You know that you are describing a bridge related VLAN modification. >> + * >> + * Describes a bridge related modification that is about to be applied. >> + */ >> +struct dsa_bridge_mod { >> + bool *filter; >> + u16 br_vid; >> + u16 upper_vid; >> + bool bridge_upper; >> +}; > > Frankly this is a bit ugly, but I have no better idea, and the structure > is good enough for describing a state transition. Fully describing the > state is a lot more difficult, due to the need to list all bridges which > may span a DSA switch tree. I am not sure what to make of this. Its job _is_ to describe a state transition. Why would we want to describe the state? The kernel already has the state, which is what dsa_bridge_is_coherent uses to figure out if the change can be applied or not. Is it sexy? No, I guess not. This type of code seldom is. The alternative would be to cram the info into the argument list, but that makes the wrappers harder to read and it makes it harder to extend when we want to validate another invariant. >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, >> + struct dsa_bridge_mod *mod, >> + struct netlink_ext_ack *extack); >> void dsa_lag_map(struct dsa_switch_tree *dst, struct net_device *lag); >> void dsa_lag_unmap(struct dsa_switch_tree *dst, struct net_device *lag); >> int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); > (...) ? >> -static struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) >> +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) >> { >> struct netdev_nested_priv priv = { >> .data = NULL, >> }; >> >> + if (dsa_slave_dev_check(dev)) >> + return netdev_priv(dev); >> + >> netdev_walk_all_lower_dev_rcu(dev, dsa_lower_dev_walk, &priv); >> >> return (struct dsa_slave_priv *)priv.data; > > Ah, so that's what you did there. I don't like it. If the function is > called "lower_find" and you come back with "dev" itself, I think that > would qualify as "unexpected". Could you create a new function called > dsa_slave_find_in_lowers_or_self, or something like that, which calls > dsa_slave_dev_lower_find with the extra identity check? My assumption was the opposite. Looking through the kernel, there seem to be three other logical equivalents: drivers/net/ethernet/marvell/prestera/prestera_main.c 495:struct prestera_port *prestera_port_dev_lower_find(struct net_device *dev) drivers/net/ethernet/mellanox/mlxsw/spectrum.c 3406:struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find(struct net_device *dev) drivers/net/ethernet/rocker/rocker_main.c 3090:struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev, All three will check the starting device before walking any lowers. Thank you!
Hi Tobias, I love your patch! Perhaps something to improve: [auto build test WARNING on net/master] url: https://github.com/0day-ci/linux/commits/Tobias-Waldekranz/net-dsa-Centralize-validation-of-VLAN-configuration/20210316-035618 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git a25f822285420486f5da434efc8d940d42a83bce config: powerpc-randconfig-r006-20210316 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 50c7504a93fdb90c26870db8c8ea7add895c7725) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # https://github.com/0day-ci/linux/commit/9a35f7597b676a3bdaa9dd753e0a7d11fb132ed5 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Tobias-Waldekranz/net-dsa-Centralize-validation-of-VLAN-configuration/20210316-035618 git checkout 9a35f7597b676a3bdaa9dd753e0a7d11fb132ed5 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> net/dsa/slave.c:2074:12: warning: variable 'err' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] else if (is_vlan_dev(info->upper_dev)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/dsa/slave.c:2077:7: note: uninitialized use occurs here if (err) ^~~ net/dsa/slave.c:2074:8: note: remove the 'if' if its condition is always true else if (is_vlan_dev(info->upper_dev)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/dsa/slave.c:2070:10: note: initialize the variable 'err' to silence this warning int err; ^ = 0 1 warning generated. vim +2074 net/dsa/slave.c 2059 2060 static int dsa_slave_netdevice_event(struct notifier_block *nb, 2061 unsigned long event, void *ptr) 2062 { 2063 struct net_device *dev = netdev_notifier_info_to_dev(ptr); 2064 2065 switch (event) { 2066 case NETDEV_PRECHANGEUPPER: { 2067 struct netdev_notifier_changeupper_info *info = ptr; 2068 struct dsa_switch *ds; 2069 struct dsa_port *dp; 2070 int err; 2071 2072 if (is_vlan_dev(dev)) 2073 err = dsa_prevent_bridging_8021q_upper(dev, ptr); > 2074 else if (is_vlan_dev(info->upper_dev)) 2075 err = dsa_slave_check_8021q_upper(dev, ptr); 2076 2077 if (err) 2078 return err; 2079 2080 if (!dsa_slave_dev_check(dev)) 2081 return NOTIFY_DONE; 2082 2083 dp = dsa_slave_to_port(dev); 2084 ds = dp->ds; 2085 2086 if (ds->ops->port_prechangeupper) { 2087 err = ds->ops->port_prechangeupper(ds, dp->index, info); 2088 if (err) 2089 return notifier_from_errno(err); 2090 } 2091 break; 2092 } 2093 case NETDEV_CHANGEUPPER: 2094 if (dsa_slave_dev_check(dev)) 2095 return dsa_slave_changeupper(dev, ptr); 2096 2097 if (netif_is_lag_master(dev)) 2098 return dsa_slave_lag_changeupper(dev, ptr); 2099 2100 break; 2101 case NETDEV_CHANGELOWERSTATE: { 2102 struct netdev_notifier_changelowerstate_info *info = ptr; 2103 struct dsa_port *dp; 2104 int err; 2105 2106 if (!dsa_slave_dev_check(dev)) 2107 break; 2108 2109 dp = dsa_slave_to_port(dev); 2110 2111 err = dsa_port_lag_change(dp, info->lower_state_info); 2112 return notifier_from_errno(err); 2113 } 2114 case NETDEV_GOING_DOWN: { 2115 struct dsa_port *dp, *cpu_dp; 2116 struct dsa_switch_tree *dst; 2117 LIST_HEAD(close_list); 2118 2119 if (!netdev_uses_dsa(dev)) 2120 return NOTIFY_DONE; 2121 2122 cpu_dp = dev->dsa_ptr; 2123 dst = cpu_dp->ds->dst; 2124 2125 list_for_each_entry(dp, &dst->ports, list) { 2126 if (!dsa_is_user_port(dp->ds, dp->index)) 2127 continue; 2128 2129 list_add(&dp->slave->close_list, &close_list); 2130 } 2131 2132 dev_close_many(&close_list, true); 2133 2134 return NOTIFY_OK; 2135 } 2136 default: 2137 break; 2138 } 2139 2140 return NOTIFY_DONE; 2141 } 2142 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Mar 16, 2021 at 05:08:28PM +0100, Tobias Waldekranz wrote: > On Tue, Mar 16, 2021 at 01:49, Vladimir Oltean <olteanv@gmail.com> wrote: > > On Mon, Mar 15, 2021 at 08:54:13PM +0100, Tobias Waldekranz wrote: > >> There are four kinds of events that have an inpact on VLAN > > > > impact > > > >> configuration of DSA ports: > >> > >> - Adding VLAN uppers > >> (ip link add dev swp0.1 link swp0 type vlan id 1) > > (..) > > Parse error; I need more context :) For what? I didn't say anything. > >> +static bool dsa_8021q_uppers_are_coherent(struct dsa_switch_tree *dst, > >> + struct net_device *br, > >> + bool seen_vlan_upper, > > > > have_8021q_uppers_in_bridge maybe? > > I like that the current name hints of a relation with > seen_offloaded. Your suggestion seems awfully long for an argument name. seen_offloaded would have become have_offloaded_bridge_ports_as_uppers, I thought that was obvious. But if you don't like it I can't force you... > > > >> + unsigned long *upper_vids, > >> + struct netlink_ext_ack *extack) > >> +{ > >> + struct net_device *lower, *upper; > >> + struct list_head *liter, *uiter; > > > > It doesn't hurt to name them lower_iter, upper_iter? > > > >> + struct dsa_slave_priv *priv; > >> + bool seen_offloaded = false; > >> + u16 vid; > >> + > >> + netdev_for_each_lower_dev(br, lower, liter) { > >> + priv = dsa_slave_dev_lower_find(lower); > >> + if (!priv || priv->dp->ds->dst != dst) > >> + /* Ignore ports that are not related to us in > >> + * any way. > >> + */ > >> + continue; > > > > So "priv" is the lower of a bridge port... > > > >> + > >> + if (is_vlan_dev(lower)) { > >> + seen_vlan_upper = true; > >> + continue; > >> + } > > > > But in the code path below, that bridge port is not a VLAN... So it must > > be a LAG or a HSR ring.... > > > >> + if (dsa_port_offloads_bridge(priv->dp, br) && > >> + dsa_port_offloads_bridge_port(priv->dp, lower)) > >> + seen_offloaded = true; > >> + else > >> + /* Non-offloaded uppers can to whatever they > > > > s/can to/can do/ > > > >> + * want. > >> + */ > >> + continue; > > > > Which is offloaded.. > > > >> + netdev_for_each_upper_dev_rcu(lower, upper, uiter) { > >> + if (!is_vlan_dev(upper)) > >> + continue; > > > > So this iterates through VLAN uppers of offloaded LAGs and HSR rings? > > Does it also iterate through 8021q uppers of "priv" somehow? > > As you discovered below, dsa_slave_dev_lower_find now also matches the > starting device as well as any device below it. So we iterate through > all uppers of any bridge port that this tree is offloading. > > >> + vid = vlan_dev_vlan_id(upper); > >> + if (!test_bit(vid, upper_vids)) { > >> + set_bit(vid, upper_vids); > >> + continue; > >> + } > >> + > >> + NL_SET_ERR_MSG_MOD(extack, > >> + "Multiple VLAN interfaces cannot use the same VID"); > >> + return false; > >> + } > >> + } > >> + > >> + if (seen_offloaded && seen_vlan_upper) { > >> + NL_SET_ERR_MSG_MOD(extack, > >> + "VLAN interfaces cannot share bridge with offloaded port"); > >> + return false; > >> + } > >> + > >> + return true; > >> +} > >> + > >> +static bool dsa_bridge_vlans_are_coherent(struct net_device *br, > >> + u16 new_vid, unsigned long *upper_vids, > > > > const unsigned long *upper_vids > > > >> + struct netlink_ext_ack *extack) > >> +{ > >> + u16 vid; > >> + > >> + if (new_vid && test_bit(new_vid, upper_vids)) > >> + goto err; > >> + > >> + for_each_set_bit(vid, upper_vids, VLAN_N_VID) { > >> + struct bridge_vlan_info br_info; > >> + > >> + if (br_vlan_get_info(br, vid, &br_info)) > > > > You should only error out if VLAN filtering is enabled/turning on in the > > bridge, no? > > We only validate upper and bridge VLAN coherency for filtering > bridges. Otherwise we return early from dsa_bridge_is_coherent. Ok. > >> + /* Error means that the VID does not exist, > >> + * which is what we want to ensure. > >> + */ > >> + continue; > >> + > >> + goto err; > >> + } > >> + > >> + return true; > >> + > >> +err: > >> + NL_SET_ERR_MSG_MOD(extack, "No bridge VID may be used on a related VLAN interface"); > >> + return false; > >> +} > >> + > >> +/** > >> + * dsa_bridge_is_coherent - Verify that DSA tree accepts a bridge config. > >> + * @dst: Tree to verify against. > >> + * @br: Bridge netdev to verify. > >> + * @mod: Description of the modification to introduce. > >> + * @extack: Netlink extended ack for error reporting. > >> + * > >> + * Verify that the VLAN config of @br, its offloaded ports belonging > >> + * to @dst and their VLAN uppers, can be correctly offloaded after > >> + * introducing the change described by @mod. If this is not the case, > >> + * an error is reported via @extack. > >> + * > >> + * Return: true if the config can be offloaded, false otherwise. > >> + */ > >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, > >> + struct dsa_bridge_mod *mod, > >> + struct netlink_ext_ack *extack) > >> +{ > >> + unsigned long *upper_vids = NULL; > > > > initialization with NULL is pointless. > > > >> + bool filter; > >> + > >> + if (mod->filter) > >> + filter = *mod->filter; > >> + else > >> + filter = br && br_vlan_enabled(br); > >> + > >> + if (!dsa_bridge_filtering_is_coherent(dst, filter, extack)) > >> + goto err; > >> + > >> + if (!filter) > >> + return true; > >> + > >> + upper_vids = bitmap_zalloc(VLAN_N_VID, GFP_KERNEL); > >> + if (!upper_vids) { > >> + WARN_ON_ONCE(1); > > > > if (WARN_ON_ONCE(!upper_vids)) > > > >> + goto err; > >> + } > >> + > >> + if (mod->upper_vid) > >> + set_bit(mod->upper_vid, upper_vids); > >> + > >> + if (!dsa_8021q_uppers_are_coherent(dst, br, mod->bridge_upper, > >> + upper_vids, extack)) > >> + goto err_free; > >> + > >> + if (!dsa_bridge_vlans_are_coherent(br, mod->br_vid, upper_vids, extack)) > >> + goto err_free; > >> + > >> + kfree(upper_vids); > >> + return true; > >> + > >> +err_free: > >> + kfree(upper_vids); > >> +err: > >> + return false; > >> +} > >> + > >> /** > >> * dsa_tree_notify - Execute code for all switches in a DSA switch tree. > >> * @dst: collection of struct dsa_switch devices to notify. > >> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h > >> index 9d4b0e9b1aa1..8d8d307df437 100644 > >> --- a/net/dsa/dsa_priv.h > >> +++ b/net/dsa/dsa_priv.h > >> @@ -361,6 +369,27 @@ int dsa_switch_register_notifier(struct dsa_switch *ds); > >> void dsa_switch_unregister_notifier(struct dsa_switch *ds); > >> > >> /* dsa2.c */ > >> + > >> +/** > >> + * struct dsa_bridge_mod - Modification of bridge related config > >> + * @filter: If non-NULL, the new state of VLAN filtering. > >> + * @br_vid: If non-zero, this VID will be added to the bridge. > >> + * @upper_vid: If non-zero, a VLAN upper using this VID will be added to > >> + * a bridge port. > >> + * @bridge_upper: If non-NULL, a VLAN upper will be added to the bridge. > > > > I would name this "add_8021q_upper_to_bridge". Longer name, but clearer. > > It is not like it is a global variable or anything, there is plenty of > context here I think. You know that you are describing a bridge related > VLAN modification. Ok. > >> + * > >> + * Describes a bridge related modification that is about to be applied. > >> + */ > >> +struct dsa_bridge_mod { > >> + bool *filter; > >> + u16 br_vid; > >> + u16 upper_vid; > >> + bool bridge_upper; > >> +}; > > > > Frankly this is a bit ugly, but I have no better idea, and the structure > > is good enough for describing a state transition. Fully describing the > > state is a lot more difficult, due to the need to list all bridges which > > may span a DSA switch tree. > > I am not sure what to make of this. Its job _is_ to describe a state > transition. Why would we want to describe the state? The kernel already > has the state, which is what dsa_bridge_is_coherent uses to figure out > if the change can be applied or not. > > Is it sexy? No, I guess not. This type of code seldom is. The > alternative would be to cram the info into the argument list, but that > makes the wrappers harder to read and it makes it harder to extend when > we want to validate another invariant. What I was thinking out loud about was whether it would be possible for the validation to work as follows: - call a function that extracts the current DSA configuration - modify that state description according to what you're doing, transforming it into a candidate configuration - call a function that validates the candidate configuration - error out if the validation fails Your solution needs to extract some of the current DSA configuration anyway, for the constellation of uppers, but then, the validation function takes the state transition as an explicit argument, and constructing the state is done as we go along. Again, I said I don't have a better idea than your proposal. > >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, > >> + struct dsa_bridge_mod *mod, > >> + struct netlink_ext_ack *extack); > >> void dsa_lag_map(struct dsa_switch_tree *dst, struct net_device *lag); > >> void dsa_lag_unmap(struct dsa_switch_tree *dst, struct net_device *lag); > >> int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); > > (...) > > ? Don't worry, I just trimmed a chunk of the patch. > >> -static struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) > >> +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) > >> { > >> struct netdev_nested_priv priv = { > >> .data = NULL, > >> }; > >> > >> + if (dsa_slave_dev_check(dev)) > >> + return netdev_priv(dev); > >> + > >> netdev_walk_all_lower_dev_rcu(dev, dsa_lower_dev_walk, &priv); > >> > >> return (struct dsa_slave_priv *)priv.data; > > > > Ah, so that's what you did there. I don't like it. If the function is > > called "lower_find" and you come back with "dev" itself, I think that > > would qualify as "unexpected". Could you create a new function called > > dsa_slave_find_in_lowers_or_self, or something like that, which calls > > dsa_slave_dev_lower_find with the extra identity check? > > My assumption was the opposite. Looking through the kernel, there seem > to be three other logical equivalents: > > drivers/net/ethernet/marvell/prestera/prestera_main.c > 495:struct prestera_port *prestera_port_dev_lower_find(struct net_device *dev) > > drivers/net/ethernet/mellanox/mlxsw/spectrum.c > 3406:struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find(struct net_device *dev) > > drivers/net/ethernet/rocker/rocker_main.c > 3090:struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev, > > All three will check the starting device before walking any lowers. Ok. Actually, I think there is a bigger issue. Sadly, I only put 2 and 2 together now, and I believe we are still not dealing correctly with 8021q uppers of bridge ports with vlan_filtering 0. The network stack's expectation is described here: https://patchwork.kernel.org/project/netdevbpf/patch/20210221213355.1241450-12-olteanv@gmail.com/ Basically the problem is in the way DSA drives the configuration of the hardware port. When VLAN filtering is disabled, VLAN awareness is disabled, too, and we make no effort to classify packets according to the VLAN ID, and take a different forwarding decision in hardware based on that. So the traffic corresponding to the 8021q upper gets forwarded, flooded to ports it shouldn't go to. At the very least, we should deny: - joining a bridge with vlan_filtering 0 while there is any 8021q upper - toggling vlan_filtering off while there is any 8021q upper - adding an 8021q upper while the port is under a vlan_filtering=0 bridge And for net-next, we should look at lifting that restriction, perhaps by reacting to the above events by unoffloading the bridge port dynamically (it's going to be so complicated that my head hurts already).
On Tue, Mar 16, 2021 at 23:49, Vladimir Oltean <olteanv@gmail.com> wrote: > On Tue, Mar 16, 2021 at 05:08:28PM +0100, Tobias Waldekranz wrote: >> On Tue, Mar 16, 2021 at 01:49, Vladimir Oltean <olteanv@gmail.com> wrote: >> > On Mon, Mar 15, 2021 at 08:54:13PM +0100, Tobias Waldekranz wrote: >> >> There are four kinds of events that have an inpact on VLAN >> > >> > impact >> > >> >> configuration of DSA ports: >> >> >> >> - Adding VLAN uppers >> >> (ip link add dev swp0.1 link swp0 type vlan id 1) >> > (..) >> >> Parse error; I need more context :) > > For what? I didn't say anything. I did not understand that you had simply trimmed parts of the original message. >> >> +static bool dsa_8021q_uppers_are_coherent(struct dsa_switch_tree *dst, >> >> + struct net_device *br, >> >> + bool seen_vlan_upper, >> > >> > have_8021q_uppers_in_bridge maybe? >> >> I like that the current name hints of a relation with >> seen_offloaded. Your suggestion seems awfully long for an argument name. > > seen_offloaded would have become have_offloaded_bridge_ports_as_uppers, > I thought that was obvious. But if you don't like it I can't force > you... > >> > >> >> + unsigned long *upper_vids, >> >> + struct netlink_ext_ack *extack) >> >> +{ >> >> + struct net_device *lower, *upper; >> >> + struct list_head *liter, *uiter; >> > >> > It doesn't hurt to name them lower_iter, upper_iter? >> > >> >> + struct dsa_slave_priv *priv; >> >> + bool seen_offloaded = false; >> >> + u16 vid; >> >> + >> >> + netdev_for_each_lower_dev(br, lower, liter) { >> >> + priv = dsa_slave_dev_lower_find(lower); >> >> + if (!priv || priv->dp->ds->dst != dst) >> >> + /* Ignore ports that are not related to us in >> >> + * any way. >> >> + */ >> >> + continue; >> > >> > So "priv" is the lower of a bridge port... >> > >> >> + >> >> + if (is_vlan_dev(lower)) { >> >> + seen_vlan_upper = true; >> >> + continue; >> >> + } >> > >> > But in the code path below, that bridge port is not a VLAN... So it must >> > be a LAG or a HSR ring.... >> > >> >> + if (dsa_port_offloads_bridge(priv->dp, br) && >> >> + dsa_port_offloads_bridge_port(priv->dp, lower)) >> >> + seen_offloaded = true; >> >> + else >> >> + /* Non-offloaded uppers can to whatever they >> > >> > s/can to/can do/ >> > >> >> + * want. >> >> + */ >> >> + continue; >> > >> > Which is offloaded.. >> > >> >> + netdev_for_each_upper_dev_rcu(lower, upper, uiter) { >> >> + if (!is_vlan_dev(upper)) >> >> + continue; >> > >> > So this iterates through VLAN uppers of offloaded LAGs and HSR rings? >> > Does it also iterate through 8021q uppers of "priv" somehow? >> >> As you discovered below, dsa_slave_dev_lower_find now also matches the >> starting device as well as any device below it. So we iterate through >> all uppers of any bridge port that this tree is offloading. >> >> >> + vid = vlan_dev_vlan_id(upper); >> >> + if (!test_bit(vid, upper_vids)) { >> >> + set_bit(vid, upper_vids); >> >> + continue; >> >> + } >> >> + >> >> + NL_SET_ERR_MSG_MOD(extack, >> >> + "Multiple VLAN interfaces cannot use the same VID"); >> >> + return false; >> >> + } >> >> + } >> >> + >> >> + if (seen_offloaded && seen_vlan_upper) { >> >> + NL_SET_ERR_MSG_MOD(extack, >> >> + "VLAN interfaces cannot share bridge with offloaded port"); >> >> + return false; >> >> + } >> >> + >> >> + return true; >> >> +} >> >> + >> >> +static bool dsa_bridge_vlans_are_coherent(struct net_device *br, >> >> + u16 new_vid, unsigned long *upper_vids, >> > >> > const unsigned long *upper_vids >> > >> >> + struct netlink_ext_ack *extack) >> >> +{ >> >> + u16 vid; >> >> + >> >> + if (new_vid && test_bit(new_vid, upper_vids)) >> >> + goto err; >> >> + >> >> + for_each_set_bit(vid, upper_vids, VLAN_N_VID) { >> >> + struct bridge_vlan_info br_info; >> >> + >> >> + if (br_vlan_get_info(br, vid, &br_info)) >> > >> > You should only error out if VLAN filtering is enabled/turning on in the >> > bridge, no? >> >> We only validate upper and bridge VLAN coherency for filtering >> bridges. Otherwise we return early from dsa_bridge_is_coherent. > > Ok. > >> >> + /* Error means that the VID does not exist, >> >> + * which is what we want to ensure. >> >> + */ >> >> + continue; >> >> + >> >> + goto err; >> >> + } >> >> + >> >> + return true; >> >> + >> >> +err: >> >> + NL_SET_ERR_MSG_MOD(extack, "No bridge VID may be used on a related VLAN interface"); >> >> + return false; >> >> +} >> >> + >> >> +/** >> >> + * dsa_bridge_is_coherent - Verify that DSA tree accepts a bridge config. >> >> + * @dst: Tree to verify against. >> >> + * @br: Bridge netdev to verify. >> >> + * @mod: Description of the modification to introduce. >> >> + * @extack: Netlink extended ack for error reporting. >> >> + * >> >> + * Verify that the VLAN config of @br, its offloaded ports belonging >> >> + * to @dst and their VLAN uppers, can be correctly offloaded after >> >> + * introducing the change described by @mod. If this is not the case, >> >> + * an error is reported via @extack. >> >> + * >> >> + * Return: true if the config can be offloaded, false otherwise. >> >> + */ >> >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, >> >> + struct dsa_bridge_mod *mod, >> >> + struct netlink_ext_ack *extack) >> >> +{ >> >> + unsigned long *upper_vids = NULL; >> > >> > initialization with NULL is pointless. >> > >> >> + bool filter; >> >> + >> >> + if (mod->filter) >> >> + filter = *mod->filter; >> >> + else >> >> + filter = br && br_vlan_enabled(br); >> >> + >> >> + if (!dsa_bridge_filtering_is_coherent(dst, filter, extack)) >> >> + goto err; >> >> + >> >> + if (!filter) >> >> + return true; >> >> + >> >> + upper_vids = bitmap_zalloc(VLAN_N_VID, GFP_KERNEL); >> >> + if (!upper_vids) { >> >> + WARN_ON_ONCE(1); >> > >> > if (WARN_ON_ONCE(!upper_vids)) >> > >> >> + goto err; >> >> + } >> >> + >> >> + if (mod->upper_vid) >> >> + set_bit(mod->upper_vid, upper_vids); >> >> + >> >> + if (!dsa_8021q_uppers_are_coherent(dst, br, mod->bridge_upper, >> >> + upper_vids, extack)) >> >> + goto err_free; >> >> + >> >> + if (!dsa_bridge_vlans_are_coherent(br, mod->br_vid, upper_vids, extack)) >> >> + goto err_free; >> >> + >> >> + kfree(upper_vids); >> >> + return true; >> >> + >> >> +err_free: >> >> + kfree(upper_vids); >> >> +err: >> >> + return false; >> >> +} >> >> + >> >> /** >> >> * dsa_tree_notify - Execute code for all switches in a DSA switch tree. >> >> * @dst: collection of struct dsa_switch devices to notify. >> >> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h >> >> index 9d4b0e9b1aa1..8d8d307df437 100644 >> >> --- a/net/dsa/dsa_priv.h >> >> +++ b/net/dsa/dsa_priv.h >> >> @@ -361,6 +369,27 @@ int dsa_switch_register_notifier(struct dsa_switch *ds); >> >> void dsa_switch_unregister_notifier(struct dsa_switch *ds); >> >> >> >> /* dsa2.c */ >> >> + >> >> +/** >> >> + * struct dsa_bridge_mod - Modification of bridge related config >> >> + * @filter: If non-NULL, the new state of VLAN filtering. >> >> + * @br_vid: If non-zero, this VID will be added to the bridge. >> >> + * @upper_vid: If non-zero, a VLAN upper using this VID will be added to >> >> + * a bridge port. >> >> + * @bridge_upper: If non-NULL, a VLAN upper will be added to the bridge. >> > >> > I would name this "add_8021q_upper_to_bridge". Longer name, but clearer. >> >> It is not like it is a global variable or anything, there is plenty of >> context here I think. You know that you are describing a bridge related >> VLAN modification. > > Ok. > >> >> + * >> >> + * Describes a bridge related modification that is about to be applied. >> >> + */ >> >> +struct dsa_bridge_mod { >> >> + bool *filter; >> >> + u16 br_vid; >> >> + u16 upper_vid; >> >> + bool bridge_upper; >> >> +}; >> > >> > Frankly this is a bit ugly, but I have no better idea, and the structure >> > is good enough for describing a state transition. Fully describing the >> > state is a lot more difficult, due to the need to list all bridges which >> > may span a DSA switch tree. >> >> I am not sure what to make of this. Its job _is_ to describe a state >> transition. Why would we want to describe the state? The kernel already >> has the state, which is what dsa_bridge_is_coherent uses to figure out >> if the change can be applied or not. >> >> Is it sexy? No, I guess not. This type of code seldom is. The >> alternative would be to cram the info into the argument list, but that >> makes the wrappers harder to read and it makes it harder to extend when >> we want to validate another invariant. > > What I was thinking out loud about was whether it would be possible for > the validation to work as follows: > > - call a function that extracts the current DSA configuration > - modify that state description according to what you're doing, > transforming it into a candidate configuration > - call a function that validates the candidate configuration > - error out if the validation fails > > Your solution needs to extract some of the current DSA configuration > anyway, for the constellation of uppers, but then, the validation > function takes the state transition as an explicit argument, and > constructing the state is done as we go along. Ahh OK, I see what you are saying. Yes it would be nice to not have to emulate the different modifications everywhere but instead have them be baked in to the data you are operating on. Possible, but _lots_ of work as you say. > Again, I said I don't have a better idea than your proposal. > >> >> +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, >> >> + struct dsa_bridge_mod *mod, >> >> + struct netlink_ext_ack *extack); >> >> void dsa_lag_map(struct dsa_switch_tree *dst, struct net_device *lag); >> >> void dsa_lag_unmap(struct dsa_switch_tree *dst, struct net_device *lag); >> >> int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); >> > (...) >> >> ? > > Don't worry, I just trimmed a chunk of the patch. > >> >> -static struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) >> >> +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) >> >> { >> >> struct netdev_nested_priv priv = { >> >> .data = NULL, >> >> }; >> >> >> >> + if (dsa_slave_dev_check(dev)) >> >> + return netdev_priv(dev); >> >> + >> >> netdev_walk_all_lower_dev_rcu(dev, dsa_lower_dev_walk, &priv); >> >> >> >> return (struct dsa_slave_priv *)priv.data; >> > >> > Ah, so that's what you did there. I don't like it. If the function is >> > called "lower_find" and you come back with "dev" itself, I think that >> > would qualify as "unexpected". Could you create a new function called >> > dsa_slave_find_in_lowers_or_self, or something like that, which calls >> > dsa_slave_dev_lower_find with the extra identity check? >> >> My assumption was the opposite. Looking through the kernel, there seem >> to be three other logical equivalents: >> >> drivers/net/ethernet/marvell/prestera/prestera_main.c >> 495:struct prestera_port *prestera_port_dev_lower_find(struct net_device *dev) >> >> drivers/net/ethernet/mellanox/mlxsw/spectrum.c >> 3406:struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find(struct net_device *dev) >> >> drivers/net/ethernet/rocker/rocker_main.c >> 3090:struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev, >> >> All three will check the starting device before walking any lowers. > > Ok. > > > Actually, I think there is a bigger issue. > Sadly, I only put 2 and 2 together now, and I believe we are still not > dealing correctly with 8021q uppers of bridge ports with vlan_filtering 0. > The network stack's expectation is described here: > https://patchwork.kernel.org/project/netdevbpf/patch/20210221213355.1241450-12-olteanv@gmail.com/ So this is the setup Ido is describing, right? br1 \ .100 br0 \ / \ swp0 swp1 > Basically the problem is in the way DSA drives the configuration of the > hardware port. When VLAN filtering is disabled, VLAN awareness is disabled, > too, and we make no effort to classify packets according to the VLAN ID, > and take a different forwarding decision in hardware based on that. So > the traffic corresponding to the 8021q upper gets forwarded, flooded to > ports it shouldn't go to. I will try to put this in my own words to see if I understand: the problem with the setup above is the discrepancy between how the switch and how Linux would handle VID 100 tagged frames. In a s/swp/eth/-scenario, br0 would never see those frames since they would be snooped by swp0.100, whereas the switch will gleefully forward them to swp1. $PROFANITY > At the very least, we should deny: > - joining a bridge with vlan_filtering 0 while there is any 8021q upper > - toggling vlan_filtering off while there is any 8021q upper > - adding an 8021q upper while the port is under a vlan_filtering=0 bridge Right, so we have a new invariant: For VLAN unaware bridges, no bridge port may be a VLAN upper stacked on a device that we offload. Correct? > And for net-next, we should look at lifting that restriction, perhaps by > reacting to the above events by unoffloading the bridge port dynamically > (it's going to be so complicated that my head hurts already). +1
On Tue, Mar 16, 2021 at 11:40:13PM +0100, Tobias Waldekranz wrote: > > Actually, I think there is a bigger issue. > > Sadly, I only put 2 and 2 together now, and I believe we are still not > > dealing correctly with 8021q uppers of bridge ports with vlan_filtering 0. > > The network stack's expectation is described here: > > https://patchwork.kernel.org/project/netdevbpf/patch/20210221213355.1241450-12-olteanv@gmail.com/ > > So this is the setup Ido is describing, right? > > br1 > \ > .100 br0 > \ / \ > swp0 swp1 > > > Basically the problem is in the way DSA drives the configuration of the > > hardware port. When VLAN filtering is disabled, VLAN awareness is disabled, > > too, and we make no effort to classify packets according to the VLAN ID, > > and take a different forwarding decision in hardware based on that. So > > the traffic corresponding to the 8021q upper gets forwarded, flooded to > > ports it shouldn't go to. > > I will try to put this in my own words to see if I understand: the > problem with the setup above is the discrepancy between how the switch > and how Linux would handle VID 100 tagged frames. > > In a s/swp/eth/-scenario, br0 would never see those frames since they > would be snooped by swp0.100, whereas the switch will gleefully forward > them to swp1. Yes, pretty much, you can see with bridged veth pairs that this is exactly what happens, you can ping through their 8021q uppers even though you cannot ping through the veth interfaces themselves (and of course, you can ping through the bridge, but that is beside the point). > $PROFANITY +2
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 4d4956ed303b..206e05985857 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -21,6 +21,186 @@ static DEFINE_MUTEX(dsa2_mutex); LIST_HEAD(dsa_tree_list); +static bool dsa_bridge_filtering_is_coherent(struct dsa_switch_tree *dst, bool filter, + struct netlink_ext_ack *extack) +{ + bool is_global = false; + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) { + if (dp->ds->vlan_filtering_is_global) { + is_global = true; + break; + } + } + + if (!is_global) + return true; + + /* For cases where enabling/disabling VLAN awareness is global + * to the switch, we need to handle the case where multiple + * bridges span different ports of the same switch device and + * one of them has a different setting than what is being + * requested. + */ + list_for_each_entry(dp, &dst->ports, list) { + if (!dp->bridge_dev) + continue; + + if (br_vlan_enabled(dp->bridge_dev) != filter) { + NL_SET_ERR_MSG_MOD(extack, + "VLAN filtering is not consistent across all bridges"); + return false; + } + } + + return true; +} + +static bool dsa_8021q_uppers_are_coherent(struct dsa_switch_tree *dst, + struct net_device *br, + bool seen_vlan_upper, + unsigned long *upper_vids, + struct netlink_ext_ack *extack) +{ + struct net_device *lower, *upper; + struct list_head *liter, *uiter; + struct dsa_slave_priv *priv; + bool seen_offloaded = false; + u16 vid; + + netdev_for_each_lower_dev(br, lower, liter) { + priv = dsa_slave_dev_lower_find(lower); + if (!priv || priv->dp->ds->dst != dst) + /* Ignore ports that are not related to us in + * any way. + */ + continue; + + if (is_vlan_dev(lower)) { + seen_vlan_upper = true; + continue; + } + + if (dsa_port_offloads_bridge(priv->dp, br) && + dsa_port_offloads_bridge_port(priv->dp, lower)) + seen_offloaded = true; + else + /* Non-offloaded uppers can to whatever they + * want. + */ + continue; + + netdev_for_each_upper_dev_rcu(lower, upper, uiter) { + if (!is_vlan_dev(upper)) + continue; + + vid = vlan_dev_vlan_id(upper); + if (!test_bit(vid, upper_vids)) { + set_bit(vid, upper_vids); + continue; + } + + NL_SET_ERR_MSG_MOD(extack, + "Multiple VLAN interfaces cannot use the same VID"); + return false; + } + } + + if (seen_offloaded && seen_vlan_upper) { + NL_SET_ERR_MSG_MOD(extack, + "VLAN interfaces cannot share bridge with offloaded port"); + return false; + } + + return true; +} + +static bool dsa_bridge_vlans_are_coherent(struct net_device *br, + u16 new_vid, unsigned long *upper_vids, + struct netlink_ext_ack *extack) +{ + u16 vid; + + if (new_vid && test_bit(new_vid, upper_vids)) + goto err; + + for_each_set_bit(vid, upper_vids, VLAN_N_VID) { + struct bridge_vlan_info br_info; + + if (br_vlan_get_info(br, vid, &br_info)) + /* Error means that the VID does not exist, + * which is what we want to ensure. + */ + continue; + + goto err; + } + + return true; + +err: + NL_SET_ERR_MSG_MOD(extack, "No bridge VID may be used on a related VLAN interface"); + return false; +} + +/** + * dsa_bridge_is_coherent - Verify that DSA tree accepts a bridge config. + * @dst: Tree to verify against. + * @br: Bridge netdev to verify. + * @mod: Description of the modification to introduce. + * @extack: Netlink extended ack for error reporting. + * + * Verify that the VLAN config of @br, its offloaded ports belonging + * to @dst and their VLAN uppers, can be correctly offloaded after + * introducing the change described by @mod. If this is not the case, + * an error is reported via @extack. + * + * Return: true if the config can be offloaded, false otherwise. + */ +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, + struct dsa_bridge_mod *mod, + struct netlink_ext_ack *extack) +{ + unsigned long *upper_vids = NULL; + bool filter; + + if (mod->filter) + filter = *mod->filter; + else + filter = br && br_vlan_enabled(br); + + if (!dsa_bridge_filtering_is_coherent(dst, filter, extack)) + goto err; + + if (!filter) + return true; + + upper_vids = bitmap_zalloc(VLAN_N_VID, GFP_KERNEL); + if (!upper_vids) { + WARN_ON_ONCE(1); + goto err; + } + + if (mod->upper_vid) + set_bit(mod->upper_vid, upper_vids); + + if (!dsa_8021q_uppers_are_coherent(dst, br, mod->bridge_upper, + upper_vids, extack)) + goto err_free; + + if (!dsa_bridge_vlans_are_coherent(br, mod->br_vid, upper_vids, extack)) + goto err_free; + + kfree(upper_vids); + return true; + +err_free: + kfree(upper_vids); +err: + return false; +} + /** * dsa_tree_notify - Execute code for all switches in a DSA switch tree. * @dst: collection of struct dsa_switch devices to notify. diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 9d4b0e9b1aa1..8d8d307df437 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -188,6 +188,13 @@ int dsa_port_lag_change(struct dsa_port *dp, int dsa_port_lag_join(struct dsa_port *dp, struct net_device *lag_dev, struct netdev_lag_upper_info *uinfo); void dsa_port_lag_leave(struct dsa_port *dp, struct net_device *lag_dev); +bool dsa_port_can_stack_vlan_upper(struct dsa_port *dp, u16 vid, + struct netlink_ext_ack *extack); +bool dsa_port_can_bridge_vlan_upper(struct dsa_port *dp, + struct net_device *upper_br, + struct netlink_ext_ack *extack); +bool dsa_port_can_add_bridge_vlan(struct dsa_port *dp, u16 vid, + struct netlink_ext_ack *extack); int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering, struct netlink_ext_ack *extack); bool dsa_port_skip_vlan_configuration(struct dsa_port *dp); @@ -271,6 +278,7 @@ static inline bool dsa_tree_offloads_bridge_port(struct dsa_switch_tree *dst, } /* slave.c */ +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev); extern const struct dsa_device_ops notag_netdev_ops; void dsa_slave_mii_bus_init(struct dsa_switch *ds); int dsa_slave_create(struct dsa_port *dp); @@ -361,6 +369,27 @@ int dsa_switch_register_notifier(struct dsa_switch *ds); void dsa_switch_unregister_notifier(struct dsa_switch *ds); /* dsa2.c */ + +/** + * struct dsa_bridge_mod - Modification of bridge related config + * @filter: If non-NULL, the new state of VLAN filtering. + * @br_vid: If non-zero, this VID will be added to the bridge. + * @upper_vid: If non-zero, a VLAN upper using this VID will be added to + * a bridge port. + * @bridge_upper: If non-NULL, a VLAN upper will be added to the bridge. + * + * Describes a bridge related modification that is about to be applied. + */ +struct dsa_bridge_mod { + bool *filter; + u16 br_vid; + u16 upper_vid; + bool bridge_upper; +}; + +bool dsa_bridge_is_coherent(struct dsa_switch_tree *dst, struct net_device *br, + struct dsa_bridge_mod *mod, + struct netlink_ext_ack *extack); void dsa_lag_map(struct dsa_switch_tree *dst, struct net_device *lag); void dsa_lag_unmap(struct dsa_switch_tree *dst, struct net_device *lag); int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); diff --git a/net/dsa/port.c b/net/dsa/port.c index c9c6d7ab3f47..76c79d2d80a2 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -292,72 +292,48 @@ void dsa_port_lag_leave(struct dsa_port *dp, struct net_device *lag) dsa_lag_unmap(dp->ds->dst, lag); } -/* Must be called under rcu_read_lock() */ -static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp, - bool vlan_filtering, - struct netlink_ext_ack *extack) +bool dsa_port_can_stack_vlan_upper(struct dsa_port *dp, u16 vid, + struct netlink_ext_ack *extack) { - struct dsa_switch *ds = dp->ds; - int err, i; + struct dsa_bridge_mod mod = { + .upper_vid = vid, + }; - /* VLAN awareness was off, so the question is "can we turn it on". - * We may have had 8021q uppers, those need to go. Make sure we don't - * enter an inconsistent state: deny changing the VLAN awareness state - * as long as we have 8021q uppers. - */ - if (vlan_filtering && dsa_is_user_port(ds, dp->index)) { - struct net_device *upper_dev, *slave = dp->slave; - struct net_device *br = dp->bridge_dev; - struct list_head *iter; - - netdev_for_each_upper_dev_rcu(slave, upper_dev, iter) { - struct bridge_vlan_info br_info; - u16 vid; - - if (!is_vlan_dev(upper_dev)) - continue; - - vid = vlan_dev_vlan_id(upper_dev); - - /* br_vlan_get_info() returns -EINVAL or -ENOENT if the - * device, respectively the VID is not found, returning - * 0 means success, which is a failure for us here. - */ - err = br_vlan_get_info(br, vid, &br_info); - if (err == 0) { - NL_SET_ERR_MSG_MOD(extack, - "Must first remove VLAN uppers having VIDs also present in bridge"); - return false; - } - } - } + return dsa_bridge_is_coherent(dp->ds->dst, dp->bridge_dev, &mod, + extack); +} - if (!ds->vlan_filtering_is_global) - return true; +bool dsa_port_can_bridge_vlan_upper(struct dsa_port *dp, + struct net_device *upper_br, + struct netlink_ext_ack *extack) +{ + struct dsa_bridge_mod mod = { + .bridge_upper = true, + }; - /* For cases where enabling/disabling VLAN awareness is global to the - * switch, we need to handle the case where multiple bridges span - * different ports of the same switch device and one of them has a - * different setting than what is being requested. - */ - for (i = 0; i < ds->num_ports; i++) { - struct net_device *other_bridge; - - other_bridge = dsa_to_port(ds, i)->bridge_dev; - if (!other_bridge) - continue; - /* If it's the same bridge, it also has same - * vlan_filtering setting => no need to check - */ - if (other_bridge == dp->bridge_dev) - continue; - if (br_vlan_enabled(other_bridge) != vlan_filtering) { - NL_SET_ERR_MSG_MOD(extack, - "VLAN filtering is a global setting"); - return false; - } - } - return true; + return dsa_bridge_is_coherent(dp->ds->dst, upper_br, &mod, extack); +} + +bool dsa_port_can_add_bridge_vlan(struct dsa_port *dp, u16 vid, + struct netlink_ext_ack *extack) +{ + struct dsa_bridge_mod mod = { + .br_vid = vid, + }; + + return dsa_bridge_is_coherent(dp->ds->dst, dp->bridge_dev, &mod, + extack); +} + +static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp, bool filter, + struct netlink_ext_ack *extack) +{ + struct dsa_bridge_mod mod = { + .filter = &filter, + }; + + return dsa_bridge_is_coherent(dp->ds->dst, dp->bridge_dev, &mod, + extack); } int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 992fcab4b552..04f75b0ae297 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -325,28 +325,6 @@ static int dsa_slave_port_attr_set(struct net_device *dev, return ret; } -/* Must be called under rcu_read_lock() */ -static int -dsa_slave_vlan_check_for_8021q_uppers(struct net_device *slave, - const struct switchdev_obj_port_vlan *vlan) -{ - struct net_device *upper_dev; - struct list_head *iter; - - netdev_for_each_upper_dev_rcu(slave, upper_dev, iter) { - u16 vid; - - if (!is_vlan_dev(upper_dev)) - continue; - - vid = vlan_dev_vlan_id(upper_dev); - if (vid == vlan->vid) - return -EBUSY; - } - - return 0; -} - static int dsa_slave_vlan_add(struct net_device *dev, const struct switchdev_obj *obj, struct netlink_ext_ack *extack) @@ -363,19 +341,8 @@ static int dsa_slave_vlan_add(struct net_device *dev, vlan = *SWITCHDEV_OBJ_PORT_VLAN(obj); - /* Deny adding a bridge VLAN when there is already an 802.1Q upper with - * the same VID. - */ - if (br_vlan_enabled(dp->bridge_dev)) { - rcu_read_lock(); - err = dsa_slave_vlan_check_for_8021q_uppers(dev, &vlan); - rcu_read_unlock(); - if (err) { - NL_SET_ERR_MSG_MOD(extack, - "Port already has a VLAN upper with this VID"); - return err; - } - } + if (!dsa_port_can_add_bridge_vlan(dp, vlan.vid, extack)) + return -EBUSY; err = dsa_port_vlan_add(dp, &vlan, extack); if (err) @@ -2050,30 +2017,20 @@ static int dsa_prevent_bridging_8021q_upper(struct net_device *dev, struct netdev_notifier_changeupper_info *info) { - struct netlink_ext_ack *ext_ack; - struct net_device *slave; - struct dsa_port *dp; - - ext_ack = netdev_notifier_info_to_extack(&info->info); + struct netlink_ext_ack *extack; + struct dsa_slave_priv *p; - if (!is_vlan_dev(dev)) - return NOTIFY_DONE; + extack = netdev_notifier_info_to_extack(&info->info); - slave = vlan_dev_real_dev(dev); - if (!dsa_slave_dev_check(slave)) + p = dsa_slave_dev_lower_find(vlan_dev_real_dev(dev)); + if (!p) return NOTIFY_DONE; - dp = dsa_slave_to_port(slave); - if (!dp->bridge_dev) + if (!netif_is_bridge_master(info->upper_dev) || !info->linking) return NOTIFY_DONE; - /* Deny enslaving a VLAN device into a VLAN-aware bridge */ - if (br_vlan_enabled(dp->bridge_dev) && - netif_is_bridge_master(info->upper_dev) && info->linking) { - NL_SET_ERR_MSG_MOD(ext_ack, - "Cannot enslave VLAN device into VLAN aware bridge"); + if (!dsa_port_can_bridge_vlan_upper(p->dp, info->upper_dev, extack)) return notifier_from_errno(-EINVAL); - } return NOTIFY_DONE; } @@ -2082,29 +2039,20 @@ static int dsa_slave_check_8021q_upper(struct net_device *dev, struct netdev_notifier_changeupper_info *info) { - struct dsa_port *dp = dsa_slave_to_port(dev); - struct net_device *br = dp->bridge_dev; - struct bridge_vlan_info br_info; struct netlink_ext_ack *extack; - int err = NOTIFY_DONE; + struct dsa_slave_priv *p; u16 vid; - if (!br || !br_vlan_enabled(br)) + extack = netdev_notifier_info_to_extack(&info->info); + + p = dsa_slave_dev_lower_find(dev); + if (!p) return NOTIFY_DONE; - extack = netdev_notifier_info_to_extack(&info->info); vid = vlan_dev_vlan_id(info->upper_dev); - /* br_vlan_get_info() returns -EINVAL or -ENOENT if the - * device, respectively the VID is not found, returning - * 0 means success, which is a failure for us here. - */ - err = br_vlan_get_info(br, vid, &br_info); - if (err == 0) { - NL_SET_ERR_MSG_MOD(extack, - "This VLAN is already configured by the bridge"); + if (!dsa_port_can_stack_vlan_upper(p->dp, vid, extack)) return notifier_from_errno(-EBUSY); - } return NOTIFY_DONE; } @@ -2121,8 +2069,16 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb, struct dsa_port *dp; int err; + if (is_vlan_dev(dev)) + err = dsa_prevent_bridging_8021q_upper(dev, ptr); + else if (is_vlan_dev(info->upper_dev)) + err = dsa_slave_check_8021q_upper(dev, ptr); + + if (err) + return err; + if (!dsa_slave_dev_check(dev)) - return dsa_prevent_bridging_8021q_upper(dev, ptr); + return NOTIFY_DONE; dp = dsa_slave_to_port(dev); ds = dp->ds; @@ -2132,9 +2088,6 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb, if (err) return notifier_from_errno(err); } - - if (is_vlan_dev(info->upper_dev)) - return dsa_slave_check_8021q_upper(dev, ptr); break; } case NETDEV_CHANGEUPPER: @@ -2260,12 +2213,15 @@ static int dsa_lower_dev_walk(struct net_device *lower_dev, return 0; } -static struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) +struct dsa_slave_priv *dsa_slave_dev_lower_find(struct net_device *dev) { struct netdev_nested_priv priv = { .data = NULL, }; + if (dsa_slave_dev_check(dev)) + return netdev_priv(dev); + netdev_walk_all_lower_dev_rcu(dev, dsa_lower_dev_walk, &priv); return (struct dsa_slave_priv *)priv.data;
There are four kinds of events that have an inpact on VLAN configuration of DSA ports: - Adding VLAN uppers (ip link add dev swp0.1 link swp0 type vlan id 1) - Bridging VLAN uppers (ip link set dev swp0.1 master br0) - Adding bridge VLANs (bridge vlan add dev swp0 vid 1) - Changes to a bridge's VLAN filtering setting (ip link set dev br0 type bridge vlan_filtering 1) For all of these events, we want to ensure that some invariants are upheld for offloaded ports belonging to our tree: - For hardware where VLAN filtering is a global setting, either all bridges must use VLAN filtering, or no bridge can. - For all filtering bridges, no VID may be configured on more than one VLAN upper. An example of a violation of this would be: .100 br0 .100 \ / \ / swp0 swp1 $ ip link add dev br0 type bridge vlan_filtering 1 $ ip link add dev swp0.100 link swp0 type vlan id 100 $ ip link set dev swp0 master br0 $ ip link add dev swp1.100 link swp0 type vlan id 100 $ ip link set dev swp1 master br0 - For all filtering bridges, no upper VLAN may share a bridge with another offloaded port. An example of a violation of this would be: br0 / | / | .100 | | | swp0 swp1 $ ip link add dev br0 type bridge vlan_filtering 1 $ ip link add dev swp0.100 link swp0 type vlan id 100 $ ip link set dev swp0.100 master br0 $ ip link set dev swp1 master br0 - For all filtering bridges, no VID that exists in the bridge may be used by a VLAN upper. An example of a violation of this would be: br0 (100) | .100 | \ | swp0 $ ip link add dev br0 type bridge vlan_filtering 1 $ ip link add dev swp0.100 link swp0 type vlan id 100 $ ip link set dev swp0 master br0 $ bridge vlan add dev swp0 vid 100 Move the validation of these invariants to a central function, and use it from all sites where these events are handled. This way, we ensure that all invariants are always checked, avoiding certain configs being allowed or disallowed depending on the order in which commands are given. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> --- I realize that this perhaps is a bit sprawling, but it is not obvious to me how it could be split up. Happy to take suggestions. I have tested this against Vladimir's comprehensive test suite posted here: https://lore.kernel.org/netdev/20210309220119.t24sdc7cqqfxhpfb@skbuf/ With this patch applied, all cases pass both for regular ports and hardware offloaded LAGs on an mv88e6xxx device. I have made an effort to not use any LAG-specific knowledge, so this should also work for other offloaded uppers in the future (like HSR rings for example). net/dsa/dsa2.c | 180 +++++++++++++++++++++++++++++++++++++++++++++ net/dsa/dsa_priv.h | 29 ++++++++ net/dsa/port.c | 100 ++++++++++--------------- net/dsa/slave.c | 100 +++++++------------------ 4 files changed, 275 insertions(+), 134 deletions(-)