Message ID | 20210309124848.238327-1-elder@linaro.org |
---|---|
Headers | show |
Series | net: qualcomm: rmnet: stop using C bit-fields | expand |
On 3/9/21 6:48 AM, Alex Elder wrote: > Version 3 of this series uses BIT() rather than GENMASK() to define > single-bit masks. It then uses a simple AND (&) operation rather > than (e.g.) u8_get_bits() to access such flags. This was suggested > by David Laight and really prefer the result. With Bjorn's > permission I have preserved his Reviewed-by tags on the first five > patches. Nice as all this looks, it doesn't *work*. I did some very basic testing before sending out version 3, but not enough. (More on the problem, below). --> I retract this series <-- I will send out an update (version 4). But I won't be doing it for a few more days. The problem is that the BIT() flags are defined in host byte order. But the values they're compared against are not always (or perhaps, never) in host byte order. I regret the error, and will do a complete set of testing on version 4 before sending it out for review. -Alex > Version 2 fixed bugs in the way the value written into the header > was computed. > > The series was first posted here: > https://lore.kernel.org/netdev/20210304223431.15045-1-elder@linaro.org/ > Below is a summary of the original description. > > This series converts data structures defined in <linux/if_rmnet.h> > so they use integral field values with bitfield masks rather than > relying on C bit-fields. > - The first three patches lay the ground work for the others. > - The first adds endianness notation to a structure. > - The second simplifies a bit of complicated code. > - The third open-codes some macros that needlessly > obscured some simple code. > - Each of the last three patches converts one of the structures > defined in <linux/if_rmnet.h> so it no longer uses C bit-fields. > > -Alex > > Alex Elder (6): > net: qualcomm: rmnet: mark trailer field endianness > net: qualcomm: rmnet: simplify some byte order logic > net: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros > net: qualcomm: rmnet: use field masks instead of C bit-fields > net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer > net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header > > .../ethernet/qualcomm/rmnet/rmnet_handlers.c | 11 ++-- > .../net/ethernet/qualcomm/rmnet/rmnet_map.h | 12 ---- > .../qualcomm/rmnet/rmnet_map_command.c | 11 +++- > .../ethernet/qualcomm/rmnet/rmnet_map_data.c | 60 ++++++++--------- > include/linux/if_rmnet.h | 65 +++++++++---------- > 5 files changed, 70 insertions(+), 89 deletions(-) >
Hi Alex, On Tue, Mar 09, 2021 at 05:39:20PM -0600, Alex Elder wrote: > On 3/9/21 6:48 AM, Alex Elder wrote: > > Version 3 of this series uses BIT() rather than GENMASK() to define > > single-bit masks. It then uses a simple AND (&) operation rather > > than (e.g.) u8_get_bits() to access such flags. This was suggested > > by David Laight and really prefer the result. With Bjorn's > > permission I have preserved his Reviewed-by tags on the first five > > patches. > > Nice as all this looks, it doesn't *work*. I did some very basic > testing before sending out version 3, but not enough. (More on > the problem, below). > > --> I retract this series <-- > > I will send out an update (version 4). But I won't be doing it > for a few more days. > > The problem is that the BIT() flags are defined in host byte > order. But the values they're compared against are not always > (or perhaps, never) in host byte order. > > I regret the error, and will do a complete set of testing on > version 4 before sending it out for review. I think I understand some of your pain. I had a similar situation trying to write a driver for hardware with very strange bitfield organization, and my top priority was actually maintaining a set of bitfield definitions that could be taken directly from the user manual of said piece of hardware (and similar to you, I dislike C bitfields). What I came up with was an entirely new API called packing() which is described here: https://www.kernel.org/doc/html/latest/core-api/packing.html It doesn't have any users except code added by me (some in Ethernet fast paths), and it has some limitations (mainly that it only has support for u64 CPU words), but on the other hand, it's easy to understand, easy to use, supports any bit/byte layout under the sun, doesn't suffer from unaligned memory access issues due to its byte-by-byte approach, and is completely independent of host endianness. That said, I'm not completely happy with it because it has slightly higher overhead compared to typical bitfield accessors. I've been on the fence about even deleting it, considering that it's been two years since it's in mainline and it hasn't gained much of a traction. So I would rather try to work my way around a different API in the sja1105 driver. Have you noticed this API and decided to not use it for whatever reason? Could you let me know what that was? Even better, in your quest to fix the rmnet driver, have you seen any API that is capable of extracting a bitfield that spans two 64-bit halves of an 128 bit word in a custom bit layout?
On 3/9/21 6:27 PM, Vladimir Oltean wrote: > Hi Alex, > > On Tue, Mar 09, 2021 at 05:39:20PM -0600, Alex Elder wrote: >> On 3/9/21 6:48 AM, Alex Elder wrote: >>> Version 3 of this series uses BIT() rather than GENMASK() to define >>> single-bit masks. It then uses a simple AND (&) operation rather >>> than (e.g.) u8_get_bits() to access such flags. This was suggested >>> by David Laight and really prefer the result. With Bjorn's >>> permission I have preserved his Reviewed-by tags on the first five >>> patches. >> >> Nice as all this looks, it doesn't *work*. I did some very basic >> testing before sending out version 3, but not enough. (More on >> the problem, below). >> >> --> I retract this series <-- >> >> I will send out an update (version 4). But I won't be doing it >> for a few more days. >> >> The problem is that the BIT() flags are defined in host byte >> order. But the values they're compared against are not always >> (or perhaps, never) in host byte order. >> >> I regret the error, and will do a complete set of testing on >> version 4 before sending it out for review. > > I think I understand some of your pain. I had a similar situation trying I appreciate your saying that. In this case these were mistakes of my own making, but it has been surprisingly tricky to make sense of exactly how bits are laid out under various scenarios, old and new. > to write a driver for hardware with very strange bitfield organization, > and my top priority was actually maintaining a set of bitfield definitions > that could be taken directly from the user manual of said piece of > hardware (and similar to you, I dislike C bitfields). What I came up That seems like a reasonable thing to do. The conventional layout of big endian bytes and bits in network documentation is great, but it is also different from other conventions used elsewhere, and sometimes that too can lead to confusion. For example, network specs list tend to show groups of 4 bytes in a row with high-order byte *and bit* numbered 0: high order byte | 0 | 1 | 2 | 3 | |0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15| ... | ... |31| ^-- high bit While SCSI shows 1 byte rows, high-order bit numbered 7: Byte| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 0 | ^-- high bit | 1 | > with was an entirely new API called packing() which is described here: > https://www.kernel.org/doc/html/latest/core-api/packing.html I was not aware of that. And I have now looked at it and have a few comments, below. > It doesn't have any users except code added by me (some in Ethernet fast > paths), and it has some limitations (mainly that it only has support for > u64 CPU words), but on the other hand, it's easy to understand, easy to > use, supports any bit/byte layout under the sun, doesn't suffer from > unaligned memory access issues due to its byte-by-byte approach, and is > completely independent of host endianness. > That said, I'm not completely happy with it because it has slightly > higher overhead compared to typical bitfield accessors. I've been on the > fence about even deleting it, considering that it's been two years since > it's in mainline and it hasn't gained much of a traction. So I would > rather try to work my way around a different API in the sja1105 driver. > > Have you noticed this API and decided to not use it for whatever reason? I had not noticed it before you brought it to my attention. > Could you let me know what that was? Even better, in your quest to fix > the rmnet driver, have you seen any API that is capable of extracting a > bitfield that spans two 64-bit halves of an 128 bit word in a custom bit > layout? I have not seen an interface that attempts to handle things that span multiple 64-bit units of memory. Your document uses big endian 64-bit word (not u64) as its "no quirks" example (listing high-order bytes first, and numbering bytes--and bits--starting at higher numbers). My only objection to that is that you should probably call it __be64. Otherwise I'm just pointing out that it is a convention, and as pointed out above, it isn't universal. For my purposes, all registers are 32 bits. So to use your packing() interface I would have to implement 32-bit versions of the function. But that's not really a big deal. I do see what you're doing, defining exactly what you want to do in the arguments passed to packing(), and allowing the one function to both encode and extract values in a 64-bit register regardless of endianness. Having looked at this, though, I prefer the functions and macros defined in <linux/bitfield.h>. Probably the biggest reason is a bias I have about the fact that a single mask value represents both the position and width of a field within a register. I independently "invented/discovered" this and implemented it for my own use, only to learn later it had already been done. So I switched to the functions that were already upstream. The "bitfield.h" functions *require* the field mask to be a runtime constant. That is great, because it means that determining the width and position of fields from the mask can be done at compile time, and can be done as efficiently as possible. The downside is that it would occasionally be nice to be able to supply a variable mask value. I've had to go to some lengths in the IPA driver to get a result that compiles in some cases. But because you asked, I'll list some other reasons why I prefer those functions over your packing() function. - As mentioned, a single mask defines both the position and size of a field. It is specified in "natural" host byte order, but provides options for encoding or extracting values for big- or little-endian objects as well. - I prefer to have the size of the integral types (and the operations on them) to be explicit. I.e., yes, "int" is supposed to be the most efficient type on a machine, but in working with the hardware I like declaring exactly what size object I'm dealing with. The "bitfield.h" functions have that, for every standard word size. - I think that your one function, although convenient, is too general-purpose. I'd rather *at least* have one function for encoding values and a second for decoding. - In contrast, the "bitfield.h" functions each do one thing--either encode or extract, on an object of a known size and endianness. The arguments passed are minimal but sufficient to perform each operation. - I prefer having the encoding function *return* the encoded value rather than passing the address of an object to be updated. - Although I recognize exactly the problem it solves, the QUIRKS flags you have aren't a lot more intuitive to me than the __LITTLE_ENDIAN_BITFIELD symbol is. At least the names you use offer a little more clarity. None of these are strong criticisms. In fact I like that you created a function/system to solve this problem. But you requested some reasoning, so I wanted to offer some. In this particular RMNet series, I followed conventions I used for the IPA driver. In that case, there are many registers with fields, so it was important to follow some consistent patterns. For this case, not nearly as much generality is needed, so I could probably get away with simpler masks, possibly without even using the u16_encode_bits() functions. Not sure what I'll settle on but I plan to post version 4 of the series early next week. Thanks for your note. I do appreciate having someone say "hey, I've been there." And I'm glad to now be aware of the packing() function. -Alex