Message ID | 20250507142110.3452012-9-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add initial support for --enable-ubsan | expand |
* Adhemerval Zanella: > The ubsan triggers: > > UBSAN: Undefined behaviour in ./elem-hash.h:27:14 left shift of 360447856 by 3 cannot be represented in type 'int' > > Using unsigned shift here zero fill like signed. > --- > locale/elem-hash.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/locale/elem-hash.h b/locale/elem-hash.h > index 5d34ec43f3..8052b0fe6d 100644 > --- a/locale/elem-hash.h > +++ b/locale/elem-hash.h > @@ -24,7 +24,7 @@ elem_hash (const char *str, int32_t n) > > while (n-- > 0) > { > - result <<= 3; > + result = (uint32_t)result << 3; > result += *str++; > } Wow, this hash function is really bad with signed chars. Is this part of the ABI? Can we chang its definition without changing localedef output files? Thanks, Florian
On Mai 07 2025, Adhemerval Zanella wrote: > diff --git a/locale/elem-hash.h b/locale/elem-hash.h > index 5d34ec43f3..8052b0fe6d 100644 > --- a/locale/elem-hash.h > +++ b/locale/elem-hash.h > @@ -24,7 +24,7 @@ elem_hash (const char *str, int32_t n) > > while (n-- > 0) > { > - result <<= 3; > + result = (uint32_t)result << 3; > result += *str++; > } The only use of elem_hash assigns the value to a uint32_t, so it would be better to adjust the return type of the function.
On Mai 20 2025, Florian Weimer wrote: > Is this part of the ABI? Can we chang its definition without changing > localedef output files? There are only two uses of the _NL_COLLATE_SYMB_TABLEMB element of LC_COLLATE (posix/fnmatch_loop.c, and seek_collating_symbol_entry in posix/regcomp.c), and both just look for a non-zero hash, but ignore it otherwise. I don't think the particular definition of the hash function is part of the ABI.
diff --git a/locale/elem-hash.h b/locale/elem-hash.h index 5d34ec43f3..8052b0fe6d 100644 --- a/locale/elem-hash.h +++ b/locale/elem-hash.h @@ -24,7 +24,7 @@ elem_hash (const char *str, int32_t n) while (n-- > 0) { - result <<= 3; + result = (uint32_t)result << 3; result += *str++; }