diff mbox series

[06/11] locale: Fix UB on insert_weights

Message ID 20250507142110.3452012-7-adhemerval.zanella@linaro.org
State New
Headers show
Series Add initial support for --enable-ubsan | expand

Commit Message

Adhemerval Zanella May 7, 2025, 2:17 p.m. UTC
The ubsan triggers:

UBSAN: Undefined behaviour in programs/ld-collate.c:862:5 null pointer passed as argument 2, nonnull attribute declared at unknown:0:0,

The memcpy is only requires if current 'weights' is nonnull, so
check it before calling it.
---
 locale/programs/ld-collate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Florian Weimer May 20, 2025, 12:45 p.m. UTC | #1
* Adhemerval Zanella:

> The ubsan triggers:
>
> UBSAN: Undefined behaviour in programs/ld-collate.c:862:5 null pointer passed as argument 2, nonnull attribute declared at unknown:0:0,
>
> The memcpy is only requires if current 'weights' is nonnull, so
> check it before calling it.
> ---
>  locale/programs/ld-collate.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
> index 7de3ba064d..4fa08bd273 100644
> --- a/locale/programs/ld-collate.c
> +++ b/locale/programs/ld-collate.c
> @@ -858,7 +858,8 @@ insert_weights (struct linereader *ldfile, struct element_t *elem,
>  		  max += 10;
>  		  newp = (struct element_t **)
>  		    alloca (max * sizeof (struct element_t *));
> -		  memcpy (newp, weights, cnt * sizeof (struct element_t *));
> +		  if (weights != NULL)
> +		    memcpy (newp, weights, cnt * sizeof (struct element_t *));
>  		  weights = newp;
>  		}
>  	      weights[cnt++] = charelem;

I don't think we should make this change, considering that memcpy is
going to accept null arguments for length zero soon.

Thanks,
Florian
Adhemerval Zanella May 28, 2025, 7:24 p.m. UTC | #2
On 20/05/25 09:45, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> The ubsan triggers:
>>
>> UBSAN: Undefined behaviour in programs/ld-collate.c:862:5 null pointer passed as argument 2, nonnull attribute declared at unknown:0:0,
>>
>> The memcpy is only requires if current 'weights' is nonnull, so
>> check it before calling it.
>> ---
>>  locale/programs/ld-collate.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
>> index 7de3ba064d..4fa08bd273 100644
>> --- a/locale/programs/ld-collate.c
>> +++ b/locale/programs/ld-collate.c
>> @@ -858,7 +858,8 @@ insert_weights (struct linereader *ldfile, struct element_t *elem,
>>  		  max += 10;
>>  		  newp = (struct element_t **)
>>  		    alloca (max * sizeof (struct element_t *));
>> -		  memcpy (newp, weights, cnt * sizeof (struct element_t *));
>> +		  if (weights != NULL)
>> +		    memcpy (newp, weights, cnt * sizeof (struct element_t *));
>>  		  weights = newp;
>>  		}
>>  	      weights[cnt++] = charelem;
> 
> I don't think we should make this change, considering that memcpy is
> going to accept null arguments for length zero soon.

Yes, but for -std=gnu11 is still UB and that is what -fsanitize=undefined
it catching here.  I think we might change it on the future when/if we
bump the standard to use it.

Another option would to just enable it for ENABLE_UBSAN, but I think to
have --enable-ubsan as a possible option to CI or for development, I 
think we should have as minimal as possible issues to better regression
evaluation.
Florian Weimer June 2, 2025, 10:47 a.m. UTC | #3
* Adhemerval Zanella Netto:

> On 20/05/25 09:45, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> The ubsan triggers:
>>>
>>> UBSAN: Undefined behaviour in programs/ld-collate.c:862:5 null pointer passed as argument 2, nonnull attribute declared at unknown:0:0,
>>>
>>> The memcpy is only requires if current 'weights' is nonnull, so
>>> check it before calling it.
>>> ---
>>>  locale/programs/ld-collate.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
>>> index 7de3ba064d..4fa08bd273 100644
>>> --- a/locale/programs/ld-collate.c
>>> +++ b/locale/programs/ld-collate.c
>>> @@ -858,7 +858,8 @@ insert_weights (struct linereader *ldfile, struct element_t *elem,
>>>  		  max += 10;
>>>  		  newp = (struct element_t **)
>>>  		    alloca (max * sizeof (struct element_t *));
>>> -		  memcpy (newp, weights, cnt * sizeof (struct element_t *));
>>> +		  if (weights != NULL)
>>> +		    memcpy (newp, weights, cnt * sizeof (struct element_t *));
>>>  		  weights = newp;
>>>  		}
>>>  	      weights[cnt++] = charelem;
>> 
>> I don't think we should make this change, considering that memcpy is
>> going to accept null arguments for length zero soon.
>
> Yes, but for -std=gnu11 is still UB and that is what -fsanitize=undefined
> it catching here.  I think we might change it on the future when/if we
> bump the standard to use it.
>
> Another option would to just enable it for ENABLE_UBSAN, but I think to
> have --enable-ubsan as a possible option to CI or for development, I 
> think we should have as minimal as possible issues to better regression
> evaluation.

What happens if we drop the nonnull annotation of the prototype?  Will
this change UBSAN behavior today?

Thanks,
Florian
Collin Funk June 2, 2025, 4:33 p.m. UTC | #4
Florian Weimer <fweimer@redhat.com> writes:

>> Yes, but for -std=gnu11 is still UB and that is what -fsanitize=undefined
>> it catching here.  I think we might change it on the future when/if we
>> bump the standard to use it.
>>
>> Another option would to just enable it for ENABLE_UBSAN, but I think to
>> have --enable-ubsan as a possible option to CI or for development, I 
>> think we should have as minimal as possible issues to better regression
>> evaluation.
>
> What happens if we drop the nonnull annotation of the prototype?  Will
> this change UBSAN behavior today?

I thought about making this change too, but it felt a bit pre-mature
since the behavior is not in a current released standard.

If it fixes UBSAN though, I am for the change.

Collin
diff mbox series

Patch

diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
index 7de3ba064d..4fa08bd273 100644
--- a/locale/programs/ld-collate.c
+++ b/locale/programs/ld-collate.c
@@ -858,7 +858,8 @@  insert_weights (struct linereader *ldfile, struct element_t *elem,
 		  max += 10;
 		  newp = (struct element_t **)
 		    alloca (max * sizeof (struct element_t *));
-		  memcpy (newp, weights, cnt * sizeof (struct element_t *));
+		  if (weights != NULL)
+		    memcpy (newp, weights, cnt * sizeof (struct element_t *));
 		  weights = newp;
 		}
 	      weights[cnt++] = charelem;