diff mbox series

usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed

Message ID 20250328161823.2240125-1-fisaksen@baylibre.com
State Superseded
Headers show
Series usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed | expand

Commit Message

Frode Isaksen March 28, 2025, 4:17 p.m. UTC
From: Frode Isaksen <frode@meta.com>

Invalidate io_data by setting context to NULL when USB request is
dequeued or completed, and check for NULL io_data in epfile_io_complete().
The invalidation of io_data in req->context is done when exiting
epfile_io(), since then io_data will become invalid as it is allocated
on the stack.
The epfile_io_complete() may be called after ffs_epfile_io() returns
in case the wait_for_completion_interruptible() is interrupted.
This fixes a use-after-free error with the following call stack:

Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
pc : ffs_epfile_io_complete+0x30/0x48
lr : usb_gadget_giveback_request+0x30/0xf8
Call trace:
ffs_epfile_io_complete+0x30/0x48
usb_gadget_giveback_request+0x30/0xf8
dwc3_remove_requests+0x264/0x2e8
dwc3_gadget_pullup+0x1d0/0x250
kretprobe_trampoline+0x0/0xc4
usb_gadget_remove_driver+0x40/0xf4
usb_gadget_unregister_driver+0xdc/0x178
unregister_gadget_item+0x40/0x6c
ffs_closed+0xd4/0x10c
ffs_data_clear+0x2c/0xf0
ffs_data_closed+0x178/0x1ec
ffs_ep0_release+0x24/0x38
__fput+0xe8/0x27c

Signed-off-by: Frode Isaksen <frode@meta.com>
---
This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
Also tested on T.I. AM62x board.

 drivers/usb/gadget/function/f_fs.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Greg KH March 28, 2025, 9:02 p.m. UTC | #1
On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
> From: Frode Isaksen <frode@meta.com>
> 
> Invalidate io_data by setting context to NULL when USB request is
> dequeued or completed, and check for NULL io_data in epfile_io_complete().
> The invalidation of io_data in req->context is done when exiting
> epfile_io(), since then io_data will become invalid as it is allocated
> on the stack.
> The epfile_io_complete() may be called after ffs_epfile_io() returns
> in case the wait_for_completion_interruptible() is interrupted.
> This fixes a use-after-free error with the following call stack:
> 
> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
> pc : ffs_epfile_io_complete+0x30/0x48
> lr : usb_gadget_giveback_request+0x30/0xf8
> Call trace:
> ffs_epfile_io_complete+0x30/0x48
> usb_gadget_giveback_request+0x30/0xf8
> dwc3_remove_requests+0x264/0x2e8
> dwc3_gadget_pullup+0x1d0/0x250
> kretprobe_trampoline+0x0/0xc4
> usb_gadget_remove_driver+0x40/0xf4
> usb_gadget_unregister_driver+0xdc/0x178
> unregister_gadget_item+0x40/0x6c
> ffs_closed+0xd4/0x10c
> ffs_data_clear+0x2c/0xf0
> ffs_data_closed+0x178/0x1ec
> ffs_ep0_release+0x24/0x38
> __fput+0xe8/0x27c
> 
> Signed-off-by: Frode Isaksen <frode@meta.com>
> ---
> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
> Also tested on T.I. AM62x board.

What commit id does this fix?  Should it go to stable?

> 
>  drivers/usb/gadget/function/f_fs.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> index 2dea9e42a0f8..f1be0a5c0bd0 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>  {
>  	struct ffs_io_data *io_data = req->context;
>  
> +	if (WARN_ON(io_data == NULL))
> +		return;

If this happens you just crashed the box (remember about panic-on-warn,
which is still set in a few billion Linux systems these days...)

Just handle the issue properly, no need to dump the stack and crash a
device.

But, what keeps io_data from changing after you have checked it?  Where
is the lock here?

thanks,

greg k-h
Frode Isaksen March 31, 2025, 8:18 a.m. UTC | #2
On 3/28/25 10:02 PM, Greg KH wrote:
> On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
>> From: Frode Isaksen <frode@meta.com>
>>
>> Invalidate io_data by setting context to NULL when USB request is
>> dequeued or completed, and check for NULL io_data in epfile_io_complete().
>> The invalidation of io_data in req->context is done when exiting
>> epfile_io(), since then io_data will become invalid as it is allocated
>> on the stack.
>> The epfile_io_complete() may be called after ffs_epfile_io() returns
>> in case the wait_for_completion_interruptible() is interrupted.
>> This fixes a use-after-free error with the following call stack:
>>
>> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
>> pc : ffs_epfile_io_complete+0x30/0x48
>> lr : usb_gadget_giveback_request+0x30/0xf8
>> Call trace:
>> ffs_epfile_io_complete+0x30/0x48
>> usb_gadget_giveback_request+0x30/0xf8
>> dwc3_remove_requests+0x264/0x2e8
>> dwc3_gadget_pullup+0x1d0/0x250
>> kretprobe_trampoline+0x0/0xc4
>> usb_gadget_remove_driver+0x40/0xf4
>> usb_gadget_unregister_driver+0xdc/0x178
>> unregister_gadget_item+0x40/0x6c
>> ffs_closed+0xd4/0x10c
>> ffs_data_clear+0x2c/0xf0
>> ffs_data_closed+0x178/0x1ec
>> ffs_ep0_release+0x24/0x38
>> __fput+0xe8/0x27c
>>
>> Signed-off-by: Frode Isaksen <frode@meta.com>
>> ---
>> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
>> Also tested on T.I. AM62x board.
> What commit id does this fix?  Should it go to stable?

This has always been there, so the is no specific commit when this was 
added.

Will add the Cc tag to stable in v2.

>
>>   drivers/usb/gadget/function/f_fs.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>> index 2dea9e42a0f8..f1be0a5c0bd0 100644
>> --- a/drivers/usb/gadget/function/f_fs.c
>> +++ b/drivers/usb/gadget/function/f_fs.c
>> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>>   {
>>   	struct ffs_io_data *io_data = req->context;
>>   
>> +	if (WARN_ON(io_data == NULL))
>> +		return;
> If this happens you just crashed the box (remember about panic-on-warn,
> which is still set in a few billion Linux systems these days...)
>
> Just handle the issue properly, no need to dump the stack and crash a
> device.
OK, removing the WARN_ON for v2.
>
> But, what keeps io_data from changing after you have checked it?  Where
> is the lock here?

There is no lock here, as I didn't want to introduce extra complexity 
(and bugs...). But this code has been running without a crash on 
millions of devices for more than a year.

Thanks,

Frode

>
> thanks,
>
> greg k-h
Frode Isaksen March 31, 2025, 1:17 p.m. UTC | #3
On 3/31/25 10:57 AM, Greg KH wrote:
> On Mon, Mar 31, 2025 at 10:18:29AM +0200, Frode Isaksen wrote:
>> On 3/28/25 10:02 PM, Greg KH wrote:
>>> On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
>>>> From: Frode Isaksen <frode@meta.com>
>>>>
>>>> Invalidate io_data by setting context to NULL when USB request is
>>>> dequeued or completed, and check for NULL io_data in epfile_io_complete().
>>>> The invalidation of io_data in req->context is done when exiting
>>>> epfile_io(), since then io_data will become invalid as it is allocated
>>>> on the stack.
>>>> The epfile_io_complete() may be called after ffs_epfile_io() returns
>>>> in case the wait_for_completion_interruptible() is interrupted.
>>>> This fixes a use-after-free error with the following call stack:
>>>>
>>>> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
>>>> pc : ffs_epfile_io_complete+0x30/0x48
>>>> lr : usb_gadget_giveback_request+0x30/0xf8
>>>> Call trace:
>>>> ffs_epfile_io_complete+0x30/0x48
>>>> usb_gadget_giveback_request+0x30/0xf8
>>>> dwc3_remove_requests+0x264/0x2e8
>>>> dwc3_gadget_pullup+0x1d0/0x250
>>>> kretprobe_trampoline+0x0/0xc4
>>>> usb_gadget_remove_driver+0x40/0xf4
>>>> usb_gadget_unregister_driver+0xdc/0x178
>>>> unregister_gadget_item+0x40/0x6c
>>>> ffs_closed+0xd4/0x10c
>>>> ffs_data_clear+0x2c/0xf0
>>>> ffs_data_closed+0x178/0x1ec
>>>> ffs_ep0_release+0x24/0x38
>>>> __fput+0xe8/0x27c
>>>>
>>>> Signed-off-by: Frode Isaksen <frode@meta.com>
>>>> ---
>>>> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
>>>> Also tested on T.I. AM62x board.
>>> What commit id does this fix?  Should it go to stable?
>> This has always been there, so the is no specific commit when this was
>> added.
>>
>> Will add the Cc tag to stable in v2.
>>
>>>>    drivers/usb/gadget/function/f_fs.c | 5 +++++
>>>>    1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>>>> index 2dea9e42a0f8..f1be0a5c0bd0 100644
>>>> --- a/drivers/usb/gadget/function/f_fs.c
>>>> +++ b/drivers/usb/gadget/function/f_fs.c
>>>> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>>>>    {
>>>>    	struct ffs_io_data *io_data = req->context;
>>>> +	if (WARN_ON(io_data == NULL))
>>>> +		return;
>>> If this happens you just crashed the box (remember about panic-on-warn,
>>> which is still set in a few billion Linux systems these days...)
>>>
>>> Just handle the issue properly, no need to dump the stack and crash a
>>> device.
>> OK, removing the WARN_ON for v2.
>>> But, what keeps io_data from changing after you have checked it?  Where
>>> is the lock here?
>> There is no lock here, as I didn't want to introduce extra complexity (and
>> bugs...). But this code has been running without a crash on millions of
>> devices for more than a year.
> The fix has?  Great, but again, you need to at least say why this value
> will not change right after testing for it, otherwise you have just
> reduced the race window, not removed it.

I agree that this is only reducing the race window and not eliminating 
it completely, but I have no idea how to fix this easily.

Thanks,

Frode

>
> thanks,
>
> greg k-h
diff mbox series

Patch

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 2dea9e42a0f8..f1be0a5c0bd0 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -738,6 +738,9 @@  static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
 {
 	struct ffs_io_data *io_data = req->context;
 
+	if (WARN_ON(io_data == NULL))
+		return;
+
 	if (req->status)
 		io_data->status = req->status;
 	else
@@ -1126,6 +1129,7 @@  static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
 			spin_lock_irq(&epfile->ffs->eps_lock);
 			if (epfile->ep != ep) {
 				ret = -ESHUTDOWN;
+				req->context = NULL;
 				goto error_lock;
 			}
 			/*
@@ -1140,6 +1144,7 @@  static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
 			interrupted = io_data->status < 0;
 		}
 
+		req->context = NULL;
 		if (interrupted)
 			ret = -EINTR;
 		else if (io_data->read && io_data->status > 0)