Message ID | dd6ad0b5-569b-067d-257f-dbf15ffa7077@mvista.com |
---|---|
State | New |
Headers | show |
Series | Warning from swake_up_all in 4.14.15-rt13 non-RT | expand |
On 03/06/2018 11:46 AM, Sebastian Andrzej Siewior wrote: > On 2018-03-05 09:08:11 [-0600], Corey Minyard wrote: >> Starting with the change >> >> 8a64547a07980f9d25e962a78c2e10ee82bdb742 fs/dcache: use swait_queue instead >> of >> waitqueue > … >> The following change is the obvious reason: >> >> --- a/kernel/sched/swait.c >> +++ b/kernel/sched/swait.c >> @@ -69,6 +69,7 @@ void swake_up_all(struct swait_queue_head *q) >> struct swait_queue *curr; >> LIST_HEAD(tmp); >> >> + WARN_ON(irqs_disabled()); >> raw_spin_lock_irq(&q->lock); >> list_splice_init(&q->task_list, &tmp); >> while (!list_empty(&tmp)) { >> >> I've done a little bit of analysis here, percpu_ref_kill_and_confirm() >> does spin_lock_irqsave() and then does a percpu_ref_put(). If the >> refcount reaches zero, the release function of the refcount is >> called. In this case, the block code has set this to >> blk_queue_usage_counter_release(), which calls swake_up_all(). >> >> It seems like a bad idea to call percpu_ref_put() with interrupts >> disabled. This problem actually doesn't appear to be RT-related, >> there's just no warning call if the RT tree isn't used. > yeah but vanilla uses wake_up() which does spin_lock_irqsafe() so it is > not an issue there. > > The odd part here is that percpu_ref_kill_and_confirm() does _irqsave() > which suggests that it might be called from any context and then it does > wait_event_lock_irq() which enables interrupts again while it waits. So > it can't be used from any context. > >> I'm not sure if it's best to just do the put outside the lock, or >> have modified put function that returns a bool to know if a release >> is required, then the release function can be called outside the >> lock. I can do patches and test, but I'm hoping for a little >> guidance here. > swake_up_all() does raw_spin_lock_irq() because it should be called from > non-IRQ context. And it drops the lock (+IRQ enable) between wake-ups in > case we "need_resched()" because we woke a high-priority waiter. There > is the list_splice because we wanted to drop the locks (and have IRQs > open) during the entire wake up process but finish_swait() may happen > during the wake up and so we must hold the lock while the list-item is > removed for the queue head. > I have no idea what is the wisest thing to do here. The obvious fix > would be to use the irqsafe() variant here and not drop the lock between > wake ups. That is essentially what swake_up_all_locked() does which I > need for the completions (and based on some testing most users have one > waiter except during PM and some crypto code). > It is probably no comparison to wake_up_q() (which does multiple wake > ups without a context switch) but then we did this before like that. > > Preferably we would have a proper list_splice() and some magic in the > "early" dequeue part that works. > Maybe just modify the block code to run the swake_up_all() call in a workqueue or tasklet? If you think that works, I'll create a patch, test it, and submit it if all goes well. Thanks, -corey -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/09/2018 08:58 AM, Sebastian Andrzej Siewior wrote: > On 2018-03-09 07:29:31 [-0600], Corey Minyard wrote: >> From what I can tell, wake_up_q() is unbounded, and you have undone what >> the previous code had tried to accomplish. In the scenario I'm talking >> about, >> interrupts are still disabled here. That's why I was asking about where to >> put >> wake_up_q(), I knew you could put it here, but it didn't seem to me to help >> at all. > So you are worried about unbound latencies on !RT. Okay. So for !RT this > does not help but it is not worse then before (before the RT patch was > applied and changed things). > In fact it is better now (with RT patch and this one) because before > that patch you would not only open interrupts between the wake up but you > would leave the function with interrupts open which is wrong. Any > interrupt (or a context switch due to need-resched() that would invoke > percpu_ref_put() would freeze the CPU/system. > Also, every user that invoked swake_up_all() with enabled interrupts > will still perform the wake up with enabled interrupts. So nothing > changes here. Ah, ok, that makes sense. Sorry, I was mixing things up. Yes. on RT this should fix the unbounded time issue, and it should also solve the interrupts disabled issue on !RT. I'll try this out. -corey >>>> I had another idea. This is only occurring if RT is not enabled, because >>>> with >>>> RT all the irq disable things go away and you are generally running in task >>>> context. So why not have a different version of swake_up_all() for non-RT >>>> that does work from irqs-off context? >>> With the patch above I have puzzle part which would allow to use swait >>> based completions upstream. That ifdef would probably not help. >> I agree that having a bounded time way to wake up a bunch of threads while >> interrupts are disabled would solve a bunch of issues. I just don't see how >> it >> can be done without pushing it off to a softirq or workqueue. > true but this is a different story. We started with a WARN_ON() which > triggered correctly and the problem it pointed to looks solved to me. > > This "unbounded runtime during the wake up of many tasks with interrupts > disabled via percpu_ref_kill() -> blk_queue_usage_counter_release()" > thing exists already in the vanilla kernel and does not exist > with the RT patch applied and RT enabled. If you are affected by this > and you don't like it - fine. Using a workqueue is one way of getting > around it (the softirq is not preemptible in !RT so it wouldn't change > much). However, I see no benefit in carrying such a patch because as I > said only !RT is affected by this. > >> -corey > Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/kernel/sched/swait.c +++ b/kernel/sched/swait.c @@ -69,6 +69,7 @@ void swake_up_all(struct swait_queue_head *q) struct swait_queue *curr; LIST_HEAD(tmp); + WARN_ON(irqs_disabled()); raw_spin_lock_irq(&q->lock); list_splice_init(&q->task_list, &tmp); while (!list_empty(&tmp)) {