Message ID | 55169723.3070006@linaro.org |
---|---|
State | New |
Headers | show |
On 28 March 2015 at 17:27, viresh kumar <viresh.kumar@linaro.org> wrote: > On 28 March 2015 at 15:23, Peter Zijlstra <peterz@infradead.org> wrote: > >> Well, for one your patch is indeed disgusting. > > Yeah, I agree :) Sigh.. Sorry for the series of *nonsense* mails before the last one. Its some thunderbird *BUG* which did that, I was accessing my mail from both gmail's interface and thunderbird and somehow this happened. I have hit the send button only once. Really sorry for the noise. (The last mail has few inquiries towards the end and a thanks note, just to identify it..) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 29 March 2015 at 15:54, Peter Zijlstra <peterz@infradead.org> wrote: > On Sat, Mar 28, 2015 at 02:44:57PM +0100, Peter Zijlstra wrote: >> > Now there are few issues I see here (Sorry if they are all imaginary): >> > - In case a timer re-arms itself from its handler and is migrated from CPU A to B, what >> > happens if the re-armed timer fires before the first handler finishes ? i.e. timer->fn() >> > hasn't finished running on CPU A and it has fired again on CPU B. Wouldn't this expose >> > us to a lot of other problems? It wouldn't be serialized to itself anymore ? >> >> What I said above. > > What I didn't say, but had thought of is that __run_timer() should skip > any timer that has RUNNING set -- for obvious reasons :-) Below is copied from your first reply, and so you probably already said that ? :) > Also, once you have tbase_running, we can take base->running_timer out > altogether. I wanted to clarify if I understood it correctly.. Are you saying that: Case 1.) if we find tbase_running on cpuY (because it was rearmed from its handler on cpuX and has got migrated to cpuY), then we should drop the timer from the list without calling its handler (as that is already running in parallel) ? Or Case 2.) we keep retrying for it, until the time the other handler finishes? I have few queries for both the cases. Case 1.) Will that be fair to the timer user as the timer may get lost completely. If we skip the timer on cpuY here, it wouldn't be enqueued again and so will be lost. Case 2.) We kept waiting for the first handler to finish .. - cpuY may waste some cycles as it kept waiting for handler to finish on cpuX .. - We may need to perform base unlock/lock on cpuY, so that cpuX can take cpuY's lock to reset tbase_running. And that might be racy, not sure. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 30 March 2015 at 18:17, Peter Zijlstra <peterz@infradead.org> wrote: > No, I means something else with that. We can remove the > tvec_base::running_timer field. Everything that uses that can use > tbase_running() AFAICT. Okay, there is one instance which still needs it. migrate_timers(): BUG_ON(old_base->running_timer); What I wasn't sure about it is if we get can drop this statement or not. If we decide not to drop it, then we can convert running_timer into a bool. > Drop yes, racy not so much I think. > > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 2d3f5c504939..1394f9540348 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -1189,12 +1189,39 @@ static inline void __run_timers(struct tvec_base *base) > cascade(base, &base->tv5, INDEX(3)); > ++base->timer_jiffies; > list_replace_init(base->tv1.vec + index, head); > + > +again: > while (!list_empty(head)) { > void (*fn)(unsigned long); > unsigned long data; > bool irqsafe; > > - timer = list_first_entry(head, struct timer_list,entry); > + timer = list_first_entry(head, struct timer_list, entry); > + if (unlikely(tbase_running(timer))) { > + /* Only one timer on the list, force wait. */ > + if (unlikely(head->next == head->prev)) { > + spin_unlock(&base->lock); > + > + /* > + * The only way to get here is if the > + * handler requeued itself on another > + * base, this guarantees the timer will > + * not go away. > + */ > + while (tbase_running(timer)) > + cpu_relax(); > + > + spin_lock(&base->lock); > + } else { > + /* > + * Otherwise, rotate the list and try > + * someone else. > + */ > + list_move_tail(&timer->entry, head); > + } > + goto again; > + } > + > fn = timer->function; > data = timer->data; > irqsafe = tbase_get_irqsafe(timer->base); Yeah, so I have written something similar only. Wasn't sure about what you wrote earlier. Thanks for the clarification. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/include/linux/timer.h b/include/linux/timer.h index 8c5a197e1587..e7184f57449c 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -67,8 +67,9 @@ extern struct tvec_base boot_tvec_bases; */ #define TIMER_DEFERRABLE 0x1LU #define TIMER_IRQSAFE 0x2LU +#define TIMER_PINNED 0x4LU -#define TIMER_FLAG_MASK 0x3LU +#define TIMER_FLAG_MASK 0x7LU And Fenguang's build-bot showed the problem (only) on blackfin [1]. config: make ARCH=blackfin allyesconfig All error/warnings: kernel/timer.c: In function 'init_timers': >> kernel/timer.c:1683:2: error: call to '__compiletime_assert_1683' >> declared with attribute error: BUILD_BUG_ON failed: >> __alignof__(struct tvec_base) & TIMER_FLAG_MASK So probably we need to make 'base' aligned to 8 bytes ? So, what you are suggesting is something like this (untested): diff --git a/include/linux/timer.h b/include/linux/timer.h index 8c5a197e1587..68bf09d69352 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -67,8 +67,9 @@ extern struct tvec_base boot_tvec_bases; */ #define TIMER_DEFERRABLE 0x1LU #define TIMER_IRQSAFE 0x2LU +#define TIMER_RUNNING 0x4LU -#define TIMER_FLAG_MASK 0x3LU +#define TIMER_FLAG_MASK 0x7LU #define __TIMER_INITIALIZER(_function, _expires, _data, _flags) { \ .entry = { .prev = TIMER_ENTRY_STATIC }, \ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2d3f5c504939..8f9efa64bd34 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -105,6 +105,21 @@ static inline unsigned int tbase_get_irqsafe(struct tvec_base *base) return ((unsigned int)(unsigned long)base & TIMER_IRQSAFE); } +static inline unsigned int tbase_get_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base & TIMER_RUNNING); +} + +static inline unsigned int tbase_set_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base | TIMER_RUNNING); +} + +static inline unsigned int tbase_clear_running(struct tvec_base *base) +{ + return ((unsigned int)(unsigned long)base & ~TIMER_RUNNING); +} + static inline struct tvec_base *tbase_get_base(struct tvec_base *base) { return ((struct tvec_base *)((unsigned long)base & ~TIMER_FLAG_MASK)); @@ -781,21 +796,12 @@ __mod_timer(struct timer_list *timer, unsigned long expires, new_base = per_cpu(tvec_bases, cpu); if (base != new_base) { - /* - * We are trying to schedule the timer on the local CPU. - * However we can't change timer's base while it is running, - * otherwise del_timer_sync() can't detect that the timer's - * handler yet has not finished. This also guarantees that - * the timer is serialized wrt itself. - */ - if (likely(base->running_timer != timer)) { - /* See the comment in lock_timer_base() */ - timer_set_base(timer, NULL); - spin_unlock(&base->lock); - base = new_base; - spin_lock(&base->lock); - timer_set_base(timer, base); - } + /* See the comment in lock_timer_base() */ + timer_set_base(timer, NULL); + spin_unlock(&base->lock); + base = new_base; + spin_lock(&base->lock); + timer_set_base(timer, base); } timer->expires = expires; @@ -1016,7 +1022,7 @@ int try_to_del_timer_sync(struct timer_list *timer) base = lock_timer_base(timer, &flags); - if (base->running_timer != timer) { + if (tbase_get_running(timer->base)) { timer_stats_timer_clear_start_info(timer); ret = detach_if_pending(timer, base, true); } @@ -1202,6 +1208,7 @@ static inline void __run_timers(struct tvec_base *base) timer_stats_account_timer(timer); base->running_timer = timer; + tbase_set_running(timer->base); detach_expired_timer(timer, base); if (irqsafe) { @@ -1216,6 +1223,7 @@ static inline void __run_timers(struct tvec_base *base) } } base->running_timer = NULL; + tbase_clear_running(timer->base); spin_unlock_irq(&base->lock); }