Message ID | 1490290924-12958-2-git-send-email-daniel.lezcano@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
On Thu, Mar 23, 2017 at 06:42:02PM +0100, Daniel Lezcano wrote: > +/* > + * The function record_irq_time is only called in one place in the > + * interrupts handler. We want this function always inline so the code > + * inside is embedded in the function and the static key branching > + * code can act at the higher level. Without the explicit > + * __always_inline we can end up with a function call and a small > + * overhead in the hotpath for nothing. > + */ > +static __always_inline void record_irq_time(struct irq_desc *desc) > +{ > + if (static_key_enabled(&irq_timing_enabled)) { I think you meant to have either static_branch_likely() or static_branch_unlikely() here. Those are runtime code patched, static_key_enabled() generates a regular load and test condition. Also; if you do something like: if (!static_branch_likely(&irq_timing_enabled)) return; you can save one level of indent. > + if (desc->istate & IRQS_TIMINGS) { > + struct irq_timings *timings = this_cpu_ptr(&irq_timings); > + unsigned int index = timings->count & IRQ_TIMINGS_MASK; > + > + timings->values[index].ts = local_clock(); > + timings->values[index].irq = irq_desc_get_irq(desc); > + timings->count++; > + } > + } > +} > +DEFINE_STATIC_KEY_FALSE(irq_timing_enabled); > + > +DEFINE_PER_CPU(struct irq_timings, irq_timings); > + > +void irq_timings_enable(void) > +{ > + static_branch_inc(&irq_timing_enabled); Do you really need counting, or do you want static_branch_enable() here? > +} > + > +void irq_timings_disable(void) > +{ > + static_branch_dec(&irq_timing_enabled); idem. > +} > -- > 1.9.1 >
On Thu, Mar 23, 2017 at 07:35:42PM +0100, Peter Zijlstra wrote: > On Thu, Mar 23, 2017 at 06:42:02PM +0100, Daniel Lezcano wrote: > > +/* > > + * The function record_irq_time is only called in one place in the > > + * interrupts handler. We want this function always inline so the code > > + * inside is embedded in the function and the static key branching > > + * code can act at the higher level. Without the explicit > > + * __always_inline we can end up with a function call and a small > > + * overhead in the hotpath for nothing. > > + */ > > +static __always_inline void record_irq_time(struct irq_desc *desc) > > +{ > > + if (static_key_enabled(&irq_timing_enabled)) { > > I think you meant to have either static_branch_likely() or > static_branch_unlikely() here. Those are runtime code patched, > static_key_enabled() generates a regular load and test condition. > > Also; if you do something like: > > if (!static_branch_likely(&irq_timing_enabled)) > return; > > you can save one level of indent. Ok, thanks for the hint. > > + if (desc->istate & IRQS_TIMINGS) { > > + struct irq_timings *timings = this_cpu_ptr(&irq_timings); > > + unsigned int index = timings->count & IRQ_TIMINGS_MASK; > > + > > + timings->values[index].ts = local_clock(); > > + timings->values[index].irq = irq_desc_get_irq(desc); > > + timings->count++; > > + } > > + } > > +} > > > > > +DEFINE_STATIC_KEY_FALSE(irq_timing_enabled); > > + > > +DEFINE_PER_CPU(struct irq_timings, irq_timings); > > + > > +void irq_timings_enable(void) > > +{ > > + static_branch_inc(&irq_timing_enabled); > > Do you really need counting, or do you want static_branch_enable() here? I put counting in order to let several subsystem to use the irq timings if it is needed. > > +} > > + > > +void irq_timings_disable(void) > > +{ > > + static_branch_dec(&irq_timing_enabled); > > idem. > > > +} > > -- > > 1.9.1 > > -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On Thu, 23 Mar 2017, Daniel Lezcano wrote: > On Thu, Mar 23, 2017 at 07:35:42PM +0100, Peter Zijlstra wrote: > > On Thu, Mar 23, 2017 at 06:42:02PM +0100, Daniel Lezcano wrote: > > > +void irq_timings_enable(void) > > > +{ > > > + static_branch_inc(&irq_timing_enabled); > > > > Do you really need counting, or do you want static_branch_enable() here? > > I put counting in order to let several subsystem to use the irq timings if it > is needed. Please avoid overengineering. This can easily be changed later if it is needed. Nicolas
On Thu, 23 Mar 2017, Daniel Lezcano wrote: > +#define IRQ_TIMINGS_SHIFT 5 > +#define IRQ_TIMINGS_SIZE (1 << IRQ_TIMINGS_SHIFT) > +#define IRQ_TIMINGS_MASK (IRQ_TIMINGS_SIZE - 1) > + > +struct irq_timing { > + u32 irq; > + u64 ts; > +}; > + > +struct irq_timings { > + struct irq_timing values[IRQ_TIMINGS_SIZE]; /* our circular buffer */ This is not very space efficient because of alignment restrictions from the u64 in struct irq_timing. 25% of the memory is wasted. You could consider having two arrays instead: u32 irq_values[IRQ_TIMINGS_SIZE]; u64 ts_values[IRQ_TIMINGS_SIZE]; Nicolas
On Thu, 23 Mar 2017, Nicolas Pitre wrote: > On Thu, 23 Mar 2017, Daniel Lezcano wrote: > > > +#define IRQ_TIMINGS_SHIFT 5 > > +#define IRQ_TIMINGS_SIZE (1 << IRQ_TIMINGS_SHIFT) > > +#define IRQ_TIMINGS_MASK (IRQ_TIMINGS_SIZE - 1) > > + > > +struct irq_timing { > > + u32 irq; > > + u64 ts; > > +}; > > + > > +struct irq_timings { > > + struct irq_timing values[IRQ_TIMINGS_SIZE]; /* our circular buffer */ > > This is not very space efficient because of alignment restrictions from > the u64 in struct irq_timing. 25% of the memory is wasted. > > You could consider having two arrays instead: > > u32 irq_values[IRQ_TIMINGS_SIZE]; > u64 ts_values[IRQ_TIMINGS_SIZE]; For the penalty of dirtying two cachelines instead of one. Thanks, tglx
On Thu, 23 Mar 2017, Thomas Gleixner wrote: > On Thu, 23 Mar 2017, Nicolas Pitre wrote: > > > On Thu, 23 Mar 2017, Daniel Lezcano wrote: > > > > > +#define IRQ_TIMINGS_SHIFT 5 > > > +#define IRQ_TIMINGS_SIZE (1 << IRQ_TIMINGS_SHIFT) > > > +#define IRQ_TIMINGS_MASK (IRQ_TIMINGS_SIZE - 1) > > > + > > > +struct irq_timing { > > > + u32 irq; > > > + u64 ts; > > > +}; > > > + > > > +struct irq_timings { > > > + struct irq_timing values[IRQ_TIMINGS_SIZE]; /* our circular buffer */ > > > > This is not very space efficient because of alignment restrictions from > > the u64 in struct irq_timing. 25% of the memory is wasted. > > > > You could consider having two arrays instead: > > > > u32 irq_values[IRQ_TIMINGS_SIZE]; > > u64 ts_values[IRQ_TIMINGS_SIZE]; > > For the penalty of dirtying two cachelines instead of one. Well... Is there a need for 64 bits of relative time stamps? And 32 bits of IRQ number? I'd say that 48 bit time stamp and 16 bit IRQ number is way sufficient. Who cares if we mispredict an IRQ after 78 hours of idle time? Hence: u64 value = (ts << 16) | irq; Nicolas
On Thu, 23 Mar 2017, Nicolas Pitre wrote: > Is there a need for 64 bits of relative time stamps? > And 32 bits of IRQ number? No. > I'd say that 48 bit time stamp and 16 bit IRQ number is way sufficient. > Who cares if we mispredict an IRQ after 78 hours of idle time? > > Hence: > > u64 value = (ts << 16) | irq; Excellent!
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index bc3b675..c050333 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -704,6 +704,11 @@ static inline void init_irq_proc(void) } #endif +#ifdef CONFIG_IRQ_TIMINGS +void irq_timings_enable(void); +void irq_timings_disable(void); +#endif + struct seq_file; int show_interrupts(struct seq_file *p, void *v); int arch_show_interrupts(struct seq_file *p, int prec); diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 3bbfd6a..38e551d 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -81,6 +81,9 @@ config GENERIC_MSI_IRQ_DOMAIN config HANDLE_DOMAIN_IRQ bool +config IRQ_TIMINGS + bool + config IRQ_DOMAIN_DEBUG bool "Expose hardware/virtual IRQ mapping via debugfs" depends on IRQ_DOMAIN && DEBUG_FS diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile index 1d3ee31..efb5f14 100644 --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -10,3 +10,4 @@ obj-$(CONFIG_PM_SLEEP) += pm.o obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o obj-$(CONFIG_SMP) += affinity.o +obj-$(CONFIG_IRQ_TIMINGS) += timings.o diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c index d3f2490..eb4d3e8 100644 --- a/kernel/irq/handle.c +++ b/kernel/irq/handle.c @@ -138,6 +138,8 @@ irqreturn_t __handle_irq_event_percpu(struct irq_desc *desc, unsigned int *flags unsigned int irq = desc->irq_data.irq; struct irqaction *action; + record_irq_time(desc); + for_each_action_of_desc(desc, action) { irqreturn_t res; diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index bc226e7..abaadaa 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -8,6 +8,7 @@ #include <linux/irqdesc.h> #include <linux/kernel_stat.h> #include <linux/pm_runtime.h> +#include <linux/sched/clock.h> #ifdef CONFIG_SPARSE_IRQ # define IRQ_BITMAP_BITS (NR_IRQS + 8196) @@ -57,6 +58,7 @@ enum { IRQS_WAITING = 0x00000080, IRQS_PENDING = 0x00000200, IRQS_SUSPENDED = 0x00000800, + IRQS_TIMINGS = 0x00001000, }; #include "debug.h" @@ -226,3 +228,68 @@ static inline int irq_desc_is_chained(struct irq_desc *desc) static inline void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action) { } #endif + +#ifdef CONFIG_IRQ_TIMINGS + +#define IRQ_TIMINGS_SHIFT 5 +#define IRQ_TIMINGS_SIZE (1 << IRQ_TIMINGS_SHIFT) +#define IRQ_TIMINGS_MASK (IRQ_TIMINGS_SIZE - 1) + +struct irq_timing { + u32 irq; + u64 ts; +}; + +struct irq_timings { + struct irq_timing values[IRQ_TIMINGS_SIZE]; /* our circular buffer */ + unsigned int count; /* Number of interruptions since last inspection */ +}; + +DECLARE_PER_CPU(struct irq_timings, irq_timings); + +static inline void remove_timings(struct irq_desc *desc) +{ + desc->istate &= ~IRQS_TIMINGS; +} + +static inline void setup_timings(struct irq_desc *desc, struct irqaction *act) +{ + /* + * We don't need the measurement because the idle code already + * knows the next expiry event. + */ + if (act->flags & __IRQF_TIMER) + return; + + desc->istate |= IRQS_TIMINGS; +} + +extern struct static_key_false irq_timing_enabled; + +/* + * The function record_irq_time is only called in one place in the + * interrupts handler. We want this function always inline so the code + * inside is embedded in the function and the static key branching + * code can act at the higher level. Without the explicit + * __always_inline we can end up with a function call and a small + * overhead in the hotpath for nothing. + */ +static __always_inline void record_irq_time(struct irq_desc *desc) +{ + if (static_key_enabled(&irq_timing_enabled)) { + if (desc->istate & IRQS_TIMINGS) { + struct irq_timings *timings = this_cpu_ptr(&irq_timings); + unsigned int index = timings->count & IRQ_TIMINGS_MASK; + + timings->values[index].ts = local_clock(); + timings->values[index].irq = irq_desc_get_irq(desc); + timings->count++; + } + } +} +#else +static inline void remove_timings(struct irq_desc *desc) {} +static inline void setup_timings(struct irq_desc *desc, + struct irqaction *act) {}; +static inline void record_irq_time(struct irq_desc *desc) {} +#endif /* CONFIG_IRQ_TIMINGS */ diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index c0a23e2..da799d3 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1370,6 +1370,8 @@ static void irq_release_resources(struct irq_desc *desc) raw_spin_unlock_irqrestore(&desc->lock, flags); + setup_timings(desc, new); + /* * Strictly no need to wake it up, but hung_task complains * when no hard interrupt wakes the thread up. @@ -1498,6 +1500,7 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id) irq_settings_clr_disable_unlazy(desc); irq_shutdown(desc); irq_release_resources(desc); + remove_timings(desc); } #ifdef CONFIG_SMP diff --git a/kernel/irq/timings.c b/kernel/irq/timings.c new file mode 100644 index 0000000..21ceca8 --- /dev/null +++ b/kernel/irq/timings.c @@ -0,0 +1,30 @@ +/* + * linux/kernel/irq/timings.c + * + * Copyright (C) 2016, Linaro Ltd - Daniel Lezcano <daniel.lezcano@linaro.org> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ +#include <linux/percpu.h> +#include <linux/static_key.h> +#include <linux/interrupt.h> +#include <linux/irq.h> + +#include "internals.h" + +DEFINE_STATIC_KEY_FALSE(irq_timing_enabled); + +DEFINE_PER_CPU(struct irq_timings, irq_timings); + +void irq_timings_enable(void) +{ + static_branch_inc(&irq_timing_enabled); +} + +void irq_timings_disable(void) +{ + static_branch_dec(&irq_timing_enabled); +}