Message ID | 1427945681-29972-20-git-send-email-john.stultz@linaro.org |
---|---|
State | New |
Headers | show |
On Thu, Apr 2, 2015 at 1:50 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Apr 01, 2015 at 08:34:39PM -0700, John Stultz wrote: >> Ingo noted that the description of clocks_calc_max_nsecs()'s >> 50% safety margin was somewhat circular. So this patch tries >> to improve the comment to better explain what we mean by the >> 50% safety margin and why we need it. >> >> Cc: Ingo Molnar <mingo@kernel.org> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Prarit Bhargava <prarit@redhat.com> >> Cc: Richard Cochran <richardcochran@gmail.com> >> Signed-off-by: John Stultz <john.stultz@linaro.org> >> --- >> kernel/time/clocksource.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c >> index c3be3c7..15facb1 100644 >> --- a/kernel/time/clocksource.c >> +++ b/kernel/time/clocksource.c >> @@ -472,8 +472,11 @@ static u32 clocksource_max_adjustment(struct clocksource *cs) >> * @max_cyc: maximum cycle value before potential overflow (does not include >> * any safety margin) >> * >> - * NOTE: This function includes a safety margin of 50%, so that bad clock values >> - * can be detected. >> + * NOTE: This function includes a safety margin of 50%, in other words, we >> + * return half the number of nanoseconds the hardware counter can technically >> + * cover. This is done so that we can potentially detect problems caused by >> + * delayed timers or bad hardware, which might result in time intervals that >> + * are larger then what the math used can handle without overflows. >> */ >> u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cyc) >> { > > Should we make a further note that the tk_fast things rely on this > slack since they're not strongly serialized against this? That is, they > can end up using an older cycle_last value and therefore end up with a > larger delta than other code. Though, even with the tk_fast bits, we expect the update to happen regularly, its just that for the benefit of lock-free access we are ok with the possible slight inconsistencies (in the mono clock) that could happen if we use a slightly stale value mid-update. So I don't think the tk_fast bits are actually relying on the slack any more then the normal timekeeping code relies on the slack to handle slight delays in processing the updates. If we deal with time deltas large enough to cause overflows, or time intervals larger then the hardware can represent, we're sunk in either case. This 50% margin just makes it easier to catch unexpected delays or issues. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Thu, Apr 2, 2015 at 11:34 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Apr 02, 2015 at 10:30:18AM -0700, John Stultz wrote: >> > Should we make a further note that the tk_fast things rely on this >> > slack since they're not strongly serialized against this? That is, they >> > can end up using an older cycle_last value and therefore end up with a >> > larger delta than other code. >> >> Though, even with the tk_fast bits, we expect the update to happen >> regularly, its just that for the benefit of lock-free access we are ok >> with the possible slight inconsistencies (in the mono clock) that >> could happen if we use a slightly stale value mid-update. So I don't >> think the tk_fast bits are actually relying on the slack any more then >> the normal timekeeping code relies on the slack to handle slight >> delays in processing the updates. If we deal with time deltas large >> enough to cause overflows, or time intervals larger then the hardware >> can represent, we're sunk in either case. This 50% margin just makes >> it easier to catch unexpected delays or issues. > > Right, so you're saying that even though the fast bits will see slightly > larger deltas than the normal code, they should still not get anywhere > near the 50% because we update much more frequently? Well, they may get to 50% or slightly over (since 50% is the max nohz idle length), but that's likely rare, and we shouldn't get anywhere close to real failure edges (100% be it the mult-overflow or hardware mask limit). thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Thu, Apr 2, 2015 at 11:43 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Apr 02, 2015 at 11:41:15AM -0700, John Stultz wrote: >> On Thu, Apr 2, 2015 at 11:34 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> > On Thu, Apr 02, 2015 at 10:30:18AM -0700, John Stultz wrote: >> >> > Should we make a further note that the tk_fast things rely on this >> >> > slack since they're not strongly serialized against this? That is, they >> >> > can end up using an older cycle_last value and therefore end up with a >> >> > larger delta than other code. >> >> >> >> Though, even with the tk_fast bits, we expect the update to happen >> >> regularly, its just that for the benefit of lock-free access we are ok >> >> with the possible slight inconsistencies (in the mono clock) that >> >> could happen if we use a slightly stale value mid-update. So I don't >> >> think the tk_fast bits are actually relying on the slack any more then >> >> the normal timekeeping code relies on the slack to handle slight >> >> delays in processing the updates. If we deal with time deltas large >> >> enough to cause overflows, or time intervals larger then the hardware >> >> can represent, we're sunk in either case. This 50% margin just makes >> >> it easier to catch unexpected delays or issues. >> > >> > Right, so you're saying that even though the fast bits will see slightly >> > larger deltas than the normal code, they should still not get anywhere >> > near the 50% because we update much more frequently? >> >> Well, they may get to 50% or slightly over (since 50% is the max nohz >> idle length), but that's likely rare, and we shouldn't get anywhere >> close to real failure edges (100% be it the mult-overflow or hardware >> mask limit). > > Right, so the fast thing rely on there being slack. That was my point > rather. I know they'll not get to the end of slack. Right, but all of the time code relies on there being slack, since we can't be sure timers will not be slightly late. For the fast-timekeper stuff, its just that the deltas could be ever so slightly larger since the lock-free reads can land during an update. Its not that its ok for the update to be delayed longer for fast-timekeepers, or that we expect them to handle larger deltas then the normal time read logic can handle. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index c3be3c7..15facb1 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -472,8 +472,11 @@ static u32 clocksource_max_adjustment(struct clocksource *cs) * @max_cyc: maximum cycle value before potential overflow (does not include * any safety margin) * - * NOTE: This function includes a safety margin of 50%, so that bad clock values - * can be detected. + * NOTE: This function includes a safety margin of 50%, in other words, we + * return half the number of nanoseconds the hardware counter can technically + * cover. This is done so that we can potentially detect problems caused by + * delayed timers or bad hardware, which might result in time intervals that + * are larger then what the math used can handle without overflows. */ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cyc) {
Ingo noted that the description of clocks_calc_max_nsecs()'s 50% safety margin was somewhat circular. So this patch tries to improve the comment to better explain what we mean by the 50% safety margin and why we need it. Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> --- kernel/time/clocksource.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)