Message ID | 1495856035-6622-3-git-send-email-john.stultz@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | Fixes for two recently found timekeeping bugs | expand |
* John Stultz <john.stultz@linaro.org> wrote: > Due to how the MONOTONIC_RAW accumulation logic was handled, > there is the potential for a 1ns discontinuity when we do > accumulations. This small discontinuity has for the most part > gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW > in their vDSO clock_gettime implementation, we've seen failures > with the inconsistency-check test in kselftest. > > This patch addresses the issue by using the same sub-ns > accumulation handling that CLOCK_MONOTONIC uses, which avoids > the issue for in-kernel users. > > Since the ARM64 vDSO implementation has its own clock_gettime > calculation logic, this patch reduces the frequency of errors, > but failures are still seen. The ARM64 vDSO will need to be > updated to include the sub-nanosecond xtime_nsec values in its > calculation for this issue to be completely fixed. > > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Miroslav Lichvar <mlichvar@redhat.com> > Cc: Richard Cochran <richardcochran@gmail.com> > Cc: Prarit Bhargava <prarit@redhat.com> > Cc: Stephen Boyd <stephen.boyd@linaro.org> > Cc: Kevin Brodsky <kevin.brodsky@arm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: Daniel Mentz <danielmentz@google.com> > Signed-off-by: John Stultz <john.stultz@linaro.org> > --- > include/linux/timekeeper_internal.h | 4 ++-- > kernel/time/timekeeping.c | 21 ++++++++++++--------- > 2 files changed, 14 insertions(+), 11 deletions(-) > > diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h > index 110f453..528cc86 100644 > --- a/include/linux/timekeeper_internal.h > +++ b/include/linux/timekeeper_internal.h > @@ -58,7 +58,7 @@ struct tk_read_base { > * interval. > * @xtime_remainder: Shifted nano seconds left over when rounding > * @cycle_interval > - * @raw_interval: Raw nano seconds accumulated per NTP interval. > + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval. > * @ntp_error: Difference between accumulated time and NTP time in ntp > * shifted nano seconds. > * @ntp_error_shift: Shift conversion between clock shifted nano seconds and > @@ -100,7 +100,7 @@ struct timekeeper { > u64 cycle_interval; > u64 xtime_interval; > s64 xtime_remainder; > - u32 raw_interval; > + u64 raw_interval; > /* The ntp_tick_length() value currently being used. > * This cached copy ensures we consistently apply the tick > * length for an entire tick, as ntp_tick_length may change > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index abc1968..35d3ba3 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -282,7 +282,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock) > /* Go back from cycles -> shifted ns */ > tk->xtime_interval = interval * clock->mult; > tk->xtime_remainder = ntpinterval - tk->xtime_interval; > - tk->raw_interval = (interval * clock->mult) >> clock->shift; > + tk->raw_interval = interval * clock->mult; > > /* if changing clocks, convert xtime_nsec shift units */ > if (old_clock) { > @@ -1994,7 +1994,7 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset, > u32 shift, unsigned int *clock_set) > { > u64 interval = tk->cycle_interval << shift; > - u64 raw_nsecs; > + u64 nsecps; What does the 'ps' postfix stand for? It's not obvious (to me). > + tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec > + << tk->tkr_raw.shift; > + tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec > + << tk->tkr_raw.shift; Please don't break the line in such an ugly, random way, just write: tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift; tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift; Thanks, Ingo
On Sat, May 27, 2017 at 12:36 AM, Ingo Molnar <mingo@kernel.org> wrote: > > * John Stultz <john.stultz@linaro.org> wrote: > >> + u64 nsecps; > > What does the 'ps' postfix stand for? It's not obvious (to me). > I guess that nsecps stands for "nanoseconds per second", although the code appears to be storing that value left shifted by tk->tkr_raw.shift.
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index 110f453..528cc86 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -58,7 +58,7 @@ struct tk_read_base { * interval. * @xtime_remainder: Shifted nano seconds left over when rounding * @cycle_interval - * @raw_interval: Raw nano seconds accumulated per NTP interval. + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval. * @ntp_error: Difference between accumulated time and NTP time in ntp * shifted nano seconds. * @ntp_error_shift: Shift conversion between clock shifted nano seconds and @@ -100,7 +100,7 @@ struct timekeeper { u64 cycle_interval; u64 xtime_interval; s64 xtime_remainder; - u32 raw_interval; + u64 raw_interval; /* The ntp_tick_length() value currently being used. * This cached copy ensures we consistently apply the tick * length for an entire tick, as ntp_tick_length may change diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index abc1968..35d3ba3 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -282,7 +282,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock) /* Go back from cycles -> shifted ns */ tk->xtime_interval = interval * clock->mult; tk->xtime_remainder = ntpinterval - tk->xtime_interval; - tk->raw_interval = (interval * clock->mult) >> clock->shift; + tk->raw_interval = interval * clock->mult; /* if changing clocks, convert xtime_nsec shift units */ if (old_clock) { @@ -1994,7 +1994,7 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset, u32 shift, unsigned int *clock_set) { u64 interval = tk->cycle_interval << shift; - u64 raw_nsecs; + u64 nsecps; /* If the offset is smaller than a shifted interval, do nothing */ if (offset < interval) @@ -2009,14 +2009,17 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset, *clock_set |= accumulate_nsecs_to_secs(tk); /* Accumulate raw time */ - raw_nsecs = (u64)tk->raw_interval << shift; - raw_nsecs += tk->raw_time.tv_nsec; - if (raw_nsecs >= NSEC_PER_SEC) { - u64 raw_secs = raw_nsecs; - raw_nsecs = do_div(raw_secs, NSEC_PER_SEC); - tk->raw_time.tv_sec += raw_secs; + tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec + << tk->tkr_raw.shift; + tk->tkr_raw.xtime_nsec += tk->raw_interval << shift; + nsecps = (u64)NSEC_PER_SEC << tk->tkr_raw.shift; + while (tk->tkr_raw.xtime_nsec >= nsecps) { + tk->tkr_raw.xtime_nsec -= nsecps; + tk->raw_time.tv_sec++; } - tk->raw_time.tv_nsec = raw_nsecs; + tk->raw_time.tv_nsec = tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift; + tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec + << tk->tkr_raw.shift; /* Accumulate error between NTP and clock interval */ tk->ntp_error += tk->ntp_tick << shift;
Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Mentz <danielmentz@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> --- include/linux/timekeeper_internal.h | 4 ++-- kernel/time/timekeeping.c | 21 ++++++++++++--------- 2 files changed, 14 insertions(+), 11 deletions(-) -- 2.7.4