Message ID | 20200918181951.21752-15-vsementsov@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | preallocate filter | expand |
On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote: > Performance improvements / degradations are usually discussed in > percentage. Let's make the script calculate it for us. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > scripts/simplebench/simplebench.py | 46 +++++++++++++++++++++++++++--- > 1 file changed, 42 insertions(+), 4 deletions(-) > > diff --git a/scripts/simplebench/simplebench.py b/scripts/simplebench/simplebench.py > index 56d3a91ea2..0ff05a38b8 100644 > --- a/scripts/simplebench/simplebench.py > +++ b/scripts/simplebench/simplebench.py [...] > + for j in range(0, i): > + env_j = results['envs'][j] > + res_j = case_results[env_j['id']] > + > + if 'average' not in res_j: > + # Failed result > + cell += ' --' > + continue > + > + col_j = chr(ord('A') + j) > + avg_j = res_j['average'] > + delta = (res['average'] - avg_j) / avg_j * 100 I was wondering why you’d subtract, when percentage differences usually mean a quotient. Then I realized that this would usually be written as: (res['average'] / avg_j - 1) * 100 > + delta_delta = (res['delta'] + res_j['delta']) / avg_j * 100 Why not use the new format_percent for both cases? > + cell += f' {col_j}{round(delta):+}±{round(delta_delta)}%' I don’t know what I should think about ±delta_delta. If I saw “Compared to run A, this is +42.1%±2.0%”, I would think that you calculated the difference between each run result, and then based on that array calculated average and standard deviation. Furthermore, I don’t even know what the delta_delta is supposed to tell you. It isn’t even a delta_delta, it’s an average_delta. The delta_delta would be (res['delta'] / res_j['delta'] - 1) * 100.0. And that might be presented perhaps like “+42.1% Δ± +2.0%” (if delta were the SD, “Δx̅=+42.1% Δσ=+2.0%” would also work; although, again, I do interpret ± as the SD anyway). Max
25.09.2020 13:24, Max Reitz wrote: > On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote: >> Performance improvements / degradations are usually discussed in >> percentage. Let's make the script calculate it for us. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> scripts/simplebench/simplebench.py | 46 +++++++++++++++++++++++++++--- >> 1 file changed, 42 insertions(+), 4 deletions(-) >> >> diff --git a/scripts/simplebench/simplebench.py b/scripts/simplebench/simplebench.py >> index 56d3a91ea2..0ff05a38b8 100644 >> --- a/scripts/simplebench/simplebench.py >> +++ b/scripts/simplebench/simplebench.py > > [...] > >> + for j in range(0, i): >> + env_j = results['envs'][j] >> + res_j = case_results[env_j['id']] >> + >> + if 'average' not in res_j: >> + # Failed result >> + cell += ' --' >> + continue >> + >> + col_j = chr(ord('A') + j) >> + avg_j = res_j['average'] >> + delta = (res['average'] - avg_j) / avg_j * 100 > > I was wondering why you’d subtract, when percentage differences usually > mean a quotient. Then I realized that this would usually be written as: > > (res['average'] / avg_j - 1) * 100 > >> + delta_delta = (res['delta'] + res_j['delta']) / avg_j * 100 > > Why not use the new format_percent for both cases? because I want less precision here > >> + cell += f' {col_j}{round(delta):+}±{round(delta_delta)}%' > > I don’t know what I should think about ±delta_delta. If I saw “Compared > to run A, this is +42.1%±2.0%”, I would think that you calculated the > difference between each run result, and then based on that array > calculated average and standard deviation. > > Furthermore, I don’t even know what the delta_delta is supposed to tell > you. It isn’t even a delta_delta, it’s an average_delta. not avarage, but sum of errors. And it shows the error for the delta > > The delta_delta would be (res['delta'] / res_j['delta'] - 1) * 100.0. and this shows nothing. Assume we have = A = 10+-2 and B = 15+-2 The difference is (15-10)+-(2+2) = 5+-4. And your formula will give (2/2 - 1) *100 = 0, which is wrong. Anyway, my code is mess) > And that might be presented perhaps like “+42.1% Δ± +2.0%” (if delta > were the SD, “Δx̅=+42.1% Δσ=+2.0%” would also work; although, again, I do > interpret ± as the SD anyway). > I feel that I'm bad in statistics :( I'll learn a little and make a new version. -- Best regards, Vladimir
On 25.09.20 19:13, Vladimir Sementsov-Ogievskiy wrote: > 25.09.2020 13:24, Max Reitz wrote: >> On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote: >>> Performance improvements / degradations are usually discussed in >>> percentage. Let's make the script calculate it for us. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> --- >>> scripts/simplebench/simplebench.py | 46 +++++++++++++++++++++++++++--- >>> 1 file changed, 42 insertions(+), 4 deletions(-) >>> >>> diff --git a/scripts/simplebench/simplebench.py >>> b/scripts/simplebench/simplebench.py >>> index 56d3a91ea2..0ff05a38b8 100644 >>> --- a/scripts/simplebench/simplebench.py >>> +++ b/scripts/simplebench/simplebench.py >> >> [...] >> >>> + for j in range(0, i): >>> + env_j = results['envs'][j] >>> + res_j = case_results[env_j['id']] >>> + >>> + if 'average' not in res_j: >>> + # Failed result >>> + cell += ' --' >>> + continue >>> + >>> + col_j = chr(ord('A') + j) >>> + avg_j = res_j['average'] >>> + delta = (res['average'] - avg_j) / avg_j * 100 >> >> I was wondering why you’d subtract, when percentage differences usually >> mean a quotient. Then I realized that this would usually be written as: >> >> (res['average'] / avg_j - 1) * 100 >> >>> + delta_delta = (res['delta'] + res_j['delta']) / >>> avg_j * 100 >> >> Why not use the new format_percent for both cases? > > because I want less precision here > >> >>> + cell += f' >>> {col_j}{round(delta):+}±{round(delta_delta)}%' >> >> I don’t know what I should think about ±delta_delta. If I saw “Compared >> to run A, this is +42.1%±2.0%”, I would think that you calculated the >> difference between each run result, and then based on that array >> calculated average and standard deviation. >> >> Furthermore, I don’t even know what the delta_delta is supposed to tell >> you. It isn’t even a delta_delta, it’s an average_delta. > > not avarage, but sum of errors. And it shows the error for the delta > >> >> The delta_delta would be (res['delta'] / res_j['delta'] - 1) * 100.0. > > and this shows nothing. > > Assume we have = A = 10+-2 and B = 15+-2 > > The difference is (15-10)+-(2+2) = 5+-4. > And your formula will give (2/2 - 1) *100 = 0, which is wrong. Well, it’s the difference in delta (whatever “delta” means here). I wouldn’t call it wrong. We want to compare two test runs, so if both have the same delta, then the difference in delta is 0. That’s how understood it, hence my “Δ±” notation below. (This may be useful information, because perhaps one may consider a big delta bad, and so if one run has less delta than another one, that may be considered a better outcoming. Comparing deltas has a purpose.) I see I understood your intentions wrong, though; you want to just give an error estimate for the difference of the means of both runs. I have to admit I don’t know how that works exactly, and it will probably heavily depend on what “delta” is. (Googling suggests that for the standard deviation, one would square each SD to get the variance back, then divide by the respective sample size, add, and take the square root. But that’s for when you have two distributions that you want to combine, but we want to compare here... http://homework.uoregon.edu/pub/class/es202/ztest.html seems to suggest the same for such a comparison, though. I don’t know.) (As for your current version, after more thinking it does seem right when delta is the maximum deviation. Or perhaps the deltas shouldn’t be added then but the maximum should be used? I’m just not sure.) ((Perhaps it doesn’t even matter. “Don’t believe any statistics you haven’t forged yourself”, and so on.)) Max
diff --git a/scripts/simplebench/simplebench.py b/scripts/simplebench/simplebench.py index 56d3a91ea2..0ff05a38b8 100644 --- a/scripts/simplebench/simplebench.py +++ b/scripts/simplebench/simplebench.py @@ -153,14 +153,22 @@ def bench(test_func, test_envs, test_cases, *args, **vargs): def ascii(results): """Return ASCII representation of bench() returned dict.""" - from tabulate import tabulate + import tabulate + + # We want leading whitespace for difference row cells (see below) + tabulate.PRESERVE_WHITESPACE = True dim = None - tab = [[""] + [c['id'] for c in results['envs']]] + tab = [ + # Environment columns are named A, B, ... + [""] + [chr(ord('A') + i) for i in range(len(results['envs']))], + [""] + [c['id'] for c in results['envs']] + ] for case in results['cases']: row = [case['id']] + case_results = results['tab'][case['id']] for env in results['envs']: - res = results['tab'][case['id']][env['id']] + res = case_results[env['id']] if dim is None: dim = res['dimension'] else: @@ -168,4 +176,34 @@ def ascii(results): row.append(ascii_one(res)) tab.append(row) - return f'All results are in {dim}\n\n' + tabulate(tab) + # Add row of difference between column. For each column starting from + # B we calculate difference with all previous columns. + row = ['', ''] # case name and first column + for i in range(1, len(results['envs'])): + cell = '' + env = results['envs'][i] + res = case_results[env['id']] + + if 'average' not in res: + # Failed result + row.append(cell) + continue + + for j in range(0, i): + env_j = results['envs'][j] + res_j = case_results[env_j['id']] + + if 'average' not in res_j: + # Failed result + cell += ' --' + continue + + col_j = chr(ord('A') + j) + avg_j = res_j['average'] + delta = (res['average'] - avg_j) / avg_j * 100 + delta_delta = (res['delta'] + res_j['delta']) / avg_j * 100 + cell += f' {col_j}{round(delta):+}±{round(delta_delta)}%' + row.append(cell) + tab.append(row) + + return f'All results are in {dim}\n\n' + tabulate.tabulate(tab)
Performance improvements / degradations are usually discussed in percentage. Let's make the script calculate it for us. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- scripts/simplebench/simplebench.py | 46 +++++++++++++++++++++++++++--- 1 file changed, 42 insertions(+), 4 deletions(-)