Message ID | 1450446761-5209-1-git-send-email-james.greenhalgh@arm.com |
---|---|
State | New |
Headers | show |
On Fri, Dec 18, 2015 at 05:09:14PM +0300, Yuri Rumyantsev wrote: > James, > > We implemented slightly different patch - we restrict number of SET > instructions for if-conversion through new parameter and add check in > bb_ok_for_noce_convert_multiple_sets: > > + unsigned limit = MIN (ii->branch_cost, > + (unsigned) PARAM_VALUE (PARAM_MAX_IF_CONV_SET_INSNS)); > .. > - if (count > ii->branch_cost) > - return FALSE; > + if (count > limit) > + return false; > > Now we are testing it for different suites/chips but I saw that real > benchmark for which we saw huge performance degradation gets speed-ip > on 65% for -march=slm -m32. Yes, that will work too. I put it where I did to give back-ends the choice to turn off all if-conversion by setting the param to zero. Your fix is more targeted to fixing just the one regression. I don't have a preference as to which direction we take this. I saw a similar performance boost for your testcase on my development box with my patch - so at least we now have two candidate patches to fix the performance regression. Thanks, James > > 2015-12-18 16:52 GMT+03:00 James Greenhalgh <james.greenhalgh@arm.com>: > > > > Hi, > > > > PR68920 talks about undesirable if-conversion in the x86 back-end. > > The if-conversion cost model simply uses BRANCH_COST (I want to revisit > > this for GCC 7), but BRANCH_COST is overloaded in the compiler and reducing > > it causes other optimisations to be disabled. > > > > Consequently, to fix the PR we want something new that the target can set > > to override BRANCH_COST and reduce the number of instructions that get > > if-converted. This patch adds this mechanism through a param. > > > > Bootstrapped on x86_64-none-linux-gnu and aarch64-none-linux-gnu. > > > > OK for trunk? > > > > Thanks, > > James > > > > --- > > gcc/ > > > > 2015-12-17 James Greenhalgh <james.greenhalgh@arm.com> > > > > PR rtl-optimization/68920 > > * params.def (PARAM_MAX_RTL_IF_CONVERSION_INSNS): New. > > * ifcvt.c (noce_find_if_block): Limit by new param. > > * doc/invoke.texi (max-rtl-if-conversion-insns): Document it. > > > > gcc/testsuite/ > > > > 2015-12-17 James Greenhalgh <james.greenhalgh@arm.com> > > > > PR rtl-optimization/68920 > > * gcc.deg/ifcvt-5.c: New. > > >
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 9b3e2fe..1f9b0fe 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10341,6 +10341,14 @@ In each case, the @var{value} is an integer. The allowable choices for When branch is predicted to be taken with probability lower than this threshold (in percent), then it is considered well predictable. The default is 10. +@item max-rtl-if-conversion-insns +RTL if-conversion tries to remove conditional branches around a block and +replace them with conditionally executed instructions. This parameter +gives the maximum number of instructions in a block which should be +considered for if-conversion. The default is 10, though the compiler will +also use other heuristics to decide whether if-conversion is likely to be +profitable. + @item max-crossjump-edges The maximum number of incoming edges to consider for cross-jumping. The algorithm used by @option{-fcrossjumping} is @math{O(N^2)} in diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 6164232..2674baa 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -32,6 +32,7 @@ #include "regs.h" #include "emit-rtl.h" #include "recog.h" +#include "params.h" #include "cfgrtl.h" #include "cfganal.h" @@ -3944,6 +3945,9 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge, if_info.then_else_reversed = then_else_reversed; if_info.branch_cost = BRANCH_COST (optimize_bb_for_speed_p (test_bb), predictable_edge_p (then_edge)); + if_info.branch_cost + = MIN (if_info.branch_cost, + (unsigned int) PARAM_VALUE (PARAM_MAX_RTL_IF_CONVERSION_INSNS)); /* Do the real work. */ diff --git a/gcc/params.def b/gcc/params.def index 41fd8a8..a0d9787 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -49,6 +49,15 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME, "Maximal estimated outcome of branch considered predictable.", 2, 0, 50) +/* The maximum number of insns in a basic block to consider for + RTL if-conversion. A target might use BRANCH_COST to further limit the + number of insns considered. */ +DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS, + "max-rtl-if-conversion-insns", + "Maximum number of insns in a basic block to consider for RTL " + "if-conversion.", + 10, 0, 10) + DEFPARAM (PARAM_INLINE_MIN_SPEEDUP, "inline-min-speedup", "The minimal estimated speedup allowing inliner to ignore inline-insns-single and inline-isnsns-auto.", diff --git a/gcc/testsuite/gcc.dg/ifcvt-5.c b/gcc/testsuite/gcc.dg/ifcvt-5.c new file mode 100644 index 0000000..fc6a2ca --- /dev/null +++ b/gcc/testsuite/gcc.dg/ifcvt-5.c @@ -0,0 +1,19 @@ +/* Check that multi-insn if-conversion is not done if the override + parameter would not allow it. */ + +/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=1" } */ +int +foo (int x, int y, int a) +{ + int i = x; + int j = y; + /* Try to make taking the branch likely. */ + __builtin_expect (x > y, 1); + if (x > y) + { + i = a; + j = i; + } + return i * j; +} +/* { dg-final { scan-rtl-dump "0 true changes made" "ce1" } } */