Message ID | 5836F1D2.8090407@foss.arm.com |
---|---|
State | New |
Headers | show |
On Thu, Nov 24, 2016 at 2:57 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > Hi all, > > In this bug we have TER during out-of-ssa misbehaving. A minimal example is: > void foo(unsigned a, float b) > { > unsigned c = (unsigned) b; // 1 > register unsigned r0 __asm__("r0") = a; // 2 > register unsigned r1 __asm__("r1") = c; // 3 > __asm__ volatile( "str %[r1], [%[r0]]\n" > : > : [r0] "r" (r0), > [r1] "r" (r1)); > } > > Statement 1 can produce a libcall to convert float b into an int and TER > moves it > into statement 3. But the libcall clobbers r0, which we want set to 'a' in > statement 2. > So r0 gets clobbered by the argument to the conversion libcall. > > TER already has code to avoid substituting across function calls and ideally > we'd teach it > to not substitute expressions that can perform a libcall across register > variable definitions > where the register can be clobbered in a libcall, but that information is > not easy to get hold > off in a general way at this level. > > So this patch disallows replacement across any local register variable > definition. It does this > by keeping track of local register definitions encountered in a similar way > to which calls are > counted for TER purposes. > > I hope this is not too big a hammer as local register variables are not very > widely used and we > only disable replacement across them and it allows us to fix the wrong-code > bug on some > inline-asm usage scenarios where gcc currently miscompiles code following > its documented > advice [1] > > Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu, > x86_64. > > Is this approach acceptable? Ok. Thanks, Richard. > Thanks, > Kyrill > > [1] > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables > > 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/48863 > PR inline-asm/70184 > * tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field. > (new_temp_expr_table): Initialise reg_vars_cnt. > (free_temp_expr_table): Release reg_vars_cnt. > (process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt > field of TAB. > (find_replaceable_in_bb): Use the above to record register variable > write occurrences and cancel replacement across them. > > 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/48863 > PR inline-asm/70184 > * gcc.target/arm/pr48863.c: New test.
On 24/11/16 15:12, Richard Biener wrote: > On Thu, Nov 24, 2016 at 2:57 PM, Kyrill Tkachov > <kyrylo.tkachov@foss.arm.com> wrote: >> Hi all, >> >> In this bug we have TER during out-of-ssa misbehaving. A minimal example is: >> void foo(unsigned a, float b) >> { >> unsigned c = (unsigned) b; // 1 >> register unsigned r0 __asm__("r0") = a; // 2 >> register unsigned r1 __asm__("r1") = c; // 3 >> __asm__ volatile( "str %[r1], [%[r0]]\n" >> : >> : [r0] "r" (r0), >> [r1] "r" (r1)); >> } >> >> Statement 1 can produce a libcall to convert float b into an int and TER >> moves it >> into statement 3. But the libcall clobbers r0, which we want set to 'a' in >> statement 2. >> So r0 gets clobbered by the argument to the conversion libcall. >> >> TER already has code to avoid substituting across function calls and ideally >> we'd teach it >> to not substitute expressions that can perform a libcall across register >> variable definitions >> where the register can be clobbered in a libcall, but that information is >> not easy to get hold >> off in a general way at this level. >> >> So this patch disallows replacement across any local register variable >> definition. It does this >> by keeping track of local register definitions encountered in a similar way >> to which calls are >> counted for TER purposes. >> >> I hope this is not too big a hammer as local register variables are not very >> widely used and we >> only disable replacement across them and it allows us to fix the wrong-code >> bug on some >> inline-asm usage scenarios where gcc currently miscompiles code following >> its documented >> advice [1] >> >> Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu, >> x86_64. >> >> Is this approach acceptable? > Ok. Thanks. This has been in trunk for a week without any problems. Is it ok to backport to the branches? I have bootstrapped and tested arm-none-linux-gnueabihf on them. Kyrill > Thanks, > Richard. > >> Thanks, >> Kyrill >> >> [1] >> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables >> >> 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/48863 >> PR inline-asm/70184 >> * tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field. >> (new_temp_expr_table): Initialise reg_vars_cnt. >> (free_temp_expr_table): Release reg_vars_cnt. >> (process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt >> field of TAB. >> (find_replaceable_in_bb): Use the above to record register variable >> write occurrences and cancel replacement across them. >> >> 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/48863 >> PR inline-asm/70184 >> * gcc.target/arm/pr48863.c: New test.
On Thu, Dec 1, 2016 at 12:14 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > On 24/11/16 15:12, Richard Biener wrote: >> >> On Thu, Nov 24, 2016 at 2:57 PM, Kyrill Tkachov >> <kyrylo.tkachov@foss.arm.com> wrote: >>> >>> Hi all, >>> >>> In this bug we have TER during out-of-ssa misbehaving. A minimal example >>> is: >>> void foo(unsigned a, float b) >>> { >>> unsigned c = (unsigned) b; // 1 >>> register unsigned r0 __asm__("r0") = a; // 2 >>> register unsigned r1 __asm__("r1") = c; // 3 >>> __asm__ volatile( "str %[r1], [%[r0]]\n" >>> : >>> : [r0] "r" (r0), >>> [r1] "r" (r1)); >>> } >>> >>> Statement 1 can produce a libcall to convert float b into an int and TER >>> moves it >>> into statement 3. But the libcall clobbers r0, which we want set to 'a' >>> in >>> statement 2. >>> So r0 gets clobbered by the argument to the conversion libcall. >>> >>> TER already has code to avoid substituting across function calls and >>> ideally >>> we'd teach it >>> to not substitute expressions that can perform a libcall across register >>> variable definitions >>> where the register can be clobbered in a libcall, but that information is >>> not easy to get hold >>> off in a general way at this level. >>> >>> So this patch disallows replacement across any local register variable >>> definition. It does this >>> by keeping track of local register definitions encountered in a similar >>> way >>> to which calls are >>> counted for TER purposes. >>> >>> I hope this is not too big a hammer as local register variables are not >>> very >>> widely used and we >>> only disable replacement across them and it allows us to fix the >>> wrong-code >>> bug on some >>> inline-asm usage scenarios where gcc currently miscompiles code following >>> its documented >>> advice [1] >>> >>> Bootstrapped and tested on arm-none-linux-gnueabihf, >>> aarch64-none-linux-gnu, >>> x86_64. >>> >>> Is this approach acceptable? >> >> Ok. > > > Thanks. > This has been in trunk for a week without any problems. > Is it ok to backport to the branches? > I have bootstrapped and tested arm-none-linux-gnueabihf on them. I think so. Richard. > Kyrill > > >> Thanks, >> Richard. >> >>> Thanks, >>> Kyrill >>> >>> [1] >>> >>> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables >>> >>> 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> PR target/48863 >>> PR inline-asm/70184 >>> * tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field. >>> (new_temp_expr_table): Initialise reg_vars_cnt. >>> (free_temp_expr_table): Release reg_vars_cnt. >>> (process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt >>> field of TAB. >>> (find_replaceable_in_bb): Use the above to record register variable >>> write occurrences and cancel replacement across them. >>> >>> 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> PR target/48863 >>> PR inline-asm/70184 >>> * gcc.target/arm/pr48863.c: New test. > >
commit c53373216f7954e89893c1c7067e053fe6b819e8 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Wed Nov 23 16:44:47 2016 +0000 [TER] PR 48863: Don't replace expressions across local register variable definitions diff --git a/gcc/testsuite/gcc.target/arm/pr48863.c b/gcc/testsuite/gcc.target/arm/pr48863.c new file mode 100644 index 0000000..33bc7a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr48863.c @@ -0,0 +1,35 @@ +/* PR target/48863. */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +/* Check that Temporary Expression Replacement does not move a + libcall-producing expression across a statement initialising a local + register variable. */ + +static inline int +dosvc (int fd, unsigned long high, unsigned low) +{ + register int r0 asm("r0") = fd; + register int r2 asm("r2") = high; + register int r3 asm("r3") = low; + + asm volatile("" : "=r"(r0) : "0"(r0), "r"(r2), "r"(r3)); + return r0; +} + +struct s +{ + int fd; + long long length; +} s = { 2, 0 }, *p = &s; + +int +main (void) +{ + unsigned low = p->length & 0xffffffff; + unsigned high = p->length / 23; + + if (dosvc (p->fd, high, low) != 2) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-ssa-ter.c b/gcc/tree-ssa-ter.c index c7d8b7e..af5d91c 100644 --- a/gcc/tree-ssa-ter.c +++ b/gcc/tree-ssa-ter.c @@ -169,6 +169,8 @@ struct temp_expr_table bitmap new_replaceable_dependencies; /* Holding place for pending dep's. */ int *num_in_part; /* # of ssa_names in a partition. */ int *call_cnt; /* Call count at definition. */ + int *reg_vars_cnt; /* Number of register variable + definitions encountered. */ }; /* Used to indicate a dependency on VDEFs. */ @@ -211,6 +213,7 @@ new_temp_expr_table (var_map map) t->num_in_part[p]++; } t->call_cnt = XCNEWVEC (int, num_ssa_names + 1); + t->reg_vars_cnt = XCNEWVEC (int, num_ssa_names + 1); return t; } @@ -243,6 +246,7 @@ free_temp_expr_table (temp_expr_table *t) free (t->partition_dependencies); free (t->num_in_part); free (t->call_cnt); + free (t->reg_vars_cnt); if (t->replaceable_expressions) ret = t->replaceable_expressions; @@ -435,7 +439,8 @@ ter_is_replaceable_p (gimple *stmt) /* Create an expression entry for a replaceable expression. */ static void -process_replaceable (temp_expr_table *tab, gimple *stmt, int call_cnt) +process_replaceable (temp_expr_table *tab, gimple *stmt, int call_cnt, + int reg_vars_cnt) { tree var, def, basevar; int version; @@ -477,6 +482,7 @@ process_replaceable (temp_expr_table *tab, gimple *stmt, int call_cnt) } tab->call_cnt[version] = call_cnt; + tab->reg_vars_cnt[version] = reg_vars_cnt; } @@ -573,6 +579,7 @@ find_replaceable_in_bb (temp_expr_table *tab, basic_block bb) ssa_op_iter iter; bool stmt_replaceable; int cur_call_cnt = 0; + int cur_reg_vars_cnt = 0; for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) { @@ -649,11 +656,14 @@ find_replaceable_in_bb (temp_expr_table *tab, basic_block bb) /* Mark expression as replaceable unless stmt is volatile, or the def variable has the same root variable as something in the substitution list, or the def and use span a call such that - we'll expand lifetimes across a call. */ + we'll expand lifetimes across a call. We also don't want to + replace across these expressions that may call libcalls that + clobber the register involved. See PR 70184. */ if (gimple_has_volatile_ops (stmt) || same_root_var || (tab->call_cnt[ver] != cur_call_cnt && SINGLE_SSA_USE_OPERAND (SSA_NAME_DEF_STMT (use), SSA_OP_USE) - == NULL_USE_OPERAND_P)) + == NULL_USE_OPERAND_P) + || tab->reg_vars_cnt[ver] != cur_reg_vars_cnt) finished_with_expr (tab, ver, true); else mark_replaceable (tab, use, stmt_replaceable); @@ -676,9 +686,16 @@ find_replaceable_in_bb (temp_expr_table *tab, basic_block bb) && DECL_BUILT_IN (fndecl))) cur_call_cnt++; + /* Increment counter if this statement sets a local + register variable. */ + if (gimple_assign_single_p (stmt) + && (TREE_CODE (gimple_assign_lhs (stmt)) == VAR_DECL + && DECL_HARD_REGISTER (gimple_assign_lhs (stmt)))) + cur_reg_vars_cnt++; + /* Now see if we are creating a new expression or not. */ if (stmt_replaceable) - process_replaceable (tab, stmt, cur_call_cnt); + process_replaceable (tab, stmt, cur_call_cnt, cur_reg_vars_cnt); /* Free any unused dependency lists. */ bitmap_clear (tab->new_replaceable_dependencies);