Message ID | 586c7e2d-5305-7bd2-8d57-85693feffb16@arm.com |
---|---|
State | Superseded |
Headers | show |
Series | [IRA] Revert 11b8091fb to fix Bug 93221 | expand |
On 1/21/20 12:20 PM, Joel Hutton wrote: > Hi all, > > A previous change to simplify LRA introduced in 11b809 (From-SVN: > r279550) disabled hard register splitting for -O0. This causes a problem > on aarch64 in cases where parameters are passed in multiple registers > (in the bug report an OI passed in 2 V4SI registers). This is mandated > by the AAPCS. I see. I suspected that it can cause the problem but I thought the probability is very small for this. I was wrong. We should revert it at least for now. Correct work of GCC is more important than saving cycles in -O0 mode. > Vlad, Eric, do you have a preferred alternate solution to reverting the > patch? I am in favour of reverting the patch now. But may be Eric can provide another version of the patch not causing the arm problem. I am ready to reconsider this too. So I guess the decision is upto Eric. > Previously discussed here: > https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01414.html > Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 > > Bootstrapped and regression tested on aarch64 > > Changelog: > > 2020-01-21 Joel Hutton <Joel.Hutton@arm.com\> > > * ira.c (ira): Revert use of simplified LRA algorithm. > > gcc/testsuite/ChangeLog: > > 2020-01-21 Joel Hutton <Joel.Hutton@arm.com\> > > PR bug/93221 > * gcc.target/aarch64/pr93221.c: New test. >
On Tue, Jan 21, 2020 at 05:20:51PM +0000, Joel Hutton wrote: > 2020-01-21 Joel Hutton <Joel.Hutton@arm.com\> > > * ira.c (ira): Revert use of simplified LRA algorithm. > > gcc/testsuite/ChangeLog: > > 2020-01-21 Joel Hutton <Joel.Hutton@arm.com\> > > PR bug/93221 > * gcc.target/aarch64/pr93221.c: New test. Not a review, just nitpicking. Please avoid the backslash before >. The PR number should go to both gcc/ and gcc/testsuite/ ChangeLog entry and we don't have bug/ category; the bug is target category, so it should be PR target/93221, or could be reclassified first in bugzilla e.g. to middle-end and then written as PR middle-end/93221. > @@ -0,0 +1,10 @@ > +/* PR bug/93221 */ Likewise here. > +/* { dg-do compile } */ > +/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */ > + > +struct S { __Int32x4_t b[2]; }; > + > +void __attribute__((optimize (0))) Not sure if I understand why do you need optimize (0) attribute when the whole test is compiled with -O0. Doesn't it ICE without the attribute too? > +foo (struct S x) > +{ > +} Jakub
On 21/01/2020 19:26, Jakub Jelinek wrote: > Not a review, just nitpicking. Please avoid the backslash before >. > The PR number should go to both gcc/ and gcc/testsuite/ ChangeLog > entry and we don't have bug/ category; the bug is target category, > so it should be PR target/93221, or could be reclassified first in > bugzilla e.g. to middle-end and then written as PR middle-end/93221. > Done >> @@ -0,0 +1,10 @@ >> +/* PR bug/93221 */ > Likewise here. Done > Not sure if I understand why do you need optimize (0) attribute when the > whole test is compiled with -O0. Doesn't it ICE without the attribute > too? Done. It's not really necessary, belt and braces. Updated patch attached From 1a2980ef6eeb76dbf0556f806a85a4f49ad3ebdd Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Tue, 21 Jan 2020 09:37:48 +0000 Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb 11b8091fb introduced a simplified LRA algorithm for -O0 that turned off hard register splitting, this causes a problem for parameters passed in multiple registers on aarch64. This fixes bug 93221. --- gcc/ira.c | 38 +++++++++------------- gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++++++ 2 files changed, 25 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c diff --git a/gcc/ira.c b/gcc/ira.c index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -5205,35 +5205,27 @@ ira (FILE *f) /* Perform target specific PIC register initialization. */ targetm.init_pic_reg (); - if (optimize) - { - ira_conflicts_p = true; - - /* Determine the number of pseudos actually requiring coloring. */ - unsigned int num_used_regs = 0; - for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) - if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) - num_used_regs++; - - /* If there are too many pseudos and/or basic blocks (e.g. 10K - pseudos and 10K blocks or 100K pseudos and 1K blocks), we will - use simplified and faster algorithms in LRA. */ - lra_simple_p - = ira_use_lra_p - && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); - } - else - { - ira_conflicts_p = false; - lra_simple_p = ira_use_lra_p; - } + ira_conflicts_p = optimize > 0; + + /* Determine the number of pseudos actually requiring coloring. */ + unsigned int num_used_regs = 0; + for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) + if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) + num_used_regs++; + + /* If there are too many pseudos and/or basic blocks (e.g. 10K + pseudos and 10K blocks or 100K pseudos and 1K blocks), we will + use simplified and faster algorithms in LRA. */ + lra_simple_p + = ira_use_lra_p + && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); if (lra_simple_p) { /* It permits to skip live range splitting in LRA. */ flag_caller_saves = false; /* There is no sense to do regional allocation when we use - simplified LRA. */ + simplified LRA. */ flag_ira_region = IRA_REGION_ONE; ira_conflicts_p = false; } diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c new file mode 100644 index 0000000000000000000000000000000000000000..4dc2c3d0149423dd3d666f7428277ffa9eb765c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c @@ -0,0 +1,10 @@ +/* PR target/93221 */ +/* { dg-do compile } */ +/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */ + +struct S { __Int32x4_t b[2]; }; + +void +foo (struct S x) +{ +} -- 2.17.1
Updated changelog: Changelog: 2020-01-21 Joel Hutton <Joel.Hutton@arm.com><mailto:Joel.Hutton@arm.com> PR target/93221 * ira.c (ira): Revert use of simplified LRA algorithm. gcc/testsuite/ChangeLog: 2020-01-21 Joel Hutton <Joel.Hutton@arm.com><mailto:Joel.Hutton@arm.com> PR target/93221 * gcc.target/aarch64/pr93221.c: New test.
Changelog was mangled by thunderbird, updated changelog: Changelog: 2020-01-21 Joel Hutton <Joel.Hutton@arm.com> PR target/93221 * ira.c (ira): Revert use of simplified LRA algorithm. gcc/testsuite/ChangeLog: 2020-01-21 Joel Hutton <Joel.Hutton@arm.com> PR target/93221 * gcc.target/aarch64/pr93221.c: New test.
From 0d9980d2327c61eb99d041a217d6ea5c5b34c754 Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Tue, 21 Jan 2020 09:37:48 +0000 Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb 11b8091fb introduced a simplified LRA algorithm for -O0 that turned off hard register splitting, this causes a problem for parameters passed in multiple registers on aarch64. This fixes bug 93221. --- gcc/ira.c | 38 +++++++++------------- gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++++++ 2 files changed, 25 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c diff --git a/gcc/ira.c b/gcc/ira.c index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -5205,35 +5205,27 @@ ira (FILE *f) /* Perform target specific PIC register initialization. */ targetm.init_pic_reg (); - if (optimize) - { - ira_conflicts_p = true; - - /* Determine the number of pseudos actually requiring coloring. */ - unsigned int num_used_regs = 0; - for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) - if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) - num_used_regs++; - - /* If there are too many pseudos and/or basic blocks (e.g. 10K - pseudos and 10K blocks or 100K pseudos and 1K blocks), we will - use simplified and faster algorithms in LRA. */ - lra_simple_p - = ira_use_lra_p - && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); - } - else - { - ira_conflicts_p = false; - lra_simple_p = ira_use_lra_p; - } + ira_conflicts_p = optimize > 0; + + /* Determine the number of pseudos actually requiring coloring. */ + unsigned int num_used_regs = 0; + for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) + if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) + num_used_regs++; + + /* If there are too many pseudos and/or basic blocks (e.g. 10K + pseudos and 10K blocks or 100K pseudos and 1K blocks), we will + use simplified and faster algorithms in LRA. */ + lra_simple_p + = ira_use_lra_p + && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); if (lra_simple_p) { /* It permits to skip live range splitting in LRA. */ flag_caller_saves = false; /* There is no sense to do regional allocation when we use - simplified LRA. */ + simplified LRA. */ flag_ira_region = IRA_REGION_ONE; ira_conflicts_p = false; } diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c new file mode 100644 index 0000000000000000000000000000000000000000..517135a889de8a7e379c79222f8a8b2efcc7b422 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c @@ -0,0 +1,10 @@ +/* PR bug/93221 */ +/* { dg-do compile } */ +/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */ + +struct S { __Int32x4_t b[2]; }; + +void __attribute__((optimize (0))) +foo (struct S x) +{ +} -- 2.17.1