From patchwork Wed Oct 28 11:52:56 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 55666 Delivered-To: patch@linaro.org Received: by 10.112.61.134 with SMTP id p6csp69955lbr; Wed, 28 Oct 2015 04:53:18 -0700 (PDT) X-Received: by 10.50.32.33 with SMTP id f1mr2334220igi.33.1446033198155; Wed, 28 Oct 2015 04:53:18 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id f5si13418185ioj.40.2015.10.28.04.53.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 28 Oct 2015 04:53:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-411801-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-411801-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-411801-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=exTLQRlSE854sxqiitGpL+VF0m8NEdVjBW4RO/2O9Ii UAdDKxUtf9a4drdM7ibpKFzgntO3ICH+0hd9irTgkDwg10kVthoHpnkwX5/TwBnr bjVzyIWzUt78N//qulIUhkV8FBXsQGkftPpcaa8IPbi2g9vWwl9nzWZ/B3zxOMQQ = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=eg25YycCos6/8c8Jvs801llY0ls=; b=Or2Dp9jBHBSVh91jn IitnV7GCCvw2UbfZ9Oh0hioQsbXZ519F9Hg/dsfGApwhOKOQdF0tsXu8rag7a2Kz Rd09mXxI5tRekj1YfS1Q3PAfJ3NZsP4oY/kKiJ7OLUIaLKNSNa0AlQjz/845D1jE z39KmbUnVkNyLawfjPDJB/ta5I= Received: (qmail 80587 invoked by alias); 28 Oct 2015 11:53:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 80546 invoked by uid 89); 28 Oct 2015 11:53:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_20, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 28 Oct 2015 11:53:02 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-38-101Dtlf0T7-dyZ_QXjlNXA-1; Wed, 28 Oct 2015 11:52:56 +0000 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 28 Oct 2015 11:52:56 +0000 Message-ID: <5630B718.3060204@arm.com> Date: Wed, 28 Oct 2015 11:52:56 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Richard Biener , Jeff Law Subject: [PATCH][RFC][tree-ifcombine] Making tree-ifcombine more friendly for conditional comparisons generation X-MC-Unique: 101Dtlf0T7-dyZ_QXjlNXA-1 X-IsSubscribed: yes Hi all, This is a follow up to the thread at https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01646.html I'm trying to increase the utilisation of conditional compare instructions on aarch64 and I've identified tree ifcombine as a place that improves the opportunities for that. The transformations in the ifcombine_ifandif function combine basic blocks with an AND or an IOR when possible, which is the form that the expander code in ccmp.c looks for when generating conditional compares. To recap from the referenced thread, currently ifcombine will only merge such blocks if the inner block contains only a single statement i.e. the condition. I'd like to make it more aggressive. This patch allows merging of inner blocks with multiple statements, as long as those statements are comparisons or AND/IOR operations. We don't allow any number of such statements though. An additional gating is introduced via the PARAM_IFCOMBINE_THRESHOLD param which is used to gate the estimate_num_insns_seq cost of the inner basic block, as Richard suggested. Using this param, targets that don't utilise the ccmp.c machinery can make sure that the codegen effect from this patch is minimal. The default value for the param is such that it has minimal codegen difference on targets other than aarch64 (for which I set it to a higher value) I tried allowing any kind of statements in the inner block, not just the types mentioned above and using just the param to gate this, but this didn't show good results on aarch64. Pulling in the extra code into the unconditional path was not beneficial on the whole. With this approach we can avoid explicitly checking for target conditional compare support in tree-ssa-ifcombine.c. Also, with this patch I don't see the gcc.dg/pr66299-2.c regression that I complained about in the earlier thread. On aarch64 with this patch I see about 4.5% more ccmp instructions being generated on SPEC2006. SPECINT improves by about 0.4% on a Cortex-A57 platform and SPECFP shows no regressions. On x86_64 there was no codegen difference for SPEC2006. What do people think of this approach? A concern is that the new param is essentially a new magic number replacing an implicit magic number of '1'. It feels that how aggressively we do this for ccmp generation ultimately should depend on BRANCH_COST somehow, but BRANCH_COST is already overloaded and meaningless as discussed in other threads Thanks, Kyrill 2015-10-28 Kyrylo Tkachov * params.def (PARAM_IFCOMBINE_THRESHOLD): New param. * tree-ssa-ifcombine.c: Include tree-inline.h and params.h. (gimple_seq_contains_logical_ops_p): New function. (ifcombine_ifandif): Use it. * config/aarch64/aarch64.c: Include params.h (aarch64_override_options): Set PARAM_IFCOMBINE_THRESHOLD. 2015-10-28 Kyrylo Tkachov * gcc.target/aarch64/ifcombine_cond_cmp_1.c: New test. commit 0a6d01f7a3c0a59e5119b3ec85528b89c1129efb Author: Kyrylo Tkachov Date: Wed Sep 23 15:39:32 2015 +0100 Add tree ifcombine cost gating diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 205f12d..9d59427 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -72,6 +72,7 @@ #include "aarch64-cost-tables.h" #include "dumpfile.h" #include "builtins.h" +#include "params.h" #include "rtl-iter.h" #include "tm-constrs.h" #include "sched-int.h" @@ -7933,6 +7934,11 @@ aarch64_override_options (void) aarch64_override_options_internal (&global_options); + maybe_set_param_value (PARAM_IFCOMBINE_THRESHOLD, + 5, + global_options.x_param_values, + global_options_set.x_param_values); + /* Save these options as the default ones in case we push and pop them later while processing functions with potential target attributes. */ target_option_default_node = target_option_current_node diff --git a/gcc/params.def b/gcc/params.def index da2c6a3..7969e08 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1041,6 +1041,12 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD, "if 0, use the default for the machine", 0, 0, 0) +DEFPARAM (PARAM_IFCOMBINE_THRESHOLD, + "ifcombine-threshold", + "The approximate cost of statements in an inner block above which " + "it is not deemed beneficial to if-combine it with an outer block", + 2, 0, 0) + /* Data race flags for C++0x memory model compliance. */ DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES, "allow-store-data-races", diff --git a/gcc/testsuite/gcc.target/aarch64/ifcombine_cond_cmp_1.c b/gcc/testsuite/gcc.target/aarch64/ifcombine_cond_cmp_1.c new file mode 100644 index 0000000..36b78a6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ifcombine_cond_cmp_1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int +foo1 (int a, int b, int c, int d) +{ + return a > b && b <= c && c > d; +} + +int +foo2 (int a, int b, int c, int d) +{ + return a > b && (b <= c && c > d); +} + +/* { dg-final { scan-assembler-times "ccmp" 4 } } */ diff --git a/gcc/tree-ssa-ifcombine.c b/gcc/tree-ssa-ifcombine.c index 66be430..4b6c98e 100644 --- a/gcc/tree-ssa-ifcombine.c +++ b/gcc/tree-ssa-ifcombine.c @@ -35,10 +35,12 @@ along with GCC; see the file COPYING3. If not see #include "stor-layout.h" #include "cfganal.h" #include "tree-pretty-print.h" +#include "tree-inline.h" #include "internal-fn.h" #include "gimple-fold.h" #include "gimple-iterator.h" #include "gimplify-me.h" +#include "params.h" #include "tree-cfg.h" #include "tree-pass.h" @@ -325,6 +327,41 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) return true; } +/* Return true if the sequence GS contains only comparisons + and logical combinations of variables. */ + +static bool +gimple_seq_contains_logical_ops_p (gimple_seq gs) +{ + gimple_stmt_iterator gsi; + bool ret = true; + + for (gsi = gsi_start (gs); !gsi_one_before_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + enum gimple_code code = gimple_code (stmt); + + if (code == GIMPLE_DEBUG) + continue; + + if (code != GIMPLE_ASSIGN) + return false; + + enum tree_code tcode = gimple_assign_rhs_code (stmt); + if (TREE_CODE_CLASS (tcode) != tcc_comparison + && tcode != TRUTH_AND_EXPR + && tcode != TRUTH_ANDIF_EXPR + && tcode != TRUTH_OR_EXPR + && tcode != TRUTH_ORIF_EXPR + && tcode != BIT_AND_EXPR + && tcode != BIT_IOR_EXPR + && tcode != BIT_NOT_EXPR) + return false; + } + + return ret; +} + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. inner_inv, outer_inv and result_inv indicate whether the conditions @@ -511,9 +548,31 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, gimple_stmt_iterator gsi; if (!LOGICAL_OP_NON_SHORT_CIRCUIT) return false; - /* Only do this optimization if the inner bb contains only the conditional. */ - if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb (inner_cond_bb))) - return false; + + /* We'll be pulling in the inner_cond_bb code to execute + unconditionally. If the inner bb is just the condition then + always combine. If there are more statements, only allow + statements that facilitate generation of conditional compare + operations and even then only up to a limit of + PARAM_IFCOMBINE_THRESHOLD. */ + if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb + (inner_cond_bb))) + { + bool bb_logic_ops_and_comparisons_p + = gimple_seq_contains_logical_ops_p (bb_seq (inner_cond_bb)); + + if (!bb_logic_ops_and_comparisons_p) + return false; + + int inner_cost + = estimate_num_insns_seq (bb_seq (inner_cond_bb), + &eni_time_weights); + int threshold = PARAM_VALUE (PARAM_IFCOMBINE_THRESHOLD); + + if (inner_cost > threshold) + return false; + } + t1 = fold_build2_loc (gimple_location (inner_cond), inner_cond_code, boolean_type_node,