From patchwork Wed Oct 26 10:17:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 79388 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp325697qge; Wed, 26 Oct 2016 03:17:46 -0700 (PDT) X-Received: by 10.98.150.137 with SMTP id s9mr2736097pfk.135.1477477066383; Wed, 26 Oct 2016 03:17:46 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id q199si1778421pgq.205.2016.10.26.03.17.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 26 Oct 2016 03:17:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-439574-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-439574-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-439574-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; q=dns; s=default; b=eJbUs/g41n2BOm3 UvYgAUR3RbI229UkeFSGt3wZCvmFrpD3ho53HD5zK5FWXLxx6gZeYsrHidQGtgyr Z4IxJNUtDfHfXM1uySut12q2uFvskn/A6b4lOXh4fujIpkQEAXhe7eG7ccYlhMxo HArcvXP0KVtEtlCoh7/zObA/2nI8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; s=default; bh=HzjIY+8sn74CEbZ8wturY RP8OeE=; b=J3LguFjlPG2lu8ZB2kNTy3NnRDl6Ibf8Hj7vU+pKYZhuaIK5i6GVc C2fvr5wTdLXZTysvWXkMr0m0Wh6bmSLQqW27AEe4r9OAGa0ZSQLS729bDIYi6M8d 1Kwsgc/yG1Gdx0uAcbSsMczK9QPWZ5aoFC8fzMcPSFOzdUq/7s1HPM= Received: (qmail 128205 invoked by alias); 26 Oct 2016 10:17:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128159 invoked by uid 89); 26 Oct 2016 10:17:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=no version=3.3.2 spammy=sk:calcula, occurrence, exclude, WAY X-HELO: mail-it0-f52.google.com Received: from mail-it0-f52.google.com (HELO mail-it0-f52.google.com) (209.85.214.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 26 Oct 2016 10:17:13 +0000 Received: by mail-it0-f52.google.com with SMTP id e187so17362375itc.0 for ; Wed, 26 Oct 2016 03:17:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ulJp3aSxhCRBlppHZglHp8dG3FMTC8QNA8ZGVxfxtT4=; b=cxiWri+Y3iSvDQPkSnNbq72HxqL5ho1WiqRlswyfw6xLPi0pvcdaJHf3Oonjf4Opcw VRACq3DXGa+mMVp6Uf9TtV0ds9654WIB0vxy/F5AeAxT1LrC5nuoyA7lhHxFwJIw8KwL 0+7vQ28GqZHr/bC+LN2+tOS/xAui71obdcYRq4DX5M8vU92o2MyH7IwiZBOwLkRwzi83 /2Y586SXm9S/qd2jFj6I/TyTieZKb3WYN7FL5HwjSuy6iKeYPflqV6hBoTyDIHGjHek4 P1qkV8RdJd/okPVNxYHebh+PREFb0cKk05HYQpOYwuNRxx7rScl4Euqkh/x/fGt27oQA uj6Q== X-Gm-Message-State: ABUngvewmg93u6isVpLun0RwUlJx9Z5givkoBSn5zaDROpeICP3lwB6QcSLYiMEnnkDrX18s8NMCf5ZPnn6Ky+4P X-Received: by 10.36.90.72 with SMTP id v69mr1580772ita.71.1477477032035; Wed, 26 Oct 2016 03:17:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.164.158 with HTTP; Wed, 26 Oct 2016 03:17:11 -0700 (PDT) In-Reply-To: References: From: Prathamesh Kulkarni Date: Wed, 26 Oct 2016 15:47:11 +0530 Message-ID: Subject: Re: RFC [1/3] divmod transform v2 To: Richard Biener Cc: Richard Biener , gcc Patches , Kugan , Jim Wilson X-IsSubscribed: yes On 25 October 2016 at 18:47, Richard Biener wrote: > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote: > >> On 25 October 2016 at 16:17, Richard Biener wrote: >> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote: >> > >> >> On 25 October 2016 at 13:43, Richard Biener wrote: >> >> > On Sun, Oct 16, 2016 at 7:59 AM, Prathamesh Kulkarni >> >> > wrote: >> >> >> Hi, >> >> >> After approval from Bernd Schmidt, I committed the patch to remove >> >> >> optab functions for >> >> >> sdivmod_optab and udivmod_optab in optabs.def, which removes the block >> >> >> for divmod patch. >> >> >> >> >> >> This patch is mostly the same as previous one, except it drops >> >> >> targeting __udivmoddi4() because >> >> >> it gave undefined reference link error for calling __udivmoddi4() on >> >> >> aarch64-linux-gnu. >> >> >> It appears aarch64 has hardware insn for DImode div, so __udivmoddi4() >> >> >> isn't needed for the target >> >> >> (it was a bug in my patch that called __udivmoddi4() even though >> >> >> aarch64 supported hardware div). >> >> >> >> >> >> However this makes me wonder if it's guaranteed that __udivmoddi4() >> >> >> will be available for a target if it doesn't have hardware div and >> >> >> divmod insn and doesn't have target-specific libfunc for >> >> >> DImode divmod ? To be conservative, the attached patch doesn't >> >> >> generate call to __udivmoddi4. >> >> >> >> >> >> Passes bootstrap+test on x86_64-unknown-linux. >> >> >> Cross-tested on arm*-*-*, aarch64*-*-*. >> >> >> Verified that there are no regressions with SPEC2006 on >> >> >> x86_64-unknown-linux-gnu. >> >> >> OK to commit ? >> >> > >> >> > I think the searching is still somewhat wrong - it's been some time >> >> > since my last look at the >> >> > patch so maybe I've said this already. Please bail out early for >> >> > stmt_can_throw_internal (stmt), >> >> > otherwise the top stmt search might end up not working. So >> >> > >> >> > + >> >> > + if (top_stmt == stmt && stmt_can_throw_internal (top_stmt)) >> >> > + return false; >> >> > >> >> > can go. >> >> > >> >> > top_stmt may end up as a TRUNC_DIV_EXPR so it's pointless to only look >> >> > for another >> >> > TRUNC_DIV_EXPR later ... you may end up without a single TRUNC_MOD_EXPR. >> >> > Which means you want a div_seen and a mod_seen, or simply record the top_stmt >> >> > code and look for the opposite in the 2nd loop. >> >> Um sorry I don't quite understand how we could end up without a trunc_mod stmt ? >> >> The 2nd loop adds both trunc_div and trunc_mod to stmts vector, and >> >> checks if we have >> >> come across at least a single trunc_div stmt (and we bail out if no >> >> div is seen). >> >> >> >> At 2nd loop I suppose we don't need mod_seen, because stmt is >> >> guaranteed to be trunc_mod_expr. >> >> In the 2nd loop the following condition will never trigger for stmt: >> >> if (stmt_can_throw_internal (use_stmt)) >> >> continue; >> >> since we checked before hand if stmt could throw and chose to bail out >> >> in that case. >> >> >> >> and the following condition would also not trigger for stmt: >> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb)) >> >> { >> >> end_imm_use_stmt_traverse (&use_iter); >> >> return false; >> >> } >> >> since gimple_bb (stmt) is always dominated by gimple_bb (top_stmt). >> >> >> >> The case where top_stmt == stmt, we wouldn't reach the above >> >> condition, since we have above it: >> >> if (top_stmt == stmt) >> >> continue; >> >> >> >> So IIUC, top_stmt and stmt would always get added to stmts vector. >> >> Am I missing something ? >> > >> > Ah, indeed. Maybe add a comment then, it wasn't really obvious ;) >> > >> > Please still move the stmt_can_throw_internal (stmt) check up. >> Sure, I will move that up and do the other suggested changes. >> >> I was wondering if this condition in 2nd loop is too restrictive ? >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb)) >> { >> end_imm_use_stmt_traverse (&use_iter); >> return false; >> } >> >> Should we rather "continue" in this case by not adding use_stmt to >> stmts vector rather than dropping >> the transform all-together if gimple_bb (use_stmt) is not dominated by >> gimple_bb (top_stmt) ? > > Ah, yes - didn't spot that. Hi, Is this version OK ? Thanks, Prathamesh > > Richard. > >> >> For instance if we have a test-case like: >> >> if (cond) >> { >> t1 = x / y; >> t2 = x % y; >> } >> else >> t3 = x % y; >> >> and suppose stmt is "t2 = x % y", we would set top_stmt to "t1 = x / y"; >> In this case we would still want to do divmod transform in THEN block >> even though "t3 = x % y" is not dominated by top_stmt ? >> >> if (cond) >> { >> divmod_tmp = DIVMOD (x, y); >> t1 = REALPART_EXPR (divmod_tmp); >> t2 = IMAGPART_EXPR (divmod_tmp); >> } >> else >> t3 = x % y; >> >> We will always ensure that all the trunc_div, trunc_mod statements in >> stmts vector will be dominated by top_stmt, >> but I suppose they need not constitute all the trunc_div, trunc_mod >> statements in the function. >> >> Thanks, >> Prathamesh >> > >> > Thanks, >> > Richard. >> > >> >> Thanks, >> >> Prathamesh >> >> > >> >> > + switch (gimple_assign_rhs_code (use_stmt)) >> >> > + { >> >> > + case TRUNC_DIV_EXPR: >> >> > + new_rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res); >> >> > + break; >> >> > + >> >> > + case TRUNC_MOD_EXPR: >> >> > + new_rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op2), res); >> >> > + break; >> >> > + >> >> > >> >> > why type of op1 and type of op2 in the other case? Choose one for consistency. >> >> > >> >> > + if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt)) >> >> > + cfg_changed = true; >> >> > >> >> > as you are rejecting all internally throwing stmts this shouldn't be necessary. >> >> > >> >> > The patch is ok with those changes. >> >> > >> >> > Thanks, >> >> > Richard. >> >> > >> >> > >> >> >> Thanks, >> >> >> Prathamesh >> >> >> >> >> > >> > -- >> > Richard Biener >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) >> >> > > -- > Richard Biener > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) 2016-10-26 Prathamesh Kulkarni Kugan Vivekanandarajah Jim Wilson * target.def: New hook expand_divmod_libfunc. * doc/tm.texi.in: Add hook for TARGET_EXPAND_DIVMOD_LIBFUNC * doc/tm.texi: Regenerate. * internal-fn.def: Add new entry for DIVMOD ifn. * internal-fn.c (expand_DIVMOD): New. * tree-ssa-math-opts.c: Include optabs-libfuncs.h, tree-eh.h, targhooks.h. (widen_mul_stats): Add new field divmod_calls_inserted. (target_supports_divmod_p): New. (divmod_candidate_p): Likewise. (convert_to_divmod): Likewise. (pass_optimize_widening_mul::execute): Call calculate_dominance_info(), renumber_gimple_stmt_uids() at beginning of function. Call convert_to_divmod() and record stats for divmod. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index cffcfe9..d2bcdca 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -7096,6 +7096,11 @@ This is firstly introduced on ARM/AArch64 targets, please refer to the hook implementation for how different fusion types are supported. @end deftypefn +@deftypefn {Target Hook} void TARGET_EXPAND_DIVMOD_LIBFUNC (rtx @var{libfunc}, machine_mode @var{mode}, rtx @var{op0}, rtx @var{op1}, rtx *@var{quot}, rtx *@var{rem}) +Define this hook for enabling divmod transform if the port does not have +hardware divmod insn but defines target-specific divmod libfuncs. +@end deftypefn + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index d2dd45f..3399465 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4892,6 +4892,8 @@ them: try the first ones in this list first. @hook TARGET_SCHED_FUSION_PRIORITY +@hook TARGET_EXPAND_DIVMOD_LIBFUNC + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 4477697..022a97f 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -2220,6 +2220,53 @@ expand_LAUNDER (internal_fn, gcall *call) expand_assignment (lhs, gimple_call_arg (call, 0), false); } +/* Expand DIVMOD() using: + a) optab handler for udivmod/sdivmod if it is available. + b) If optab_handler doesn't exist, generate call to + target-specific divmod libfunc. */ + +static void +expand_DIVMOD (internal_fn, gcall *call_stmt) +{ + tree lhs = gimple_call_lhs (call_stmt); + tree arg0 = gimple_call_arg (call_stmt, 0); + tree arg1 = gimple_call_arg (call_stmt, 1); + + gcc_assert (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE); + tree type = TREE_TYPE (TREE_TYPE (lhs)); + machine_mode mode = TYPE_MODE (type); + bool unsignedp = TYPE_UNSIGNED (type); + optab tab = (unsignedp) ? udivmod_optab : sdivmod_optab; + + rtx op0 = expand_normal (arg0); + rtx op1 = expand_normal (arg1); + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + + rtx quotient, remainder, libfunc; + + /* Check if optab_handler exists for divmod_optab for given mode. */ + if (optab_handler (tab, mode) != CODE_FOR_nothing) + { + quotient = gen_reg_rtx (mode); + remainder = gen_reg_rtx (mode); + expand_twoval_binop (tab, op0, op1, quotient, remainder, unsignedp); + } + + /* Generate call to divmod libfunc if it exists. */ + else if ((libfunc = optab_libfunc (tab, mode)) != NULL_RTX) + targetm.expand_divmod_libfunc (libfunc, mode, op0, op1, + "ient, &remainder); + + else + gcc_unreachable (); + + /* Wrap the return value (quotient, remainder) within COMPLEX_EXPR. */ + expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs), + make_tree (TREE_TYPE (arg0), quotient), + make_tree (TREE_TYPE (arg1), remainder)), + target, VOIDmode, EXPAND_NORMAL); +} + /* Expand a call to FN using the operands in STMT. FN has a single output operand and NARGS input operands. */ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 28863df..cf2c402 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -201,6 +201,9 @@ DEF_INTERNAL_FN (FALLTHROUGH, ECF_LEAF | ECF_NOTHROW, NULL) /* To implement __builtin_launder. */ DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL) +/* Divmod function. */ +DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL) + #undef DEF_INTERNAL_INT_FN #undef DEF_INTERNAL_FLT_FN #undef DEF_INTERNAL_OPTAB_FN diff --git a/gcc/target.def b/gcc/target.def index 20def24..ae0ea16 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5055,6 +5055,15 @@ Normally, this is not needed.", bool, (const_tree field, machine_mode mode), default_member_type_forces_blk) +/* See tree-ssa-math-opts.c:divmod_candidate_p for conditions + that gate the divod transform. */ +DEFHOOK +(expand_divmod_libfunc, + "Define this hook for enabling divmod transform if the port does not have\n\ +hardware divmod insn but defines target-specific divmod libfuncs.", + void, (rtx libfunc, machine_mode mode, rtx op0, rtx op1, rtx *quot, rtx *rem), + NULL) + /* Return the class for a secondary reload, and fill in extra information. */ DEFHOOK (secondary_reload, diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 0cea1a8..c315da8 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -112,6 +112,9 @@ along with GCC; see the file COPYING3. If not see #include "params.h" #include "internal-fn.h" #include "case-cfn-macros.h" +#include "optabs-libfuncs.h" +#include "tree-eh.h" +#include "targhooks.h" /* This structure represents one basic block that either computes a division, or is a common dominator for basic block that compute a @@ -184,6 +187,9 @@ static struct /* Number of fp fused multiply-add ops inserted. */ int fmas_inserted; + + /* Number of divmod calls inserted. */ + int divmod_calls_inserted; } widen_mul_stats; /* The instance of "struct occurrence" representing the highest @@ -3793,6 +3799,213 @@ match_uaddsub_overflow (gimple_stmt_iterator *gsi, gimple *stmt, return true; } +/* Return true if target has support for divmod. */ + +static bool +target_supports_divmod_p (optab divmod_optab, optab div_optab, machine_mode mode) +{ + /* If target supports hardware divmod insn, use it for divmod. */ + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing) + return true; + + /* Check if libfunc for divmod is available. */ + rtx libfunc = optab_libfunc (divmod_optab, mode); + if (libfunc != NULL_RTX) + { + /* If optab_handler exists for div_optab, perhaps in a wider mode, + we don't want to use the libfunc even if it exists for given mode. */ + for (machine_mode div_mode = mode; + div_mode != VOIDmode; + div_mode = GET_MODE_WIDER_MODE (div_mode)) + if (optab_handler (div_optab, div_mode) != CODE_FOR_nothing) + return false; + + return targetm.expand_divmod_libfunc != NULL; + } + + return false; +} + +/* Check if stmt is candidate for divmod transform. */ + +static bool +divmod_candidate_p (gassign *stmt) +{ + tree type = TREE_TYPE (gimple_assign_lhs (stmt)); + enum machine_mode mode = TYPE_MODE (type); + optab divmod_optab, div_optab; + + if (TYPE_UNSIGNED (type)) + { + divmod_optab = udivmod_optab; + div_optab = udiv_optab; + } + else + { + divmod_optab = sdivmod_optab; + div_optab = sdiv_optab; + } + + tree op1 = gimple_assign_rhs1 (stmt); + tree op2 = gimple_assign_rhs2 (stmt); + + /* Disable the transform if either is a constant, since division-by-constant + may have specialized expansion. */ + if (CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2)) + return false; + + /* Exclude the case where TYPE_OVERFLOW_TRAPS (type) as that should + expand using the [su]divv optabs. */ + if (TYPE_OVERFLOW_TRAPS (type)) + return false; + + if (!target_supports_divmod_p (divmod_optab, div_optab, mode)) + return false; + + return true; +} + +/* This function looks for: + t1 = a TRUNC_DIV_EXPR b; + t2 = a TRUNC_MOD_EXPR b; + and transforms it to the following sequence: + complex_tmp = DIVMOD (a, b); + t1 = REALPART_EXPR(a); + t2 = IMAGPART_EXPR(b); + For conditions enabling the transform see divmod_candidate_p(). + + The pass has three parts: + 1) Find top_stmt which is trunc_div or trunc_mod stmt and dominates all + other trunc_div_expr and trunc_mod_expr stmts. + 2) Add top_stmt and all trunc_div and trunc_mod stmts dominated by top_stmt + to stmts vector. + 3) Insert DIVMOD call just before top_stmt and update entries in + stmts vector to use return value of DIMOVD (REALEXPR_PART for div, + IMAGPART_EXPR for mod). */ + +static bool +convert_to_divmod (gassign *stmt) +{ + if (stmt_can_throw_internal (stmt) + || !divmod_candidate_p (stmt)) + return false; + + tree op1 = gimple_assign_rhs1 (stmt); + tree op2 = gimple_assign_rhs2 (stmt); + + imm_use_iterator use_iter; + gimple *use_stmt; + auto_vec stmts; + + gimple *top_stmt = stmt; + basic_block top_bb = gimple_bb (stmt); + + /* Part 1: Try to set top_stmt to "topmost" stmt that dominates + at-least stmt and possibly other trunc_div/trunc_mod stmts + having same operands as stmt. */ + + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1) + { + if (is_gimple_assign (use_stmt) + && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR + || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR) + && operand_equal_p (op1, gimple_assign_rhs1 (use_stmt), 0) + && operand_equal_p (op2, gimple_assign_rhs2 (use_stmt), 0)) + { + if (stmt_can_throw_internal (use_stmt)) + continue; + + basic_block bb = gimple_bb (use_stmt); + + if (bb == top_bb) + { + if (gimple_uid (use_stmt) < gimple_uid (top_stmt)) + top_stmt = use_stmt; + } + else if (dominated_by_p (CDI_DOMINATORS, top_bb, bb)) + { + top_bb = bb; + top_stmt = use_stmt; + } + } + } + + tree top_op1 = gimple_assign_rhs1 (top_stmt); + tree top_op2 = gimple_assign_rhs2 (top_stmt); + + stmts.safe_push (top_stmt); + bool div_seen = (gimple_assign_rhs_code (top_stmt) == TRUNC_DIV_EXPR); + + /* Part 2: Add all trunc_div/trunc_mod statements domianted by top_bb + to stmts vector. The 2nd loop will always add stmt to stmts vector, since + gimple_bb (top_stmt) dominates gimple_bb (stmt), so the + 2nd loop ends up adding at-least single trunc_mod_expr stmt. */ + + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1) + { + if (is_gimple_assign (use_stmt) + && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR + || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR) + && operand_equal_p (top_op1, gimple_assign_rhs1 (use_stmt), 0) + && operand_equal_p (top_op2, gimple_assign_rhs2 (use_stmt), 0)) + { + if (use_stmt == top_stmt + || stmt_can_throw_internal (use_stmt) + || !dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb)) + continue; + + stmts.safe_push (use_stmt); + if (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR) + div_seen = true; + } + } + + if (!div_seen) + return false; + + /* Part 3: Create libcall to internal fn DIVMOD: + divmod_tmp = DIVMOD (op1, op2). */ + + gcall *call_stmt = gimple_build_call_internal (IFN_DIVMOD, 2, op1, op2); + tree res = make_temp_ssa_name (build_complex_type (TREE_TYPE (op1)), + call_stmt, "divmod_tmp"); + gimple_call_set_lhs (call_stmt, res); + + /* Insert the call before top_stmt. */ + gimple_stmt_iterator top_stmt_gsi = gsi_for_stmt (top_stmt); + gsi_insert_before (&top_stmt_gsi, call_stmt, GSI_SAME_STMT); + + widen_mul_stats.divmod_calls_inserted++; + + /* Update all statements in stmts vector: + lhs = op1 TRUNC_DIV_EXPR op2 -> lhs = REALPART_EXPR + lhs = op1 TRUNC_MOD_EXPR op2 -> lhs = IMAGPART_EXPR. */ + + for (unsigned i = 0; stmts.iterate (i, &use_stmt); ++i) + { + tree new_rhs; + + switch (gimple_assign_rhs_code (use_stmt)) + { + case TRUNC_DIV_EXPR: + new_rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res); + break; + + case TRUNC_MOD_EXPR: + new_rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op1), res); + break; + + default: + gcc_unreachable (); + } + + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + gimple_assign_set_rhs_from_tree (&gsi, new_rhs); + update_stmt (use_stmt); + } + + return true; +} /* Find integer multiplications where the operands are extended from smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR @@ -3837,6 +4050,8 @@ pass_optimize_widening_mul::execute (function *fun) bool cfg_changed = false; memset (&widen_mul_stats, 0, sizeof (widen_mul_stats)); + calculate_dominance_info (CDI_DOMINATORS); + renumber_gimple_stmt_uids (); FOR_EACH_BB_FN (bb, fun) { @@ -3870,6 +4085,10 @@ pass_optimize_widening_mul::execute (function *fun) match_uaddsub_overflow (&gsi, stmt, code); break; + case TRUNC_MOD_EXPR: + convert_to_divmod (as_a (stmt)); + break; + default:; } } @@ -3916,6 +4135,8 @@ pass_optimize_widening_mul::execute (function *fun) widen_mul_stats.maccs_inserted); statistics_counter_event (fun, "fused multiply-adds inserted", widen_mul_stats.fmas_inserted); + statistics_counter_event (fun, "divmod calls inserted", + widen_mul_stats.divmod_calls_inserted); return cfg_changed ? TODO_cleanup_cfg : 0; }