From patchwork Thu Oct 27 15:26:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 79718 Delivered-To: patch@linaro.org Received: by 10.80.142.83 with SMTP id 19csp707011edx; Thu, 27 Oct 2016 08:27:15 -0700 (PDT) X-Received: by 10.99.161.2 with SMTP id b2mr12617329pgf.5.1477582035892; Thu, 27 Oct 2016 08:27:15 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id t133si1895860pgb.161.2016.10.27.08.27.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Oct 2016 08:27:15 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-439762-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-439762-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-439762-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=R43h23l7LOVCh7aD x5dvNw//rwhg2ABDdNo7gY0NEXeYgr/J0orFHlvmaC8Vm5eeQQYJcxet5vhL4h+A daIKSzmDO3IYFgAv7cCs3QZt7yDli7wvupwXaI8j7xx57f0KrqnXVY/wj90AQIjj /yNLuYbz7aEcH2QB7ud5Ewn7t70= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=YJbO8Kx5kfbYmarT8HyaX0 dq7q4=; b=uz+o553c2ok3wYTGzDMw1x9RF94wp2Iti72PThmBsLpaK4cZqzRYg1 m9H3b1OtxoPPz1zX0ElZKrLCTivTRcQRY7eZ39FarHSM7Xts/0BvVxo3k28zWb1j SJDI2+pMgDV3AxeDaqxJEUfI/qZ6h09vLFYNfO6lvOzg9UjN0/i3U= Received: (qmail 97617 invoked by alias); 27 Oct 2016 15:26:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 97600 invoked by uid 89); 27 Oct 2016 15:26:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=no version=3.3.2 spammy=646, (unknown) X-HELO: smtp.eu.adacore.com Received: from mel.act-europe.fr (HELO smtp.eu.adacore.com) (194.98.77.210) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 27 Oct 2016 15:26:44 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id C21F88132E for ; Thu, 27 Oct 2016 17:26:42 +0200 (CEST) Received: from smtp.eu.adacore.com ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b_9YS30L3a33 for ; Thu, 27 Oct 2016 17:26:42 +0200 (CEST) Received: from polaris.localnet (bon31-6-88-161-99-133.fbx.proxad.net [88.161.99.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.eu.adacore.com (Postfix) with ESMTPSA id 923398132C for ; Thu, 27 Oct 2016 17:26:42 +0200 (CEST) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: [patch] Use straight-line sequence for signed overflow additive operation Date: Thu, 27 Oct 2016 17:26:42 +0200 Message-ID: <20524360.1RVvHARErM@polaris> User-Agent: KMail/4.14.10 (Linux/3.16.7-45-desktop; KDE/4.14.9; x86_64; ; ) MIME-Version: 1.0 Hi, as suggested by Segher, this changes the generic signed-signed-signed case of expand_addsub_overflow to using a straight-line code sequence instead of a branchy one, the new sequence being also shorter. On 32-bit PowerPC the code generated at -O2 for 32-bit addition and subtraction is: add 10,3,4 eqv 3,3,4 xor 9,10,4 and. 8,9,3 blt- 0, subf 9,4,3 xor 3,3,4 eqv 4,9,4 and. 10,3,4 blt- 0, eqv is not(xor) so a typical RISC machine will have 1 more instruction (if it doesn't have and(not) like e.g. SPARC). What's even more interesting is that the overflow overhead is constant, e.g. for 64-bit addition and subtraction: addc 6,4,6 adde 10,3,5 eqv 3,3,5 xor 9,10,5 and. 9,9,3 blt- 0, subfc 6,6,4 subfe 10,5,3 xor 3,3,5 eqv 5,10,5 and. 9,3,5 blt- 0, so this will also help 32-bit x86, which doesn't implement 64-bit overflow operations in the MD file. There is a fixlet for do_jump_by_parts_greater_rtx in the patch too, which forgets to invert the probability of the jump when it swaps the arms of the branch. Tested on x86-64/Linux, PowerPC/Linux and PowerPC64/Linux, OK for mainline? 2016-10-27 Eric Botcazou Segher Boessenkool * dojump.c (do_jump_by_parts_greater_rtx): Invert probability when swapping the arms of the branch. * internal-fn.c (expand_addsub_overflow): Use a straight-line code sequence for the generic signed-signed-signed case. -- Eric Botcazou Index: dojump.c =================================================================== --- dojump.c (revision 241611) +++ dojump.c (working copy) @@ -703,6 +703,7 @@ do_jump_by_parts_greater_rtx (machine_mo if_false_label = drop_through_label; drop_through_if_true = false; drop_through_if_false = true; + prob = inv (prob); } /* Compare a word at a time, high order first. */ Index: internal-fn.c =================================================================== --- internal-fn.c (revision 241611) +++ internal-fn.c (working copy) @@ -847,56 +847,68 @@ expand_addsub_overflow (location_t loc, delete_insns_since (last); } - rtx_code_label *sub_check = gen_label_rtx (); - int pos_neg = 3; - /* Compute the operation. On RTL level, the addition is always unsigned. */ res = expand_binop (mode, code == PLUS_EXPR ? add_optab : sub_optab, op0, op1, NULL_RTX, false, OPTAB_LIB_WIDEN); - /* If we can prove one of the arguments (for MINUS_EXPR only + /* If we can prove that one of the arguments (for MINUS_EXPR only the second operand, as subtraction is not commutative) is always non-negative or always negative, we can do just one comparison - and conditional jump instead of 2 at runtime, 3 present in the - emitted code. If one of the arguments is CONST_INT, all we - need is to make sure it is op1, then the first - do_compare_rtx_and_jump will be just folded. Otherwise try - to use range info if available. */ - if (code == PLUS_EXPR && CONST_INT_P (op0)) - std::swap (op0, op1); - else if (CONST_INT_P (op1)) - ; - else if (code == PLUS_EXPR && TREE_CODE (arg0) == SSA_NAME) + and conditional jump. */ + int pos_neg = get_range_pos_neg (arg1); + if (code == PLUS_EXPR) { - pos_neg = get_range_pos_neg (arg0); - if (pos_neg != 3) - std::swap (op0, op1); + int pos_neg0 = get_range_pos_neg (arg0); + if (pos_neg0 != 3 && pos_neg == 3) + { + std::swap (op0, op1); + pos_neg = pos_neg0; + } } - if (pos_neg == 3 && !CONST_INT_P (op1) && TREE_CODE (arg1) == SSA_NAME) - pos_neg = get_range_pos_neg (arg1); - - /* If the op1 is negative, we have to use a different check. */ - if (pos_neg == 3) - do_compare_rtx_and_jump (op1, const0_rtx, LT, false, mode, NULL_RTX, - NULL, sub_check, PROB_EVEN); - - /* Compare the result of the operation with one of the operands. */ - if (pos_neg & 1) - do_compare_rtx_and_jump (res, op0, code == PLUS_EXPR ? GE : LE, - false, mode, NULL_RTX, NULL, done_label, - PROB_VERY_LIKELY); - /* If we get here, we have to print the error. */ + /* Addition overflows if and only if the two operands have the same sign, + and the result has the opposite sign. Subtraction overflows if and + only if the two operands have opposite sign, and the subtrahend has + the same sign as the result. Here 0 is counted as positive. */ if (pos_neg == 3) { - emit_jump (do_error); - emit_label (sub_check); + /* Compute op0 ^ op1 (operands have opposite sign). */ + rtx op_xor = expand_binop (mode, xor_optab, op0, op1, NULL_RTX, false, + OPTAB_LIB_WIDEN); + + /* Compute res ^ op1 (result and 2nd operand have opposite sign). */ + rtx res_xor = expand_binop (mode, xor_optab, res, op1, NULL_RTX, false, + OPTAB_LIB_WIDEN); + + rtx tem; + if (code == PLUS_EXPR) + { + /* Compute (res ^ op1) & ~(op0 ^ op1). */ + tem = expand_unop (mode, one_cmpl_optab, op_xor, NULL_RTX, false); + tem = expand_binop (mode, and_optab, res_xor, tem, NULL_RTX, false, + OPTAB_LIB_WIDEN); + } + else + { + /* Compute (op0 ^ op1) & ~(res ^ op1). */ + tem = expand_unop (mode, one_cmpl_optab, res_xor, NULL_RTX, false); + tem = expand_binop (mode, and_optab, op_xor, tem, NULL_RTX, false, + OPTAB_LIB_WIDEN); + } + + /* No overflow if the result has bit sign cleared. */ + do_compare_rtx_and_jump (tem, const0_rtx, GE, false, mode, NULL_RTX, + NULL, done_label, PROB_VERY_LIKELY); } - /* We have k = a + b for b < 0 here. k <= a must hold. */ - if (pos_neg & 2) - do_compare_rtx_and_jump (res, op0, code == PLUS_EXPR ? LE : GE, + /* Compare the result of the operation with the first operand. + No overflow for addition if second operand is positive and result + is larger or second operand is negative and result is smaller. + Likewise for subtraction with sign of second operand flipped. */ + else + do_compare_rtx_and_jump (res, op0, + (pos_neg == 1) ^ (code == MINUS_EXPR) ? GE : LE, false, mode, NULL_RTX, NULL, done_label, PROB_VERY_LIKELY); }