[v3,05/18] string: Improve generic strlen

Message ID	1515588482-15744-6-git-send-email-adhemerval.zanella@linaro.org
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of libc-alpha-return-89002-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=ttirvOLX61Zx8UKIsqK0oejjehF3NV6 9HWbOiFhvCJt+A/vNG6aBz++2sKPT+YGUHXb1dWX2utEbTd+IkLuZgqRXY+cpbI8 AhkdQFb+7KfWQFhJMi3EESNumtqCTeSCbuuUmqi7pYMDpucFNT4XWRhC/uSlUg3R ZcX/NdpHptgw= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org From: Adhemerval Zanella <adhemerval.zanella@linaro.org> To: libc-alpha@sourceware.org Cc: Richard Henderson <rth@twiddle.net> Subject: [PATCH v3 05/18] string: Improve generic strlen Date: Wed, 10 Jan 2018 10:47:49 -0200 Message-Id: <1515588482-15744-6-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org>
Series	Improve generic string routines \| expand [v3,00/18] Improve generic string routines [v3,01/18] Parameterize op_t from memcopy.h [v3,02/18] Parameterize OP_T_THRES from memcopy.h [v3,03/18] Add string-maskoff.h generic header [v3,04/18] Add string vectorized find and detection functions [v3,05/18] string: Improve generic strlen [v3,06/18] string: Improve generic memchr [v3,07/18] string: Improve generic memrchr [v3,08/18] string: Improve generic strnlen [v3,09/18] string: Improve generic strchr [v3,10/18] string: Improve generic strchrnul [v3,11/18] string: Improve generic strcmp [v3,12/18] string: Improve generic strcpy [v3,13/18] hppa: Add memcopy.h [v3,14/18] hppa: Add string-fzb.h and string-fzi.h [v3,15/18] alpha: Add string-fzb.h and string-fzi.h [v3,16/18] arm: Add string-fza.h [v3,17/18] powerpc: Add string-fza.h [v3,18/18] sh: Add string-fzb.h

Message ID

1515588482-15744-6-git-send-email-adhemerval.zanella@linaro.org

State

New

Headers

Received-SPF: pass (google.com: domain of
	libc-alpha-return-89002-patch=linaro.org@sourceware.org
	designates 209.132.180.131 as permitted sender)
	client-ip=209.132.180.131; 
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id:in-reply-to
	:references; q=dns; s=default; b=ttirvOLX61Zx8UKIsqK0oejjehF3NV6
	9HWbOiFhvCJt+A/vNG6aBz++2sKPT+YGUHXb1dWX2utEbTd+IkLuZgqRXY+cpbI8
	AhkdQFb+7KfWQFhJMi3EESNumtqCTeSCbuuUmqi7pYMDpucFNT4XWRhC/uSlUg3R
	ZcX/NdpHptgw=
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Cc: Richard Henderson <rth@twiddle.net>
Subject: [PATCH v3 05/18] string: Improve generic strlen
Date: Wed, 10 Jan 2018 10:47:49 -0200
Message-Id: <1515588482-15744-6-git-send-email-adhemerval.zanella@linaro.org>
In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org>
References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org>

Series

Improve generic string routines | expand

Commit Message

Adhemerval Zanella Jan. 10, 2018, 12:47 p.m. UTC

From: Richard Henderson <rth@twiddle.net>


New algorithm have the following key differences:

  - Reads first word unaligned and use string-maskoff functions to
    remove unwanted data.  This strategy follow assemble optimized
    ones for powerpc, sparc, and SH.

  - Use of has_zero and index_first_zero parametrized functions.

Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).

	[BZ #5806]
    	* string/strlen.c: Use them.
---
 string/strlen.c | 83 +++++++++++----------------------------------------------
 1 file changed, 15 insertions(+), 68 deletions(-)

-- 
2.7.4

Comments

Paul Eggert Jan. 11, 2018, 5:21 p.m. UTC | #1

On 01/10/2018 04:47 AM, Adhemerval Zanella wrote:
> +  /* Align pointer to sizeof op_t.  */

> +  const uintptr_t s_int = (uintptr_t) str;

> +  const op_t *word_ptr = (const op_t*) (s_int & -sizeof (op_t));

I see this sort of code used in multiple places (often with different 
implementations), and suggest packaging it up into an inline function. 
Also, you might consider implementing it this way, which is a bit 
simpler (it's the method used in your generic strcmp):

   op_t *
   word_containing (char const *p)
   {
      return (op_t *) (p - (uintptr_t) p % sizeof (op_t));
   }

This generates the same code with gcc -O2, and minimizes the usage of 
pointers as integers.

Similarly, in other code I suggest using char * instead of uintptr_t, as 
much as possible. This works just as well with ordinary addition, and 
will simplify GCC's job if we ever build with -fcheck-pointer-bounds.

> +  while (1)

>       {

> -      /* 64-bit version of the magic.  */

> -      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */

> -      himagic = ((himagic << 16) << 16) | himagic;

> -      lomagic = ((lomagic << 16) << 16) | lomagic;

> +      if (has_zero (word))

> +	break;

> +      word = *++word_ptr;

>       }

This would be a bit simpler:

   while (! has_zero (word))
     word = *++word_ptr;

Adhemerval Zanella Jan. 12, 2018, 5:59 p.m. UTC | #2

On 11/01/2018 15:21, Paul Eggert wrote:
> On 01/10/2018 04:47 AM, Adhemerval Zanella wrote:

>> +  /* Align pointer to sizeof op_t.  */

>> +  const uintptr_t s_int = (uintptr_t) str;

>> +  const op_t *word_ptr = (const op_t*) (s_int & -sizeof (op_t));

> 

> I see this sort of code used in multiple places (often with different implementations), and suggest packaging it up into an inline function. Also, you might consider implementing it this way, which is a bit simpler (it's the method used in your generic strcmp):

> 

>   op_t *

>   word_containing (char const *p)

>   {

>      return (op_t *) (p - (uintptr_t) p % sizeof (op_t));

>   }

> 

> This generates the same code with gcc -O2, and minimizes the usage of pointers as integers.


Thanks, I have added this function suggestion to string-maskoff.h and used on
the generic implementations.

> 

> Similarly, in other code I suggest using char * instead of uintptr_t, as much as possible. This works just as well with ordinary addition, and will simplify GCC's job if we ever build with -fcheck-pointer-bounds.

> 

>> +  while (1)

>>       {

>> -      /* 64-bit version of the magic.  */

>> -      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */

>> -      himagic = ((himagic << 16) << 16) | himagic;

>> -      lomagic = ((lomagic << 16) << 16) | lomagic;

>> +      if (has_zero (word))

>> +    break;

>> +      word = *++word_ptr;

>>       }

> 

> This would be a bit simpler:

> 

>   while (! has_zero (word))

>     word = *++word_ptr;

> 


Thanks, I have applied it locally as well.

diff --git a/string/strlen.c b/string/strlen.c
index 8ce1318..6bd0ed9 100644
--- a/string/strlen.c
+++ b/string/strlen.c
@@ -20,6 +20,11 @@ 
 
 #include <string.h>
 #include <stdlib.h>
+#include <stdint.h>
+#include <string-fza.h>
+#include <string-fzb.h>
+#include <string-fzi.h>
+#include <string-maskoff.h>
 
 #undef strlen
 
@@ -32,78 +37,20 @@ 
 size_t
 STRLEN (const char *str)
 {
-  const char *char_ptr;
-  const unsigned long int *longword_ptr;
-  unsigned long int longword, himagic, lomagic;
+  /* Align pointer to sizeof op_t.  */
+  const uintptr_t s_int = (uintptr_t) str;
+  const op_t *word_ptr = (const op_t*) (s_int & -sizeof (op_t));
 
-  /* Handle the first few characters by reading one character at a time.
-     Do this until CHAR_PTR is aligned on a longword boundary.  */
-  for (char_ptr = str; ((unsigned long int) char_ptr
-			& (sizeof (longword) - 1)) != 0;
-       ++char_ptr)
-    if (*char_ptr == '\0')
-      return char_ptr - str;
+  /* Read and MASK the first word. */
+  op_t word = *word_ptr | create_mask (s_int);
 
-  /* All these elucidatory comments refer to 4-byte longwords,
-     but the theory applies equally well to 8-byte longwords.  */
-
-  longword_ptr = (unsigned long int *) char_ptr;
-
-  /* Bits 31, 24, 16, and 8 of this number are zero.  Call these bits
-     the "holes."  Note that there is a hole just to the left of
-     each byte, with an extra at the end:
-
-     bits:  01111110 11111110 11111110 11111111
-     bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD
-
-     The 1-bits make sure that carries propagate to the next 0-bit.
-     The 0-bits provide holes for carries to fall into.  */
-  himagic = 0x80808080L;
-  lomagic = 0x01010101L;
-  if (sizeof (longword) > 4)
+  while (1)
     {
-      /* 64-bit version of the magic.  */
-      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */
-      himagic = ((himagic << 16) << 16) | himagic;
-      lomagic = ((lomagic << 16) << 16) | lomagic;
+      if (has_zero (word))
+	break;
+      word = *++word_ptr;
     }
-  if (sizeof (longword) > 8)
-    abort ();
 
-  /* Instead of the traditional loop which tests each character,
-     we will test a longword at a time.  The tricky part is testing
-     if *any of the four* bytes in the longword in question are zero.  */
-  for (;;)
-    {
-      longword = *longword_ptr++;
-
-      if (((longword - lomagic) & ~longword & himagic) != 0)
-	{
-	  /* Which of the bytes was the zero?  If none of them were, it was
-	     a misfire; continue the search.  */
-
-	  const char *cp = (const char *) (longword_ptr - 1);
-
-	  if (cp[0] == 0)
-	    return cp - str;
-	  if (cp[1] == 0)
-	    return cp - str + 1;
-	  if (cp[2] == 0)
-	    return cp - str + 2;
-	  if (cp[3] == 0)
-	    return cp - str + 3;
-	  if (sizeof (longword) > 4)
-	    {
-	      if (cp[4] == 0)
-		return cp - str + 4;
-	      if (cp[5] == 0)
-		return cp - str + 5;
-	      if (cp[6] == 0)
-		return cp - str + 6;
-	      if (cp[7] == 0)
-		return cp - str + 7;
-	    }
-	}
-    }
+  return ((const char *) word_ptr) + index_first_zero (word) - str;
 }
 libc_hidden_builtin_def (strlen)

[v3,05/18] string: Improve generic strlen

Commit Message

Comments

Patch