From patchwork Mon Oct 21 00:23:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 177012 Delivered-To: patch@linaro.org Received: by 2002:a92:409a:0:0:0:0:0 with SMTP id d26csp2677525ill; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqysZIdyxBNdPlBgP1jAXl/FeCylzvk+GapwGKBOd5zxNq99vJy975e/YaRQrDQGnPlu1lf+ X-Received: by 2002:aa7:d04a:: with SMTP id n10mr22573896edo.14.1571617478333; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571617478; cv=none; d=google.com; s=arc-20160816; b=a0jmPgS7r0rB8O0/rrv1Irh1Pw8UR+JMmS2tr2cXjVD8G/I1DxY5r7bta/mXQb+hKF xhsYB6/e24+LgWmN70T4bAKC2c/pQGZPh27cEI7vrEVNPqNeQGJw4dHwp/IM9ETxXMlZ IzxR3AVoItWCSLYW+TlFkv7nR3MNZMXXVNB8Gjhhu8R6A+fpQDYkmlC7F7Y62W1OYnHk s+Pymv7eMrGJudGdRXwP+gRj5FzJoDrxTRFBbbjelyV2TuRYNWiLYBeEn5Zb71scwf8e aEVN8tAVm3MuukNKgNfdtrucKFXSe1RE0CEWYlZt9b9z/rqBXlLcmCrNTs6L7fUqszSN 7pOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=r8ZMVrwCbpKhw32AwcaMyXgQyJTlku5DO6tdQCuUcfE=; b=0fcjedGUCnrVuOTwPicJO2akxx3NvVyuaTYtY0sOmfok//+QyJbm1CALq61uTqr4r2 330jSqF0VzSID0QPqhwfGBZK3s3kuVoRA/+4p+38XPCjShM7siRvErp59wcRYtnE8exb fNG1MppDkfQZ1VlmdDhi2SF8nykzkxGRFOmXlCEoMumeHEcNMf9lzv/Ib7FUCXnzW3jk dT4eS8sSvkeTZYrXSkP/pVvhYn/1l13cEPwroXA79hOgMFtwhtslEAzJPPcpj40pKr89 ANqrGI2/OlURo736XImYR2eAIb+7t0jVk+UNVaVX0GFu9vExbJJ9x3laLLowcAwq3FPq uvQg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id r19si8351002edy.409.2019.10.20.17.24.38; Sun, 20 Oct 2019 17:24:38 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 45DE72BEA; Mon, 21 Oct 2019 02:23:54 +0200 (CEST) Received: from foss.arm.com (unknown [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 8FFAC29D2 for ; Mon, 21 Oct 2019 02:23:47 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD904101E; Sun, 20 Oct 2019 17:23:35 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C7ED53F71F; Sun, 20 Oct 2019 17:23:35 -0700 (PDT) From: Honnappa Nagarahalli To: olivier.matz@6wind.com, sthemmin@microsoft.com, jerinj@marvell.com, bruce.richardson@intel.com, david.marchand@redhat.com, pbhagavatula@marvell.com, konstantin.ananyev@intel.com, drc@linux.vnet.ibm.com, hemant.agrawal@nxp.com, honnappa.nagarahalli@arm.com Cc: dev@dpdk.org, dharmik.thakkar@arm.com, ruifeng.wang@arm.com, gavin.hu@arm.com Date: Sun, 20 Oct 2019 19:23:00 -0500 Message-Id: <20191021002300.26497-7-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191021002300.26497-1-honnappa.nagarahalli@arm.com> References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191021002300.26497-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [RFC v6 6/6] lib/ring: improved copy function to copy ring elements X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Improved copy function to copy to/from ring elements. Signed-off-by: Honnappa Nagarahalli Signed-off-by: Konstantin Ananyev --- lib/librte_ring/rte_ring_elem.h | 165 ++++++++++++++++---------------- 1 file changed, 84 insertions(+), 81 deletions(-) -- 2.17.1 diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h index 0ce5f2be7..80ec3c562 100644 --- a/lib/librte_ring/rte_ring_elem.h +++ b/lib/librte_ring/rte_ring_elem.h @@ -109,85 +109,88 @@ __rte_experimental struct rte_ring *rte_ring_create_elem(const char *name, unsigned int count, unsigned int esize, int socket_id, unsigned int flags); -#define ENQUEUE_PTRS_GEN(r, ring_start, prod_head, obj_table, esize, n) do { \ - unsigned int i, j; \ - const uint32_t size = (r)->size; \ - uint32_t idx = prod_head & (r)->mask; \ - uint32_t *ring = (uint32_t *)ring_start; \ - uint32_t *obj = (uint32_t *)obj_table; \ - uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ - uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ - uint32_t seg0 = size - idx; \ - if (likely(n < seg0)) { \ - for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ - i += 8, nr_idx += 8) { \ - memcpy(ring + nr_idx, obj + i, 8 * sizeof (uint32_t)); \ - } \ - switch (nr_n & 0x7) { \ - case 7: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 6: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 5: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 4: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 3: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 2: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - case 1: \ - ring[nr_idx++] = obj[i++]; /* fallthrough */ \ - } \ - } else { \ - uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ - uint32_t nr_seg1 = nr_n - nr_seg0; \ - for (i = 0; i < nr_seg0; i++, nr_idx++)\ - ring[nr_idx] = obj[i]; \ - for (j = 0; j < nr_seg1; i++, j++) \ - ring[j] = obj[i]; \ - } \ -} while (0) - -#define DEQUEUE_PTRS_GEN(r, ring_start, cons_head, obj_table, esize, n) do { \ - unsigned int i, j; \ - uint32_t idx = cons_head & (r)->mask; \ - const uint32_t size = (r)->size; \ - uint32_t *ring = (uint32_t *)ring_start; \ - uint32_t *obj = (uint32_t *)obj_table; \ - uint32_t nr_n = n * (esize / sizeof(uint32_t)); \ - uint32_t nr_idx = idx * (esize / sizeof(uint32_t)); \ - uint32_t seg0 = size - idx; \ - if (likely(n < seg0)) { \ - for (i = 0; i < (nr_n & ((~(unsigned)0x7))); \ - i += 8, nr_idx += 8) { \ - memcpy(obj + i, ring + nr_idx, 8 * sizeof (uint32_t)); \ - } \ - switch (nr_n & 0x7) { \ - case 7: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 6: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 5: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 4: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 3: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 2: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - case 1: \ - obj[i++] = ring[nr_idx++]; /* fallthrough */ \ - } \ - } else { \ - uint32_t nr_seg0 = seg0 * (esize / sizeof(uint32_t)); \ - uint32_t nr_seg1 = nr_n - nr_seg0; \ - for (i = 0; i < nr_seg0; i++, nr_idx++)\ - obj[i] = ring[nr_idx];\ - for (j = 0; j < nr_seg1; i++, j++) \ - obj[i] = ring[j]; \ - } \ -} while (0) +static __rte_always_inline void +copy_elems(uint32_t du32[], const uint32_t su32[], uint32_t nr_num) +{ + uint32_t i; + + for (i = 0; i < (nr_num & ~7); i += 8) + memcpy(du32 + i, su32 + i, 8 * sizeof(uint32_t)); + + switch (nr_num & 7) { + case 7: du32[nr_num - 7] = su32[nr_num - 7]; /* fallthrough */ + case 6: du32[nr_num - 6] = su32[nr_num - 6]; /* fallthrough */ + case 5: du32[nr_num - 5] = su32[nr_num - 5]; /* fallthrough */ + case 4: du32[nr_num - 4] = su32[nr_num - 4]; /* fallthrough */ + case 3: du32[nr_num - 3] = su32[nr_num - 3]; /* fallthrough */ + case 2: du32[nr_num - 2] = su32[nr_num - 2]; /* fallthrough */ + case 1: du32[nr_num - 1] = su32[nr_num - 1]; /* fallthrough */ + } +} + +static __rte_always_inline void +enqueue_elems(struct rte_ring *r, void *ring_start, uint32_t prod_head, + void *obj_table, uint32_t num, uint32_t esize) +{ + uint32_t idx, nr_idx, nr_num; + uint32_t *du32; + const uint32_t *su32; + + const uint32_t size = r->size; + uint32_t s0, nr_s0, nr_s1; + + idx = prod_head & (r)->mask; + /* Normalize the idx to uint32_t */ + nr_idx = (idx * esize) / sizeof(uint32_t); + + du32 = (uint32_t *)ring_start + nr_idx; + su32 = obj_table; + + /* Normalize the number of elements to uint32_t */ + nr_num = (num * esize) / sizeof(uint32_t); + + s0 = size - idx; + if (num < s0) + copy_elems(du32, su32, nr_num); + else { + nr_s0 = (s0 * esize) / sizeof(uint32_t); + nr_s1 = nr_num - nr_s0; + copy_elems(du32, su32, nr_s0); + copy_elems(ring_start, su32 + nr_s0, nr_s1); + } +} + +static __rte_always_inline void +dequeue_elems(struct rte_ring *r, void *ring_start, uint32_t cons_head, + void *obj_table, uint32_t num, uint32_t esize) +{ + uint32_t idx, nr_idx, nr_num; + uint32_t *du32; + const uint32_t *su32; + + const uint32_t size = r->size; + uint32_t s0, nr_s0, nr_s1; + + idx = cons_head & (r)->mask; + /* Normalize the idx to uint32_t */ + nr_idx = (idx * esize) / sizeof(uint32_t); + + su32 = (uint32_t *)ring_start + nr_idx; + du32 = obj_table; + + /* Normalize the number of elements to uint32_t */ + nr_num = (num * esize) / sizeof(uint32_t); + + s0 = size - idx; + if (num < s0) + copy_elems(du32, su32, nr_num); + else { + nr_s0 = (s0 * esize) / sizeof(uint32_t); + nr_s1 = nr_num - nr_s0; + copy_elems(du32, su32, nr_s0); + copy_elems(du32 + nr_s0, ring_start, nr_s1); + } +} /* Between load and load. there might be cpu reorder in weak model * (powerpc/arm). @@ -242,7 +245,7 @@ __rte_ring_do_enqueue_elem(struct rte_ring *r, void * const obj_table, if (n == 0) goto end; - ENQUEUE_PTRS_GEN(r, &r[1], prod_head, obj_table, esize, n); + enqueue_elems(r, &r[1], prod_head, obj_table, n, esize); update_tail(&r->prod, prod_head, prod_next, is_sp, 1); end: @@ -289,7 +292,7 @@ __rte_ring_do_dequeue_elem(struct rte_ring *r, void *obj_table, if (n == 0) goto end; - DEQUEUE_PTRS_GEN(r, &r[1], cons_head, obj_table, esize, n); + dequeue_elems(r, &r[1], cons_head, obj_table, n, esize); update_tail(&r->cons, cons_head, cons_next, is_sc, 0);