From patchwork Tue Jul 2 21:16:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 168380 Delivered-To: patch@linaro.org Received: by 2002:a92:4782:0:0:0:0:0 with SMTP id e2csp4740949ilk; Tue, 2 Jul 2019 14:16:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqx2AoKIe0OPPd0oN11/14oyayfAPo3rzFKc1xlZkNibEE1zlarx6a54lXt5diXAfNHoJ9qc X-Received: by 2002:adf:80e4:: with SMTP id 91mr17826813wrl.246.1562102217355; Tue, 02 Jul 2019 14:16:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562102217; cv=none; d=google.com; s=arc-20160816; b=1IcWYIB9hhTBT1Zv2CXpgYYlmvWTDYMBdu/teWX9TiMGaZeo5Rrb4zpa26x/XveRpE 4cdhky8Oe3TV1bApPZUGM6kLC/OxvhUkEZevZ3WYFp30Ys2u4J/4nYUtlJkCK6PM/Scn pryJXp7Mb9CbKWa23+29+M0zndM4KrK3pLmQirTslfsVcoVYRBd2k1xMuYEWFKKhWc8I f3JKtmT8SnCoVYlNUwaMKRe4R2OO/N9so4f12T79geuxROQ0gdGK266ja9S/I2c/mks7 EzzFtdnuxgpzCnaSrtbivANPegXiQP+BEt4utiHbweLXDKd5pnpq4/DkZkSTc90j0v3t bpFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=XpMx2ueP7X5+cBg0Bd8ZFKl8gSFMnlc5Um0PhDvxc64=; b=yC95R206zxTK4CkrhU3dX6vmqLL74qkq14zsVJ0s+Yf1qn/5qnCg3EmPAqF+h64ZLr URlYw3i0XVnaxyXFJK4iwZl9dFiUFEANnsaPLbCG9cVEH/JoNKUPwavT9LSGiQoEelYa d6eiyxEQg6nvpfr7kxal/SdxnVX+UbeBTGfg1PJ/M2Fn6cCL6tS4Ca7PpC064WRNr6/l jSkBnjKalmOeoT/lWUA8FQdbYDCPG4EgwJk8Z4AHVxEnzAmm+oaaxDp1zsqw7bemtaCd LVqMj1M+CtfRY7D9BFKqhF4I6hQhQl1vqO+9azyxwLOx7EoLBZ7fykCFLSlhA2wOP/mT g+Tg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id v188si66721wmg.155.2019.07.02.14.16.57; Tue, 02 Jul 2019 14:16:57 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DDDA61BDF3; Tue, 2 Jul 2019 23:16:51 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 3FA781BDEE; Tue, 2 Jul 2019 23:16:50 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9406CCFC; Tue, 2 Jul 2019 14:16:49 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8092C3F703; Tue, 2 Jul 2019 14:16:49 -0700 (PDT) From: Honnappa Nagarahalli To: yipeng1.wang@intel.com, sameh.gobriel@intel.com, bruce.richardson@intel.com, pablo.de.lara.guarch@intel.com, honnappa.nagarahalli@arm.com Cc: gavin.hu@arm.com, ruifeng.wang@arm.com, dev@dpdk.org, nd@arm.com, stable@dpdk.org Date: Tue, 2 Jul 2019 16:16:33 -0500 Message-Id: <20190702211634.37940-2-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190702211634.37940-1-honnappa.nagarahalli@arm.com> References: <20190625211520.43181-1-honnappa.nagarahalli@arm.com> <20190702211634.37940-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [PATCH v2 1/2] lib/hash: use ordered loads only if signature matches X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Relaxed signature comparison is done first. Further ordered loads are done only if the signature matches. Any false positives are caught by the full key comparison. This provides performance benefits as load-acquire is executed only when required. Fixes: e605a1d36 ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Tested-by: Ruifeng Wang Acked-by: Yipeng Wang --- lib/librte_hash/rte_cuckoo_hash.c | 37 +++++++++++++++++++------------ 1 file changed, 23 insertions(+), 14 deletions(-) -- 2.17.1 diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c index 953928f27..0e042d924 100644 --- a/lib/librte_hash/rte_cuckoo_hash.c +++ b/lib/librte_hash/rte_cuckoo_hash.c @@ -1188,22 +1188,31 @@ search_one_bucket_lf(const struct rte_hash *h, const void *key, uint16_t sig, struct rte_hash_key *k, *keys = h->key_store; for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { - key_idx = __atomic_load_n(&bkt->key_idx[i], + /* Signature comparison is done before the acquire-load + * of the key index to achieve better performance. + * This can result in the reader loading old signature + * (which matches), while the key_idx is updated to a + * value that belongs to a new key. However, the full + * key comparison will ensure that the lookup fails. + */ + if (bkt->sig_current[i] == sig) { + key_idx = __atomic_load_n(&bkt->key_idx[i], __ATOMIC_ACQUIRE); - if (bkt->sig_current[i] == sig && key_idx != EMPTY_SLOT) { - k = (struct rte_hash_key *) ((char *)keys + - key_idx * h->key_entry_size); - pdata = __atomic_load_n(&k->pdata, - __ATOMIC_ACQUIRE); + if (key_idx != EMPTY_SLOT) { + k = (struct rte_hash_key *) ((char *)keys + + key_idx * h->key_entry_size); + pdata = __atomic_load_n(&k->pdata, + __ATOMIC_ACQUIRE); - if (rte_hash_cmp_eq(key, k->key, h) == 0) { - if (data != NULL) - *data = pdata; - /* - * Return index where key is stored, - * subtracting the first dummy index - */ - return key_idx - 1; + if (rte_hash_cmp_eq(key, k->key, h) == 0) { + if (data != NULL) + *data = pdata; + /* + * Return index where key is stored, + * subtracting the first dummy index + */ + return key_idx - 1; + } } } } From patchwork Tue Jul 2 21:16:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 168381 Delivered-To: patch@linaro.org Received: by 2002:a92:4782:0:0:0:0:0 with SMTP id e2csp4741078ilk; Tue, 2 Jul 2019 14:17:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqzAR5yvb1b8kv37rEjNeSgE15HchPAJYxqxNFe89mfBVPbs+7jNXcKJ/i+WVRkrDGxTJd1P X-Received: by 2002:adf:aad1:: with SMTP id i17mr13689082wrc.63.1562102225572; Tue, 02 Jul 2019 14:17:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562102225; cv=none; d=google.com; s=arc-20160816; b=sIn5DAkFxZKrNfbkzWs6ruI3ZJv4xr7c1qBlZsbEFjqjyKoWLxLY4PMZtW9lh1c5p/ WFovCejkGHmMDg9lcaHPrE+/X6ubO0/YY1JCwZP8fugCQD+DZYBe9D1mdb/DtYYaXQcd 602Jf7Cor/WBjdO/h+69YepvsHJxpgOIo2ygZomdrl9TgqCBR1RGhpK8CnlHPfuHPxiS VJohjew00nHgqgh5tNvcZFXAS4MoyghDKIkr/6FuBSgGzkinZXahOlBZXDn9jCWWTgdv 92P7Ct5AzFNLEkkIAqfhzj9s45DmQTvAPvo2QNUYTPtoNbfaNcvlXf5ywg+o1j8BHKxX 6tFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:cc:to:from; bh=TTfFOCUQ7Jgz4VFWEjHpzNGmxqZSP520ZVdLcoGpr7A=; b=jgbjykXnFgMwYULQ5Fix/BAtM/A6bOY76w0lH7O0bCWE+CmkxxVmh+RowlRpR1C0u7 0BH2TkSITwycQcyb1vNo+aLjPqloKi1dSd7JptPFgxnXK62KuSPwp/XCk3QyEjxQsLyS 8vA+3FM5q1dC5/+DFuG4AUvba5hZCG9UMjvu/KZnVFMaA8cOL0leJgh+c1l5xj/w6Mlg GCkrF5WU3iDjZcCQ7m+frx8TAmwxpKcpG0mzTQXlHtjLj++Lxh34ssE3fRdhymwl6iuh YZ0LURLEGt8pDCoJ0MNkpmiFzRZifE6oaeGZV127T3yQxX2K1K5AbPRA5/jTj9AbhOmp oU0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id c3si27891wrq.3.2019.07.02.14.17.05; Tue, 02 Jul 2019 14:17:05 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BD1731BDFD; Tue, 2 Jul 2019 23:16:53 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 303211BDEE; Tue, 2 Jul 2019 23:16:51 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B22711424; Tue, 2 Jul 2019 14:16:50 -0700 (PDT) Received: from qc2400f-1.austin.arm.com (qc2400f-1.austin.arm.com [10.118.12.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9C1DD3F703; Tue, 2 Jul 2019 14:16:50 -0700 (PDT) From: Honnappa Nagarahalli To: yipeng1.wang@intel.com, sameh.gobriel@intel.com, bruce.richardson@intel.com, pablo.de.lara.guarch@intel.com, honnappa.nagarahalli@arm.com Cc: gavin.hu@arm.com, ruifeng.wang@arm.com, dev@dpdk.org, nd@arm.com, stable@dpdk.org Date: Tue, 2 Jul 2019 16:16:34 -0500 Message-Id: <20190702211634.37940-3-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190702211634.37940-1-honnappa.nagarahalli@arm.com> References: <20190625211520.43181-1-honnappa.nagarahalli@arm.com> <20190702211634.37940-1-honnappa.nagarahalli@arm.com> Subject: [dpdk-dev] [PATCH v2 2/2] lib/hash: load pData after full key compare X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When a hash entry is added, there are 2 sets of stores. 1) The application writes its data to memory (whose address is provided in rte_hash_add_key_with_hash_data API (or NULL)) 2) The rte_hash library writes to its own internal data structures; key store entry and the hash table. The only ordering requirement between these 2 is that - store to the application data must complete before the store to key_index. There are no ordering requirements between the stores to key/signature and store to application data. The synchronization point for application data can be any point between the 'store to application data' and 'store to the key_index'. So, 'pdata' should not be a guard variable for the data in hash table. It should be a guard variable only for the application data written to the memory location pointed by 'pdata'. Hence, in the lookup functions, 'pdata' can be loaded after full key comparison succeeds. The synchronization point for the application data (store-release to 'pdata' in key store) is changed to be consistent with the order of loads in lookup function. However, this change is cosmetic and does not affect the functionality. Fixes: e605a1d36 ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli Reviewed-by: Gavin Hu Tested-by: Ruifeng Wang --- lib/librte_hash/rte_cuckoo_hash.c | 67 +++++++++++++++---------------- 1 file changed, 32 insertions(+), 35 deletions(-) -- 2.17.1 diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c index 0e042d924..55c5c1b8a 100644 --- a/lib/librte_hash/rte_cuckoo_hash.c +++ b/lib/librte_hash/rte_cuckoo_hash.c @@ -649,9 +649,11 @@ search_and_update(const struct rte_hash *h, void *data, const void *key, k = (struct rte_hash_key *) ((char *)keys + bkt->key_idx[i] * h->key_entry_size); if (rte_hash_cmp_eq(key, k->key, h) == 0) { - /* 'pdata' acts as the synchronization point - * when an existing hash entry is updated. - * Key is not updated in this case. + /* The store to application data at *data + * should not leak after the store to pdata + * in the key store. i.e. pdata is the guard + * variable. Release the application data + * to the readers. */ __atomic_store_n(&k->pdata, data, @@ -711,11 +713,10 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, /* Check if slot is available */ if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) { prim_bkt->sig_current[i] = sig; - /* Key can be of arbitrary length, so it is - * not possible to store it atomically. - * Hence the new key element's memory stores - * (key as well as data) should be complete - * before it is referenced. + /* Store to signature and key should not + * leak after the store to key_idx. i.e. + * key_idx is the guard variable for signature + * and key. */ __atomic_store_n(&prim_bkt->key_idx[i], new_idx, @@ -990,17 +991,15 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, new_k = RTE_PTR_ADD(keys, (uintptr_t)slot_id * h->key_entry_size); new_idx = (uint32_t)((uintptr_t) slot_id); - /* Copy key */ - memcpy(new_k->key, key, h->key_len); - /* Key can be of arbitrary length, so it is not possible to store - * it atomically. Hence the new key element's memory stores - * (key as well as data) should be complete before it is referenced. - * 'pdata' acts as the synchronization point when an existing hash - * entry is updated. + /* The store to application data (by the application) at *data should + * not leak after the store of pdata in the key store. i.e. pdata is + * the guard variable. Release the application data to the readers. */ __atomic_store_n(&new_k->pdata, data, __ATOMIC_RELEASE); + /* Copy key */ + memcpy(new_k->key, key, h->key_len); /* Find an empty slot and insert */ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, @@ -1064,8 +1063,10 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, /* Check if slot is available */ if (likely(cur_bkt->key_idx[i] == EMPTY_SLOT)) { cur_bkt->sig_current[i] = short_sig; - /* Store to signature should not leak after - * the store to key_idx + /* Store to signature and key should not + * leak after the store to key_idx. i.e. + * key_idx is the guard variable for signature + * and key. */ __atomic_store_n(&cur_bkt->key_idx[i], new_idx, @@ -1087,8 +1088,9 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, bkt_id = (uint32_t)((uintptr_t)ext_bkt_id) - 1; /* Use the first location of the new bucket */ (h->buckets_ext[bkt_id]).sig_current[0] = short_sig; - /* Store to signature should not leak after - * the store to key_idx + /* Store to signature and key should not leak after + * the store to key_idx. i.e. key_idx is the guard variable + * for signature and key. */ __atomic_store_n(&(h->buckets_ext[bkt_id]).key_idx[0], new_idx, @@ -1184,7 +1186,6 @@ search_one_bucket_lf(const struct rte_hash *h, const void *key, uint16_t sig, { int i; uint32_t key_idx; - void *pdata; struct rte_hash_key *k, *keys = h->key_store; for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { @@ -1201,12 +1202,13 @@ search_one_bucket_lf(const struct rte_hash *h, const void *key, uint16_t sig, if (key_idx != EMPTY_SLOT) { k = (struct rte_hash_key *) ((char *)keys + key_idx * h->key_entry_size); - pdata = __atomic_load_n(&k->pdata, - __ATOMIC_ACQUIRE); if (rte_hash_cmp_eq(key, k->key, h) == 0) { - if (data != NULL) - *data = pdata; + if (data != NULL) { + *data = __atomic_load_n( + &k->pdata, + __ATOMIC_ACQUIRE); + } /* * Return index where key is stored, * subtracting the first dummy index @@ -1904,7 +1906,6 @@ __rte_hash_lookup_bulk_lf(const struct rte_hash *h, const void **keys, uint32_t prim_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0}; uint32_t sec_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0}; struct rte_hash_bucket *cur_bkt, *next_bkt; - void *pdata[RTE_HASH_LOOKUP_BULK_MAX]; uint32_t cnt_b, cnt_a; /* Prefetch first keys */ @@ -2006,10 +2007,6 @@ __rte_hash_lookup_bulk_lf(const struct rte_hash *h, const void **keys, (const char *)h->key_store + key_idx * h->key_entry_size); - if (key_idx != EMPTY_SLOT) - pdata[i] = __atomic_load_n( - &key_slot->pdata, - __ATOMIC_ACQUIRE); /* * If key index is 0, do not compare key, * as it is checking the dummy slot @@ -2018,7 +2015,9 @@ __rte_hash_lookup_bulk_lf(const struct rte_hash *h, const void **keys, !rte_hash_cmp_eq( key_slot->key, keys[i], h)) { if (data != NULL) - data[i] = pdata[i]; + data[i] = __atomic_load_n( + &key_slot->pdata, + __ATOMIC_ACQUIRE); hits |= 1ULL << i; positions[i] = key_idx - 1; @@ -2040,10 +2039,6 @@ __rte_hash_lookup_bulk_lf(const struct rte_hash *h, const void **keys, (const char *)h->key_store + key_idx * h->key_entry_size); - if (key_idx != EMPTY_SLOT) - pdata[i] = __atomic_load_n( - &key_slot->pdata, - __ATOMIC_ACQUIRE); /* * If key index is 0, do not compare key, * as it is checking the dummy slot @@ -2053,7 +2048,9 @@ __rte_hash_lookup_bulk_lf(const struct rte_hash *h, const void **keys, !rte_hash_cmp_eq( key_slot->key, keys[i], h)) { if (data != NULL) - data[i] = pdata[i]; + data[i] = __atomic_load_n( + &key_slot->pdata, + __ATOMIC_ACQUIRE); hits |= 1ULL << i; positions[i] = key_idx - 1;