From patchwork Tue Oct 23 04:21:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honnappa Nagarahalli X-Patchwork-Id: 149408 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp220727ljp; Mon, 22 Oct 2018 21:22:16 -0700 (PDT) X-Google-Smtp-Source: ACcGV60aRKjPWvpwCRi3qJ2Wm1IvhHALAVmTLyhJY9P2pgx7LiVmyag8HAyGmGT03BcnIc5cJEss X-Received: by 2002:a1c:118c:: with SMTP id 134-v6mr17957810wmr.75.1540268536913; Mon, 22 Oct 2018 21:22:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540268536; cv=none; d=google.com; s=arc-20160816; b=T+zJ+uBzk49tgAuQzOlznPV/gN0r3Lj182xLN9aphBRIp9ebh+bW1gv0S7XFOnHHFb fu9BjLDu20ejIXMdnS9RFz9YMX+V15+yltY2HyA794j6pPXEGBC90/YoxCvxDDPf44ll frnAIDY+p9JEU+nwD5vHDSnZ2+Sy2MfNi3I3fLc2dbjDJcI0boRRWpGilEy4vFs8RqfS 3kZF0GnrgZnWq0q7I1TxnOaXPlo18fLkBB244oOa04Osmfym2XlhuX6H8dCvu4x89ZY/ P2nPNxTfM4Mz3ny4rU8y0RH0ZnoqmYsZp9fcc6wC9aa0Bdplh5gq0p3D8+DUGMSDH/2j /ErA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:message-id:date:cc:to :from; bh=LJLyBOfc8+0/++6pDEf28r3QdWyMADVC4TozXPRafwQ=; b=Dmjwr/bJNwFqDtMeCH4ApgX2jCsKFq2pnqTKjnNE9FMIpw9B6Qdu9wRvZDfdiTSXVU VseGE4xRhdkpnUbfFUhWj8vdwuQXpnNovUNoRRPjJDx9F197IPOrc/7Ga0sJ09QE2VlH ddfUK1LHw6hud76q0aY6h2Be3RZu5YKL1GvKTbbnSSpfBLEYYoonhIGJYjaB9XXtki/1 RZNW/L+9aRzVvjRTcBsFOoM0lgof6Tq9D37K/JTRE0G6m/rgcnmLV6RwZ393w/dcxP74 RInK3mWYjZSI4gilM1cK35gnbJBpEqpWqDk6nWW1M/Yxc0nfJmdfyONk8bGDDB/vZ3AH KBVA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Return-Path: Received: from dpdk.org (dpdk.org. [92.243.14.124]) by mx.google.com with ESMTP id 7-v6si263716wmj.96.2018.10.22.21.22.16; Mon, 22 Oct 2018 21:22:16 -0700 (PDT) Received-SPF: pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) client-ip=92.243.14.124; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dev-bounces@dpdk.org designates 92.243.14.124 as permitted sender) smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D7F161B148; Tue, 23 Oct 2018 06:22:14 +0200 (CEST) Received: from foss.arm.com (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id 6AD241B108 for ; Tue, 23 Oct 2018 06:22:13 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6D804341; Mon, 22 Oct 2018 21:22:12 -0700 (PDT) Received: from 2p2660v4-1.austin.arm.com (2p2660v4-1.austin.arm.com [10.118.14.139]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 06B9E3F5D3; Mon, 22 Oct 2018 21:22:11 -0700 (PDT) From: Honnappa Nagarahalli To: bruce.richardson@intel.com, pablo.de.lara.guarch@intel.com Cc: dev@dpdk.org, yipeng1.wang@intel.com, honnappa.nagarahalli@arm.com, gavin.hu@arm.com, dharmik.thakkar@arm.com, nd@arm.com Date: Mon, 22 Oct 2018 23:21:59 -0500 Message-Id: <1540268524-126673-1-git-send-email-honnappa.nagarahalli@arm.com> X-Mailer: git-send-email 2.7.4 Subject: [dpdk-dev] [PATCH v5 0/5] Address reader-writer concurrency in rte_hash X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch has dependency on the following patches in the order: http://patchwork.dpdk.org/cover/45611/ http://patchwork.dpdk.org/patch/47196/ Currently, reader-writer concurrency problems in rte_hash are addressed using reader-writer locks. Use of reader-writer locks results in following issues: 1) In many of the use cases for the hash table, writer threads are running on control plane. If the writer is preempted while holding the lock, it will block the readers for an extended period resulting in packet drops. This problem seems to apply for platforms with transactional memory support as well because of the algorithm used for rte_rwlock_write_lock_tm: static inline void rte_rwlock_write_lock_tm(rte_rwlock_t *rwl) { if (likely(rte_try_tm(&rwl->cnt))) return; rte_rwlock_write_lock(rwl); } i.e. there is a posibility of using rte_rwlock_write_lock in failure cases. 2) Reader-writer lock based solution does not address the following issue. rte_hash_lookup_xxx APIs return the index of the element in the key store. Application(reader) can use that index to reference other data structures in its scope. Because of this, the index should not be freed till the application completes using the index. 3) Since writer blocks all the readers, the hash lookup rate comes down significantly when there is activity on the writer. This happens even for unrelated entries. Performance numbers given below clearly indicate this. Lock-free solution is required to solve these problems. This patch series adds the lock-free capabilities in the following steps: 1) Add support to not free the key-store index upon calling rte_hash_del_xxx APIs. This solves the issue in 2). 2) Correct the alignment for the key store entry to prep for memory ordering. 3) Add memory ordering to prevent race conditions when a new key is added to the table. 4) Reader-writer concurrency issue, caused by moving the keys to their alternate locations during key insert, is solved by introducing an atomic global counter indicating a change in table. 5) This solution also has to solve the issue of readers using key store element even after the key is deleted from control plane. To solve this issue, the hash_del_key_xxx APIs do not free the key store element when lock-free algorithm is enabled. The key store element has to be freed using the newly introduced rte_hash_free_key_with_position API. It needs to be called once all the readers have stopped using the key store element. How this is determined is outside the scope of this patch (RCU is one such mechanism that the application can use). 6) Finally, a lock free reader-writer concurrency flag is added to enable this feature at run time. Performance numbers can be got from the additional test case added as part of this patch. v4->v5 1) Rebased with patch v8 of extendable hash bucket feature (http://patchwork.dpdk.org/patch/47196/) 2) Changed 'success' to 'hit' and 'fail' to 'miss' in read/write concurrenct lock free test code v3-v4 1) Merged 4/7, 5/7 and 6/7 into 4/5 2) Changed RTE_HASH_EXTRA_FLAGS_RECYCLE_ON_DEL to RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL (Yipeng) 3) Changed the commit log for the patch "hash: correct key store element alignment" (Yipeng) 4) Changed the comment for rte_hash_add_key_data API (Yipeng) 5) Added bulk lookup for lock-free performance test case (Yipeng) 5) Reduced the number of keys to 4M in the tests (Yipeng) v2->v3 1) Rebased on top of: http://patchwork.dpdk.org/cover/45611/ http://patchwork.dpdk.org/project/dpdk/list/?series=1822 2) Added comments to RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF to indicate multi writer support (Yipeng) 3) Updated the comments for rte_hash_add_key_data_xxx APIs to free the 'data' only after the readers have completed using 'data' (Yipeng) 4) Extendable tables are not supported when lock free algorithm is requested. v1->v2 1) Separate multi-writer capability from rw concurrency 2) Add do not recycle on delete feature (Yipeng) 3) Add Arm copyright 4) Add test case to test lock-free algorithm and multi-writer test case (Yipeng) 5) Additional API documentation to indicate RCU usage (Yipeng) 6) Additional documentation on rte_hash_reset API (Yipeng) 7) Allocate memory for the global counter and avoid API changes (Yipeng) Dharmik Thakkar (1): test/hash: read-write lock-free concurrency test Honnappa Nagarahalli (4): hash: separate multi-writer from rw-concurrency hash: support do not free on delete hash: fix key store element alignment hash: add lock-free read-write concurrency lib/librte_hash/rte_cuckoo_hash.c | 520 +++++++++++---- lib/librte_hash/rte_cuckoo_hash.h | 21 +- lib/librte_hash/rte_hash.h | 77 ++- lib/librte_hash/rte_hash_version.map | 7 + test/test/Makefile | 1 + test/test/meson.build | 1 + test/test/test_hash.c | 140 +++- test/test/test_hash_readwrite.c | 6 +- test/test/test_hash_readwrite_lf.c | 1220 ++++++++++++++++++++++++++++++++++ 9 files changed, 1848 insertions(+), 145 deletions(-) create mode 100644 test/test/test_hash_readwrite_lf.c -- 2.7.4