From patchwork Thu Feb 27 22:38:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 868993 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33E4723E356; Thu, 27 Feb 2025 22:39:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695967; cv=none; b=mcBBBQHgoQ7AOU2F8S475hJ8TkdK/jI7QmRONRBR6VFps/mSHdB+QCOIKY4HXGApCvkYRKwUXqoGOlc/Im5PEguZXlx0689cGG1Zl8TYygL1djtDNwaBYGjw071lLHGDCjVjll0X0eHTd53dzttXqvrDOeNTL9HnTqWzeOz9GFU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695967; c=relaxed/simple; bh=Eu5GPMV5GwjsKxZne/JZJ1tltnpCTYvS40AuVLY5zAw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kgwAkJgufbNE39qPqdEJK47mUpbdigB7gOJsXfRLkGKUUG3ux5t4cUGBXxXztt9hm4WIxBR2ae7bLgERmb8KRX3ENbcyEeJlBVXPdEzzZwPmSY2L6sIba63DRqIdZezDzVg11j1+mp7lD6b+szpW9D2e0MxB9E0cV/PHvX8m7P4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Z3mPx6zCfz6L5Bl; Fri, 28 Feb 2025 06:35:33 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id D869D1400DA; Fri, 28 Feb 2025 06:39:22 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.48.149.240) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 27 Feb 2025 23:39:20 +0100 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 2/8] cxl/memfeature: Add CXL memory device patrol scrub control feature Date: Thu, 27 Feb 2025 22:38:09 +0000 Message-ID: <20250227223816.2036-3-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250227223816.2036-1-shiju.jose@huawei.com> References: <20250227223816.2036-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml500011.china.huawei.com (7.191.174.215) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub control feature. The device patrol scrub proactively locates and makes corrections to errors in regular cycle. Allow specifying the number of hours within which the patrol scrub must be completed, subject to minimum and maximum limits reported by the device. Also allow disabling scrub allowing trade-off error rates against performance. Add support for patrol scrub control on CXL memory devices. Register with the EDAC device driver, which retrieves the scrub attribute descriptors from EDAC scrub and exposes the sysfs scrub control attributes to userspace. For example, scrub control for the CXL memory device "cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/. Additionally, add support for region-based CXL memory patrol scrub control. CXL memory regions may be interleaved across one or more CXL memory devices. For example, region-based scrub control for "cxl_region1" is exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. Reviewed-by: Dave Jiang Co-developed-by: Jonathan Cameron Signed-off-by: Jonathan Cameron Signed-off-by: Shiju Jose --- Documentation/edac/scrub.rst | 64 +++++ drivers/cxl/Kconfig | 18 ++ drivers/cxl/core/Makefile | 1 + drivers/cxl/core/memfeatures.c | 476 +++++++++++++++++++++++++++++++++ drivers/cxl/core/region.c | 5 + drivers/cxl/cxlmem.h | 10 + drivers/cxl/mem.c | 4 + 7 files changed, 578 insertions(+) create mode 100644 drivers/cxl/core/memfeatures.c diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index daab929cdba1..788cf43188a4 100644 --- a/Documentation/edac/scrub.rst +++ b/Documentation/edac/scrub.rst @@ -264,3 +264,67 @@ Sysfs files are documented in `Documentation/ABI/testing/sysfs-edac-scrub` `Documentation/ABI/testing/sysfs-edac-ecs` + +Examples +-------- + +The usage takes the form shown in these examples: + +1. CXL memory device patrol scrubber + +1.1 Device based scrubbing + +1.1.1. Query what is device default/current scrub cycle setting. + +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +43200 + +1.1.2. Query the range of device supported scrub cycle. + +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration + +3600 + +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration + +918000 + +1.1.3. Program scrubbing for a device to repeat every 21600 seconds (quarter of a day). + +# echo 21600 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +1.2. Region based scrubbing + +CXL memory is exposed to memory management subsystem and ultimately userspace +via CXL regions. These can incorporate one or more parts of multiple CXL +Type 3 devices with traffic interleaved across them. The user may want to +control the scrub rate via this more abstract region instead of having to +figure out the constituent devices and program them separately. The scrub +rate for each device covers the whole device. Thus if multiple regions use +parts of that device then requests for scrubbing of other regions may result +in a higher scrub rate than requested for this specific region. + +1.2.1 Query what is device default/current scrub cycle setting for a CXL memory region. + +# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +86400 + +1.2.2 Query the range of device supported scrub cycle for a CXL memory region. + +# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration + +3600 + +# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration + +918000 + +1.2.3 Program scrubbing for a region to repeat every 43200 seconds (half a day) + +# echo 43200 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 205547e5543a..b83bdb30b702 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -113,6 +113,24 @@ config CXL_FEATURES If unsure say 'n' +config CXL_RAS_FEATURES + bool "CXL: Memory RAS features" + depends on CXL_MEM + depends on CXL_FEATURES + depends on EDAC=y || (CXL_BUS=m && EDAC=m) + depends on EDAC_SCRUB + help + The CXL memory RAS feature control is optional and allows host to + control the RAS features configurations of CXL Type 3 devices. + + It registers with the EDAC device subsystem to expose control + attributes of CXL memory device's RAS features to the user. + It provides interface functions to support configuring the CXL + memory device's RAS features. + Say 'y/m' if you have an expert need to change default settings + of a memory RAS feature established by the platform/device (eg. + scrub rates for the patrol scrub feature). otherwise say 'n'. + config CXL_PORT default CXL_BUS tristate diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index e1d591e52d4b..2f48845b86d7 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -18,4 +18,5 @@ cxl_core-y += acpi.o cxl_core-$(CONFIG_TRACING) += trace.o cxl_core-$(CONFIG_CXL_REGION) += region.o cxl_core-$(CONFIG_CXL_FEATURES) += features.o +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeatures.o cxl_core-$(CONFIG_CXL_MCE) += mce.o diff --git a/drivers/cxl/core/memfeatures.c b/drivers/cxl/core/memfeatures.c new file mode 100644 index 000000000000..7a5c0645a07e --- /dev/null +++ b/drivers/cxl/core/memfeatures.c @@ -0,0 +1,476 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * CXL memory RAS feature driver. + * + * Copyright (c) 2024-2025 HiSilicon Limited. + * + * - Supports functions to configure RAS features of the + * CXL memory devices. + * - Registers with the EDAC device subsystem driver to expose + * the features sysfs attributes to the user for configuring + * CXL memory RAS feature. + */ + +#include +#include +#include +#include +#include +#include +#include "core.h" + +#define CXL_DEV_NUM_RAS_FEATURES 1 +#define CXL_DEV_HOUR_IN_SECS 3600 + +#define CXL_DEV_NAME_LEN 128 + +static int cxl_hold_region_and_dpa(void) +{ + int rc; + + rc = down_read_interruptible(&cxl_region_rwsem); + if (rc) + return rc; + + rc = down_read_interruptible(&cxl_dpa_rwsem); + if (rc) { + up_read(&cxl_region_rwsem); + return rc; + } + + return 0; +} + +static void cxl_release_region_and_dpa(void) +{ + up_read(&cxl_dpa_rwsem); + up_read(&cxl_region_rwsem); +} + +/* + * CXL memory patrol scrub control functions + */ +struct cxl_patrol_scrub_context { + u8 instance; + u16 get_feat_size; + u16 set_feat_size; + u8 get_version; + u8 set_version; + u16 effects; + struct cxl_memdev *cxlmd; + struct cxl_region *cxlr; +}; + +/** + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure. + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub. + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable. + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours. + * [OUT] Current patrol scrub cycle in hours. + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported. + */ +struct cxl_memdev_ps_params { + bool enable; + bool scrub_cycle_changeable; + u8 scrub_cycle_hrs; + u8 min_scrub_cycle_hrs; +}; + +enum cxl_scrub_param { + CXL_PS_PARAM_ENABLE, + CXL_PS_PARAM_SCRUB_CYCLE, +}; + +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0) +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1) +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0) +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8) +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0) + +/* + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-222 Device Patrol Scrub Control + * Feature Readable Attributes. + */ +struct cxl_memdev_ps_rd_attrs { + u8 scrub_cycle_cap; + __le16 scrub_cycle_hrs; + u8 scrub_flags; +} __packed; + +/* + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-223 Device Patrol Scrub Control + * Feature Writable Attributes. + */ +struct cxl_memdev_ps_wr_attrs { + u8 scrub_cycle_hrs; + u8 scrub_flags; +} __packed; + +static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox, + struct cxl_memdev_ps_params *params) +{ + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs); + u16 scrub_cycle_hrs; + size_t data_size; + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) = + kzalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + data_size = cxl_get_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, NULL); + if (!data_size) + return -EIO; + + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK, + rd_attrs->scrub_cycle_cap); + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + rd_attrs->scrub_flags); + scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs); + params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + scrub_cycle_hrs); + params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK, + scrub_cycle_hrs); + + return 0; +} + +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_memdev_ps_params *params) +{ + struct cxl_mailbox *cxl_mbox; + struct cxl_memdev *cxlmd; + u16 min_scrub_cycle = 0; + int i, ret; + + if (cxl_ps_ctx->cxlr) { + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; + struct cxl_region_params *p = &cxlr->params; + + ret = cxl_hold_region_and_dpa(); + if (ret) + return ret; + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + cxl_mbox = &cxlmd->cxlds->cxl_mbox; + ret = cxl_mem_ps_get_attrs(cxl_mbox, params); + if (ret) + return ret; + + if (params->min_scrub_cycle_hrs > min_scrub_cycle) + min_scrub_cycle = params->min_scrub_cycle_hrs; + } + cxl_release_region_and_dpa(); + + params->min_scrub_cycle_hrs = min_scrub_cycle; + return 0; + } + cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox; + + return cxl_mem_ps_get_attrs(cxl_mbox, params); +} + +static int cxl_mem_ps_set_attrs(struct device *dev, + struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_mailbox *cxl_mbox, + struct cxl_memdev_ps_params *params, + enum cxl_scrub_param param_type) +{ + struct cxl_memdev_ps_wr_attrs wr_attrs; + struct cxl_memdev_ps_params rd_params; + int ret; + + ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params); + if (ret) { + dev_dbg(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n", ret); + return ret; + } + + switch (param_type) { + case CXL_PS_PARAM_ENABLE: + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + params->enable); + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + rd_params.scrub_cycle_hrs); + break; + case CXL_PS_PARAM_SCRUB_CYCLE: + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) { + dev_dbg(dev, "Invalid CXL patrol scrub cycle(%d) to set\n", + params->scrub_cycle_hrs); + dev_dbg(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n", + rd_params.min_scrub_cycle_hrs); + return -EINVAL; + } + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + params->scrub_cycle_hrs); + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + rd_params.enable); + break; + } + + ret = cxl_set_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID, + cxl_ps_ctx->set_version, + &wr_attrs, sizeof(wr_attrs), + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET, + 0, NULL); + if (ret) { + dev_dbg(dev, "CXL patrol scrub set feature failed ret=%d\n", ret); + return ret; + } + + return 0; +} + +static int cxl_ps_set_attrs(struct device *dev, + struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_memdev_ps_params *params, + enum cxl_scrub_param param_type) +{ + struct cxl_mailbox *cxl_mbox; + struct cxl_memdev *cxlmd; + int ret, i; + + if (cxl_ps_ctx->cxlr) { + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; + struct cxl_region_params *p = &cxlr->params; + + ret = cxl_hold_region_and_dpa(); + if (ret) + return ret; + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + cxl_mbox = &cxlmd->cxlds->cxl_mbox; + ret = cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox, + params, param_type); + if (ret) + return ret; + } + cxl_release_region_and_dpa(); + + return 0; + } + cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox; + + return cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox, + params, param_type); +} + +static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + *enabled = params.enable; + + return 0; +} + +static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params = { + .enable = enable, + }; + + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_ENABLE); +} + +static int cxl_patrol_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data, + u32 *min) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + *min = params.min_scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS; + + return 0; +} + +static int cxl_patrol_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data, + u32 *max) +{ + *max = U8_MAX * CXL_DEV_HOUR_IN_SECS; /* Max set by register size */ + + return 0; +} + +static int cxl_patrol_scrub_read_scrub_cycle(struct device *dev, void *drv_data, + u32 *scrub_cycle_secs) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + *scrub_cycle_secs = params.scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS; + + return 0; +} + +static int cxl_patrol_scrub_write_scrub_cycle(struct device *dev, void *drv_data, + u32 scrub_cycle_secs) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params = { + .scrub_cycle_hrs = scrub_cycle_secs / CXL_DEV_HOUR_IN_SECS, + }; + + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_SCRUB_CYCLE); +} + +static const struct edac_scrub_ops cxl_ps_scrub_ops = { + .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg, + .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg, + .get_min_cycle = cxl_patrol_scrub_read_min_scrub_cycle, + .get_max_cycle = cxl_patrol_scrub_read_max_scrub_cycle, + .get_cycle_duration = cxl_patrol_scrub_read_scrub_cycle, + .set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle, +}; + +static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd, + struct edac_dev_feature *ras_feature, u8 scrub_inst) +{ + struct cxl_patrol_scrub_context *cxl_ps_ctx; + struct cxl_feat_entry *feat_entry; + + feat_entry = cxl_get_feature_entry(cxlmd->cxlds, &CXL_FEAT_PATROL_SCRUB_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE)) + return -EOPNOTSUPP; + + cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL); + if (!cxl_ps_ctx) + return -ENOMEM; + + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) { + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .instance = scrub_inst, + .cxlmd = cxlmd, + }; + + ras_feature->ft_type = RAS_FEAT_SCRUB; + ras_feature->instance = cxl_ps_ctx->instance; + ras_feature->scrub_ops = &cxl_ps_scrub_ops; + ras_feature->ctx = cxl_ps_ctx; + + return 0; +} + +static int cxl_region_scrub_init(struct cxl_region *cxlr, + struct edac_dev_feature *ras_feature, u8 scrub_inst) +{ + struct cxl_patrol_scrub_context *cxl_ps_ctx; + struct cxl_region_params *p = &cxlr->params; + struct cxl_feat_entry *feat_entry = NULL; + struct cxl_memdev *cxlmd; + int i; + + /* + * The cxl_region_rwsem must be held if the code below is used in a context + * other than when the region is in the probe state, as shown here. + */ + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + feat_entry = cxl_get_feature_entry(cxlmd->cxlds, &CXL_FEAT_PATROL_SCRUB_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE)) + return -EOPNOTSUPP; + } + if (!feat_entry) + return -EOPNOTSUPP; + + cxl_ps_ctx = devm_kzalloc(&cxlr->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL); + if (!cxl_ps_ctx) + return -ENOMEM; + + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) { + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .instance = scrub_inst, + .cxlr = cxlr, + }; + + ras_feature->ft_type = RAS_FEAT_SCRUB; + ras_feature->instance = cxl_ps_ctx->instance; + ras_feature->scrub_ops = &cxl_ps_scrub_ops; + ras_feature->ctx = cxl_ps_ctx; + + return 0; +} + +int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) +{ + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; + char cxl_dev_name[CXL_DEV_NAME_LEN]; + int num_ras_features = 0; + u8 scrub_inst = 0; + int rc; + + rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features], + scrub_inst); + if (rc < 0 && rc != -EOPNOTSUPP) + return rc; + + if (rc != -EOPNOTSUPP) + num_ras_features++; + + snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s", + "cxl", dev_name(&cxlmd->dev)); + + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, + num_ras_features, ras_features); +} +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_edac_register, "CXL"); + +int devm_cxl_region_edac_register(struct cxl_region *cxlr) +{ + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; + char cxl_dev_name[CXL_DEV_NAME_LEN]; + int num_ras_features = 0; + u8 scrub_inst = 0; + int rc; + + rc = cxl_region_scrub_init(cxlr, &ras_features[num_ras_features], + scrub_inst); + if (rc < 0) + return rc; + + num_ras_features++; + + snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s", + "cxl", dev_name(&cxlr->dev)); + + return edac_dev_register(&cxlr->dev, cxl_dev_name, NULL, + num_ras_features, ras_features); +} +EXPORT_SYMBOL_NS_GPL(devm_cxl_region_edac_register, "CXL"); diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index a83301f24fa2..9e7b716296d7 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3545,6 +3545,11 @@ static int cxl_region_probe(struct device *dev) case CXL_PARTMODE_PMEM: return devm_cxl_add_pmem_region(cxlr); case CXL_PARTMODE_RAM: + rc = devm_cxl_region_edac_register(cxlr); + if (rc) + dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n", + cxlr->id); + /* * The region can not be manged by CXL if any portion of * it is already online as 'System RAM' diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 3ec6b906371b..a08405f94a30 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -853,6 +853,16 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); +#if IS_ENABLED(CONFIG_CXL_RAS_FEATURES) +int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd); +int devm_cxl_region_edac_register(struct cxl_region *cxlr); +#else +static inline int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) +{ return 0; } +static inline int devm_cxl_region_edac_register(struct cxl_region *cxlr) +{ return 0; } +#endif + #ifdef CONFIG_CXL_SUSPEND void cxl_mem_active_inc(void); void cxl_mem_active_dec(void); diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index 9675243bd05b..6e6777b7bafb 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -180,6 +180,10 @@ static int cxl_mem_probe(struct device *dev) return rc; } + rc = devm_cxl_memdev_edac_register(cxlmd); + if (rc) + dev_dbg(dev, "CXL memdev EDAC registration failed rc=%d\n", rc); + /* * The kernel may be operating out of CXL memory on this device, * there is no spec defined way to determine whether this device From patchwork Thu Feb 27 22:38:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 868992 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7252E27183D; Thu, 27 Feb 2025 22:39:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695973; cv=none; b=PV80cTpAwQOkj3GCPHYXmG5xFAi5KWEYWs+8khqwV5uopO0SUL5vb5VV0T0G0g4MIUIRmGkgaLXKv8hhMKCjIKgCN2BjKHCvxzBTpVfGEabUIl6vvu6blJj8mVKxWwQH2c+ijQ9K2GTKY+ZFRHzHIChIs3+hP0y1xxcKgQHIAXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695973; c=relaxed/simple; bh=ePkmvjyLKjKyUBBrvMo7E8uw1XIJh/jwd3inanVHDm0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iyKUmGRI2m1dTJ33m2wVAWaCVu3sZJGL8DUriY9YOOMWdpsbvFFkIMAoDlPv05+T9qHCyowmivwYnuJ/aCETc9sDQtzIMYbcvVvgLge2sQkbv+vHQOhia7mlneRWjqep9G+49KmYEerrrrIkiho91P3fuM6CBp3qpFjOJD/xbUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Z3mQ42Zcfz6L53c; Fri, 28 Feb 2025 06:35:40 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 3ABB71404F5; Fri, 28 Feb 2025 06:39:29 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.48.149.240) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 27 Feb 2025 23:39:26 +0100 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 4/8] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command Date: Thu, 27 Feb 2025 22:38:11 +0000 Message-ID: <20250227223816.2036-5-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250227223816.2036-1-shiju.jose@huawei.com> References: <20250227223816.2036-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml500011.china.huawei.com (7.191.174.215) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Add support for PERFORM_MAINTENANCE mailbox command. CXL spec 3.2 section 8.2.10.7.1 describes the Perform Maintenance command. This command requests the device to execute the maintenance operation specified by the maintenance operation class and the maintenance operation subclass. Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Signed-off-by: Shiju Jose --- drivers/cxl/core/mbox.c | 34 ++++++++++++++++++++++++++++++++++ drivers/cxl/cxlmem.h | 17 +++++++++++++++++ 2 files changed, 51 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index d72764056ce6..19d46a284650 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -824,6 +824,40 @@ static const uuid_t log_uuid[] = { [VENDOR_DEBUG_UUID] = DEFINE_CXL_VENDOR_DEBUG_UUID, }; +int cxl_do_maintenance(struct cxl_mailbox *cxl_mbox, + u8 class, u8 subclass, + void *data_in, size_t data_in_size) +{ + struct cxl_memdev_maintenance_pi { + struct cxl_mbox_do_maintenance_hdr hdr; + u8 data[]; + } __packed; + struct cxl_mbox_cmd mbox_cmd; + size_t hdr_size; + + struct cxl_memdev_maintenance_pi *pi __free(kfree) = + kmalloc(cxl_mbox->payload_size, GFP_KERNEL); + pi->hdr.op_class = class; + pi->hdr.op_subclass = subclass; + hdr_size = sizeof(pi->hdr); + /* + * Check minimum mbox payload size is available for + * the maintenance data transfer. + */ + if (hdr_size + data_in_size > cxl_mbox->payload_size) + return -ENOMEM; + + memcpy(pi->data, data_in, data_in_size); + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_DO_MAINTENANCE, + .size_in = hdr_size + data_in_size, + .payload_in = pi, + }; + + return cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); +} +EXPORT_SYMBOL_NS_GPL(cxl_do_maintenance, "CXL"); + /** * cxl_enumerate_cmds() - Enumerate commands for a device. * @mds: The driver data for the operation diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index a08405f94a30..642ce976dcee 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -527,6 +527,7 @@ enum cxl_opcode { CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500, CXL_MBOX_OP_GET_FEATURE = 0x0501, CXL_MBOX_OP_SET_FEATURE = 0x0502, + CXL_MBOX_OP_DO_MAINTENANCE = 0x0600, CXL_MBOX_OP_IDENTIFY = 0x4000, CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100, CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101, @@ -827,6 +828,19 @@ enum { CXL_PMEM_SEC_PASS_USER, }; +/* + * Perform Maintenance CXL 3.2 Spec 8.2.10.7.1 + */ + +/* + * Perform Maintenance input payload + * CXL rev 3.2 section 8.2.10.7.1 Table 8-117 + */ +struct cxl_mbox_do_maintenance_hdr { + u8 op_class; + u8 op_subclass; +} __packed; + int cxl_internal_send_cmd(struct cxl_mailbox *cxl_mbox, struct cxl_mbox_cmd *cmd); int cxl_dev_state_identify(struct cxl_memdev_state *mds); @@ -898,4 +912,7 @@ struct cxl_hdm { struct seq_file; struct dentry *cxl_debugfs_create_dir(const char *dir); void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds); +int cxl_do_maintenance(struct cxl_mailbox *cxl_mbox, + u8 class, u8 subclass, + void *data_in, size_t data_in_size); #endif /* __CXL_MEM_H__ */ From patchwork Thu Feb 27 22:38:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 868991 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C59F274267; Thu, 27 Feb 2025 22:39:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695979; cv=none; b=MCZHcC+kjVoOeA05KA1SQpkQcvJc5j3wD45zZeWHVMTaG56aLt3Slipjb5h8pYudTq+rtrTe54e6t0ZgAC3GC0+OyguTc8NfUDyWF3KbWI7KAhIpHoJQAnguqFQaoC+JO11WrmK/BSkdZdvjiop9oYAbPz4ltJrBkHDdHX8T3FM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695979; c=relaxed/simple; bh=hbJPL+fP9ZmKmttDkDULrh7PQ8SJfgkC+JamgeKG2pM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IylIAcM3rN+NP4Kh6TsaJ+cslK24Lzx3MKJgZrFCEeeVnK2mjtKOU+2+YW7xbOVXGFO50n7dfF/84Rv9UuCpT22HSQCjGQKZdsc3lIWP9udWgCExA46dLKiEVOkNPBzt6myL6z9yIvAABjGa4SN7wbsHl+zIY6sUkNXRcZYkcBQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Z3mQB3ZQjz6L5Bg; Fri, 28 Feb 2025 06:35:46 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 652C71400DA; Fri, 28 Feb 2025 06:39:35 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.48.149.240) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 27 Feb 2025 23:39:33 +0100 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 6/8] cxl: Support for finding memory operation attributes from the current boot Date: Thu, 27 Feb 2025 22:38:13 +0000 Message-ID: <20250227223816.2036-7-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250227223816.2036-1-shiju.jose@huawei.com> References: <20250227223816.2036-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml500011.china.huawei.com (7.191.174.215) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Certain operations on memory, such as memory repair, are permitted only when the address and other attributes for the operation are from the current boot. This is determined by checking whether the memory attributes for the operation match those in the CXL gen_media or CXL DRAM memory event records reported during the current boot. The CXL event records must be backed up because they are cleared in the hardware after being processed by the kernel. Support is added for storing CXL gen_media or CXL DRAM memory event records in xarrays. Additionally, helper functions are implemented to find a matching record in the xarray storage based on the memory attributes and repair type. Add validity check, when matching attributes for sparing, using the validity flag in the DRAM event record, to ensure that all required attributes for a requested repair operation are valid and set. Presently supported only when CONFIG_CXL_RAS_FEATURES is enabled, as this feature is specifically used for CXL RAS functionalities now. Co-developed-by: Jonathan Cameron Signed-off-by: Jonathan Cameron Signed-off-by: Shiju Jose --- drivers/cxl/core/Makefile | 2 +- drivers/cxl/core/mbox.c | 11 ++- drivers/cxl/core/memdev.c | 9 +++ drivers/cxl/core/ras.c | 151 ++++++++++++++++++++++++++++++++++++++ drivers/cxl/cxlmem.h | 55 ++++++++++++++ drivers/cxl/pci.c | 3 + 6 files changed, 228 insertions(+), 3 deletions(-) create mode 100644 drivers/cxl/core/ras.c diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 2f48845b86d7..9394d4b8c014 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -18,5 +18,5 @@ cxl_core-y += acpi.o cxl_core-$(CONFIG_TRACING) += trace.o cxl_core-$(CONFIG_CXL_REGION) += region.o cxl_core-$(CONFIG_CXL_FEATURES) += features.o -cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeatures.o +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeatures.o ras.o cxl_core-$(CONFIG_CXL_MCE) += mce.o diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 19d46a284650..c9328f1b6464 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -956,12 +956,19 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, hpa_alias = hpa - cache_size; } - if (event_type == CXL_CPER_EVENT_GEN_MEDIA) + if (event_type == CXL_CPER_EVENT_GEN_MEDIA) { + if (cxl_store_rec_gen_media((struct cxl_memdev *)cxlmd, evt)) + dev_dbg(&cxlmd->dev, "CXL store rec_gen_media failed\n"); + trace_cxl_general_media(cxlmd, type, cxlr, hpa, hpa_alias, &evt->gen_media); - else if (event_type == CXL_CPER_EVENT_DRAM) + } else if (event_type == CXL_CPER_EVENT_DRAM) { + if (cxl_store_rec_dram((struct cxl_memdev *)cxlmd, evt)) + dev_dbg(&cxlmd->dev, "CXL store rec_dram failed\n"); + trace_cxl_dram(cxlmd, type, cxlr, hpa, hpa_alias, &evt->dram); + } } } EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL"); diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index a16a5886d40a..bd9ba50bc01e 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -25,8 +25,17 @@ static DEFINE_IDA(cxl_memdev_ida); static void cxl_memdev_release(struct device *dev) { struct cxl_memdev *cxlmd = to_cxl_memdev(dev); + struct cxl_event_gen_media *rec_gen_media; + struct cxl_event_dram *rec_dram; + unsigned long index; ida_free(&cxl_memdev_ida, cxlmd->id); + xa_for_each(&cxlmd->rec_dram, index, rec_dram) + kfree(rec_dram); + xa_destroy(&cxlmd->rec_dram); + xa_for_each(&cxlmd->rec_gen_media, index, rec_gen_media) + kfree(rec_gen_media); + xa_destroy(&cxlmd->rec_gen_media); kfree(cxlmd); } diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c new file mode 100644 index 000000000000..65994eec1037 --- /dev/null +++ b/drivers/cxl/core/ras.c @@ -0,0 +1,151 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * CXL RAS driver. + * + * Copyright (c) 2025 HiSilicon Limited. + * + */ + +#include +#include + +#include "trace.h" + +struct cxl_event_gen_media *cxl_find_rec_gen_media(struct cxl_memdev *cxlmd, + struct cxl_mem_repair_attrbs *attrbs) +{ + struct cxl_event_gen_media *rec; + + rec = xa_load(&cxlmd->rec_gen_media, attrbs->dpa); + if (!rec) + return NULL; + + if (attrbs->repair_type == CXL_PPR) + return rec; + + return NULL; +} +EXPORT_SYMBOL_NS_GPL(cxl_find_rec_gen_media, "CXL"); + +struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd, + struct cxl_mem_repair_attrbs *attrbs) +{ + struct cxl_event_dram *rec; + u16 validity_flags; + + rec = xa_load(&cxlmd->rec_dram, attrbs->dpa); + if (!rec) + return NULL; + + validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags); + if (!(validity_flags & CXL_DER_VALID_CHANNEL) || + !(validity_flags & CXL_DER_VALID_RANK)) + return NULL; + + switch (attrbs->repair_type) { + case CXL_PPR: + if (!(validity_flags & CXL_DER_VALID_NIBBLE) || + get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask) + return rec; + break; + case CXL_CACHELINE_SPARING: + if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) || + !(validity_flags & CXL_DER_VALID_BANK) || + !(validity_flags & CXL_DER_VALID_ROW) || + !(validity_flags & CXL_DER_VALID_COLUMN)) + return NULL; + + if (rec->media_hdr.channel == attrbs->channel && + rec->media_hdr.rank == attrbs->rank && + rec->bank_group == attrbs->bank_group && + rec->bank == attrbs->bank && + get_unaligned_le24(rec->row) == attrbs->row && + get_unaligned_le16(rec->column) == attrbs->column && + (!(validity_flags & CXL_DER_VALID_NIBBLE) || + get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask) && + (!(validity_flags & CXL_DER_VALID_SUB_CHANNEL) || + rec->sub_channel == attrbs->sub_channel)) + return rec; + break; + case CXL_ROW_SPARING: + if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) || + !(validity_flags & CXL_DER_VALID_BANK) || + !(validity_flags & CXL_DER_VALID_ROW)) + return NULL; + + if (rec->media_hdr.channel == attrbs->channel && + rec->media_hdr.rank == attrbs->rank && + rec->bank_group == attrbs->bank_group && + rec->bank == attrbs->bank && + get_unaligned_le24(rec->row) == attrbs->row && + (!(validity_flags & CXL_DER_VALID_NIBBLE) || + get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask)) + return rec; + break; + case CXL_BANK_SPARING: + if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) || + !(validity_flags & CXL_DER_VALID_BANK)) + return NULL; + + if (rec->media_hdr.channel == attrbs->channel && + rec->media_hdr.rank == attrbs->rank && + rec->bank_group == attrbs->bank_group && + rec->bank == attrbs->bank && + (!(validity_flags & CXL_DER_VALID_NIBBLE) || + get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask)) + return rec; + break; + case CXL_RANK_SPARING: + if (rec->media_hdr.channel == attrbs->channel && + rec->media_hdr.rank == attrbs->rank && + (!(validity_flags & CXL_DER_VALID_NIBBLE) || + get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask)) + return rec; + break; + default: + return NULL; + } + + return NULL; +} +EXPORT_SYMBOL_NS_GPL(cxl_find_rec_dram, "CXL"); + +int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt) +{ + void *old_rec; + struct cxl_event_gen_media *rec = kmemdup(&evt->gen_media, + sizeof(*rec), GFP_KERNEL); + if (!rec) + return -ENOMEM; + + old_rec = xa_store(&cxlmd->rec_gen_media, + le64_to_cpu(rec->media_hdr.phys_addr), + rec, GFP_KERNEL); + if (xa_is_err(old_rec)) + return xa_err(old_rec); + + kfree(old_rec); + + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_store_rec_gen_media, "CXL"); + +int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt) +{ + void *old_rec; + struct cxl_event_dram *rec = kmemdup(&evt->dram, sizeof(*rec), GFP_KERNEL); + + if (!rec) + return -ENOMEM; + + old_rec = xa_store(&cxlmd->rec_dram, + le64_to_cpu(rec->media_hdr.phys_addr), + rec, GFP_KERNEL); + if (xa_is_err(old_rec)) + return xa_err(old_rec); + + kfree(old_rec); + + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_store_rec_dram, "CXL"); diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 642ce976dcee..441e8ca71dad 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -34,6 +34,41 @@ (FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) != \ CXLMDEV_RESET_NEEDED_NOT) +enum cxl_mem_repair_type { + CXL_PPR, + CXL_CACHELINE_SPARING, + CXL_ROW_SPARING, + CXL_BANK_SPARING, + CXL_RANK_SPARING, + CXL_REPAIR_MAX, +}; + +/** + * struct cxl_mem_repair_attrbs - CXL memory repair attributes + * @dpa: DPA of memory to repair + * @nibble_mask: nibble mask, identifies one or more nibbles on the memory bus + * @row: row of memory to repair + * @column: column of memory to repair + * @channel: channel of memory to repair + * @sub_channel: sub channel of memory to repair + * @rank: rank of memory to repair + * @bank_group: bank group of memory to repair + * @bank: bank of memory to repair + * @repair_type: repair type. For eg. PPR, memory sparing etc. + */ +struct cxl_mem_repair_attrbs { + u64 dpa; + u32 nibble_mask; + u32 row; + u16 column; + u8 channel; + u8 sub_channel; + u8 rank; + u8 bank_group; + u8 bank; + enum cxl_mem_repair_type repair_type; +}; + /** * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device * @dev: driver core device object @@ -45,6 +80,8 @@ * @endpoint: connection to the CXL port topology for this memory device * @id: id number of this memdev instance. * @depth: endpoint port depth + * @rec_gen_media: xarray to store CXL general media records + * @rec_dram: xarray to store CXL DRAM records */ struct cxl_memdev { struct device dev; @@ -56,6 +93,8 @@ struct cxl_memdev { struct cxl_port *endpoint; int id; int depth; + struct xarray rec_gen_media; + struct xarray rec_dram; }; static inline struct cxl_memdev *to_cxl_memdev(struct device *dev) @@ -870,11 +909,27 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); #if IS_ENABLED(CONFIG_CXL_RAS_FEATURES) int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd); int devm_cxl_region_edac_register(struct cxl_region *cxlr); +struct cxl_event_gen_media * +cxl_find_rec_gen_media(struct cxl_memdev *cxlmd, struct cxl_mem_repair_attrbs *attrbs); +struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd, + struct cxl_mem_repair_attrbs *attrbs); +int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt); +int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt); #else static inline int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) { return 0; } static inline int devm_cxl_region_edac_register(struct cxl_region *cxlr) { return 0; } +static inline struct cxl_event_gen_media * +cxl_find_rec_gen_media(struct cxl_memdev *cxlmd, struct cxl_mem_repair_attrbs *attrbs) +{ return 0; } +static inline struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd, + struct cxl_mem_repair_attrbs *attrbs) +{ return 0; } +static inline int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt) +{ return 0; } +static inline int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt) +{ return 0; } #endif #ifdef CONFIG_CXL_SUSPEND diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4288f4814cc5..51f09e685dd9 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -1053,6 +1053,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) pci_save_state(pdev); + xa_init(&cxlmd->rec_gen_media); + xa_init(&cxlmd->rec_dram); + return rc; } From patchwork Thu Feb 27 22:38:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 868990 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2D1C23E347; Thu, 27 Feb 2025 22:39:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695987; cv=none; b=OEPrQC9+QCouj+fabOvzVRTg7XICbfwDYIolr/IfzvO/6OGZ95kv9w2WXAZs1u/XA85/Gx5CpGxbS7TCNVL6in+60Jww7tI3l8Bwpj/EZSz79VuoGli8cAZXNYKHWw3dvDH8nRGd+scf6EbE5Q8+AkfDwJBVMUMRnmyht5mh8zU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695987; c=relaxed/simple; bh=iuUs0MfJqb2vU47ECQXS59OT5iTW4qEVoa1pMehu+Uc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uVbv39Dz88uwRdw5azYZG+m9XEom5XuDMVbi8gf1+o27sCEy2OPMXetxuP4+osRdejJFQjo7UCqB5V8EHrJ/UTjMIkr3FshgYAXSPgTY7SzlJPWekHLK/85KVQOW5r3kzdq2t/g1l/Sy63XTcqWG/DB2sKajQSMPm0pLqp8Mo/k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Z3mQJ5hmwz6L5Bw; Fri, 28 Feb 2025 06:35:52 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id AC7A7140CF4; Fri, 28 Feb 2025 06:39:41 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.48.149.240) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 27 Feb 2025 23:39:39 +0100 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 8/8] cxl/memfeature: Add CXL memory device memory sparing control feature Date: Thu, 27 Feb 2025 22:38:15 +0000 Message-ID: <20250227223816.2036-9-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250227223816.2036-1-shiju.jose@huawei.com> References: <20250227223816.2036-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml500011.china.huawei.com (7.191.174.215) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Memory sparing is defined as a repair function that replaces a portion of memory with a portion of functional memory at that same DPA. The subclasses for this operation vary in terms of the scope of the sparing being performed. The cacheline sparing subclass refers to a sparing action that can replace a full cacheline. Row sparing is provided as an alternative to PPR sparing functions and its scope is that of a single DDR row. As per CXL r3.2 Table 8-125 foot note 1. Memory sparing is preferred over PPR when possible. Bank sparing allows an entire bank to be replaced. Rank sparing is defined as an operation in which an entire DDR rank is replaced. Memory sparing maintenance operations may be supported by CXL devices that implement CXL.mem protocol. A sparing maintenance operation requests the CXL device to perform a repair operation on its media. For example, a CXL device with DRAM components that support memory sparing features may implement sparing maintenance operations. The host may issue a query command by setting query resources flag in the input payload (CXL spec 3.2 Table 8-120) to determine availability of sparing resources for a given address. In response to a query request, the device shall report the resource availability by producing the memory sparing event record (CXL spec 3.2 Table 8-60) in which the Channel, Rank, Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy of the values specified in the request. During the execution of a sparing maintenance operation, a CXL memory device: - may not retain data - may not be able to process CXL.mem requests correctly. These CXL memory device capabilities are specified by restriction flags in the memory sparing feature readable attributes. When a CXL device identifies error on a memory component, the device may inform the host about the need for a memory sparing maintenance operation by using DRAM event record, where the 'maintenance needed' flag may set. The event record contains some of the DPA, Channel, Rank, Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that should be repaired. The userspace tool requests for maintenance operation if the 'maintenance needed' flag set in the CXL DRAM error record. CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing maintenance operation feature. CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature discovery and configuration. Add support for controlling CXL memory device memory sparing feature. Register with EDAC driver, which gets the memory repair attr descriptors from the EDAC memory repair driver and exposes sysfs repair control attributes for memory sparing to the userspace. For example CXL memory sparing control for the CXL mem0 device is exposed in /sys/bus/edac/devices/cxl_mem0/mem_repairX/ Use case ======== 1. CXL device identifies a failure in a memory component, report to userspace in a CXL DRAM trace event with DPA and other attributes of memory to repair such as channel, rank, nibble mask, bank Group, bank, row, column, sub-channel. 2. Rasdaemon process the trace event and may issue query request in sysfs check resources available for memory sparing if either of the following conditions met. - 'maintenance needed' flag set in the event record. - 'threshold event' flag set for CVME threshold feature. - If the previous case is not enough, may be when the number of corrected error reported on a CXL.mem media to the user space exceeds an error threshold set in the userspace policy. 3. Rasdaemon process the memory sparing trace event and issue repair request for memory sparing. Kernel CXL driver shall report memory sparing event record to the userspace with the resource availability in order rasdaemon to process the event record and issue a repair request in sysfs for the memory sparing operation in the CXL device. Note: Based on the feedbacks from the community 'query' sysfs attribute is removed and reporting memory sparing error record to the userspace are not supported. Instead userspace issues sparing operation and kernel does the same to the CXL memory device, when 'maintenance needed' flag set in the DRAM event record. Add checks to ensure the memory to be repaired is offline and if online, then originates from a CXL DRAM error record reported in the current boot before requesting a memory sparing operation on the device. Tested for memory sparing control feature with "hw/cxl: Add memory sparing control feature" Repository: "https://gitlab.com/shiju.jose/qemu.git" Branch: cxl-ras-features-2024-10-24 Signed-off-by: Shiju Jose --- Documentation/edac/memory_repair.rst | 57 +++ drivers/cxl/core/memfeatures.c | 557 ++++++++++++++++++++++++++- drivers/edac/mem_repair.c | 4 + include/linux/edac.h | 4 + 4 files changed, 620 insertions(+), 2 deletions(-) diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst index b698921628b5..22aec58bb459 100644 --- a/Documentation/edac/memory_repair.rst +++ b/Documentation/edac/memory_repair.rst @@ -165,3 +165,60 @@ Note: Repair command returns error if unsupported/resources are not available for the repair operation. # echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair + +2. CXL memory sparing + +2.1. Read device supported capabilities for the cacheline sparing. + +# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_type + +cacheline-sparing + +# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/persist_mode + +0 + +# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_safe_when_in_use + +1 + +# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/min_dpa + +0x0 + +# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/max_dpa + +0xfffffff + +Sparing that is safe to use with ongoing accesses to the memory + +and applies to 4GiB of DPA space. + +2.2. Set attributes for cacheline sparing operation for a DPA=0x700000, + where device reported the attributes in CXL DRAM error event record. + +# echo 0x700000 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/dpa + +# echo 2 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank_group + +# echo 4 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank + +# echo 7 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/channel + +# echo 5 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/sub_channel + +# echo 9 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/rank + +# echo 0x240a > /sys/bus/edac/devices/cxl_mem0/mem_repair1/row + +# echo 11 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/column + +# echo 0x0FF > /sys/bus/edac/devices/cxl_mem0/mem_repair1/nibble_mask + +2.3. Start cacheline sparing operation + +Note: Repair command returns error if unsupported, resources are not +available for the sparing operation or if memory to repair is online +and attributes are reported from the previous boot etc. + +# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair diff --git a/drivers/cxl/core/memfeatures.c b/drivers/cxl/core/memfeatures.c index 8d5a57a0c154..14d3960504a2 100644 --- a/drivers/cxl/core/memfeatures.c +++ b/drivers/cxl/core/memfeatures.c @@ -19,8 +19,9 @@ #include #include #include "core.h" +#include "trace.h" -#define CXL_DEV_NUM_RAS_FEATURES 3 +#define CXL_DEV_NUM_RAS_FEATURES 7 #define CXL_DEV_HOUR_IN_SECS 3600 #define CXL_DEV_NAME_LEN 128 @@ -1109,6 +1110,546 @@ static int cxl_memdev_soft_ppr_init(struct cxl_memdev *cxlmd, return 0; } +/* CXL memory sparing control definitions */ +enum cxl_mem_sparing_granularity { + CXL_MEM_SPARING_CACHELINE, + CXL_MEM_SPARING_ROW, + CXL_MEM_SPARING_BANK, + CXL_MEM_SPARING_RANK, + CXL_MEM_SPARING_MAX +}; + +struct cxl_mem_sparing_context { + struct cxl_memdev *cxlmd; + uuid_t repair_uuid; + u16 get_feat_size; + u16 set_feat_size; + u16 effects; + u8 instance; + u8 get_version; + u8 set_version; + u8 channel; + u8 rank; + u8 bank_group; + u32 nibble_mask; + u64 dpa; + u32 row; + u16 column; + u8 bank; + u8 sub_channel; + enum edac_mem_repair_type repair_type; + bool persist_mode; + enum cxl_mem_sparing_granularity granularity; +}; + +struct cxl_memdev_sparing_params { + u8 op_class; + u8 op_subclass; + bool cap_safe_when_in_use; + bool cap_hard_sparing; + bool cap_soft_sparing; +}; + +#define CXL_MEMDEV_SPARING_RD_CAP_SAFE_IN_USE_MASK BIT(0) +#define CXL_MEMDEV_SPARING_RD_CAP_HARD_SPARING_MASK BIT(1) +#define CXL_MEMDEV_SPARING_RD_CAP_SOFT_SPARING_MASK BIT(2) + +#define CXL_MEMDEV_SPARING_WR_DEVICE_INITIATED_MASK BIT(0) + +#define CXL_MEMDEV_SPARING_QUERY_RESOURCE_FLAG BIT(0) +#define CXL_MEMDEV_SET_HARD_SPARING_FLAG BIT(1) +#define CXL_MEMDEV_SPARING_SUB_CHANNEL_VALID_FLAG BIT(2) +#define CXL_MEMDEV_SPARING_NIB_MASK_VALID_FLAG BIT(3) + +/* + * See CXL spec rev 3.2 @8.2.10.7.2.3 Table 8-134 Memory Sparing Feature + * Readable Attributes. + */ +struct cxl_memdev_sparing_rd_attrs { + struct cxl_memdev_repair_rd_attrs_hdr hdr; + u8 rsvd; + __le16 restriction_flags; +} __packed; + +/* + * See CXL spec rev 3.2 @8.2.10.7.1.4 Table 8-120 Memory Sparing Input Payload. + */ +struct cxl_memdev_sparing_in_payload { + u8 flags; + u8 channel; + u8 rank; + u8 nibble_mask[3]; + u8 bank_group; + u8 bank; + u8 row[3]; + __le16 column; + u8 sub_channel; +} __packed; + +static int cxl_mem_sparing_get_attrs(struct cxl_mem_sparing_context *cxl_sparing_ctx, + struct cxl_memdev_sparing_params *params) +{ + size_t rd_data_size = sizeof(struct cxl_memdev_sparing_rd_attrs); + struct cxl_memdev *cxlmd = cxl_sparing_ctx->cxlmd; + struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox; + u16 restriction_flags; + size_t data_size; + u16 return_code; + struct cxl_memdev_sparing_rd_attrs *rd_attrs __free(kfree) = + kzalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + data_size = cxl_get_feature(cxl_mbox, &cxl_sparing_ctx->repair_uuid, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, &return_code); + if (!data_size) + return -EIO; + + params->op_class = rd_attrs->hdr.op_class; + params->op_subclass = rd_attrs->hdr.op_subclass; + restriction_flags = le16_to_cpu(rd_attrs->restriction_flags); + params->cap_safe_when_in_use = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_SAFE_IN_USE_MASK, + restriction_flags) ^ 1; + params->cap_hard_sparing = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_HARD_SPARING_MASK, + restriction_flags); + params->cap_soft_sparing = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_SOFT_SPARING_MASK, + restriction_flags); + + return 0; +} + +static struct cxl_event_dram * +cxl_mem_get_rec_dram(struct cxl_memdev *cxlmd, struct cxl_mem_sparing_context *ctx) +{ + struct cxl_mem_repair_attrbs attrbs = { 0 }; + + attrbs.dpa = ctx->dpa; + attrbs.channel = ctx->channel; + attrbs.rank = ctx->rank; + attrbs.nibble_mask = ctx->nibble_mask; + switch (ctx->repair_type) { + case EDAC_CACHELINE_SPARING: + attrbs.repair_type = CXL_CACHELINE_SPARING; + attrbs.bank_group = ctx->bank_group; + attrbs.bank = ctx->bank; + attrbs.row = ctx->row; + attrbs.column = ctx->column; + attrbs.sub_channel = ctx->sub_channel; + break; + case EDAC_ROW_SPARING: + attrbs.repair_type = CXL_ROW_SPARING; + attrbs.bank_group = ctx->bank_group; + attrbs.bank = ctx->bank; + attrbs.row = ctx->row; + break; + case EDAC_BANK_SPARING: + attrbs.repair_type = CXL_BANK_SPARING; + attrbs.bank_group = ctx->bank_group; + attrbs.bank = ctx->bank; + break; + case EDAC_RANK_SPARING: + attrbs.repair_type = CXL_BANK_SPARING; + break; + default: + return NULL; + } + + /* + * Check memory to repair is from the current boot + */ + return cxl_find_rec_dram(cxlmd, &attrbs); +} + +static int cxl_mem_do_sparing_op(struct device *dev, + struct cxl_mem_sparing_context *cxl_sparing_ctx, + struct cxl_memdev_sparing_params *rd_params) +{ + struct cxl_memdev *cxlmd = cxl_sparing_ctx->cxlmd; + struct cxl_memdev_sparing_in_payload sparing_pi; + struct cxl_event_dram *rec = NULL; + u16 validity_flags = 0; + + if (!rd_params->cap_safe_when_in_use) { + /* + * Memory to repair must be offline + */ + if (cxl_are_decoders_committed(cxlmd)) + return -EBUSY; + /* + * offline, so good for repair + */ + } else { + /* + * If offline all good, otherwise check for match with record + */ + if (cxl_are_decoders_committed(cxlmd)) { + rec = cxl_mem_get_rec_dram(cxlmd, cxl_sparing_ctx); + if (!rec) + return -EINVAL; + /* + * Record matched, so even though online good for repair + */ + validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags); + if (!validity_flags) + return -EINVAL; + } + } + + memset(&sparing_pi, 0, sizeof(sparing_pi)); + sparing_pi.flags = FIELD_PREP(CXL_MEMDEV_SPARING_QUERY_RESOURCE_FLAG, 0); + if (cxl_sparing_ctx->persist_mode) + sparing_pi.flags |= + FIELD_PREP(CXL_MEMDEV_SET_HARD_SPARING_FLAG, 1); + + switch (cxl_sparing_ctx->repair_type) { + case EDAC_CACHELINE_SPARING: + sparing_pi.column = cpu_to_le16(cxl_sparing_ctx->column); + /* + * Sub-channel is an optional attribute. + */ + if (!rec || (validity_flags & CXL_DER_VALID_SUB_CHANNEL)) { + sparing_pi.flags |= + FIELD_PREP(CXL_MEMDEV_SPARING_SUB_CHANNEL_VALID_FLAG, 1); + sparing_pi.sub_channel = cxl_sparing_ctx->sub_channel; + } + fallthrough; + case EDAC_ROW_SPARING: + put_unaligned_le24(cxl_sparing_ctx->row, sparing_pi.row); + fallthrough; + case EDAC_BANK_SPARING: + sparing_pi.bank_group = cxl_sparing_ctx->bank_group; + sparing_pi.bank = cxl_sparing_ctx->bank; + fallthrough; + case EDAC_RANK_SPARING: + sparing_pi.rank = cxl_sparing_ctx->rank; + fallthrough; + default: + sparing_pi.channel = cxl_sparing_ctx->channel; + if ((rec && (validity_flags & CXL_DER_VALID_NIBBLE)) || + (!rec && (!cxl_sparing_ctx->nibble_mask || + (cxl_sparing_ctx->nibble_mask & 0xFFFFFF)))) { + sparing_pi.flags |= + FIELD_PREP(CXL_MEMDEV_SPARING_NIB_MASK_VALID_FLAG, 1); + put_unaligned_le24(cxl_sparing_ctx->nibble_mask, + sparing_pi.nibble_mask); + } + break; + } + + return cxl_do_maintenance(&cxlmd->cxlds->cxl_mbox, rd_params->op_class, + rd_params->op_subclass, &sparing_pi, sizeof(sparing_pi)); +} + +static int cxl_mem_sparing_set_attrs(struct device *dev, + struct cxl_mem_sparing_context *ctx) +{ + struct cxl_memdev_sparing_params rd_params; + int ret; + + ret = cxl_mem_sparing_get_attrs(ctx, &rd_params); + if (ret) + return ret; + + ret = cxl_hold_region_and_dpa(); + if (ret) + return ret; + ret = cxl_mem_do_sparing_op(dev, ctx, &rd_params); + cxl_release_region_and_dpa(); + + return ret; +} + +static int cxl_mem_sparing_get_repair_type(struct device *dev, void *drv_data, + const char **repair_type) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + + switch (ctx->repair_type) { + case EDAC_CACHELINE_SPARING: + case EDAC_ROW_SPARING: + case EDAC_BANK_SPARING: + case EDAC_RANK_SPARING: + *repair_type = edac_repair_type[ctx->repair_type]; + break; + default: + return -EINVAL; + } + + return 0; +} + +#define CXL_SPARING_GET_ATTR(attrib, data_type) \ +static int cxl_mem_sparing_get_##attrib(struct device *dev, void *drv_data, \ + data_type *val) \ +{ \ + struct cxl_mem_sparing_context *ctx = drv_data; \ + \ + *val = ctx->attrib; \ + \ + return 0; \ +} +CXL_SPARING_GET_ATTR(persist_mode, bool) +CXL_SPARING_GET_ATTR(dpa, u64) +CXL_SPARING_GET_ATTR(nibble_mask, u32) +CXL_SPARING_GET_ATTR(bank_group, u32) +CXL_SPARING_GET_ATTR(bank, u32) +CXL_SPARING_GET_ATTR(rank, u32) +CXL_SPARING_GET_ATTR(row, u32) +CXL_SPARING_GET_ATTR(column, u32) +CXL_SPARING_GET_ATTR(channel, u32) +CXL_SPARING_GET_ATTR(sub_channel, u32) + +#define CXL_SPARING_SET_ATTR(attrib, data_type) \ +static int cxl_mem_sparing_set_##attrib(struct device *dev, void *drv_data, \ + data_type val) \ +{ \ + struct cxl_mem_sparing_context *ctx = drv_data; \ + \ + ctx->attrib = val; \ + \ + return 0; \ +} +CXL_SPARING_SET_ATTR(nibble_mask, u32) +CXL_SPARING_SET_ATTR(bank_group, u32) +CXL_SPARING_SET_ATTR(bank, u32) +CXL_SPARING_SET_ATTR(rank, u32) +CXL_SPARING_SET_ATTR(row, u32) +CXL_SPARING_SET_ATTR(column, u32) +CXL_SPARING_SET_ATTR(channel, u32) +CXL_SPARING_SET_ATTR(sub_channel, u32) + +static int cxl_mem_sparing_set_persist_mode(struct device *dev, void *drv_data, + bool persist_mode) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + struct cxl_memdev_sparing_params params; + int ret; + + ret = cxl_mem_sparing_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + if ((persist_mode && params.cap_hard_sparing) || + (!persist_mode && params.cap_soft_sparing)) + ctx->persist_mode = persist_mode; + else + return -EOPNOTSUPP; + + return 0; +} + +static int cxl_get_mem_sparing_safe_when_in_use(struct device *dev, void *drv_data, + bool *safe) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + struct cxl_memdev_sparing_params params; + int ret; + + ret = cxl_mem_sparing_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + *safe = params.cap_safe_when_in_use; + + return 0; +} + +static int cxl_mem_sparing_get_min_dpa(struct device *dev, void *drv_data, + u64 *min_dpa) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + struct cxl_memdev *cxlmd = ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + *min_dpa = cxlds->dpa_res.start; + + return 0; +} + +static int cxl_mem_sparing_get_max_dpa(struct device *dev, void *drv_data, + u64 *max_dpa) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + struct cxl_memdev *cxlmd = ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + *max_dpa = cxlds->dpa_res.end; + + return 0; +} + +static int cxl_mem_sparing_set_dpa(struct device *dev, void *drv_data, u64 dpa) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + struct cxl_memdev *cxlmd = ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + if (dpa < cxlds->dpa_res.start || dpa > cxlds->dpa_res.end) + return -EINVAL; + + ctx->dpa = dpa; + + return 0; +} + +static int cxl_do_mem_sparing(struct device *dev, void *drv_data, u32 val) +{ + struct cxl_mem_sparing_context *ctx = drv_data; + + if (val != EDAC_DO_MEM_REPAIR) + return -EINVAL; + + return cxl_mem_sparing_set_attrs(dev, ctx); +} + +#define RANK_OPS \ + .get_repair_type = cxl_mem_sparing_get_repair_type, \ + .get_persist_mode = cxl_mem_sparing_get_persist_mode, \ + .set_persist_mode = cxl_mem_sparing_set_persist_mode, \ + .get_repair_safe_when_in_use = cxl_get_mem_sparing_safe_when_in_use, \ + .get_min_dpa = cxl_mem_sparing_get_min_dpa, \ + .get_max_dpa = cxl_mem_sparing_get_max_dpa, \ + .get_dpa = cxl_mem_sparing_get_dpa, \ + .set_dpa = cxl_mem_sparing_set_dpa, \ + .get_nibble_mask = cxl_mem_sparing_get_nibble_mask, \ + .set_nibble_mask = cxl_mem_sparing_set_nibble_mask, \ + .get_rank = cxl_mem_sparing_get_rank, \ + .set_rank = cxl_mem_sparing_set_rank, \ + .get_channel = cxl_mem_sparing_get_channel, \ + .set_channel = cxl_mem_sparing_set_channel, \ + .do_repair = cxl_do_mem_sparing + +#define BANK_OPS \ + RANK_OPS, \ + .get_bank_group = cxl_mem_sparing_get_bank_group, \ + .set_bank_group = cxl_mem_sparing_set_bank_group, \ + .get_bank = cxl_mem_sparing_get_bank, \ + .set_bank = cxl_mem_sparing_set_bank + +#define ROW_OPS \ + BANK_OPS, \ + .get_row = cxl_mem_sparing_get_row, \ + .set_row = cxl_mem_sparing_set_row + +#define CACHELINE_OPS \ + ROW_OPS, \ + .get_column = cxl_mem_sparing_get_column, \ + .set_column = cxl_mem_sparing_set_column, \ + .get_sub_channel = cxl_mem_sparing_get_sub_channel, \ + .set_sub_channel = cxl_mem_sparing_set_sub_channel + +static const struct edac_mem_repair_ops cxl_rank_sparing_ops = { + RANK_OPS, +}; + +static const struct edac_mem_repair_ops cxl_bank_sparing_ops = { + BANK_OPS, +}; + +static const struct edac_mem_repair_ops cxl_row_sparing_ops = { + ROW_OPS, +}; + +static const struct edac_mem_repair_ops cxl_cacheline_sparing_ops = { + CACHELINE_OPS, +}; + +struct cxl_mem_sparing_desc { + const uuid_t repair_uuid; + enum edac_mem_repair_type repair_type; + enum cxl_mem_sparing_granularity granularity; + const struct edac_mem_repair_ops *repair_ops; +}; + +static const struct cxl_mem_sparing_desc mem_sparing_desc[] = { + { + .repair_uuid = CXL_FEAT_CACHELINE_SPARING_UUID, + .repair_type = EDAC_CACHELINE_SPARING, + .granularity = CXL_MEM_SPARING_CACHELINE, + .repair_ops = &cxl_cacheline_sparing_ops, + }, + { + .repair_uuid = CXL_FEAT_ROW_SPARING_UUID, + .repair_type = EDAC_ROW_SPARING, + .granularity = CXL_MEM_SPARING_ROW, + .repair_ops = &cxl_row_sparing_ops, + }, + { + .repair_uuid = CXL_FEAT_BANK_SPARING_UUID, + .repair_type = EDAC_BANK_SPARING, + .granularity = CXL_MEM_SPARING_BANK, + .repair_ops = &cxl_bank_sparing_ops, + }, + { + .repair_uuid = CXL_FEAT_RANK_SPARING_UUID, + .repair_type = EDAC_RANK_SPARING, + .granularity = CXL_MEM_SPARING_RANK, + .repair_ops = &cxl_rank_sparing_ops, + }, +}; + +static int cxl_memdev_sparing_init(struct cxl_memdev *cxlmd, + struct edac_dev_feature *ras_feature, + const struct cxl_mem_sparing_desc *desc, + u8 repair_inst) +{ + struct cxl_mem_sparing_context *cxl_sparing_ctx; + struct cxl_memdev_sparing_params rd_params; + struct cxl_feat_entry *feat_entry; + int ret; + + feat_entry = cxl_get_feature_entry(cxlmd->cxlds, &desc->repair_uuid); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE)) + return -EOPNOTSUPP; + + cxl_sparing_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_sparing_ctx), + GFP_KERNEL); + if (!cxl_sparing_ctx) + return -ENOMEM; + + *cxl_sparing_ctx = (struct cxl_mem_sparing_context) { + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .cxlmd = cxlmd, + .repair_type = desc->repair_type, + .granularity = desc->granularity, + .instance = repair_inst++, + }; + uuid_copy(&cxl_sparing_ctx->repair_uuid, &desc->repair_uuid); + + /* + * Read CXL device's sparing capabilities. + */ + ret = cxl_mem_sparing_get_attrs(cxl_sparing_ctx, &rd_params); + if (ret) + return ret; + + /* + * Set default value for persist_mode. + */ + if ((rd_params.cap_soft_sparing && rd_params.cap_hard_sparing) || + rd_params.cap_soft_sparing) + cxl_sparing_ctx->persist_mode = 0; + else if (rd_params.cap_hard_sparing) + cxl_sparing_ctx->persist_mode = 1; + else + return -EOPNOTSUPP; + + ras_feature->ft_type = RAS_FEAT_MEM_REPAIR; + ras_feature->instance = cxl_sparing_ctx->instance; + ras_feature->mem_repair_ops = desc->repair_ops; + ras_feature->ctx = cxl_sparing_ctx; + + return 0; +} + int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) { struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; @@ -1116,7 +1657,7 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) int num_ras_features = 0; u8 scrub_inst = 0; u8 repair_inst = 0; - int rc; + int rc, i; rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features], scrub_inst); @@ -1143,6 +1684,18 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) num_ras_features++; } + for (i = 0; i < CXL_MEM_SPARING_MAX; i++) { + rc = cxl_memdev_sparing_init(cxlmd, &ras_features[num_ras_features], + &mem_sparing_desc[i], repair_inst); + if (rc == -EOPNOTSUPP) + continue; + if (rc < 0) + return rc; + + repair_inst++; + num_ras_features++; + } + snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s", "cxl", dev_name(&cxlmd->dev)); diff --git a/drivers/edac/mem_repair.c b/drivers/edac/mem_repair.c index bf7e01a8b4dd..c4a2e99c9355 100755 --- a/drivers/edac/mem_repair.c +++ b/drivers/edac/mem_repair.c @@ -47,6 +47,10 @@ struct edac_mem_repair_context { const char * const edac_repair_type[] = { [EDAC_PPR] = "ppr", + [EDAC_CACHELINE_SPARING] = "cacheline-sparing", + [EDAC_ROW_SPARING] = "row-sparing", + [EDAC_BANK_SPARING] = "bank-sparing", + [EDAC_RANK_SPARING] = "rank-sparing", }; EXPORT_SYMBOL_GPL(edac_repair_type); diff --git a/include/linux/edac.h b/include/linux/edac.h index 5669d8d2509a..57c9856a6bd9 100644 --- a/include/linux/edac.h +++ b/include/linux/edac.h @@ -746,6 +746,10 @@ static inline int edac_ecs_get_desc(struct device *ecs_dev, enum edac_mem_repair_type { EDAC_PPR, + EDAC_CACHELINE_SPARING, + EDAC_ROW_SPARING, + EDAC_BANK_SPARING, + EDAC_RANK_SPARING, EDAC_REPAIR_MAX };