diff mbox series

[v1] i2c: microchip-core: actually use repeated sends

Message ID 20240930-uneasy-dorsal-1acda9227b0d@spud
State New
Headers show
Series [v1] i2c: microchip-core: actually use repeated sends | expand

Commit Message

Conor Dooley Sept. 30, 2024, 1:38 p.m. UTC
From: Conor Dooley <conor.dooley@microchip.com>

At present, where repeated sends are intended to be used, the
i2c-microchip-core driver sends a stop followed by a start. Lots of i2c
devices must not malfunction in the face of this behaviour, because the
driver has operated like this for years! Try to keep track of whether or
not a repeated send is required, and suppress sending a stop in these
cases.

Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
CC: Conor Dooley <conor.dooley@microchip.com>
CC: Daire McNamara <daire.mcnamara@microchip.com>
CC: Andi Shyti <andi.shyti@kernel.org>
CC: Wolfram Sang <wsa@kernel.org>
CC: linux-riscv@lists.infradead.org
CC: linux-i2c@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
 drivers/i2c/busses/i2c-microchip-corei2c.c | 124 ++++++++++++++++-----
 1 file changed, 96 insertions(+), 28 deletions(-)

Comments

Wolfram Sang Oct. 1, 2024, 8:50 a.m. UTC | #1
Hi Conor,

> At present, where repeated sends are intended to be used, the
> i2c-microchip-core driver sends a stop followed by a start. Lots of i2c

Oh, this is wrong. Was this just overlooked or was maybe older hardware
not able to generated correct repeated-starts?

> devices must not malfunction in the face of this behaviour, because the
> driver has operated like this for years! Try to keep track of whether or
> not a repeated send is required, and suppress sending a stop in these
> cases.

? I don't get that argument. If the driver is expected to do a repeated
start, it should do a repeated start. If it didn't, it was a bug and you
were lucky that the targets could handle this. Because most controllers
can do repeated starts correctly, we can also argue that this works for
most targets for years. In the unlikely event that a target fails after
converting this driver to proper repeated starts, the target is buggy
and needs fixing. It would not work with the majority of other
controllers this way.

I didn't look at the code but reading "keeping track whether rep start
is required" looks wrong from a high level perspective. The driver
should do repeated start when it should do repeated start.

All the best,

   Wolfram
Conor Dooley Oct. 1, 2024, 10:16 a.m. UTC | #2
On Tue, Oct 01, 2024 at 10:50:56AM +0200, Wolfram Sang wrote:
> > At present, where repeated sends are intended to be used, the
> > i2c-microchip-core driver sends a stop followed by a start. Lots of i2c
> 
> Oh, this is wrong. Was this just overlooked or was maybe older hardware
> not able to generated correct repeated-starts?

Overlooked, because the devices that had been used until recently didn't
care about whether they got a repeated start or stop + start. The bare
metal driver upon which the Linux one was originally based had a trivial
time of supporting repeated starts because it only allows specific sorts
of transfers. I kinda doubt you care, but the bare metal implementation
is here:
https://github.com/polarfire-soc/polarfire-soc-bare-metal-library/blob/614a67abb3023ba47ea6d1b8d7b9a9997353e007/src/platform/drivers/mss/mss_i2c/mss_i2c.c#L737

It just must have been missed that the bare metal method was not replaced.

> > devices must not malfunction in the face of this behaviour, because the
> > driver has operated like this for years! Try to keep track of whether or
> > not a repeated send is required, and suppress sending a stop in these
> > cases.
> 
> ? I don't get that argument. If the driver is expected to do a repeated
> start, it should do a repeated start. If it didn't, it was a bug and you
> were lucky that the targets could handle this. Because most controllers
> can do repeated starts correctly, we can also argue that this works for
> most targets for years. In the unlikely event that a target fails after
> converting this driver to proper repeated starts, the target is buggy
> and needs fixing. It would not work with the majority of other
> controllers this way.
> 
> I didn't look at the code but reading "keeping track whether rep start
> is required" looks wrong from a high level perspective.

I think if you had looked at the code, you'd (hopefully) understand what
I meant w.r.t. tracking that.
The design of this IP is pretty old, and intended for use with other
logic implemented in FPGA fabric where each interrupt generated by
the core would be the stimulus for the state machine controlling it to
transition state. Cos of that, when controlling it from software, the
interrupt handler assumes the role of that state machine. When I talk
about tracking whether or not a repeated send is required, that's
whether or not a particular message in a transfer requires it, not
whether or not the target device requires them or not.

Currently the driver operates by iterating over a list of messages in a
transfer, and calling send() for each one, and then effectively "looping"
in the interrupt handler until the message has been sent. By looking at
the current code, you can see that the completion's "lifecycle" matches
that. Currently, at the end of each message being sent
	static irqreturn_t mchp_corei2c_handle_isr(struct mchp_corei2c_dev *idev)
	{
	
		<snip>
	
		/* On the last byte to be transmitted, send STOP */
		if (last_byte)
			mchp_corei2c_stop(idev);
	
		if (last_byte || finished)
			complete(&idev->msg_complete);
	
		return IRQ_HANDLED;
	}
a stop is put on the bus, unless !last_byte, which is only true in error
cases. Clearly I don't need to explain why that is a problem to you...
You'd think that we could do something like moving the stop out of the
interrupt handler, and to the loop in mchp_corei2c_xfer(), where we have
access to the transfer's message list and can check if a stop should be
sent or not - that's not really possible with the hardware we have.

When the interrupt handler completes, it clears the interrupt bit in the
IP, as you might expect. The controller IP uses that as the trigger to
transition state in its state machine, which is detailed in
https://ww1.microchip.com/downloads/aemDocuments/documents/FPGA/ProductDocuments/UserGuides/ip_cores/directcores/CoreI2C_HB.pdf
On page 23, row 0x28, you can see the case that (IIRC) is the
problematic one. It is impossible to leave this state without triggering
some sort of action.
The only way that I could see to make this work correctly was to get the
driver track whether or not the next message required a repeated start or
not, so as to transition out of state 0x28 correctly.

Unfortunately, then the clearing of the interrupt bit causing state
transitions kicked in again - after sending a repeated start, it will
immediately attempt to act (see state 0x10 on page 23). Without
reworking the driver to send entire transfers "in one go" (where the
completion is that of the transfer rather than the message as it
currently is) the controller will re-send the last target address +
read/write command it sent, instead of the next one. That's why there's
so many changes outside of the interrupt handler and so many additional
members in the controller's private data structure.

I hope that that at least makes some sense..

> The driver
> should do repeated start when it should do repeated start.

Yup, that's what I'm trying to do here :)
Andi Shyti Oct. 1, 2024, 12:45 p.m. UTC | #3
Hi Conor,

On Mon, Sep 30, 2024 at 02:38:27PM GMT, Conor Dooley wrote:
> From: Conor Dooley <conor.dooley@microchip.com>
> 
> At present, where repeated sends are intended to be used, the
> i2c-microchip-core driver sends a stop followed by a start. Lots of i2c
> devices must not malfunction in the face of this behaviour, because the
> driver has operated like this for years! Try to keep track of whether or
> not a repeated send is required, and suppress sending a stop in these
> cases.
> 
> Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers")

I don't think the Fixes tag is needed here if everything worked
until now, unless you got some other device that requires this
change and you need to explain it.

If this is more an improvement (because it has worked), then we
shouldn't add the Fixes tag.

In any case, when patches are going to stable, we need to Cc
stable too.

Cc: <stable@vger.kernel.org> # v6.0+

(This is specified in the
Documentation/process/stable-kernel-rules.rst and I'm starting to
enforce it here).

> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

...

> +	/*
> +	 * If there's been an error, the isr needs to return control
> +	 * to the "main" part of the driver, so as not to keep sending
> +	 * messages once it completes and clears the SI bit.
> +	 */
> +	if (idev->msg_err) {
> +		complete(&idev->msg_complete);
> +		return;
> +	}
> +
> +	this_msg = (idev->msg_queue)++;

do we need parenthesis here?

...

> +	/*
> +	 * The isr controls the flow of a transfer, this info needs to be saved
> +	 * to a location that it can access the queue information from.
> +	 */
> +	idev->restart_needed = false;
> +	idev->msg_queue = msgs;
> +	idev->total_num = num;
> +	idev->current_num = 0;
> +
> +	/*
> +	 * But the first entry to the isr is triggered by the start in this
> +	 * function, so the first message needs to be "dequeued".
> +	 */
> +	idev->addr = i2c_8bit_addr_from_msg(this_msg);
> +	idev->msg_len = this_msg->len;
> +	idev->buf = this_msg->buf;
> +	idev->msg_err = 0;
> +
> +	if (idev->total_num > 1) {
> +		struct i2c_msg *next_msg = msgs + 1;
> +
> +		idev->restart_needed = next_msg->flags & I2C_M_RD;
> +	}
> +
> +	idev->current_num++;
> +	idev->msg_queue++;

Can we initialize only once? This part is just adding extra code.

The rest looks good. I just need to know if Wolfram has some more
observations here.

Thanks,
Andi
Conor Dooley Oct. 1, 2024, 1:02 p.m. UTC | #4
On Tue, Oct 01, 2024 at 02:45:20PM +0200, Andi Shyti wrote:
> Hi Conor,
> 
> On Mon, Sep 30, 2024 at 02:38:27PM GMT, Conor Dooley wrote:
> > From: Conor Dooley <conor.dooley@microchip.com>
> > 
> > At present, where repeated sends are intended to be used, the
> > i2c-microchip-core driver sends a stop followed by a start. Lots of i2c
> > devices must not malfunction in the face of this behaviour, because the
> > driver has operated like this for years! Try to keep track of whether or
> > not a repeated send is required, and suppress sending a stop in these
> > cases.
> > 
> > Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers")
> 
> I don't think the Fixes tag is needed here if everything worked
> until now, unless you got some other device that requires this
> change and you need to explain it.

I think the fixes tag is accurate, because it only happened to work on
the limited set of devices I and others tried. This patch came about cos
I got reports of it being broken in 6.6

> If this is more an improvement (because it has worked), then we
> shouldn't add the Fixes tag.
> 
> In any case, when patches are going to stable, we need to Cc
> stable too.
> 
> Cc: <stable@vger.kernel.org> # v6.0+
> 
> (This is specified in the
> Documentation/process/stable-kernel-rules.rst and I'm starting to
> enforce it here).

Yah, some maintainers want to add the tags themselves, so got into a
(bad?) habit of leaving them out. I can add it if there's a v2.

> 
> > Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> 
> ...
> 
> > +	/*
> > +	 * If there's been an error, the isr needs to return control
> > +	 * to the "main" part of the driver, so as not to keep sending
> > +	 * messages once it completes and clears the SI bit.
> > +	 */
> > +	if (idev->msg_err) {
> > +		complete(&idev->msg_complete);
> > +		return;
> > +	}
> > +
> > +	this_msg = (idev->msg_queue)++;
> 
> do we need parenthesis here?

I suppose not, do you want a v2 if that's the only change?

> 
> ...
> 
> > +	/*
> > +	 * The isr controls the flow of a transfer, this info needs to be saved
> > +	 * to a location that it can access the queue information from.
> > +	 */
> > +	idev->restart_needed = false;
> > +	idev->msg_queue = msgs;
> > +	idev->total_num = num;
> > +	idev->current_num = 0;
> > +
> > +	/*
> > +	 * But the first entry to the isr is triggered by the start in this
> > +	 * function, so the first message needs to be "dequeued".
> > +	 */
> > +	idev->addr = i2c_8bit_addr_from_msg(this_msg);
> > +	idev->msg_len = this_msg->len;
> > +	idev->buf = this_msg->buf;
> > +	idev->msg_err = 0;
> > +
> > +	if (idev->total_num > 1) {
> > +		struct i2c_msg *next_msg = msgs + 1;
> > +
> > +		idev->restart_needed = next_msg->flags & I2C_M_RD;
> > +	}
> > +
> > +	idev->current_num++;
> > +	idev->msg_queue++;
> 
> Can we initialize only once? This part is just adding extra code.

I don't agree that it is extra code, I think it is clearer like this as
I intentionally wrote it this way.

> The rest looks good. I just need to know if Wolfram has some more
> observations here.
> 
> Thanks,
> Andi
Conor Dooley Oct. 24, 2024, 9:36 a.m. UTC | #5
Yo,

On Tue, Oct 01, 2024 at 11:16:24AM +0100, Conor Dooley wrote:
> On Tue, Oct 01, 2024 at 10:50:56AM +0200, Wolfram Sang wrote:
> > > At present, where repeated sends are intended to be used, the
> > > i2c-microchip-core driver sends a stop followed by a start. Lots of i2c
> > 
> > Oh, this is wrong. Was this just overlooked or was maybe older hardware
> > not able to generated correct repeated-starts?
> 
> Overlooked, because the devices that had been used until recently didn't
> care about whether they got a repeated start or stop + start. The bare
> metal driver upon which the Linux one was originally based had a trivial
> time of supporting repeated starts because it only allows specific sorts
> of transfers. I kinda doubt you care, but the bare metal implementation
> is here:
> https://github.com/polarfire-soc/polarfire-soc-bare-metal-library/blob/614a67abb3023ba47ea6d1b8d7b9a9997353e007/src/platform/drivers/mss/mss_i2c/mss_i2c.c#L737
> 
> It just must have been missed that the bare metal method was not replaced.
> 
> > > devices must not malfunction in the face of this behaviour, because the
> > > driver has operated like this for years! Try to keep track of whether or
> > > not a repeated send is required, and suppress sending a stop in these
> > > cases.
> > 
> > ? I don't get that argument. If the driver is expected to do a repeated
> > start, it should do a repeated start. If it didn't, it was a bug and you
> > were lucky that the targets could handle this. Because most controllers
> > can do repeated starts correctly, we can also argue that this works for
> > most targets for years. In the unlikely event that a target fails after
> > converting this driver to proper repeated starts, the target is buggy
> > and needs fixing. It would not work with the majority of other
> > controllers this way.
> > 
> > I didn't look at the code but reading "keeping track whether rep start
> > is required" looks wrong from a high level perspective.
> 
> I think if you had looked at the code, you'd (hopefully) understand what
> I meant w.r.t. tracking that.
> The design of this IP is pretty old, and intended for use with other
> logic implemented in FPGA fabric where each interrupt generated by
> the core would be the stimulus for the state machine controlling it to
> transition state. Cos of that, when controlling it from software, the
> interrupt handler assumes the role of that state machine. When I talk
> about tracking whether or not a repeated send is required, that's
> whether or not a particular message in a transfer requires it, not
> whether or not the target device requires them or not.
> 
> Currently the driver operates by iterating over a list of messages in a
> transfer, and calling send() for each one, and then effectively "looping"
> in the interrupt handler until the message has been sent. By looking at
> the current code, you can see that the completion's "lifecycle" matches
> that. Currently, at the end of each message being sent
> 	static irqreturn_t mchp_corei2c_handle_isr(struct mchp_corei2c_dev *idev)
> 	{
> 	
> 		<snip>
> 	
> 		/* On the last byte to be transmitted, send STOP */
> 		if (last_byte)
> 			mchp_corei2c_stop(idev);
> 	
> 		if (last_byte || finished)
> 			complete(&idev->msg_complete);
> 	
> 		return IRQ_HANDLED;
> 	}
> a stop is put on the bus, unless !last_byte, which is only true in error
> cases. Clearly I don't need to explain why that is a problem to you...
> You'd think that we could do something like moving the stop out of the
> interrupt handler, and to the loop in mchp_corei2c_xfer(), where we have
> access to the transfer's message list and can check if a stop should be
> sent or not - that's not really possible with the hardware we have.
> 
> When the interrupt handler completes, it clears the interrupt bit in the
> IP, as you might expect. The controller IP uses that as the trigger to
> transition state in its state machine, which is detailed in
> https://ww1.microchip.com/downloads/aemDocuments/documents/FPGA/ProductDocuments/UserGuides/ip_cores/directcores/CoreI2C_HB.pdf
> On page 23, row 0x28, you can see the case that (IIRC) is the
> problematic one. It is impossible to leave this state without triggering
> some sort of action.
> The only way that I could see to make this work correctly was to get the
> driver track whether or not the next message required a repeated start or
> not, so as to transition out of state 0x28 correctly.
> 
> Unfortunately, then the clearing of the interrupt bit causing state
> transitions kicked in again - after sending a repeated start, it will
> immediately attempt to act (see state 0x10 on page 23). Without
> reworking the driver to send entire transfers "in one go" (where the
> completion is that of the transfer rather than the message as it
> currently is) the controller will re-send the last target address +
> read/write command it sent, instead of the next one. That's why there's
> so many changes outside of the interrupt handler and so many additional
> members in the controller's private data structure.
> 
> I hope that that at least makes some sense..
> 
> > The driver
> > should do repeated start when it should do repeated start.
> 
> Yup, that's what I'm trying to do here :)

I'd like to get this fix in, and Andi only had some minor comments that
didn't require a respin. I don't want to respin or resend while this
conversation remains unresolved.

Thanks,
Conor.
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-microchip-corei2c.c b/drivers/i2c/busses/i2c-microchip-corei2c.c
index 0b0a1c4d17cae..332dd14483c05 100644
--- a/drivers/i2c/busses/i2c-microchip-corei2c.c
+++ b/drivers/i2c/busses/i2c-microchip-corei2c.c
@@ -93,27 +93,35 @@ 
  * @base:		pointer to register struct
  * @dev:		device reference
  * @i2c_clk:		clock reference for i2c input clock
+ * @msg_queue:		pointer to the messages requiring sending
  * @buf:		pointer to msg buffer for easier use
  * @msg_complete:	xfer completion object
  * @adapter:		core i2c abstraction
  * @msg_err:		error code for completed message
  * @bus_clk_rate:	current i2c bus clock rate
  * @isr_status:		cached copy of local ISR status
+ * @total_num:		total number of messages to be sent/received
+ * @current_num:	index of the current message being sent/received
  * @msg_len:		number of bytes transferred in msg
  * @addr:		address of the current slave
+ * @restart_needed:	whether or not a repeated start is required after current message
  */
 struct mchp_corei2c_dev {
 	void __iomem *base;
 	struct device *dev;
 	struct clk *i2c_clk;
+	struct i2c_msg *msg_queue;
 	u8 *buf;
 	struct completion msg_complete;
 	struct i2c_adapter adapter;
 	int msg_err;
+	int total_num;
+	int current_num;
 	u32 bus_clk_rate;
 	u32 isr_status;
 	u16 msg_len;
 	u8 addr;
+	bool restart_needed;
 };
 
 static void mchp_corei2c_core_disable(struct mchp_corei2c_dev *idev)
@@ -222,6 +230,47 @@  static int mchp_corei2c_fill_tx(struct mchp_corei2c_dev *idev)
 	return 0;
 }
 
+static void mchp_corei2c_next_msg(struct mchp_corei2c_dev *idev)
+{
+	struct i2c_msg *this_msg;
+	u8 ctrl;
+
+	if (idev->current_num >= idev->total_num) {
+		complete(&idev->msg_complete);
+		return;
+	}
+
+	/*
+	 * If there's been an error, the isr needs to return control
+	 * to the "main" part of the driver, so as not to keep sending
+	 * messages once it completes and clears the SI bit.
+	 */
+	if (idev->msg_err) {
+		complete(&idev->msg_complete);
+		return;
+	}
+
+	this_msg = (idev->msg_queue)++;
+
+	if (idev->current_num < (idev->total_num - 1)) {
+		struct i2c_msg *next_msg = idev->msg_queue;
+
+		idev->restart_needed = next_msg->flags & I2C_M_RD;
+	} else {
+		idev->restart_needed = false;
+	}
+
+	idev->addr = i2c_8bit_addr_from_msg(this_msg);
+	idev->msg_len = this_msg->len;
+	idev->buf = this_msg->buf;
+
+	ctrl = readb(idev->base + CORE_I2C_CTRL);
+	ctrl |= CTRL_STA;
+	writeb(ctrl, idev->base + CORE_I2C_CTRL);
+
+	idev->current_num++;
+}
+
 static irqreturn_t mchp_corei2c_handle_isr(struct mchp_corei2c_dev *idev)
 {
 	u32 status = idev->isr_status;
@@ -247,10 +296,14 @@  static irqreturn_t mchp_corei2c_handle_isr(struct mchp_corei2c_dev *idev)
 		break;
 	case STATUS_M_SLAW_ACK:
 	case STATUS_M_TX_DATA_ACK:
-		if (idev->msg_len > 0)
+		if (idev->msg_len > 0) {
 			mchp_corei2c_fill_tx(idev);
-		else
-			last_byte = true;
+		} else {
+			if (idev->restart_needed)
+				finished = true;
+			else
+				last_byte = true;
+		}
 		break;
 	case STATUS_M_TX_DATA_NACK:
 	case STATUS_M_SLAR_NACK:
@@ -287,7 +340,7 @@  static irqreturn_t mchp_corei2c_handle_isr(struct mchp_corei2c_dev *idev)
 		mchp_corei2c_stop(idev);
 
 	if (last_byte || finished)
-		complete(&idev->msg_complete);
+		mchp_corei2c_next_msg(idev);
 
 	return IRQ_HANDLED;
 }
@@ -311,21 +364,48 @@  static irqreturn_t mchp_corei2c_isr(int irq, void *_dev)
 	return ret;
 }
 
-static int mchp_corei2c_xfer_msg(struct mchp_corei2c_dev *idev,
-				 struct i2c_msg *msg)
+static int mchp_corei2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
+			     int num)
 {
-	u8 ctrl;
+	struct mchp_corei2c_dev *idev = i2c_get_adapdata(adap);
+	struct i2c_msg *this_msg = msgs;
 	unsigned long time_left;
-
-	idev->addr = i2c_8bit_addr_from_msg(msg);
-	idev->msg_len = msg->len;
-	idev->buf = msg->buf;
-	idev->msg_err = 0;
-
-	reinit_completion(&idev->msg_complete);
+	u8 ctrl;
 
 	mchp_corei2c_core_enable(idev);
 
+	/*
+	 * The isr controls the flow of a transfer, this info needs to be saved
+	 * to a location that it can access the queue information from.
+	 */
+	idev->restart_needed = false;
+	idev->msg_queue = msgs;
+	idev->total_num = num;
+	idev->current_num = 0;
+
+	/*
+	 * But the first entry to the isr is triggered by the start in this
+	 * function, so the first message needs to be "dequeued".
+	 */
+	idev->addr = i2c_8bit_addr_from_msg(this_msg);
+	idev->msg_len = this_msg->len;
+	idev->buf = this_msg->buf;
+	idev->msg_err = 0;
+
+	if (idev->total_num > 1) {
+		struct i2c_msg *next_msg = msgs + 1;
+
+		idev->restart_needed = next_msg->flags & I2C_M_RD;
+	}
+
+	idev->current_num++;
+	idev->msg_queue++;
+
+	reinit_completion(&idev->msg_complete);
+
+	/*
+	 * Send the first start to pass control to the isr
+	 */
 	ctrl = readb(idev->base + CORE_I2C_CTRL);
 	ctrl |= CTRL_STA;
 	writeb(ctrl, idev->base + CORE_I2C_CTRL);
@@ -335,20 +415,8 @@  static int mchp_corei2c_xfer_msg(struct mchp_corei2c_dev *idev,
 	if (!time_left)
 		return -ETIMEDOUT;
 
-	return idev->msg_err;
-}
-
-static int mchp_corei2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
-			     int num)
-{
-	struct mchp_corei2c_dev *idev = i2c_get_adapdata(adap);
-	int i, ret;
-
-	for (i = 0; i < num; i++) {
-		ret = mchp_corei2c_xfer_msg(idev, msgs++);
-		if (ret)
-			return ret;
-	}
+	if (idev->msg_err)
+		return idev->msg_err;
 
 	return num;
 }