Message ID | 20210708120111.519444-4-nuno.sa@analog.com |
---|---|
State | New |
Headers | show |
Series | AXI FAN new features and improvements | expand |
On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote: > Add the bindings for the tacho signal evaluation parameters which depend > on the FAN being used. > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > --- > .../bindings/hwmon/adi,axi-fan-control.yaml | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml > index 6747b870f297..0481eb34d9f1 100644 > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml > @@ -37,6 +37,18 @@ properties: > $ref: /schemas/types.yaml#/definitions/uint32 > enum: [1, 2, 4] > > + adi,tacho-25-us: > + description: Expected tacho signal when the PWM is at 25%. > + > + adi,tacho-50-us: > + description: Expected tacho signal when the PWM is at 50%. > + > + adi,tacho-75-us: > + description: Expected tacho signal when the PWM is at 75%. > + > + adi,tacho-100-us: > + description: Expected tacho signal when the PWM is at 100%. This looks like it should be common. But having PWM percents in the property names doesn't scale. This is also a property of the fan, not the fan controller. There's only so many ways a fan can be controlled and I'm going to keep telling folks to make a common fan binding. There's some start to it, but it needs some work. Rob
> From: Rob Herring <robh@kernel.org> > Sent: Monday, July 12, 2021 7:27 PM > To: Sa, Nuno <Nuno.Sa@analog.com> > Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org; > Guenter Roeck <linux@roeck-us.net>; Jean Delvare > <jdelvare@suse.com> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > properties > > [External] > > On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote: > > Add the bindings for the tacho signal evaluation parameters which > depend > > on the FAN being used. > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > --- > > .../bindings/hwmon/adi,axi-fan-control.yaml | 12 > ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi- > fan-control.yaml > > index 6747b870f297..0481eb34d9f1 100644 > > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > control.yaml > > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > control.yaml > > @@ -37,6 +37,18 @@ properties: > > $ref: /schemas/types.yaml#/definitions/uint32 > > enum: [1, 2, 4] > > > > + adi,tacho-25-us: > > + description: Expected tacho signal when the PWM is at 25%. > > + > > + adi,tacho-50-us: > > + description: Expected tacho signal when the PWM is at 50%. > > + > > + adi,tacho-75-us: > > + description: Expected tacho signal when the PWM is at 75%. > > + > > + adi,tacho-100-us: > > + description: Expected tacho signal when the PWM is at 100%. > > This looks like it should be common. But having PWM percents in the > property names doesn't scale. This is also a property of the fan, not > the fan controller. Yes, I see that these parameters are definitely a property of the attached fan but the evaluation of these timings are very specific to this controller (I think). The way it works is that the HW can fully operate without any runtime SW configuration. In this case, it will use the values in these registers to evaluate the tacho signal coming from the FAN. And the HW really uses the evaluation points like this (0, 25%, 50% and 100%). It has some predefined values that work for the FAN that was used to develop the IP but naturally the evaluation will fail as soon as other FAN is attached (resulting in fan fault interrupts). And yes, writing these parameters is already SW configuration but what I mean with "runtime" is after probe :). So, I honestly do not know how we could name this better... Maybe a 'tacho-eval-points-us' array? The question would be the min/max elements? Do you have any suggestion for a more generic property? > There's only so many ways a fan can be controlled and I'm going to > keep > telling folks to make a common fan binding. There's some start to it, > but it needs some work. You mean the pwm-fan.txt? I gave a look to the driver and I don't think it fully fits this controller. I'm ok in sending an fan.yaml binding if you prefer it but I'm just not sure what we would need there... Maybe pulses-per-revolution and the above property (whatever the decided name) would be a starting point? - Nuno Sá
On Thu, Jul 15, 2021 at 10:26:05AM +0000, Sa, Nuno wrote: > > From: Rob Herring <robh@kernel.org> > > Sent: Monday, July 12, 2021 7:27 PM > > To: Sa, Nuno <Nuno.Sa@analog.com> > > Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org; > > Guenter Roeck <linux@roeck-us.net>; Jean Delvare > > <jdelvare@suse.com> > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > > properties > > > > [External] > > > > On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote: > > > Add the bindings for the tacho signal evaluation parameters which > > depend > > > on the FAN being used. > > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > --- > > > .../bindings/hwmon/adi,axi-fan-control.yaml | 12 > > ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > > control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi- > > fan-control.yaml > > > index 6747b870f297..0481eb34d9f1 100644 > > > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > > control.yaml > > > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > > control.yaml > > > @@ -37,6 +37,18 @@ properties: > > > $ref: /schemas/types.yaml#/definitions/uint32 > > > enum: [1, 2, 4] > > > > > > + adi,tacho-25-us: > > > + description: Expected tacho signal when the PWM is at 25%. > > > + > > > + adi,tacho-50-us: > > > + description: Expected tacho signal when the PWM is at 50%. > > > + > > > + adi,tacho-75-us: > > > + description: Expected tacho signal when the PWM is at 75%. > > > + > > > + adi,tacho-100-us: > > > + description: Expected tacho signal when the PWM is at 100%. > > > > This looks like it should be common. But having PWM percents in the > > property names doesn't scale. This is also a property of the fan, not > > the fan controller. > > Yes, I see that these parameters are definitely a property of the attached > fan but the evaluation of these timings are very specific to this controller > (I think). The way it works is that the HW can fully operate without any > runtime SW configuration. In this case, it will use the values in these > registers to evaluate the tacho signal coming from the FAN. And the HW > really uses the evaluation points like this (0, 25%, 50% and 100%). It has > some predefined values that work for the FAN that was used to develop > the IP but naturally the evaluation will fail as soon as other FAN is attached > (resulting in fan fault interrupts). And yes, writing these parameters is > already SW configuration but what I mean with "runtime" is after probe :). > Are you sure you can ever get this stable ? Each fan has its own properties and tolerances. If you replace a fan in a given system, you might get different RPM numbers. The RPM will differ widely from system to system and from fan to fan. Anything that assumes a specific RPM in devicetree data seems to be quite vulnerable to failures. I have experienced that recently with a different chip which also tries to correlate RPM and PWM and fails quite miserably. In my experience, anything other than minimum fan speed is really a recipe for instability and sporadic false failures. Even setting a minimum fan speed is tricky because it depends a lot on the fan. > So, I honestly do not know how we could name this better... Maybe a > 'tacho-eval-points-us' array? The question would be the min/max > elements? Do you have any suggestion for a more generic property? > I am guessing that the "us" refers to the time between pulses from the fan. I think this is a bad value to start with - anything fan speed related should really be expressed in RPM, not in time between pulses. Overall I don't think this should be handled as generic set of properties. Whatever we come up with as standard set of pwm or fan related properties should not be an expected correlation between pwm and rpm. Assuming such a property is needed here (after all, the controller is what it is), maybe a set of tuples makes sense, such as adi,pwm-rpm-map = < 25, 250, 50, 500, 75, 750, 100, 1000 >; though I think that each of those should also include the tolerance instead of just assuming that a 25% tolerance (as implemented in patch 2/6) would work for all fans. Guenter > > There's only so many ways a fan can be controlled and I'm going to > > keep > > telling folks to make a common fan binding. There's some start to it, > > but it needs some work. > > You mean the pwm-fan.txt? I gave a look to the driver and I don't think > it fully fits this controller. I'm ok in sending an fan.yaml binding if you > prefer it but I'm just not sure what we would need there... Maybe > pulses-per-revolution and the above property > (whatever the decided name) would be a starting point? > > - Nuno Sá
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter > Roeck > Sent: Thursday, July 15, 2021 10:40 PM > To: Sa, Nuno <Nuno.Sa@analog.com> > Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org; > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > properties > > On Thu, Jul 15, 2021 at 10:26:05AM +0000, Sa, Nuno wrote: > > > From: Rob Herring <robh@kernel.org> > > > Sent: Monday, July 12, 2021 7:27 PM > > > To: Sa, Nuno <Nuno.Sa@analog.com> > > > Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org; > > > Guenter Roeck <linux@roeck-us.net>; Jean Delvare > > > <jdelvare@suse.com> > > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add > tacho > > > properties > > > > > > [External] > > > > > > On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote: > > > > Add the bindings for the tacho signal evaluation parameters > which > > > depend > > > > on the FAN being used. > > > > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > > --- > > > > .../bindings/hwmon/adi,axi-fan-control.yaml | 12 > > > ++++++++++++ > > > > 1 file changed, 12 insertions(+) > > > > > > > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi- > fan- > > > control.yaml > b/Documentation/devicetree/bindings/hwmon/adi,axi- > > > fan-control.yaml > > > > index 6747b870f297..0481eb34d9f1 100644 > > > > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > > > control.yaml > > > > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan- > > > control.yaml > > > > @@ -37,6 +37,18 @@ properties: > > > > $ref: /schemas/types.yaml#/definitions/uint32 > > > > enum: [1, 2, 4] > > > > > > > > + adi,tacho-25-us: > > > > + description: Expected tacho signal when the PWM is at 25%. > > > > + > > > > + adi,tacho-50-us: > > > > + description: Expected tacho signal when the PWM is at 50%. > > > > + > > > > + adi,tacho-75-us: > > > > + description: Expected tacho signal when the PWM is at 75%. > > > > + > > > > + adi,tacho-100-us: > > > > + description: Expected tacho signal when the PWM is at 100%. > > > > > > This looks like it should be common. But having PWM percents in > the > > > property names doesn't scale. This is also a property of the fan, not > > > the fan controller. > > > > Yes, I see that these parameters are definitely a property of the > attached > > fan but the evaluation of these timings are very specific to this > controller > > (I think). The way it works is that the HW can fully operate without > any > > runtime SW configuration. In this case, it will use the values in these > > registers to evaluate the tacho signal coming from the FAN. And the > HW > > really uses the evaluation points like this (0, 25%, 50% and 100%). It > has > > some predefined values that work for the FAN that was used to > develop > > the IP but naturally the evaluation will fail as soon as other FAN is > attached > > (resulting in fan fault interrupts). And yes, writing these parameters > is > > already SW configuration but what I mean with "runtime" is after > probe :). > > > > Are you sure you can ever get this stable ? Each fan has its own > properties > and tolerances. If you replace a fan in a given system, you might get > different RPM numbers. The RPM will differ widely from system to > system > and from fan to fan. Anything that assumes a specific RPM in > devicetree > data seems to be quite vulnerable to failures. I have experienced that > recently with a different chip which also tries to correlate RPM and > PWM > and fails quite miserably. > > In my experience, anything other than minimum fan speed is really a > recipe > for instability and sporadic false failures. Even setting a minimum fan > speed > is tricky because it depends a lot on the fan. I see what you mean. So, I had to go through this process when testing this changes because the fan I'm using is different from the default one used to develop and stablish the default values in the IP core. The core provides you with a register which contains the tacho measurements in clock cycles. You can read that for all the PWM points of interest (with devmem2 for example) and make your own "calibration". I assume that people have to go through this process before putting some values in the devicetree. I'm aware this is not the neatest process but I guess it's acceptable... > > So, I honestly do not know how we could name this better... Maybe > a > > 'tacho-eval-points-us' array? The question would be the min/max > > elements? Do you have any suggestion for a more generic property? > > > I am guessing that the "us" refers to the time between pulses from the > fan. I think this is a bad value to start with - anything fan speed > related should really be expressed in RPM, not in time between > pulses. > > Overall I don't think this should be handled as generic set of > properties. > Whatever we come up with as standard set of pwm or fan related > properties > should not be an expected correlation between pwm and rpm. > Assuming such > a property is needed here (after all, the controller is what it is), > maybe a set of tuples makes sense, such as > > adi,pwm-rpm-map = < > 25, 250, > 50, 500, > 75, 750, > 100, 1000 > >; > > though I think that each of those should also include the tolerance > instead of just assuming that a 25% tolerance (as implemented in patch > 2/6) would work for all fans. Yes, this makes sense thanks. As the HW default tolerance is 25% I was somehow attached to that. Since all these values are correlated it makes complete sense to also give the tolerance here as it makes things more flexible. The map is also a good tip, I just have to see if there is a nice way to specify that the pwm column is constant... Thanks! - Nuno Sá
On 7/16/21 12:44 AM, Sa, Nuno wrote: [ ... ] >> >> Are you sure you can ever get this stable ? Each fan has its own >> properties >> and tolerances. If you replace a fan in a given system, you might get >> different RPM numbers. The RPM will differ widely from system to >> system >> and from fan to fan. Anything that assumes a specific RPM in >> devicetree >> data seems to be quite vulnerable to failures. I have experienced that >> recently with a different chip which also tries to correlate RPM and >> PWM >> and fails quite miserably. >> >> In my experience, anything other than minimum fan speed is really a >> recipe >> for instability and sporadic false failures. Even setting a minimum fan >> speed >> is tricky because it depends a lot on the fan. > > I see what you mean. So, I had to go through this process when testing > this changes because the fan I'm using is different from the default one > used to develop and stablish the default values in the IP core. The core Exactly my point. > provides you with a register which contains the tacho measurements in > clock cycles. You can read that for all the PWM points of interest > (with devmem2 for example) and make your own "calibration". I assume > that people have to go through this process before putting some values > in the devicetree. I'm aware this is not the neatest process but I guess it's > acceptable... > Do you really expect everyone using a system with this chip to go through this process and update its devicetree configuration, and then repeat it whenever a fan is changed ? Given how dynamic this is, I really wonder if that information should be in devicetree in the first place. Guenter
> -----Original Message----- > From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter > Roeck > Sent: Friday, July 16, 2021 5:04 PM > To: Sa, Nuno <Nuno.Sa@analog.com> > Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org; > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > properties > > [External] > > On 7/16/21 12:44 AM, Sa, Nuno wrote: > [ ... ] > >> > >> Are you sure you can ever get this stable ? Each fan has its own > >> properties > >> and tolerances. If you replace a fan in a given system, you might get > >> different RPM numbers. The RPM will differ widely from system to > >> system > >> and from fan to fan. Anything that assumes a specific RPM in > >> devicetree > >> data seems to be quite vulnerable to failures. I have experienced > that > >> recently with a different chip which also tries to correlate RPM and > >> PWM > >> and fails quite miserably. > >> > >> In my experience, anything other than minimum fan speed is really > a > >> recipe > >> for instability and sporadic false failures. Even setting a minimum > fan > >> speed > >> is tricky because it depends a lot on the fan. > > > > I see what you mean. So, I had to go through this process when > testing > > this changes because the fan I'm using is different from the default > one > > used to develop and stablish the default values in the IP core. The > core > > Exactly my point. > > > provides you with a register which contains the tacho measurements > in > > clock cycles. You can read that for all the PWM points of interest > > (with devmem2 for example) and make your own "calibration". I > assume > > that people have to go through this process before putting some > values > > in the devicetree. I'm aware this is not the neatest process but I > guess it's > > acceptable... > > > > Do you really expect everyone using a system with this chip to go > through > this process and update its devicetree configuration, and then repeat it > whenever a fan is changed ? Given how dynamic this is, I really wonder > if that information should be in devicetree in the first place. > My naive assumption was that we would only do this work at evaluation time. After that and after we settled with a fan for some system, I expected that changing to a different fan is not that likely. From your inputs, I guess this is not really the case which makes this process more cumbersome (as it also implies recompiling the devicetree for your system). However, even if we export these as runtime parameters, services/daemons will also have an hard time doing this "calibration" in a dynamic way. The reason is because the way the controller works is that it only accepts a new PWM request if it is an higher value than whatever the HW has at that moment. Thus, going through the calibration points might be very cumbersome. I can see some ways of handling this though but not very neat... Since this is a FPGA core, we might have some flexibility here. Something that came to my mind would be to have a calibration mode in the HW that would allow us to freely control the PWM values. In that way we could go freely over the calibration points. I guess, for safety reasons, this calibration mode would expire after some reasonable time (that give us enough time for doing the whole thing). The best place for doing the calibration, I guess it would be directly in the driver since we do receive the interrupts about new tacho measurements making things easier to sync and handle. However, given the time that takes for a new PWM to settle + new tacho measurements, it would not be very acceptable to do this during probe which is definitely also not ideal (we could defer this to a worker/timer). I'm not sure if the above makes much sense to you and it also depends on the HW guys being on board with this mechanism. - Nuno Sá
On Mon, Jul 19, 2021 at 07:46:41AM +0000, Sa, Nuno wrote: > > > > -----Original Message----- > > From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter > > Roeck > > Sent: Friday, July 16, 2021 5:04 PM > > To: Sa, Nuno <Nuno.Sa@analog.com> > > Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org; > > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com> > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > > properties > > > > [External] > > > > On 7/16/21 12:44 AM, Sa, Nuno wrote: > > [ ... ] > > >> > > >> Are you sure you can ever get this stable ? Each fan has its own > > >> properties > > >> and tolerances. If you replace a fan in a given system, you might get > > >> different RPM numbers. The RPM will differ widely from system to > > >> system > > >> and from fan to fan. Anything that assumes a specific RPM in > > >> devicetree > > >> data seems to be quite vulnerable to failures. I have experienced > > that > > >> recently with a different chip which also tries to correlate RPM and > > >> PWM > > >> and fails quite miserably. > > >> > > >> In my experience, anything other than minimum fan speed is really > > a > > >> recipe > > >> for instability and sporadic false failures. Even setting a minimum > > fan > > >> speed > > >> is tricky because it depends a lot on the fan. > > > > > > I see what you mean. So, I had to go through this process when > > testing > > > this changes because the fan I'm using is different from the default > > one > > > used to develop and stablish the default values in the IP core. The > > core > > > > Exactly my point. > > > > > provides you with a register which contains the tacho measurements > > in > > > clock cycles. You can read that for all the PWM points of interest > > > (with devmem2 for example) and make your own "calibration". I > > assume > > > that people have to go through this process before putting some > > values > > > in the devicetree. I'm aware this is not the neatest process but I > > guess it's > > > acceptable... > > > > > > > Do you really expect everyone using a system with this chip to go > > through > > this process and update its devicetree configuration, and then repeat it > > whenever a fan is changed ? Given how dynamic this is, I really wonder > > if that information should be in devicetree in the first place. > > > > My naive assumption was that we would only do this work at evaluation > time. After that and after we settled with a fan for some system, I expected > that changing to a different fan is not that likely. From your inputs, I guess > this is not really the case which makes this process more cumbersome (as it > also implies recompiling the devicetree for your system). > > However, even if we export these as runtime parameters, services/daemons > will also have an hard time doing this "calibration" in a dynamic way. The reason > is because the way the controller works is that it only accepts a new PWM > request if it is an higher value than whatever the HW has at that moment. Thus, > going through the calibration points might be very cumbersome. I can see some > ways of handling this though but not very neat... > > Since this is a FPGA core, we might have some flexibility here. Something that > came to my mind would be to have a calibration mode in the HW that would > allow us to freely control the PWM values. In that way we could go freely over > the calibration points. I guess, for safety reasons, this calibration mode would > expire after some reasonable time (that give us enough time for doing the whole > thing). The best place for doing the calibration, I guess it would be directly in the > driver since we do receive the interrupts about new tacho measurements making > things easier to sync and handle. However, given the time that takes for a new > PWM to settle + new tacho measurements, it would not be very acceptable to do this > during probe which is definitely also not ideal (we could defer this to a worker/timer). > > I'm not sure if the above makes much sense to you and it also depends on the HW > guys being on board with this mechanism. > I don't really know what to say or recommend here. Personally I think any attempt to tie PWM values to RPM are doomed to fail. Here are a couple of examples: Take your test system and move the fan to a restricted place (eg close to a wall). You'll see the fan RPM change, potentially significantly. Put it into some place with airflow towards or away from the system (eg blow air into the system from another place, which may happen if the system is installed in a lab), and again you'll see fan speed changes. Open the chassis, and the fan speed will change. I have seen fan speeds vary by up to 50% when changing airflow. That doesn't even take into account that replacing a fan even with a similar model (eg after a fan failed) will likely result in potentially significant rpm changes. Ultimately, anything that does more than determine if a fan is still running is potentially unstable. Having said all that, it is really your call to decide how you want to detect fan failures. Thanks, Guenter
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter > Roeck > Sent: Wednesday, July 21, 2021 5:00 PM > To: Sa, Nuno <Nuno.Sa@analog.com> > Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org; > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho > properties > > > On Mon, Jul 19, 2021 at 07:46:41AM +0000, Sa, Nuno wrote: > > > > > > > -----Original Message----- > > > From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter > > > Roeck > > > Sent: Friday, July 16, 2021 5:04 PM > > > To: Sa, Nuno <Nuno.Sa@analog.com> > > > Cc: Rob Herring <robh@kernel.org>; linux- > hwmon@vger.kernel.org; > > > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com> > > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add > tacho > > > properties > > > > > > [External] > > > > > > On 7/16/21 12:44 AM, Sa, Nuno wrote: > > > [ ... ] > > > >> > > > >> Are you sure you can ever get this stable ? Each fan has its own > > > >> properties > > > >> and tolerances. If you replace a fan in a given system, you might > get > > > >> different RPM numbers. The RPM will differ widely from system > to > > > >> system > > > >> and from fan to fan. Anything that assumes a specific RPM in > > > >> devicetree > > > >> data seems to be quite vulnerable to failures. I have > experienced > > > that > > > >> recently with a different chip which also tries to correlate RPM > and > > > >> PWM > > > >> and fails quite miserably. > > > >> > > > >> In my experience, anything other than minimum fan speed is > really > > > a > > > >> recipe > > > >> for instability and sporadic false failures. Even setting a > minimum > > > fan > > > >> speed > > > >> is tricky because it depends a lot on the fan. > > > > > > > > I see what you mean. So, I had to go through this process when > > > testing > > > > this changes because the fan I'm using is different from the > default > > > one > > > > used to develop and stablish the default values in the IP core. > The > > > core > > > > > > Exactly my point. > > > > > > > provides you with a register which contains the tacho > measurements > > > in > > > > clock cycles. You can read that for all the PWM points of interest > > > > (with devmem2 for example) and make your own "calibration". I > > > assume > > > > that people have to go through this process before putting some > > > values > > > > in the devicetree. I'm aware this is not the neatest process but I > > > guess it's > > > > acceptable... > > > > > > > > > > Do you really expect everyone using a system with this chip to go > > > through > > > this process and update its devicetree configuration, and then > repeat it > > > whenever a fan is changed ? Given how dynamic this is, I really > wonder > > > if that information should be in devicetree in the first place. > > > > > > > My naive assumption was that we would only do this work at > evaluation > > time. After that and after we settled with a fan for some system, I > expected > > that changing to a different fan is not that likely. From your inputs, I > guess > > this is not really the case which makes this process more > cumbersome (as it > > also implies recompiling the devicetree for your system). > > > > However, even if we export these as runtime parameters, > services/daemons > > will also have an hard time doing this "calibration" in a dynamic way. > The reason > > is because the way the controller works is that it only accepts a new > PWM > > request if it is an higher value than whatever the HW has at that > moment. Thus, > > going through the calibration points might be very cumbersome. I > can see some > > ways of handling this though but not very neat... > > > > Since this is a FPGA core, we might have some flexibility here. > Something that > > came to my mind would be to have a calibration mode in the HW that > would > > allow us to freely control the PWM values. In that way we could go > freely over > > the calibration points. I guess, for safety reasons, this calibration > mode would > > expire after some reasonable time (that give us enough time for > doing the whole > > thing). The best place for doing the calibration, I guess it would be > directly in the > > driver since we do receive the interrupts about new tacho > measurements making > > things easier to sync and handle. However, given the time that takes > for a new > > PWM to settle + new tacho measurements, it would not be very > acceptable to do this > > during probe which is definitely also not ideal (we could defer this to > a worker/timer). > > > > I'm not sure if the above makes much sense to you and it also > depends on the HW > > guys being on board with this mechanism. > > > > I don't really know what to say or recommend here. Personally I think > any > attempt to tie PWM values to RPM are doomed to fail. Here are a > couple of > examples: > > Take your test system and move the fan to a restricted place (eg close > to a > wall). You'll see the fan RPM change, potentially significantly. Put it into > some place with airflow towards or away from the system (eg blow air > into > the system from another place, which may happen if the system is > installed > in a lab), and again you'll see fan speed changes. Open the chassis, and > the fan speed will change. I have seen fan speeds vary by up to 50% > when > changing airflow. Here we can at least control the tolerance for each PWM vs RPM point but I can image this as a very painful process to get these values right and no one will think in setting tolerances of 50%... > That doesn't even take into account that replacing a fan even with a > similar > model (eg after a fan failed) will likely result in potentially significant > rpm changes. > > Ultimately, anything that does more than determine if a fan is still > running > is potentially unstable. Yeah, I understand your points. The HW does the evaluation and of course it also looks for the presence of a signal... So, in your opinion, not even setting a minimum fan speed is likely to be stable? > Having said all that, it is really your call to decide how you want to > detect > fan failures. > Well, my hands are also tied here. The core is supposed to work without any SW interaction in which case the tacho evaluation is always done. The only thing I could do is to completely ignore fan faults which is also bad... I can try to persuade the HW guy to completely remove the evaluation and just give fan fauts in case there's no signal but I'm not really sure he will go for it. In that case, I'm tempted to just leave this as-is (with the extra bindings for the tolerance and turn these bindings into a map) if you're willing to take it... The reason is that, as you said, this is likely to be unstable any ways so that the added complexity in the SW does not really pay off (better keep at least the SW simple)... - Nuno Sá
On 7/22/21 6:00 AM, Sa, Nuno wrote: [ ... ] >> I don't really know what to say or recommend here. Personally I think >> any >> attempt to tie PWM values to RPM are doomed to fail. Here are a >> couple of >> examples: >> >> Take your test system and move the fan to a restricted place (eg close >> to a >> wall). You'll see the fan RPM change, potentially significantly. Put it into >> some place with airflow towards or away from the system (eg blow air >> into >> the system from another place, which may happen if the system is >> installed >> in a lab), and again you'll see fan speed changes. Open the chassis, and >> the fan speed will change. I have seen fan speeds vary by up to 50% >> when >> changing airflow. > > Here we can at least control the tolerance for each PWM vs RPM point but > I can image this as a very painful process to get these values right and no > one will think in setting tolerances of 50%... > >> That doesn't even take into account that replacing a fan even with a >> similar >> model (eg after a fan failed) will likely result in potentially significant >> rpm changes. >> >> Ultimately, anything that does more than determine if a fan is still >> running >> is potentially unstable. > > Yeah, I understand your points. The HW does the evaluation and of > course it also looks for the presence of a signal... So, in your opinion, > not even setting a minimum fan speed is likely to be stable? > Using the minimum fan speed as detection mechanism for fan failures is ok and widely used. My concern is the desire to associate it with pwm values. >> Having said all that, it is really your call to decide how you want to >> detect >> fan failures. >> > > Well, my hands are also tied here. The core is supposed to work without > any SW interaction in which case the tacho evaluation is always done. The > only thing I could do is to completely ignore fan faults which is also bad... > As I said above, it would be perfectly fine to have a parameter that reflects minimum fan speed (or, translated into chip speak, minimum number of pulses per minute). > I can try to persuade the HW guy to completely remove the evaluation and > just give fan fauts in case there's no signal but I'm not really sure he will go > for it. In that case, I'm tempted to just leave this as-is (with the extra bindings > for the tolerance and turn these bindings into a map) if you're willing to take it... > The reason is that, as you said, this is likely to be unstable any ways so that the > added complexity in the SW does not really pay off (better keep at least the SW > simple)... > Sure, I'll take it, as long as you find a binding that is acceptable for Rob. It is your funeral, after all :-). Thanks, Guenter
diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml index 6747b870f297..0481eb34d9f1 100644 --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml @@ -37,6 +37,18 @@ properties: $ref: /schemas/types.yaml#/definitions/uint32 enum: [1, 2, 4] + adi,tacho-25-us: + description: Expected tacho signal when the PWM is at 25%. + + adi,tacho-50-us: + description: Expected tacho signal when the PWM is at 50%. + + adi,tacho-75-us: + description: Expected tacho signal when the PWM is at 75%. + + adi,tacho-100-us: + description: Expected tacho signal when the PWM is at 100%. + required: - compatible - reg
Add the bindings for the tacho signal evaluation parameters which depend on the FAN being used. Signed-off-by: Nuno Sá <nuno.sa@analog.com> --- .../bindings/hwmon/adi,axi-fan-control.yaml | 12 ++++++++++++ 1 file changed, 12 insertions(+)