[3/4] cputlb: introduce tlb_flush_other_cpu for reset use

Message ID	20250225184628.3590671-4-alex.bennee@linaro.org
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org> To: qemu-devel@nongnu.org Cc: Daniel Henrique Barboza <danielhb413@gmail.com>, Igor Mammedov <imammedo@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Helge Deller <deller@gmx.de>, Paolo Bonzini <pbonzini@redhat.com>, Nicholas Piggin <npiggin@gmail.com>, qemu-ppc@nongnu.org, Zhao Liu <zhao1.liu@intel.com>, =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org> Subject: [PATCH 3/4] cputlb: introduce tlb_flush_other_cpu for reset use Date: Tue, 25 Feb 2025 18:46:27 +0000 Message-Id: <20250225184628.3590671-4-alex.bennee@linaro.org> In-Reply-To: <20250225184628.3590671-1-alex.bennee@linaro.org> References: <20250225184628.3590671-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=alex.bennee@linaro.org; helo=mail-wm1-x32d.google.com X-Spam_score_int: 12 X-Spam_score: 1.2 X-Spam_bar: + X-Spam_report: (1.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_SBL_CSS=3.335, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	cputlb: add tlb_flush_other_cpu \| expand [0/4] cputlb: add tlb_flush_other_cpu [1/4] target/ppc: drop ppc_tlb_invalidate_all from cpu_reset [2/4] target/hppa: defer hppa_ptlbe until CPU starts running [3/4] cputlb: introduce tlb_flush_other_cpu for reset use [4/4] tcg:tlb: use tcg_debug_assert() in assert_cpu_is_self()

Message ID

20250225184628.3590671-4-alex.bennee@linaro.org

State

New

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
To: qemu-devel@nongnu.org
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>,
 Igor Mammedov <imammedo@redhat.com>,
 Richard Henderson <richard.henderson@linaro.org>,
 Helge Deller <deller@gmx.de>, Paolo Bonzini <pbonzini@redhat.com>,
 Nicholas Piggin <npiggin@gmail.com>, qemu-ppc@nongnu.org,
 Zhao Liu <zhao1.liu@intel.com>,
 =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
Subject: [PATCH 3/4] cputlb: introduce tlb_flush_other_cpu for reset use
Date: Tue, 25 Feb 2025 18:46:27 +0000
Message-Id: <20250225184628.3590671-4-alex.bennee@linaro.org>
In-Reply-To: <20250225184628.3590671-1-alex.bennee@linaro.org>
References: <20250225184628.3590671-1-alex.bennee@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2a00:1450:4864:20::32d;
 envelope-from=alex.bennee@linaro.org; helo=mail-wm1-x32d.google.com
X-Spam_score_int: 12
X-Spam_score: 1.2
X-Spam_bar: +
X-Spam_report: (1.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1,
 DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001,
 RCVD_IN_SBL_CSS=3.335, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=no autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

cputlb: add tlb_flush_other_cpu | expand

Commit Message

Alex Bennée Feb. 25, 2025, 6:46 p.m. UTC

The commit 30933c4fb4 (tcg/cputlb: remove other-cpu capability from
TLB flushing) introduced a regression that only shows up when
--enable-debug-tcg is used. The main use case of tlb_flush outside of
the current_cpu context is for handling reset and CPU creation. Rather
than revert the commit introduce a new helper and tweak the
documentation to make it clear where it should be used.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - appraently reset can come from both cpu context and outside
  - add cpu_common_post_load fixes
---
 include/exec/exec-all.h   | 20 ++++++++++++++++----
 accel/tcg/cputlb.c        | 11 +++++++++++
 accel/tcg/tcg-accel-ops.c |  2 +-
 cpu-target.c              |  2 +-
 target/i386/machine.c     |  2 +-
 5 files changed, 30 insertions(+), 7 deletions(-)

Comments

Richard Henderson Feb. 25, 2025, 7:49 p.m. UTC | #1

On 2/25/25 10:46, Alex Bennée wrote:
> The commit 30933c4fb4 (tcg/cputlb: remove other-cpu capability from
> TLB flushing) introduced a regression that only shows up when
> --enable-debug-tcg is used. The main use case of tlb_flush outside of
> the current_cpu context is for handling reset and CPU creation. Rather
> than revert the commit introduce a new helper and tweak the
> documentation to make it clear where it should be used.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> v2
>    - appraently reset can come from both cpu context and outside
>    - add cpu_common_post_load fixes
> ---
>   include/exec/exec-all.h   | 20 ++++++++++++++++----
>   accel/tcg/cputlb.c        | 11 +++++++++++
>   accel/tcg/tcg-accel-ops.c |  2 +-
>   cpu-target.c              |  2 +-
>   target/i386/machine.c     |  2 +-
>   5 files changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index d9045c9ac4..cf030001ca 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -64,12 +64,24 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, vaddr addr);
>    * tlb_flush:
>    * @cpu: CPU whose TLB should be flushed
>    *
> - * Flush the entire TLB for the specified CPU. Most CPU architectures
> - * allow the implementation to drop entries from the TLB at any time
> - * so this is generally safe. If more selective flushing is required
> - * use one of the other functions for efficiency.
> + * Flush the entire TLB for the specified current CPU.
> + *
> + * Most CPU architectures allow the implementation to drop entries
> + * from the TLB at any time so this is generally safe. If more
> + * selective flushing is required use one of the other functions for
> + * efficiency.
>    */
>   void tlb_flush(CPUState *cpu);
> +/**
> + * tlb_flush_other_cpu:
> + * @cpu: CPU whose TLB should be flushed
> + *
> + * Flush the entire TLB for a specified CPU. For cross vCPU flushes
> + * you shuld be using a more selective function. This is really only
> + * used for flushing CPUs being reset from outside their current
> + * context.
> + */
> +void tlb_flush_other_cpu(CPUState *cpu);
>   /**
>    * tlb_flush_all_cpus_synced:
>    * @cpu: src CPU of the flush
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ad158050a1..fc16a576f0 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -417,6 +417,17 @@ void tlb_flush(CPUState *cpu)
>       tlb_flush_by_mmuidx(cpu, ALL_MMUIDX_BITS);
>   }
>   
> +void tlb_flush_other_cpu(CPUState *cpu)
> +{
> +    if (qemu_cpu_is_self(cpu)) {
> +        tlb_flush(cpu);
> +    } else {
> +        async_run_on_cpu(cpu,
> +                         tlb_flush_by_mmuidx_async_work,
> +                         RUN_ON_CPU_HOST_INT(ALL_MMUIDX_BITS));
> +    }
> +}

I'm not convinced this is necessary.

> diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
> index 6e3f1fa92b..e85d317d34 100644
> --- a/accel/tcg/tcg-accel-ops.c
> +++ b/accel/tcg/tcg-accel-ops.c
> @@ -85,7 +85,7 @@ static void tcg_cpu_reset_hold(CPUState *cpu)
>   {
>       tcg_flush_jmp_cache(cpu);
>   
> -    tlb_flush(cpu);
> +    tlb_flush_other_cpu(cpu);
>   }

I would really like to believe that at this point, hold phase, the cpu is *not* running. 
Therefore it is safe to zero out the softmmu tlb data structures.

>   
>   /* mask must never be zero, except for A20 change call */
> diff --git a/cpu-target.c b/cpu-target.c
> index 667688332c..8eb1633c02 100644
> --- a/cpu-target.c
> +++ b/cpu-target.c
> @@ -56,7 +56,7 @@ static int cpu_common_post_load(void *opaque, int version_id)
>       /* 0x01 was CPU_INTERRUPT_EXIT. This line can be removed when the
>          version_id is increased. */
>       cpu->interrupt_request &= ~0x01;
> -    tlb_flush(cpu);
> +    tlb_flush_other_cpu(cpu);

Likewise, in post_load, the cpu is *not* running.

> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index d9d4f25d1a..e66f46758a 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -401,7 +401,7 @@ static int cpu_post_load(void *opaque, int version_id)
>           env->dr[7] = dr7 & ~(DR7_GLOBAL_BP_MASK | DR7_LOCAL_BP_MASK);
>           cpu_x86_update_dr7(env, dr7);
>       }
> -    tlb_flush(cs);
> +    tlb_flush_other_cpu(cs);
>       return 0;

Likewise.


r~

Alex Bennée Feb. 26, 2025, 2:29 p.m. UTC | #2

Richard Henderson <richard.henderson@linaro.org> writes:

> On 2/25/25 10:46, Alex Bennée wrote:
>> The commit 30933c4fb4 (tcg/cputlb: remove other-cpu capability from
>> TLB flushing) introduced a regression that only shows up when
>> --enable-debug-tcg is used. The main use case of tlb_flush outside of
>> the current_cpu context is for handling reset and CPU creation. Rather
>> than revert the commit introduce a new helper and tweak the
>> documentation to make it clear where it should be used.
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>> v2
>>    - appraently reset can come from both cpu context and outside
>>    - add cpu_common_post_load fixes
>> ---
>>   include/exec/exec-all.h   | 20 ++++++++++++++++----
>>   accel/tcg/cputlb.c        | 11 +++++++++++
>>   accel/tcg/tcg-accel-ops.c |  2 +-
>>   cpu-target.c              |  2 +-
>>   target/i386/machine.c     |  2 +-
>>   5 files changed, 30 insertions(+), 7 deletions(-)
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index d9045c9ac4..cf030001ca 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -64,12 +64,24 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, vaddr addr);
>>    * tlb_flush:
>>    * @cpu: CPU whose TLB should be flushed
>>    *
>> - * Flush the entire TLB for the specified CPU. Most CPU architectures
>> - * allow the implementation to drop entries from the TLB at any time
>> - * so this is generally safe. If more selective flushing is required
>> - * use one of the other functions for efficiency.
>> + * Flush the entire TLB for the specified current CPU.
>> + *
>> + * Most CPU architectures allow the implementation to drop entries
>> + * from the TLB at any time so this is generally safe. If more
>> + * selective flushing is required use one of the other functions for
>> + * efficiency.
>>    */
>>   void tlb_flush(CPUState *cpu);
>> +/**
>> + * tlb_flush_other_cpu:
>> + * @cpu: CPU whose TLB should be flushed
>> + *
>> + * Flush the entire TLB for a specified CPU. For cross vCPU flushes
>> + * you shuld be using a more selective function. This is really only
>> + * used for flushing CPUs being reset from outside their current
>> + * context.
>> + */
>> +void tlb_flush_other_cpu(CPUState *cpu);
>>   /**
>>    * tlb_flush_all_cpus_synced:
>>    * @cpu: src CPU of the flush
>> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> index ad158050a1..fc16a576f0 100644
>> --- a/accel/tcg/cputlb.c
>> +++ b/accel/tcg/cputlb.c
>> @@ -417,6 +417,17 @@ void tlb_flush(CPUState *cpu)
>>       tlb_flush_by_mmuidx(cpu, ALL_MMUIDX_BITS);
>>   }
>>   +void tlb_flush_other_cpu(CPUState *cpu)
>> +{
>> +    if (qemu_cpu_is_self(cpu)) {
>> +        tlb_flush(cpu);
>> +    } else {
>> +        async_run_on_cpu(cpu,
>> +                         tlb_flush_by_mmuidx_async_work,
>> +                         RUN_ON_CPU_HOST_INT(ALL_MMUIDX_BITS));
>> +    }
>> +}
>
> I'm not convinced this is necessary.

I guess we want something like:


/* tlb_reset() - reset the TLB when the CPU is not running
 * cs: the cpu
 *
 * Only to be used when the CPU is definitely not running
 */

void tlb_reset(CPUState *cs) {
     g_assert(cs->cpu_stopped);

    for (i = 0; i < NB_MMU_MODES; i++) {
        tlb_mmu_flush_locked(&cpu->neg.tlb.d[i], &cpu->neg.tlb.f[i]);
    }
}

?

>
>> diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
>> index 6e3f1fa92b..e85d317d34 100644
>> --- a/accel/tcg/tcg-accel-ops.c
>> +++ b/accel/tcg/tcg-accel-ops.c
>> @@ -85,7 +85,7 @@ static void tcg_cpu_reset_hold(CPUState *cpu)
>>   {
>>       tcg_flush_jmp_cache(cpu);
>>   -    tlb_flush(cpu);
>> +    tlb_flush_other_cpu(cpu);
>>   }
>
> I would really like to believe that at this point, hold phase, the cpu
> is *not* running. Therefore it is safe to zero out the softmmu tlb
> data structures.
>
>>     /* mask must never be zero, except for A20 change call */
>> diff --git a/cpu-target.c b/cpu-target.c
>> index 667688332c..8eb1633c02 100644
>> --- a/cpu-target.c
>> +++ b/cpu-target.c
>> @@ -56,7 +56,7 @@ static int cpu_common_post_load(void *opaque, int version_id)
>>       /* 0x01 was CPU_INTERRUPT_EXIT. This line can be removed when the
>>          version_id is increased. */
>>       cpu->interrupt_request &= ~0x01;
>> -    tlb_flush(cpu);
>> +    tlb_flush_other_cpu(cpu);
>
> Likewise, in post_load, the cpu is *not* running.
>
>> diff --git a/target/i386/machine.c b/target/i386/machine.c
>> index d9d4f25d1a..e66f46758a 100644
>> --- a/target/i386/machine.c
>> +++ b/target/i386/machine.c
>> @@ -401,7 +401,7 @@ static int cpu_post_load(void *opaque, int version_id)
>>           env->dr[7] = dr7 & ~(DR7_GLOBAL_BP_MASK | DR7_LOCAL_BP_MASK);
>>           cpu_x86_update_dr7(env, dr7);
>>       }
>> -    tlb_flush(cs);
>> +    tlb_flush_other_cpu(cs);
>>       return 0;
>
> Likewise.
>
>
> r~

Richard Henderson Feb. 26, 2025, 5:59 p.m. UTC | #3

On 2/26/25 06:29, Alex Bennée wrote:
> I guess we want something like:
> 
> 
> /* tlb_reset() - reset the TLB when the CPU is not running
>   * cs: the cpu
>   *
>   * Only to be used when the CPU is definitely not running
>   */
> 
> void tlb_reset(CPUState *cs) {
>       g_assert(cs->cpu_stopped);
> 
>      for (i = 0; i < NB_MMU_MODES; i++) {
>          tlb_mmu_flush_locked(&cpu->neg.tlb.d[i], &cpu->neg.tlb.f[i]);
>      }
> }
> 
> ?

I like the name, and the separate assert.
I'm not convinced skipping the tlb resize and (especially) accounting is a good idea.

I suspect that the tlb_flush_by_mmuidx_async_work should be split vs its 
assert_cpu_is_self, and you just should use that.  I'll note that tcg_cpu_reset_hold and 
tlb_flush_by_mmuidx_async_work both call tcg_flush_jmp_cache, so we've got a double-flush 
in there.

If you don't want to use tlb_flush_by_mmuidx_async_work, I think you need at minimum

- take the lock
- tlb_window_reset()
- honor and update cpu->neg.tlb.c.dirty

r~

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index d9045c9ac4..cf030001ca 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -64,12 +64,24 @@  void tlb_flush_page_all_cpus_synced(CPUState *src, vaddr addr);
  * tlb_flush:
  * @cpu: CPU whose TLB should be flushed
  *
- * Flush the entire TLB for the specified CPU. Most CPU architectures
- * allow the implementation to drop entries from the TLB at any time
- * so this is generally safe. If more selective flushing is required
- * use one of the other functions for efficiency.
+ * Flush the entire TLB for the specified current CPU.
+ *
+ * Most CPU architectures allow the implementation to drop entries
+ * from the TLB at any time so this is generally safe. If more
+ * selective flushing is required use one of the other functions for
+ * efficiency.
  */
 void tlb_flush(CPUState *cpu);
+/**
+ * tlb_flush_other_cpu:
+ * @cpu: CPU whose TLB should be flushed
+ *
+ * Flush the entire TLB for a specified CPU. For cross vCPU flushes
+ * you shuld be using a more selective function. This is really only
+ * used for flushing CPUs being reset from outside their current
+ * context.
+ */
+void tlb_flush_other_cpu(CPUState *cpu);
 /**
  * tlb_flush_all_cpus_synced:
  * @cpu: src CPU of the flush
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ad158050a1..fc16a576f0 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -417,6 +417,17 @@  void tlb_flush(CPUState *cpu)
     tlb_flush_by_mmuidx(cpu, ALL_MMUIDX_BITS);
 }
 
+void tlb_flush_other_cpu(CPUState *cpu)
+{
+    if (qemu_cpu_is_self(cpu)) {
+        tlb_flush(cpu);
+    } else {
+        async_run_on_cpu(cpu,
+                         tlb_flush_by_mmuidx_async_work,
+                         RUN_ON_CPU_HOST_INT(ALL_MMUIDX_BITS));
+    }
+}
+
 void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *src_cpu, uint16_t idxmap)
 {
     const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work;
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index 6e3f1fa92b..e85d317d34 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -85,7 +85,7 @@  static void tcg_cpu_reset_hold(CPUState *cpu)
 {
     tcg_flush_jmp_cache(cpu);
 
-    tlb_flush(cpu);
+    tlb_flush_other_cpu(cpu);
 }
 
 /* mask must never be zero, except for A20 change call */
diff --git a/cpu-target.c b/cpu-target.c
index 667688332c..8eb1633c02 100644
--- a/cpu-target.c
+++ b/cpu-target.c
@@ -56,7 +56,7 @@  static int cpu_common_post_load(void *opaque, int version_id)
     /* 0x01 was CPU_INTERRUPT_EXIT. This line can be removed when the
        version_id is increased. */
     cpu->interrupt_request &= ~0x01;
-    tlb_flush(cpu);
+    tlb_flush_other_cpu(cpu);
 
     /* loadvm has just updated the content of RAM, bypassing the
      * usual mechanisms that ensure we flush TBs for writes to
diff --git a/target/i386/machine.c b/target/i386/machine.c
index d9d4f25d1a..e66f46758a 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -401,7 +401,7 @@  static int cpu_post_load(void *opaque, int version_id)
         env->dr[7] = dr7 & ~(DR7_GLOBAL_BP_MASK | DR7_LOCAL_BP_MASK);
         cpu_x86_update_dr7(env, dr7);
     }
-    tlb_flush(cs);
+    tlb_flush_other_cpu(cs);
     return 0;
 }

[3/4] cputlb: introduce tlb_flush_other_cpu for reset use

Commit Message

Comments

Patch