Message ID | 20191114145918.235339-1-suzuki.poulose@arm.com |
---|---|
Headers | show |
Series | arm64: Add workaround for Cortex-A77 erratum 1542418 | expand |
Hi Suzuki, On Thu, Nov 14, 2019 at 02:59:13PM +0000, Suzuki K Poulose wrote: > This series adds workaround for Arm erratum 1542418 which affects Searching for that erratum number doesn't find me a description :( > Cortex-A77 cores (r0p0 - r1p0). Affected cores may execute stale > instructions from the L0 macro-op cache violating the > prefetch-speculation-protection guaranteed by the architecture. > This happens when the when the branch predictor bases its predictions > on a branch at this address on the stale history due to ASID or VMID > reuse. Two immediate questions: 1. Can we disable the L0 MOP cache? 2. Can we invalidate the branch predictor? If Spectre-v2 taught us anything it's that removing those instructions was a mistake! Moving on... Have you reproduced this at top-level? If I recall the prefetch-speculation-protection, it's designed to protect against the case where you have a direct branch: addr: B foo and another CPU writes out a new function: bar: insn0 ... insnN before doing any necessary maintenance and then patches the original branch to: addr: B bar The idea is that a concurrently executing CPU could mispredict the original branch to point at 'bar', fetch the instructions before they've been written out and then confirm the prediction by looking at the newly written branch instruction. Even without the prefetch-speculation-protection, that's fairly difficult to achieve in practice: you'd need to be doing something like reusing memory to hold the instructions so that the initial misprediction occurs. How does A77 stop this from occurring when the ASID is not reallocated (e.g. the example above)? Is the MOP cache flushed somehow? With this erratum, it sounds like you have to end up reusing an ASID from a task that had a branch at 'addr' in its address space that branched to the address of 'bar' (again. in its address space). Is that right? That sounds super rare to me, particularly with ASLR: not only does the aliasing branch need to exist, but it needs to be held in the branch predictor while we cycle through 64k ASIDs *and* the race with the writer needs to happen so that we get stale instructions from the MOP cache. Is there something I'm missing that makes this remotely plausible? Will
Hi Will On 11/14/2019 04:39 PM, Will Deacon wrote: > Hi Suzuki, > > On Thu, Nov 14, 2019 at 02:59:13PM +0000, Suzuki K Poulose wrote: >> This series adds workaround for Arm erratum 1542418 which affects > > Searching for that erratum number doesn't find me a description :( I believe this was published in the Cortex-A77 SDEN v9.0. I will chase it internally. > >> Cortex-A77 cores (r0p0 - r1p0). Affected cores may execute stale >> instructions from the L0 macro-op cache violating the >> prefetch-speculation-protection guaranteed by the architecture. >> This happens when the when the branch predictor bases its predictions >> on a branch at this address on the stale history due to ASID or VMID >> reuse. > > Two immediate questions: > > 1. Can we disable the L0 MOP cache? Yes, but it hurts performance. > 2. Can we invalidate the branch predictor? If Spectre-v2 taught us > anything it's that removing those instructions was a mistake! The workaround suggested is actually invalidating the branch history but in a costly way. I am unaware of any. > Moving on... > > Have you reproduced this at top-level? If I recall the > prefetch-speculation-protection, it's designed to protect against the > case where you have a direct branch: No, see below. > > addr: B foo > > and another CPU writes out a new function: > > bar: > insn0 > ... > insnN > > before doing any necessary maintenance and then patches the original > branch to: > > addr: B bar > > The idea is that a concurrently executing CPU could mispredict the original > branch to point at 'bar', fetch the instructions before they've been written > out and then confirm the prediction by looking at the newly written branch > instruction. Even without the prefetch-speculation-protection, that's > fairly difficult to achieve in practice: you'd need to be doing something > like reusing memory to hold the instructions so that the initial > misprediction occurs. > > How does A77 stop this from occurring when the ASID is not reallocated (e.g. > the example above)? Is the MOP cache flushed somehow? IIUC, The MOP cache is flushed on I-cache invalidate, thus it is fine. > > With this erratum, it sounds like you have to end up reusing an ASID from > a task that had a branch at 'addr' in its address space that branched to > the address of 'bar' (again. in its address space). Is that right? That > sounds super rare to me, particularly with ASLR: not only does the aliasing AFAICS, yes and on top of that, it should also miss "addr" in MOP-cache and hit "bar" before the I-cache invalidate is received. This may cause the "bar" to be fetched from mop (and is not canceled even though there was a mop-flush triggered by the i-cache invalidate after the hit) and "addr" should miss in I-cache, causing it to fetch the updated instruction. Also this means that the new context must not have executed "addr" (which would give a hit in MOP-cache) while "bar" was fetched. So, this adds on more constraints to actually hit it. > branch need to exist, but it needs to be held in the branch predictor while > we cycle through 64k ASIDs *and* the race with the writer needs to happen > so that we get stale instructions from the MOP cache. > > Is there something I'm missing that makes this remotely plausible? No :-) Cheers Suzuki
On Fri, Nov 15, 2019 at 01:14:07AM +0000, Suzuki K Poulose wrote: > On 11/14/2019 04:39 PM, Will Deacon wrote: > > On Thu, Nov 14, 2019 at 02:59:13PM +0000, Suzuki K Poulose wrote: > > > This series adds workaround for Arm erratum 1542418 which affects > > > Cortex-A77 cores (r0p0 - r1p0). Affected cores may execute stale > > > instructions from the L0 macro-op cache violating the > > > prefetch-speculation-protection guaranteed by the architecture. > > > This happens when the when the branch predictor bases its predictions > > > on a branch at this address on the stale history due to ASID or VMID > > > reuse. > > > > Two immediate questions: > > > > 1. Can we disable the L0 MOP cache? > Yes, but it hurts performance. > > > 2. Can we invalidate the branch predictor? If Spectre-v2 taught us > > anything it's that removing those instructions was a mistake! > > The workaround suggested is actually invalidating the branch history > but in a costly way. I am unaware of any. > > Moving on... > > > > Have you reproduced this at top-level? If I recall the > > prefetch-speculation-protection, it's designed to protect against the > > case where you have a direct branch: > > No, see below. > > > > > addr: B foo > > > > and another CPU writes out a new function: > > > > bar: > > insn0 > > ... > > insnN > > > > before doing any necessary maintenance and then patches the original > > branch to: > > > > addr: B bar > > > > The idea is that a concurrently executing CPU could mispredict the original > > branch to point at 'bar', fetch the instructions before they've been written > > out and then confirm the prediction by looking at the newly written branch > > instruction. Even without the prefetch-speculation-protection, that's > > fairly difficult to achieve in practice: you'd need to be doing something > > like reusing memory to hold the instructions so that the initial > > misprediction occurs. > > > > How does A77 stop this from occurring when the ASID is not reallocated (e.g. > > the example above)? Is the MOP cache flushed somehow? > > IIUC, The MOP cache is flushed on I-cache invalidate, thus it is fine. Hmm, so this is interesting. Does that mean we could do a local I-cache invalidation in check_and_switch_context() at the same as doing the local TLBI after a rollover? I still don't grok the failure case, though, because assuming A77 has IDC=0, then won't you see the I-cache maintenance from userspace anyway? Will