Message ID | 1444247430-14808-1-git-send-email-christoffer.dall@linaro.org |
---|---|
State | New |
Headers | show |
Hello! > + The mpidr encoding is based on the affinity information in the > + architecture defined MPIDR, and the field is encoded as follows: > + | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 | > + | Aff3 | Aff2 | Aff1 | Aff0 | One concern about this... Does it already have "We are Bosses, we Decided it, It's not subject to change, We do not care" status? Actually, current approach with using index instead of Aff bits works pretty well. It works fine with qemu too, because internally qemu also maintains CPU index, which it uses for GICv2 accesses. Keeping index allows to reuse more code, and provides better backwards compatibility. So could we do this? Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On Thu, Oct 08, 2015 at 10:17:09AM +0300, Pavel Fedin wrote: > Hello! > > > + The mpidr encoding is based on the affinity information in the > > + architecture defined MPIDR, and the field is encoded as follows: > > + | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 | > > + | Aff3 | Aff2 | Aff1 | Aff0 | > > One concern about this... Does it already have "We are Bosses, we Decided it, It's not subject to > change, We do not care" status? That's a rather negative question. We spent a fair amount of time discussing this during SFO15 and arrived at the conclusion that this was the way to go. If there are good arguments for why this is not sufficient or causes problems, then of course we'll make changes as required. > Actually, current approach with using index instead of Aff bits > works pretty well. It works fine with qemu too, because internally qemu also maintains CPU index, > which it uses for GICv2 accesses. > Keeping index allows to reuse more code, and provides better backwards compatibility. So could we > do this? > The architecture defines how to address a specific CPU, and that's using the MPIDR, not inventing our own scheme, so that's what we should do. For example, I don't think you had the full 32-bits available to address a CPU in your proposal, so if nothing else, that proposal had less expressive power than what the architecture defines. I also don't buy the argument that this saves a lot of code. If you have something that deals with a cpu index, surely two simple functions can allow the same amount of code reuse: unsigned long mpidr_to_vcpu_idx(unsigned long mpidr); unsigned long vcpu_idx_to_mpidr(unsigned long vcpu_idx); -Christoffer
Hello! > > One concern about this... Does it already have "We are Bosses, we Decided it, It's not > subject to > > change, We do not care" status? > > That's a rather negative question. Sorry, didn't want to offend anyone. I just wanted to tell that i know that you, as maintainers, have much more power than i do, and you can always say "it's political decision, we just want it and that's final", and if you choose to do this, i'm perfectly OK with that, just say it. > The architecture defines how to address a specific CPU, and that's using > the MPIDR, not inventing our own scheme, so that's what we should do. But doesn't the same apply to GICv2 then? It just happened so that we had Aff0 == idx, therefore looks like nobody cared. My argument is to have GICv3 API as similar to GICv2 as possible, IMHO that would make it easier to maintain the code, and perhaps give some way to reusing it. > For example, I don't think you had the full 32-bits available to address > a CPU in your proposal, so if nothing else, that proposal had less > expressive power than what the architecture defines. My proposal was: --- cut --- KVM_DEV_ARM_VGIC_GRP_DIST_REGS Attributes: The attr field of kvm_device_attr encodes two values: bits: | 63 | 62 .. 52 | 51 .. 32 | 31 .... 0 | values: | size | reserved | cpu idx | offset | All distributor regs can be accessed as (rw, 32-bit) For GICv3 some regsisters are actually (rw, 64-bit) according to the specification. In order to perform full 64-bit access 'size' bit should be set to 1. KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose. --- cut --- Bit 63 was meant to specify access size (u64 or u32), and bits 62 - 52 were reserved just in order to match ARM64_SYS_REG() macro, which uses these bits for own purpose. But, since your proposal suggests that all GICv3 accesses are 64-bit wide, and we use own system register encoding (i don't see any serious problems with this), it is possible just to use bits 63...32 for vCPU index. So, maximum number of CPUs would be the same 0xFFFFFFFF as in your proposal, and the API would be fully compatible with GICv2 one (well, except access size), and we would use the same definitions and the same approach to handle both. This would bring more consistency and less confusion to userspace developers who use the API. By the way, the rest of KVM API (KVM_IRQ_LINE, for example), also uses vCPU index. That's all my arguments for vCPU index instead of affinity value, and if you say "that doesn't count, we don't have to be compatible with GICv2", i'll accept this and proceed with the rewrite. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On Thu, Oct 08, 2015 at 12:10:35PM +0300, Pavel Fedin wrote: > Hello! > [...] > > > The architecture defines how to address a specific CPU, and that's using > > the MPIDR, not inventing our own scheme, so that's what we should do. > > But doesn't the same apply to GICv2 then? It just happened so that we had Aff0 == idx, therefore > looks like nobody cared. I don't think it's about caring, but we (I) didn't consider it when designing that API. > My argument is to have GICv3 API as similar to GICv2 as possible, IMHO that would make it easier to > maintain the code, and perhaps give some way to reusing it. Plenty of things are broken about the VGICv2 implementation and API, so my goal is not to have GICv3 be similar to GICv2, but to design a good API. > > > For example, I don't think you had the full 32-bits available to address > > a CPU in your proposal, so if nothing else, that proposal had less > > expressive power than what the architecture defines. > > My proposal was: > > --- cut --- > KVM_DEV_ARM_VGIC_GRP_DIST_REGS > Attributes: > The attr field of kvm_device_attr encodes two values: > bits: | 63 | 62 .. 52 | 51 .. 32 | 31 .... 0 | > values: | size | reserved | cpu idx | offset | > > All distributor regs can be accessed as (rw, 32-bit) > For GICv3 some regsisters are actually (rw, 64-bit) according to the > specification. In order to perform full 64-bit access 'size' bit should be > set to 1. KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose. > --- cut --- > > Bit 63 was meant to specify access size (u64 or u32), and bits 62 - 52 were reserved just in order > to match ARM64_SYS_REG() macro, which uses these bits for own purpose. > > But, since your proposal suggests that all GICv3 accesses are 64-bit wide, and we use own system > register encoding (i don't see any serious problems with this), it is possible just to use bits > 63...32 for vCPU index. So, maximum number of CPUs would be the same 0xFFFFFFFF as in your proposal, > and the API would be fully compatible with GICv2 one (well, except access size), and we would use > the same definitions and the same approach to handle both. This would bring more consistency and > less confusion to userspace developers who use the API. I don't agree; using the same API with slightly different semantics depending on which device you created is much more error prone than having a clear separation between the two different APIs, IMHO. > > By the way, the rest of KVM API (KVM_IRQ_LINE, for example), also uses vCPU index. > > That's all my arguments for vCPU index instead of affinity value I'm not convinced that we should be compatible with GICv2 at all. (The deeper argument here is that GICv2 had an architectural limitation to 8 CPUs so all the CPU addressing is simple in that case. This is a fundamental evolution from GICv2 to GICv3 so it is only natural that there are substantial changes in this area.) I'll let Marc or Peter chime in if they disagree. >, and if you say "that doesn't > count, we don't have to be compatible with GICv2", i'll accept this and proceed with the rewrite. > Cool! Thanks, -Christoffer
On 08/10/15 10:28, Christoffer Dall wrote: > On Thu, Oct 08, 2015 at 12:10:35PM +0300, Pavel Fedin wrote: >> Hello! >> > [...] >> >>> The architecture defines how to address a specific CPU, and that's using >>> the MPIDR, not inventing our own scheme, so that's what we should do. >> >> But doesn't the same apply to GICv2 then? It just happened so that we had Aff0 == idx, therefore >> looks like nobody cared. > > I don't think it's about caring, but we (I) didn't consider it when > designing that API. > >> My argument is to have GICv3 API as similar to GICv2 as possible, IMHO that would make it easier to >> maintain the code, and perhaps give some way to reusing it. > > Plenty of things are broken about the VGICv2 implementation and API, so > my goal is not to have GICv3 be similar to GICv2, but to design a good > API. > >> >>> For example, I don't think you had the full 32-bits available to address >>> a CPU in your proposal, so if nothing else, that proposal had less >>> expressive power than what the architecture defines. >> >> My proposal was: >> >> --- cut --- >> KVM_DEV_ARM_VGIC_GRP_DIST_REGS >> Attributes: >> The attr field of kvm_device_attr encodes two values: >> bits: | 63 | 62 .. 52 | 51 .. 32 | 31 .... 0 | >> values: | size | reserved | cpu idx | offset | >> >> All distributor regs can be accessed as (rw, 32-bit) >> For GICv3 some regsisters are actually (rw, 64-bit) according to the >> specification. In order to perform full 64-bit access 'size' bit should be >> set to 1. KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose. >> --- cut --- >> >> Bit 63 was meant to specify access size (u64 or u32), and bits 62 - 52 were reserved just in order >> to match ARM64_SYS_REG() macro, which uses these bits for own purpose. >> >> But, since your proposal suggests that all GICv3 accesses are 64-bit wide, and we use own system >> register encoding (i don't see any serious problems with this), it is possible just to use bits >> 63...32 for vCPU index. So, maximum number of CPUs would be the same 0xFFFFFFFF as in your proposal, >> and the API would be fully compatible with GICv2 one (well, except access size), and we would use >> the same definitions and the same approach to handle both. This would bring more consistency and >> less confusion to userspace developers who use the API. > > I don't agree; using the same API with slightly different semantics > depending on which device you created is much more error prone than > having a clear separation between the two different APIs, IMHO. > >> >> By the way, the rest of KVM API (KVM_IRQ_LINE, for example), also uses vCPU index. >> >> That's all my arguments for vCPU index instead of affinity value > > I'm not convinced that we should be compatible with GICv2 at all. (The > deeper argument here is that GICv2 had an architectural limitation to 8 > CPUs so all the CPU addressing is simple in that case. This is a > fundamental evolution from GICv2 to GICv3 so it is only natural that > there are substantial changes in this area.) > > I'll let Marc or Peter chime in if they disagree. Well, compatibility with GICv2 is the biggest mistake we made when designing the GICv3 architecture. And that's why our emulation doesn't give a damn about v2 compatibility. Designing the kernel API in terms of GICv2 is nothing short of revolting, if you ask me. We've already shoe-horned GICv3 in GICv2 data structures in KVM, and that's an absolute mess. I can't wait to simple nuke the thing. Once v2 is out of the picture, everything is much simpler. Thanks, M.
On 8 October 2015 at 10:10, Pavel Fedin <p.fedin@samsung.com> wrote: > Sorry, didn't want to offend anyone. I just wanted to tell that i know > that you, as maintainers, have much more power than i do, and you can > always say "it's political decision, we just want it and that's final", > and if you choose to do this, i'm perfectly OK with that, just say it. This isn't intended to be a political decision; it's our joint technical opinion on the best design for this API. Since we all happened to be in one physical location the other week it was a good opportunity for us to work through how we thought the API should look from a technical perspective. At that point it seemed to us to be clearer to write up the results of that discussion as a single patch to the API documentation, rather than doing it by (for instance) making lots of different review comments on your patches. Christoffer's API documentation patch is a patch like any other and it has to go through review. If there are parts where you don't understand the rationale or you think we got something wrong you should let us know. thanks -- PMM
Hello! > Well, compatibility with GICv2 is the biggest mistake we made when > designing the GICv3 architecture. And that's why our emulation doesn't > give a damn about v2 compatibility. Ok, i see your arguments, and after that it becomes a matter of personal taste. Three beat one, i go your way. :) Don't know if i'll be able to roll out v5 tomorrow, but i think on monday i'll definitely do. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On Thu, Oct 08, 2015 at 03:28:40PM +0300, Pavel Fedin wrote: > Hello! > > > Well, compatibility with GICv2 is the biggest mistake we made when > > designing the GICv3 architecture. And that's why our emulation doesn't > > give a damn about v2 compatibility. > > Ok, i see your arguments, and after that it becomes a matter of personal taste. Three beat one, i > go your way. :) Don't know if i'll be able to roll out v5 tomorrow, but i think on monday i'll > definitely do. > There's no rush, I don't think this will make it into v4.4 anyhow, as we're already on -rc4 and there's a bunch of other stuff in flight, and I haven't configured a way to test these patches yet. Speaking of which, does the QEMU side of this patch set require first adding the GICv3 emulation for the data structures or is there a stand-alone migration patch set somewhere? -Christoffer
Hello! > There's no rush, I don't think this will make it into v4.4 anyhow Did you mean 4.3 here? > Speaking of which, does the QEMU side of this patch set require first > adding the GICv3 emulation for the data structures or is there a > stand-alone migration patch set somewhere? I rolled it out a week ago: https://www.mail-archive.com/qemu-devel@nongnu.org/msg325331.html. I tried to get reviewed/accepted/whatever at least data structure part, but Peter replied that he isn't interested before we have the kernel. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On 08/10/15 13:45, Pavel Fedin wrote: > Hello! > >> There's no rush, I don't think this will make it into v4.4 anyhow > > Did you mean 4.3 here? No, that'd be really 4.4. 4.3 has closed 4 weeks ago, and 4.4 is about to open. This work is unlikely to make it before 4.5 TBH. Thanks, M.
On 8 October 2015 at 13:45, Pavel Fedin <p.fedin@samsung.com> wrote: >> Speaking of which, does the QEMU side of this patch set require first >> adding the GICv3 emulation for the data structures or is there a >> stand-alone migration patch set somewhere? > > I rolled it out a week ago: https://www.mail-archive.com/qemu-devel@nongnu.org/msg325331.html. I > tried to get reviewed/accepted/whatever at least data structure part, but Peter replied that he > isn't interested before we have the kernel. More specifically, I wanted us to agree on the kernel API for migration, because that significantly affects how the QEMU code looks. A quick look at your patch suggests you still have data structures like +struct gicv3_irq_state { + /* The enable bits are only banked for per-cpu interrupts. */ + unsigned long *enabled; + unsigned long *pending; + unsigned long *active; + unsigned long *level; + unsigned long *group; + bool edge_trigger; /* true: edge-triggered, false: level-triggered */ + uint32_t mask_size; /* Size of bitfields in long's, for migration */ +}; I think it's probably going to be better to have an array of redistributor structures (one per CPU), and then keep the state that in hardware is in each redistributor inside those sub-structures. Similarly for state that in hardware is inside the distributor, and inside each cpu interface. thanks -- PMM
Hello! > + KVM_DEV_ARM_VGIC_CPU_SYSREGS > + Attributes: > + The attr field of kvm_device_attr encodes two values: > + bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 | > + values: | mpidr | RES | instr | One small clarification: do you really want it to be different from KVM_DEV_ARM_VGIC_GRP_CPU_REGS? Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On 09/10/15 08:30, Pavel Fedin wrote: > Hello! > >> + KVM_DEV_ARM_VGIC_CPU_SYSREGS >> + Attributes: >> + The attr field of kvm_device_attr encodes two values: >> + bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 | >> + values: | mpidr | RES | instr | > > One small clarification: do you really want it to be different from KVM_DEV_ARM_VGIC_GRP_CPU_REGS? One is memory mapped, the other one is a system register. One is addressed by a linear index, the other one by affinity. How do you reconcile the two? What's the point of trying to shoehorn a different concept into the existing API? M.
Hello! > How do you reconcile the two? What's the point of trying to shoehorn a > different concept into the existing API? Keeping amount of #define's as small as possible, and try to reuse everything that can be reused. It's just my personal taste, nothing serious. :) The question was just a clarification. Ok, this is not a problem. By the way, i've just finished v5 patchset, now need to rewrite qemu and test before posting it. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
On 09/10/15 09:10, Pavel Fedin wrote: > Hello! > >> How do you reconcile the two? What's the point of trying to shoehorn a >> different concept into the existing API? > > Keeping amount of #define's as small as possible, and try to reuse everything that can be reused. Reusing existing code is a noble goal, but I think that having clear abstractions and following the architecture trumps it completely. Experience has shown that trying to be clever with userspace interfaces comes and bites us in the rear sooner or later. Not enough bits, being unable to represent valid architecture features, pointlessly complicated code that is hard to maintain. Those are the things I care about today. So a #define is completely free, additional code is still very cheap. My brain cells are few, and I like to look at the code and get this warm fuzzy feeling that it is doing the right thing. Having separate interfaces for different devices is a very good way to ensure the above. Thanks, M.
diff --git a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt new file mode 100644 index 0000000..383f9be --- /dev/null +++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt @@ -0,0 +1,114 @@ +ARM Virtual Generic Interrupt Controller v3 and later (VGICv3) +============================================================== + + +Device types supported: + KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0 + +Only one VGIC instance may be instantiated through this API. The created VGIC +will act as the VM interrupt controller, requiring emulated user-space devices +to inject interrupts to the VGIC instead of directly to CPUs. It is not +possible to create both a GICv3 and GICv2 on the same VM. + +Creating a guest GICv3 device requires a host GICv3 as well. + +Groups: + KVM_DEV_ARM_VGIC_GRP_ADDR + Attributes: + KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit) + Base address in the guest physical address space of the GICv3 distributor + register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3. + This address needs to be 64K aligned and the region covers 64 KByte. + + KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit) + Base address in the guest physical address space of the GICv3 + redistributor register mappings. There are two 64K pages for each + VCPU and all of the redistributor pages are contiguous. + Only valid for KVM_DEV_TYPE_ARM_VGIC_V3. + This address needs to be 64K aligned. + + + KVM_DEV_ARM_VGIC_GRP_DIST_REGS + KVM_DEV_ARM_VGIC_GRP_REDIST_REGS + Attributes: + The attr field of kvm_device_attr encodes two values: + bits: | 63 .... 32 | 31 .... 0 | + values: | mpidr | offset | + + All distributor regs are (rw, 64-bit). + + KVM_DEV_ARM_VGIC_GRP_DIST_REGS accesses the main distributor registers. + KVM_DEV_ARM_VGIC_GRP_REDIST_REGS accesses the redistributor of the CPU + specified by the mpidr. + + The offset is relative to the "[Re]Distributor base address" as defined + in the GICv3/4 specs. Getting or setting such a register has the same + effect as reading or writing the register on real hardware, and the mpidr + field is used to specify which redistributor is accessed. The mpidr is + ignored for the distributor. + + The mpidr encoding is based on the affinity information in the + architecture defined MPIDR, and the field is encoded as follows: + | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 | + | Aff3 | Aff2 | Aff1 | Aff0 | + + Note that distributor fields are not banked, but return the same value + regardless of the mpidr used to access the register. + Limitations: + - Priorities are not implemented, and registers are RAZ/WI + Errors: + -ENXIO: Getting or setting this register is not yet supported + -EBUSY: One or more VCPUs are running + + + KVM_DEV_ARM_VGIC_CPU_SYSREGS + Attributes: + The attr field of kvm_device_attr encodes two values: + bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 | + values: | mpidr | RES | instr | + + The mpidr field encodes the CPU ID based on the affinity information in the + architecture defined MPIDR, and the field is encoded as follows: + | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 | + | Aff3 | Aff2 | Aff1 | Aff0 | + + The instr field encodes the system register to access based on the fields + defined in the A64 instruction set encoding for system register access + (RES means the bits are reserved for future use and should be zero): + + | 15 ... 14 | 13 ... 11 | 10 ... 7 | 6 ... 3 | 2 ... 0 | + | Op 0 | Op1 | CRn | CRm | Op2 | + + All system regs accessed through this API are (rw, 64-bit). + + KVM_DEV_ARM_VGIC_CPU_SYSREGS accesses the CPU interface registers for the + CPU specified by the mpidr field. + + + Limitations: + - Priorities are not implemented, and registers are RAZ/WI + Errors: + -ENXIO: Getting or setting this register is not yet supported + -EBUSY: One or more VCPUs are running + + + KVM_DEV_ARM_VGIC_GRP_NR_IRQS + Attributes: + A value describing the number of interrupts (SGI, PPI and SPI) for + this GIC instance, ranging from 64 to 1024, in increments of 32. + + Errors: + -EINVAL: Value set is out of the expected range + -EBUSY: Value has already be set. + + + KVM_DEV_ARM_VGIC_GRP_CTRL + Attributes: + KVM_DEV_ARM_VGIC_CTRL_INIT + request the initialization of the VGIC, no additional parameter in + kvm_device_attr.addr. + Errors: + -ENXIO: VGIC not properly configured as required prior to calling + this attribute + -ENODEV: no online VCPU + -ENOMEM: memory shortage when allocating vgic internal data diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt index 3fb9054..e61d7a0 100644 --- a/Documentation/virtual/kvm/devices/arm-vgic.txt +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt @@ -3,16 +3,16 @@ ARM Virtual Generic Interrupt Controller (VGIC) Device types supported: KVM_DEV_TYPE_ARM_VGIC_V2 ARM Generic Interrupt Controller v2.0 - KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0 Only one VGIC instance may be instantiated through either this API or the legacy KVM_CREATE_IRQCHIP api. The created VGIC will act as the VM interrupt controller, requiring emulated user-space devices to inject interrupts to the VGIC instead of directly to CPUs. -Creating a guest GICv3 device requires a host GICv3 as well. -GICv3 implementations with hardware compatibility support allow a guest GICv2 -as well. +GICv3 implementations with hardware compatibility support allow creating a +guest GICv2 through this interface. For information on creating a guest GICv3 +device, see arm-vgic-v3.txt. It is not possible to create both a GICv3 and +GICv2 device on the same VM. Groups: KVM_DEV_ARM_VGIC_GRP_ADDR @@ -27,18 +27,6 @@ Groups: interface register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2. This address needs to be 4K aligned and the region covers 4 KByte. - KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit) - Base address in the guest physical address space of the GICv3 distributor - register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3. - This address needs to be 64K aligned and the region covers 64 KByte. - - KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit) - Base address in the guest physical address space of the GICv3 - redistributor register mappings. There are two 64K pages for each - VCPU and all of the redistributor pages are contiguous. - Only valid for KVM_DEV_TYPE_ARM_VGIC_V3. - This address needs to be 64K aligned. - KVM_DEV_ARM_VGIC_GRP_DIST_REGS Attributes: