Message ID | 1398424967-9306-6-git-send-email-ian.campbell@citrix.com |
---|---|
State | New |
Headers | show |
Hi Ian, On 25/04/14 12:22, Ian Campbell wrote: > +static int populate_guest_memory(struct xc_dom_image *dom, > + xen_pfn_t base_pfn, xen_pfn_t nr_pfns) > +{ > + int rc; > + xen_pfn_t allocsz, pfn; > + > + if (!nr_pfns) > + return 0; > + > + DOMPRINTF("%s: populating RAM @ %016"PRIx64"-%016"PRIx64" (%"PRId64"MB)", > + __FUNCTION__, > + (uint64_t)base_pfn << XC_PAGE_SHIFT, > + (uint64_t)(base_pfn + nr_pfns) << XC_PAGE_SHIFT, > + (uint64_t)nr_pfns >> (20-XC_PAGE_SHIFT)); > + > + for ( pfn = 0; pfn < nr_pfns; pfn++ ) > + dom->p2m_host[pfn] = base_pfn + pfn; > + > + for ( pfn = rc = allocsz = 0; (pfn < nr_pfns) && !rc; pfn += allocsz ) May I ask for a bit of clean up here? - allocsz doesn't need to be initialized. It will be unconditionally set at the beginning of the loop - rc can be set 0 at the beginning of the function. > + { > + allocsz = nr_pfns - pfn; > + if ( allocsz > 1024*1024 ) > + allocsz = 1024*1024; > + > + rc = xc_domain_populate_physmap_exact( > + dom->xch, dom->guest_domid, allocsz, > + 0, 0, &dom->p2m_host[pfn]); > + } > + > + return rc; > +} > + > int arch_setup_meminit(struct xc_dom_image *dom) > { > int rc; > - xen_pfn_t pfn, allocsz, i; > + xen_pfn_t pfn; > uint64_t modbase; > > /* Convenient */ > @@ -259,6 +291,8 @@ int arch_setup_meminit(struct xc_dom_image *dom) > const uint64_t ram0size = ramsize; > const uint64_t ram0end = GUEST_RAM0_BASE + ram0size; > > + const xen_pfn_t p2m_size = (ram0end - GUEST_RAM0_BASE) >> XC_PAGE_SHIFT; > + > const uint64_t kernbase = dom->kernel_seg.vstart; > const uint64_t kernend = ROUNDUP(dom->kernel_seg.vend, 21/*2MB*/); > const uint64_t kernsize = kernend - kernbase; > @@ -292,27 +326,17 @@ int arch_setup_meminit(struct xc_dom_image *dom) > > dom->shadow_enabled = 1; > > - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages); > + dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size); > if ( dom->p2m_host == NULL ) > return -EINVAL; > + for ( pfn = 0; pfn < p2m_size; pfn++ ) > + dom->p2m_host[pfn] = INVALID_MFN; With this solution, you will loop 262244 times for nothing (the hole between the 2 banks). Also when the guess will have lots of RAM, it will be slow because we loop nearly twice the array (one here, the other in populate_guest_memory). It think we can avoid looping twice by making the two banks contiguous in the memory (i.e starting the second bank at 4GB instead of 8GB). Regards,
On Fri, 2014-04-25 at 13:51 +0100, Julien Grall wrote: > Hi Ian, > > On 25/04/14 12:22, Ian Campbell wrote: > > +static int populate_guest_memory(struct xc_dom_image *dom, > > + xen_pfn_t base_pfn, xen_pfn_t nr_pfns) > > +{ > > + int rc; > > + xen_pfn_t allocsz, pfn; > > + > > + if (!nr_pfns) > > + return 0; > > + > > + DOMPRINTF("%s: populating RAM @ %016"PRIx64"-%016"PRIx64" (%"PRId64"MB)", > > + __FUNCTION__, > > + (uint64_t)base_pfn << XC_PAGE_SHIFT, > > + (uint64_t)(base_pfn + nr_pfns) << XC_PAGE_SHIFT, > > + (uint64_t)nr_pfns >> (20-XC_PAGE_SHIFT)); > > + > > + for ( pfn = 0; pfn < nr_pfns; pfn++ ) > > + dom->p2m_host[pfn] = base_pfn + pfn; > > + > > + for ( pfn = rc = allocsz = 0; (pfn < nr_pfns) && !rc; pfn += allocsz ) > > May I ask for a bit of clean up here? This is code motion. I deliberately don't want to change it for that reason. > > @@ -292,27 +326,17 @@ int arch_setup_meminit(struct xc_dom_image *dom) > > > > dom->shadow_enabled = 1; > > > > - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages); > > + dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size); > > if ( dom->p2m_host == NULL ) > > return -EINVAL; > > + for ( pfn = 0; pfn < p2m_size; pfn++ ) > > + dom->p2m_host[pfn] = INVALID_MFN; > > With this solution, you will loop 262244 times for nothing (the hole > between the 2 banks). Yes, this is the simplest way to ensure that p2m_host is definitely completely initialised, irrespective of the presence of any holes in the address space. > Also when the guess will have lots of RAM, it will be slow because we > loop nearly twice the array (one here, the other in populate_guest_memory). This is dwarfed by all the other overheads of course, like actually filling in the RAM on the second pass. > It think we can avoid looping twice by making the two banks contiguous > in the memory (i.e starting the second bank at 4GB instead of 8GB). As explained in the commit message I have deliberately left a hole so that we can see that such configurations actually work. Ian.
On 25/04/14 13:59, Ian Campbell wrote: >> It think we can avoid looping twice by making the two banks contiguous >> in the memory (i.e starting the second bank at 4GB instead of 8GB). > > As explained in the commit message I have deliberately left a hole so > that we can see that such configurations actually work. IHMO, the code path doesn't seem very complicate. Adding this overhead (the two loop + the 1GB hole) just for it seems pointless. BTW, is there any issue to create one big bank rather than 2?
On Fri, 2014-04-25 at 14:12 +0100, Julien Grall wrote: > > On 25/04/14 13:59, Ian Campbell wrote: > >> It think we can avoid looping twice by making the two banks contiguous > >> in the memory (i.e starting the second bank at 4GB instead of 8GB). > > > > As explained in the commit message I have deliberately left a hole so > > that we can see that such configurations actually work. > > IHMO, the code path doesn't seem very complicate. Adding this overhead > (the two loop + the 1GB hole) just for it seems pointless. I disagree. It is always worth exercising these things, and this will make sure we don't bake in assumptions about using a single contiguous bank anywhere by mistake. I think you are overestimating the overhead and underestimating the benefit.0 > BTW, is there any issue to create one big bank rather than 2? I expect it would work just fine, but I am not going to do that either for the reasons already discussed. Ian.
On 25/04/14 14:22, Ian Campbell wrote: > On Fri, 2014-04-25 at 14:12 +0100, Julien Grall wrote: >> >> On 25/04/14 13:59, Ian Campbell wrote: >> >> It think we can avoid looping twice by making the two banks contiguous >>>> in the memory (i.e starting the second bank at 4GB instead of 8GB). >>> >>> As explained in the commit message I have deliberately left a hole so >>> that we can see that such configurations actually work. >> >> IHMO, the code path doesn't seem very complicate. Adding this overhead >> (the two loop + the 1GB hole) just for it seems pointless. > > I disagree. It is always worth exercising these things, and this will > make sure we don't bake in assumptions about using a single contiguous > bank anywhere by mistake. > > I think you are overestimating the overhead and underestimating the > benefit.0 > >> BTW, is there any issue to create one big bank rather than 2? > > I expect it would work just fine, but I am not going to do that either > for the reasons already discussed. I'm not entirely convince of this hole... but it might help to also avoid assumption during guest migration: Acked-by: Julien Grall <julien.grall@linaro.org>
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index 8775ca4..61f9ba6 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -247,10 +247,42 @@ static int set_mode(xc_interface *xch, domid_t domid, char *guest_type) return rc; } +static int populate_guest_memory(struct xc_dom_image *dom, + xen_pfn_t base_pfn, xen_pfn_t nr_pfns) +{ + int rc; + xen_pfn_t allocsz, pfn; + + if (!nr_pfns) + return 0; + + DOMPRINTF("%s: populating RAM @ %016"PRIx64"-%016"PRIx64" (%"PRId64"MB)", + __FUNCTION__, + (uint64_t)base_pfn << XC_PAGE_SHIFT, + (uint64_t)(base_pfn + nr_pfns) << XC_PAGE_SHIFT, + (uint64_t)nr_pfns >> (20-XC_PAGE_SHIFT)); + + for ( pfn = 0; pfn < nr_pfns; pfn++ ) + dom->p2m_host[pfn] = base_pfn + pfn; + + for ( pfn = rc = allocsz = 0; (pfn < nr_pfns) && !rc; pfn += allocsz ) + { + allocsz = nr_pfns - pfn; + if ( allocsz > 1024*1024 ) + allocsz = 1024*1024; + + rc = xc_domain_populate_physmap_exact( + dom->xch, dom->guest_domid, allocsz, + 0, 0, &dom->p2m_host[pfn]); + } + + return rc; +} + int arch_setup_meminit(struct xc_dom_image *dom) { int rc; - xen_pfn_t pfn, allocsz, i; + xen_pfn_t pfn; uint64_t modbase; /* Convenient */ @@ -259,6 +291,8 @@ int arch_setup_meminit(struct xc_dom_image *dom) const uint64_t ram0size = ramsize; const uint64_t ram0end = GUEST_RAM0_BASE + ram0size; + const xen_pfn_t p2m_size = (ram0end - GUEST_RAM0_BASE) >> XC_PAGE_SHIFT; + const uint64_t kernbase = dom->kernel_seg.vstart; const uint64_t kernend = ROUNDUP(dom->kernel_seg.vend, 21/*2MB*/); const uint64_t kernsize = kernend - kernbase; @@ -292,27 +326,17 @@ int arch_setup_meminit(struct xc_dom_image *dom) dom->shadow_enabled = 1; - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages); + dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size); if ( dom->p2m_host == NULL ) return -EINVAL; + for ( pfn = 0; pfn < p2m_size; pfn++ ) + dom->p2m_host[pfn] = INVALID_MFN; - /* setup initial p2m */ - for ( pfn = 0; pfn < dom->total_pages; pfn++ ) - dom->p2m_host[pfn] = pfn + dom->rambase_pfn; - - /* allocate guest memory */ - for ( i = rc = allocsz = 0; - (i < dom->total_pages) && !rc; - i += allocsz ) - { - allocsz = dom->total_pages - i; - if ( allocsz > 1024*1024 ) - allocsz = 1024*1024; - - rc = xc_domain_populate_physmap_exact( - dom->xch, dom->guest_domid, allocsz, - 0, 0, &dom->p2m_host[i]); - } + /* setup initial p2m and allocate guest memory */ + if ((rc = populate_guest_memory(dom, + GUEST_RAM0_BASE >> XC_PAGE_SHIFT, + ram0size >> XC_PAGE_SHIFT))) + return rc; /* * We try to place dtb+initrd at 128MB or if we have less RAM
This will help when we have more guest RAM banks. Mostly code motion of the p2m_host initialisation and allocation loop into the new function populate_guest_memory, but in addition in the caller we now initialise the p2m all the INVALID_MFN to handle any holes, although in this patch we still fill in the entire allocated region. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- v2: New patch --- tools/libxc/xc_dom_arm.c | 62 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 43 insertions(+), 19 deletions(-)