Message ID | 1400464443-34816-1-git-send-email-wangnan0@huawei.com |
---|---|
State | New |
Headers | show |
On Mon, May 19, 2014 at 02:54:03AM +0100, Wang Nan wrote: > When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents > the second kernel ioremap first kernel's memory if the address falls > into second kernel section. This limitation requires the second kernel > occupies a full section, and elfcorehdr must resides in another section. > > This patch makes crash dump kernel use strict pfn_valid, removes such > limitation. > > For example: > > For a platform with SECTION_SIZE_BITS == 28 (256MiB) and > crashkernel=128M@0x28000000 in kernel cmdline, the second > kernel is loaded at 0x28000000. Kexec puts elfcorehdr at > 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to > second kernel. When second kernel start, it tries to use > ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the > same section of the second kernel, pfn_valid will recongnize > the page as valid, so ioremap will refuse to map it. So isn't the issue here that you're passing an incorrect mem= parameter to the crash kernel? Will
On 2014/5/20 0:09, Will Deacon wrote: > On Mon, May 19, 2014 at 02:54:03AM +0100, Wang Nan wrote: >> When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents >> the second kernel ioremap first kernel's memory if the address falls >> into second kernel section. This limitation requires the second kernel >> occupies a full section, and elfcorehdr must resides in another section. >> >> This patch makes crash dump kernel use strict pfn_valid, removes such >> limitation. >> >> For example: >> >> For a platform with SECTION_SIZE_BITS == 28 (256MiB) and >> crashkernel=128M@0x28000000 in kernel cmdline, the second >> kernel is loaded at 0x28000000. Kexec puts elfcorehdr at >> 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to >> second kernel. When second kernel start, it tries to use >> ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the >> same section of the second kernel, pfn_valid will recongnize >> the page as valid, so ioremap will refuse to map it. > > So isn't the issue here that you're passing an incorrect mem= parameter > to the crash kernel? > mem= parameter is generated by kexec-tools according to /proc/iomem, it is the size of reserved memory minus 1MiB. So I think what you mean is I passing an incorrect crashkernel= parameter? I'll explain limitations on crash kernel reserved memory in the case of SPARSEMEM enabled, and show how *impractical* the 'correct' crashkernel will be. Use realview board for example. Limitation 1: crash kernel reservation kernel must be aligned with 0x08000000 (128MiB). This is because zImage determine final kernel address by (pc & 0xf8000000). If, for example, set crashkernel=64M@0x29000000, then the second kernel itself overwrites first kernel's memory. We'll lost some memory in /proc/vmcore. Limitation 2: crash kernel must resides in different section with the first kernel. This is because the second kernel use ioremap for accessing first kernel's memory, and arm prevent a valid pfn be ioremapped. Which means a whole section must be reserved for the secton kernel. On realview, which is 256MiB. Limitation 3: the last 1MiB of reserved memory must be ioremappable. This is because the second kernel depeneds kexec-tools passing an elfheader as 'elfcorehdr' to instructs it generating /proc/vmcore. See fs/proc/vmcore.c. Kexec-tools simply uses the last 1MiB for it. The second kernel use ioremap to access it, force the header be put in another section. In realview board, the only possible correct setting should be 'crashkernel=257M@0x20000000'. However, realview has only 1GiB memory, crash kernel consumes a quarter plus 1MiB. In addition, even set this parameter, crash kernel is still unusable because: crashkernel reservation failed - memory is in use (0x20000000) > Will >
Hi Will, What's your opinion about my explanation? Thanks! On 2014/5/20 11:22, Wang Nan wrote: > On 2014/5/20 0:09, Will Deacon wrote: >> On Mon, May 19, 2014 at 02:54:03AM +0100, Wang Nan wrote: >>> When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents >>> the second kernel ioremap first kernel's memory if the address falls >>> into second kernel section. This limitation requires the second kernel >>> occupies a full section, and elfcorehdr must resides in another section. >>> >>> This patch makes crash dump kernel use strict pfn_valid, removes such >>> limitation. >>> >>> For example: >>> >>> For a platform with SECTION_SIZE_BITS == 28 (256MiB) and >>> crashkernel=128M@0x28000000 in kernel cmdline, the second >>> kernel is loaded at 0x28000000. Kexec puts elfcorehdr at >>> 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to >>> second kernel. When second kernel start, it tries to use >>> ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the >>> same section of the second kernel, pfn_valid will recongnize >>> the page as valid, so ioremap will refuse to map it. >> >> So isn't the issue here that you're passing an incorrect mem= parameter >> to the crash kernel? >> > > mem= parameter is generated by kexec-tools according to /proc/iomem, it is the size > of reserved memory minus 1MiB. So I think what you mean is I passing an incorrect > crashkernel= parameter? > > I'll explain limitations on crash kernel reserved memory in the case of SPARSEMEM > enabled, and show how *impractical* the 'correct' crashkernel will be. > > Use realview board for example. > > Limitation 1: crash kernel reservation kernel must be aligned with 0x08000000 (128MiB). > > This is because zImage determine final kernel address by (pc & 0xf8000000). If, > for example, set crashkernel=64M@0x29000000, then the second kernel itself > overwrites first kernel's memory. We'll lost some memory in /proc/vmcore. > > Limitation 2: crash kernel must resides in different section with the first kernel. > > This is because the second kernel use ioremap for accessing first kernel's memory, > and arm prevent a valid pfn be ioremapped. Which means a whole section must be reserved > for the secton kernel. On realview, which is 256MiB. > > Limitation 3: the last 1MiB of reserved memory must be ioremappable. > > This is because the second kernel depeneds kexec-tools passing an elfheader as > 'elfcorehdr' to instructs it generating /proc/vmcore. See fs/proc/vmcore.c. Kexec-tools > simply uses the last 1MiB for it. The second kernel use ioremap to access it, force > the header be put in another section. > > In realview board, the only possible correct setting should be 'crashkernel=257M@0x20000000'. > However, realview has only 1GiB memory, crash kernel consumes a quarter plus 1MiB. In addition, even > set this parameter, crash kernel is still unusable because: > > crashkernel reservation failed - memory is in use (0x20000000) > >> Will >> > > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >
Wang, Will I'm now working on kdump support for arm64 on top of Geoff's kexec patch. On 05/20/2014 12:22 PM, Wang Nan wrote: > On 2014/5/20 0:09, Will Deacon wrote: >> On Mon, May 19, 2014 at 02:54:03AM +0100, Wang Nan wrote: >>> When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents >>> the second kernel ioremap first kernel's memory if the address falls >>> into second kernel section. This limitation requires the second kernel >>> occupies a full section, and elfcorehdr must resides in another section. >>> >>> This patch makes crash dump kernel use strict pfn_valid, removes such >>> limitation. >>> >>> For example: >>> >>> For a platform with SECTION_SIZE_BITS == 28 (256MiB) and >>> crashkernel=128M@0x28000000 in kernel cmdline, the second >>> kernel is loaded at 0x28000000. Kexec puts elfcorehdr at >>> 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to >>> second kernel. When second kernel start, it tries to use >>> ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the >>> same section of the second kernel, pfn_valid will recongnize >>> the page as valid, so ioremap will refuse to map it. >> >> So isn't the issue here that you're passing an incorrect mem= parameter >> to the crash kernel? >> > > mem= parameter is generated by kexec-tools according to /proc/iomem, it is the size > of reserved memory minus 1MiB. So I think what you mean is I passing an incorrect > crashkernel= parameter? Just FYI, kexec-tools doesn't seem to be implemented in proper way to support device-tree. Once device-tree is handled correctly, we don't need to pass "mem=" parameter. (Of course, only on machines that support device-tree.) > I'll explain limitations on crash kernel reserved memory in the case of SPARSEMEM > enabled, and show how *impractical* the 'correct' crashkernel will be. > > Use realview board for example. > > Limitation 1: crash kernel reservation kernel must be aligned with 0x08000000 (128MiB). > > This is because zImage determine final kernel address by (pc & 0xf8000000). If, > for example, set crashkernel=64M@0x29000000, then the second kernel itself > overwrites first kernel's memory. We'll lost some memory in /proc/vmcore. > > Limitation 2: crash kernel must resides in different section with the first kernel. > > This is because the second kernel use ioremap for accessing first kernel's memory, > and arm prevent a valid pfn be ioremapped. Which means a whole section must be reserved > for the secton kernel. On realview, which is 256MiB. > > Limitation 3: the last 1MiB of reserved memory must be ioremappable. > > This is because the second kernel depeneds kexec-tools passing an elfheader as > 'elfcorehdr' to instructs it generating /proc/vmcore. See fs/proc/vmcore.c. Kexec-tools > simply uses the last 1MiB for it. The second kernel use ioremap to access it, force > the header be put in another section. We can avoid "Limitation 3" just by implementing arm's own elfcorehdr_read() with memcpy(). I can submit a patch, but can't test it for now. -Takahiro AKASHI > In realview board, the only possible correct setting should be 'crashkernel=257M@0x20000000'. > However, realview has only 1GiB memory, crash kernel consumes a quarter plus 1MiB. In addition, even > set this parameter, crash kernel is still unusable because: > > crashkernel reservation failed - memory is in use (0x20000000) > >> Will >> > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On 2014/5/29 12:39, AKASHI Takahiro wrote: > Wang, Will > > I'm now working on kdump support for arm64 on top of Geoff's kexec patch. > > On 05/20/2014 12:22 PM, Wang Nan wrote: >> On 2014/5/20 0:09, Will Deacon wrote: >>> On Mon, May 19, 2014 at 02:54:03AM +0100, Wang Nan wrote: >>>> When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents >>>> the second kernel ioremap first kernel's memory if the address falls >>>> into second kernel section. This limitation requires the second kernel >>>> occupies a full section, and elfcorehdr must resides in another section. >>>> >>>> This patch makes crash dump kernel use strict pfn_valid, removes such >>>> limitation. >>>> >>>> For example: >>>> >>>> For a platform with SECTION_SIZE_BITS == 28 (256MiB) and >>>> crashkernel=128M@0x28000000 in kernel cmdline, the second >>>> kernel is loaded at 0x28000000. Kexec puts elfcorehdr at >>>> 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to >>>> second kernel. When second kernel start, it tries to use >>>> ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the >>>> same section of the second kernel, pfn_valid will recongnize >>>> the page as valid, so ioremap will refuse to map it. >>> >>> So isn't the issue here that you're passing an incorrect mem= parameter >>> to the crash kernel? >>> >> >> mem= parameter is generated by kexec-tools according to /proc/iomem, it is the size >> of reserved memory minus 1MiB. So I think what you mean is I passing an incorrect >> crashkernel= parameter? > > Just FYI, kexec-tools doesn't seem to be implemented in proper way to support device-tree. > Once device-tree is handled correctly, we don't need to pass "mem=" parameter. > (Of course, only on machines that support device-tree.) > >> I'll explain limitations on crash kernel reserved memory in the case of SPARSEMEM >> enabled, and show how *impractical* the 'correct' crashkernel will be. >> >> Use realview board for example. >> >> Limitation 1: crash kernel reservation kernel must be aligned with 0x08000000 (128MiB). >> >> This is because zImage determine final kernel address by (pc & 0xf8000000). If, >> for example, set crashkernel=64M@0x29000000, then the second kernel itself >> overwrites first kernel's memory. We'll lost some memory in /proc/vmcore. >> >> Limitation 2: crash kernel must resides in different section with the first kernel. >> >> This is because the second kernel use ioremap for accessing first kernel's memory, >> and arm prevent a valid pfn be ioremapped. Which means a whole section must be reserved >> for the secton kernel. On realview, which is 256MiB. >> >> Limitation 3: the last 1MiB of reserved memory must be ioremappable. >> >> This is because the second kernel depeneds kexec-tools passing an elfheader as >> 'elfcorehdr' to instructs it generating /proc/vmcore. See fs/proc/vmcore.c. Kexec-tools >> simply uses the last 1MiB for it. The second kernel use ioremap to access it, force >> the header be put in another section. > > We can avoid "Limitation 3" just by implementing arm's own elfcorehdr_read() with memcpy(). > I can submit a patch, but can't test it for now. > > -Takahiro AKASHI > However you still need pfn_valid to check whether elfcorehdr resides in a valid area. Furthermore, simply replacing ioremap to memcpy seems breaks things. Configurations work before replacement will fail. Finally you will find you still need strict pfn_valid to check whether to use ioremap or use memcpy. > >> In realview board, the only possible correct setting should be 'crashkernel=257M@0x20000000'. >> However, realview has only 1GiB memory, crash kernel consumes a quarter plus 1MiB. In addition, even >> set this parameter, crash kernel is still unusable because: >> >> crashkernel reservation failed - memory is in use (0x20000000) >> >>> Will >>> >> >> >> >> _______________________________________________ >> linux-arm-kernel mailing list >> linux-arm-kernel@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >>
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index db3c541..795b1d4 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1800,7 +1800,7 @@ config ARCH_SELECT_MEMORY_MODEL def_bool ARCH_SPARSEMEM_ENABLE config HAVE_ARCH_PFN_VALID - def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM + def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM || CRASH_DUMP config HIGHMEM bool "High Memory Support"
When SPARSEMEM and CRASH_DUMP both selected, simple pfn_valid prevents the second kernel ioremap first kernel's memory if the address falls into second kernel section. This limitation requires the second kernel occupies a full section, and elfcorehdr must resides in another section. This patch makes crash dump kernel use strict pfn_valid, removes such limitation. For example: For a platform with SECTION_SIZE_BITS == 28 (256MiB) and crashkernel=128M@0x28000000 in kernel cmdline, the second kernel is loaded at 0x28000000. Kexec puts elfcorehdr at 0x2ff00000, and passes 'elfcorehdr=0x2ff00000 mem=130048K' to second kernel. When second kernel start, it tries to use ioremap to retrive its elfcorehrd. In this case, elfcodehdr is at the same section of the second kernel, pfn_valid will recongnize the page as valid, so ioremap will refuse to map it. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Geng Hui <hui.geng@huawei.com> --- I have sent this patch once, but get no response. Resend with commit message update. --- arch/arm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)