diff mbox series

x86: efi: avoid BUILD_BUG_ON() for non-constant p4d_index

Message ID 20210107223424.4135538-1-arnd@kernel.org
State New
Headers show
Series x86: efi: avoid BUILD_BUG_ON() for non-constant p4d_index | expand

Commit Message

Arnd Bergmann Jan. 7, 2021, 10:34 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>


When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT
and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),
so it only triggers for constant input.

Link: https://github.com/ClangBuiltLinux/linux/issues/256
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
 arch/x86/platform/efi/efi_64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.29.2

Comments

Nathan Chancellor Jan. 7, 2021, 10:42 p.m. UTC | #1
On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>

> 

> When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> 

> x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> 

> Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> so it only triggers for constant input.

> 

> Link: https://github.com/ClangBuiltLinux/linux/issues/256

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>


Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>

Tested-by: Nathan Chancellor <natechancellor@gmail.com>


> ---

>  arch/x86/platform/efi/efi_64.c | 4 ++--

>  1 file changed, 2 insertions(+), 2 deletions(-)

> 

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..62bb1616b4a5 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

>  	 * As with PGDs, we share all P4D entries apart from the one entry

>  	 * that covers the EFI runtime mapping space.

>  	 */

> -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +	MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> +	MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

>  

>  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>  	pgd_k = pgd_offset_k(EFI_VA_END);

> -- 

> 2.29.2

>
Ard Biesheuvel Jan. 13, 2021, 5:51 p.m. UTC | #2
On Thu, 7 Jan 2021 at 23:34, Arnd Bergmann <arnd@kernel.org> wrote:
>

> From: Arnd Bergmann <arnd@arndb.de>

>

> When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

>

> x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

>

> Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> so it only triggers for constant input.

>

> Link: https://github.com/ClangBuiltLinux/linux/issues/256

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>


Acked-by: Ard Biesheuvel <ardb@kernel.org>


This can go via the x86 tree directly, IMO

> ---

>  arch/x86/platform/efi/efi_64.c | 4 ++--

>  1 file changed, 2 insertions(+), 2 deletions(-)

>

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..62bb1616b4a5 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

>          * As with PGDs, we share all P4D entries apart from the one entry

>          * that covers the EFI runtime mapping space.

>          */

> -       BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +       MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> +       MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

>

>         pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>         pgd_k = pgd_offset_k(EFI_VA_END);

> --

> 2.29.2

>
Borislav Petkov Jan. 15, 2021, 6:23 p.m. UTC | #3
On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>

> 

> When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():


I have CONFIG_X86_5LEVEL=y, CONFIG_EFI=y and am using Debian clang
version 10.0.1-8+b1 but my .config builds just fine.

How do you trigger this?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Nathan Chancellor Jan. 15, 2021, 6:32 p.m. UTC | #4
On Fri, Jan 15, 2021 at 07:23:00PM +0100, Borislav Petkov wrote:
> On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:

> > From: Arnd Bergmann <arnd@arndb.de>

> > 

> > When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> 

> I have CONFIG_X86_5LEVEL=y, CONFIG_EFI=y and am using Debian clang

> version 10.0.1-8+b1 but my .config builds just fine.

> 

> How do you trigger this?


I triggered it with CONFIG_UBSAN=y + CONFIG_UBSAN_UNSIGNED_OVERFLOW=y
(it can be exposed with an allyesconfig/allmodconfig on mainline
currently).

Cheers,
Nathan
Borislav Petkov Jan. 15, 2021, 7:07 p.m. UTC | #5
On Fri, Jan 15, 2021 at 11:32:03AM -0700, Nathan Chancellor wrote:
> I triggered it with CONFIG_UBSAN=y + CONFIG_UBSAN_UNSIGNED_OVERFLOW=y

> (it can be exposed with an allyesconfig/allmodconfig on mainline

> currently).


Yah, I can trigger with that, thanks.

But I'll be damned, check this out:

clang preprocesses to this:

 do { extern void __compiletime_assert_332(void) ; if (!(!(p4d_index((-68 * ((1UL) << 30))) != p4d_index((0xffffffffff000000UL))))) __compiletime_assert_332(); } while (0);

The resulting asm is:

.LBB1_32:
        movabsq $-73014444032, %r13     # imm = 0xFFFFFFEF00000000
        testb   $1, %al
        jne     .LBB1_33
.LBB1_34:
        xorl    %r14d, %ebx
        testl   $33554431, %ebx         # imm = 0x1FFFFFF
        je      .LBB1_36
# %bb.35:
        callq   __compiletime_assert_332

so the undefined symbol is there, leading to:

ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
/home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:140: undefined reference to `__compiletime_assert_332'

Now look at gcc:

It preprocesses to:

 do { extern void __compiletime_assert_332(void) __attribute__((__error__("BUILD_BUG_ON failed: " "p4d_index(EFI_VA_END) != p4d_index(MODULES_END)"))); if (!(!(p4d_index((-68 * ((1UL) << 30))) != p4d_index((0xffffffffff000000UL))))) __compiletime_assert_332(); } while (0);


Resulting asm:

$ grep __compiletime_assert_332  arch/x86/platform/efi/efi_64.s
$

That thing has been optimized away!

Which means, those build assertions are gone on gcc and they don't catch
diddly squat. I sure hope I'm missing something here...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Arvind Sankar Jan. 15, 2021, 7:07 p.m. UTC | #6
On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>

> 

> When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> 

> x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> 

> Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> so it only triggers for constant input.

> 

> Link: https://github.com/ClangBuiltLinux/linux/issues/256

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> ---

>  arch/x86/platform/efi/efi_64.c | 4 ++--

>  1 file changed, 2 insertions(+), 2 deletions(-)

> 

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..62bb1616b4a5 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

>  	 * As with PGDs, we share all P4D entries apart from the one entry

>  	 * that covers the EFI runtime mapping space.

>  	 */

> -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +	MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> +	MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

>  

>  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>  	pgd_k = pgd_offset_k(EFI_VA_END);

> -- 

> 2.29.2

> 


I think this needs more explanation as to why clang is triggering this.
The issue mentions clang not inline p4d_index(), and I guess not
performing inter-procedural analysis either?

For the second assertion there, everything is always constant AFAICT:
EFI_VA_START, EFI_VA_END and P4D_MASK are all constants regardless of
CONFIG_5LEVEL.

For the first assertion, it isn't technically constant, but if
p4d_index() gets inlined, the compiler should be able to see that the
two are always equal, even though ptrs_per_p4d is not constant:
	EFI_VA_END >> 39 == MODULES_END >> 39
so the masking with ptrs_per_p4d-1 doesn't matter for the comparison.

As a matter of fact, it seems like the four assertions could be combined
into:
	BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));
	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));
instead of separately asserting they're the same PGD entry and the same
P4D entry.

Thanks.
Arvind Sankar Jan. 15, 2021, 7:11 p.m. UTC | #7
On Fri, Jan 15, 2021 at 08:07:29PM +0100, Borislav Petkov wrote:
> On Fri, Jan 15, 2021 at 11:32:03AM -0700, Nathan Chancellor wrote:

> > I triggered it with CONFIG_UBSAN=y + CONFIG_UBSAN_UNSIGNED_OVERFLOW=y

> > (it can be exposed with an allyesconfig/allmodconfig on mainline

> > currently).

> 

> Yah, I can trigger with that, thanks.

> 

> But I'll be damned, check this out:

> 

> clang preprocesses to this:

> 

>  do { extern void __compiletime_assert_332(void) ; if (!(!(p4d_index((-68 * ((1UL) << 30))) != p4d_index((0xffffffffff000000UL))))) __compiletime_assert_332(); } while (0);

> 

> The resulting asm is:

> 

> .LBB1_32:

>         movabsq $-73014444032, %r13     # imm = 0xFFFFFFEF00000000

>         testb   $1, %al

>         jne     .LBB1_33

> .LBB1_34:

>         xorl    %r14d, %ebx

>         testl   $33554431, %ebx         # imm = 0x1FFFFFF

>         je      .LBB1_36

> # %bb.35:

>         callq   __compiletime_assert_332

> 

> so the undefined symbol is there, leading to:

> 

> ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> /home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:140: undefined reference to `__compiletime_assert_332'

> 

> Now look at gcc:

> 

> It preprocesses to:

> 

>  do { extern void __compiletime_assert_332(void) __attribute__((__error__("BUILD_BUG_ON failed: " "p4d_index(EFI_VA_END) != p4d_index(MODULES_END)"))); if (!(!(p4d_index((-68 * ((1UL) << 30))) != p4d_index((0xffffffffff000000UL))))) __compiletime_assert_332(); } while (0);

> 

> 

> Resulting asm:

> 

> $ grep __compiletime_assert_332  arch/x86/platform/efi/efi_64.s

> $

> 

> That thing has been optimized away!

> 

> Which means, those build assertions are gone on gcc and they don't catch

> diddly squat. I sure hope I'm missing something here...


That's how build-time assertions work: they are _supposed_ to be
optimized away completely when the assertion is true. If they're
_not_ optimized away, the build will fail.
Borislav Petkov Jan. 15, 2021, 7:18 p.m. UTC | #8
On Fri, Jan 15, 2021 at 02:11:25PM -0500, Arvind Sankar wrote:
> That's how build-time assertions work: they are _supposed_ to be

> optimized away completely when the assertion is true. If they're

> _not_ optimized away, the build will fail.


Yah, that I know, thanks.

If gcc really inlines p4d_index() and does a lot more aggressive
optimization to determine that the condition is false and thus optimize
everything away (and clang doesn't), then that would explain the
observation.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Arnd Bergmann Jan. 15, 2021, 7:54 p.m. UTC | #9
On Fri, Jan 15, 2021 at 8:18 PM Borislav Petkov <bp@alien8.de> wrote:
>

> On Fri, Jan 15, 2021 at 02:11:25PM -0500, Arvind Sankar wrote:

> > That's how build-time assertions work: they are _supposed_ to be

> > optimized away completely when the assertion is true. If they're

> > _not_ optimized away, the build will fail.

>

> Yah, that I know, thanks.

>

> If gcc really inlines p4d_index() and does a lot more aggressive

> optimization to determine that the condition is false and thus optimize

> everything away (and clang doesn't), then that would explain the

> observation.


One difference is that gcc does not have
-fsanitize=unsigned-integer-overflow at all, and I don't see the
assertion without that on clang either, so it's possible that clang
behaves as designed here.

The description is:
    -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where
     the result of an unsigned integer computation cannot be represented in
     its type. Unlike signed integer overflow, this is not undefined behavior,
     but it is often unintentional. This sanitizer does not check for
lossy implicit
     conversions performed before such a computation (see
    -fsanitize=implicit-conversion).

The "-68 * ((1UL) << 30" computation does overflow an unsigned long
as intended, right? Maybe this is enough for the ubsan code in clang to
just disable some of the optimization steps that the assertion relies on.

        Arnd
Arvind Sankar Jan. 15, 2021, 8:12 p.m. UTC | #10
On Fri, Jan 15, 2021 at 08:54:18PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 15, 2021 at 8:18 PM Borislav Petkov <bp@alien8.de> wrote:

> >

> > On Fri, Jan 15, 2021 at 02:11:25PM -0500, Arvind Sankar wrote:

> > > That's how build-time assertions work: they are _supposed_ to be

> > > optimized away completely when the assertion is true. If they're

> > > _not_ optimized away, the build will fail.

> >

> > Yah, that I know, thanks.

> >

> > If gcc really inlines p4d_index() and does a lot more aggressive

> > optimization to determine that the condition is false and thus optimize

> > everything away (and clang doesn't), then that would explain the

> > observation.

> 

> One difference is that gcc does not have

> -fsanitize=unsigned-integer-overflow at all, and I don't see the

> assertion without that on clang either, so it's possible that clang

> behaves as designed here.

> 

> The description is:

>     -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where

>      the result of an unsigned integer computation cannot be represented in

>      its type. Unlike signed integer overflow, this is not undefined behavior,

>      but it is often unintentional. This sanitizer does not check for

> lossy implicit

>      conversions performed before such a computation (see

>     -fsanitize=implicit-conversion).

> 

> The "-68 * ((1UL) << 30" computation does overflow an unsigned long

> as intended, right? Maybe this is enough for the ubsan code in clang to

> just disable some of the optimization steps that the assertion relies on.

> 

>         Arnd


That does seem to be an overflow, but that happens at compile-time.
Maybe
	AC(-68, UL) << 30
would be a better definition to avoid overflow.

The real issue might be (ptrs_per_p4d - 1) which can overflow at
run-time, and maybe the added ubsan code makes p4d_index() not worth
inlining according to clang?
Arvind Sankar Jan. 15, 2021, 8:27 p.m. UTC | #11
On Fri, Jan 15, 2021 at 02:07:51PM -0500, Arvind Sankar wrote:
> On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:

> > From: Arnd Bergmann <arnd@arndb.de>

> > 

> > When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> > 

> > x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> > efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> > 

> > Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> > and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> > so it only triggers for constant input.

> > 

> > Link: https://github.com/ClangBuiltLinux/linux/issues/256

> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> > ---

> >  arch/x86/platform/efi/efi_64.c | 4 ++--

> >  1 file changed, 2 insertions(+), 2 deletions(-)

> > 

> > diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> > index e1e8d4e3a213..62bb1616b4a5 100644

> > --- a/arch/x86/platform/efi/efi_64.c

> > +++ b/arch/x86/platform/efi/efi_64.c

> > @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

> >  	 * As with PGDs, we share all P4D entries apart from the one entry

> >  	 * that covers the EFI runtime mapping space.

> >  	 */

> > -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > +	MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > +	MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> >  

> >  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

> >  	pgd_k = pgd_offset_k(EFI_VA_END);

> > -- 

> > 2.29.2

> > 

> 

> I think this needs more explanation as to why clang is triggering this.

> The issue mentions clang not inline p4d_index(), and I guess not

> performing inter-procedural analysis either?

> 

> For the second assertion there, everything is always constant AFAICT:

> EFI_VA_START, EFI_VA_END and P4D_MASK are all constants regardless of

> CONFIG_5LEVEL.

> 

> For the first assertion, it isn't technically constant, but if

> p4d_index() gets inlined, the compiler should be able to see that the

> two are always equal, even though ptrs_per_p4d is not constant:

> 	EFI_VA_END >> 39 == MODULES_END >> 39

> so the masking with ptrs_per_p4d-1 doesn't matter for the comparison.

> 

> As a matter of fact, it seems like the four assertions could be combined

> into:

> 	BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> 	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> instead of separately asserting they're the same PGD entry and the same

> P4D entry.

> 

> Thanks.


I actually don't quite get the MODULES_END check -- Ard, do you know
what that's for?

What we really should be checking is that EFI_VA_START is in the top-most
PGD entry and the top-most P4D entry, since we only copy PGD/P4D entries
before EFI_VA_END, but not after EFI_VA_START. So the checks should
really be
	BUILD_BUG_ON(((EFI_VA_START - 1) & P4D_MASK) != (-1ul & P4D_MASK));
	BUILD_BUG_ON(((EFI_VA_START - 1) & P4D_MASK) != (EFI_VA_END & P4D_MASK));
imo. I guess that's what using MODULES_END is effectively checking, but
it would be clearer to check it directly.
Arvind Sankar Jan. 15, 2021, 8:32 p.m. UTC | #12
On Fri, Jan 15, 2021 at 03:12:25PM -0500, Arvind Sankar wrote:
> On Fri, Jan 15, 2021 at 08:54:18PM +0100, Arnd Bergmann wrote:

> > On Fri, Jan 15, 2021 at 8:18 PM Borislav Petkov <bp@alien8.de> wrote:

> > >

> > > On Fri, Jan 15, 2021 at 02:11:25PM -0500, Arvind Sankar wrote:

> > > > That's how build-time assertions work: they are _supposed_ to be

> > > > optimized away completely when the assertion is true. If they're

> > > > _not_ optimized away, the build will fail.

> > >

> > > Yah, that I know, thanks.

> > >

> > > If gcc really inlines p4d_index() and does a lot more aggressive

> > > optimization to determine that the condition is false and thus optimize

> > > everything away (and clang doesn't), then that would explain the

> > > observation.

> > 

> > One difference is that gcc does not have

> > -fsanitize=unsigned-integer-overflow at all, and I don't see the

> > assertion without that on clang either, so it's possible that clang

> > behaves as designed here.

> > 

> > The description is:

> >     -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where

> >      the result of an unsigned integer computation cannot be represented in

> >      its type. Unlike signed integer overflow, this is not undefined behavior,

> >      but it is often unintentional. This sanitizer does not check for

> > lossy implicit

> >      conversions performed before such a computation (see

> >     -fsanitize=implicit-conversion).

> > 

> > The "-68 * ((1UL) << 30" computation does overflow an unsigned long

> > as intended, right? Maybe this is enough for the ubsan code in clang to

> > just disable some of the optimization steps that the assertion relies on.

> > 

> >         Arnd

> 

> That does seem to be an overflow, but that happens at compile-time.

> Maybe

> 	AC(-68, UL) << 30

> would be a better definition to avoid overflow.


Eh, that's an overflow too, isn't it :( Is this option really useful
with kernel code -- this sort of thing is probably done all over the
place?

> 

> The real issue might be (ptrs_per_p4d - 1) which can overflow at

> run-time, and maybe the added ubsan code makes p4d_index() not worth

> inlining according to clang?
Ard Biesheuvel Jan. 16, 2021, 4:34 p.m. UTC | #13
On Fri, 15 Jan 2021 at 21:27, Arvind Sankar <nivedita@alum.mit.edu> wrote:
>

> On Fri, Jan 15, 2021 at 02:07:51PM -0500, Arvind Sankar wrote:

> > On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:

> > > From: Arnd Bergmann <arnd@arndb.de>

> > >

> > > When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> > >

> > > x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> > > efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> > >

> > > Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> > > and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> > > so it only triggers for constant input.

> > >

> > > Link: https://github.com/ClangBuiltLinux/linux/issues/256

> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> > > ---

> > >  arch/x86/platform/efi/efi_64.c | 4 ++--

> > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > >

> > > diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> > > index e1e8d4e3a213..62bb1616b4a5 100644

> > > --- a/arch/x86/platform/efi/efi_64.c

> > > +++ b/arch/x86/platform/efi/efi_64.c

> > > @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

> > >      * As with PGDs, we share all P4D entries apart from the one entry

> > >      * that covers the EFI runtime mapping space.

> > >      */

> > > -   BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > > -   BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > +   MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > > +   MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > >

> > >     pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

> > >     pgd_k = pgd_offset_k(EFI_VA_END);

> > > --

> > > 2.29.2

> > >

> >

> > I think this needs more explanation as to why clang is triggering this.

> > The issue mentions clang not inline p4d_index(), and I guess not

> > performing inter-procedural analysis either?

> >

> > For the second assertion there, everything is always constant AFAICT:

> > EFI_VA_START, EFI_VA_END and P4D_MASK are all constants regardless of

> > CONFIG_5LEVEL.

> >

> > For the first assertion, it isn't technically constant, but if

> > p4d_index() gets inlined, the compiler should be able to see that the

> > two are always equal, even though ptrs_per_p4d is not constant:

> >       EFI_VA_END >> 39 == MODULES_END >> 39

> > so the masking with ptrs_per_p4d-1 doesn't matter for the comparison.

> >

> > As a matter of fact, it seems like the four assertions could be combined

> > into:

> >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > instead of separately asserting they're the same PGD entry and the same

> > P4D entry.

> >

> > Thanks.

>

> I actually don't quite get the MODULES_END check -- Ard, do you know

> what that's for?

>


Maybe Boris remembers? He wrote the original code for the 'new' EFI
page table layout.


> What we really should be checking is that EFI_VA_START is in the top-most

> PGD entry and the top-most P4D entry, since we only copy PGD/P4D entries

> before EFI_VA_END, but not after EFI_VA_START. So the checks should

> really be

>         BUILD_BUG_ON(((EFI_VA_START - 1) & P4D_MASK) != (-1ul & P4D_MASK));

>         BUILD_BUG_ON(((EFI_VA_START - 1) & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> imo. I guess that's what using MODULES_END is effectively checking, but

> it would be clearer to check it directly.


This obviously needs a comment, but checking that everything lives in
the top 512 GB of the kernel VA space seems sufficient to me,
Borislav Petkov Jan. 18, 2021, 8:24 p.m. UTC | #14
On Sat, Jan 16, 2021 at 05:34:27PM +0100, Ard Biesheuvel wrote:
> On Fri, 15 Jan 2021 at 21:27, Arvind Sankar <nivedita@alum.mit.edu> wrote:

> >

> > On Fri, Jan 15, 2021 at 02:07:51PM -0500, Arvind Sankar wrote:

> > > On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:

> > > > From: Arnd Bergmann <arnd@arndb.de>

> > > >

> > > > When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> > > >

> > > > x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> > > > efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> > > >

> > > > Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> > > > and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> > > > so it only triggers for constant input.

> > > >

> > > > Link: https://github.com/ClangBuiltLinux/linux/issues/256

> > > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> > > > ---

> > > >  arch/x86/platform/efi/efi_64.c | 4 ++--

> > > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > > >

> > > > diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> > > > index e1e8d4e3a213..62bb1616b4a5 100644

> > > > --- a/arch/x86/platform/efi/efi_64.c

> > > > +++ b/arch/x86/platform/efi/efi_64.c

> > > > @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

> > > >      * As with PGDs, we share all P4D entries apart from the one entry

> > > >      * that covers the EFI runtime mapping space.

> > > >      */

> > > > -   BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > > > -   BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > > +   MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > > > +   MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > >

> > > >     pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

> > > >     pgd_k = pgd_offset_k(EFI_VA_END);

> > > > --

> > > > 2.29.2

> > > >

> > >

> > > I think this needs more explanation as to why clang is triggering this.

> > > The issue mentions clang not inline p4d_index(), and I guess not

> > > performing inter-procedural analysis either?

> > >

> > > For the second assertion there, everything is always constant AFAICT:

> > > EFI_VA_START, EFI_VA_END and P4D_MASK are all constants regardless of

> > > CONFIG_5LEVEL.

> > >

> > > For the first assertion, it isn't technically constant, but if

> > > p4d_index() gets inlined, the compiler should be able to see that the

> > > two are always equal, even though ptrs_per_p4d is not constant:

> > >       EFI_VA_END >> 39 == MODULES_END >> 39

> > > so the masking with ptrs_per_p4d-1 doesn't matter for the comparison.

> > >

> > > As a matter of fact, it seems like the four assertions could be combined

> > > into:

> > >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> > >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > instead of separately asserting they're the same PGD entry and the same

> > > P4D entry.

> > >

> > > Thanks.

> >

> > I actually don't quite get the MODULES_END check -- Ard, do you know

> > what that's for?

> >

> 

> Maybe Boris remembers? He wrote the original code for the 'new' EFI

> page table layout.


That was added by Kirill for 5-level pgtables:

  e981316f5604 ("x86/efi: Add 5-level paging support")

 Documentation/x86/x86_64/mm.rst should explain the pagetable layout:

   ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
   ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
   ffffffff80000000 |-2048    MB |                  |         |
   ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
   ffffffffff000000 |  -16    MB |                  |         |
      FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset

That thing which starts at -512 GB above is the last PGD on the
pagetable. In it, between -4G and -68G there are 64G which are the EFI
region mapping space for runtime services.

Frankly I'm not sure what this thing is testing because the EFI VA range
is hardcoded and I can't imagine it being somewhere else *except* in the
last PGD.

Lemme add Kirill for clarification.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Arvind Sankar Jan. 18, 2021, 9:42 p.m. UTC | #15
On Mon, Jan 18, 2021 at 09:24:09PM +0100, Borislav Petkov wrote:
> > > > As a matter of fact, it seems like the four assertions could be combined

> > > > into:

> > > >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> > > >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > > instead of separately asserting they're the same PGD entry and the same

> > > > P4D entry.

> > > >

> > > > Thanks.

> > >

> > > I actually don't quite get the MODULES_END check -- Ard, do you know

> > > what that's for?

> > >

> > 

> > Maybe Boris remembers? He wrote the original code for the 'new' EFI

> > page table layout.

> 

> That was added by Kirill for 5-level pgtables:

> 

>   e981316f5604 ("x86/efi: Add 5-level paging support")


That just duplicates the existing pgd_index() check for the p4d_index()
as well. It looks like the original commit adding
efi_sync_low_kernel_mappings() used to copy upto the PGD entry including
MODULES_END:
  d2f7cbe7b26a7 ("x86/efi: Runtime services virtual mapping")
and then Matt changed that when creating efi_mm:
  67a9108ed4313 ("x86/efi: Build our own page table structures")
to use EFI_VA_END instead but have a check that EFI_VA_END is in the
same entry as MODULES_END.

AFAICT, MODULES_END is only relevant as being something that happens to
be in the top 512GiB, and -1ul would be clearer.

> 

>  Documentation/x86/x86_64/mm.rst should explain the pagetable layout:

> 

>    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole

>    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space

>    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole

>    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

>    ffffffff80000000 |-2048    MB |                  |         |

>    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space

>    ffffffffff000000 |  -16    MB |                  |         |

>       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset

> 

> That thing which starts at -512 GB above is the last PGD on the

> pagetable. In it, between -4G and -68G there are 64G which are the EFI

> region mapping space for runtime services.

> 

> Frankly I'm not sure what this thing is testing because the EFI VA range

> is hardcoded and I can't imagine it being somewhere else *except* in the

> last PGD.


It's just so that someone doesn't just change the #define's for
EFI_VA_END/START and think that it will work, I guess.

Another reasonable option, for example, would be to reserve an entire
PGD entry, allowing everything but the PGD level to be shared, and
adding the EFI PGD to the pgd_list and getting rid of
efi_sync_low_kernel_mappings() altogether. There aren't that many PGD
entries still unused though, so this is probably not worth it.

> 

> Lemme add Kirill for clarification.

> 

> -- 

> Regards/Gruss,

>     Boris.

> 

> https://people.kernel.org/tglx/notes-about-netiquette
Ard Biesheuvel Jan. 20, 2021, 9:33 a.m. UTC | #16
On Mon, 18 Jan 2021 at 22:42, Arvind Sankar <nivedita@alum.mit.edu> wrote:
>

> On Mon, Jan 18, 2021 at 09:24:09PM +0100, Borislav Petkov wrote:

> > > > > As a matter of fact, it seems like the four assertions could be combined

> > > > > into:

> > > > >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> > > > >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > > > instead of separately asserting they're the same PGD entry and the same

> > > > > P4D entry.

> > > > >

> > > > > Thanks.

> > > >

> > > > I actually don't quite get the MODULES_END check -- Ard, do you know

> > > > what that's for?

> > > >

> > >

> > > Maybe Boris remembers? He wrote the original code for the 'new' EFI

> > > page table layout.

> >

> > That was added by Kirill for 5-level pgtables:

> >

> >   e981316f5604 ("x86/efi: Add 5-level paging support")

>

> That just duplicates the existing pgd_index() check for the p4d_index()

> as well. It looks like the original commit adding

> efi_sync_low_kernel_mappings() used to copy upto the PGD entry including

> MODULES_END:

>   d2f7cbe7b26a7 ("x86/efi: Runtime services virtual mapping")

> and then Matt changed that when creating efi_mm:

>   67a9108ed4313 ("x86/efi: Build our own page table structures")

> to use EFI_VA_END instead but have a check that EFI_VA_END is in the

> same entry as MODULES_END.

>

> AFAICT, MODULES_END is only relevant as being something that happens to

> be in the top 512GiB, and -1ul would be clearer.

>

> >

> >  Documentation/x86/x86_64/mm.rst should explain the pagetable layout:

> >

> >    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole

> >    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space

> >    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole

> >    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

> >    ffffffff80000000 |-2048    MB |                  |         |

> >    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space

> >    ffffffffff000000 |  -16    MB |                  |         |

> >       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset

> >

> > That thing which starts at -512 GB above is the last PGD on the

> > pagetable. In it, between -4G and -68G there are 64G which are the EFI

> > region mapping space for runtime services.

> >

> > Frankly I'm not sure what this thing is testing because the EFI VA range

> > is hardcoded and I can't imagine it being somewhere else *except* in the

> > last PGD.

>

> It's just so that someone doesn't just change the #define's for

> EFI_VA_END/START and think that it will work, I guess.

>

> Another reasonable option, for example, would be to reserve an entire

> PGD entry, allowing everything but the PGD level to be shared, and

> adding the EFI PGD to the pgd_list and getting rid of

> efi_sync_low_kernel_mappings() altogether. There aren't that many PGD

> entries still unused though, so this is probably not worth it.

>


The churn doesn't seem to be worth it, tbh.

So could we get rid of the complexity here, and only build_bug() on
the start address of the EFI region being outside the topmost p4d?
That should make the PGD test redundant as well.
Kirill A. Shutemov Jan. 20, 2021, 11:06 a.m. UTC | #17
On Fri, Jan 15, 2021 at 02:07:51PM -0500, Arvind Sankar wrote:
> On Thu, Jan 07, 2021 at 11:34:15PM +0100, Arnd Bergmann wrote:

> > From: Arnd Bergmann <arnd@arndb.de>

> > 

> > When 5-level page tables are enabled, clang triggers a BUILD_BUG_ON():

> > 

> > x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

> > efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> > 

> > Use the same method as in commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT

> > and PTRS_PER_P4D variable") and change it to MAYBE_BUILD_BUG_ON(),

> > so it only triggers for constant input.

> > 

> > Link: https://github.com/ClangBuiltLinux/linux/issues/256

> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> > ---

> >  arch/x86/platform/efi/efi_64.c | 4 ++--

> >  1 file changed, 2 insertions(+), 2 deletions(-)

> > 

> > diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> > index e1e8d4e3a213..62bb1616b4a5 100644

> > --- a/arch/x86/platform/efi/efi_64.c

> > +++ b/arch/x86/platform/efi/efi_64.c

> > @@ -137,8 +137,8 @@ void efi_sync_low_kernel_mappings(void)

> >  	 * As with PGDs, we share all P4D entries apart from the one entry

> >  	 * that covers the EFI runtime mapping space.

> >  	 */

> > -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > +	MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> > +	MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> >  

> >  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

> >  	pgd_k = pgd_offset_k(EFI_VA_END);

> > -- 

> > 2.29.2

> > 

> 

> I think this needs more explanation as to why clang is triggering this.

> The issue mentions clang not inline p4d_index(), and I guess not

> performing inter-procedural analysis either?

> 

> For the second assertion there, everything is always constant AFAICT:

> EFI_VA_START, EFI_VA_END and P4D_MASK are all constants regardless of

> CONFIG_5LEVEL.


Back in the days BUILD_BUG_ON() false-triggered on GCC-4.8 as well. 

-- 
 Kirill A. Shutemov
Kirill A. Shutemov Jan. 20, 2021, 11:26 a.m. UTC | #18
On Mon, Jan 18, 2021 at 04:42:20PM -0500, Arvind Sankar wrote:
> AFAICT, MODULES_END is only relevant as being something that happens to

> be in the top 512GiB, and -1ul would be clearer.


I think you are right. But -1UL is not very self-descriptive. :/

-- 
 Kirill A. Shutemov
Borislav Petkov Jan. 20, 2021, 11:44 a.m. UTC | #19
On Wed, Jan 20, 2021 at 10:33:43AM +0100, Ard Biesheuvel wrote:
> The churn doesn't seem to be worth it, tbh.

> 

> So could we get rid of the complexity here, and only build_bug() on

> the start address of the EFI region being outside the topmost p4d?

> That should make the PGD test redundant as well.


Yah, makes sense to me.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Nathan Chancellor Feb. 3, 2021, 6:51 p.m. UTC | #20
On Wed, Jan 20, 2021 at 10:33:43AM +0100, Ard Biesheuvel wrote:
> On Mon, 18 Jan 2021 at 22:42, Arvind Sankar <nivedita@alum.mit.edu> wrote:

> >

> > On Mon, Jan 18, 2021 at 09:24:09PM +0100, Borislav Petkov wrote:

> > > > > > As a matter of fact, it seems like the four assertions could be combined

> > > > > > into:

> > > > > >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> > > > > >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > > > > instead of separately asserting they're the same PGD entry and the same

> > > > > > P4D entry.

> > > > > >

> > > > > > Thanks.

> > > > >

> > > > > I actually don't quite get the MODULES_END check -- Ard, do you know

> > > > > what that's for?

> > > > >

> > > >

> > > > Maybe Boris remembers? He wrote the original code for the 'new' EFI

> > > > page table layout.

> > >

> > > That was added by Kirill for 5-level pgtables:

> > >

> > >   e981316f5604 ("x86/efi: Add 5-level paging support")

> >

> > That just duplicates the existing pgd_index() check for the p4d_index()

> > as well. It looks like the original commit adding

> > efi_sync_low_kernel_mappings() used to copy upto the PGD entry including

> > MODULES_END:

> >   d2f7cbe7b26a7 ("x86/efi: Runtime services virtual mapping")

> > and then Matt changed that when creating efi_mm:

> >   67a9108ed4313 ("x86/efi: Build our own page table structures")

> > to use EFI_VA_END instead but have a check that EFI_VA_END is in the

> > same entry as MODULES_END.

> >

> > AFAICT, MODULES_END is only relevant as being something that happens to

> > be in the top 512GiB, and -1ul would be clearer.

> >

> > >

> > >  Documentation/x86/x86_64/mm.rst should explain the pagetable layout:

> > >

> > >    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole

> > >    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space

> > >    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole

> > >    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

> > >    ffffffff80000000 |-2048    MB |                  |         |

> > >    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space

> > >    ffffffffff000000 |  -16    MB |                  |         |

> > >       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset

> > >

> > > That thing which starts at -512 GB above is the last PGD on the

> > > pagetable. In it, between -4G and -68G there are 64G which are the EFI

> > > region mapping space for runtime services.

> > >

> > > Frankly I'm not sure what this thing is testing because the EFI VA range

> > > is hardcoded and I can't imagine it being somewhere else *except* in the

> > > last PGD.

> >

> > It's just so that someone doesn't just change the #define's for

> > EFI_VA_END/START and think that it will work, I guess.

> >

> > Another reasonable option, for example, would be to reserve an entire

> > PGD entry, allowing everything but the PGD level to be shared, and

> > adding the EFI PGD to the pgd_list and getting rid of

> > efi_sync_low_kernel_mappings() altogether. There aren't that many PGD

> > entries still unused though, so this is probably not worth it.

> >

> 

> The churn doesn't seem to be worth it, tbh.

> 

> So could we get rid of the complexity here, and only build_bug() on

> the start address of the EFI region being outside the topmost p4d?

> That should make the PGD test redundant as well.


Was there ever a resolution to this conversation or a patch sent? I am
still seeing the build failure that Arnd initially sent the patch for.
x86_64 all{mod,yes}config with clang are going to ship broken in 5.11.

Cheers,
Nathan
Ard Biesheuvel Feb. 3, 2021, 8:29 p.m. UTC | #21
On Wed, 3 Feb 2021 at 19:51, Nathan Chancellor <nathan@kernel.org> wrote:
>

> On Wed, Jan 20, 2021 at 10:33:43AM +0100, Ard Biesheuvel wrote:

> > On Mon, 18 Jan 2021 at 22:42, Arvind Sankar <nivedita@alum.mit.edu> wrote:

> > >

> > > On Mon, Jan 18, 2021 at 09:24:09PM +0100, Borislav Petkov wrote:

> > > > > > > As a matter of fact, it seems like the four assertions could be combined

> > > > > > > into:

> > > > > > >       BUILD_BUG_ON((EFI_VA_END & P4D_MASK) != (MODULES_END & P4D_MASK));

> > > > > > >       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> > > > > > > instead of separately asserting they're the same PGD entry and the same

> > > > > > > P4D entry.

> > > > > > >

> > > > > > > Thanks.

> > > > > >

> > > > > > I actually don't quite get the MODULES_END check -- Ard, do you know

> > > > > > what that's for?

> > > > > >

> > > > >

> > > > > Maybe Boris remembers? He wrote the original code for the 'new' EFI

> > > > > page table layout.

> > > >

> > > > That was added by Kirill for 5-level pgtables:

> > > >

> > > >   e981316f5604 ("x86/efi: Add 5-level paging support")

> > >

> > > That just duplicates the existing pgd_index() check for the p4d_index()

> > > as well. It looks like the original commit adding

> > > efi_sync_low_kernel_mappings() used to copy upto the PGD entry including

> > > MODULES_END:

> > >   d2f7cbe7b26a7 ("x86/efi: Runtime services virtual mapping")

> > > and then Matt changed that when creating efi_mm:

> > >   67a9108ed4313 ("x86/efi: Build our own page table structures")

> > > to use EFI_VA_END instead but have a check that EFI_VA_END is in the

> > > same entry as MODULES_END.

> > >

> > > AFAICT, MODULES_END is only relevant as being something that happens to

> > > be in the top 512GiB, and -1ul would be clearer.

> > >

> > > >

> > > >  Documentation/x86/x86_64/mm.rst should explain the pagetable layout:

> > > >

> > > >    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole

> > > >    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space

> > > >    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole

> > > >    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

> > > >    ffffffff80000000 |-2048    MB |                  |         |

> > > >    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space

> > > >    ffffffffff000000 |  -16    MB |                  |         |

> > > >       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset

> > > >

> > > > That thing which starts at -512 GB above is the last PGD on the

> > > > pagetable. In it, between -4G and -68G there are 64G which are the EFI

> > > > region mapping space for runtime services.

> > > >

> > > > Frankly I'm not sure what this thing is testing because the EFI VA range

> > > > is hardcoded and I can't imagine it being somewhere else *except* in the

> > > > last PGD.

> > >

> > > It's just so that someone doesn't just change the #define's for

> > > EFI_VA_END/START and think that it will work, I guess.

> > >

> > > Another reasonable option, for example, would be to reserve an entire

> > > PGD entry, allowing everything but the PGD level to be shared, and

> > > adding the EFI PGD to the pgd_list and getting rid of

> > > efi_sync_low_kernel_mappings() altogether. There aren't that many PGD

> > > entries still unused though, so this is probably not worth it.

> > >

> >

> > The churn doesn't seem to be worth it, tbh.

> >

> > So could we get rid of the complexity here, and only build_bug() on

> > the start address of the EFI region being outside the topmost p4d?

> > That should make the PGD test redundant as well.

>

> Was there ever a resolution to this conversation or a patch sent? I am

> still seeing the build failure that Arnd initially sent the patch for.

> x86_64 all{mod,yes}config with clang are going to ship broken in 5.11.

>


I think we have agreement on the approach but it is unclear who is
going to write the patch.
Borislav Petkov Feb. 4, 2021, 10:51 a.m. UTC | #22
On Wed, Feb 03, 2021 at 09:29:18PM +0100, Ard Biesheuvel wrote:
> I think we have agreement on the approach but it is unclear who is

> going to write the patch.


How's that below?

And frankly, I'd even vote for removing those assertions altogether. If
somehow the EFI pgd lands somewhere else, the kernel will crash'n'burn
spectacularly and quickly so it's not like we won't catch it...

---
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 91ac10654570..b6be19c09841 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -156,8 +156,8 @@ extern unsigned int ptrs_per_p4d;
 #define CPU_ENTRY_AREA_PGD	_AC(-4, UL)
 #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)
 
-#define EFI_VA_START		( -4 * (_AC(1, UL) << 30))
-#define EFI_VA_END		(-68 * (_AC(1, UL) << 30))
+#define EFI_VA_START		( -4UL * (_AC(1, UL) << 30))
+#define EFI_VA_END		(-68UL * (_AC(1, UL) << 30))
 
 #define EARLY_DYNAMIC_PAGE_TABLES	64
 
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index e1e8d4e3a213..56fdc0bbb554 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -123,9 +123,7 @@ void efi_sync_low_kernel_mappings(void)
 	 * only span a single PGD entry and that the entry also maps
 	 * other important kernel regions.
 	 */
-	MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));
-	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=
-			(EFI_VA_END & PGDIR_MASK));
+	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) != PGDIR_MASK);
 
 	pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);
 	pgd_k = pgd_offset_k(PAGE_OFFSET);
@@ -137,8 +135,7 @@ void efi_sync_low_kernel_mappings(void)
 	 * As with PGDs, we share all P4D entries apart from the one entry
 	 * that covers the EFI runtime mapping space.
 	 */
-	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));
-	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));
+	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != P4D_MASK);
 
 	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);
 	pgd_k = pgd_offset_k(EFI_VA_END);


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Ard Biesheuvel Feb. 4, 2021, 10:59 a.m. UTC | #23
On Thu, 4 Feb 2021 at 11:52, Borislav Petkov <bp@alien8.de> wrote:
>

> On Wed, Feb 03, 2021 at 09:29:18PM +0100, Ard Biesheuvel wrote:

> > I think we have agreement on the approach but it is unclear who is

> > going to write the patch.

>

> How's that below?

>

> And frankly, I'd even vote for removing those assertions altogether. If

> somehow the EFI pgd lands somewhere else, the kernel will crash'n'burn

> spectacularly and quickly so it's not like we won't catch it...

>


+1 for just removing them

> ---

> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h

> index 91ac10654570..b6be19c09841 100644

> --- a/arch/x86/include/asm/pgtable_64_types.h

> +++ b/arch/x86/include/asm/pgtable_64_types.h

> @@ -156,8 +156,8 @@ extern unsigned int ptrs_per_p4d;

>  #define CPU_ENTRY_AREA_PGD     _AC(-4, UL)

>  #define CPU_ENTRY_AREA_BASE    (CPU_ENTRY_AREA_PGD << P4D_SHIFT)

>

> -#define EFI_VA_START           ( -4 * (_AC(1, UL) << 30))

> -#define EFI_VA_END             (-68 * (_AC(1, UL) << 30))

> +#define EFI_VA_START           ( -4UL * (_AC(1, UL) << 30))

> +#define EFI_VA_END             (-68UL * (_AC(1, UL) << 30))

>

>  #define EARLY_DYNAMIC_PAGE_TABLES      64

>

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..56fdc0bbb554 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -123,9 +123,7 @@ void efi_sync_low_kernel_mappings(void)

>          * only span a single PGD entry and that the entry also maps

>          * other important kernel regions.

>          */

> -       MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -       MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -                       (EFI_VA_END & PGDIR_MASK));

> +       MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) != PGDIR_MASK);

>

>         pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>         pgd_k = pgd_offset_k(PAGE_OFFSET);

> @@ -137,8 +135,7 @@ void efi_sync_low_kernel_mappings(void)

>          * As with PGDs, we share all P4D entries apart from the one entry

>          * that covers the EFI runtime mapping space.

>          */

> -       BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != P4D_MASK);

>

>         pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>         pgd_k = pgd_offset_k(EFI_VA_END);

>

>

> --

> Regards/Gruss,

>     Boris.

>

> https://people.kernel.org/tglx/notes-about-netiquette
Nathan Chancellor Feb. 4, 2021, 7:16 p.m. UTC | #24
On Thu, Feb 04, 2021 at 11:51:55AM +0100, Borislav Petkov wrote:
> On Wed, Feb 03, 2021 at 09:29:18PM +0100, Ard Biesheuvel wrote:

> > I think we have agreement on the approach but it is unclear who is

> > going to write the patch.

> 

> How's that below?

> 

> And frankly, I'd even vote for removing those assertions altogether. If

> somehow the EFI pgd lands somewhere else, the kernel will crash'n'burn

> spectacularly and quickly so it's not like we won't catch it...

> 

> ---


This resolves the issue initially reported in this thread. Obviously
removing the assertions will do that as well. Feel free to carry
forward

Tested-by: Nathan Chancellor <nathan@kernel.org>


on a patch if you send it out.

> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h

> index 91ac10654570..b6be19c09841 100644

> --- a/arch/x86/include/asm/pgtable_64_types.h

> +++ b/arch/x86/include/asm/pgtable_64_types.h

> @@ -156,8 +156,8 @@ extern unsigned int ptrs_per_p4d;

>  #define CPU_ENTRY_AREA_PGD	_AC(-4, UL)

>  #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)

>  

> -#define EFI_VA_START		( -4 * (_AC(1, UL) << 30))

> -#define EFI_VA_END		(-68 * (_AC(1, UL) << 30))

> +#define EFI_VA_START		( -4UL * (_AC(1, UL) << 30))

> +#define EFI_VA_END		(-68UL * (_AC(1, UL) << 30))

>  

>  #define EARLY_DYNAMIC_PAGE_TABLES	64

>  

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..56fdc0bbb554 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -123,9 +123,7 @@ void efi_sync_low_kernel_mappings(void)

>  	 * only span a single PGD entry and that the entry also maps

>  	 * other important kernel regions.

>  	 */

> -	MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -			(EFI_VA_END & PGDIR_MASK));

> +	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) != PGDIR_MASK);

>  

>  	pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>  	pgd_k = pgd_offset_k(PAGE_OFFSET);

> @@ -137,8 +135,7 @@ void efi_sync_low_kernel_mappings(void)

>  	 * As with PGDs, we share all P4D entries apart from the one entry

>  	 * that covers the EFI runtime mapping space.

>  	 */

> -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != P4D_MASK);

>  

>  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>  	pgd_k = pgd_offset_k(EFI_VA_END);

> 

> 

> -- 

> Regards/Gruss,

>     Boris.

> 

> https://people.kernel.org/tglx/notes-about-netiquette
Arvind Sankar Feb. 4, 2021, 9:43 p.m. UTC | #25
On Thu, Feb 04, 2021 at 11:51:55AM +0100, Borislav Petkov wrote:
> On Wed, Feb 03, 2021 at 09:29:18PM +0100, Ard Biesheuvel wrote:

> > I think we have agreement on the approach but it is unclear who is

> > going to write the patch.

> 

> How's that below?

> 

> And frankly, I'd even vote for removing those assertions altogether. If

> somehow the EFI pgd lands somewhere else, the kernel will crash'n'burn

> spectacularly and quickly so it's not like we won't catch it...


Removing altogether should be fine, but see below if we don't.

> 

> ---

> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h

> index 91ac10654570..b6be19c09841 100644

> --- a/arch/x86/include/asm/pgtable_64_types.h

> +++ b/arch/x86/include/asm/pgtable_64_types.h

> @@ -156,8 +156,8 @@ extern unsigned int ptrs_per_p4d;

>  #define CPU_ENTRY_AREA_PGD	_AC(-4, UL)

>  #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)

>  

> -#define EFI_VA_START		( -4 * (_AC(1, UL) << 30))

> -#define EFI_VA_END		(-68 * (_AC(1, UL) << 30))

> +#define EFI_VA_START		( -4UL * (_AC(1, UL) << 30))

> +#define EFI_VA_END		(-68UL * (_AC(1, UL) << 30))


This doesn't have any effect right? And the reason for the _AC() stuff
in there is to allow the #define to be used in assembler -- this
particular one isn't, but it makes no sense to use the UL suffix as well
as _AC() in the same macro.

>  

>  #define EARLY_DYNAMIC_PAGE_TABLES	64

>  

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..56fdc0bbb554 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -123,9 +123,7 @@ void efi_sync_low_kernel_mappings(void)

>  	 * only span a single PGD entry and that the entry also maps

>  	 * other important kernel regions.

>  	 */

> -	MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -			(EFI_VA_END & PGDIR_MASK));

> +	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) != PGDIR_MASK);


This check is superfluous. Just do the P4D one.

>  

>  	pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>  	pgd_k = pgd_offset_k(PAGE_OFFSET);

> @@ -137,8 +135,7 @@ void efi_sync_low_kernel_mappings(void)

>  	 * As with PGDs, we share all P4D entries apart from the one entry

>  	 * that covers the EFI runtime mapping space.

>  	 */

> -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> +	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != P4D_MASK);


This should check EFI_VA_END instead of EFI_VA_START, and maybe throw in
a BUG_ON if EFI_VA_END >= EFI_VA_START.

>  

>  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>  	pgd_k = pgd_offset_k(EFI_VA_END);

> 

> 

> -- 

> Regards/Gruss,

>     Boris.

> 

> https://people.kernel.org/tglx/notes-about-netiquette
Borislav Petkov Feb. 4, 2021, 10:13 p.m. UTC | #26
On Thu, Feb 04, 2021 at 04:43:58PM -0500, Arvind Sankar wrote:
> This should check EFI_VA_END instead of EFI_VA_START, and maybe throw in

> a BUG_ON if EFI_VA_END >= EFI_VA_START.


No need:

        if (efi_va < EFI_VA_END) {
                pr_warn(FW_WARN "VA address range overflow!\n");
                return;
        }

We already check we're not going over at map time. And our runtime
services range is hardcoded. And we're switching to that PGD on each
runtime services call.

So I don't see the point for keeping any of the assertions.

Unless you have other valid arguments for keeping them...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Arvind Sankar Feb. 5, 2021, 12:08 a.m. UTC | #27
On Thu, Feb 04, 2021 at 11:13:18PM +0100, Borislav Petkov wrote:
> On Thu, Feb 04, 2021 at 04:43:58PM -0500, Arvind Sankar wrote:

> > This should check EFI_VA_END instead of EFI_VA_START, and maybe throw in

> > a BUG_ON if EFI_VA_END >= EFI_VA_START.

> 

> No need:

> 

>         if (efi_va < EFI_VA_END) {

>                 pr_warn(FW_WARN "VA address range overflow!\n");

>                 return;

>         }

> 

> We already check we're not going over at map time. And our runtime

> services range is hardcoded. And we're switching to that PGD on each

> runtime services call.

> 

> So I don't see the point for keeping any of the assertions.

> 

> Unless you have other valid arguments for keeping them...

> 


No, I don't have any objections to removing them altogether. All the
comments other than the one about changing the #define's only apply if
it's decided to keep them.
Borislav Petkov Feb. 5, 2021, 10:34 a.m. UTC | #28
On Wed, Feb 03, 2021 at 11:51:48AM -0700, Nathan Chancellor wrote:
> x86_64 all{mod,yes}config with clang are going to ship broken in 5.11.


Dunno, it is still broken here even with those build assertions removed. And it
ain't even an all{mod,yes}config - just my machine's config with

CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
CONFIG_UBSAN=y
# CONFIG_UBSAN_TRAP is not set
CONFIG_CC_HAS_UBSAN_BOUNDS=y
CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y
CONFIG_UBSAN_BOUNDS=y
CONFIG_UBSAN_ARRAY_BOUNDS=y
CONFIG_UBSAN_SHIFT=y
CONFIG_UBSAN_DIV_ZERO=y
CONFIG_UBSAN_SIGNED_OVERFLOW=y
CONFIG_UBSAN_UNSIGNED_OVERFLOW=y
CONFIG_UBSAN_OBJECT_SIZE=y
CONFIG_UBSAN_BOOL=y
CONFIG_UBSAN_ENUM=y
CONFIG_UBSAN_ALIGNMENT=y
CONFIG_UBSAN_SANITIZE_ALL=y
# CONFIG_TEST_UBSAN is not set

and clang-10:

lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x253: call to __ubsan_handle_add_overflow() with UACCESS enabled
lib/strnlen_user.o: warning: objtool: strnlen_user()+0x244: call to __ubsan_handle_add_overflow() with UACCESS enabled
ld: init/main.o: in function `kmalloc':
/home/boris/kernel/linux/./include/linux/slab.h:557: undefined reference to `__ubsan_handle_alignment_assumption'
ld: init/initramfs.o: in function `kmalloc':
/home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'
ld: init/initramfs.o: in function `kmalloc_large':
/home/boris/kernel/linux/./include/linux/slab.h:481: undefined reference to `__ubsan_handle_alignment_assumption'
ld: init/initramfs.o: in function `kmalloc':
/home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'
ld: /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'
ld: init/initramfs.o:/home/boris/kernel/linux/./include/linux/slab.h:552: more undefined references to `__ubsan_handle_alignment_assumption' follow
ld: mm/mremap.o: in function `get_extent':
/home/boris/kernel/linux/mm/mremap.c:355: undefined reference to `__compiletime_assert_327'
ld: mm/rmap.o: in function `anon_vma_chain_alloc':
/home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'
ld: mm/rmap.o: in function `anon_vma_alloc':
/home/boris/kernel/linux/mm/rmap.c:89: undefined reference to `__ubsan_handle_alignment_assumption'
ld: mm/rmap.o: in function `anon_vma_chain_alloc':
/home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'
ld: /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'
ld: /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'
ld: mm/vmalloc.o:/home/boris/kernel/linux/mm/vmalloc.c:1213: more undefined references to `__ubsan_handle_alignment_assumption' follow
make: *** [Makefile:1164: vmlinux] Error 1

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Ard Biesheuvel Feb. 5, 2021, 11:57 a.m. UTC | #29
On Fri, 5 Feb 2021 at 12:39, Borislav Petkov <bp@alien8.de> wrote:
>

> From: Borislav Petkov <bp@suse.de>

>

> With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW

> enabled, clang fails the build with

>

>   x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

>   efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

>

> which happens due to -fsanitize=unsigned-integer-overflow being enabled:

>

>   -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where

>   the result of an unsigned integer computation cannot be represented

>   in its type. Unlike signed integer overflow, this is not undefined

>   behavior, but it is often unintentional. This sanitizer does not check

>   for lossy implicit conversions performed before such a computation

>   (see -fsanitize=implicit-conversion).

>

> and that fires when the (intentional) EFI_VA_START/END defines overflow

> an unsigned long, leading to the assertion expressions not getting

> optimized away (on GCC they do)...

>

> However, those checks are superfluous: the runtime services mapping

> code already makes sure the ranges don't overshoot EFI_VA_END as the

> EFI mapping range is hardcoded. On each runtime services call, it is

> switched to the EFI-specific PGD and even if mappings manage to escape

> that last PGD, this won't remain unnoticed for long.

>

> So rip them out.

>

> See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.

>

> Reported-by: Arnd Bergmann <arnd@arndb.de>

> Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org

> Signed-off-by: Borislav Petkov <bp@suse.de>


Acked-by: Ard Biesheuvel <ardb@kernel.org>


> ---

>  arch/x86/platform/efi/efi_64.c | 19 -------------------

>  1 file changed, 19 deletions(-)

>

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..8efd003540ca 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -115,31 +115,12 @@ void efi_sync_low_kernel_mappings(void)

>         pud_t *pud_k, *pud_efi;

>         pgd_t *efi_pgd = efi_mm.pgd;

>

> -       /*

> -        * We can share all PGD entries apart from the one entry that

> -        * covers the EFI runtime mapping space.

> -        *

> -        * Make sure the EFI runtime region mappings are guaranteed to

> -        * only span a single PGD entry and that the entry also maps

> -        * other important kernel regions.

> -        */

> -       MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -       MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -                       (EFI_VA_END & PGDIR_MASK));

> -

>         pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>         pgd_k = pgd_offset_k(PAGE_OFFSET);

>

>         num_entries = pgd_index(EFI_VA_END) - pgd_index(PAGE_OFFSET);

>         memcpy(pgd_efi, pgd_k, sizeof(pgd_t) * num_entries);

>

> -       /*

> -        * As with PGDs, we share all P4D entries apart from the one entry

> -        * that covers the EFI runtime mapping space.

> -        */

> -       BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> -

>         pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>         pgd_k = pgd_offset_k(EFI_VA_END);

>         p4d_efi = p4d_offset(pgd_efi, 0);

> --

> 2.29.2

>

> --

> Regards/Gruss,

>     Boris.

>

> https://people.kernel.org/tglx/notes-about-netiquette
Nick Desaulniers Feb. 5, 2021, 6:14 p.m. UTC | #30
On Fri, Feb 5, 2021 at 3:39 AM Borislav Petkov <bp@alien8.de> wrote:
>

> From: Borislav Petkov <bp@suse.de>

>

> With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW

> enabled, clang fails the build with

>

>   x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

>   efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

>

> which happens due to -fsanitize=unsigned-integer-overflow being enabled:

>

>   -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where

>   the result of an unsigned integer computation cannot be represented

>   in its type. Unlike signed integer overflow, this is not undefined

>   behavior, but it is often unintentional. This sanitizer does not check

>   for lossy implicit conversions performed before such a computation

>   (see -fsanitize=implicit-conversion).

>

> and that fires when the (intentional) EFI_VA_START/END defines overflow

> an unsigned long, leading to the assertion expressions not getting

> optimized away (on GCC they do)...

>

> However, those checks are superfluous: the runtime services mapping

> code already makes sure the ranges don't overshoot EFI_VA_END as the

> EFI mapping range is hardcoded. On each runtime services call, it is

> switched to the EFI-specific PGD and even if mappings manage to escape

> that last PGD, this won't remain unnoticed for long.

>

> So rip them out.

>

> See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.

>

> Reported-by: Arnd Bergmann <arnd@arndb.de>

> Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org

> Signed-off-by: Borislav Petkov <bp@suse.de>


Thanks, this fixes the failed assertion for me.

Tested-by: Nick Desaulniers <ndesaulniers@google.com>


(https://lore.kernel.org/lkml/20201230154104.522605-1-arnd@kernel.org/
is needed to finish a build of that configuration; going to chase that
next)

(consider applying Arvind's+Ard's suggested by tag)

> ---

>  arch/x86/platform/efi/efi_64.c | 19 -------------------

>  1 file changed, 19 deletions(-)

>

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..8efd003540ca 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -115,31 +115,12 @@ void efi_sync_low_kernel_mappings(void)

>         pud_t *pud_k, *pud_efi;

>         pgd_t *efi_pgd = efi_mm.pgd;

>

> -       /*

> -        * We can share all PGD entries apart from the one entry that

> -        * covers the EFI runtime mapping space.

> -        *

> -        * Make sure the EFI runtime region mappings are guaranteed to

> -        * only span a single PGD entry and that the entry also maps

> -        * other important kernel regions.

> -        */

> -       MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -       MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -                       (EFI_VA_END & PGDIR_MASK));

> -

>         pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>         pgd_k = pgd_offset_k(PAGE_OFFSET);

>

>         num_entries = pgd_index(EFI_VA_END) - pgd_index(PAGE_OFFSET);

>         memcpy(pgd_efi, pgd_k, sizeof(pgd_t) * num_entries);

>

> -       /*

> -        * As with PGDs, we share all P4D entries apart from the one entry

> -        * that covers the EFI runtime mapping space.

> -        */

> -       BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -       BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> -

>         pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>         pgd_k = pgd_offset_k(EFI_VA_END);

>         p4d_efi = p4d_offset(pgd_efi, 0);

> --

> 2.29.2

>

> --

> Regards/Gruss,

>     Boris.

>

> https://people.kernel.org/tglx/notes-about-netiquette




-- 
Thanks,
~Nick Desaulniers
Nick Desaulniers Feb. 5, 2021, 6:27 p.m. UTC | #31
On Fri, Feb 5, 2021 at 2:35 AM Borislav Petkov <bp@alien8.de> wrote:
>

> On Wed, Feb 03, 2021 at 11:51:48AM -0700, Nathan Chancellor wrote:

> > x86_64 all{mod,yes}config with clang are going to ship broken in 5.11.

>

> Dunno, it is still broken here even with those build assertions removed. And it

> ain't even an all{mod,yes}config - just my machine's config with

>

> CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y

> CONFIG_UBSAN=y

> # CONFIG_UBSAN_TRAP is not set

> CONFIG_CC_HAS_UBSAN_BOUNDS=y

> CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y

> CONFIG_UBSAN_BOUNDS=y

> CONFIG_UBSAN_ARRAY_BOUNDS=y

> CONFIG_UBSAN_SHIFT=y

> CONFIG_UBSAN_DIV_ZERO=y

> CONFIG_UBSAN_SIGNED_OVERFLOW=y

> CONFIG_UBSAN_UNSIGNED_OVERFLOW=y

> CONFIG_UBSAN_OBJECT_SIZE=y

> CONFIG_UBSAN_BOOL=y

> CONFIG_UBSAN_ENUM=y

> CONFIG_UBSAN_ALIGNMENT=y

> CONFIG_UBSAN_SANITIZE_ALL=y

> # CONFIG_TEST_UBSAN is not set

>

> and clang-10:

>

> lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x253: call to __ubsan_handle_add_overflow() with UACCESS enabled

> lib/strnlen_user.o: warning: objtool: strnlen_user()+0x244: call to __ubsan_handle_add_overflow() with UACCESS enabled

> ld: init/main.o: in function `kmalloc':

> /home/boris/kernel/linux/./include/linux/slab.h:557: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: init/initramfs.o: in function `kmalloc':

> /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: init/initramfs.o: in function `kmalloc_large':

> /home/boris/kernel/linux/./include/linux/slab.h:481: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: init/initramfs.o: in function `kmalloc':

> /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: init/initramfs.o:/home/boris/kernel/linux/./include/linux/slab.h:552: more undefined references to `__ubsan_handle_alignment_assumption' follow

> ld: mm/mremap.o: in function `get_extent':

> /home/boris/kernel/linux/mm/mremap.c:355: undefined reference to `__compiletime_assert_327'


^ this one is https://lore.kernel.org/lkml/20201230154104.522605-1-arnd@kernel.org/.
Trying to get the last of these tracked down.  I think there were some
changes to UBSAN configs that weren't tested with clang before merged.

> ld: mm/rmap.o: in function `anon_vma_chain_alloc':

> /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: mm/rmap.o: in function `anon_vma_alloc':

> /home/boris/kernel/linux/mm/rmap.c:89: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: mm/rmap.o: in function `anon_vma_chain_alloc':

> /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: /home/boris/kernel/linux/mm/rmap.c:136: undefined reference to `__ubsan_handle_alignment_assumption'

> ld: mm/vmalloc.o:/home/boris/kernel/linux/mm/vmalloc.c:1213: more undefined references to `__ubsan_handle_alignment_assumption' follow

> make: *** [Makefile:1164: vmlinux] Error 1

>

> --

> Regards/Gruss,

>     Boris.

>

> https://people.kernel.org/tglx/notes-about-netiquette




-- 
Thanks,
~Nick Desaulniers
Nathan Chancellor Feb. 5, 2021, 6:31 p.m. UTC | #32
On Fri, Feb 05, 2021 at 10:27:54AM -0800, Nick Desaulniers wrote:
> On Fri, Feb 5, 2021 at 2:35 AM Borislav Petkov <bp@alien8.de> wrote:

> >

> > On Wed, Feb 03, 2021 at 11:51:48AM -0700, Nathan Chancellor wrote:

> > > x86_64 all{mod,yes}config with clang are going to ship broken in 5.11.

> >

> > Dunno, it is still broken here even with those build assertions removed. And it

> > ain't even an all{mod,yes}config - just my machine's config with

> >

> > CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y

> > CONFIG_UBSAN=y

> > # CONFIG_UBSAN_TRAP is not set

> > CONFIG_CC_HAS_UBSAN_BOUNDS=y

> > CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y

> > CONFIG_UBSAN_BOUNDS=y

> > CONFIG_UBSAN_ARRAY_BOUNDS=y

> > CONFIG_UBSAN_SHIFT=y

> > CONFIG_UBSAN_DIV_ZERO=y

> > CONFIG_UBSAN_SIGNED_OVERFLOW=y

> > CONFIG_UBSAN_UNSIGNED_OVERFLOW=y

> > CONFIG_UBSAN_OBJECT_SIZE=y

> > CONFIG_UBSAN_BOOL=y

> > CONFIG_UBSAN_ENUM=y

> > CONFIG_UBSAN_ALIGNMENT=y

> > CONFIG_UBSAN_SANITIZE_ALL=y

> > # CONFIG_TEST_UBSAN is not set

> >

> > and clang-10:

> >

> > lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x253: call to __ubsan_handle_add_overflow() with UACCESS enabled

> > lib/strnlen_user.o: warning: objtool: strnlen_user()+0x244: call to __ubsan_handle_add_overflow() with UACCESS enabled

> > ld: init/main.o: in function `kmalloc':

> > /home/boris/kernel/linux/./include/linux/slab.h:557: undefined reference to `__ubsan_handle_alignment_assumption'

> > ld: init/initramfs.o: in function `kmalloc':

> > /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> > ld: init/initramfs.o: in function `kmalloc_large':

> > /home/boris/kernel/linux/./include/linux/slab.h:481: undefined reference to `__ubsan_handle_alignment_assumption'

> > ld: init/initramfs.o: in function `kmalloc':

> > /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> > ld: /home/boris/kernel/linux/./include/linux/slab.h:552: undefined reference to `__ubsan_handle_alignment_assumption'

> > ld: init/initramfs.o:/home/boris/kernel/linux/./include/linux/slab.h:552: more undefined references to `__ubsan_handle_alignment_assumption' follow

> > ld: mm/mremap.o: in function `get_extent':

> > /home/boris/kernel/linux/mm/mremap.c:355: undefined reference to `__compiletime_assert_327'

> 

> ^ this one is https://lore.kernel.org/lkml/20201230154104.522605-1-arnd@kernel.org/.

> Trying to get the last of these tracked down.  I think there were some

> changes to UBSAN configs that weren't tested with clang before merged.


The rest of these should be resolved by
https://lore.kernel.org/r/20210205023257.NJnJdyyZk%25akpm@linux-foundation.org/,
which is currently on its way to Linus.

Cheers,
Nathan
Nathan Chancellor Feb. 5, 2021, 6:56 p.m. UTC | #33
On Fri, Feb 05, 2021 at 12:39:30PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>

> 

> With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW

> enabled, clang fails the build with

> 

>   x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':

>   efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

> 

> which happens due to -fsanitize=unsigned-integer-overflow being enabled:

> 

>   -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where

>   the result of an unsigned integer computation cannot be represented

>   in its type. Unlike signed integer overflow, this is not undefined

>   behavior, but it is often unintentional. This sanitizer does not check

>   for lossy implicit conversions performed before such a computation

>   (see -fsanitize=implicit-conversion).

> 

> and that fires when the (intentional) EFI_VA_START/END defines overflow

> an unsigned long, leading to the assertion expressions not getting

> optimized away (on GCC they do)...

> 

> However, those checks are superfluous: the runtime services mapping

> code already makes sure the ranges don't overshoot EFI_VA_END as the

> EFI mapping range is hardcoded. On each runtime services call, it is

> switched to the EFI-specific PGD and even if mappings manage to escape

> that last PGD, this won't remain unnoticed for long.

> 

> So rip them out.

> 

> See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.

> 

> Reported-by: Arnd Bergmann <arnd@arndb.de>

> Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org

> Signed-off-by: Borislav Petkov <bp@suse.de>


Reviewed-by: Nathan Chancellor <nathan@kernel.org>

Tested-by: Nathan Chancellor <nathan@kernel.org>


> ---

>  arch/x86/platform/efi/efi_64.c | 19 -------------------

>  1 file changed, 19 deletions(-)

> 

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

> index e1e8d4e3a213..8efd003540ca 100644

> --- a/arch/x86/platform/efi/efi_64.c

> +++ b/arch/x86/platform/efi/efi_64.c

> @@ -115,31 +115,12 @@ void efi_sync_low_kernel_mappings(void)

>  	pud_t *pud_k, *pud_efi;

>  	pgd_t *efi_pgd = efi_mm.pgd;

>  

> -	/*

> -	 * We can share all PGD entries apart from the one entry that

> -	 * covers the EFI runtime mapping space.

> -	 *

> -	 * Make sure the EFI runtime region mappings are guaranteed to

> -	 * only span a single PGD entry and that the entry also maps

> -	 * other important kernel regions.

> -	 */

> -	MAYBE_BUILD_BUG_ON(pgd_index(EFI_VA_END) != pgd_index(MODULES_END));

> -	MAYBE_BUILD_BUG_ON((EFI_VA_START & PGDIR_MASK) !=

> -			(EFI_VA_END & PGDIR_MASK));

> -

>  	pgd_efi = efi_pgd + pgd_index(PAGE_OFFSET);

>  	pgd_k = pgd_offset_k(PAGE_OFFSET);

>  

>  	num_entries = pgd_index(EFI_VA_END) - pgd_index(PAGE_OFFSET);

>  	memcpy(pgd_efi, pgd_k, sizeof(pgd_t) * num_entries);

>  

> -	/*

> -	 * As with PGDs, we share all P4D entries apart from the one entry

> -	 * that covers the EFI runtime mapping space.

> -	 */

> -	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));

> -	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));

> -

>  	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);

>  	pgd_k = pgd_offset_k(EFI_VA_END);

>  	p4d_efi = p4d_offset(pgd_efi, 0);

> -- 

> 2.29.2

> 

> -- 

> Regards/Gruss,

>     Boris.

> 

> https://people.kernel.org/tglx/notes-about-netiquette
diff mbox series

Patch

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index e1e8d4e3a213..62bb1616b4a5 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -137,8 +137,8 @@  void efi_sync_low_kernel_mappings(void)
 	 * As with PGDs, we share all P4D entries apart from the one entry
 	 * that covers the EFI runtime mapping space.
 	 */
-	BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));
-	BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));
+	MAYBE_BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));
+	MAYBE_BUILD_BUG_ON((EFI_VA_START & P4D_MASK) != (EFI_VA_END & P4D_MASK));
 
 	pgd_efi = efi_pgd + pgd_index(EFI_VA_END);
 	pgd_k = pgd_offset_k(EFI_VA_END);