diff mbox series

kbuild, x86: revert macros in extended asm workarounds

Message ID 1544692661-9455-1-git-send-email-yamada.masahiro@socionext.com
State New
Headers show
Series kbuild, x86: revert macros in extended asm workarounds | expand

Commit Message

Masahiro Yamada Dec. 13, 2018, 9:17 a.m. UTC
Revert the following commits:

- 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd
  ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

- d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2
  ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

- 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.
  ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

- 494b5168f2de009eb80f198f668da374295098dd.
  ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

- f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.
  ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

- 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.
  ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

- 9e1725b410594911cc5981b6c7b4cea4ec054ca8.
  ("x86/refcount: Work around GCC inlining bug")
  (Conflicts: arch/x86/include/asm/refcount.h)

- c06c4d8090513f2974dfdbed2ac98634357ac475.
  ("x86/objtool: Use asm macros to work around GCC inlining bugs")

- 77b0bf55bc675233d22cd5df97605d516d64525e.
  ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

A few days after those commits applied, discussion started to solve
the issue more elegantly on the compiler side:

  https://lkml.org/lkml/2018/10/7/92

The "asm inline" was implemented by Segher Boessenkool, and now queued
up for GCC 9. (People were positive even for back-porting it to older
compilers).

Since the in-kernel workarounds merged, some issues have been reported:
breakage of building with distcc/icecc, breakage of distro packages for
module building. (More fundamentally, we cannot build external modules
after 'make clean')

Patching around the build system would make the code even uglier.

Given that this issue will be solved in a cleaner way sooner or later,
let's revert the in-kernel workarounds, and wait for GCC 9.

Reported-by: Logan Gunthorpe <logang@deltatee.com> # distcc
Reported-by: Sedat Dilek <sedat.dilek@gmail.com> # debian/rpm package
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Cc: Nadav Amit <namit@vmware.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
---

Please consider this for v4.20 release.
Currently, distro package build is broken.


 Makefile                               |  9 +---
 arch/x86/Makefile                      |  7 ---
 arch/x86/entry/calling.h               |  2 +-
 arch/x86/include/asm/alternative-asm.h | 20 +++----
 arch/x86/include/asm/alternative.h     | 11 +++-
 arch/x86/include/asm/asm.h             | 53 +++++++++++-------
 arch/x86/include/asm/bug.h             | 98 +++++++++++++++-------------------
 arch/x86/include/asm/cpufeature.h      | 82 ++++++++++++----------------
 arch/x86/include/asm/jump_label.h      | 72 ++++++++++++++++++-------
 arch/x86/include/asm/paravirt_types.h  | 56 +++++++++----------
 arch/x86/include/asm/refcount.h        | 81 ++++++++++++----------------
 arch/x86/kernel/macros.S               | 16 ------
 include/asm-generic/bug.h              |  8 +--
 include/linux/compiler.h               | 56 +++++--------------
 scripts/Kbuild.include                 |  4 +-
 scripts/mod/Makefile                   |  2 -
 16 files changed, 262 insertions(+), 315 deletions(-)
 delete mode 100644 arch/x86/kernel/macros.S

-- 
2.7.4

Comments

Peter Zijlstra Dec. 13, 2018, 10:51 a.m. UTC | #1
On Thu, Dec 13, 2018 at 06:17:41PM +0900, Masahiro Yamada wrote:
> Revert the following commits:

> 

> - 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd

>   ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

> 

> - d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2

>   ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

> 

> - 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.

>   ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

> 

> - 494b5168f2de009eb80f198f668da374295098dd.

>   ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

> 

> - f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.

>   ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

> 

> - 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.

>   ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

> 

> - 9e1725b410594911cc5981b6c7b4cea4ec054ca8.

>   ("x86/refcount: Work around GCC inlining bug")

>   (Conflicts: arch/x86/include/asm/refcount.h)

> 

> - c06c4d8090513f2974dfdbed2ac98634357ac475.

>   ("x86/objtool: Use asm macros to work around GCC inlining bugs")

> 

> - 77b0bf55bc675233d22cd5df97605d516d64525e.

>   ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

> 


I don't think we want to blindly revert all that. Some of them actually
made sense and did clean up things irrespective of the asm-inline issue.

In particular I like the jump-label one. The cpufeature one OTOh, yeah,
I'd love to get that reverted.

And as a note; the normal commit quoting style is:

  d5a581d84ae6 ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")
Masahiro Yamada Dec. 15, 2018, 12:51 a.m. UTC | #2
Hi Peter,

On Thu, Dec 13, 2018 at 7:53 PM Peter Zijlstra <peterz@infradead.org> wrote:
>

> On Thu, Dec 13, 2018 at 06:17:41PM +0900, Masahiro Yamada wrote:

> > Revert the following commits:

> >

> > - 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd

> >   ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

> >

> > - d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2

> >   ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

> >

> > - 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.

> >   ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

> >

> > - 494b5168f2de009eb80f198f668da374295098dd.

> >   ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

> >

> > - f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.

> >   ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

> >

> > - 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.

> >   ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

> >

> > - 9e1725b410594911cc5981b6c7b4cea4ec054ca8.

> >   ("x86/refcount: Work around GCC inlining bug")

> >   (Conflicts: arch/x86/include/asm/refcount.h)

> >

> > - c06c4d8090513f2974dfdbed2ac98634357ac475.

> >   ("x86/objtool: Use asm macros to work around GCC inlining bugs")

> >

> > - 77b0bf55bc675233d22cd5df97605d516d64525e.

> >   ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

> >

>

> I don't think we want to blindly revert all that. Some of them actually

> made sense and did clean up things irrespective of the asm-inline issue.

>

> In particular I like the jump-label one.


[1] The #error message is unnecessary.

[2] keep STATC_BRANCH_NOP/JMP instead of STATIC_JUMP_IF_TRUE/FALSE



In v2, I will make sure to not re-add [1].
I am not sure about [2].


Do you mean only [1],
or both of them?



> The cpufeature one OTOh, yeah,

> I'd love to get that reverted.

>

> And as a note; the normal commit quoting style is:

>

>   d5a581d84ae6 ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")



OK. I will do so in v2.


--
Best Regards
Masahiro Yamada
Nadav Amit Dec. 16, 2018, 2:33 a.m. UTC | #3
> On Dec 14, 2018, at 4:51 PM, Masahiro Yamada <yamada.masahiro@socionext.com> wrote:

> 

> Hi Peter,

> 

> On Thu, Dec 13, 2018 at 7:53 PM Peter Zijlstra <peterz@infradead.org> wrote:

>> On Thu, Dec 13, 2018 at 06:17:41PM +0900, Masahiro Yamada wrote:

>>> Revert the following commits:

>>> 

>>> - 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd

>>>  ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

>>> 

>>> - d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2

>>>  ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

>>> 

>>> - 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.

>>>  ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

>>> 

>>> - 494b5168f2de009eb80f198f668da374295098dd.

>>>  ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

>>> 

>>> - f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.

>>>  ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

>>> 

>>> - 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.

>>>  ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

>>> 

>>> - 9e1725b410594911cc5981b6c7b4cea4ec054ca8.

>>>  ("x86/refcount: Work around GCC inlining bug")

>>>  (Conflicts: arch/x86/include/asm/refcount.h)

>>> 

>>> - c06c4d8090513f2974dfdbed2ac98634357ac475.

>>>  ("x86/objtool: Use asm macros to work around GCC inlining bugs")

>>> 

>>> - 77b0bf55bc675233d22cd5df97605d516d64525e.

>>>  ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

>> 

>> I don't think we want to blindly revert all that. Some of them actually

>> made sense and did clean up things irrespective of the asm-inline issue.

>> 

>> In particular I like the jump-label one.

> 

> [1] The #error message is unnecessary.

> 

> [2] keep STATC_BRANCH_NOP/JMP instead of STATIC_JUMP_IF_TRUE/FALSE

> 

> 

> 

> In v2, I will make sure to not re-add [1].

> I am not sure about [2].

> 

> 

> Do you mean only [1],

> or both of them?

> 

> 

> 

>> The cpufeature one OTOh, yeah,

>> I'd love to get that reverted.

>> 

>> And as a note; the normal commit quoting style is:

>> 

>>  d5a581d84ae6 ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

> 

> 

> OK. I will do so in v2.


I recommend to do the following for v2:

1. Run some static measurements (e.g., function sizes, number of function
symbols) to ensure that GCC works as it should. If possible, run small
performance evaluations. IIRC, I saw small but consistent performance
difference when I ran a loop with mprotect() that kept changing permissions.
This was due to PV MMU functions that caused inlining mess.

2. Break the patch into separate patches, based on the original patch-set
order (reversed). This is the common practice, which allows people to review
patches, perform bisections, and revert when needed.

3. Cc the relevant people who ack'd the original patches, e.g., Kees Cook,
who’s on top of the reference-counters and Linus, who proposed this
approach.

In general, I think that from the start it was clear that the motivation for
the patch-set is not just performance and also better code. For example, I
see no reason to revert the PV-changes or the lock-prefix changes that
improved the code readability.

Regards,
Nadav
Borislav Petkov Dec. 16, 2018, 10 a.m. UTC | #4
On Sun, Dec 16, 2018 at 02:33:39AM +0000, Nadav Amit wrote:
> In general, I think that from the start it was clear that the motivation for

> the patch-set is not just performance and also better code. For example, I

> see no reason to revert the PV-changes or the lock-prefix changes that

> improved the code readability.


One thing that has caught my eye with the asm macros, which actually
decreases readability, is that I can't see the macro properly expanded
when I do

make <filename>.s

For example, I get

#APP
# 164 "./arch/x86/include/asm/cpufeature.h" 1
        STATIC_CPU_HAS bitnum=$8 cap_byte="boot_cpu_data+35(%rip)" feature=123 t_yes=.L75 t_no=.L78 always=117  #, MEM[(const char *)&boot_cpu_data + 35B],,,,
# 0 "" 2
        .loc 11 164 2 view .LVU480
#NO_APP

but I'd like to see the actual asm as it is really helpful when hacking
on inline asm stuff. And I haven't found a way to make gcc expand asm
macros in .s output.

Now, assuming the gcc inline patch will be backported to gcc8, I think
we should be covered on all modern distros going forward. So I think we
should revert at least the more complex macros.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
Nadav Amit Dec. 17, 2018, 1 a.m. UTC | #5
> On Dec 16, 2018, at 2:00 AM, Borislav Petkov <bp@alien8.de> wrote:

> 

> On Sun, Dec 16, 2018 at 02:33:39AM +0000, Nadav Amit wrote:

>> In general, I think that from the start it was clear that the motivation for

>> the patch-set is not just performance and also better code. For example, I

>> see no reason to revert the PV-changes or the lock-prefix changes that

>> improved the code readability.

> 

> One thing that has caught my eye with the asm macros, which actually

> decreases readability, is that I can't see the macro properly expanded

> when I do

> 

> make <filename>.s

> 

> For example, I get

> 

> #APP

> # 164 "./arch/x86/include/asm/cpufeature.h" 1

>        STATIC_CPU_HAS bitnum=$8 cap_byte="boot_cpu_data+35(%rip)" feature=123 t_yes=.L75 t_no=.L78 always=117  #, MEM[(const char *)&boot_cpu_data + 35B],,,,

> # 0 "" 2

>        .loc 11 164 2 view .LVU480

> #NO_APP

> 

> but I'd like to see the actual asm as it is really helpful when hacking

> on inline asm stuff. And I haven't found a way to make gcc expand asm

> macros in .s output.


You’re right, although there were already 72 assembly macros for defined in
x86 .h files, and some may find the unexpanded macro in the ‘.s’ file more
friendly (well, a small comment in inline assembly could have resolved this
issue).

Anyhow, using gnu asm listings should be a relatively reasonable workaround
for this limitation (unless you want to hack the code before assembly).

For example, using ` as -alm arch/x86/kernel/macros.s arch/x86/kvm/vmx.s `

would give you:

 421                    # ./arch/x86/include/asm/cpufeature.h:164:      asm_volatile_goto("STATIC_CPU_HAS bitnum=%[bitnum] "
 422                            .file 8 "./arch/x86/include/asm/cpufeature.h"
 423                            .loc 8 164 0
 424                    #APP
 425                    # 164 "./arch/x86/include/asm/cpufeature.h" 1
 426                            STATIC_CPU_HAS bitnum=$2 cap_byte="boot_cpu_data+38(%rip)" feature=145 t_yes=.L17 t_no=.L18 always
 426                    > 1:
 426 00d8 E9000000      >  jmp 6f
 426      00
 426                    > 2:
 426                    >  .skip -(((5f-4f)- (2b-1b))>0)* ((5f-4f)- (2b-1b)),0x90
 426                    > 3:
 426                    >  .section .altinstructions,"a"
 426 000d 00000000      >  .long 1b - .
 426 0011 00000000      >  .long 4f - .
 426 0015 7500          >  .word 117

…

This can be incorporated into a makefile option, I suppose.


> Now, assuming the gcc inline patch will be backported to gcc8, I think

> we should be covered on all modern distros going forward. So I think we

> should revert at least the more complex macros.


I understand, and perhaps STATIC_CPU_HAS is not a good use-case (once
inlining is resolved in a different manner). I think that the main question
should be whether the whole infrastructure should be removed or to be
selective. 

In the case of exception tables, for instance, the result is much cleaner,
as it allows to consolidate the C and assembly implementations. There is an
alternative solution of turning the assembly macros into C macros, which
would make the Make system hacks go away, but would make the code not as
nice.
Sedat Dilek Dec. 17, 2018, 9:16 a.m. UTC | #6
On Thu, Dec 13, 2018 at 10:19 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>

> Revert the following commits:

>

> - 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd

>   ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

>

> - d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2

>   ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

>

> - 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.

>   ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

>

> - 494b5168f2de009eb80f198f668da374295098dd.

>   ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

>

> - f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.

>   ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

>

> - 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.

>   ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

>

> - 9e1725b410594911cc5981b6c7b4cea4ec054ca8.

>   ("x86/refcount: Work around GCC inlining bug")

>   (Conflicts: arch/x86/include/asm/refcount.h)

>

> - c06c4d8090513f2974dfdbed2ac98634357ac475.

>   ("x86/objtool: Use asm macros to work around GCC inlining bugs")

>

> - 77b0bf55bc675233d22cd5df97605d516d64525e.

>   ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

>

> A few days after those commits applied, discussion started to solve

> the issue more elegantly on the compiler side:

>

>   https://lkml.org/lkml/2018/10/7/92

>

> The "asm inline" was implemented by Segher Boessenkool, and now queued

> up for GCC 9. (People were positive even for back-porting it to older

> compilers).

>

> Since the in-kernel workarounds merged, some issues have been reported:

> breakage of building with distcc/icecc, breakage of distro packages for

> module building. (More fundamentally, we cannot build external modules

> after 'make clean')

>

> Patching around the build system would make the code even uglier.

>

> Given that this issue will be solved in a cleaner way sooner or later,

> let's revert the in-kernel workarounds, and wait for GCC 9.

>

> Reported-by: Logan Gunthorpe <logang@deltatee.com> # distcc

> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> # debian/rpm package


Hi,

I reported the issue with debian package breakage in [1].

I am not subscribed to any involved mailing-list and not following all
the discussions.
I see the situation is not easy as there is especially linux-kbuild
and linux/x86 involved and maybe other interests.
But I am interested in having a fix in v4.20 final and hope this all
still works with LLVM/Clang.

I can offer my help in testing - against Linux v4.20-rc7.
Not sure if all discussed material is in upstream or elsewhere.
What is your suggestion for me as a tester?

Will we have a solution in Linux v4.20 final?

Thanks.

With my best wishes,
- Sedat -

[1] https://marc.info/?t=154212770600037&r=1&w=2

> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

> Cc: Nadav Amit <namit@vmware.com>

> Cc: Segher Boessenkool <segher@kernel.crashing.org>

> ---

>

> Please consider this for v4.20 release.

> Currently, distro package build is broken.

>

>

>  Makefile                               |  9 +---

>  arch/x86/Makefile                      |  7 ---

>  arch/x86/entry/calling.h               |  2 +-

>  arch/x86/include/asm/alternative-asm.h | 20 +++----

>  arch/x86/include/asm/alternative.h     | 11 +++-

>  arch/x86/include/asm/asm.h             | 53 +++++++++++-------

>  arch/x86/include/asm/bug.h             | 98 +++++++++++++++-------------------

>  arch/x86/include/asm/cpufeature.h      | 82 ++++++++++++----------------

>  arch/x86/include/asm/jump_label.h      | 72 ++++++++++++++++++-------

>  arch/x86/include/asm/paravirt_types.h  | 56 +++++++++----------

>  arch/x86/include/asm/refcount.h        | 81 ++++++++++++----------------

>  arch/x86/kernel/macros.S               | 16 ------

>  include/asm-generic/bug.h              |  8 +--

>  include/linux/compiler.h               | 56 +++++--------------

>  scripts/Kbuild.include                 |  4 +-

>  scripts/mod/Makefile                   |  2 -

>  16 files changed, 262 insertions(+), 315 deletions(-)

>  delete mode 100644 arch/x86/kernel/macros.S

>

> diff --git a/Makefile b/Makefile

> index f2c3423..4cf4c5b 100644

> --- a/Makefile

> +++ b/Makefile

> @@ -1081,7 +1081,7 @@ scripts: scripts_basic scripts_dtc asm-generic gcc-plugins $(autoksyms_h)

>  # version.h and scripts_basic is processed / created.

>

>  # Listed in dependency order

> -PHONY += prepare archprepare macroprepare prepare0 prepare1 prepare2 prepare3

> +PHONY += prepare archprepare prepare0 prepare1 prepare2 prepare3

>

>  # prepare3 is used to check if we are building in a separate output directory,

>  # and if so do:

> @@ -1104,9 +1104,7 @@ prepare2: prepare3 outputmakefile asm-generic

>  prepare1: prepare2 $(version_h) $(autoksyms_h) include/generated/utsrelease.h

>         $(cmd_crmodverdir)

>

> -macroprepare: prepare1 archmacros

> -

> -archprepare: archheaders archscripts macroprepare scripts_basic

> +archprepare: archheaders archscripts prepare1 scripts_basic

>

>  prepare0: archprepare gcc-plugins

>         $(Q)$(MAKE) $(build)=.

> @@ -1174,9 +1172,6 @@ archheaders:

>  PHONY += archscripts

>  archscripts:

>

> -PHONY += archmacros

> -archmacros:

> -

>  PHONY += __headers

>  __headers: $(version_h) scripts_basic uapi-asm-generic archheaders archscripts

>         $(Q)$(MAKE) $(build)=scripts build_unifdef

> diff --git a/arch/x86/Makefile b/arch/x86/Makefile

> index 75ef499..85a66c4 100644

> --- a/arch/x86/Makefile

> +++ b/arch/x86/Makefile

> @@ -232,13 +232,6 @@ archscripts: scripts_basic

>  archheaders:

>         $(Q)$(MAKE) $(build)=arch/x86/entry/syscalls all

>

> -archmacros:

> -       $(Q)$(MAKE) $(build)=arch/x86/kernel arch/x86/kernel/macros.s

> -

> -ASM_MACRO_FLAGS = -Wa,arch/x86/kernel/macros.s

> -export ASM_MACRO_FLAGS

> -KBUILD_CFLAGS += $(ASM_MACRO_FLAGS)

> -

>  ###

>  # Kernel objects

>

> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h

> index 25e5a6b..20d0885 100644

> --- a/arch/x86/entry/calling.h

> +++ b/arch/x86/entry/calling.h

> @@ -352,7 +352,7 @@ For 32-bit we have the following conventions - kernel is built with

>  .macro CALL_enter_from_user_mode

>  #ifdef CONFIG_CONTEXT_TRACKING

>  #ifdef HAVE_JUMP_LABEL

> -       STATIC_BRANCH_JMP l_yes=.Lafter_call_\@, key=context_tracking_enabled, branch=1

> +       STATIC_JUMP_IF_FALSE .Lafter_call_\@, context_tracking_enabled, def=0

>  #endif

>         call enter_from_user_mode

>  .Lafter_call_\@:

> diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h

> index 8e4ea39..31b627b 100644

> --- a/arch/x86/include/asm/alternative-asm.h

> +++ b/arch/x86/include/asm/alternative-asm.h

> @@ -7,24 +7,16 @@

>  #include <asm/asm.h>

>

>  #ifdef CONFIG_SMP

> -.macro LOCK_PREFIX_HERE

> +       .macro LOCK_PREFIX

> +672:   lock

>         .pushsection .smp_locks,"a"

>         .balign 4

> -       .long 671f - .          # offset

> +       .long 672b - .

>         .popsection

> -671:

> -.endm

> -

> -.macro LOCK_PREFIX insn:vararg

> -       LOCK_PREFIX_HERE

> -       lock \insn

> -.endm

> +       .endm

>  #else

> -.macro LOCK_PREFIX_HERE

> -.endm

> -

> -.macro LOCK_PREFIX insn:vararg

> -.endm

> +       .macro LOCK_PREFIX

> +       .endm

>  #endif

>

>  /*

> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h

> index d7faa16..4cd6a3b 100644

> --- a/arch/x86/include/asm/alternative.h

> +++ b/arch/x86/include/asm/alternative.h

> @@ -31,8 +31,15 @@

>   */

>

>  #ifdef CONFIG_SMP

> -#define LOCK_PREFIX_HERE "LOCK_PREFIX_HERE\n\t"

> -#define LOCK_PREFIX "LOCK_PREFIX "

> +#define LOCK_PREFIX_HERE \

> +               ".pushsection .smp_locks,\"a\"\n"       \

> +               ".balign 4\n"                           \

> +               ".long 671f - .\n" /* offset */         \

> +               ".popsection\n"                         \

> +               "671:"

> +

> +#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; "

> +

>  #else /* ! CONFIG_SMP */

>  #define LOCK_PREFIX_HERE ""

>  #define LOCK_PREFIX ""

> diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h

> index 21b0867..6467757b 100644

> --- a/arch/x86/include/asm/asm.h

> +++ b/arch/x86/include/asm/asm.h

> @@ -120,25 +120,12 @@

>  /* Exception table entry */

>  #ifdef __ASSEMBLY__

>  # define _ASM_EXTABLE_HANDLE(from, to, handler)                        \

> -       ASM_EXTABLE_HANDLE from to handler

> -

> -.macro ASM_EXTABLE_HANDLE from:req to:req handler:req

> -       .pushsection "__ex_table","a"

> -       .balign 4

> -       .long (\from) - .

> -       .long (\to) - .

> -       .long (\handler) - .

> +       .pushsection "__ex_table","a" ;                         \

> +       .balign 4 ;                                             \

> +       .long (from) - . ;                                      \

> +       .long (to) - . ;                                        \

> +       .long (handler) - . ;                                   \

>         .popsection

> -.endm

> -#else /* __ASSEMBLY__ */

> -

> -# define _ASM_EXTABLE_HANDLE(from, to, handler)                        \

> -       "ASM_EXTABLE_HANDLE from=" #from " to=" #to             \

> -       " handler=\"" #handler "\"\n\t"

> -

> -/* For C file, we already have NOKPROBE_SYMBOL macro */

> -

> -#endif /* __ASSEMBLY__ */

>

>  # define _ASM_EXTABLE(from, to)                                        \

>         _ASM_EXTABLE_HANDLE(from, to, ex_handler_default)

> @@ -161,7 +148,6 @@

>         _ASM_PTR (entry);                                       \

>         .popsection

>

> -#ifdef __ASSEMBLY__

>  .macro ALIGN_DESTINATION

>         /* check for bad alignment of destination */

>         movl %edi,%ecx

> @@ -185,7 +171,34 @@

>         _ASM_EXTABLE_UA(100b, 103b)

>         _ASM_EXTABLE_UA(101b, 103b)

>         .endm

> -#endif /* __ASSEMBLY__ */

> +

> +#else

> +# define _EXPAND_EXTABLE_HANDLE(x) #x

> +# define _ASM_EXTABLE_HANDLE(from, to, handler)                        \

> +       " .pushsection \"__ex_table\",\"a\"\n"                  \

> +       " .balign 4\n"                                          \

> +       " .long (" #from ") - .\n"                              \

> +       " .long (" #to ") - .\n"                                \

> +       " .long (" _EXPAND_EXTABLE_HANDLE(handler) ") - .\n"    \

> +       " .popsection\n"

> +

> +# define _ASM_EXTABLE(from, to)                                        \

> +       _ASM_EXTABLE_HANDLE(from, to, ex_handler_default)

> +

> +# define _ASM_EXTABLE_UA(from, to)                             \

> +       _ASM_EXTABLE_HANDLE(from, to, ex_handler_uaccess)

> +

> +# define _ASM_EXTABLE_FAULT(from, to)                          \

> +       _ASM_EXTABLE_HANDLE(from, to, ex_handler_fault)

> +

> +# define _ASM_EXTABLE_EX(from, to)                             \

> +       _ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)

> +

> +# define _ASM_EXTABLE_REFCOUNT(from, to)                       \

> +       _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)

> +

> +/* For C file, we already have NOKPROBE_SYMBOL macro */

> +#endif

>

>  #ifndef __ASSEMBLY__

>  /*

> diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h

> index 5090035..6804d66 100644

> --- a/arch/x86/include/asm/bug.h

> +++ b/arch/x86/include/asm/bug.h

> @@ -4,8 +4,6 @@

>

>  #include <linux/stringify.h>

>

> -#ifndef __ASSEMBLY__

> -

>  /*

>   * Despite that some emulators terminate on UD2, we use it for WARN().

>   *

> @@ -22,15 +20,53 @@

>

>  #define LEN_UD2                2

>

> +#ifdef CONFIG_GENERIC_BUG

> +

> +#ifdef CONFIG_X86_32

> +# define __BUG_REL(val)        ".long " __stringify(val)

> +#else

> +# define __BUG_REL(val)        ".long " __stringify(val) " - 2b"

> +#endif

> +

> +#ifdef CONFIG_DEBUG_BUGVERBOSE

> +

> +#define _BUG_FLAGS(ins, flags)                                         \

> +do {                                                                   \

> +       asm volatile("1:\t" ins "\n"                                    \

> +                    ".pushsection __bug_table,\"aw\"\n"                \

> +                    "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"   \

> +                    "\t"  __BUG_REL(%c0) "\t# bug_entry::file\n"       \

> +                    "\t.word %c1"        "\t# bug_entry::line\n"       \

> +                    "\t.word %c2"        "\t# bug_entry::flags\n"      \

> +                    "\t.org 2b+%c3\n"                                  \

> +                    ".popsection"                                      \

> +                    : : "i" (__FILE__), "i" (__LINE__),                \

> +                        "i" (flags),                                   \

> +                        "i" (sizeof(struct bug_entry)));               \

> +} while (0)

> +

> +#else /* !CONFIG_DEBUG_BUGVERBOSE */

> +

>  #define _BUG_FLAGS(ins, flags)                                         \

>  do {                                                                   \

> -       asm volatile("ASM_BUG ins=\"" ins "\" file=%c0 line=%c1 "       \

> -                    "flags=%c2 size=%c3"                               \

> -                    : : "i" (__FILE__), "i" (__LINE__),                \

> -                        "i" (flags),                                   \

> +       asm volatile("1:\t" ins "\n"                                    \

> +                    ".pushsection __bug_table,\"aw\"\n"                \

> +                    "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"   \

> +                    "\t.word %c0"        "\t# bug_entry::flags\n"      \

> +                    "\t.org 2b+%c1\n"                                  \

> +                    ".popsection"                                      \

> +                    : : "i" (flags),                                   \

>                          "i" (sizeof(struct bug_entry)));               \

>  } while (0)

>

> +#endif /* CONFIG_DEBUG_BUGVERBOSE */

> +

> +#else

> +

> +#define _BUG_FLAGS(ins, flags)  asm volatile(ins)

> +

> +#endif /* CONFIG_GENERIC_BUG */

> +

>  #define HAVE_ARCH_BUG

>  #define BUG()                                                  \

>  do {                                                           \

> @@ -46,54 +82,4 @@ do {                                                         \

>

>  #include <asm-generic/bug.h>

>

> -#else /* __ASSEMBLY__ */

> -

> -#ifdef CONFIG_GENERIC_BUG

> -

> -#ifdef CONFIG_X86_32

> -.macro __BUG_REL val:req

> -       .long \val

> -.endm

> -#else

> -.macro __BUG_REL val:req

> -       .long \val - 2b

> -.endm

> -#endif

> -

> -#ifdef CONFIG_DEBUG_BUGVERBOSE

> -

> -.macro ASM_BUG ins:req file:req line:req flags:req size:req

> -1:     \ins

> -       .pushsection __bug_table,"aw"

> -2:     __BUG_REL val=1b        # bug_entry::bug_addr

> -       __BUG_REL val=\file     # bug_entry::file

> -       .word \line             # bug_entry::line

> -       .word \flags            # bug_entry::flags

> -       .org 2b+\size

> -       .popsection

> -.endm

> -

> -#else /* !CONFIG_DEBUG_BUGVERBOSE */

> -

> -.macro ASM_BUG ins:req file:req line:req flags:req size:req

> -1:     \ins

> -       .pushsection __bug_table,"aw"

> -2:     __BUG_REL val=1b        # bug_entry::bug_addr

> -       .word \flags            # bug_entry::flags

> -       .org 2b+\size

> -       .popsection

> -.endm

> -

> -#endif /* CONFIG_DEBUG_BUGVERBOSE */

> -

> -#else /* CONFIG_GENERIC_BUG */

> -

> -.macro ASM_BUG ins:req file:req line:req flags:req size:req

> -       \ins

> -.endm

> -

> -#endif /* CONFIG_GENERIC_BUG */

> -

> -#endif /* __ASSEMBLY__ */

> -

>  #endif /* _ASM_X86_BUG_H */

> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h

> index 7d44272..aced6c9 100644

> --- a/arch/x86/include/asm/cpufeature.h

> +++ b/arch/x86/include/asm/cpufeature.h

> @@ -2,10 +2,10 @@

>  #ifndef _ASM_X86_CPUFEATURE_H

>  #define _ASM_X86_CPUFEATURE_H

>

> -#ifdef __KERNEL__

> -#ifndef __ASSEMBLY__

> -

>  #include <asm/processor.h>

> +

> +#if defined(__KERNEL__) && !defined(__ASSEMBLY__)

> +

>  #include <asm/asm.h>

>  #include <linux/bitops.h>

>

> @@ -161,10 +161,37 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);

>   */

>  static __always_inline __pure bool _static_cpu_has(u16 bit)

>  {

> -       asm_volatile_goto("STATIC_CPU_HAS bitnum=%[bitnum] "

> -                         "cap_byte=\"%[cap_byte]\" "

> -                         "feature=%P[feature] t_yes=%l[t_yes] "

> -                         "t_no=%l[t_no] always=%P[always]"

> +       asm_volatile_goto("1: jmp 6f\n"

> +                "2:\n"

> +                ".skip -(((5f-4f) - (2b-1b)) > 0) * "

> +                        "((5f-4f) - (2b-1b)),0x90\n"

> +                "3:\n"

> +                ".section .altinstructions,\"a\"\n"

> +                " .long 1b - .\n"              /* src offset */

> +                " .long 4f - .\n"              /* repl offset */

> +                " .word %P[always]\n"          /* always replace */

> +                " .byte 3b - 1b\n"             /* src len */

> +                " .byte 5f - 4f\n"             /* repl len */

> +                " .byte 3b - 2b\n"             /* pad len */

> +                ".previous\n"

> +                ".section .altinstr_replacement,\"ax\"\n"

> +                "4: jmp %l[t_no]\n"

> +                "5:\n"

> +                ".previous\n"

> +                ".section .altinstructions,\"a\"\n"

> +                " .long 1b - .\n"              /* src offset */

> +                " .long 0\n"                   /* no replacement */

> +                " .word %P[feature]\n"         /* feature bit */

> +                " .byte 3b - 1b\n"             /* src len */

> +                " .byte 0\n"                   /* repl len */

> +                " .byte 0\n"                   /* pad len */

> +                ".previous\n"

> +                ".section .altinstr_aux,\"ax\"\n"

> +                "6:\n"

> +                " testb %[bitnum],%[cap_byte]\n"

> +                " jnz %l[t_yes]\n"

> +                " jmp %l[t_no]\n"

> +                ".previous\n"

>                  : : [feature]  "i" (bit),

>                      [always]   "i" (X86_FEATURE_ALWAYS),

>                      [bitnum]   "i" (1 << (bit & 7)),

> @@ -199,44 +226,5 @@ static __always_inline __pure bool _static_cpu_has(u16 bit)

>  #define CPU_FEATURE_TYPEVAL            boot_cpu_data.x86_vendor, boot_cpu_data.x86, \

>                                         boot_cpu_data.x86_model

>

> -#else /* __ASSEMBLY__ */

> -

> -.macro STATIC_CPU_HAS bitnum:req cap_byte:req feature:req t_yes:req t_no:req always:req

> -1:

> -       jmp 6f

> -2:

> -       .skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90

> -3:

> -       .section .altinstructions,"a"

> -       .long 1b - .            /* src offset */

> -       .long 4f - .            /* repl offset */

> -       .word \always           /* always replace */

> -       .byte 3b - 1b           /* src len */

> -       .byte 5f - 4f           /* repl len */

> -       .byte 3b - 2b           /* pad len */

> -       .previous

> -       .section .altinstr_replacement,"ax"

> -4:

> -       jmp \t_no

> -5:

> -       .previous

> -       .section .altinstructions,"a"

> -       .long 1b - .            /* src offset */

> -       .long 0                 /* no replacement */

> -       .word \feature          /* feature bit */

> -       .byte 3b - 1b           /* src len */

> -       .byte 0                 /* repl len */

> -       .byte 0                 /* pad len */

> -       .previous

> -       .section .altinstr_aux,"ax"

> -6:

> -       testb \bitnum,\cap_byte

> -       jnz \t_yes

> -       jmp \t_no

> -       .previous

> -.endm

> -

> -#endif /* __ASSEMBLY__ */

> -

> -#endif /* __KERNEL__ */

> +#endif /* defined(__KERNEL__) && !defined(__ASSEMBLY__) */

>  #endif /* _ASM_X86_CPUFEATURE_H */

> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h

> index a5fb34f..21efc9d 100644

> --- a/arch/x86/include/asm/jump_label.h

> +++ b/arch/x86/include/asm/jump_label.h

> @@ -2,6 +2,19 @@

>  #ifndef _ASM_X86_JUMP_LABEL_H

>  #define _ASM_X86_JUMP_LABEL_H

>

> +#ifndef HAVE_JUMP_LABEL

> +/*

> + * For better or for worse, if jump labels (the gcc extension) are missing,

> + * then the entire static branch patching infrastructure is compiled out.

> + * If that happens, the code in here will malfunction.  Raise a compiler

> + * error instead.

> + *

> + * In theory, jump labels and the static branch patching infrastructure

> + * could be decoupled to fix this.

> + */

> +#error asm/jump_label.h included on a non-jump-label kernel

> +#endif

> +

>  #define JUMP_LABEL_NOP_SIZE 5

>

>  #ifdef CONFIG_X86_64

> @@ -20,9 +33,15 @@

>

>  static __always_inline bool arch_static_branch(struct static_key *key, bool branch)

>  {

> -       asm_volatile_goto("STATIC_BRANCH_NOP l_yes=\"%l[l_yes]\" key=\"%c0\" "

> -                         "branch=\"%c1\""

> -                       : :  "i" (key), "i" (branch) : : l_yes);

> +       asm_volatile_goto("1:"

> +               ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"

> +               ".pushsection __jump_table,  \"aw\" \n\t"

> +               _ASM_ALIGN "\n\t"

> +               ".long 1b - ., %l[l_yes] - . \n\t"

> +               _ASM_PTR "%c0 + %c1 - .\n\t"

> +               ".popsection \n\t"

> +               : :  "i" (key), "i" (branch) : : l_yes);

> +

>         return false;

>  l_yes:

>         return true;

> @@ -30,8 +49,14 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran

>

>  static __always_inline bool arch_static_branch_jump(struct static_key *key, bool branch)

>  {

> -       asm_volatile_goto("STATIC_BRANCH_JMP l_yes=\"%l[l_yes]\" key=\"%c0\" "

> -                         "branch=\"%c1\""

> +       asm_volatile_goto("1:"

> +               ".byte 0xe9\n\t .long %l[l_yes] - 2f\n\t"

> +               "2:\n\t"

> +               ".pushsection __jump_table,  \"aw\" \n\t"

> +               _ASM_ALIGN "\n\t"

> +               ".long 1b - ., %l[l_yes] - . \n\t"

> +               _ASM_PTR "%c0 + %c1 - .\n\t"

> +               ".popsection \n\t"

>                 : :  "i" (key), "i" (branch) : : l_yes);

>

>         return false;

> @@ -41,26 +66,37 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool

>

>  #else  /* __ASSEMBLY__ */

>

> -.macro STATIC_BRANCH_NOP l_yes:req key:req branch:req

> -.Lstatic_branch_nop_\@:

> -       .byte STATIC_KEY_INIT_NOP

> -.Lstatic_branch_no_after_\@:

> +.macro STATIC_JUMP_IF_TRUE target, key, def

> +.Lstatic_jump_\@:

> +       .if \def

> +       /* Equivalent to "jmp.d32 \target" */

> +       .byte           0xe9

> +       .long           \target - .Lstatic_jump_after_\@

> +.Lstatic_jump_after_\@:

> +       .else

> +       .byte           STATIC_KEY_INIT_NOP

> +       .endif

>         .pushsection __jump_table, "aw"

>         _ASM_ALIGN

> -       .long           .Lstatic_branch_nop_\@ - ., \l_yes - .

> -       _ASM_PTR        \key + \branch - .

> +       .long           .Lstatic_jump_\@ - ., \target - .

> +       _ASM_PTR        \key - .

>         .popsection

>  .endm

>

> -.macro STATIC_BRANCH_JMP l_yes:req key:req branch:req

> -.Lstatic_branch_jmp_\@:

> -       .byte 0xe9

> -       .long \l_yes - .Lstatic_branch_jmp_after_\@

> -.Lstatic_branch_jmp_after_\@:

> +.macro STATIC_JUMP_IF_FALSE target, key, def

> +.Lstatic_jump_\@:

> +       .if \def

> +       .byte           STATIC_KEY_INIT_NOP

> +       .else

> +       /* Equivalent to "jmp.d32 \target" */

> +       .byte           0xe9

> +       .long           \target - .Lstatic_jump_after_\@

> +.Lstatic_jump_after_\@:

> +       .endif

>         .pushsection __jump_table, "aw"

>         _ASM_ALIGN

> -       .long           .Lstatic_branch_jmp_\@ - ., \l_yes - .

> -       _ASM_PTR        \key + \branch - .

> +       .long           .Lstatic_jump_\@ - ., \target - .

> +       _ASM_PTR        \key + 1 - .

>         .popsection

>  .endm

>

> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h

> index 26942ad..488c596 100644

> --- a/arch/x86/include/asm/paravirt_types.h

> +++ b/arch/x86/include/asm/paravirt_types.h

> @@ -348,11 +348,23 @@ extern struct paravirt_patch_template pv_ops;

>  #define paravirt_clobber(clobber)              \

>         [paravirt_clobber] "i" (clobber)

>

> +/*

> + * Generate some code, and mark it as patchable by the

> + * apply_paravirt() alternate instruction patcher.

> + */

> +#define _paravirt_alt(insn_string, type, clobber)      \

> +       "771:\n\t" insn_string "\n" "772:\n"            \

> +       ".pushsection .parainstructions,\"a\"\n"        \

> +       _ASM_ALIGN "\n"                                 \

> +       _ASM_PTR " 771b\n"                              \

> +       "  .byte " type "\n"                            \

> +       "  .byte 772b-771b\n"                           \

> +       "  .short " clobber "\n"                        \

> +       ".popsection\n"

> +

>  /* Generate patchable code, with the default asm parameters. */

> -#define paravirt_call                                                  \

> -       "PARAVIRT_CALL type=\"%c[paravirt_typenum]\""                   \

> -       " clobber=\"%c[paravirt_clobber]\""                             \

> -       " pv_opptr=\"%c[paravirt_opptr]\";"

> +#define paravirt_alt(insn_string)                                      \

> +       _paravirt_alt(insn_string, "%c[paravirt_typenum]", "%c[paravirt_clobber]")

>

>  /* Simple instruction patching code. */

>  #define NATIVE_LABEL(a,x,b) "\n\t.globl " a #x "_" #b "\n" a #x "_" #b ":\n\t"

> @@ -373,6 +385,16 @@ unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len);

>  int paravirt_disable_iospace(void);

>

>  /*

> + * This generates an indirect call based on the operation type number.

> + * The type number, computed in PARAVIRT_PATCH, is derived from the

> + * offset into the paravirt_patch_template structure, and can therefore be

> + * freely converted back into a structure offset.

> + */

> +#define PARAVIRT_CALL                                  \

> +       ANNOTATE_RETPOLINE_SAFE                         \

> +       "call *%c[paravirt_opptr];"

> +

> +/*

>   * These macros are intended to wrap calls through one of the paravirt

>   * ops structs, so that they can be later identified and patched at

>   * runtime.

> @@ -509,7 +531,7 @@ int paravirt_disable_iospace(void);

>                 /* since this condition will never hold */              \

>                 if (sizeof(rettype) > sizeof(unsigned long)) {          \

>                         asm volatile(pre                                \

> -                                    paravirt_call                      \

> +                                    paravirt_alt(PARAVIRT_CALL)        \

>                                      post                               \

>                                      : call_clbr, ASM_CALL_CONSTRAINT   \

>                                      : paravirt_type(op),               \

> @@ -519,7 +541,7 @@ int paravirt_disable_iospace(void);

>                         __ret = (rettype)((((u64)__edx) << 32) | __eax); \

>                 } else {                                                \

>                         asm volatile(pre                                \

> -                                    paravirt_call                      \

> +                                    paravirt_alt(PARAVIRT_CALL)        \

>                                      post                               \

>                                      : call_clbr, ASM_CALL_CONSTRAINT   \

>                                      : paravirt_type(op),               \

> @@ -546,7 +568,7 @@ int paravirt_disable_iospace(void);

>                 PVOP_VCALL_ARGS;                                        \

>                 PVOP_TEST_NULL(op);                                     \

>                 asm volatile(pre                                        \

> -                            paravirt_call                              \

> +                            paravirt_alt(PARAVIRT_CALL)                \

>                              post                                       \

>                              : call_clbr, ASM_CALL_CONSTRAINT           \

>                              : paravirt_type(op),                       \

> @@ -664,26 +686,6 @@ struct paravirt_patch_site {

>  extern struct paravirt_patch_site __parainstructions[],

>         __parainstructions_end[];

>

> -#else  /* __ASSEMBLY__ */

> -

> -/*

> - * This generates an indirect call based on the operation type number.

> - * The type number, computed in PARAVIRT_PATCH, is derived from the

> - * offset into the paravirt_patch_template structure, and can therefore be

> - * freely converted back into a structure offset.

> - */

> -.macro PARAVIRT_CALL type:req clobber:req pv_opptr:req

> -771:   ANNOTATE_RETPOLINE_SAFE

> -       call *\pv_opptr

> -772:   .pushsection .parainstructions,"a"

> -       _ASM_ALIGN

> -       _ASM_PTR 771b

> -       .byte \type

> -       .byte 772b-771b

> -       .short \clobber

> -       .popsection

> -.endm

> -

>  #endif /* __ASSEMBLY__ */

>

>  #endif /* _ASM_X86_PARAVIRT_TYPES_H */

> diff --git a/arch/x86/include/asm/refcount.h b/arch/x86/include/asm/refcount.h

> index a8b5e1e..dbaed55 100644

> --- a/arch/x86/include/asm/refcount.h

> +++ b/arch/x86/include/asm/refcount.h

> @@ -4,41 +4,6 @@

>   * x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from

>   * PaX/grsecurity.

>   */

> -

> -#ifdef __ASSEMBLY__

> -

> -#include <asm/asm.h>

> -#include <asm/bug.h>

> -

> -.macro REFCOUNT_EXCEPTION counter:req

> -       .pushsection .text..refcount

> -111:   lea \counter, %_ASM_CX

> -112:   ud2

> -       ASM_UNREACHABLE

> -       .popsection

> -113:   _ASM_EXTABLE_REFCOUNT(112b, 113b)

> -.endm

> -

> -/* Trigger refcount exception if refcount result is negative. */

> -.macro REFCOUNT_CHECK_LT_ZERO counter:req

> -       js 111f

> -       REFCOUNT_EXCEPTION counter="\counter"

> -.endm

> -

> -/* Trigger refcount exception if refcount result is zero or negative. */

> -.macro REFCOUNT_CHECK_LE_ZERO counter:req

> -       jz 111f

> -       REFCOUNT_CHECK_LT_ZERO counter="\counter"

> -.endm

> -

> -/* Trigger refcount exception unconditionally. */

> -.macro REFCOUNT_ERROR counter:req

> -       jmp 111f

> -       REFCOUNT_EXCEPTION counter="\counter"

> -.endm

> -

> -#else /* __ASSEMBLY__ */

> -

>  #include <linux/refcount.h>

>  #include <asm/bug.h>

>

> @@ -50,12 +15,35 @@

>   * central refcount exception. The fixup address for the exception points

>   * back to the regular execution flow in .text.

>   */

> +#define _REFCOUNT_EXCEPTION                            \

> +       ".pushsection .text..refcount\n"                \

> +       "111:\tlea %[var], %%" _ASM_CX "\n"             \

> +       "112:\t" ASM_UD2 "\n"                           \

> +       ASM_UNREACHABLE                                 \

> +       ".popsection\n"                                 \

> +       "113:\n"                                        \

> +       _ASM_EXTABLE_REFCOUNT(112b, 113b)

> +

> +/* Trigger refcount exception if refcount result is negative. */

> +#define REFCOUNT_CHECK_LT_ZERO                         \

> +       "js 111f\n\t"                                   \

> +       _REFCOUNT_EXCEPTION

> +

> +/* Trigger refcount exception if refcount result is zero or negative. */

> +#define REFCOUNT_CHECK_LE_ZERO                         \

> +       "jz 111f\n\t"                                   \

> +       REFCOUNT_CHECK_LT_ZERO

> +

> +/* Trigger refcount exception unconditionally. */

> +#define REFCOUNT_ERROR                                 \

> +       "jmp 111f\n\t"                                  \

> +       _REFCOUNT_EXCEPTION

>

>  static __always_inline void refcount_add(unsigned int i, refcount_t *r)

>  {

>         asm volatile(LOCK_PREFIX "addl %1,%0\n\t"

> -               "REFCOUNT_CHECK_LT_ZERO counter=\"%[counter]\""

> -               : [counter] "+m" (r->refs.counter)

> +               REFCOUNT_CHECK_LT_ZERO

> +               : [var] "+m" (r->refs.counter)

>                 : "ir" (i)

>                 : "cc", "cx");

>  }

> @@ -63,32 +51,31 @@ static __always_inline void refcount_add(unsigned int i, refcount_t *r)

>  static __always_inline void refcount_inc(refcount_t *r)

>  {

>         asm volatile(LOCK_PREFIX "incl %0\n\t"

> -               "REFCOUNT_CHECK_LT_ZERO counter=\"%[counter]\""

> -               : [counter] "+m" (r->refs.counter)

> +               REFCOUNT_CHECK_LT_ZERO

> +               : [var] "+m" (r->refs.counter)

>                 : : "cc", "cx");

>  }

>

>  static __always_inline void refcount_dec(refcount_t *r)

>  {

>         asm volatile(LOCK_PREFIX "decl %0\n\t"

> -               "REFCOUNT_CHECK_LE_ZERO counter=\"%[counter]\""

> -               : [counter] "+m" (r->refs.counter)

> +               REFCOUNT_CHECK_LE_ZERO

> +               : [var] "+m" (r->refs.counter)

>                 : : "cc", "cx");

>  }

>

>  static __always_inline __must_check

>  bool refcount_sub_and_test(unsigned int i, refcount_t *r)

>  {

> -

>         return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",

> -                                        "REFCOUNT_CHECK_LT_ZERO counter=\"%[var]\"",

> +                                        REFCOUNT_CHECK_LT_ZERO,

>                                          r->refs.counter, e, "er", i, "cx");

>  }

>

>  static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)

>  {

>         return GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl",

> -                                       "REFCOUNT_CHECK_LT_ZERO counter=\"%[var]\"",

> +                                       REFCOUNT_CHECK_LT_ZERO,

>                                         r->refs.counter, e, "cx");

>  }

>

> @@ -106,8 +93,8 @@ bool refcount_add_not_zero(unsigned int i, refcount_t *r)

>

>                 /* Did we try to increment from/to an undesirable state? */

>                 if (unlikely(c < 0 || c == INT_MAX || result < c)) {

> -                       asm volatile("REFCOUNT_ERROR counter=\"%[counter]\""

> -                                    : : [counter] "m" (r->refs.counter)

> +                       asm volatile(REFCOUNT_ERROR

> +                                    : : [var] "m" (r->refs.counter)

>                                      : "cc", "cx");

>                         break;

>                 }

> @@ -122,6 +109,4 @@ static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r)

>         return refcount_add_not_zero(1, r);

>  }

>

> -#endif /* __ASSEMBLY__ */

> -

>  #endif

> diff --git a/arch/x86/kernel/macros.S b/arch/x86/kernel/macros.S

> deleted file mode 100644

> index 161c950..0000000

> --- a/arch/x86/kernel/macros.S

> +++ /dev/null

> @@ -1,16 +0,0 @@

> -/* SPDX-License-Identifier: GPL-2.0 */

> -

> -/*

> - * This file includes headers whose assembly part includes macros which are

> - * commonly used. The macros are precompiled into assmebly file which is later

> - * assembled together with each compiled file.

> - */

> -

> -#include <linux/compiler.h>

> -#include <asm/refcount.h>

> -#include <asm/alternative-asm.h>

> -#include <asm/bug.h>

> -#include <asm/paravirt.h>

> -#include <asm/asm.h>

> -#include <asm/cpufeature.h>

> -#include <asm/jump_label.h>

> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h

> index cdafa5e..20561a6 100644

> --- a/include/asm-generic/bug.h

> +++ b/include/asm-generic/bug.h

> @@ -17,8 +17,10 @@

>  #ifndef __ASSEMBLY__

>  #include <linux/kernel.h>

>

> -struct bug_entry {

> +#ifdef CONFIG_BUG

> +

>  #ifdef CONFIG_GENERIC_BUG

> +struct bug_entry {

>  #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS

>         unsigned long   bug_addr;

>  #else

> @@ -33,10 +35,8 @@ struct bug_entry {

>         unsigned short  line;

>  #endif

>         unsigned short  flags;

> -#endif /* CONFIG_GENERIC_BUG */

>  };

> -

> -#ifdef CONFIG_BUG

> +#endif /* CONFIG_GENERIC_BUG */

>

>  /*

>   * Don't use BUG() or BUG_ON() unless there's really no way out; one

> diff --git a/include/linux/compiler.h b/include/linux/compiler.h

> index 06396c1..fc5004a 100644

> --- a/include/linux/compiler.h

> +++ b/include/linux/compiler.h

> @@ -99,13 +99,22 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,

>   * unique, to convince GCC not to merge duplicate inline asm statements.

>   */

>  #define annotate_reachable() ({                                                \

> -       asm volatile("ANNOTATE_REACHABLE counter=%c0"                   \

> -                    : : "i" (__COUNTER__));                            \

> +       asm volatile("%c0:\n\t"                                         \

> +                    ".pushsection .discard.reachable\n\t"              \

> +                    ".long %c0b - .\n\t"                               \

> +                    ".popsection\n\t" : : "i" (__COUNTER__));          \

>  })

>  #define annotate_unreachable() ({                                      \

> -       asm volatile("ANNOTATE_UNREACHABLE counter=%c0"                 \

> -                    : : "i" (__COUNTER__));                            \

> +       asm volatile("%c0:\n\t"                                         \

> +                    ".pushsection .discard.unreachable\n\t"            \

> +                    ".long %c0b - .\n\t"                               \

> +                    ".popsection\n\t" : : "i" (__COUNTER__));          \

>  })

> +#define ASM_UNREACHABLE                                                        \

> +       "999:\n\t"                                                      \

> +       ".pushsection .discard.unreachable\n\t"                         \

> +       ".long 999b - .\n\t"                                            \

> +       ".popsection\n\t"

>  #else

>  #define annotate_reachable()

>  #define annotate_unreachable()

> @@ -293,45 +302,6 @@ static inline void *offset_to_ptr(const int *off)

>         return (void *)((unsigned long)off + *off);

>  }

>

> -#else /* __ASSEMBLY__ */

> -

> -#ifdef __KERNEL__

> -#ifndef LINKER_SCRIPT

> -

> -#ifdef CONFIG_STACK_VALIDATION

> -.macro ANNOTATE_UNREACHABLE counter:req

> -\counter:

> -       .pushsection .discard.unreachable

> -       .long \counter\()b -.

> -       .popsection

> -.endm

> -

> -.macro ANNOTATE_REACHABLE counter:req

> -\counter:

> -       .pushsection .discard.reachable

> -       .long \counter\()b -.

> -       .popsection

> -.endm

> -

> -.macro ASM_UNREACHABLE

> -999:

> -       .pushsection .discard.unreachable

> -       .long 999b - .

> -       .popsection

> -.endm

> -#else /* CONFIG_STACK_VALIDATION */

> -.macro ANNOTATE_UNREACHABLE counter:req

> -.endm

> -

> -.macro ANNOTATE_REACHABLE counter:req

> -.endm

> -

> -.macro ASM_UNREACHABLE

> -.endm

> -#endif /* CONFIG_STACK_VALIDATION */

> -

> -#endif /* LINKER_SCRIPT */

> -#endif /* __KERNEL__ */

>  #endif /* __ASSEMBLY__ */

>

>  /* Compile time object size, -1 for unknown */

> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include

> index bb01555..3d09844 100644

> --- a/scripts/Kbuild.include

> +++ b/scripts/Kbuild.include

> @@ -115,9 +115,7 @@ __cc-option = $(call try-run,\

>

>  # Do not attempt to build with gcc plugins during cc-option tests.

>  # (And this uses delayed resolution so the flags will be up to date.)

> -# In addition, do not include the asm macros which are built later.

> -CC_OPTION_FILTERED = $(GCC_PLUGINS_CFLAGS) $(ASM_MACRO_FLAGS)

> -CC_OPTION_CFLAGS = $(filter-out $(CC_OPTION_FILTERED),$(KBUILD_CFLAGS))

> +CC_OPTION_CFLAGS = $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS))

>

>  # cc-option

>  # Usage: cflags-y += $(call cc-option,-march=winchip-c6,-march=i586)

> diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile

> index a5b4af4..42c5d50 100644

> --- a/scripts/mod/Makefile

> +++ b/scripts/mod/Makefile

> @@ -4,8 +4,6 @@ OBJECT_FILES_NON_STANDARD := y

>  hostprogs-y    := modpost mk_elfconfig

>  always         := $(hostprogs-y) empty.o

>

> -CFLAGS_REMOVE_empty.o := $(ASM_MACRO_FLAGS)

> -

>  modpost-objs   := modpost.o file2alias.o sumversion.o

>

>  devicetable-offsets-file := devicetable-offsets.h

> --

> 2.7.4

>
Nadav Amit Dec. 18, 2018, 7:33 p.m. UTC | #7
> On Dec 17, 2018, at 1:16 AM, Sedat Dilek <sedat.dilek@gmail.com> wrote:

> 

> On Thu, Dec 13, 2018 at 10:19 AM Masahiro Yamada

> <yamada.masahiro@socionext.com> wrote:

>> Revert the following commits:

>> 

>> - 5bdcd510c2ac9efaf55c4cbd8d46421d8e2320cd

>>  ("x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs")

>> 

>> - d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2

>>  ("x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs")

>> 

>> - 0474d5d9d2f7f3b11262f7bf87d0e7314ead9200.

>>  ("x86/extable: Macrofy inline assembly code to work around GCC inlining bugs")

>> 

>> - 494b5168f2de009eb80f198f668da374295098dd.

>>  ("x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops")

>> 

>> - f81f8ad56fd1c7b99b2ed1c314527f7d9ac447c6.

>>  ("x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs")

>> 

>> - 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099.

>>  ("x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs")

>> 

>> - 9e1725b410594911cc5981b6c7b4cea4ec054ca8.

>>  ("x86/refcount: Work around GCC inlining bug")

>>  (Conflicts: arch/x86/include/asm/refcount.h)

>> 

>> - c06c4d8090513f2974dfdbed2ac98634357ac475.

>>  ("x86/objtool: Use asm macros to work around GCC inlining bugs")

>> 

>> - 77b0bf55bc675233d22cd5df97605d516d64525e.

>>  ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

>> 

>> A few days after those commits applied, discussion started to solve

>> the issue more elegantly on the compiler side:

>> 

>>  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2018%2F10%2F7%2F92&amp;data=02%7C01%7Cnamit%40vmware.com%7C40b5df8a38e54f310ab708d664005ee0%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C636806350113136223&amp;sdata=RlYXSyhzM6HLQuKC2NuyDqvke9qZ11tvMDNd32NiF2U%3D&amp;reserved=0

>> 

>> The "asm inline" was implemented by Segher Boessenkool, and now queued

>> up for GCC 9. (People were positive even for back-porting it to older

>> compilers).

>> 

>> Since the in-kernel workarounds merged, some issues have been reported:

>> breakage of building with distcc/icecc, breakage of distro packages for

>> module building. (More fundamentally, we cannot build external modules

>> after 'make clean')

>> 

>> Patching around the build system would make the code even uglier.

>> 

>> Given that this issue will be solved in a cleaner way sooner or later,

>> let's revert the in-kernel workarounds, and wait for GCC 9.

>> 

>> Reported-by: Logan Gunthorpe <logang@deltatee.com> # distcc

>> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> # debian/rpm package

> 

> Hi,

> 

> I reported the issue with debian package breakage in [1].

> 

> I am not subscribed to any involved mailing-list and not following all

> the discussions.

> I see the situation is not easy as there is especially linux-kbuild

> and linux/x86 involved and maybe other interests.

> But I am interested in having a fix in v4.20 final and hope this all

> still works with LLVM/Clang.

> 

> I can offer my help in testing - against Linux v4.20-rc7.

> Not sure if all discussed material is in upstream or elsewhere.

> What is your suggestion for me as a tester?

> 

> Will we have a solution in Linux v4.20 final?


Thanks for the reference. I see solutions have already been developed. So
it’s Masahiro’s call.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index f2c3423..4cf4c5b 100644
--- a/Makefile
+++ b/Makefile
@@ -1081,7 +1081,7 @@  scripts: scripts_basic scripts_dtc asm-generic gcc-plugins $(autoksyms_h)
 # version.h and scripts_basic is processed / created.
 
 # Listed in dependency order
-PHONY += prepare archprepare macroprepare prepare0 prepare1 prepare2 prepare3
+PHONY += prepare archprepare prepare0 prepare1 prepare2 prepare3
 
 # prepare3 is used to check if we are building in a separate output directory,
 # and if so do:
@@ -1104,9 +1104,7 @@  prepare2: prepare3 outputmakefile asm-generic
 prepare1: prepare2 $(version_h) $(autoksyms_h) include/generated/utsrelease.h
 	$(cmd_crmodverdir)
 
-macroprepare: prepare1 archmacros
-
-archprepare: archheaders archscripts macroprepare scripts_basic
+archprepare: archheaders archscripts prepare1 scripts_basic
 
 prepare0: archprepare gcc-plugins
 	$(Q)$(MAKE) $(build)=.
@@ -1174,9 +1172,6 @@  archheaders:
 PHONY += archscripts
 archscripts:
 
-PHONY += archmacros
-archmacros:
-
 PHONY += __headers
 __headers: $(version_h) scripts_basic uapi-asm-generic archheaders archscripts
 	$(Q)$(MAKE) $(build)=scripts build_unifdef
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 75ef499..85a66c4 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -232,13 +232,6 @@  archscripts: scripts_basic
 archheaders:
 	$(Q)$(MAKE) $(build)=arch/x86/entry/syscalls all
 
-archmacros:
-	$(Q)$(MAKE) $(build)=arch/x86/kernel arch/x86/kernel/macros.s
-
-ASM_MACRO_FLAGS = -Wa,arch/x86/kernel/macros.s
-export ASM_MACRO_FLAGS
-KBUILD_CFLAGS += $(ASM_MACRO_FLAGS)
-
 ###
 # Kernel objects
 
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 25e5a6b..20d0885 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -352,7 +352,7 @@  For 32-bit we have the following conventions - kernel is built with
 .macro CALL_enter_from_user_mode
 #ifdef CONFIG_CONTEXT_TRACKING
 #ifdef HAVE_JUMP_LABEL
-	STATIC_BRANCH_JMP l_yes=.Lafter_call_\@, key=context_tracking_enabled, branch=1
+	STATIC_JUMP_IF_FALSE .Lafter_call_\@, context_tracking_enabled, def=0
 #endif
 	call enter_from_user_mode
 .Lafter_call_\@:
diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 8e4ea39..31b627b 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -7,24 +7,16 @@ 
 #include <asm/asm.h>
 
 #ifdef CONFIG_SMP
-.macro LOCK_PREFIX_HERE
+	.macro LOCK_PREFIX
+672:	lock
 	.pushsection .smp_locks,"a"
 	.balign 4
-	.long 671f - .		# offset
+	.long 672b - .
 	.popsection
-671:
-.endm
-
-.macro LOCK_PREFIX insn:vararg
-	LOCK_PREFIX_HERE
-	lock \insn
-.endm
+	.endm
 #else
-.macro LOCK_PREFIX_HERE
-.endm
-
-.macro LOCK_PREFIX insn:vararg
-.endm
+	.macro LOCK_PREFIX
+	.endm
 #endif
 
 /*
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index d7faa16..4cd6a3b 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -31,8 +31,15 @@ 
  */
 
 #ifdef CONFIG_SMP
-#define LOCK_PREFIX_HERE "LOCK_PREFIX_HERE\n\t"
-#define LOCK_PREFIX "LOCK_PREFIX "
+#define LOCK_PREFIX_HERE \
+		".pushsection .smp_locks,\"a\"\n"	\
+		".balign 4\n"				\
+		".long 671f - .\n" /* offset */		\
+		".popsection\n"				\
+		"671:"
+
+#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; "
+
 #else /* ! CONFIG_SMP */
 #define LOCK_PREFIX_HERE ""
 #define LOCK_PREFIX ""
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 21b0867..6467757b 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -120,25 +120,12 @@ 
 /* Exception table entry */
 #ifdef __ASSEMBLY__
 # define _ASM_EXTABLE_HANDLE(from, to, handler)			\
-	ASM_EXTABLE_HANDLE from to handler
-
-.macro ASM_EXTABLE_HANDLE from:req to:req handler:req
-	.pushsection "__ex_table","a"
-	.balign 4
-	.long (\from) - .
-	.long (\to) - .
-	.long (\handler) - .
+	.pushsection "__ex_table","a" ;				\
+	.balign 4 ;						\
+	.long (from) - . ;					\
+	.long (to) - . ;					\
+	.long (handler) - . ;					\
 	.popsection
-.endm
-#else /* __ASSEMBLY__ */
-
-# define _ASM_EXTABLE_HANDLE(from, to, handler)			\
-	"ASM_EXTABLE_HANDLE from=" #from " to=" #to		\
-	" handler=\"" #handler "\"\n\t"
-
-/* For C file, we already have NOKPROBE_SYMBOL macro */
-
-#endif /* __ASSEMBLY__ */
 
 # define _ASM_EXTABLE(from, to)					\
 	_ASM_EXTABLE_HANDLE(from, to, ex_handler_default)
@@ -161,7 +148,6 @@ 
 	_ASM_PTR (entry);					\
 	.popsection
 
-#ifdef __ASSEMBLY__
 .macro ALIGN_DESTINATION
 	/* check for bad alignment of destination */
 	movl %edi,%ecx
@@ -185,7 +171,34 @@ 
 	_ASM_EXTABLE_UA(100b, 103b)
 	_ASM_EXTABLE_UA(101b, 103b)
 	.endm
-#endif /* __ASSEMBLY__ */
+
+#else
+# define _EXPAND_EXTABLE_HANDLE(x) #x
+# define _ASM_EXTABLE_HANDLE(from, to, handler)			\
+	" .pushsection \"__ex_table\",\"a\"\n"			\
+	" .balign 4\n"						\
+	" .long (" #from ") - .\n"				\
+	" .long (" #to ") - .\n"				\
+	" .long (" _EXPAND_EXTABLE_HANDLE(handler) ") - .\n"	\
+	" .popsection\n"
+
+# define _ASM_EXTABLE(from, to)					\
+	_ASM_EXTABLE_HANDLE(from, to, ex_handler_default)
+
+# define _ASM_EXTABLE_UA(from, to)				\
+	_ASM_EXTABLE_HANDLE(from, to, ex_handler_uaccess)
+
+# define _ASM_EXTABLE_FAULT(from, to)				\
+	_ASM_EXTABLE_HANDLE(from, to, ex_handler_fault)
+
+# define _ASM_EXTABLE_EX(from, to)				\
+	_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)
+
+# define _ASM_EXTABLE_REFCOUNT(from, to)			\
+	_ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
+
+/* For C file, we already have NOKPROBE_SYMBOL macro */
+#endif
 
 #ifndef __ASSEMBLY__
 /*
diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 5090035..6804d66 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -4,8 +4,6 @@ 
 
 #include <linux/stringify.h>
 
-#ifndef __ASSEMBLY__
-
 /*
  * Despite that some emulators terminate on UD2, we use it for WARN().
  *
@@ -22,15 +20,53 @@ 
 
 #define LEN_UD2		2
 
+#ifdef CONFIG_GENERIC_BUG
+
+#ifdef CONFIG_X86_32
+# define __BUG_REL(val)	".long " __stringify(val)
+#else
+# define __BUG_REL(val)	".long " __stringify(val) " - 2b"
+#endif
+
+#ifdef CONFIG_DEBUG_BUGVERBOSE
+
+#define _BUG_FLAGS(ins, flags)						\
+do {									\
+	asm volatile("1:\t" ins "\n"					\
+		     ".pushsection __bug_table,\"aw\"\n"		\
+		     "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"	\
+		     "\t"  __BUG_REL(%c0) "\t# bug_entry::file\n"	\
+		     "\t.word %c1"        "\t# bug_entry::line\n"	\
+		     "\t.word %c2"        "\t# bug_entry::flags\n"	\
+		     "\t.org 2b+%c3\n"					\
+		     ".popsection"					\
+		     : : "i" (__FILE__), "i" (__LINE__),		\
+			 "i" (flags),					\
+			 "i" (sizeof(struct bug_entry)));		\
+} while (0)
+
+#else /* !CONFIG_DEBUG_BUGVERBOSE */
+
 #define _BUG_FLAGS(ins, flags)						\
 do {									\
-	asm volatile("ASM_BUG ins=\"" ins "\" file=%c0 line=%c1 "	\
-		     "flags=%c2 size=%c3"				\
-		     : : "i" (__FILE__), "i" (__LINE__),                \
-			 "i" (flags),                                   \
+	asm volatile("1:\t" ins "\n"					\
+		     ".pushsection __bug_table,\"aw\"\n"		\
+		     "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"	\
+		     "\t.word %c0"        "\t# bug_entry::flags\n"	\
+		     "\t.org 2b+%c1\n"					\
+		     ".popsection"					\
+		     : : "i" (flags),					\
 			 "i" (sizeof(struct bug_entry)));		\
 } while (0)
 
+#endif /* CONFIG_DEBUG_BUGVERBOSE */
+
+#else
+
+#define _BUG_FLAGS(ins, flags)  asm volatile(ins)
+
+#endif /* CONFIG_GENERIC_BUG */
+
 #define HAVE_ARCH_BUG
 #define BUG()							\
 do {								\
@@ -46,54 +82,4 @@  do {								\
 
 #include <asm-generic/bug.h>
 
-#else /* __ASSEMBLY__ */
-
-#ifdef CONFIG_GENERIC_BUG
-
-#ifdef CONFIG_X86_32
-.macro __BUG_REL val:req
-	.long \val
-.endm
-#else
-.macro __BUG_REL val:req
-	.long \val - 2b
-.endm
-#endif
-
-#ifdef CONFIG_DEBUG_BUGVERBOSE
-
-.macro ASM_BUG ins:req file:req line:req flags:req size:req
-1:	\ins
-	.pushsection __bug_table,"aw"
-2:	__BUG_REL val=1b	# bug_entry::bug_addr
-	__BUG_REL val=\file	# bug_entry::file
-	.word \line		# bug_entry::line
-	.word \flags		# bug_entry::flags
-	.org 2b+\size
-	.popsection
-.endm
-
-#else /* !CONFIG_DEBUG_BUGVERBOSE */
-
-.macro ASM_BUG ins:req file:req line:req flags:req size:req
-1:	\ins
-	.pushsection __bug_table,"aw"
-2:	__BUG_REL val=1b	# bug_entry::bug_addr
-	.word \flags		# bug_entry::flags
-	.org 2b+\size
-	.popsection
-.endm
-
-#endif /* CONFIG_DEBUG_BUGVERBOSE */
-
-#else /* CONFIG_GENERIC_BUG */
-
-.macro ASM_BUG ins:req file:req line:req flags:req size:req
-	\ins
-.endm
-
-#endif /* CONFIG_GENERIC_BUG */
-
-#endif /* __ASSEMBLY__ */
-
 #endif /* _ASM_X86_BUG_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 7d44272..aced6c9 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -2,10 +2,10 @@ 
 #ifndef _ASM_X86_CPUFEATURE_H
 #define _ASM_X86_CPUFEATURE_H
 
-#ifdef __KERNEL__
-#ifndef __ASSEMBLY__
-
 #include <asm/processor.h>
+
+#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
+
 #include <asm/asm.h>
 #include <linux/bitops.h>
 
@@ -161,10 +161,37 @@  extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
  */
 static __always_inline __pure bool _static_cpu_has(u16 bit)
 {
-	asm_volatile_goto("STATIC_CPU_HAS bitnum=%[bitnum] "
-			  "cap_byte=\"%[cap_byte]\" "
-			  "feature=%P[feature] t_yes=%l[t_yes] "
-			  "t_no=%l[t_no] always=%P[always]"
+	asm_volatile_goto("1: jmp 6f\n"
+		 "2:\n"
+		 ".skip -(((5f-4f) - (2b-1b)) > 0) * "
+			 "((5f-4f) - (2b-1b)),0x90\n"
+		 "3:\n"
+		 ".section .altinstructions,\"a\"\n"
+		 " .long 1b - .\n"		/* src offset */
+		 " .long 4f - .\n"		/* repl offset */
+		 " .word %P[always]\n"		/* always replace */
+		 " .byte 3b - 1b\n"		/* src len */
+		 " .byte 5f - 4f\n"		/* repl len */
+		 " .byte 3b - 2b\n"		/* pad len */
+		 ".previous\n"
+		 ".section .altinstr_replacement,\"ax\"\n"
+		 "4: jmp %l[t_no]\n"
+		 "5:\n"
+		 ".previous\n"
+		 ".section .altinstructions,\"a\"\n"
+		 " .long 1b - .\n"		/* src offset */
+		 " .long 0\n"			/* no replacement */
+		 " .word %P[feature]\n"		/* feature bit */
+		 " .byte 3b - 1b\n"		/* src len */
+		 " .byte 0\n"			/* repl len */
+		 " .byte 0\n"			/* pad len */
+		 ".previous\n"
+		 ".section .altinstr_aux,\"ax\"\n"
+		 "6:\n"
+		 " testb %[bitnum],%[cap_byte]\n"
+		 " jnz %l[t_yes]\n"
+		 " jmp %l[t_no]\n"
+		 ".previous\n"
 		 : : [feature]  "i" (bit),
 		     [always]   "i" (X86_FEATURE_ALWAYS),
 		     [bitnum]   "i" (1 << (bit & 7)),
@@ -199,44 +226,5 @@  static __always_inline __pure bool _static_cpu_has(u16 bit)
 #define CPU_FEATURE_TYPEVAL		boot_cpu_data.x86_vendor, boot_cpu_data.x86, \
 					boot_cpu_data.x86_model
 
-#else /* __ASSEMBLY__ */
-
-.macro STATIC_CPU_HAS bitnum:req cap_byte:req feature:req t_yes:req t_no:req always:req
-1:
-	jmp 6f
-2:
-	.skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
-3:
-	.section .altinstructions,"a"
-	.long 1b - .		/* src offset */
-	.long 4f - .		/* repl offset */
-	.word \always		/* always replace */
-	.byte 3b - 1b		/* src len */
-	.byte 5f - 4f		/* repl len */
-	.byte 3b - 2b		/* pad len */
-	.previous
-	.section .altinstr_replacement,"ax"
-4:
-	jmp \t_no
-5:
-	.previous
-	.section .altinstructions,"a"
-	.long 1b - .		/* src offset */
-	.long 0			/* no replacement */
-	.word \feature		/* feature bit */
-	.byte 3b - 1b		/* src len */
-	.byte 0			/* repl len */
-	.byte 0			/* pad len */
-	.previous
-	.section .altinstr_aux,"ax"
-6:
-	testb \bitnum,\cap_byte
-	jnz \t_yes
-	jmp \t_no
-	.previous
-.endm
-
-#endif /* __ASSEMBLY__ */
-
-#endif /* __KERNEL__ */
+#endif /* defined(__KERNEL__) && !defined(__ASSEMBLY__) */
 #endif /* _ASM_X86_CPUFEATURE_H */
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index a5fb34f..21efc9d 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -2,6 +2,19 @@ 
 #ifndef _ASM_X86_JUMP_LABEL_H
 #define _ASM_X86_JUMP_LABEL_H
 
+#ifndef HAVE_JUMP_LABEL
+/*
+ * For better or for worse, if jump labels (the gcc extension) are missing,
+ * then the entire static branch patching infrastructure is compiled out.
+ * If that happens, the code in here will malfunction.  Raise a compiler
+ * error instead.
+ *
+ * In theory, jump labels and the static branch patching infrastructure
+ * could be decoupled to fix this.
+ */
+#error asm/jump_label.h included on a non-jump-label kernel
+#endif
+
 #define JUMP_LABEL_NOP_SIZE 5
 
 #ifdef CONFIG_X86_64
@@ -20,9 +33,15 @@ 
 
 static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
 {
-	asm_volatile_goto("STATIC_BRANCH_NOP l_yes=\"%l[l_yes]\" key=\"%c0\" "
-			  "branch=\"%c1\""
-			: :  "i" (key), "i" (branch) : : l_yes);
+	asm_volatile_goto("1:"
+		".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
+		".pushsection __jump_table,  \"aw\" \n\t"
+		_ASM_ALIGN "\n\t"
+		".long 1b - ., %l[l_yes] - . \n\t"
+		_ASM_PTR "%c0 + %c1 - .\n\t"
+		".popsection \n\t"
+		: :  "i" (key), "i" (branch) : : l_yes);
+
 	return false;
 l_yes:
 	return true;
@@ -30,8 +49,14 @@  static __always_inline bool arch_static_branch(struct static_key *key, bool bran
 
 static __always_inline bool arch_static_branch_jump(struct static_key *key, bool branch)
 {
-	asm_volatile_goto("STATIC_BRANCH_JMP l_yes=\"%l[l_yes]\" key=\"%c0\" "
-			  "branch=\"%c1\""
+	asm_volatile_goto("1:"
+		".byte 0xe9\n\t .long %l[l_yes] - 2f\n\t"
+		"2:\n\t"
+		".pushsection __jump_table,  \"aw\" \n\t"
+		_ASM_ALIGN "\n\t"
+		".long 1b - ., %l[l_yes] - . \n\t"
+		_ASM_PTR "%c0 + %c1 - .\n\t"
+		".popsection \n\t"
 		: :  "i" (key), "i" (branch) : : l_yes);
 
 	return false;
@@ -41,26 +66,37 @@  static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
 
 #else	/* __ASSEMBLY__ */
 
-.macro STATIC_BRANCH_NOP l_yes:req key:req branch:req
-.Lstatic_branch_nop_\@:
-	.byte STATIC_KEY_INIT_NOP
-.Lstatic_branch_no_after_\@:
+.macro STATIC_JUMP_IF_TRUE target, key, def
+.Lstatic_jump_\@:
+	.if \def
+	/* Equivalent to "jmp.d32 \target" */
+	.byte		0xe9
+	.long		\target - .Lstatic_jump_after_\@
+.Lstatic_jump_after_\@:
+	.else
+	.byte		STATIC_KEY_INIT_NOP
+	.endif
 	.pushsection __jump_table, "aw"
 	_ASM_ALIGN
-	.long		.Lstatic_branch_nop_\@ - ., \l_yes - .
-	_ASM_PTR        \key + \branch - .
+	.long		.Lstatic_jump_\@ - ., \target - .
+	_ASM_PTR	\key - .
 	.popsection
 .endm
 
-.macro STATIC_BRANCH_JMP l_yes:req key:req branch:req
-.Lstatic_branch_jmp_\@:
-	.byte 0xe9
-	.long \l_yes - .Lstatic_branch_jmp_after_\@
-.Lstatic_branch_jmp_after_\@:
+.macro STATIC_JUMP_IF_FALSE target, key, def
+.Lstatic_jump_\@:
+	.if \def
+	.byte		STATIC_KEY_INIT_NOP
+	.else
+	/* Equivalent to "jmp.d32 \target" */
+	.byte		0xe9
+	.long		\target - .Lstatic_jump_after_\@
+.Lstatic_jump_after_\@:
+	.endif
 	.pushsection __jump_table, "aw"
 	_ASM_ALIGN
-	.long		.Lstatic_branch_jmp_\@ - ., \l_yes - .
-	_ASM_PTR	\key + \branch - .
+	.long		.Lstatic_jump_\@ - ., \target - .
+	_ASM_PTR	\key + 1 - .
 	.popsection
 .endm
 
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 26942ad..488c596 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -348,11 +348,23 @@  extern struct paravirt_patch_template pv_ops;
 #define paravirt_clobber(clobber)		\
 	[paravirt_clobber] "i" (clobber)
 
+/*
+ * Generate some code, and mark it as patchable by the
+ * apply_paravirt() alternate instruction patcher.
+ */
+#define _paravirt_alt(insn_string, type, clobber)	\
+	"771:\n\t" insn_string "\n" "772:\n"		\
+	".pushsection .parainstructions,\"a\"\n"	\
+	_ASM_ALIGN "\n"					\
+	_ASM_PTR " 771b\n"				\
+	"  .byte " type "\n"				\
+	"  .byte 772b-771b\n"				\
+	"  .short " clobber "\n"			\
+	".popsection\n"
+
 /* Generate patchable code, with the default asm parameters. */
-#define paravirt_call							\
-	"PARAVIRT_CALL type=\"%c[paravirt_typenum]\""			\
-	" clobber=\"%c[paravirt_clobber]\""				\
-	" pv_opptr=\"%c[paravirt_opptr]\";"
+#define paravirt_alt(insn_string)					\
+	_paravirt_alt(insn_string, "%c[paravirt_typenum]", "%c[paravirt_clobber]")
 
 /* Simple instruction patching code. */
 #define NATIVE_LABEL(a,x,b) "\n\t.globl " a #x "_" #b "\n" a #x "_" #b ":\n\t"
@@ -373,6 +385,16 @@  unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len);
 int paravirt_disable_iospace(void);
 
 /*
+ * This generates an indirect call based on the operation type number.
+ * The type number, computed in PARAVIRT_PATCH, is derived from the
+ * offset into the paravirt_patch_template structure, and can therefore be
+ * freely converted back into a structure offset.
+ */
+#define PARAVIRT_CALL					\
+	ANNOTATE_RETPOLINE_SAFE				\
+	"call *%c[paravirt_opptr];"
+
+/*
  * These macros are intended to wrap calls through one of the paravirt
  * ops structs, so that they can be later identified and patched at
  * runtime.
@@ -509,7 +531,7 @@  int paravirt_disable_iospace(void);
 		/* since this condition will never hold */		\
 		if (sizeof(rettype) > sizeof(unsigned long)) {		\
 			asm volatile(pre				\
-				     paravirt_call			\
+				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
 				     : call_clbr, ASM_CALL_CONSTRAINT	\
 				     : paravirt_type(op),		\
@@ -519,7 +541,7 @@  int paravirt_disable_iospace(void);
 			__ret = (rettype)((((u64)__edx) << 32) | __eax); \
 		} else {						\
 			asm volatile(pre				\
-				     paravirt_call			\
+				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
 				     : call_clbr, ASM_CALL_CONSTRAINT	\
 				     : paravirt_type(op),		\
@@ -546,7 +568,7 @@  int paravirt_disable_iospace(void);
 		PVOP_VCALL_ARGS;					\
 		PVOP_TEST_NULL(op);					\
 		asm volatile(pre					\
-			     paravirt_call				\
+			     paravirt_alt(PARAVIRT_CALL)		\
 			     post					\
 			     : call_clbr, ASM_CALL_CONSTRAINT		\
 			     : paravirt_type(op),			\
@@ -664,26 +686,6 @@  struct paravirt_patch_site {
 extern struct paravirt_patch_site __parainstructions[],
 	__parainstructions_end[];
 
-#else	/* __ASSEMBLY__ */
-
-/*
- * This generates an indirect call based on the operation type number.
- * The type number, computed in PARAVIRT_PATCH, is derived from the
- * offset into the paravirt_patch_template structure, and can therefore be
- * freely converted back into a structure offset.
- */
-.macro PARAVIRT_CALL type:req clobber:req pv_opptr:req
-771:	ANNOTATE_RETPOLINE_SAFE
-	call *\pv_opptr
-772:	.pushsection .parainstructions,"a"
-	_ASM_ALIGN
-	_ASM_PTR 771b
-	.byte \type
-	.byte 772b-771b
-	.short \clobber
-	.popsection
-.endm
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* _ASM_X86_PARAVIRT_TYPES_H */
diff --git a/arch/x86/include/asm/refcount.h b/arch/x86/include/asm/refcount.h
index a8b5e1e..dbaed55 100644
--- a/arch/x86/include/asm/refcount.h
+++ b/arch/x86/include/asm/refcount.h
@@ -4,41 +4,6 @@ 
  * x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from
  * PaX/grsecurity.
  */
-
-#ifdef __ASSEMBLY__
-
-#include <asm/asm.h>
-#include <asm/bug.h>
-
-.macro REFCOUNT_EXCEPTION counter:req
-	.pushsection .text..refcount
-111:	lea \counter, %_ASM_CX
-112:	ud2
-	ASM_UNREACHABLE
-	.popsection
-113:	_ASM_EXTABLE_REFCOUNT(112b, 113b)
-.endm
-
-/* Trigger refcount exception if refcount result is negative. */
-.macro REFCOUNT_CHECK_LT_ZERO counter:req
-	js 111f
-	REFCOUNT_EXCEPTION counter="\counter"
-.endm
-
-/* Trigger refcount exception if refcount result is zero or negative. */
-.macro REFCOUNT_CHECK_LE_ZERO counter:req
-	jz 111f
-	REFCOUNT_CHECK_LT_ZERO counter="\counter"
-.endm
-
-/* Trigger refcount exception unconditionally. */
-.macro REFCOUNT_ERROR counter:req
-	jmp 111f
-	REFCOUNT_EXCEPTION counter="\counter"
-.endm
-
-#else /* __ASSEMBLY__ */
-
 #include <linux/refcount.h>
 #include <asm/bug.h>
 
@@ -50,12 +15,35 @@ 
  * central refcount exception. The fixup address for the exception points
  * back to the regular execution flow in .text.
  */
+#define _REFCOUNT_EXCEPTION				\
+	".pushsection .text..refcount\n"		\
+	"111:\tlea %[var], %%" _ASM_CX "\n"		\
+	"112:\t" ASM_UD2 "\n"				\
+	ASM_UNREACHABLE					\
+	".popsection\n"					\
+	"113:\n"					\
+	_ASM_EXTABLE_REFCOUNT(112b, 113b)
+
+/* Trigger refcount exception if refcount result is negative. */
+#define REFCOUNT_CHECK_LT_ZERO				\
+	"js 111f\n\t"					\
+	_REFCOUNT_EXCEPTION
+
+/* Trigger refcount exception if refcount result is zero or negative. */
+#define REFCOUNT_CHECK_LE_ZERO				\
+	"jz 111f\n\t"					\
+	REFCOUNT_CHECK_LT_ZERO
+
+/* Trigger refcount exception unconditionally. */
+#define REFCOUNT_ERROR					\
+	"jmp 111f\n\t"					\
+	_REFCOUNT_EXCEPTION
 
 static __always_inline void refcount_add(unsigned int i, refcount_t *r)
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0\n\t"
-		"REFCOUNT_CHECK_LT_ZERO counter=\"%[counter]\""
-		: [counter] "+m" (r->refs.counter)
+		REFCOUNT_CHECK_LT_ZERO
+		: [var] "+m" (r->refs.counter)
 		: "ir" (i)
 		: "cc", "cx");
 }
@@ -63,32 +51,31 @@  static __always_inline void refcount_add(unsigned int i, refcount_t *r)
 static __always_inline void refcount_inc(refcount_t *r)
 {
 	asm volatile(LOCK_PREFIX "incl %0\n\t"
-		"REFCOUNT_CHECK_LT_ZERO counter=\"%[counter]\""
-		: [counter] "+m" (r->refs.counter)
+		REFCOUNT_CHECK_LT_ZERO
+		: [var] "+m" (r->refs.counter)
 		: : "cc", "cx");
 }
 
 static __always_inline void refcount_dec(refcount_t *r)
 {
 	asm volatile(LOCK_PREFIX "decl %0\n\t"
-		"REFCOUNT_CHECK_LE_ZERO counter=\"%[counter]\""
-		: [counter] "+m" (r->refs.counter)
+		REFCOUNT_CHECK_LE_ZERO
+		: [var] "+m" (r->refs.counter)
 		: : "cc", "cx");
 }
 
 static __always_inline __must_check
 bool refcount_sub_and_test(unsigned int i, refcount_t *r)
 {
-
 	return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
-					 "REFCOUNT_CHECK_LT_ZERO counter=\"%[var]\"",
+					 REFCOUNT_CHECK_LT_ZERO,
 					 r->refs.counter, e, "er", i, "cx");
 }
 
 static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)
 {
 	return GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl",
-					"REFCOUNT_CHECK_LT_ZERO counter=\"%[var]\"",
+					REFCOUNT_CHECK_LT_ZERO,
 					r->refs.counter, e, "cx");
 }
 
@@ -106,8 +93,8 @@  bool refcount_add_not_zero(unsigned int i, refcount_t *r)
 
 		/* Did we try to increment from/to an undesirable state? */
 		if (unlikely(c < 0 || c == INT_MAX || result < c)) {
-			asm volatile("REFCOUNT_ERROR counter=\"%[counter]\""
-				     : : [counter] "m" (r->refs.counter)
+			asm volatile(REFCOUNT_ERROR
+				     : : [var] "m" (r->refs.counter)
 				     : "cc", "cx");
 			break;
 		}
@@ -122,6 +109,4 @@  static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r)
 	return refcount_add_not_zero(1, r);
 }
 
-#endif /* __ASSEMBLY__ */
-
 #endif
diff --git a/arch/x86/kernel/macros.S b/arch/x86/kernel/macros.S
deleted file mode 100644
index 161c950..0000000
--- a/arch/x86/kernel/macros.S
+++ /dev/null
@@ -1,16 +0,0 @@ 
-/* SPDX-License-Identifier: GPL-2.0 */
-
-/*
- * This file includes headers whose assembly part includes macros which are
- * commonly used. The macros are precompiled into assmebly file which is later
- * assembled together with each compiled file.
- */
-
-#include <linux/compiler.h>
-#include <asm/refcount.h>
-#include <asm/alternative-asm.h>
-#include <asm/bug.h>
-#include <asm/paravirt.h>
-#include <asm/asm.h>
-#include <asm/cpufeature.h>
-#include <asm/jump_label.h>
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index cdafa5e..20561a6 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -17,8 +17,10 @@ 
 #ifndef __ASSEMBLY__
 #include <linux/kernel.h>
 
-struct bug_entry {
+#ifdef CONFIG_BUG
+
 #ifdef CONFIG_GENERIC_BUG
+struct bug_entry {
 #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS
 	unsigned long	bug_addr;
 #else
@@ -33,10 +35,8 @@  struct bug_entry {
 	unsigned short	line;
 #endif
 	unsigned short	flags;
-#endif	/* CONFIG_GENERIC_BUG */
 };
-
-#ifdef CONFIG_BUG
+#endif	/* CONFIG_GENERIC_BUG */
 
 /*
  * Don't use BUG() or BUG_ON() unless there's really no way out; one
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 06396c1..fc5004a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -99,13 +99,22 @@  void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * unique, to convince GCC not to merge duplicate inline asm statements.
  */
 #define annotate_reachable() ({						\
-	asm volatile("ANNOTATE_REACHABLE counter=%c0"			\
-		     : : "i" (__COUNTER__));				\
+	asm volatile("%c0:\n\t"						\
+		     ".pushsection .discard.reachable\n\t"		\
+		     ".long %c0b - .\n\t"				\
+		     ".popsection\n\t" : : "i" (__COUNTER__));		\
 })
 #define annotate_unreachable() ({					\
-	asm volatile("ANNOTATE_UNREACHABLE counter=%c0"			\
-		     : : "i" (__COUNTER__));				\
+	asm volatile("%c0:\n\t"						\
+		     ".pushsection .discard.unreachable\n\t"		\
+		     ".long %c0b - .\n\t"				\
+		     ".popsection\n\t" : : "i" (__COUNTER__));		\
 })
+#define ASM_UNREACHABLE							\
+	"999:\n\t"							\
+	".pushsection .discard.unreachable\n\t"				\
+	".long 999b - .\n\t"						\
+	".popsection\n\t"
 #else
 #define annotate_reachable()
 #define annotate_unreachable()
@@ -293,45 +302,6 @@  static inline void *offset_to_ptr(const int *off)
 	return (void *)((unsigned long)off + *off);
 }
 
-#else /* __ASSEMBLY__ */
-
-#ifdef __KERNEL__
-#ifndef LINKER_SCRIPT
-
-#ifdef CONFIG_STACK_VALIDATION
-.macro ANNOTATE_UNREACHABLE counter:req
-\counter:
-	.pushsection .discard.unreachable
-	.long \counter\()b -.
-	.popsection
-.endm
-
-.macro ANNOTATE_REACHABLE counter:req
-\counter:
-	.pushsection .discard.reachable
-	.long \counter\()b -.
-	.popsection
-.endm
-
-.macro ASM_UNREACHABLE
-999:
-	.pushsection .discard.unreachable
-	.long 999b - .
-	.popsection
-.endm
-#else /* CONFIG_STACK_VALIDATION */
-.macro ANNOTATE_UNREACHABLE counter:req
-.endm
-
-.macro ANNOTATE_REACHABLE counter:req
-.endm
-
-.macro ASM_UNREACHABLE
-.endm
-#endif /* CONFIG_STACK_VALIDATION */
-
-#endif /* LINKER_SCRIPT */
-#endif /* __KERNEL__ */
 #endif /* __ASSEMBLY__ */
 
 /* Compile time object size, -1 for unknown */
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index bb01555..3d09844 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -115,9 +115,7 @@  __cc-option = $(call try-run,\
 
 # Do not attempt to build with gcc plugins during cc-option tests.
 # (And this uses delayed resolution so the flags will be up to date.)
-# In addition, do not include the asm macros which are built later.
-CC_OPTION_FILTERED = $(GCC_PLUGINS_CFLAGS) $(ASM_MACRO_FLAGS)
-CC_OPTION_CFLAGS = $(filter-out $(CC_OPTION_FILTERED),$(KBUILD_CFLAGS))
+CC_OPTION_CFLAGS = $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS))
 
 # cc-option
 # Usage: cflags-y += $(call cc-option,-march=winchip-c6,-march=i586)
diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index a5b4af4..42c5d50 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -4,8 +4,6 @@  OBJECT_FILES_NON_STANDARD := y
 hostprogs-y	:= modpost mk_elfconfig
 always		:= $(hostprogs-y) empty.o
 
-CFLAGS_REMOVE_empty.o := $(ASM_MACRO_FLAGS)
-
 modpost-objs	:= modpost.o file2alias.o sumversion.o
 
 devicetable-offsets-file := devicetable-offsets.h