Message ID | CAKdteOaZmkDJVvwGeb=qjh5YayK6cT7oHBQ9w+GhOtU2gdqwTg@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | [RFC] Add support for non-contiguous memory regions | expand |
Hi, On Fri, Nov 29, 2019 at 11:12:03AM +0100, Christophe Lyon wrote: > The attached patches implement support for non-contiguous memory > regions, which was discussed in more detail in > https://sourceware.org/ml/binutils/2019-07/msg00020.html How does this interact with relaxations? Simon
On Fri, 29 Nov 2019 at 11:28, Simon Richter <Simon.Richter@hogyros.de> wrote: > > Hi, > > On Fri, Nov 29, 2019 at 11:12:03AM +0100, Christophe Lyon wrote: > > > The attached patches implement support for non-contiguous memory > > regions, which was discussed in more detail in > > https://sourceware.org/ml/binutils/2019-07/msg00020.html > > How does this interact with relaxations? > Hi, I haven't tested that specifically yet. Do you have a target / testcase in mind that I could check? Christophe > Simon
Hi, On Mon, Dec 02, 2019 at 10:56:22AM +0100, Christophe Lyon wrote: > > > The attached patches implement support for non-contiguous memory > > > regions > > How does this interact with relaxations? > I haven't tested that specifically yet. Do you have a target / > testcase in mind that I could check? A quick testcase would be on PowerPC: Assembler file "test.S" .machine "ppc" .section .text.one b 2f .section .text.two 2: nop Linker script "test.ld" MEMORY { one (RXAI) : ORIGIN = 0x00000000, LENGTH = 0x10000000 two (RXAI) : ORIGIN = 0x20000000, LENGTH = 0x10000000 } SECTIONS { one : { *(.text.one) } > one two : { *(.text.two) } > two } Assemble: $ as -o test.o test.S Disassemble: $ objdump -dr test.o test.o: file format elf64-powerpcle Disassembly of section .text.one: 0000000000000000 <.text.one>: 0: 00 00 00 48 b 0x0 0: R_PPC64_REL24 .text.two Disassembly of section .text.two: 0000000000000000 <.text.two>: 0: 00 00 00 60 nop Link: $ ld -o test -T test.ld --relax test.o Disassemble: $ objdump -dr test test: file format elf64-powerpcle Disassembly of section one: 0000000000000000 <0000001b.plt_branch.1c:5>: 0: 08 80 82 e9 ld r12,-32760(r2) 4: a6 03 89 7d mtctr r12 8: 20 04 80 4e bctr ... 20: e0 ff ff 4b b 0 <0000001b.plt_branch.1c:5> Disassembly of section two: 0000000020000000 <two>: 20000000: 00 00 00 60 nop The linker generates the trampoline as the branch displacement doesn't fit the instruction, extending the section size by 12 bytes. This happens after addresses are assigned, because that is when the linker learns that the displacement is too large. There is also the opposite case, where branch instructions are replaced by shorter variants on some architectures during relaxation, but I don't have an example immediately ready. The desired behaviour for the combination of non-contiguous regions and relaxations should probably be defined by someone from the core team -- probably locking out the combination (like "-r --relax" on PowerPC) or just ensuring that it doesn't crash is the sanest thing to do here. Simon
On Mon, 2 Dec 2019 at 12:05, Simon Richter <Simon.Richter@hogyros.de> wrote: > > Hi, > > On Mon, Dec 02, 2019 at 10:56:22AM +0100, Christophe Lyon wrote: > > > > > The attached patches implement support for non-contiguous memory > > > > regions > > > > How does this interact with relaxations? > > > I haven't tested that specifically yet. Do you have a target / > > testcase in mind that I could check? > > A quick testcase would be on PowerPC: > > Assembler file "test.S" > > .machine "ppc" > > .section .text.one > b 2f > > .section .text.two > 2: > nop > > Linker script "test.ld" > > MEMORY { > one (RXAI) : ORIGIN = 0x00000000, LENGTH = 0x10000000 > two (RXAI) : ORIGIN = 0x20000000, LENGTH = 0x10000000 > } > > SECTIONS { > one : { > *(.text.one) > } > one > two : { > *(.text.two) > } > two > } > > Assemble: > > $ as -o test.o test.S > > Disassemble: > > $ objdump -dr test.o > > test.o: file format elf64-powerpcle > > > Disassembly of section .text.one: > > 0000000000000000 <.text.one>: > 0: 00 00 00 48 b 0x0 > 0: R_PPC64_REL24 .text.two > > Disassembly of section .text.two: > > 0000000000000000 <.text.two>: > 0: 00 00 00 60 nop > > Link: > > $ ld -o test -T test.ld --relax test.o > > Disassemble: > > $ objdump -dr test > > test: file format elf64-powerpcle > > Disassembly of section one: > > 0000000000000000 <0000001b.plt_branch.1c:5>: > 0: 08 80 82 e9 ld r12,-32760(r2) > 4: a6 03 89 7d mtctr r12 > 8: 20 04 80 4e bctr > ... > 20: e0 ff ff 4b b 0 <0000001b.plt_branch.1c:5> > > Disassembly of section two: > > 0000000020000000 <two>: > 20000000: 00 00 00 60 nop > > The linker generates the trampoline as the branch displacement doesn't fit > the instruction, extending the section size by 12 bytes. This happens after > addresses are assigned, because that is when the linker learns that the > displacement is too large. > > There is also the opposite case, where branch instructions are replaced by > shorter variants on some architectures during relaxation, but I don't have > an example immediately ready. > > The desired behaviour for the combination of non-contiguous regions and > relaxations should probably be defined by someone from the core team -- > probably locking out the combination (like "-r --relax" on PowerPC) or just > ensuring that it doesn't crash is the sanest thing to do here. > Whether the new option is enabled or not, my patch has no impact on your testcase, because the memory regions are large enough to contain all the code. If I modify your linker script to reduce the length of memory region "one" to 0x10, the testcase fails to link (as expected) with: a.out section `one' will not fit in region `one' If I use --enable-non-contiguous-regions, link succeeds, the "one" section is silently discarded from the output (and no trampoline either). This is not a user-friendly behaviour :-) I'm checking how to keep the proper error message instead. > Simon
Hi, On Tue, Dec 03, 2019 at 02:26:59PM +0100, Christophe Lyon wrote: > If I use --enable-non-contiguous-regions, link succeeds, the "one" > section is silently discarded from the output (and no trampoline > either). > This is not a user-friendly behaviour :-) I'm checking how to keep the > proper error message instead. That's why I asked. :) The annoying thing is that there are lots of interesting cases in that combination, e.g. if you add test2.S .machine "ppc" .section .text.one nop nop nop nop and try to fit that together with the other object into an output region "one" that has two ranges of 0x20 bytes each. This would fit, by placing the trampoline and the branch it belongs to in one range, and the four nops in the other, but arriving at that conclusion will be difficult for the linker. With reversed order, it could be even worse: four nops fills half of the first range, and either the branch or the trampoline alone would still fit, but these need to go to the same output range as the entire point of generating the trampoline in the first place was to have it near the original branch. Hypothetical adversarial memory map (not sure about syntax): MEMORY { one (RXAI) : ORIGIN = 0x00000000, LENGTH = 0x10 one (RXAI) : ORIGIN = 0x10000000, LENGTH = 0x10 two (RXAI) : ORIGIN = 0x20000000, LENGTH = 0x10 } Getting all of these interesting cases right sounds like a lot of work to me, and although I would expect that there might be some overlap between "people with weird memory maps" and "people who need trampolines to get over the holes in weird memory maps", I'm not sure that this is necessary as long as no incorrect output is generated. The same goes for relaxations that shrink the code, I believe Nios II has a bunch of these as program size directly translates to resource usage. It will be annoying if the linker refuses to link a program that would fit after shrinking, but still acceptable to me as a user. Simon
From 5a6274a0db009b7e19835a0726c88360d9b9c6e8 Mon Sep 17 00:00:00 2001 From: Christophe Lyon <christophe.lyon@linaro.org> Date: Mon, 25 Nov 2019 08:55:37 +0000 Subject: [PATCH 1/2] Add support for non-contiguous memory 2019-11-27 Christophe Lyon <christophe.lyon@linaro.org> bfd/ * bfd-in2.h: Regenerate. * section.c (asection): Add already_assigned field. (BFD_FAKE_SECTION): Add default initializer for it. include/ * bfdlink.h (bfd_link_info): Add non_contiguous_regions field. ld/ * ldlang.c (lang_add_section): Add support for non_contiguous_regions. (size_input_section): Likewise. (lang_size_sections_1): Likewise. * ldlex.h (option_values): Add OPTION_NON_CONTIGUOUS_REGIONS. * lexsup.c (ld_options): Add entry for --enable-non-contiguous-regions. (parse_args): Handle it. --- bfd/bfd-in2.h | 9 +++++++- bfd/section.c | 9 +++++++- include/bfdlink.h | 3 +++ ld/ldlang.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- ld/ldlex.h | 1 + ld/lexsup.c | 5 +++++ 6 files changed, 89 insertions(+), 5 deletions(-) diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h index a00dfa35..0e1126c 100644 --- a/bfd/bfd-in2.h +++ b/bfd/bfd-in2.h @@ -1177,6 +1177,10 @@ typedef struct bfd_section struct bfd_link_order *link_order; struct bfd_section *s; } map_head, map_tail; + /* Points to the output section this section is already assigned to, if any. + This is used when support for non-contiguous memory regions is enabled. */ + struct bfd_section *already_assigned; + } asection; /* Relax table contains information about instructions which can @@ -1358,7 +1362,10 @@ discarded_section (const asection *sec) (struct bfd_symbol *) SYM, &SEC.symbol, \ \ /* map_head, map_tail */ \ - { NULL }, { NULL } \ + { NULL }, { NULL }, \ + \ + /* already_assigned */ \ + NULL \ } /* We use a macro to initialize the static asymbol structures because diff --git a/bfd/section.c b/bfd/section.c index 34e08ae..b1f0b71 100644 --- a/bfd/section.c +++ b/bfd/section.c @@ -536,6 +536,10 @@ CODE_FRAGMENT . struct bfd_link_order *link_order; . struct bfd_section *s; . } map_head, map_tail; +. {* Points to the output section this section is already assigned to, if any. +. This is used when support for non-contiguous memory regions is enabled. *} +. struct bfd_section *already_assigned; +. .} asection; . .{* Relax table contains information about instructions which can @@ -717,7 +721,10 @@ CODE_FRAGMENT . (struct bfd_symbol *) SYM, &SEC.symbol, \ . \ . {* map_head, map_tail *} \ -. { NULL }, { NULL } \ +. { NULL }, { NULL }, \ +. \ +. {* already_assigned *} \ +. NULL \ . } . .{* We use a macro to initialize the static asymbol structures because diff --git a/include/bfdlink.h b/include/bfdlink.h index 32d1512..e1e7c1e 100644 --- a/include/bfdlink.h +++ b/include/bfdlink.h @@ -501,6 +501,9 @@ struct bfd_link_info /* TRUE if "-Map map" is passed to linker. */ unsigned int has_map_file : 1; + /* TRUE if "--enable-non-contiguous-regions" is passed to the linker. */ + unsigned int non_contiguous_regions : 1; + /* Char that may appear as the first char of a symbol, but should be skipped (like symbol_leading_char) when looking up symbols in wrap_hash. Used by PowerPC Linux for 'dot' symbols. */ diff --git a/ld/ldlang.c b/ld/ldlang.c index eedcb7f..f2acdbb 100644 --- a/ld/ldlang.c +++ b/ld/ldlang.c @@ -2550,7 +2550,14 @@ lang_add_section (lang_statement_list_type *ptr, } if (section->output_section != NULL) - return; + { + if (!link_info.non_contiguous_regions) + return; + /* SECTION has already been affected to an output section, but + the user allows it to be mapped to another one in case it + overflows. We'll later update the actual output section in + size_input_section as appropriate. */ + } /* We don't copy the SEC_NEVER_LOAD flag from an input section to an output section, because we want to be able to include a @@ -5109,11 +5116,27 @@ size_input_section (lang_statement_union_type **this_ptr, lang_output_section_statement_type *output_section_statement, fill_type *fill, + bfd_boolean *removed, bfd_vma dot) { lang_input_section_type *is = &((*this_ptr)->input_section); asection *i = is->section; asection *o = output_section_statement->bfd_section; + *removed = 0; + + if (link_info.non_contiguous_regions) + { + /* If the input section I has already been successfully assigned + to an output section other than O, don't bother with it and + let the caller remove it from the list. Keep processing in + case we have already handled O, because the repeated passes + have reinitialized its size. */ + if (i->already_assigned && i->already_assigned != o) + { + *removed = 1; + return dot; + } + } if (i->sec_info_type == SEC_INFO_TYPE_JUST_SYMS) i->output_offset = i->vma - o->vma; @@ -5145,6 +5168,24 @@ size_input_section dot += alignment_needed; } + if (link_info.non_contiguous_regions) + { + /* If I would overflow O, let the caller remove I from the + list. */ + if (output_section_statement->region) + { + bfd_vma end = output_section_statement->region->origin + + output_section_statement->region->length; + + if (dot + TO_ADDR (i->size) > end) + { + *removed = 1; + dot = end; + return dot; + } + } + } + /* Remember where in the output section this input section goes. */ i->output_offset = dot - o->vma; @@ -5152,6 +5193,14 @@ size_input_section dot += TO_ADDR (i->size); if (!(o->flags & SEC_FIXED_SIZE)) o->size = TO_SIZE (dot - o->vma); + + if (link_info.non_contiguous_regions) + { + /* Record that I was successfully assigned to O, and update + its actual output section too. */ + i->already_assigned = o; + i->output_section = o; + } } return dot; @@ -5436,9 +5485,10 @@ lang_size_sections_1 bfd_boolean check_regions) { lang_statement_union_type *s; + lang_statement_union_type *prev_s = NULL; /* Size up the sections from their constituent parts. */ - for (s = *prev; s != NULL; s = s->header.next) + for (s = *prev; s != NULL; prev_s = s, s = s->header.next) { switch (s->header.type) { @@ -5852,6 +5902,7 @@ lang_size_sections_1 case lang_input_section_enum: { asection *i; + bfd_boolean removed; i = s->input_section.section; if (relax) @@ -5864,7 +5915,17 @@ lang_size_sections_1 *relax = TRUE; } dot = size_input_section (prev, output_section_statement, - fill, dot); + fill, &removed, dot); + + if (link_info.non_contiguous_regions && removed) + { + /* If I doesn't fit in the current output section, + remove it from the list. */ + if (prev_s) + prev_s->header.next=s->header.next; + else + *prev = s->header.next; + } } break; diff --git a/ld/ldlex.h b/ld/ldlex.h index 32a7a64..4e7a419 100644 --- a/ld/ldlex.h +++ b/ld/ldlex.h @@ -150,6 +150,7 @@ enum option_values OPTION_FORCE_GROUP_ALLOCATION, OPTION_PRINT_MAP_DISCARDED, OPTION_NO_PRINT_MAP_DISCARDED, + OPTION_NON_CONTIGUOUS_REGIONS, }; /* The initial parser states. */ diff --git a/ld/lexsup.c b/ld/lexsup.c index d7766c3..13cab6d 100644 --- a/ld/lexsup.c +++ b/ld/lexsup.c @@ -122,6 +122,8 @@ static const struct ld_option ld_options[] = 'E', NULL, N_("Export all dynamic symbols"), TWO_DASHES }, { {"no-export-dynamic", no_argument, NULL, OPTION_NO_EXPORT_DYNAMIC}, '\0', NULL, N_("Undo the effect of --export-dynamic"), TWO_DASHES }, + { {"enable-non-contiguous-regions", no_argument, NULL, OPTION_NON_CONTIGUOUS_REGIONS}, + '\0', NULL, N_("Enable support of non-contiguous memory regions"), TWO_DASHES }, { {"EB", no_argument, NULL, OPTION_EB}, '\0', NULL, N_("Link big-endian objects"), ONE_DASH }, { {"EL", no_argument, NULL, OPTION_EL}, @@ -845,6 +847,9 @@ parse_args (unsigned argc, char **argv) case OPTION_NO_EXPORT_DYNAMIC: link_info.export_dynamic = FALSE; break; + case OPTION_NON_CONTIGUOUS_REGIONS: + link_info.non_contiguous_regions = TRUE; + break; case 'e': lang_add_entry (optarg, TRUE); break; -- 2.7.4