We can convert mov to lea only if there are R_386_GOT32/R_X86_64_GOTPCREL
relocations against non IFUNC symbols.
* elf32-i386.c (need_convert_mov_to_lea): New.
(elf_i386_check_relocs): Set need_convert_mov_to_lea if needed.
(elf_i386_convert_mov_to_lea): Return TRUE if
need_convert_mov_to_lea is unset.
* elf64-x86-64.c (need_convert_mov_to_lea): New.
(elf_x86_64_check_relocs): Set need_convert_mov_to_lea if needed.
(elf_x86_64_convert_mov_to_lea): Return TRUE if
need_convert_mov_to_lea is unset.
When building executable, undefined symbol is a fatal error. We don't
complain about -fPIC if the symbol is undefined.
bfd/
PR ld/17847
* elf64-x86-64.c (elf_x86_64_relocate_section): Don't complain
about -fPIC if the symbol is undefined when building executable.
ld/testsuite/
PR ld/17847
* ld-x86-64/pie1.d: New file.
* ld-x86-64/pie1.s: Likwise.
* ld-x86-64/x86-64.exp: Run pie1.
When building PIE, we should only discard space for pc-relative relocs
symbols which turn out to need copy relocs.
bfd/
PR ld/17827
* elf64-x86-64.c (elf_x86_64_allocate_dynrelocs): For PIE,
only discard space for pc-relative relocs symbols which turn
out to need copy relocs.
ld/testsuite/
PR ld/17827
* ld-x86-64/pr17689.out: Updated.
* ld-x86-64/pr17689b.S: Likewise.
* ld-x86-64/pr17827.rd: New file.
* ld-x86-64/x86-64.exp: Run PR ld/17827 test.
When there is a weak symbol with a real definition, the processor
independent code will have arranged for us to see the real definition
first. We need to copy the needs_copy bit from the real definition and
check it when allowing copy reloc in PIE.
bfd/
PR ld/17689
* elf64-x86-64.c (elf_x86_64_link_hash_entry): Add needs_copy.
Change has_bnd_reloc to bit field.
(elf_x86_64_link_hash_newfunc): Initialize needs_copy and
has_bnd_reloc to 0.
(elf_x86_64_check_relocs): Set has_bnd_reloc to 1 instead
of TRUE.
(elf_x86_64_adjust_dynamic_symbol): Copy needs_copy from the
real definition to a weak symbol.
(elf_x86_64_allocate_dynrelocs): Also check needs_copy of a
weak symbol for PIE when discarding space for relocs against
symbols which turn out to need copy relocs.
(elf_x86_64_relocate_section): Also check needs_copy of a
weak symbol for PIE with copy reloc.
ld/testsuite/
PR ld/17689
* ld-x86-64/pr17689.out: New file.
* ld-x86-64/pr17689.rd: Likewise.
* ld-x86-64/pr17689a.c: Likewise.
* ld-x86-64/pr17689b.S: Likewise.
* ld-x86-64/x86-64.exp: Run PR ld/17689 tests.
Copy relocs are used in a scheme to avoid dynamic text relocations in
non-PIC executables that refer to variables defined in shared
libraries. The idea is to have the linker define any such variable in
the executable, with a copy reloc copying the initial value, then have
both the executable and shared library refer to the executable copy.
If the shared library defines the variable as protected then we have
two copies of the variable being used.
PR 15228
* elflink.c (_bfd_elf_adjust_dynamic_copy): Add "info" param.
Error on copy relocs against protected symbols.
(elf_merge_st_other): Set h->protected_def.
* elf-bfd.h (struct elf_link_hash_entry): Add "protected_def".
(_bfd_elf_adjust_dynamic_copy): Update prototype.
* elf-m10300.c (_bfd_mn10300_elf_adjust_dynamic_symbol): Update
_bfd_elf_adjust_dynamic_copy call.
* elf32-arm.c (elf32_arm_adjust_dynamic_symbol): Likewise.
* elf32-cr16.c (_bfd_cr16_elf_adjust_dynamic_symbol): Likewise.
* elf32-cris.c (elf_cris_adjust_dynamic_symbol): Likewise.
* elf32-hppa.c (elf32_hppa_adjust_dynamic_symbol): Likewise.
* elf32-i370.c (i370_elf_adjust_dynamic_symbol): Likewise.
* elf32-i386.c (elf_i386_adjust_dynamic_symbol): Likewise.
* elf32-lm32.c (lm32_elf_adjust_dynamic_symbol): Likewise.
* elf32-m32r.c (m32r_elf_adjust_dynamic_symbol): Likewise.
* elf32-m68k.c (elf_m68k_adjust_dynamic_symbol): Likewise.
* elf32-metag.c (elf_metag_adjust_dynamic_symbol): Likewise.
* elf32-or1k.c (or1k_elf_adjust_dynamic_symbol): Likewise.
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Likewise.
* elf32-s390.c (elf_s390_adjust_dynamic_symbol): Likewise.
* elf32-sh.c (sh_elf_adjust_dynamic_symbol): Likewise.
* elf32-tic6x.c (elf32_tic6x_adjust_dynamic_symbol): Likewise.
* elf32-tilepro.c (tilepro_elf_adjust_dynamic_symbol): Likewise.
* elf32-vax.c (elf_vax_adjust_dynamic_symbol): Likewise.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Likewise.
* elf64-s390.c (elf_s390_adjust_dynamic_symbol): Likewise.
* elf64-sh64.c (sh64_elf64_adjust_dynamic_symbol): Likewise.
* elf64-x86-64.c (elf_x86_64_adjust_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_adjust_dynamic_symbol): Likewise.
* elfxx-mips.c (_bfd_mips_elf_adjust_dynamic_symbol): Likewise.
* elfxx-sparc.c (_bfd_sparc_elf_adjust_dynamic_symbol): Likewise.
* elfxx-tilegx.c (tilegx_elf_adjust_dynamic_symbol): Likewise.
In i386 and x86-64 binaries with ifunc, relocations against .got.plt
section may not be in the same order as entries in PLT section. This
patch adds _bfd_elf_ifunc_get_synthetic_symtab. It takes a function
pointer which returns an array of PLT entry symbol values. It calls
the function pointer to get the PLT entry symbol value array indexed
by relocation index, instead of calling plt_sym_val on each relocation
index.
PR binutils/17677
* elf-bfd.h (_bfd_elf_ifunc_get_synthetic_symtab): New prototype.
* elf-ifunc.c (_bfd_elf_ifunc_get_synthetic_symtab): New
function.
* elf32-i386.c (elf_i386_plt_sym_val): Removed.
(elf_backend_plt_sym_val): Likewise.
(elf_i386_get_plt_sym_val): New.
(elf_i386_get_synthetic_symtab): Likewise.
(bfd_elf32_get_synthetic_symtab): Likewise.
* elf64-x86-64.c (elf_x86_64_plt_sym_val): Removed.
(elf_x86_64_plt_sym_val_offset_plt_bnd): Likewise.
(elf_backend_plt_sym_val): Likewise.
(elf_x86_64_get_plt_sym_val): New.
(elf_x86_64_get_synthetic_symtab): Use
_bfd_elf_ifunc_get_synthetic_symtab.
(bfd_elf64_get_synthetic_symtab): Don't undefine for NaCl.
This patch reverts the change in elf_x86_64_check_relocs and the partial
change in elf_x86_64_adjust_dynamic_symbol. Instead, we discard space
in PIE for relocs against symbols which turn out to need copy relocs.
* elf64-x86-64.c (elf_x86_64_check_relocs): Revert the last
change.
(elf_x86_64_adjust_dynamic_symbol): Don't check !info->shared
with ELIMINATE_COPY_RELOCS.
(elf_x86_64_allocate_dynrelocs): For PIE, discard space for
relocs against symbols which turn out to need copy relocs.
This patch allows copy relocs for non-GOT pc-relative relocation in PIE.
bfd/
* elf64-x86-64.c (elf_x86_64_create_dynamic_sections): Always
allow copy relocs for building executables.
(elf_x86_64_check_relocs): Allow copy relocs for non-GOT
pc-relative relocation in shared object.
(elf_x86_64_adjust_dynamic_symbol): Allocate copy relocs for
PIE.
(elf_x86_64_relocate_section): Don't copy a pc-relative
relocation into the output file if the symbol needs copy reloc.
ld/testsuite/
* ld-x86-64/copyreloc-lib.c: New file.
* ld-x86-64/copyreloc-main.c: Likewise.
* ld-x86-64/copyreloc-main.out: Likewise.
* ld-x86-64/copyreloc-main1.rd: Likewise.
* ld-x86-64/copyreloc-main2.rd: Likewise.
* ld-x86-64/x86-64.exp: Run copyreloc tests.
When there are both PLT and GOT references to the same function symbol,
linker will create a GOTPLT slot for PLT entry and a GOT slot for GOT
reference. A run-time JUMP_SLOT relocation is created to update the
GOTPLT slot and a run-time GLOB_DAT relocation is created to update the
GOT slot. Both JUMP_SLOT and GLOB_DAT relocations will apply the same
symbol value to GOTPLT and GOT slots, respectively, at run-time.
This optimization combines GOTPLT and GOT slots into a single GOT slot
and removes the run-time JUMP_SLOT relocation. It replaces the regular
PLT entry:
indirect jump [GOTPLT slot]
push relocation index
jump PLT0
with an GOT PLT entry with an indirect jump via the GOT slot:
indirect jump [GOT slot]
nop
and resolves PLT reference to the GOT PLT entry.
We must avoid this optimization if pointer equality is needed since
we don't clear symbol value in this case and the dynamic linker won't
update the GOT slot. Otherwise, the resulting binary will get into an
infinite loop at run-time.
bfd/
* elf32-i386.c (elf_i386_got_plt_entry): New.
(elf_i386_pic_got_plt_entry): Likewise.
(elf_i386_link_hash_entry): Add plt_got.
(elf_i386_link_hash_table): Likewise.
(elf_i386_link_hash_newfunc): Initialize plt_got.offset to -1.
(elf_i386_get_local_sym_hash): Likewise.
(elf_i386_check_relocs): Create the GOT PLT if there are both
PLT and GOT references when the regular PLT is used.
(elf_i386_allocate_dynrelocs): Use the GOT PLT if there are
both PLT and GOT references unless pointer equality is needed.
(elf_i386_relocate_section): Also check the GOT PLT when
resolving R_386_PLT32.
(elf_i386_finish_dynamic_symbol): Use the GOT PLT if it is
available.
* elf64-x86-64.c (elf_x86_64_link_hash_entry): Add plt_got.
(elf_x86_64_link_hash_table): Likewise.
(elf_x86_64_link_hash_newfunc): Initialize plt_got.offset to -1.
(elf_x86_64_get_local_sym_hash): Likewise.
(elf_x86_64_check_relocs): Create the GOT PLT if there are both
PLT and GOT references when the regular PLT is used.
(elf_x86_64_allocate_dynrelocs): Use the GOT PLT if there are
both PLT and GOT references unless pointer equality is needed.
(elf_x86_64_relocate_section): Also check the GOT PLT when
resolving R_X86_64_PLT32.
(elf_x86_64_finish_dynamic_symbol): Use the GOT PLT if it is
available.
ld/
* emulparams/elf_i386.sh (TINY_READONLY_SECTION): New.
* emulparams/elf_x86_64.sh (TINY_READONLY_SECTION): Add .plt.got.
ld/testsuite/
* ld-i386/i386.exp: Add run-time relocation tests for plt-main.
* ld-i386/plt-main.rd: New file.
* ld-x86-64/plt-main-bnd.dd: Likewise.
* ld-x86-64/plt-main.rd: Likewise.
* ld-x86-64/x86-64.exp: Add run-time relocation tests for
plt-main.
Assert size of elf_x86_64_bnd_plt2_entry and elf_x86_64_legacy_plt2_entry
only in elf_x86_64_check_relocs.
* elf64-x86-64.c (elf_x86_64_check_relocs): Assert size of
elf_x86_64_bnd_plt2_entry and elf_x86_64_legacy_plt2_entry.
(elf_x86_64_allocate_dynrelocs): Don't assert size of
elf_x86_64_bnd_plt2_entry and elf_x86_64_legacy_plt2_entry.
Displacement of branch to PLT0 in x86-64 PLT entry is signed 32-bit.
This patch adds a sanity check. We will only see the failure when PLT
size is > 2GB.
* elf64-x86-64.c (elf_x86_64_finish_dynamic_symbol): Check
branch displacement overflow in PLT entry.
Structions with R_X86_64_GOTTPOFF relocation must be encoded with REX
prefix even if it isn't required by destination register. Otherwise
linker can't safely perform IE -> LE optimization.
bfd/
PR ld/17482
* elf64-x86-64.c (elf_x86_64_relocate_section): Update comments
for IE->LE transition.
gas/
PR ld/17482
* config/tc-i386.c (output_insn): Add a dummy REX_OPCODE prefix
for structions with R_X86_64_GOTTPOFF relocation for x32 if needed.
gas/testsuite/
PR ld/17482
* gas/i386/ilp32/x32-tls.d: New file.
* gas/i386/ilp32/x32-tls.s: Likewise.
ld/testsuite/
PR ld/17482
* ld-x86-64/tlsie4.dd: Updated.
Relocations against .got.plt section may not be in the same order as
entries in PLT section. It is incorrect to assume that the Ith reloction
index against .got.plt section always maps to the (I + 1)th entry in PLT
section. This patch matches the .got.plt relocation offset/index in PLT
entry against the index in .got.plt relocation table. It only checks
R_*_JUMP_SLOT and R_*_IRELATIVE relocations. It ignores R_*_TLS_DESC
and R_*_TLSDESC relocations since they have different PLT entries.
bfd/
PR binutils/17154
* elf32-i386.c (elf_i386_plt_sym_val): Only match R_*_JUMP_SLOT
and R_*_IRELATIVE relocation offset with PLT entry.
* elf64-x86-64.c (elf_x86_64_plt_sym_val): Likewise.
(elf_x86_64_plt_sym_val_offset_plt_bnd): New.
(elf_x86_64_get_synthetic_symtab): Use it.
ld/testsuite/
PR binutils/17154
* ld-ifunc/pr17154-i386.d: New file.
* ld-ifunc/pr17154-x86-64.d: Likewise.
* ld-ifunc/pr17154-x86.s: Likewise.
* ld-x86-64/bnd-ifunc-2.d: Likewise.
* ld-x86-64/bnd-ifunc-2.s: Likewise.
* ld-x86-64/mpx.exp: Run bnd-ifunc-2.
* ld-x86-64/tlsdesc-nacl.pd: Updated.
* ld-x86-64/tlsdesc.pd: Likewise.
Intel MPX introduces 4 bound registers, which will be used for parameter
passing in x86-64. Bound registers are cleared by branch instructions.
Branch instructions with BND prefix will keep bound register contents.
This leads to 2 requirements to 64-bit MPX run-time:
1. Dynamic linker (ld.so) should save and restore bound registers during
symbol lookup.
2. Change the current 16-byte PLT0:
ff 35 08 00 00 00 pushq GOT+8(%rip)
ff 25 00 10 00 jmpq *GOT+16(%rip)
0f 1f 40 00 nopl 0x0(%rax)
and 16-byte PLT1:
ff 25 00 00 00 00 jmpq *name@GOTPCREL(%rip)
68 00 00 00 00 pushq $index
e9 00 00 00 00 jmpq PLT0
which clear bound registers, to preserve bound registers.
We use 2 new relocations:
to mark branch instructions with BND prefix.
When linker sees any R_X86_64_PC32_BND or R_X86_64_PLT32_BND relocations,
it switches to a different PLT0:
ff 35 08 00 00 00 pushq GOT+8(%rip)
f2 ff 25 00 10 00 bnd jmpq *GOT+16(%rip)
0f 1f 00 nopl (%rax)
to preserve bound registers for symbol lookup and it also creates an
external PLT section, .pl.bnd. Linker will create a BND PLT1 entry
in .plt:
68 00 00 00 00 pushq $index
f2 e9 00 00 00 00 bnd jmpq PLT0
0f 1f 44 00 00 nopl 0(%rax,%rax,1)
and a 8-byte BND PLT entry in .plt.bnd:
f2 ff 25 00 00 00 00 bnd jmpq *name@GOTPCREL(%rip)
90 nop
Otherwise, linker will create a legacy PLT1 entry in .plt:
68 00 00 00 00 pushq $index
e9 00 00 00 00 jmpq PLT0
66 0f 1f 44 00 00 nopw 0(%rax,%rax,1)
and a 8-byte legacy PLT in .plt.bnd:
ff 25 00 00 00 00 jmpq *name@GOTPCREL(%rip)
66 90 xchg %ax,%ax
The initial value of the GOT entry for "name" will be set to the the
"pushq" instruction in the corresponding entry in .plt. Linker will
resolve reference of symbol "name" to the entry in the second PLT,
.plt.bnd.
Prelink stores the offset of pushq of PLT1 (plt_base + 0x10) in GOT[1]
and GOT[1] is stored in GOT[3]. We can undo prelink in GOT by computing
the corresponding the pushq offset with
GOT[1] + (GOT offset - &GOT[3]) * 2
Since for each entry in .plt except for PLT0 we create a 8-byte entry in
.plt.bnd, there is extra 8-byte per PLT symbol.
We also investigated the 16-byte entry for .plt.bnd. We compared the
8-byte entry vs the the 16-byte entry for .plt.bnd on Sandy Bridge.
There are no performance differences in SPEC CPU 2000/2006 as well as
micro benchmarks.
Pros:
No change to undo prelink in dynamic linker.
Only 8-byte memory overhead for each PLT symbol.
Cons:
Extra .plt.bnd section is needed.
Extra 8 byte for legacy branches to PLT.
GDB is unware of the new layout of .plt and .plt.bnd.
bfd/
* elf64-x86-64.c (elf_x86_64_bnd_plt0_entry): New.
(elf_x86_64_legacy_plt_entry): Likewise.
(elf_x86_64_bnd_plt_entry): Likewise.
(elf_x86_64_legacy_plt2_entry): Likewise.
(elf_x86_64_bnd_plt2_entry): Likewise.
(elf_x86_64_bnd_arch_bed): Likewise.
(elf_x86_64_link_hash_entry): Add has_bnd_reloc and plt_bnd.
(elf_x86_64_link_hash_table): Add plt_bnd.
(elf_x86_64_link_hash_newfunc): Initialize has_bnd_reloc and
plt_bnd.
(elf_x86_64_copy_indirect_symbol): Also copy has_bnd_reloc.
(elf_x86_64_check_relocs): Create the second PLT for Intel MPX
in 64-bit mode.
(elf_x86_64_allocate_dynrelocs): Handle the second PLT for IFUNC
symbols. Resolve call to the second PLT if it is created.
(elf_x86_64_size_dynamic_sections): Keep the second PLT section.
(elf_x86_64_relocate_section): Resolve PLT references to the
second PLT if it is created.
(elf_x86_64_finish_dynamic_symbol): Use BND PLT0 and fill the
second PLT entry for BND relocation.
(elf_x86_64_finish_dynamic_sections): Use MPX backend data if
the second PLT is created.
(elf_x86_64_get_synthetic_symtab): New.
(bfd_elf64_get_synthetic_symtab): Likewise. Undefine for NaCl.
ld/
* emulparams/elf_x86_64.sh (TINY_READONLY_SECTION): New.
ld/testsuite/
* ld-x86-64/mpx.exp: Run bnd-ifunc-1 and bnd-plt-1.
* ld-x86-64/bnd-ifunc-1.d: New file.
* ld-x86-64/bnd-ifunc-1.s: Likewise.
* ld-x86-64/bnd-plt-1.d: Likewise.
It has been fixed by
commit 4199e3b866
Author: Alan Modra <amodra@gmail.com>
Date: Wed Jan 15 21:50:55 2014 +1030
non-PIC references to __ehdr_start in pie and shared
Rather than hacking every backend to not discard dynamic relocations
against an undefined hidden __ehdr_start, make it appear to be defined
early. We want __ehdr_start hidden before size_dynamic_sections so
that it isn't put in .dynsym, but we do need the dynamic relocations
for a PIE or shared library with a non-PIC reference. Defining it
early is wrong if we don't actually define the symbol later to its
proper value. (In some cases we want to leave the symbol undefined,
for example, when the ELF header isn't loaded, and we don't have this
infomation available in before_allocation.)
* elf32-i386.c (elf_i386_allocate_dynrelocs): Revert the last
change.
* elf64-x86-64.c (elf_x86_64_allocate_dynrelocs): Likewise.
__ehdr_start will be defined by assign_file_positions_for_non_load_sections
later.
PR ld/16428
* elf32-i386.c (elf_i386_allocate_dynrelocs): Don't discard relocs
against __ehdr_start.
* elf64-x86-64.c (elf_x86_64_allocate_dynrelocs): Likewise.
PR ld/16428
* elf32-i386.c (elf_i386_allocate_dynrelocs): Don't update reloc
count if there are any non pc-relative relocs.
* elf64-x86-64.c (elf_x86_64_allocate_dynrelocs): Likewise.
bfd/
* archures.c (bfd_mach_i386_nacl): Fix definition so it doesn't
collide with bfd_mach_l1om.
* bfd-in2.h: Regenerate.
* elf32-i386.c (elf32_i386_nacl_elf_object_p): New function.
(elf_backend_object_p): Use that in elf32-i386-nacl definition.
* elf64-x86-64.c (elf64_x86_64_nacl_elf_object_p): New function.
(elf_backend_object_p): Use that in elf64-x86-64-nacl definition.
(elf32_x86_64_nacl_elf_object_p): New function.
(elf_backend_object_p): Use that in elf32-x86-64-nacl definition.
binutils/
* objdump.c (dump_dwarf): Grok bfd_mach_x86_64_nacl and
bfd_mach_x64_32_nacl as equivalent to bfd_mach_x86_64.
ld/testsuite/
* ld-x86-64/x86-64.exp (mixed1, mixed2): Loosen error string match
so it accepts "i386:nacl" in place of "i386".
* ld-x86-64/ilp32-2.d: Likewise.
* ld-x86-64/ilp32-3.d: Likewise.
* ld-x86-64/lp64-2.d: Likewise.
* ld-x86-64/lp64-3.d: Likewise.