There are legitimate reasons to allow a signed value in a cmpli insn
field, for example to test for a "stw r1,lock@sdarel(r13)" instruction
in user code, a kernel might use
subis r3,r3,STW_R1_0R13@ha # subtract off high part
cmplwi r3,lock@sdarel # is low part accessing lock?
Since the lock@sdarel may take a range of -32768 to 32767,
the allowed range of cmpli immediate must be at least [-32768,65535].
bfd/
* elf32-ppc.c (ppc_elf_relocate_section): Treat field of cmpli
insn as a bitfield; Use complain_overflow_bitfield.
* elf64-ppc.c (ppc64_elf_relocate_section): Likewise.
opcodes/
* ppc-opc.c (UISIGNOPT): Define and use with cmpli.
gas/
* config/tc-ppc.c (ppc_insert_operand): Handle PPC_OPERAND_SIGNOPT
on unsigned fields. Comment on PPC_OPERAND_SIGNOPT signed fields
in 64-bit mode.
gold/
* powerpc.cc (relocate): Treat field of cmpli insn as a bitfield.
Power8 fuses addis,addi and addis,ld sequences when the target of the
addis is the same as the addi/ld. Thus
addis r12,r2,xxx@ha
addi r12,r12,xxx@l / ld r12,xxx@l(r12)
is faster than
addis r11,r2,xxx@ha
addi r12,r11,xxx@l / ld r12,xxx@l(r11)
So use the form that allows fusion in plt call and branch stubs.
bfd/
* elf64-ppc.c (ADDIS_R12_R2): Define.
(build_plt_stub): Support fusion on ELFv2 stub.
(ppc_build_one_stub): Likewise for plt branch stubs.
gold/
* powerpc.cc (addis_12_2): Define.
(Stub_table::do_write): Support fusion on ELFv2 stubs.
ld/testsuite/
* ld-powerpc/elfv2exe.d: Update for changed plt call stubs.
gdb/
* ppc64-tdep.c (ppc64_standard_linkage8): New.
(ppc64_skip_trampoline_code): Recognise ELFv2 stub supporting fusion.
ELFv2 doesn't use .opd, so folding function code can't be allowed
in safe mode if a function's address might be taken.
* powerpc.cc (Target_powerpc::local_reloc_may_be_function_pointer):
Only ignore relocs on ELFv1.
(Target_powerpc::global_reloc_may_be_function_pointer): Likewise.
When linking statically, it's possible to hit this warning with IFUNC
or very large executables, due to .glink being unused.
* powerpc.cc (do_plt_fde_location): Handle zero length .glink.
Compare FDE contents with DW_CFA_nop rather than 0.
to access a global as it expects a GOTPCREL relocation. This is really not
necessary as the linker could use a copy relocation to get around it. This
patch enables copy relocations with pie.
Context:
This is useful because currently the GCC compiler with option -fpie makes
every extern global access go through the GOT. That is because the compiler
cannot tell if a global will end up being defined in the executable or not
and is conservative. This ends up hurting performance when the binary is linked
as mostly static where most of the globals do end up being defined in the
executable. By allowing copy relocs with fPIE, the compiler need not generate
a GOTPCREL(GOT access) for any global access. It can safely assume that all
globals will be defined in the executable and generate a PC-relative access
instead. Gold can then create a copy reloc for only the undefined globals.
gold/
* symtab.h (may_need_copy_reloc): Remove check for position independent
code.
* x86_64.cc (Target_x86_64<size>::Scan::global): Add check for no
position independence before pc absolute may_need_copy_reloc call.
Add check for executable output befor pc relative may_need_copy_reloc
call.
* i386.cc: Ditto.
* arm.cc: Ditto.
* sparc.cc: Ditto.
* tilegx.cc: Ditto.
* powerpc.cc: Add check for no position independence before
may_need_copy_reloc calls.
* testsuite/pie_copyrelocs_test.cc: New file.
* testsuite/pie_copyrelocs_shared_test.cc: New file.
* Makefile.am (pie_copyrelocs_test): New test.
* Makefile.in: Regenerate.
R_PPC64_ADDR16 is used in three contexts:
- .short data relocation
- 16-bit signed insn fields, eg. addi
- 16-bit unsigned insn fields, eg. ori
In the first case we want to allow both signed and unsigned 16-bit
values, the latter two ought to error if the field exceeds the range
of values allowed for 16-bit signed and unsigned integers
respectively. These conflicting requirements meant that ld had to
choose the least restrictive overflow checks, and thus it is possible
to construct testcases where an addi field overflows but is not
reported by ld. Many relocations dealing with 16-bit insn fields have
this problem. What's more, some relocations that are only ever used
for signed fields of instructions woodenly copied the lax overflow
checking of R_PPC64_ADDR16.
bfd/
* elf64-ppc.c (ppc64_elf_howto_raw): Use complain_overflow_signed
for R_PPC64_ADDR14, R_PPC64_ADDR14_BRTAKEN, R_PPC64_ADDR14_BRNTAKEN,
R_PPC64_SECTOFF, R_PPC64_ADDR16_DS, R_PPC64_SECTOFF_DS,
R_PPC64_REL16 entries. Use complain_overflow_dont for R_PPC64_TOC.
(ppc64_elf_relocate_section): Modify overflow test for 16-bit
fields in instructions to signed/unsigned according to whether
the field takes a signed or unsigned value.
gold/
* powerpc.cc (Powerpc_relocate_functions::Overflow_check): Add
CHECK_UNSIGNED, CHECK_LOW_INSN, CHECK_HIGH_INSN.
(Powerpc_relocate_functions::has_overflow_unsigned): New function.
(Powerpc_relocate_functions::has_overflow_bitfield,
overflowed): Use the above.
(Target_powerpc::Relocate::relocate): Correct overflow checking
for a number of relocations. Modify overflow test for 16-bit
fields in instructions to signed/unsigned according to whether
the field takes a signed or unsigned value.
This adds support for "func@localentry", an expression that returns the
ELFv2 local entry point address of function "func". I've excluded
dynamic relocation support because that obviously would require glibc
changes.
include/elf/
* ppc64.h (R_PPC64_REL24_NOTOC, R_PPC64_ADDR64_LOCAL): Define.
bfd/
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_ADDR64_LOCAL entry.
(ppc64_elf_reloc_type_lookup): Support R_PPC64_ADDR64_LOCAL.
(ppc64_elf_check_relocs): Likewise.
(ppc64_elf_relocate_section): Likewise.
* Add BFD_RELOC_PPC64_ADDR64_LOCAL.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Support @localentry.
(md_apply_fix): Support R_PPC64_ADDR64_LOCAL.
ld/testsuite/
* ld-powerpc/elfv2-2a.s, ld-powerpc/elfv2-2b.s: New files.
* ld-powerpc/elfv2-2exe.d, ld-powerpc/elfv2-2so.d: New files.
* ld-powerpc/powerpc.exp: Run new test.
elfcpp/
* powerpc.h (R_PPC64_REL24_NOTOC, R_PPC64_ADDR64_LOCAL): Define.
gold/
* powerpc.cc (Target_powerpc::Scan::local, global): Support
R_PPC64_ADDR64_LOCAL.
(Target_powerpc::Relocate::relocate): Likewise.
* powerpc.cc (Target_powerpc::glink_section): Provide non-const
accessor.
(Target_powerpc::Branch_info::make_stub): Make global entry stubs.
Only call ppc64_local_entry_offset for 64-bit. Restrict
symval_for_branch lookup to ELFv1.
(Stub_table::add_plt_call_entry): Use unsigned int off.
(Output_data_glink::Address, invalid_address): New.
(Output_data_glink::add_eh_frame): Move out of line. Add
support for ELFv2.
(Output_data_glink::add_global_entry, find_global_entry,
global_entry_address): New functions.
(Output_data_glink::global_entry_stubs_, end_branch_table_,
ge_size): New variables.
(Output_data_glink::set_final_data_size): Add global entry
stub sizing.
(Output_data_glink::do_write): Write global entry stubs.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Add target
parameter. Return true for ELFv2. Adjust callers.
(Target_powerpc::Scan::local, global): Restrict opd lookup to
ELFv1. Similarly for ifunc and dynamic relocation processing
specific to ELFv1. Recognize that symbols are defined on
their plt entries for ELFv2.
(Target_powerpc::symval_for_branch): Assert if called for
ELFv2 or ppc32.
(Target_powerpc::Relocate::relocate): Use global entry plt
stub for symbol value if such exists on ELFv2.
(Target_powerpc::Relocate::relocate): Don't call
symval_for_branch when ELFv2. Do adjust for local entry
offset when ELFv2.
(Target_powerpc::do_dynsym_value): Set symbols to global entry
plt stub for ELFv2.
(Target_powerpc::do_plt_address_for_global): Similarly.
This adds an extra flag for needs_dynamic_reloc() in order to remove
the copy of this function and use_plt_offset() in powerpc.cc, and
tweaks the powerpc get_reference_flags() to return the flag as
appropriate. ELFv2 does not want ELFv1 behaviour here.
* symtab.h (Symbol::Reference_flags): Add FUNC_DESC_ABI.
(Symbol::needs_dynamic_reloc): Test new flag.
* powerpc.cc (needs_dynamic_reloc, use_plt_offset): Delete.
(Target_powerpc::Scan::get_reference_flags): Add target param.
Return FUNC_DESC_ABI for 64-bit ELFv1.
(Target_powerpc::Branch_info::make_stub): Adjust get_reference_flags
call.
(Target_powerpc::Scan::global): Use Symbol::needs_dynamic_reloc.
(Target_powerpc::Relocate::relocate): Use Symbol::use_plt_offset.
elfcpp/
* powerpc.h (EF_PPC64_ABI): New enum constant.
(STO_PPC64_LOCAL_BIT, STO_PPC64_LOCAL_MASK): Likewise.
(ppc64_decode_local_entry): New function.
(ppc64_encode_local_entry): Likewise.
gold/
* powerpc.cc (Powerpc_relobj::abiversion, set_abiversion,
ppc64_local_entry_offset, ppc64_local_entry_offset,
do_read_symbols): New functions.
(Powerpc_relobj::e_flags_, st_other_): New vars.
(Powerpc_relobj::Powerpc_relobj): Call set_abiversion.
(Powerpc_dynobj::abiversion, set_abiversion): New functions.
(Powerpc_relobj::e_flags_): New var.
(Target_powerpc::first_plt_entry_offset, plt_entry_size): Inline
and adjust for ELFv2.
(Target_powerpc::abiversion, set_abiversion, stk_toc): New functions.
(Powerpc_relobj::do_find_special_sections): Check no .opd in ELFv2.
(Powerpc_dynobj::do_find_special_sections): Likewise.
(Target_powerpc::do_define_standard_symbols): Define ".TOC.".
(Target_powerpc::Branch_info::make_stub): Adjust stub destination
to ELFv2 local entry.
(Target_powerpc::do_relax): No thread safe barriers needed for
ELFv2.
(Output_data_plt_powerpc::initial_plt_entry_size_,
plt_entry_size): Delete. Replace all uses with
first_plt_entry_offset() and plt_entry_size().
(Output_data_plt_powerpc::Output_data_plt_powerpc): Remove
reserved_size parm. Update callers.
(Output_data_plt_powerpc::entry_count): Update.
(Output_data_plt_powerpc::first_plt_entry_offset): Make private
and use Target_powerpc::first_plt_entry_offset().
(Output_data_plt_powerpc::get_plt_entry_size): Similarly and
rename to plt_entry_size.
(Output_data_plt_powerpc::add_ifunc_entry,
add_local_ifunc_entry): Adjust reloc for ELFv2.
(glink_eh_frame_fde_64): Rename to glink_eh_frame_fde_64v1.
(glink_eh_frame_fde_64v2): New.
(Stub_table::plt_call_size): Support ELFv2 sizing.
(Output_data_glink::add_eh_frame): Use the new FDE.
(Output_data_glink::set_final_data_size): Adjust for ELFv2 glink.
(Stub_table::do_write): Write ELFv2 stubs and glink.
(Target_powerpc::Relocate::relocate): Replaces nop after call
with ld 2,24(1) and adjust local offset destination for ELFv2.
This changes the behaviour of @h and @ha on PowerPC64 to report errors
on 32-bit overflow. The motivation for this change is that on
PowerPC64, most uses of @h and @ha modifiers and their corresponding
relocations are to build up 32-bit offsets. We'd like to know when
such offsets overflow. Only rarely do people use @h or @ha with the
high 32-bit modifiers to build a 64-bit constant. Those uses will now
need to use two new modifiers, @high and @higha, if the constant isn't
known at assembly time. For now, we won't report overflow at assembly
time..
This also fixes an error when applying some of the HIGHER and HIGHEST
relocations.
include/elf/
* ppc64.h (R_PPC64_ADDR16_HIGH, R_PPC64_ADDR16_HIGHA,
R_PPC64_TPREL16_HIGH, R_PPC64_TPREL16_HIGHA,
R_PPC64_DTPREL16_HIGH, R_PPC64_DTPREL16_HIGHA): New.
(IS_PPC64_TLS_RELOC): Match new tls relocs.
bfd/
* reloc.c (BFD_RELOC_PPC64_ADDR16_HIGH, BFD_RELOC_PPC64_ADDR16_HIGHA,
BFD_RELOC_PPC64_TPREL16_HIGH, BFD_RELOC_PPC64_TPREL16_HIGHA,
BFD_RELOC_PPC64_DTPREL16_HIGH, BFD_RELOC_PPC64_DTPREL16_HIGHA): New.
* elf64-ppc.c (ppc64_elf_howto_raw): Add entries for new relocs.
Make all _HA and _HI relocs report signed overflow.
(ppc64_elf_reloc_type_lookup): Handle new relocs.
(must_be_dyn_reloc, ppc64_elf_check_relocs): Likewise.
(dec_dynrel_count, ppc64_elf_relocate_section): Likewise.
(ppc64_elf_relocate_section): Don't apply 0x8000 adjust to
R_PPC64_TPREL16_HIGHER, R_PPC64_TPREL16_HIGHEST,
R_PPC64_DTPREL16_HIGHER, and R_PPC64_DTPREL16_HIGHEST.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c (SEX16): Don't mask.
(REPORT_OVERFLOW_HI): Define as zero.
(ppc_elf_suffix): Support @high, @higha, @dtprel@high, @dtprel@higha,
@tprel@high, and @tprel@higha modifiers.
(md_assemble): Ignore X_unsigned when applying 16-bit insn fields.
Add (disabled) code to check @h and @ha reloc overflow for powerpc64.
Handle new relocs.
(md_apply_fix): Similarly.
elfcpp/
* powerpc.h (R_PPC64_ADDR16_HIGH, R_PPC64_ADDR16_HIGHA,
R_PPC64_TPREL16_HIGH, R_PPC64_TPREL16_HIGHA,
R_PPC64_DTPREL16_HIGH, R_PPC64_DTPREL16_HIGHA): Define.
gold/
* powerpc.cc (Target_powerpc::Scan::check_non_pic): Handle new relocs.
(Target_powerpc::Scan::global, local): Likewise.
(Target_powerpc::Relocate::relocate): Likewise. Check for overflow
on all ppc64 @h and @ha relocs.
(Output_data_got::add_constant_pair): New function.
* powerpc.cc (Output_data_got_powerpc): Override all Output_data_got
methods used so as to first call reserve_ent().
* object.cc (Sized_relobj::do_output_section_address): New function.
(Sized_relobj): Instantiate explicitly.
* object.h (Object::output_section_address): New function.
(Object::do_output_section_address): New function.
(Sized_relobj::do_output_section_address): New function.
* powerpc.cc (Target_powerpc::symval_for_branch): Use it.
* elf64-ppc.c (struct ppc_stub_hash_entry): Delete "addend".
(ppc64_elf_size_stubs): Don't set "addend".
(ppc64_elf_relocate_section): Don't allow calls via
toc-adjusting stubs without a following nop even in an
executable, except for self-calls and both libc_start_main
and .libc_start_main.
gold/
* powerpc.cc (Target_powerpc::Relocate::relocate): Update self-call
comment.
* powerpc.cc (Output_data_brlt_powerpc::reset_brlt_sizes): New
function.
(Output_data_brlt_powerpc::finalize_brlt_sizes): New function.
(Target_powerpc::do_relax): Call the above.
on garbage collected .opd section.
* powerpc.cc (Target_powerpc::do_gc_add_reference): Test dst_shndx
is non-zero.
(Target_powerpc::do_gc_mark_symbols): Likewise for sym->shndx().
(Target_powerpc::do_function_location): Likewise for loc->shndx.
owner when sections are not adjacent and exceed group size.
(Target_powerpc::group_sections): Handle corner case.
(Target_powerpc::Branch_info::make_stub): Handle case where
stub table doesn't exist due to branches in non-exec sections.
(Target_powerpc::Relocate::relocate): Likewise.
static and public. Add report_err param. Return false for data refs.
(Target_powerpc::rela_dyn_section): New overloaded function.
(Target_powerpc::plt_, iplt_): Elucidate.
(Output_data_plt_powerpc::entry_count): Handle current_data_size()==0.
(Output_data_plt_powerpc::do_write): Don't write .iplt.
(Output_data_plt_powerpc::plt_entry_count): Don't add .iplt entries.
(Target_powerpc::Scan::local, global): Adjust reloc_needs_plt_for_ifunc
calls. Put ifunc dynamic relocs in irela_dyn_section. Only
push_branch and make_plt_entry for ifunc syms when
reloc_needs_plt_for_ifunc.
(Target_powerpc::Relocate::relocate): Don't use plt entry value
for ifunc unless reloc_needs_plt_for_ifunc.
* target.cc (Target::do_plt_fde_location): New function.
* ehframe.h (class FDE): Add post_map field to u_.from_linker,
accessor function, and constructor param.
(struct Post_fde, Post_fdes): Declare.
(Cie::write): Add post_fdes param.
* ehframe.cc (Fde::write): Use plt_fde_location.
(struct Post_fde): Define.
(Cie::write): Stash FDEs added post merge mapping.
(Eh_frame::add_ehframe_for_plt): Assert no new CIEs after mapping.
Adjust Fde constructor call. Bump final_data_size_ for post map FDEs.
(Eh_frame::do_sized_write): Arrange to write post map FDES after
other FDEs.
* powerpc.cc (Target_powerpc::do_plt_fde_location): New function.
(Target_powerpc::has_glink): New function.
(Target_powerpc::do_relax): Add eh_frame info for stubs.
(struct Eh_cie, eh_frame_cie, glink_eh_frame_fde_64,
glink_eh_frame_fde_32, default_fde): New data.
(Stub_table::eh_frame_added_): New var.
(Stub_table::find_long_branch_entry, stub_address, stub_offset):
Make const.
(Stub_table::add_eh_frame): New function.
(Output_data_glink::add_eh_frame): New function.
(Target_powerpc::make_glink_section): Call add_eh_frame.
(class Relocate, class Scan): Inherit Track_tls.
(Target_powerpc::Scan::local, global): Track tls optimization
and avoid creating plt entry for __tls_get_addr if all uses
are optimized away.
(Target_powerpc::Relocate::relocate): Update to use Track_tls.
plt_thread_safe. Update stub_group_size help text.
* powerpc.cc (Target_powerpc::plt_thread_safe): New access function
for new plt_thread_safe_ var.
(use_plt_offset): Correct comments.
(Target_powerpc::do_relax): Look for thread creation symbols to
determine default plt_thread_safe value. Clear plt call stubs
as well as branch stubs each iteration.
(add_2_2_11, add_12_12_11, bnectr_p4, cmpldi_2_0, xor_11_11_11): New
insn constants.
(l, hi, ha, write_insn): Move earlier.
(Stub_table): Delete prev_size, add last_plt_size and last_branch_size.
(Stub_table::clear_stubs): Rename from clear_long_branch_stubs, clear
plt stubs too.
(Stub_table::update_size): Adjust.
(Stub_table::prev_size, set_prev_size): Delete.
(Stub_table::stub_align): Let --plt-align affect result.
(Stub_table::plt_call_size): Calculate sizes for various stubs.
(Stub_table::branch_stub_size): Use last_plt_size in address calc.
(Stub_table::add_plt_call_stub): Pass iterator to plt_call_size.
(Stub_table::do_write): Support more stub variants.
* layout.cc (Layout::get_executable_sections): New function.
* arm.cc (Target_arm::group_sections): Use it.
(Arm_output_section::group_sections): Delete now redundant test.
* output.cc (Output_reloc::Output_reloc): Add is_relative.
param to handle relative relocs.
* output.h (Output_reloc::Output_reloc <absolute reloc>): Likewise.
(Output_data_reloc::add_absolute): Adjust.
(Output_data_reloc::add_relative): New function.
(Output_data::reset_data_size): New function.
(Output_relaxed_input_section::set_relobj, set_shndx): New functions.
(Output_section::set_addralign): New function.
(Output_section::checkpoint_set_addralign): New function.
(Output_section::clear_section_offsets_need_adjustment): New function.
(Output_section::input_sections): Make public.
* powerpc.cc (class Output_data_brlt_powerpc): New.
(class Stub_table, class Stub_control): New.
(Powerpc_relobj::has14_, set_has_14bit_branch, has_14bit_branch,
stub_table_, set_stub_table, stub_table): New vectors and accessor
functions.
(Target_powerpc::do_may_relax, do_relax, push_branch,
new_stub_table, stub_tables, brlt_section, group_sections,
add_branch_lookup_table, find_branch_lookup_table,
write_branch_lookup_table, make_brlt_section): New functions.
(Target_powerpc::struct Sort_sections, class Branch_info): New.
(Target_powerpc::brlt_section_, stub_tables_, branch_lookup_table_,
branch_info_): New vars.
(Target_powerpc::make_plt_entry, make_local_ifunc_plt_entry): Don't
make call stubs here.
(Output_data_glink): Remove all call stub handling from this class.
(Target_powerpc::Scan::local, global): Save interesting branch
relocs and relocs for ifunc. Adjust calls to plt entry functions.
(Target_powerpc::do_finalize_sections): Only make reg save/restore
functions on final link.
(Target_powerpc::Relocate::relocate): Adjust lookup of call stubs.
Handle long branch destinations too.
(Target_powerpc::do_dynsym_value, do_plt_address_for_global,
do_plt_address_for_local): Adjust lookup of plt call stubs.
(struct Opd_ent): Use "Address" rather than "Offset".
(Output_data_got_powerpc::got_base_offset): Return Valtype.
(Target_powerpc::got_section): Make public.
(Target_powerpc::scan_relocs): Move code setting symbols..
(Powerpc_relobj::do_scan_relocs): ..to here, new function.
Create _SDA_BASE_ only when referenced.
dynamic relocs for GOT_TPREL got entries, without symbol if
resolving locally.
(Target_powerpc::do_gc_add_reference): Don't add for dynamic objects.
(Target_powerpc::scan_relocs): Define _GLOBAL_OFFSET_TABLE_ early.
(Target_powerpc::Relocate:relocate): REL32 reloc may be unaligned.
(Target_powerpc::do_finalize_sections): Call it.
(Output_data_save_res): New class and supporting functions.
(Target_powerpc::symval_for_branch): Only look up .opd entry for
normal symbols defined in object files.