On x86-solaris 10, we noticed that starting a program would sometimes
cause the debugger to crash. For instance:
% gdb a
(gdb) break adainit
Breakpoint 1 at 0x8051f03
(gdb) run
Starting program: /[...]/a
[Thread debugging using libthread_db enabled]
zsh: 24398 segmentation fault (core dumped) /[...]/gdb a
The exception occurs in dtrace_process_dof_probe, while trying
to process each probe referenced by a DTRACE_DOF_SECT_TYPE_PROVIDER
DOF section from /lib/libc.so.1. For reference, the ELF section
in that shared library providing the DOF data has the following
characteristics:
Idx Name Size VMA LMA File off Algn
14 .SUNW_dof 0000109d 000b4398 000b4398 000b4398 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
The function dtrace_process_dof gets passed the contents of that
ELF section, which allows it to determine the location of the table
where all DOF sections are described. I dumped the contents of
each DOF section as seen by GDB, and it seemed to be plausible,
because the offset of each DOF section was pretty much equal to
the sum of the offset and size of the previous DOF section. Also,
the offset + sum of the last section corresponds to the size of
the .SUNW_dof section.
Things start to break down when processing one of the DOF sections
that has a type of DTRACE_DOF_SECT_TYPE_PROVIDER. It gets the contents
of this DOF section via:
struct dtrace_dof_provider *provider = (struct dtrace_dof_provider *)
DTRACE_DOF_PTR (dof, DOF_UINT (dof, section->dofs_offset));
Said more simply, the struct dtrace_dof_provider data is at
section->dofs_offset of the entire DOF contents. Given that
the contents of SECTION seemed to make sense, so far so good.
However, what SECTION tells us is that our DOF provider section
is 40 bytes long:
(gdb) print *section
$36 = {dofs_type = 15, dofs_align = 4, dofs_flags = 1,
dofs_entsize = 0, dofs_offset = 3264, dofs_size = 40}
^^^^^^^^^^^^^^
But on the other hand:
(gdb) p sizeof (struct dtrace_dof_provider)
$54 = 44
In other words GDB expected a bigger DOF section and when we try to
fetch the value of the last field of that DOF section (dofpv_prenoffs)...
eoffsets_s = DTRACE_DOF_SECT (dof,
DOF_UINT (dof, provider->dofpv_prenoffs));
... we end up reading data that actually belongs to another DOF
section, and therefore irrelevant. This in turn means that the value
of eofftab gets incorrectly set, since it depends on eoffsets_s:
eofftab = DTRACE_DOF_PTR (dof, DOF_UINT (dof, eoffsets_s->dofs_offset));
This invalid address quickly catches up to us when we pass it to
dtrace_process_dof_probe shortly after, where we crash because
we try to subscript it:
Program received signal SIGSEGV, Segmentation fault.
0x08155bba in dtrace_process_dof_probe ([...]) at [...]/dtrace-probe.c:378
378 = ((uint32_t *) eofftab)[...];
This patch fixes the issue by detecting provider DOF sections
that are smaller than expected, and discarding the DOF data.
gdb/ChangeLog:
* dtrace-probe.c (dtrace_process_dof): Ignore the objfile's DOF
data if a DTRACE_DOF_SECT_TYPE_PROVIDER section is found to be
smaller than expected.
The hidden versioned symbol can only be merged with the versioned
symbol with the same symbol version. _bfd_elf_merge_symbol should
check the symbol version before merging the new hidden versioned
symbol with the existing symbol. _bfd_elf_link_hash_copy_indirect can't
copy any references to the hidden versioned symbol. We need to
bind a symbol locally when linking executable if it is locally defined,
hidden versioned, not referenced by shared library and not exported.
bfd/
PR ld/18720
* elflink.c (_bfd_elf_merge_symbol): Add a parameter to indicate
if the new symbol matches the existing one. The new hidden
versioned symbol matches the existing symbol if they have the
same symbol version. Update the existing symbol only if they
match.
(_bfd_elf_add_default_symbol): Update call to
_bfd_elf_merge_symbol.
(_bfd_elf_link_assign_sym_version): Don't set the hidden field
here.
(elf_link_add_object_symbols): Override a definition only if the
new symbol matches the existing one.
(_bfd_elf_link_hash_copy_indirect): Don't copy any references to
the hidden versioned symbol.
(elf_link_output_extsym): Bind a symbol locally when linking
executable if it is locally defined, hidden versioned, not
referenced by shared library and not exported. Turn on
VERSYM_HIDDEN only if the hidden vesioned symbol is defined
locally.
ld/testsuite/
PR ld/18720
* ld-elf/indirect.exp: Run tests for PR ld/18720.
* ld-elf/pr18720.out: New file.
* ld-elf/pr18720a.c: Likewise.
* ld-elf/pr18720b.c: Likewise.
* ld-elf/pr18720c.c: Likewise.
The get_frame_language feels like it would be more at home in frame.c
rather than in stack.c, while the declaration, that is currently in
language.h can be moved into frame.h to match.
A couple of new includes are added, but otherwise no substantial change
here.
gdb/ChangeLog:
* stack.c (get_frame_language): Moved ...
* frame.c (get_frame_language): ... to here.
* language.h (get_frame_language): Declaration moved to frame.h.
* frame.h: Add language.h include, for language enum.
(get_frame_language): Declaration moved from language.h.
* language.c: Add frame.h include.
* top.c: Add frame.h include.
* symtab.h (struct obj_section): Declare.
(struct cmd_list_element): Declare.
As part of a drive to remove deprecated_safe_get_selected_frame, make
the get_frame_language function take a frame parameter. Given the name
of the function this actually seems to make a lot of sense.
The task of fetching a suitable frame is then passed to the calling
functions. For get_frame_language there are not many callers, these are
updated to get the selected frame in a suitable way.
gdb/ChangeLog:
* language.c (show_language_command): Find selected frame before
asking for the language of that frame.
(set_language_command): Likewise.
* language.h (get_frame_language): Add frame parameter.
* stack.c (get_frame_language): Add frame parameter, assert
parameter is not NULL, update comment and reindent.
* top.c (check_frame_language_change): Pass the selected frame
into get_frame_language.
When using options such as --localize-symbol, --globalize-symbol, etc,
along with the --wildcard option, prefixing a symbol name with '!'
should provide non-matching behaviour, as example the following example
is given in the manual:
--wildcard --weaken-symbol !foo --weaken-symbol fo*
which should weaken all symbols matching the pattern 'fo*', but not the
symbol 'foo'.
However, this currently does not work, the current logic will waken all
symbols matching the pattern 'fo*' AND all symbols that are not 'foo'.
The symbol 'foo' is covered by the first condition, and so is weakened,
while, other symbols, for example 'bar' will match the second condition,
and so be weakened.
This patch adjusts the logic so that a pattern prefixed with '!'
specifically DOES NOT apply the relevant change to any matching symbols,
instead of applying the change to all non-matching symbols. So this:
--weaken-symbol !foo
will ensure that the symbol 'foo' is not weakened, but says nothing
about symbols that are not 'foo'. As a result, a pattern prefixed with
'!' now only makes sense when used alongside a more wide ranging
wildcard pattern.
This change should make the wildcard matching feature more useful, with
no overall loss of functionality. The example given in the manual,
weaken all symbols matching 'fo*' except 'foo' can now be achieved, but
so too can more complex examples, such as weaken all symbols matching
'fo*' except 'foo', 'foa', and 'fob', like this:
--wildcard --weaken-symbol !foo \
--weaken-symbol !foa \
--weaken-symbol !fob \
--weaken-symbol fo*
Under the previous scheme, something as symbols as, weaken all symbols
except 'foo' could have been achieved with this:
--weaken-symbol !foo
however, this will no longer work. To achieve the same result under the
new scheme this is now required:
--weaken-symbol !foo --weaken-symbol *
binutils/ChangeLog:
* objcopy.c (is_specified_symbol_predicate): Don't stop at first
match. Non-match rules set found to FALSE.
binutils/testsuite/ChangeLog:
* binutils-all/objcopy.exp: Run new symbol tests.
(objcopy_test_symbol_manipulation): New function.
* binutils-all/symbols-1.d: New file.
* binutils-all/symbols-2.d: New file.
* binutils-all/symbols-3.d: New file.
* binutils-all/symbols-4.d: New file.
* binutils-all/symbols.s: New file.
Indicate speculatively executed instructions with a leading '?'. We use the
space that is normally used for the PC prefix. In the case where the
instruction at the current PC had been executed speculatively before, the PC
prefix will be partially overwritten resulting in "?> ".
As a side-effect, the /p modifier to omit the PC prefix in the "record
instruction-history" command now uses a 3-space PC prefix " " in order to
have enough space for the speculative execution indication.
gdb/
* btrace.c (btrace_compute_ftrace_bts): Clear insn flags.
(pt_btrace_insn_flags): New.
(ftrace_add_pt): Call pt_btrace_insn_flags.
* btrace.h (btrace_insn_flag): New.
(btrace_insn) <flags>: New.
* record-btrace.c (btrace_insn_history): Print insn prefix.
* NEWS: Announce it.
doc/
* gdb.texinfo (Process Record and Replay): Document prefixing of
speculatively executed instructions in the "record instruction-history"
command.
testsuite/
* gdb.btrace/instruction_history.exp: Update.
* gdb.btrace/tsx.exp: New.
* gdb.btrace/tsx.c: New.
* lib/gdb.exp (skip_tsx_tests, skip_btrace_pt_tests): New.
Intel(R) Processor Trace support requires a recent linux/perf_event.h header.
When GDB is built on an older system, Intel(R) Processor Trace will not be
available and there is no indication in the configure and build log as to
what went wrong.
Check for a compatible linux/perf_event.h at configure-time.
gdb/
* configure.ac: Check for PERF_ATTR_SIZE_VER5 in linux/perf_event.h
* configure: Regenerate.
The buildbot shows that PPC64 and x86_64 builders, both native and
extended-remote gdbserver frequently timeout these tests.
until-precsave.exp times out on my x86_64 occasionally as well.
Inspecting the logs, we see that if we waited some more, the tests
would pass.
Simply bump until-precsave.exp timeouts further, and apply the same
treatment to step-precsave.exp.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.reverse/step-precsave.exp: Use with_timeout_factor to
increase timeout.
* gdb.reverse/until-precsave.exp: Bump timeouts.
This test fails with --target_board=native-extended-gdbserver because
it misses the usual "disconnect":
(gdb) target remote | /usr/lib64/valgrind/../../bin/vgdb --pid=30454
Already connected to a remote target. Disconnect? (y or n) n
Still connected.
(gdb) FAIL: gdb.base/valgrind-infcall.exp: target remote for vgdb (got interactive prompt)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.base/valgrind-infcall.exp: Issue a "disconnect".
This patch is mostly extracted from Pedro's C++ branch. It adds explicit
casts from integer to enum types, where it is really the intention to do
so. This could be because we are ...
* iterating on enum values (we need to iterate on an equivalent integer)
* converting from a value read from bytes (dwarf attribute, agent
expression opcode) to the equivalent enum
* reading the equivalent integer value from another language (Python/Guile)
An exception to that is the casts in regcache.c. It seems to me like
struct regcache's register_status field could be a pointer to an array of
enum register_status. Doing so would waste a bit of memory (4 bytes
used by the enum vs 1 byte used by the current signed char, for each
register). If we switch to C++11 one day, we can define the underlying
type of an enum type, so we could have the best of both worlds.
gdb/ChangeLog:
* arm-tdep.c (set_fp_model_sfunc): Add cast from integer to enum.
(arm_set_abi): Likewise.
* ax-general.c (ax_print): Likewise.
* c-exp.y (exp : string_exp): Likewise.
* compile/compile-loc2c.c (compute_stack_depth_worker): Likewise.
(do_compile_dwarf_expr_to_c): Likewise.
* cp-name-parser.y (demangler_special : DEMANGLER_SPECIAL start):
Likewise.
* dwarf2expr.c (execute_stack_op): Likewise.
* dwarf2loc.c (dwarf2_compile_expr_to_ax): Likewise.
(disassemble_dwarf_expression): Likewise.
* dwarf2read.c (dwarf2_add_member_fn): Likewise.
(read_array_order): Likewise.
(abbrev_table_read_table): Likewise.
(read_attribute_value): Likewise.
(skip_unknown_opcode): Likewise.
(dwarf_decode_macro_bytes): Likewise.
(dwarf_decode_macros): Likewise.
* eval.c (value_f90_subarray): Likewise.
* guile/scm-param.c (gdbscm_make_parameter): Likewise.
* i386-linux-tdep.c (i386_canonicalize_syscall): Likewise.
* infrun.c (handle_command): Likewise.
* memory-map.c (memory_map_start_memory): Likewise.
* osabi.c (set_osabi): Likewise.
* parse.c (operator_length_standard): Likewise.
* ppc-linux-tdep.c (ppc_canonicalize_syscall): Likewise, and use
single return point.
* python/py-frame.c (gdbpy_frame_stop_reason_string): Likewise.
* python/py-symbol.c (gdbpy_lookup_symbol): Likewise.
(gdbpy_lookup_global_symbol): Likewise.
* record-full.c (record_full_restore): Likewise.
* regcache.c (regcache_register_status): Likewise.
(regcache_raw_read): Likewise.
(regcache_cooked_read): Likewise.
* rs6000-tdep.c (powerpc_set_vector_abi): Likewise.
* symtab.c (initialize_ordinary_address_classes): Likewise.
* target-debug.h (target_debug_print_signals): Likewise.
* utils.c (do_restore_current_language): Likewise.
Fixes another C++ -fpermissive error:
src/gdb/gdbserver/tracepoint.c:4535:21: error: invalid conversion from ‘int’ to ‘eval_result_type’ [-fpermissive]
expr_eval_result = ipa_expr_eval_result;
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* tracepoint.c (expr_eval_result): Now an int.
The regcache used to be hidden inside inferiors.c, but since the
tracepoints support that it's a first class object. This also fixes a
few implicit pointer conversion errors in C++ mode, caused by a few
places missing the explicit cast.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdbthread.h (struct regcache): Forward declare.
(struct thread_info) <regcache_data>: Now a struct regcache
pointer.
* inferiors.c (inferior_regcache_data)
(set_inferior_regcache_data): Now work with struct regcache
pointers.
* inferiors.h (struct regcache): Forward declare.
(inferior_regcache_data, set_inferior_regcache_data): Now work
with struct regcache pointers.
* regcache.c (get_thread_regcache, regcache_invalidate_thread)
(free_register_cache_thread): Remove struct regcache pointer
casts.
Running gdb.threads/process-dies-while-handling-bp.exp against
gdbserver sometimes FAILs because GDBserver drops the connection, but
the logs leave no clue on what the reason could be. Running manually
a few times, I saw the same:
$ ./gdbserver/gdbserver --multi :9999 testsuite/gdb.threads/process-dies-while-handling-bp
Process testsuite/gdb.threads/process-dies-while-handling-bp created; pid = 12766
Listening on port 9999
Remote debugging from host 127.0.0.1
Listening on port 9999
Child exited with status 0
Child exited with status 0
What happened is that an exception escaped and gdbserver reopened the
connection, which led to that second "Listening on port 9999" output.
The error was a failure to access registers from a now-dead thread.
The exception probably shouldn't have escaped here, but meanwhile,
this at least makes the issue less mysterious.
Tested on x86_64 Fedora 20.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* server.c (captured_main): On error, print the exception message
to stderr, and if run_once is set, throw a quit.
Found while processing the C++ enum changes. It seems like series
should be of type enum complaint_series, instead of adding a cast.
Redundant and out of date comments are also removed.
gdb/ChangeLog:
* complaints.c (enum complaint_series): Add newlines and remove
out of date comment.
(struct complaints) <series>: Change type to enum
complaint_series and remove out of date comment.
(symfile_complaint_hook): Use equivalent enum value
ISOLATED_MESSAGE instead of 0.
While hacking on the fix for PR threads/18600 (Threads left stopped
after fork+thread spawn), I once saw its test (fork-plus-threads.exp)
FAIL against gdbserver because move_out_of_jump_pad_callback has a
gdb_breakpoint_here call, and the caller isn't making sure the current
thread points to the right thread. In the case I saw, the current
thread pointed to the wrong process, so gdb_breakpoint_here returned
the wrong answer. Unfortunately I didn't save logs. Still, seems
obvious enough and it should fix a potential occasional racy FAIL.
Tested on x86_64 Fedora 20.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (move_out_of_jump_pad_callback): Temporarily switch
the current thread.
Running gdbserver --debug under Valgrind shows:
==4803== Invalid read of size 4
==4803== at 0x432B62: linux_write_memory (linux-low.c:5320)
==4803== by 0x4143F7: write_inferior_memory (target.c:83)
==4803== by 0x415895: remove_memory_breakpoint (mem-break.c:362)
==4803== by 0x432EF5: linux_remove_point (linux-low.c:5460)
==4803== by 0x416319: delete_raw_breakpoint (mem-break.c:802)
==4803== by 0x4163F3: release_breakpoint (mem-break.c:842)
==4803== by 0x416477: delete_breakpoint_1 (mem-break.c:869)
==4803== by 0x4164EF: delete_breakpoint (mem-break.c:891)
==4803== by 0x416843: delete_gdb_breakpoint_1 (mem-break.c:1069)
==4803== by 0x4168D8: delete_gdb_breakpoint (mem-break.c:1098)
==4803== by 0x4134E3: process_serial_event (server.c:4051)
==4803== by 0x4138E4: handle_serial_event (server.c:4196)
==4803== Address 0x4c6b930 is 0 bytes inside a block of size 1 alloc'd
==4803== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4803== by 0x4240C6: xmalloc (common-utils.c:43)
==4803== by 0x41439C: write_inferior_memory (target.c:80)
==4803== by 0x415895: remove_memory_breakpoint (mem-break.c:362)
==4803== by 0x432EF5: linux_remove_point (linux-low.c:5460)
==4803== by 0x416319: delete_raw_breakpoint (mem-break.c:802)
==4803== by 0x4163F3: release_breakpoint (mem-break.c:842)
==4803== by 0x416477: delete_breakpoint_1 (mem-break.c:869)
==4803== by 0x4164EF: delete_breakpoint (mem-break.c:891)
==4803== by 0x416843: delete_gdb_breakpoint_1 (mem-break.c:1069)
==4803== by 0x4168D8: delete_gdb_breakpoint (mem-break.c:1098)
==4803== by 0x4134E3: process_serial_event (server.c:4051)
==4803==
And:
==7272== Conditional jump or move depends on uninitialised value(s)
==7272== at 0x3615E48361: vfprintf (vfprintf.c:1634)
==7272== by 0x414E89: debug_vprintf (debug.c:60)
==7272== by 0x42800A: debug_printf (common-debug.c:35)
==7272== by 0x43937B: my_waitpid (linux-waitpid.c:149)
==7272== by 0x42D740: linux_wait_for_event_filtered (linux-low.c:2441)
==7272== by 0x42DADA: linux_wait_for_event (linux-low.c:2552)
==7272== by 0x42E165: linux_wait_1 (linux-low.c:2860)
==7272== by 0x42F5D8: linux_wait (linux-low.c:3453)
==7272== by 0x4144A4: mywait (target.c:107)
==7272== by 0x413969: handle_target_event (server.c:4214)
==7272== by 0x41A1A6: handle_file_event (event-loop.c:429)
==7272== by 0x41996D: process_event (event-loop.c:184)
gdb/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* nat/linux-waitpid.c (my_waitpid): Only print *status if waitpid
returned > 0.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_write_memory): Rewrite debug output to avoid
reading beyond the passed in buffer length.
This adds a kfailed test that has the whole process exit just while
several threads continuously step over a breakpoint. Usually, the
process exits just while GDB or GDBserver is handling the breakpoint
hit. In other words, the process disappears while the event thread is
(ptrace-) stopped. This exposes several issues in GDB and GDBserver.
Errors, crashes, etc.
I fixed some of these issues recently, but there's a lot more to do.
It's a bit like playing whack-a-mole at the moment. You fix an issue,
which then exposes several others.
E.g., with the native target, you get (among other errors):
(...)
[New Thread 0x7ffff47b9700 (LWP 18077)]
[New Thread 0x7ffff3fb8700 (LWP 18078)]
[New Thread 0x7ffff37b7700 (LWP 18079)]
Cannot find user-level thread for LWP 18076: generic error
(gdb) KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=1: inferior 1 exited (prompt) (PRMS: gdb/18749)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
PR gdb/18749
* gdb.threads/process-dies-while-handling-bp.c: New file.
* gdb.threads/process-dies-while-handling-bp.exp: New file.
This field was never set nor used. This patch removes it.
gdb/ChangeLog:
* common/agent.c (symbol_list) <required>: Remove.
gdb/gdbserver/ChangeLog:
* tracepoint.c (symbol_list) <required>: Remove.
Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html
This adds a test that has a multithreaded program have several threads
continuously fork, while another thread continuously steps over a
breakpoint.
This exposes several intertwined issues, which this patch addresses:
- When we're stopping and suspending threads, some thread may fork,
and we missed setting its suspend count to 1, like we do when a new
clone/thread is detected. When we next unsuspend threads, the fork
child's suspend count goes below 0, which is bogus and fails an
assertion.
- If a step-over is cancelled because a signal arrives, but then gdb
is not interested in the signal, we pass the signal straight back
to the inferior. However, we miss that we need to re-increment the
suspend counts of all other threads that had been paused for the
step-over. As a result, other threads indefinitely end up stuck
stopped.
- If a detach request comes in just while gdbserver is handling a
step-over (in the test at hand, this is GDB detaching the fork
child), gdbserver internal errors in stabilize_thread's helpers,
which assert that all thread's suspend counts are 0 (otherwise we
wouldn't be able to move threads out of the jump pads). The
suspend counts aren't 0 while a step-over is in progress, because
all threads but the one stepping past the breakpoint must remain
paused until the step-over finishes and the breakpoint can be
reinserted.
- Occasionally, we see "BAD - reinserting but not stepping." being
output (from within linux_resume_one_lwp_throw). That was because
GDB pokes memory while gdbserver is busy with a step-over, and that
suspends threads, and then re-resumes them with proceed_one_lwp,
which missed another reason to tell linux_resume_one_lwp that the
thread should be set back to stepping.
- In a couple places, we were resuming threads that are meant to be
suspended. E.g., when a vCont;c/s request for thread B comes in
just while gdbserver is stepping thread A past a breakpoint. The
resume for thread B must be deferred until the step-over finishes.
- The test runs with both "set detach-on-fork" on and off. When off,
it exercises the case of GDB detaching the fork child explicitly.
When on, it exercises the case of gdb resuming the child
explicitly. In the "off" case, gdb seems to exponentially become
slower as new inferiors are created. This is _very_ noticeable as
with only 100 inferiors gdb is crawling already, which makes the
test take quite a bit to run. For that reason, I've disabled the
"off" variant for now.
gdb/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* target/waitstatus.h (enum target_stop_reason)
<TARGET_STOPPED_BY_SINGLE_STEP>: New value.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Set the fork child's suspend
count if stopping and suspending threads.
(check_stopped_by_breakpoint): If stopped by trace, set the LWP's
stop reason to TARGET_STOPPED_BY_SINGLE_STEP.
(linux_detach): Complete an ongoing step-over.
(lwp_suspended_inc, lwp_suspended_decr): New functions. Use
throughout.
(resume_stopped_resumed_lwps): Don't resume a suspended thread.
(linux_wait_1): If passing a signal to the inferior after
finishing a step-over, unsuspend and re-resume all lwps. If we
see a single-step event but the thread should be continuing, don't
pass the trap to gdb.
(stuck_in_jump_pad_callback, move_out_of_jump_pad_callback): Use
internal_error instead of gdb_assert.
(enqueue_pending_signal): New function.
(check_ptrace_stopped_lwp_gone): Add debug output.
(start_step_over): Use internal_error instead of gdb_assert.
(complete_ongoing_step_over): New function.
(linux_resume_one_thread): Don't resume a suspended thread.
(proceed_one_lwp): If the LWP is stepping over a breakpoint, reset
it stepping.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.threads/forking-threads-plus-breakpoint.exp: New file.
* gdb.threads/forking-threads-plus-breakpoint.c: New file.
The tail end of linux_wait_1 isn't expecting that the select_event_lwp
machinery can pick a whole-process exit event to report to GDB. When
that happens, both gdb and gdbserver end up quite confused:
...
(gdb)
[Thread 24971.24971] #1 stopped.
0x0000003615a011f0 in ?? ()
c&
Continuing.
(gdb) [New Thread 24971.24981]
[New Thread 24983.24983]
[New Thread 24971.24982]
[Thread 24983.24983] #3 stopped.
0x0000003615ebc7cc in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
130 pid = ARCH_FORK ();
[New Thread 24984.24984]
Error in re-setting breakpoint -16: PC register is not available
Error in re-setting breakpoint -17: PC register is not available
Error in re-setting breakpoint -18: PC register is not available
Error in re-setting breakpoint -19: PC register is not available
Error in re-setting breakpoint -24: PC register is not available
Error in re-setting breakpoint -25: PC register is not available
Error in re-setting breakpoint -26: PC register is not available
Error in re-setting breakpoint -27: PC register is not available
Error in re-setting breakpoint -28: PC register is not available
Error in re-setting breakpoint -29: PC register is not available
Error in re-setting breakpoint -30: PC register is not available
PC register is not available
(gdb)
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (add_lwp): Set waitstatus to TARGET_WAITKIND_IGNORE.
(linux_thread_alive): Use lwp_is_marked_dead.
(extended_event_reported): Delete.
(linux_wait_1): Check if waitstatus is TARGET_WAITKIND_IGNORE
instead of extended_event_reported.
(mark_lwp_dead): Don't set the 'dead' flag. Store the waitstatus
as well.
(lwp_is_marked_dead): New function.
(lwp_running): Use lwp_is_marked_dead.
* linux-low.h: Delete 'dead' field, and update 'waitstatus's
comment.
The "extended event with waitstatus" debug output is unreachable, as
it is guarded by "if (!report_to_gdb)". If extended_event_reported is
true, then so is report_to_gdb. Move it to where we print why we're
reporting an event to GDB.
Also, the debug output currently tries to print the wrong struct
target_waitstatus.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_wait_1): Move fork event output out of the
!report_to_gdb check. Pass event_child->waitstatus to
target_waitstatus_to_string instead of ourstatus.
Reverts a2c59f28 and e474ab13. Since the unary form of ALIGN only
references "dot" implicitly, there isn't really a strong argument for
making ALIGN use a relative value when inside an output section.
* ldexp.c (align_dot_val): Delete.
(fold_unary <ALIGN_K, NEXT>): Revert 2015-07-10 change.
(is_align_conditional): Revert 2015-07-20 change.
(exp_fold_tree_1): Likewise, but keep expanded comment.
* scripttempl/elf.sc (.ldata, .bss): Revert 2015-07-20 change.
* ld.texinfo (<ALIGN>): Correct description.
At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel
observed that trying to next/step a program on GNU/Linux sometimes
results in the following failed assertion:
% gdb -q .obj/gprof/main
(gdb) start
(gdb) n
(gdb) step
[...]/infrun.c:2391: internal-error:
resume: Assertion `sig != GDB_SIGNAL_0' failed.
What happened is that, during the "next" operation, GDB hit a
longjmp/exception/step-resume breakpoint but failed to see that this
breakpoint was set for a different thread than the one being stepped.
Joel's detailed analysis follows:
More precisely, at the end of the "start" command, we are stopped at
the start of function Main in main.adb; there are 4 threads in total,
and we are in the main thread (which is thread 1):
(gdb) info thread
Id Target Id Frame
4 Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall ()
3 Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall ()
2 Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall ()
* 1 Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57
All the logs below reference Thread ID/LWP, but it'll be easier to
talk about the threads by GDB thread number. For instance, thread 1
is LWP 28370 while thread 3 is LWP 28378. So, the explanations below
translate the LWPs into thread numbers.
Back to what happens while we are trying to "next' our program:
(gdb) n
infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379))
infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378))
infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377))
infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370))
infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8054523
We've resumed thread 1 (LWP 28370), and received in return a signal
that the same thread stopped slightly further. It's still in the
range of instructions for the line of source we started the "next"
from, as evidenced by the following trace...
infrun: stepping inside range [0x805451e-0x8054531]
... and thus, we decide to continue stepping the same thread:
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
infrun: prepare_to_wait
That's when we get an event from a different thread (thread 3)...
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80782d0
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
... which we find to be at the address where we set a breakpoint on
"the unwinder debug hook" (namely "_Unwind_DebugHook"). But GDB fails
to notice that the breakpoint was inserted for thread 1 only, and so
decides to handle it as...
infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME
... and inserts a breakpoint at the corresponding resume address, as
evidenced by this the next log:
infrun: exception resume at 80542a2
That breakpoint seems innocent right now, but will play a role fairly
quickly. But for now, GDB has inserted the exception-resume
breakpoint, and needs to single-step thread 3 past the breakpoint it
just hit. Thus, it temporarily disables the exception breakpoint, and
requests a step of that thread:
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0
infrun: prepare_to_wait
We then get a notification, still from thread 3, that it's now past
that breakpoint...
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8078424
... so we can resume what we were doing before, which is single-stepping
thread 1 until we get to a new line of code:
infrun: switching back to stepped thread
infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370)
infrun: expected thread still hasn't advanced
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
The "resume" log above shows that we're resuming thread 1 from where
we left off (0x8054523). We get one more stop at 0x8054529, which is
still inside our stepping range so we go again. That's when we get
the following event, from thread 3:
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80542a2
Now the stop_pc address is interesting, because it's the address of
"exception resume" breakpoint...
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME
... and since that location is at a different line of code, this is
where it decides the "next" operation should stop:
infrun: stop_waiting
[Switching to Thread 0xb7c5aba0 (LWP 28378)]
0x080542a2 in inte_tache_rt.ttache_rt (
<_task>=0x80968ec <inte_tache_rt_inst.tache2>)
at /[...]/inte_tache_rt.adb:54
54 end loop;
However, what GDB should have noticed earlier that the exception
breakpoint we hit was for a different thread, thus should have
single-stepped that thread out of the breakpoint _without_ inserting
the exception-return breakpoint, and then resumed the single-stepping
of the initial thread (thread 1) until that thread stepped out of its
stepping range.
This is what this patch does, and after applying it, GDB now correctly
stops on the next line of code.
The patch adds a C++ test that exercises this, both for setjmp/longjmp
and exception breakpoints. With an unpatched GDB it shows:
(gdb) next
[Switching to Thread 22445.22455]
thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59
59 catch (...)
(gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1
next
/home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa
iled.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error)
Resyncing due to internal error.
n
Tested on x86_64-linux, no regressions.
gdb/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
Joel Brobecker <brobecker@adacore.com>
* breakpoint.c (bpstat_what) <bp_longjmp, bp_longjmp_call_dummy>
<bp_exception, bp_longjmp_resume, bp_exception_resume>: Handle the
case where BS->STOP is not set.
gdb/testsuite/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
* gdb.threads/next-while-other-thread-longjmps.c: New file.
* gdb.threads/next-while-other-thread-longjmps.exp: New file.
bfd * elf.c (_bfd_elf_copy_private_bfd_data): Copy the sh_link and
sh_info fields of sections whose type has been changed to
SHT_NOBITS.
bin * doc/binutils.texi: Document that the --only-keep-debug option
to strip and objcopy preserves the section headers of stripped
sections.
tests * binutils-all/objcopy.exp (keep_debug_symbols_and_check_links):
New proc. Checks that debug-info-only binaries retain the
sh_link field in stripped sections.
Fixes a build error due to typedef redefinition with some compilers.
Also added missing copyright header.
gdb/
* nat/gdb_thread_db.h: Add copyright header.
Protect against multiple inclusion.
This patch removes get_thread_id from aarch64-linux-nat.c,
arm-linux-nat.c and xtensa-linux-nat.c.
get_thread_id was added in this commit below in 2000,
41c49b06c4https://sourceware.org/ml/gdb-patches/2000-04/msg00398.html
which predates the ptid_t stuff added into GDB. Nowadays, lwpid of
inferior_ptid is only zero when the inferior is created (in
fork-child.c:fork_inferior) and its lwpid will be set after
linux_nat_wait_1 gets the first event. After that, lwpid of
inferior_ptid is not zero for linux-nat target, then we can use
ptid_get_lwp, so this function isn't needed anymore.
Even when GDB attaches to a process, the lwp of inferior_ptid
isn't zero, see linux-nat.c:linux_nat_attach,
/* The ptrace base target adds the main thread with (pid,0,0)
format. Decorate it with lwp info. */
ptid = ptid_build (ptid_get_pid (inferior_ptid),
ptid_get_pid (inferior_ptid),
0);
Note that linux_nat_xfer_partial shifts lwpid to pid for inferior_ptid
temperately for calling linux_ops->to_xfer_partial, but all the
affected functions in this patch are not called in
linux_ops->to_xfer_partial.
I think we can safely remove get_thread_id for all linux native targets.
Regression tested on arm-linux and aarch64-linux. Unable to build
native GDB and test it on xtensa-linux.
gdb:
2015-08-05 Yao Qi <yao.qi@linaro.org>
* aarch64-linux-nat.c (get_thread_id): Remove.
(debug_reg_change_callback): Call ptid_get_lwp instead of
get_thread_id.
(fetch_gregs_from_thread): Likewise.
(store_gregs_to_thread): Likewise.
(fetch_fpregs_from_thread): Likewise.
(store_fpregs_to_thread): Likewise.
(aarch64_linux_get_debug_reg_capacity): Likewise.
* arm-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Update macro to use ptid_get_lwp.
* xtensa-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Update macro to use ptid_get_lwp.
* arm-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Remove.
(fetch_fpregs): Call ptid_get_lwp instead of GET_THREAD_ID.
(store_fpregs, fetch_regs, store_regs): Likewise.
(fetch_wmmx_regs, store_wmmx_regs): Likewise.
(fetch_vfp_regs, store_vfp_regs): Likewise.
(arm_linux_read_description): Likewise.
(arm_linux_get_hwbp_cap): Likewise.
* xtensa-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Remove.
(fetch_gregs, store_gregs): Call ptid_get_lwp instead of
GET_THREAD_ID.
The class is called LineTable, not Linetable, as specified by
py-linetable.c/gdbpy_initialize_linetable:
if (gdb_pymodule_addobject (gdb_module, "LineTable",
gdb/ChangeLog:
* python/py-linetable.c: Fix case of Linetable to LineTable
in docstrings and code comments.
* python/py-symtab.c: Same.
PR binutils/18750
* ihex.c (ihex_scan): Fixes incorrect escape sequence in error message
and stack overflow when char is signed and \200-\376 was in place of hex
digit; also fixes \377 was handled as EOF instead of "incorrect character".
(ihex_read_section): Changed for consistency.
(ihex_bad_byte): Prevent (now impossible to trigger) stack
overflow and incorrect escape sequence handling.
* srec.c (srec_bad_byte): Likewise.
* readelf.c (process_mips_specific): Fix incorrect escape
sequence handling.
We only support tracepoint for aarch64. Although arm program can run
on aarch64, GDBserver doesn't support tracepoint for it.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_supports_tracepoints): Return 0
if current_thread is 32 bit.
In multi-arch debugging, if GDB sends Z0 packet, GDBserver should be
able to do several things below:
- choose the right breakpoint instruction to insert according to the
information available, such as 'kind' in Z0 packet and address,
- choose the right breakpoint instruction to check memory writes and
validate inserted memory breakpoint
- be aware of different breakpoint instructions in $ARCH_breakpoint_at.
unfortunately GDBserver can't do them now. Although x86 GDBserver
supports multi-arch, it doesn't need to support them above because
breakpoint instruction on i686 and x86_64 is the same. However,
breakpoint instructions on aarch64 and arm (arm mode, thumb1, and thumb2)
are different.
I tried to teach aarch64 GDBserver backend to be really
multi-arch-capable in the following ways,
- linux_low_target return the right breakpoint instruction according to
the 'kind' in Z0 packet, and insert_memory_breakpoint can do the right
thing.
- once breakpoint is inserted, the breakpoint data and length is recorded
in each breakpoint object, so that validate_breakpoint and
check_mem_write can get the right breakpoint instruction from each
breakpoint object, rather than from global variable breakpoint_data.
- linux_low_target needs another hook function for pc increment after
hitting a breakpoint.
- let set_breakpoint_at, which is widely used for tracepoint, use the
'default' breakpoint instruction. We can always use aarch64 breakpoint
instruction since arm doesn't support tracepoint yet.
looks it is not a small piece of work, so I decide to disable Z0 packet
on multi-arch, which means aarch64 GDBserver only supports Z0 packet
if it is started to debug only one process (extended protocol is not
used) and process target description is 64-bit.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_supports_z_point_type): Return
0 for Z_PACKET_SW_BP if it may be used in multi-arch debugging.
* server.c (extended_protocol): Remove "static".
* server.h (extended_protocol): Declare it.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_get_pc): Get PC register on
both aarch64 and aarch32.
(aarch64_set_pc): Likewise.
This patch teaches aarch64-linux GDBserver use 32-bit arm target
description and regs_info if the elf file is 32-bit.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* configure.srv (case aarch64*-*-linux*): Append arm-with-neon.o
to srv_regobj and append arm-core.xml arm-vfpv3.xml and
arm-with-neon.xml to srv_xmlfiles.
* linux-aarch64-low.c: Include linux-aarch32-low.h.
(is_64bit_tdesc): New function.
(aarch64_linux_read_description): New function.
(aarch64_arch_setup): Call aarch64_linux_read_description.
(regs_info): Rename to regs_info_aarch64.
(aarch64_regs_info): Return right regs_info.
(initialize_low_arch): Call initialize_low_arch_aarch32.
This patch adds a new regs_info regs_info_aarch32 for aarch32, which
can be used by both aarch64 and arm backend.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* configure.srv (srv_tgtobj): Add linux-aarch32-low.o.
* linux-aarch32-low.c: New file.
* linux-aarch32-low.h: New file.
* linux-arm-low.c (arm_fill_gregset): Move it to
linux-aarch32-low.c.
(arm_store_gregset): Likewise.
(arm_fill_vfpregset): Call arm_fill_vfpregset_num
(arm_store_vfpregset): Caa arm_store_vfpregset_num.
(arm_arch_setup): Check if PTRACE_GETREGSET works.
(regs_info): Rename to regs_info_arm.
(arm_regs_info): Return regs_info_aarch32 if
have_ptrace_getregset is 1 and target description is
arm_with_neon or arm_with_vfpv3.
(initialize_low_arch): Don't call init_registers_arm_with_neon.
Call initialize_low_arch_aarch32 instead.
This patch moves variable have_ptrace_getregset from linux-x86-low.c
to linux-low.c, so that arm can use it too.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-x86-low.c (have_ptrace_getregset): Move it to ...
* linux-low.c: ... here.
* linux-low.h (have_ptrace_getregset): Declare it.
-fsanitize=address
gdb.base/attach-pie-noexec.exp
==32586==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200004ed90 at pc 0x48ad50 bp 0x7ffceb3aef50 sp 0x7ffceb3aef20
READ of size 2 at 0x60200004ed90 thread T0
#0 0x48ad4f in __interceptor_strlen (/home/jkratoch/redhat/gdb-test-asan/gdb/gdb+0x48ad4f)
#1 0xeafe5c in xstrdup xstrdup.c:33
#2 0x85e024 in attach_command /home/jkratoch/redhat/gdb-test-asan/gdb/infcmd.c:2680
regressed by:
commit 6c4486e63f
Author: Pedro Alves <palves@redhat.com>
Date: Fri Oct 17 13:31:26 2014 +0100
PR gdb/17471: Repeating a background command makes it foreground
gdb/ChangeLog
2015-08-04 Jan Kratochvil <jan.kratochvil@redhat.com>
PR gdb/18767
* infcmd.c (attach_command): Move ARGS_CHAIN cleanup after last ARGS
use.
When using run_dump_test with the map option to compare the linker map
file produced, no additional dump program should be required. A dump
program can still be given if needed, but leaving it off will no longer
produce an error.
ld/testsuite/ChangeLog:
* ld/ld-lib.exp (run_dump_test): When using the map option, no
program is required.