The buildbot shows that PPC64 and x86_64 builders, both native and
extended-remote gdbserver frequently timeout these tests.
until-precsave.exp times out on my x86_64 occasionally as well.
Inspecting the logs, we see that if we waited some more, the tests
would pass.
Simply bump until-precsave.exp timeouts further, and apply the same
treatment to step-precsave.exp.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.reverse/step-precsave.exp: Use with_timeout_factor to
increase timeout.
* gdb.reverse/until-precsave.exp: Bump timeouts.
This test fails with --target_board=native-extended-gdbserver because
it misses the usual "disconnect":
(gdb) target remote | /usr/lib64/valgrind/../../bin/vgdb --pid=30454
Already connected to a remote target. Disconnect? (y or n) n
Still connected.
(gdb) FAIL: gdb.base/valgrind-infcall.exp: target remote for vgdb (got interactive prompt)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.base/valgrind-infcall.exp: Issue a "disconnect".
This patch is mostly extracted from Pedro's C++ branch. It adds explicit
casts from integer to enum types, where it is really the intention to do
so. This could be because we are ...
* iterating on enum values (we need to iterate on an equivalent integer)
* converting from a value read from bytes (dwarf attribute, agent
expression opcode) to the equivalent enum
* reading the equivalent integer value from another language (Python/Guile)
An exception to that is the casts in regcache.c. It seems to me like
struct regcache's register_status field could be a pointer to an array of
enum register_status. Doing so would waste a bit of memory (4 bytes
used by the enum vs 1 byte used by the current signed char, for each
register). If we switch to C++11 one day, we can define the underlying
type of an enum type, so we could have the best of both worlds.
gdb/ChangeLog:
* arm-tdep.c (set_fp_model_sfunc): Add cast from integer to enum.
(arm_set_abi): Likewise.
* ax-general.c (ax_print): Likewise.
* c-exp.y (exp : string_exp): Likewise.
* compile/compile-loc2c.c (compute_stack_depth_worker): Likewise.
(do_compile_dwarf_expr_to_c): Likewise.
* cp-name-parser.y (demangler_special : DEMANGLER_SPECIAL start):
Likewise.
* dwarf2expr.c (execute_stack_op): Likewise.
* dwarf2loc.c (dwarf2_compile_expr_to_ax): Likewise.
(disassemble_dwarf_expression): Likewise.
* dwarf2read.c (dwarf2_add_member_fn): Likewise.
(read_array_order): Likewise.
(abbrev_table_read_table): Likewise.
(read_attribute_value): Likewise.
(skip_unknown_opcode): Likewise.
(dwarf_decode_macro_bytes): Likewise.
(dwarf_decode_macros): Likewise.
* eval.c (value_f90_subarray): Likewise.
* guile/scm-param.c (gdbscm_make_parameter): Likewise.
* i386-linux-tdep.c (i386_canonicalize_syscall): Likewise.
* infrun.c (handle_command): Likewise.
* memory-map.c (memory_map_start_memory): Likewise.
* osabi.c (set_osabi): Likewise.
* parse.c (operator_length_standard): Likewise.
* ppc-linux-tdep.c (ppc_canonicalize_syscall): Likewise, and use
single return point.
* python/py-frame.c (gdbpy_frame_stop_reason_string): Likewise.
* python/py-symbol.c (gdbpy_lookup_symbol): Likewise.
(gdbpy_lookup_global_symbol): Likewise.
* record-full.c (record_full_restore): Likewise.
* regcache.c (regcache_register_status): Likewise.
(regcache_raw_read): Likewise.
(regcache_cooked_read): Likewise.
* rs6000-tdep.c (powerpc_set_vector_abi): Likewise.
* symtab.c (initialize_ordinary_address_classes): Likewise.
* target-debug.h (target_debug_print_signals): Likewise.
* utils.c (do_restore_current_language): Likewise.
Fixes another C++ -fpermissive error:
src/gdb/gdbserver/tracepoint.c:4535:21: error: invalid conversion from ‘int’ to ‘eval_result_type’ [-fpermissive]
expr_eval_result = ipa_expr_eval_result;
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* tracepoint.c (expr_eval_result): Now an int.
The regcache used to be hidden inside inferiors.c, but since the
tracepoints support that it's a first class object. This also fixes a
few implicit pointer conversion errors in C++ mode, caused by a few
places missing the explicit cast.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdbthread.h (struct regcache): Forward declare.
(struct thread_info) <regcache_data>: Now a struct regcache
pointer.
* inferiors.c (inferior_regcache_data)
(set_inferior_regcache_data): Now work with struct regcache
pointers.
* inferiors.h (struct regcache): Forward declare.
(inferior_regcache_data, set_inferior_regcache_data): Now work
with struct regcache pointers.
* regcache.c (get_thread_regcache, regcache_invalidate_thread)
(free_register_cache_thread): Remove struct regcache pointer
casts.
Running gdb.threads/process-dies-while-handling-bp.exp against
gdbserver sometimes FAILs because GDBserver drops the connection, but
the logs leave no clue on what the reason could be. Running manually
a few times, I saw the same:
$ ./gdbserver/gdbserver --multi :9999 testsuite/gdb.threads/process-dies-while-handling-bp
Process testsuite/gdb.threads/process-dies-while-handling-bp created; pid = 12766
Listening on port 9999
Remote debugging from host 127.0.0.1
Listening on port 9999
Child exited with status 0
Child exited with status 0
What happened is that an exception escaped and gdbserver reopened the
connection, which led to that second "Listening on port 9999" output.
The error was a failure to access registers from a now-dead thread.
The exception probably shouldn't have escaped here, but meanwhile,
this at least makes the issue less mysterious.
Tested on x86_64 Fedora 20.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* server.c (captured_main): On error, print the exception message
to stderr, and if run_once is set, throw a quit.
Found while processing the C++ enum changes. It seems like series
should be of type enum complaint_series, instead of adding a cast.
Redundant and out of date comments are also removed.
gdb/ChangeLog:
* complaints.c (enum complaint_series): Add newlines and remove
out of date comment.
(struct complaints) <series>: Change type to enum
complaint_series and remove out of date comment.
(symfile_complaint_hook): Use equivalent enum value
ISOLATED_MESSAGE instead of 0.
While hacking on the fix for PR threads/18600 (Threads left stopped
after fork+thread spawn), I once saw its test (fork-plus-threads.exp)
FAIL against gdbserver because move_out_of_jump_pad_callback has a
gdb_breakpoint_here call, and the caller isn't making sure the current
thread points to the right thread. In the case I saw, the current
thread pointed to the wrong process, so gdb_breakpoint_here returned
the wrong answer. Unfortunately I didn't save logs. Still, seems
obvious enough and it should fix a potential occasional racy FAIL.
Tested on x86_64 Fedora 20.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (move_out_of_jump_pad_callback): Temporarily switch
the current thread.
Running gdbserver --debug under Valgrind shows:
==4803== Invalid read of size 4
==4803== at 0x432B62: linux_write_memory (linux-low.c:5320)
==4803== by 0x4143F7: write_inferior_memory (target.c:83)
==4803== by 0x415895: remove_memory_breakpoint (mem-break.c:362)
==4803== by 0x432EF5: linux_remove_point (linux-low.c:5460)
==4803== by 0x416319: delete_raw_breakpoint (mem-break.c:802)
==4803== by 0x4163F3: release_breakpoint (mem-break.c:842)
==4803== by 0x416477: delete_breakpoint_1 (mem-break.c:869)
==4803== by 0x4164EF: delete_breakpoint (mem-break.c:891)
==4803== by 0x416843: delete_gdb_breakpoint_1 (mem-break.c:1069)
==4803== by 0x4168D8: delete_gdb_breakpoint (mem-break.c:1098)
==4803== by 0x4134E3: process_serial_event (server.c:4051)
==4803== by 0x4138E4: handle_serial_event (server.c:4196)
==4803== Address 0x4c6b930 is 0 bytes inside a block of size 1 alloc'd
==4803== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4803== by 0x4240C6: xmalloc (common-utils.c:43)
==4803== by 0x41439C: write_inferior_memory (target.c:80)
==4803== by 0x415895: remove_memory_breakpoint (mem-break.c:362)
==4803== by 0x432EF5: linux_remove_point (linux-low.c:5460)
==4803== by 0x416319: delete_raw_breakpoint (mem-break.c:802)
==4803== by 0x4163F3: release_breakpoint (mem-break.c:842)
==4803== by 0x416477: delete_breakpoint_1 (mem-break.c:869)
==4803== by 0x4164EF: delete_breakpoint (mem-break.c:891)
==4803== by 0x416843: delete_gdb_breakpoint_1 (mem-break.c:1069)
==4803== by 0x4168D8: delete_gdb_breakpoint (mem-break.c:1098)
==4803== by 0x4134E3: process_serial_event (server.c:4051)
==4803==
And:
==7272== Conditional jump or move depends on uninitialised value(s)
==7272== at 0x3615E48361: vfprintf (vfprintf.c:1634)
==7272== by 0x414E89: debug_vprintf (debug.c:60)
==7272== by 0x42800A: debug_printf (common-debug.c:35)
==7272== by 0x43937B: my_waitpid (linux-waitpid.c:149)
==7272== by 0x42D740: linux_wait_for_event_filtered (linux-low.c:2441)
==7272== by 0x42DADA: linux_wait_for_event (linux-low.c:2552)
==7272== by 0x42E165: linux_wait_1 (linux-low.c:2860)
==7272== by 0x42F5D8: linux_wait (linux-low.c:3453)
==7272== by 0x4144A4: mywait (target.c:107)
==7272== by 0x413969: handle_target_event (server.c:4214)
==7272== by 0x41A1A6: handle_file_event (event-loop.c:429)
==7272== by 0x41996D: process_event (event-loop.c:184)
gdb/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* nat/linux-waitpid.c (my_waitpid): Only print *status if waitpid
returned > 0.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_write_memory): Rewrite debug output to avoid
reading beyond the passed in buffer length.
This adds a kfailed test that has the whole process exit just while
several threads continuously step over a breakpoint. Usually, the
process exits just while GDB or GDBserver is handling the breakpoint
hit. In other words, the process disappears while the event thread is
(ptrace-) stopped. This exposes several issues in GDB and GDBserver.
Errors, crashes, etc.
I fixed some of these issues recently, but there's a lot more to do.
It's a bit like playing whack-a-mole at the moment. You fix an issue,
which then exposes several others.
E.g., with the native target, you get (among other errors):
(...)
[New Thread 0x7ffff47b9700 (LWP 18077)]
[New Thread 0x7ffff3fb8700 (LWP 18078)]
[New Thread 0x7ffff37b7700 (LWP 18079)]
Cannot find user-level thread for LWP 18076: generic error
(gdb) KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=1: inferior 1 exited (prompt) (PRMS: gdb/18749)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
PR gdb/18749
* gdb.threads/process-dies-while-handling-bp.c: New file.
* gdb.threads/process-dies-while-handling-bp.exp: New file.
This field was never set nor used. This patch removes it.
gdb/ChangeLog:
* common/agent.c (symbol_list) <required>: Remove.
gdb/gdbserver/ChangeLog:
* tracepoint.c (symbol_list) <required>: Remove.
Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html
This adds a test that has a multithreaded program have several threads
continuously fork, while another thread continuously steps over a
breakpoint.
This exposes several intertwined issues, which this patch addresses:
- When we're stopping and suspending threads, some thread may fork,
and we missed setting its suspend count to 1, like we do when a new
clone/thread is detected. When we next unsuspend threads, the fork
child's suspend count goes below 0, which is bogus and fails an
assertion.
- If a step-over is cancelled because a signal arrives, but then gdb
is not interested in the signal, we pass the signal straight back
to the inferior. However, we miss that we need to re-increment the
suspend counts of all other threads that had been paused for the
step-over. As a result, other threads indefinitely end up stuck
stopped.
- If a detach request comes in just while gdbserver is handling a
step-over (in the test at hand, this is GDB detaching the fork
child), gdbserver internal errors in stabilize_thread's helpers,
which assert that all thread's suspend counts are 0 (otherwise we
wouldn't be able to move threads out of the jump pads). The
suspend counts aren't 0 while a step-over is in progress, because
all threads but the one stepping past the breakpoint must remain
paused until the step-over finishes and the breakpoint can be
reinserted.
- Occasionally, we see "BAD - reinserting but not stepping." being
output (from within linux_resume_one_lwp_throw). That was because
GDB pokes memory while gdbserver is busy with a step-over, and that
suspends threads, and then re-resumes them with proceed_one_lwp,
which missed another reason to tell linux_resume_one_lwp that the
thread should be set back to stepping.
- In a couple places, we were resuming threads that are meant to be
suspended. E.g., when a vCont;c/s request for thread B comes in
just while gdbserver is stepping thread A past a breakpoint. The
resume for thread B must be deferred until the step-over finishes.
- The test runs with both "set detach-on-fork" on and off. When off,
it exercises the case of GDB detaching the fork child explicitly.
When on, it exercises the case of gdb resuming the child
explicitly. In the "off" case, gdb seems to exponentially become
slower as new inferiors are created. This is _very_ noticeable as
with only 100 inferiors gdb is crawling already, which makes the
test take quite a bit to run. For that reason, I've disabled the
"off" variant for now.
gdb/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* target/waitstatus.h (enum target_stop_reason)
<TARGET_STOPPED_BY_SINGLE_STEP>: New value.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Set the fork child's suspend
count if stopping and suspending threads.
(check_stopped_by_breakpoint): If stopped by trace, set the LWP's
stop reason to TARGET_STOPPED_BY_SINGLE_STEP.
(linux_detach): Complete an ongoing step-over.
(lwp_suspended_inc, lwp_suspended_decr): New functions. Use
throughout.
(resume_stopped_resumed_lwps): Don't resume a suspended thread.
(linux_wait_1): If passing a signal to the inferior after
finishing a step-over, unsuspend and re-resume all lwps. If we
see a single-step event but the thread should be continuing, don't
pass the trap to gdb.
(stuck_in_jump_pad_callback, move_out_of_jump_pad_callback): Use
internal_error instead of gdb_assert.
(enqueue_pending_signal): New function.
(check_ptrace_stopped_lwp_gone): Add debug output.
(start_step_over): Use internal_error instead of gdb_assert.
(complete_ongoing_step_over): New function.
(linux_resume_one_thread): Don't resume a suspended thread.
(proceed_one_lwp): If the LWP is stepping over a breakpoint, reset
it stepping.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.threads/forking-threads-plus-breakpoint.exp: New file.
* gdb.threads/forking-threads-plus-breakpoint.c: New file.
The tail end of linux_wait_1 isn't expecting that the select_event_lwp
machinery can pick a whole-process exit event to report to GDB. When
that happens, both gdb and gdbserver end up quite confused:
...
(gdb)
[Thread 24971.24971] #1 stopped.
0x0000003615a011f0 in ?? ()
c&
Continuing.
(gdb) [New Thread 24971.24981]
[New Thread 24983.24983]
[New Thread 24971.24982]
[Thread 24983.24983] #3 stopped.
0x0000003615ebc7cc in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
130 pid = ARCH_FORK ();
[New Thread 24984.24984]
Error in re-setting breakpoint -16: PC register is not available
Error in re-setting breakpoint -17: PC register is not available
Error in re-setting breakpoint -18: PC register is not available
Error in re-setting breakpoint -19: PC register is not available
Error in re-setting breakpoint -24: PC register is not available
Error in re-setting breakpoint -25: PC register is not available
Error in re-setting breakpoint -26: PC register is not available
Error in re-setting breakpoint -27: PC register is not available
Error in re-setting breakpoint -28: PC register is not available
Error in re-setting breakpoint -29: PC register is not available
Error in re-setting breakpoint -30: PC register is not available
PC register is not available
(gdb)
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (add_lwp): Set waitstatus to TARGET_WAITKIND_IGNORE.
(linux_thread_alive): Use lwp_is_marked_dead.
(extended_event_reported): Delete.
(linux_wait_1): Check if waitstatus is TARGET_WAITKIND_IGNORE
instead of extended_event_reported.
(mark_lwp_dead): Don't set the 'dead' flag. Store the waitstatus
as well.
(lwp_is_marked_dead): New function.
(lwp_running): Use lwp_is_marked_dead.
* linux-low.h: Delete 'dead' field, and update 'waitstatus's
comment.
The "extended event with waitstatus" debug output is unreachable, as
it is guarded by "if (!report_to_gdb)". If extended_event_reported is
true, then so is report_to_gdb. Move it to where we print why we're
reporting an event to GDB.
Also, the debug output currently tries to print the wrong struct
target_waitstatus.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_wait_1): Move fork event output out of the
!report_to_gdb check. Pass event_child->waitstatus to
target_waitstatus_to_string instead of ourstatus.
At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel
observed that trying to next/step a program on GNU/Linux sometimes
results in the following failed assertion:
% gdb -q .obj/gprof/main
(gdb) start
(gdb) n
(gdb) step
[...]/infrun.c:2391: internal-error:
resume: Assertion `sig != GDB_SIGNAL_0' failed.
What happened is that, during the "next" operation, GDB hit a
longjmp/exception/step-resume breakpoint but failed to see that this
breakpoint was set for a different thread than the one being stepped.
Joel's detailed analysis follows:
More precisely, at the end of the "start" command, we are stopped at
the start of function Main in main.adb; there are 4 threads in total,
and we are in the main thread (which is thread 1):
(gdb) info thread
Id Target Id Frame
4 Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall ()
3 Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall ()
2 Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall ()
* 1 Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57
All the logs below reference Thread ID/LWP, but it'll be easier to
talk about the threads by GDB thread number. For instance, thread 1
is LWP 28370 while thread 3 is LWP 28378. So, the explanations below
translate the LWPs into thread numbers.
Back to what happens while we are trying to "next' our program:
(gdb) n
infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379))
infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378))
infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377))
infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370))
infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8054523
We've resumed thread 1 (LWP 28370), and received in return a signal
that the same thread stopped slightly further. It's still in the
range of instructions for the line of source we started the "next"
from, as evidenced by the following trace...
infrun: stepping inside range [0x805451e-0x8054531]
... and thus, we decide to continue stepping the same thread:
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
infrun: prepare_to_wait
That's when we get an event from a different thread (thread 3)...
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80782d0
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
... which we find to be at the address where we set a breakpoint on
"the unwinder debug hook" (namely "_Unwind_DebugHook"). But GDB fails
to notice that the breakpoint was inserted for thread 1 only, and so
decides to handle it as...
infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME
... and inserts a breakpoint at the corresponding resume address, as
evidenced by this the next log:
infrun: exception resume at 80542a2
That breakpoint seems innocent right now, but will play a role fairly
quickly. But for now, GDB has inserted the exception-resume
breakpoint, and needs to single-step thread 3 past the breakpoint it
just hit. Thus, it temporarily disables the exception breakpoint, and
requests a step of that thread:
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0
infrun: prepare_to_wait
We then get a notification, still from thread 3, that it's now past
that breakpoint...
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8078424
... so we can resume what we were doing before, which is single-stepping
thread 1 until we get to a new line of code:
infrun: switching back to stepped thread
infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370)
infrun: expected thread still hasn't advanced
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
The "resume" log above shows that we're resuming thread 1 from where
we left off (0x8054523). We get one more stop at 0x8054529, which is
still inside our stepping range so we go again. That's when we get
the following event, from thread 3:
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80542a2
Now the stop_pc address is interesting, because it's the address of
"exception resume" breakpoint...
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME
... and since that location is at a different line of code, this is
where it decides the "next" operation should stop:
infrun: stop_waiting
[Switching to Thread 0xb7c5aba0 (LWP 28378)]
0x080542a2 in inte_tache_rt.ttache_rt (
<_task>=0x80968ec <inte_tache_rt_inst.tache2>)
at /[...]/inte_tache_rt.adb:54
54 end loop;
However, what GDB should have noticed earlier that the exception
breakpoint we hit was for a different thread, thus should have
single-stepped that thread out of the breakpoint _without_ inserting
the exception-return breakpoint, and then resumed the single-stepping
of the initial thread (thread 1) until that thread stepped out of its
stepping range.
This is what this patch does, and after applying it, GDB now correctly
stops on the next line of code.
The patch adds a C++ test that exercises this, both for setjmp/longjmp
and exception breakpoints. With an unpatched GDB it shows:
(gdb) next
[Switching to Thread 22445.22455]
thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59
59 catch (...)
(gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1
next
/home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa
iled.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error)
Resyncing due to internal error.
n
Tested on x86_64-linux, no regressions.
gdb/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
Joel Brobecker <brobecker@adacore.com>
* breakpoint.c (bpstat_what) <bp_longjmp, bp_longjmp_call_dummy>
<bp_exception, bp_longjmp_resume, bp_exception_resume>: Handle the
case where BS->STOP is not set.
gdb/testsuite/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
* gdb.threads/next-while-other-thread-longjmps.c: New file.
* gdb.threads/next-while-other-thread-longjmps.exp: New file.
Fixes a build error due to typedef redefinition with some compilers.
Also added missing copyright header.
gdb/
* nat/gdb_thread_db.h: Add copyright header.
Protect against multiple inclusion.
This patch removes get_thread_id from aarch64-linux-nat.c,
arm-linux-nat.c and xtensa-linux-nat.c.
get_thread_id was added in this commit below in 2000,
41c49b06c4https://sourceware.org/ml/gdb-patches/2000-04/msg00398.html
which predates the ptid_t stuff added into GDB. Nowadays, lwpid of
inferior_ptid is only zero when the inferior is created (in
fork-child.c:fork_inferior) and its lwpid will be set after
linux_nat_wait_1 gets the first event. After that, lwpid of
inferior_ptid is not zero for linux-nat target, then we can use
ptid_get_lwp, so this function isn't needed anymore.
Even when GDB attaches to a process, the lwp of inferior_ptid
isn't zero, see linux-nat.c:linux_nat_attach,
/* The ptrace base target adds the main thread with (pid,0,0)
format. Decorate it with lwp info. */
ptid = ptid_build (ptid_get_pid (inferior_ptid),
ptid_get_pid (inferior_ptid),
0);
Note that linux_nat_xfer_partial shifts lwpid to pid for inferior_ptid
temperately for calling linux_ops->to_xfer_partial, but all the
affected functions in this patch are not called in
linux_ops->to_xfer_partial.
I think we can safely remove get_thread_id for all linux native targets.
Regression tested on arm-linux and aarch64-linux. Unable to build
native GDB and test it on xtensa-linux.
gdb:
2015-08-05 Yao Qi <yao.qi@linaro.org>
* aarch64-linux-nat.c (get_thread_id): Remove.
(debug_reg_change_callback): Call ptid_get_lwp instead of
get_thread_id.
(fetch_gregs_from_thread): Likewise.
(store_gregs_to_thread): Likewise.
(fetch_fpregs_from_thread): Likewise.
(store_fpregs_to_thread): Likewise.
(aarch64_linux_get_debug_reg_capacity): Likewise.
* arm-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Update macro to use ptid_get_lwp.
* xtensa-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Update macro to use ptid_get_lwp.
* arm-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Remove.
(fetch_fpregs): Call ptid_get_lwp instead of GET_THREAD_ID.
(store_fpregs, fetch_regs, store_regs): Likewise.
(fetch_wmmx_regs, store_wmmx_regs): Likewise.
(fetch_vfp_regs, store_vfp_regs): Likewise.
(arm_linux_read_description): Likewise.
(arm_linux_get_hwbp_cap): Likewise.
* xtensa-linux-nat.c (get_thread_id): Remove.
(GET_THREAD_ID): Remove.
(fetch_gregs, store_gregs): Call ptid_get_lwp instead of
GET_THREAD_ID.
The class is called LineTable, not Linetable, as specified by
py-linetable.c/gdbpy_initialize_linetable:
if (gdb_pymodule_addobject (gdb_module, "LineTable",
gdb/ChangeLog:
* python/py-linetable.c: Fix case of Linetable to LineTable
in docstrings and code comments.
* python/py-symtab.c: Same.
We only support tracepoint for aarch64. Although arm program can run
on aarch64, GDBserver doesn't support tracepoint for it.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_supports_tracepoints): Return 0
if current_thread is 32 bit.
In multi-arch debugging, if GDB sends Z0 packet, GDBserver should be
able to do several things below:
- choose the right breakpoint instruction to insert according to the
information available, such as 'kind' in Z0 packet and address,
- choose the right breakpoint instruction to check memory writes and
validate inserted memory breakpoint
- be aware of different breakpoint instructions in $ARCH_breakpoint_at.
unfortunately GDBserver can't do them now. Although x86 GDBserver
supports multi-arch, it doesn't need to support them above because
breakpoint instruction on i686 and x86_64 is the same. However,
breakpoint instructions on aarch64 and arm (arm mode, thumb1, and thumb2)
are different.
I tried to teach aarch64 GDBserver backend to be really
multi-arch-capable in the following ways,
- linux_low_target return the right breakpoint instruction according to
the 'kind' in Z0 packet, and insert_memory_breakpoint can do the right
thing.
- once breakpoint is inserted, the breakpoint data and length is recorded
in each breakpoint object, so that validate_breakpoint and
check_mem_write can get the right breakpoint instruction from each
breakpoint object, rather than from global variable breakpoint_data.
- linux_low_target needs another hook function for pc increment after
hitting a breakpoint.
- let set_breakpoint_at, which is widely used for tracepoint, use the
'default' breakpoint instruction. We can always use aarch64 breakpoint
instruction since arm doesn't support tracepoint yet.
looks it is not a small piece of work, so I decide to disable Z0 packet
on multi-arch, which means aarch64 GDBserver only supports Z0 packet
if it is started to debug only one process (extended protocol is not
used) and process target description is 64-bit.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_supports_z_point_type): Return
0 for Z_PACKET_SW_BP if it may be used in multi-arch debugging.
* server.c (extended_protocol): Remove "static".
* server.h (extended_protocol): Declare it.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-aarch64-low.c (aarch64_get_pc): Get PC register on
both aarch64 and aarch32.
(aarch64_set_pc): Likewise.
This patch teaches aarch64-linux GDBserver use 32-bit arm target
description and regs_info if the elf file is 32-bit.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* configure.srv (case aarch64*-*-linux*): Append arm-with-neon.o
to srv_regobj and append arm-core.xml arm-vfpv3.xml and
arm-with-neon.xml to srv_xmlfiles.
* linux-aarch64-low.c: Include linux-aarch32-low.h.
(is_64bit_tdesc): New function.
(aarch64_linux_read_description): New function.
(aarch64_arch_setup): Call aarch64_linux_read_description.
(regs_info): Rename to regs_info_aarch64.
(aarch64_regs_info): Return right regs_info.
(initialize_low_arch): Call initialize_low_arch_aarch32.
This patch adds a new regs_info regs_info_aarch32 for aarch32, which
can be used by both aarch64 and arm backend.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* configure.srv (srv_tgtobj): Add linux-aarch32-low.o.
* linux-aarch32-low.c: New file.
* linux-aarch32-low.h: New file.
* linux-arm-low.c (arm_fill_gregset): Move it to
linux-aarch32-low.c.
(arm_store_gregset): Likewise.
(arm_fill_vfpregset): Call arm_fill_vfpregset_num
(arm_store_vfpregset): Caa arm_store_vfpregset_num.
(arm_arch_setup): Check if PTRACE_GETREGSET works.
(regs_info): Rename to regs_info_arm.
(arm_regs_info): Return regs_info_aarch32 if
have_ptrace_getregset is 1 and target description is
arm_with_neon or arm_with_vfpv3.
(initialize_low_arch): Don't call init_registers_arm_with_neon.
Call initialize_low_arch_aarch32 instead.
This patch moves variable have_ptrace_getregset from linux-x86-low.c
to linux-low.c, so that arm can use it too.
gdb/gdbserver:
2015-08-04 Yao Qi <yao.qi@linaro.org>
* linux-x86-low.c (have_ptrace_getregset): Move it to ...
* linux-low.c: ... here.
* linux-low.h (have_ptrace_getregset): Declare it.
-fsanitize=address
gdb.base/attach-pie-noexec.exp
==32586==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200004ed90 at pc 0x48ad50 bp 0x7ffceb3aef50 sp 0x7ffceb3aef20
READ of size 2 at 0x60200004ed90 thread T0
#0 0x48ad4f in __interceptor_strlen (/home/jkratoch/redhat/gdb-test-asan/gdb/gdb+0x48ad4f)
#1 0xeafe5c in xstrdup xstrdup.c:33
#2 0x85e024 in attach_command /home/jkratoch/redhat/gdb-test-asan/gdb/infcmd.c:2680
regressed by:
commit 6c4486e63f
Author: Pedro Alves <palves@redhat.com>
Date: Fri Oct 17 13:31:26 2014 +0100
PR gdb/17471: Repeating a background command makes it foreground
gdb/ChangeLog
2015-08-04 Jan Kratochvil <jan.kratochvil@redhat.com>
PR gdb/18767
* infcmd.c (attach_command): Move ARGS_CHAIN cleanup after last ARGS
use.
Implicit void * -> function pointer conversion doesn't work in C++, so
in C++, we need to cast the result of dlsym. This adds a few typedefs
and macros that make this easy. GDBserver's version already had the
CHK macro, so I added it to GDB too.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/gdbserver/ChangeLog:
2015-08-04 Pedro Alves <palves@redhat.com>
* thread-db.c (struct thread_db): Use new typedefs.
(try_thread_db_load_1): Define local TDB_DLSYM macro and use it in
CHK calls.
(disable_thread_event_reporting): Cast result of dlsym to
destination function pointer type.
(thread_db_mourn): Use td_ta_delete_ftype.
gdb/ChangeLog:
2015-08-04 Pedro Alves <palves@redhat.com>
* nat/gdb_thread_db.h (td_init_ftype, td_ta_new_ftype)
(td_ta_map_lwp2thr_ftype, td_ta_thr_iter_ftype)
(td_ta_event_addr_ftype, td_ta_set_event_ftype)
(td_ta_clear_event_ftype, td_ta_event_getmsg_ftype)
(td_thr_validate_ftype, td_thr_get_info_ftype)
(td_thr_event_enable_ftype, td_thr_tls_get_addr_ftype)
(td_thr_tlsbase_ftype, td_symbol_list_ftype, td_ta_delete_ftype):
New typedefs.
* linux-thread-db.c (struct thread_db_info): Use new typedefs.
(try_thread_db_load_1): Define TDB_VERBOSE_DLSYM, TDB_DLSYM , CHK
local macros and use them instead of verbose_dlsym and dlsym
calls.
2015-08-03 Sandra Loosemore <sandra@codesourcery.com>
gdb/testsuite/
* gdb.base/bp-permanent.exp: Report test as unsupported if
the target cannot stop at the permanent breakpoint.
These testcases are mocks of real programs.
GDB doesn't care what the programs do, they just have to look
and/or behave like the real program.
These testcases exercise gdb when debugging really large programs.
E.g., gmonster-1 has 10,000 CUs, and gmonster-2 has 1000 shared libs
(which is actually a little small, 5000 would be more accurate).
gdb/testsuite/ChangeLog:
* gdb.perf/lib/perftest/utils.py: New file.
* gdb.perf/gm-hello.cc: New file.
* gdb.perf/gm-pervasive-typedef.cc: New file.
* gdb.perf/gm-pervasive-typedef.h: New file.
* gdb.perf/gm-std.cc: New file.
* gdb.perf/gm-std.h: New file.
* gdb.perf/gm-use-cerr.cc: New file.
* gdb.perf/gm-utils.h: New file.
* gdb.perf/gmonster-null-lookup.py: New file.
* gdb.perf/gmonster-pervasive-typedef.py: New file.
* gdb.perf/gmonster-print-cerr.py: New file.
* gdb.perf/gmonster-ptype-string.py: New file.
* gdb.perf/gmonster-runto-main.py: New file.
* gdb.perf/gmonster-select-file.py: New file.
* gdb.perf/gmonster1-null-lookup.exp: New file.
* gdb.perf/gmonster1-pervasive-typedef.exp: New file.
* gdb.perf/gmonster1-print-cerr.exp: New file.
* gdb.perf/gmonster1-ptype-string.exp: New file.
* gdb.perf/gmonster1-runto-main.exp: New file.
* gdb.perf/gmonster1-select-file.exp: New file.
* gdb.perf/gmonster1.cc: New file.
* gdb.perf/gmonster1.exp: New file.
* gdb.perf/gmonster2-null-lookup.exp: New file.
* gdb.perf/gmonster2-pervasive-typedef.exp: New file.
* gdb.perf/gmonster2-print-cerr.exp: New file.
* gdb.perf/gmonster2-ptype-string.exp: New file.
* gdb.perf/gmonster2-runto-main.exp: New file.
* gdb.perf/gmonster2-select-file.exp: New file.
* gdb.perf/gmonster2.cc: New file.
* gdb.perf/gmonster2.exp: New file.
single-step.exp takes a while to run, and while that's not necessarily
bad, here it's because the default value of SINGLE_STEP_COUNT is 10,000.
We're not going to gain any more insight into perf issues
single-stepping (stepi) 10,000 times over 1,000 times,
so this patch changes the default to 1,000.
gdb/testsuite/ChangeLog:
* gdb.perf/single-step.exp (SINGLE_STEP_COUNT): Change to 1000 from
10000.
gdb/testsuite/ChangeLog:
* Makefile.in (workers/%.worker, build-perf): New rule.
(GDB_PERFTEST_MODE): New variable.
(check-perf): Use it.
(clean): Clean up gdb.perf parallel build subdirs.
* lib/build-piece.exp: New file.
* lib/gdb.exp (make_gdb_parallel_path): New function
(standard_output_file, standard_temp_file): Call it.
(GDB_PARALLEL handling): Make outputs,temp,cache directories as subdirs
of $GDB_PARALLEL.
* lib/cache.exp (gdb_do_cache): Call make_gdb_parallel_path.
This patch does two things.
1) Add support for multiple data points.
2) Move the "report" output from perftest.log to perftest.sum.
I want to record the raw data somewhere, and a bit of statistical analysis
(standard deviation left for another day), but I also don't want
it to clutter up the basic report.
This patch takes a cue from gdb.{sum,log} and does the same thing
with perftest.{sum,log}.
Ultimately, we'll probably want to emit raw data to csv files or some
such and then do post-processing passes on that.
gdb/testsuite/ChangeLog:
* lib/perftest/reporter.py (SUM_FILE_NAME): New global.
(LOG_FILE_NAME): New global.
(TextReporter.__init__): Initialize self.txt_sum.
(TextReporter.report): Add support for multiple data-points.
Move report to perftest.sum, put raw data in perftest.log.
(TextReporter.start): Open sum and log files.
(TextReporter.end): Close sum and log files.
* lib/perftest/testresult.py (SingleStatisticTestResult.record): Handle
multiple data-points.
As of commit a5fdf78a44, building GDB with
a GCC 4.1 host compiler fails with:
gdb/cp-namespace.c: In function 'cp_lookup_symbol_via_imports':
gdb/cp-namespace.c:482: warning: 'sym.block' may be used uninitialized in this function
Apparently, more recent compilers are able to deduce that no actual
uninitialized use of sym.block takes place, but GCC 4.1 isn't yet
able to do that.
Fixed by adding an explicit initalization.
gdb/
* cp-namespace.c (cp_lookup_symbol_via_imports): Fix uninitialized
variable warning with some compilers.
This patch fixes GDB build breakage on arm-linux.
gdb:
2015-08-03 Yao Qi <yao.qi@linaro.org>
* arm-linux-nat.c (arm_linux_get_hwbp_type): Capitalize "type"
in comment. Replace "rw" with "type".
(arm_linux_remove_watchpoint): Change type of "rw" to
"enum target_hw_bp_type".
Commit f486487f55 (Mostly trivial enum fixes) missed updating
ppc-linux-nat.c, resulting in:
../../src/gdb/ppc-linux-nat.c: In function ‘_initialize_ppc_linux_nat’:
../../src/gdb/ppc-linux-nat.c:2503:27: error: assignment from incompatible pointer type [-Werror]
../../src/gdb/ppc-linux-nat.c:2504:27: error: assignment from incompatible pointer type [-Werror]
gdb/ChangeLog
2015-08-02 Pedro Alves <palves@redhat.com>
* ppc-linux-nat.c (get_trigger_type, create_watchpoint_request)
(ppc_linux_insert_watchpoint, ppc_linux_remove_watchpoint): Change
parameter 'rw's type to enum target_hw_bp_type and rename to
'type'.
The previous commit (Replace the block_found global with explicit
data-flow) lacks updates in a couple of files because it was not
tested building GDB with --enable-targets=all... but buildbots did.
This adds the appropriate simple updates to fix the build.
gdb/ChangeLog:
* alpha-mdebug-tdep.c (find_proc_desc): Update call to
lookup_symbol.
* ft32-tdep.c (ft32_skip_prologue): Likewise.
* moxie-tdep.c (moxie_skip_prologue): Likewise.
* mt-tdep.c (mt_skip_prologue): Likewise.
* xstormy16-tdep.c (xstormy16_skip_prologue): Likewise.
As Pedro suggested on gdb-patches@ (see
https://sourceware.org/ml/gdb-patches/2015-05/msg00714.html), this
change makes symbol lookup functions return a structure that includes
both the symbol found and the block in which it was found. This makes
it possible to get rid of the block_found global variable and thus makes
block hunting explicit.
gdb/
* ada-exp.y (write_object_renaming): Replace struct
ada_symbol_info with struct block_symbol. Update field
references accordingly.
(block_lookup, select_possible_type_sym): Likewise.
(find_primitive_type): Likewise. Also update call to
ada_lookup_symbol to extract the symbol itself.
(write_var_or_type, write_name_assoc): Likewise.
* ada-lang.h (struct ada_symbol_info): Remove.
(ada_lookup_symbol_list): Replace struct ada_symbol_info with
struct block_symbol.
(ada_lookup_encoded_symbol, user_select_syms): Likewise.
(ada_lookup_symbol): Return struct block_symbol instead of a
mere symbol.
* ada-lang.c (defns_collected): Replace struct ada_symbol_info
with struct block_symbol.
(resolve_subexp, ada_resolve_function, sort_choices,
user_select_syms, is_nonfunction, add_defn_to_vec,
num_defns_collected, defns_collected,
symbols_are_identical_enums, remove_extra_symbols,
remove_irrelevant_renamings, add_lookup_symbol_list_worker,
ada_lookup_symbol_list, ada_iterate_over_symbols,
ada_lookup_encoded_symbol, get_var_value): Likewise.
(ada_lookup_symbol): Return a block_symbol instead of a mere
symbol. Replace struct ada_symbol_info with struct
block_symbol.
(ada_lookup_symbol_nonlocal): Likewise.
(standard_lookup): Make block passing explicit through
lookup_symbol_in_language.
* ada-tasks.c (get_tcb_types_info): Update the calls to
lookup_symbol_in_language to extract the mere symbol out of the
returned value.
(ada_tasks_inferior_data_sniffer): Likewise.
* ax-gdb.c (gen_static_field): Likewise for the call to
lookup_symbol.
(gen_maybe_namespace_elt): Deal with struct symbol_in_block from
lookup functions.
(gen_expr): Likewise.
* c-exp.y: Likewise. Remove uses of block_found.
(lex_one_token, classify_inner_name, c_print_token): Likewise.
(classify_name): Likewise. Rename the "sym" local variable to
"bsym".
* c-valprint.c (print_unpacked_pointer): Likewise.
* compile/compile-c-symbols.c (convert_symbol_sym): Promote the
"sym" parameter from struct symbol * to struct block_symbol.
Use it to remove uses of block_found. Deal with struct
symbol_in_block from lookup functions.
(gcc_convert_symbol): Likewise. Update the call to
convert_symbol_sym.
* compile/compile-object-load.c (compile_object_load): Deal with
struct symbol_in_block from lookup functions.
* cp-namespace.c (cp_lookup_nested_symbol_1,
cp_lookup_nested_symbol, cp_lookup_bare_symbol,
cp_search_static_and_baseclasses,
cp_lookup_symbol_in_namespace, cp_lookup_symbol_via_imports,
cp_lookup_symbol_imports_or_template,
cp_lookup_symbol_via_all_imports, cp_lookup_symbol_namespace,
lookup_namespace_scope, cp_lookup_nonlocal,
find_symbol_in_baseclass): Return struct symbol_in_block instead
of mere symbols and deal with struct symbol_in_block from lookup
functions.
* cp-support.c (inspect_type, replace_typedefs,
cp_lookup_rtti_type): Deal with struct symbol_in_block from
lookup functions.
* cp-support.h (cp_lookup_symbol_nonlocal,
cp_lookup_symbol_from_namespace,
cp_lookup_symbol_imports_or_template, cp_lookup_nested_symbol):
Return struct symbol_in_block instead of mere symbols.
* d-exp.y (d_type_from_name, d_module_from_name, push_variable,
push_module_name):
Deal with struct symbol_in_block from lookup functions. Remove
uses of block_found.
* eval.c (evaluate_subexp_standard): Update call to
cp_lookup_symbol_namespace.
* f-exp.y: Deal with struct symbol_in_block from lookup
functions. Remove uses of block_found.
(yylex): Likewise.
* gdbtypes.c (lookup_typename, lookup_struct, lookup_union,
lookup_enum, lookup_template_type, check_typedef): Deal with
struct symbol_in_block from lookup functions.
* guile/scm-frame.c (gdbscm_frame_read_var): Likewise.
* guile/scm-symbol.c (gdbscm_lookup_symbol): Likewise.
(gdbscm_lookup_global_symbol): Likewise.
* gnu-v3-abi.c (gnuv3_get_typeid_type): Likewise.
* go-exp.y: Likewise. Remove uses of block_found.
(package_name_p, classify_packaged_name, classify_name):
Likewise.
* infrun.c (insert_exception_resume_breakpoint): Likewise.
* jv-exp.y (push_variable): Likewise.
* jv-lang.c (java_lookup_class, get_java_object_type): Likewise.
* language.c (language_bool_type): Likewise.
* language.h (struct language_defn): Update
la_lookup_symbol_nonlocal to return a struct symbol_in_block
rather than a mere symbol.
* linespec.c (find_label_symbols): Deal with struct
symbol_in_block from lookup functions.
* m2-exp.y: Likewise. Remove uses of block_found.
(yylex): Likewise.
* mi/mi-cmd-stack.c (list_args_or_locals): Likewise.
* objc-lang.c (lookup_struct_typedef, find_imps): Likewise.
* p-exp.y: Likewise. Remove uses of block_found.
(yylex): Likewise.
* p-valprint.c (pascal_val_print): Likewise.
* parse.c (write_dollar_variable): Likewise. Remove uses of
block_found.
* parser-defs.h (struct symtoken): Turn the SYM field into a
struct symbol_in_block.
* printcmd.c (address_info): Deal with struct symbol_in_block
from lookup functions.
* python/py-frame.c (frapy_read_var): Likewise.
* python/py-symbol.c (gdbpy_lookup_symbol,
gdbpy_lookup_global_symbol): Likewise.
* skip.c (skip_function_command): Likewise.
* solib-darwin.c (darwin_lookup_lib_symbol): Return a struct
symbol_in_block instead of a mere symbol.
* solib-spu.c (spu_lookup_lib_symbol): Likewise.
* solib-svr4.c (elf_lookup_lib_symbol): Likewise.
* solib.c (solib_global_lookup): Likewise.
* solist.h (solib_global_lookup): Likewise.
(struct target_so_ops): Update lookup_lib_global_symbol to
return a struct symbol_in_block rather than a mere symbol.
* source.c (select_source_symtab): Deal with struct
symbol_in_block from lookup functions.
* stack.c (print_frame_args, iterate_over_block_arg_vars):
Likewise.
* symfile.c (set_initial_language): Likewise.
* symtab.c (SYMBOL_LOOKUP_FAILED): Turn into a struct
symbol_in_block.
(SYMBOL_LOOKUP_FAILED_P): New predicate as a macro.
(struct symbol_cache_slot): Turn the FOUND field into a struct
symbol_in_block.
(block_found): Remove.
(eq_symbol_entry): Update to deal with struct symbol_in_block in
cache slots.
(symbol_cache_lookup): Return a struct symbol_in_block rather
than a mere symbol.
(symbol_cache_mark_found): Add a BLOCK parameter to fill
appropriately the cache slots. Update callers.
(symbol_cache_dump): Update cache slots handling to the type
change.
(lookup_symbol_in_language, lookup_symbol, lookup_language_this,
lookup_symbol_aux, lookup_local_symbol,
lookup_symbol_in_objfile, lookup_global_symbol_from_objfile,
lookup_symbol_in_objfile_symtabs,
lookup_symbol_in_objfile_from_linkage_name,
lookup_symbol_via_quick_fns, basic_lookup_symbol_nonlocal,
lookup_symbol_in_static_block, lookup_static_symbol,
lookup_global_symbol):
Return a struct symbol_in_block rather than a mere symbol. Deal
with struct symbol_in_block from other lookup functions. Remove
uses of block_found.
(lookup_symbol_in_block): Remove uses of block_found.
(struct global_sym_lookup_data): Turn the RESULT field into a
struct symbol_in_block.
(lookup_symbol_global_iterator_cb): Update references to the
RESULT field.
(search_symbols): Deal with struct symbol_in_block from lookup
functions.
* symtab.h (struct symbol_in_block): New structure.
(block_found): Remove.
(lookup_symbol_in_language, lookup_symbol,
basic_lookup_symbol_nonlocal, lookup_symbol_in_static_block,
looku_static_symbol, lookup_global_symbol,
lookup_symbol_in_block, lookup_language_this,
lookup_global_symbol_from_objfile): Return a struct
symbol_in_block rather than just a mere symbol. Update comments
to remove mentions of block_found.
* valops.c (find_function_in_inferior,
value_struct_elt_for_reference, value_maybe_namespace_elt,
value_of_this): Deal with struct symbol_in_block from lookup
functions.
* value.c (value_static_field, value_fn_field): Likewise.
The buildbots show that attach-many-short-lived-thread.exp is racy.
But after staring at debug logs and playing with SystemTap scripts for
a (long) while, I figured out that neither GDB, nor the kernel nor the
test's program itself are at fault.
The problem is simply that the testsuite machinery is currently
subject to PID-reuse races. The attach-many-short-lived-threads.c
test program just happens to be much more susceptible to trigger this
race because threads and processes share the same number space on
Linux, and the test spawns many many short lived threads in
succession, thus enlarging the race window a lot.
Part of the problem is that several tests spawn processes with "exec&"
(in order to test the "attach" command) , and then at the end of the
test, to make sure things are cleaned up, issue a 'remote_spawn "kill
-p $testpid"'. Since with tcl's "exec&", tcl itself is responsible
for reaping the process's exit status, when we go kill the process,
testpid may have already exited _and_ its status may have (and often
has) been reaped already. Thus it can happen that another process
meanwhile reuses $testpid, and that "kill" command kills the wrong
process... Frequently, that happens to be
attach-many-short-lived-thread, but this explains other test's races
as well.
In the attach-many-short-lived-threads test, it sometimes manifests
like this:
(gdb) file /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads
Reading symbols from /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads...done.
(gdb) Loaded /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads into /home/pedro/gdb/mygit/build/gdb/testsuite/../../gdb/gdb
attach 5940
Attaching to program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads, process 5940
warning: process 5940 is a zombie - the process has already terminated
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ptrace: Operation not permitted.
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: attach
info threads
No threads.
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: no new threads
set breakpoint always-inserted on
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: set breakpoint always-inserted on
Other times the process dies while the test is ongoing (the process is
ptrace-stopped):
(gdb) print again = 1
Cannot access memory at address 0x6020cc
(gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: reset timer in the inferior
(Recall that on Linux, SIGKILL is not interceptable)
And other times it dies just while we're detaching:
$4 = 319
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 2: print seconds_left
detach
Can't detach Thread 0x7fb13b7de700 (LWP 1842): No such process
(gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: detach
GDB mishandles the latter (it should ignore ESRCH while detaching just
like when continuing), but that's another story.
The fix here is to change spawn_wait_for_attach to use Expect's
'spawn' command instead of Tcl's 'exec&' to spawn programs, because
with spawn we control when to wait for/reap the process. That allows
killing the process by PID without being subject to pid-reuse races,
because even if the process is already dead, the kernel won't reuse
the process's PID until the zombie is reaped.
The other part of the problem lies in DejaGnu itself, unfortunately.
I have occasionally seen tests (attach-many-short-lived-threads
included, but not only that one) die with a random inexplicable
SIGTERM too, and that too is caused by the same reason, except that in
that case, the rogue SIGTERM is sent from this bit in DejaGnu's remote.exp:
exec sh -c "exec > /dev/null 2>&1 && (kill -2 $pgid || kill -2 $pid) && sleep 5 && (kill $pgid || kill $pid) && sleep 5 && (kill -9 $pgid || kill -9 $pid) &"
...
catch "wait -i $shell_id"
Even if the program exits promptly, that whole cascade of kills
carries on in the background, thus potentially killing the poor
process that manages to reuse $pid...
I sent a fix for that to the DejaGnu list:
http://lists.gnu.org/archive/html/dejagnu/2015-07/msg00000.html
With both patches in place, I haven't seen
attach-many-short-lived-threads.exp fail again.
Tested on x86_64 Fedora 20, native, gdbserver and extended-gdbserver.
gdb/testsuite/ChangeLog:
2015-07-31 Pedro Alves <palves@redhat.com>
* gdb.base/attach-pie-misread.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid. Wait for spawn id after eof. Use
kill_wait_spawned_process instead of explicit "kill -9".
* gdb.base/attach-pie-noexec.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid. Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/attach-twice.exp: Likewise.
* gdb.base/attach.exp: Likewise.
(do_command_attach_tests): Use gdb_spawn_with_cmdline_opts and
gdb_test_multiple.
* gdb.base/solib-overlap.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid. Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/valgrind-infcall.exp: Likewise.
* gdb.multi/multi-attach.exp: Likewise.
* gdb.python/py-prompt.exp: Likewise.
* gdb.python/py-sync-interp.exp: Likewise.
* gdb.server/ext-attach.exp: Likewise.
* gdb.threads/attach-into-signal.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.threads/attach-many-short-lived-threads.exp: Adjust to
spawn_wait_for_attach returning a spawn id instead of a pid. Use
spawn_id_get_pid and kill_wait_spawned_process.
* gdb.threads/attach-stopped.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/break-interp.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid. Wait for spawn id after eof. Use
kill_wait_spawned_process instead of explicit "kill -9".
* lib/gdb.exp (can_spawn_for_attach): Adjust comment.
(kill_wait_spawned_process, spawn_id_get_pid): New procedures.
(spawn_wait_for_attach): Use spawn instead of exec to spawn
processes. Don't map cygwin/windows pids here. Now returns a
spawn id list.
This change should have been in the previous patch (Mostly trivial enum
fixes).
gdb/ChangeLog:
* remote-m32r-sdi.c (m32r_remove_watchpoint): Use enum type
instead of integer.