313 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Pedro Alves
|
998d452ac8 |
remote follow fork and spurious child stops in non-stop mode
Running gdb.threads/fork-plus-threads.exp against gdbserver in extended-remote mode, even though the test passes, we still see broken behavior: (gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off continue & Continuing. (gdb) PASS: gdb.threads/fork-plus-threads.exp: continue & [New Thread 28092.28092] [Thread 28092.28092] #2 stopped. [New Thread 28094.28094] [Inferior 2 (process 28092) exited normally] [New Thread 28094.28105] [New Thread 28094.28109] ... [Thread 28174.28174] #18 stopped. [New Thread 28185.28185] [Inferior 10 (process 28174) exited normally] [New Thread 28185.28196] [Thread 28185.28185] #20 stopped. Cannot remove breakpoints because program is no longer writable. Further execution is probably impossible. [Inferior 11 (process 28185) exited normally] [Inferior 1 (process 28091) exited normally] PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint info threads No threads. (gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left info inferiors Num Description Executable * 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads (gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left All the "[Thread FOO] #NN stopped." above are bogus, as well as the "Cannot remove breakpoints because program is no longer writable.", which is a consequence. The problem is that when we intercept a fork event, we should report the event for the parent, only, and leave the child stopped, but not report its stop event. GDB later decides whether to follow the parent or the child. But because handle_extended_wait does not set the child's last_status.kind to TARGET_WAITKIND_STOPPED, a stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access memory) by mistake ends up queueing a SIGSTOP on the child, resuming it, and then when that SIGSTOP is intercepted, because the LWP has last_resume_kind set to resume_stop, gdbserver reports the stop to GDB, as GDB_SIGNAL_0: ... >>>> entering unstop_all_lwps unstopping all lwps proceed_one_lwp: lwp 1600 client wants LWP to remain 1600 stopped proceed_one_lwp: lwp 1828 Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sending sigstop to lwp 1828 pc is 0x3615ebc7cc Resuming lwp 1828 (continue, signal 0, stop expected) continue from pc 0x3615ebc7cc unstop_all_lwps done sigchld_handler <<<< exiting unstop_all_lwps handling possible target event >>>> entering linux_wait_1 linux_wait_1: [<all threads>] my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x1): status(137f), 1828 LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK LLW: waitpid 1828 received Stopped (signal) (stopped) pc is 0x3615ebc7cc Expected stop. LLW: resume_stop SIGSTOP caught for LWP 1828.1828. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... linux_wait_1 ret = LWP 1828.1828, 1, 0 <<<< exiting linux_wait_1 Writing resume reply for LWP 1828.1828:1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Tested on x86_64 Fedora 20, extended-remote. gdb/gdbserver/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> * linux-low.c (handle_extended_wait): Set the child's last reported status to TARGET_WAITKIND_STOPPED. |
||
Pedro Alves
|
69dde7dcb8 |
PR threads/18600: Inferiors left around after fork+thread spawn
The new gdb.threads/fork-plus-threads.exp test exposes one more problem. When one types "info inferiors" after running the program, one see's a couple inferior left still, while there should only be inferior #1 left. E.g.: (gdb) info inferiors Num Description Executable 4 process 8393 /home/pedro/bugs/src/test 2 process 8388 /home/pedro/bugs/src/test * 1 <null> /home/pedro/bugs/src/test (gdb) info threads Calling prune_inferiors() manually at this point (from a top gdb) does not remove them, because they still have inf->pid != 0 (while they shouldn't). This suggests that we never mourned those inferiors. Enabling logs (master + previous patch) we see: ... WL: waitpid Thread 0x7ffff7fc2740 (LWP 9513) received Trace/breakpoint trap (stopped) WL: Handling extended status 0x03057f LHEW: Got clone event from LWP 9513, new child is LWP 9579 [New Thread 0x7ffff37b8700 (LWP 9579)] WL: waitpid Thread 0x7ffff7fc2740 (LWP 9508) received 0 (exited) WL: Thread 0x7ffff7fc2740 (LWP 9508) exited. ^^^^^^^^ [Thread 0x7ffff7fc2740 (LWP 9508) exited] WL: waitpid Thread 0x7ffff7fc2740 (LWP 9499) received 0 (exited) WL: Thread 0x7ffff7fc2740 (LWP 9499) exited. [Thread 0x7ffff7fc2740 (LWP 9499) exited] RSRL: resuming stopped-resumed LWP Thread 0x7ffff37b8700 (LWP 9579) at 0x3615ef4ce1: step=0 ... (gdb) info inferiors Num Description Executable 5 process 9508 /home/pedro/bugs/src/test ^^^^ 4 process 9503 /home/pedro/bugs/src/test 3 process 9500 /home/pedro/bugs/src/test 2 process 9499 /home/pedro/bugs/src/test * 1 <null> /home/pedro/bugs/src/test (gdb) ... Note the "Thread 0x7ffff7fc2740 (LWP 9508) exited." line. That's this in wait_lwp: /* Check if the thread has exited. */ if (WIFEXITED (status) || WIFSIGNALED (status)) { thread_dead = 1; if (debug_linux_nat) fprintf_unfiltered (gdb_stdlog, "WL: %s exited.\n", target_pid_to_str (lp->ptid)); } } That was the leader thread reporting an exit, meaning the whole process is gone. So the problem is that this code doesn't understand that an WIFEXITED status of the leader LWP should be reported to infrun as process exit. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> PR threads/18600 * linux-nat.c (wait_lwp): Report to the core when thread group leader exits. gdb/testsuite/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.exp: Test that "info inferiors" only shows inferior 1. |
||
Pedro Alves
|
4dd63d488a |
PR threads/18600: Threads left stopped after fork+thread spawn
When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file. |
||
Sergio Durigan Junior
|
7da5b897c9 |
Uniquefy gdb.threads/attach-into-signal.exp
Hi, While examining BuildBot's logs, I noticed: <https://sourceware.org/ml/gdb-testers/2015-q3/msg03767.html> gdb.threads/attach-into-signal.exp has two nested loops and don't use unique messages. This commit fixes that. Pushed under the obvious rule. gdb/testsuite/ChangeLog: 2015-07-29 Sergio Durigan Junior <sergiodj@redhat.com> * gdb.threads/attach-into-signal.exp (corefunc): Use with_test_prefix on nested loops, uniquefying the test messages. |
||
Pedro Alves
|
7759842763 |
PR gdb/18717: internal error if non-leader thread exits process
If a non-leader thread exits the process while all other threads are ptrace-stopped, native gdb fails an assertion. The test added by this commit catches it: /home/pedro/gdb/mygit/build/../src/gdb/linux-nat.c:3198: internal-error: linux_nat_filter_event: Assertion `lp->resumed' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.threads/non-leader-exit-process.exp: program exits normally (GDB internal error) The fix is just to remove the assertion. With that out of the way, neither GDB not GDBserver handle this perfectly though, so I'm adding a KFAIL: (gdb) continue Continuing. [Thread 0x7ffff7fc0700 (LWP 15350) exited] No unwaited-for children left. Couldn't get registers: No such process. (gdb) KFAIL: gdb.threads/non-ldr-exit.exp: program exits normally (PRMS: gdb/18717) gdb/ChangeLog: 2015-07-24 Pedro Alves <palves@redhat.com> PR gdb/18717 * linux-nat.c (linux_nat_filter_event): Don't assert that the lwp is resumed, and extend the debug log. gdb/testsuite/ChangeLog: 2015-07-24 Pedro Alves <palves@redhat.com> PR gdb/18717 * gdb.threads/non-ldr-exit.c: New file. * gdb.threads/non-ldr-exit.exp: New file. |
||
Pedro Alves
|
28bf096c62 |
PR threads/18127 - threads spawned by infcall end up stuck in "running" state
Refs: https://sourceware.org/ml/gdb/2015-03/msg00024.html https://sourceware.org/ml/gdb/2015-06/msg00005.html On GNU/Linux, if an infcall spawns a thread, that thread ends up with stuck running state. This happens because: - when linux-nat.c detects a new thread, it marks them as running, and does not report anything to the core. - we skip finish_thread_state when the thread that is running the infcall stops. As result, that new thread ends up with stuck "running" state, even though it really is stopped. On Windows, _all_ threads end up stuck in running state, not just the one that was spawned. That happens because when a new thread is detected, unlike linux-nat.c, windows-nat.c reports TARGET_WAITKIND_SPURIOUS to infrun. It's the fact that that event does not cause a user-visible stop that triggers the problem. When the target is re-resumed, we call set_running with a wildcard ptid, which marks all thread as running. That set_running is not suppressed because the (leader) thread being resumed does not have in_infcall set. Later, when the infcall finally finishes successfully, nothing marks all threads back to stopped. We can trigger the same problem on all targets by having a thread other than the one that is running the infcall report a breakpoint hit to infrun, and then have that breakpoint not cause a stop. That's what the included test does. The fix is to stop GDB from suppressing the set_running calls while doing an infcall, and then set the threads back to stopped when the call finishes, iff they were originally stopped before the infcall started. (Note the MI *running/*stopped event suppression isn't affected.) Tested on x86_64 GNU/Linux. gdb/ChangeLog: 2015-06-29 Pedro Alves <palves@redhat.com> PR threads/18127 * infcall.c (run_inferior_call): On infcall success, if the thread was marked stopped before, reset it back to stopped. * infrun.c (resume): Don't suppress the set_running calls when doing an infcall. (normal_stop): Only discard the finish_thread_state cleanup if the infcall succeeded. gdb/testsuite/ChangeLog: 2015-06-29 Pedro Alves <palves@redhat.com> PR threads/18127 * gdb.threads/hand-call-new-thread.c: New file. * gdb.threads/hand-call-new-thread.c: New file. |
||
Pedro Alves
|
9ee417720b |
Cleanup signal-while-stepping-over-bp-other-thread.exp
gdb/testsuite/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> * gdb.threads/signal-while-stepping-over-bp-other-thread.exp: Use gdb_test_sequence and gdb_assert. |
||
Pedro Alves
|
07473109e1 |
step-over-trips-on-watchpoint.exp: Don't put addresses in test messages
Diffing test results, I noticed: -PASS: gdb.threads/step-over-trips-on-watchpoint.exp: displaced=on: with thread-specific bp: next: b *0x0000000000400811 thread 1 +PASS: gdb.threads/step-over-trips-on-watchpoint.exp: displaced=on: with thread-specific bp: next: b *0x00000000004007d1 thread 1 gdb/testsuite/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> * gdb.threads/step-over-trips-on-watchpoint.exp (do_test): Use test messages that don't include the breakpoint address. |
||
Pedro Alves
|
c79d856c88 |
Test step-over-{lands-on-breakpoint|trips-on-watchpoint}.exp with displaced stepping
These tests exercise the infrun.c:proceed code that needs to know to start new step overs (along with switch_back_to_stepped_thread, etc.). That code is tricky to get right in the multitude of possible combinations (at least): (native | remote) X (all-stop | all-stop-but-target-always-in-non-stop) X (displaced-stepping | in-line step-over). The first two above are properties of the target, but the different step-over-breakpoint methods should work with any target that supports them. This patch makes sure we always test both methods on all targets. Tested on x86-64 Fedora 20. gdb/testsuite/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> * gdb.threads/step-over-lands-on-breakpoint.exp (do_test): New procedure, factored out from ... (top level): ... here. Add "set displaced-stepping" testing axis. * gdb.threads/step-over-trips-on-watchpoint.exp (do_test): New parameter "displaced". Use it. (top level): Use foreach and add "set displaced-stepping" testing axis. |
||
Pedro Alves
|
ebc90b50ce |
Make gdb.threads/step-over-trips-on-watchpoint.exp effective on !x86
This test is currently failing like this on (at least) PPC64 and s390x: FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: step FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: next: next FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: with thread-specific bp: step: step FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: with thread-specific bp: next: next gdb.log: (gdb) PASS: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: set scheduler-locking off step wait_threads () at ../../../src/gdb/testsuite/gdb.threads/step-over-trips-on-watchpoint.c:49 49 return 1; /* in wait_threads */ (gdb) FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: step The problem is that the test assumes that both the "watch_me = 1;" and the "other = 1;" lines compile to a single instruction each, which happens to be true on x86, but no necessarily true everywhere else. The result is that the test doesn't really test what it wants to test. Fix it by looking for the instruction that triggers the watchpoint. gdb/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> * gdb.threads/step-over-trips-on-watchpoint.c (child_function): Remove comment. * gdb.threads/step-over-trips-on-watchpoint.exp (do_test): Find both the address of the instruction that triggers the watchpoint and the address of the instruction immediately after, and use those addresses for the test. Fix comment. |
||
Pedro Alves
|
8d707a12ef |
gdb/18216: displaced step+deliver signal, a thread needs step-over, crash
The problem is that with hardware step targets and displaced stepping, "signal FOO" when stopped at a breakpoint steps the breakpoint instruction at the same time it delivers a signal. This results in tp->stepped_breakpoint set, but no step-resume breakpoint set. When the next stop event arrives, GDB crashes. Irrespective of whether we should do something more/different to step past the breakpoint in this scenario (e.g., PR 18225), it's just wrong to assume there'll be a step-resume breakpoint set (and was not the original intention). gdb/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> PR gdb/18216 * infrun.c (process_event_stop_test): Don't assume a step-resume is set if tp->stepped_breakpoint is true. gdb/testsuite/ChangeLog: 2015-04-10 Pedro Alves <palves@redhat.com> PR gdb/18216 * gdb.threads/multiple-step-overs.exp: Remove expected eof. |
||
Pedro Alves
|
f3770638ca |
Add test for PR18214 and PR18216 - multiple step-overs with queued signals
Both PRs are triggered by the same use case. PR18214 is about software single-step targets. On those, the 'resume' code that detects that we're stepping over a breakpoint and delivering a signal at the same time: /* Currently, our software single-step implementation leads to different results than hardware single-stepping in one situation: when stepping into delivering a signal which has an associated signal handler, hardware single-step will stop at the first instruction of the handler, while software single-step will simply skip execution of the handler. ... Fortunately, we can at least fix this particular issue. We detect here the case where we are about to deliver a signal while software single-stepping with breakpoints removed. In this situation, we revert the decisions to remove all breakpoints and insert single- step breakpoints, and instead we install a step-resume breakpoint at the current address, deliver the signal without stepping, and once we arrive back at the step-resume breakpoint, actually step over the breakpoint we originally wanted to step over. */ doesn't handle the case of _another_ thread also needing to step over a breakpoint. Because the other thread is just resumed at the PC where it had stopped and a breakpoint is still inserted there, the thread immediately re-traps the same breakpoint. This test exercises that. On software single-step targets, it fails like this: KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr3: continue to sigusr1_handler KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr2: continue to sigusr1_handler gdb.log (simplified): (gdb) continue Continuing. Breakpoint 4, child_function_2 (arg=0x0) at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:66 66 callme (); /* set breakpoint thread 2 here */ (gdb) thread 3 (gdb) queue-signal SIGUSR1 (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff7fc1740 (LWP 24824))] #0 main () at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:106 106 wait_threads (); /* set wait-threads breakpoint here */ (gdb) break sigusr1_handler Breakpoint 5 at 0x400837: file src/gdb/testsuite/gdb.threads/multiple-step-overs.c, line 31. (gdb) continue Continuing. [Switching to Thread 0x7ffff7fc0700 (LWP 24828)] Breakpoint 4, child_function_2 (arg=0x0) at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:66 66 callme (); /* set breakpoint thread 2 here */ (gdb) KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr3: continue to sigusr1_handler For good measure, I made the test try displaced stepping too. And then I found it crashes GDB on x86-64 (a hardware step target), but only when displaced stepping... : KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr1: continue to sigusr1_handler (PRMS: gdb/18216) KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr2: continue to sigusr1_handler (PRMS: gdb/18216) KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr3: continue to sigusr1_handler (PRMS: gdb/18216) Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000000000062a83a in process_event_stop_test (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4964 4964 if (sr_bp->loc->permanent Setting up the environment for debugging gdb. Breakpoint 1 at 0x79fcfc: file src/gdb/common/errors.c, line 54. Breakpoint 2 at 0x50a26c: file src/gdb/cli/cli-cmds.c, line 217. (top-gdb) p sr_bp $1 = (struct breakpoint *) 0x0 (top-gdb) bt #0 0x000000000062a83a in process_event_stop_test (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4964 #1 0x000000000062a1af in handle_signal_stop (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4715 #2 0x0000000000629097 in handle_inferior_event (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4165 #3 0x0000000000627482 in fetch_inferior_event (client_data=0x0) at src/gdb/infrun.c:3298 #4 0x000000000064ad7b in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at src/gdb/inf-loop.c:56 #5 0x00000000004c375f in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4658 #6 0x0000000000648c47 in handle_file_event (file_ptr=0x2e0eaa0, ready_mask=1) at src/gdb/event-loop.c:658 The all-stop-non-stop series fixes this, but meanwhile, this augments the multiple-step-overs.exp test to cover this, KFAILed. gdb/testsuite/ChangeLog: 2015-04-08 Pedro Alves <palves@redhat.com> PR gdb/18214 PR gdb/18216 * gdb.threads/multiple-step-overs.c (sigusr1_handler): New function. (main): Install it as SIGUSR1 handler. * gdb.threads/multiple-step-overs.exp (setup): Remove 'prefix' parameter. Always use "setup" as prefix. Toggle "set displaced-stepping" off/on depending on global. Don't switch to thread 1 here. (top level): Add displaced stepping "off/on" test axis. Update "setup" calls. Wrap each subtest with with_test_prefix. Test continuing with a queued signal in each thread. |
||
Yao Qi
|
337532fab1 |
Properly set alarm value in gdb.threads/non-stop-fair-events.exp
Nowadays, the alarm value is 60, and alarm is generated on some slow boards. This patch is to pass DejaGNU timeout value to the program, and move the alarm call before going to infinite loop. If any thread has activities, the alarm is reset. gdb/testsuite: 2015-04-07 Yao Qi <yao.qi@linaro.org> * gdb.threads/non-stop-fair-events.c (SECONDS): New macro. (child_function): Call alarm. (main): Move call to alarm into the loop. * gdb.threads/non-stop-fair-events.exp: Build program with -DTIMEOUT=$timeout. |
||
Yao Qi
|
cafda5977a |
kfail two tests in no-unwaited-for-left.exp for remote target
I see these two fails in no-unwaited-for-left.exp in remote testing for aarch64-linux target. ... continue Continuing. warning: Remote failure reply: E.No unwaited-for children left. [Thread 1084] #2 stopped. (gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits .... continue Continuing. warning: Remote failure reply: E.No unwaited-for children left. [Thread 1081] #1 stopped. (gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits I checked the gdb.log on buildbot, and find that these two fails also appear on Debian-i686-native-extended-gdbserver and Fedora-ppc64be-native-gdbserver-m64. I recall that they are about local/remote parity, and related RSP is missing. There has been already a PR 14618 about it. This patch is to kfail them on remote target. gdb/testsuite: 2015-04-02 Yao Qi <yao.qi@linaro.org> * gdb.threads/no-unwaited-for-left.exp: Set up kfail if target is remote. |
||
Pedro Alves
|
a14711808e |
gdb.threads/manythreads.exp: can't read "test": no such variable
If interrupt_and_wait manages to trigger the FAIL path, we get: ERROR OCCURED: can't read "test": no such variable gdb/testsuite/ChangeLog: 2015-04-01 Pedro Alves <palves@redhat.com> * gdb.threads/manythreads.exp (interrupt_and_wait): Pass $message to fail instead of non-existent $test. |
||
Pedro Alves
|
4eec2deb06 |
Crash on thread id wrap around
On GNU/Linux, if the target reuses the TID of a thread that GDB still has in its list marked as THREAD_EXITED, GDB crashes, like: (gdb) continue Continuing. src/gdb/thread.c:789: internal-error: set_running: Assertion `tp->state != THREAD_EXITED' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.threads/tid-reuse.exp: continue to breakpoint: after_reuse_time (GDB internal error) Here: (top-gdb) bt #0 internal_error (file=0x953dd8 "src/gdb/thread.c", line=789, fmt=0x953da0 "%s: Assertion `%s' failed.") at src/gdb/common/errors.c:54 #1 0x0000000000638514 in set_running (ptid=..., running=1) at src/gdb/thread.c:789 #2 0x00000000004bda42 in linux_handle_extended_wait (lp=0x16f5760, status=0, stopping=0) at src/gdb/linux-nat.c:2114 #3 0x00000000004bfa24 in linux_nat_filter_event (lwpid=20570, status=198015) at src/gdb/linux-nat.c:3127 #4 0x00000000004c070e in linux_nat_wait_1 (ops=0xe193d0, ptid=..., ourstatus=0x7fffffffd2c0, target_options=1) at src/gdb/linux-nat.c:3478 #5 0x00000000004c1015 in linux_nat_wait (ops=0xe193d0, ptid=..., ourstatus=0x7fffffffd2c0, target_options=1) at src/gdb/linux-nat.c:3722 #6 0x00000000004c92d2 in thread_db_wait (ops=0xd80b60 <thread_db_ops>, ptid=..., ourstatus=0x7fffffffd2c0, options=1) at src/gdb/linux-thread-db.c:1525 #7 0x000000000066db43 in delegate_wait (self=0xd80b60 <thread_db_ops>, arg1=..., arg2=0x7fffffffd2c0, arg3=1) at src/gdb/target-delegates.c:116 #8 0x000000000067e54b in target_wait (ptid=..., status=0x7fffffffd2c0, options=1) at src/gdb/target.c:2206 #9 0x0000000000625111 in fetch_inferior_event (client_data=0x0) at src/gdb/infrun.c:3275 #10 0x0000000000648a3b in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at src/gdb/inf-loop.c:56 #11 0x00000000004c2ecb in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4655 I managed to come up with a test that reliably reproduces this. It spawns enough threads for the pid number space to wrap around, so could potentially take a while. On my box that's 4 seconds; on gcc110, a PPC box which has max_pid set to 65536, it's over 10 seconds. So I made the test compute how long that would take, and cap the time waited if it would be unreasonably long. Tested on x86_64 Fedora 20. gdb/ChangeLog: 2015-04-01 Pedro Alves <palves@redhat.com> * linux-thread-db.c (record_thread): Readd the thread to gdb's list if it was marked exited. gdb/testsuite/ChangeLog: 2015-04-01 Pedro Alves <palves@redhat.com> * gdb.threads/tid-reuse.c: New file. * gdb.threads/tid-reuse.exp: New file. |
||
Pedro Alves
|
a25d8bf9c5 |
Fix "thread apply all" with exited threads
I noticed that "thread apply all" sometimes crashes. The problem is that thread_apply_all_command doesn take exited threads into account, and we qsort and then walk more elements than there really ever were put in the array. Valgrind shows: The current thread <Thread ID 3> has terminated. See `help thread'. (gdb) thread apply all p 1 Thread 1 (Thread 0x7ffff7fc2740 (LWP 29579)): $1 = 1 ==29576== Use of uninitialised value of size 8 ==29576== at 0x639CA8: set_thread_refcount (thread.c:1337) ==29576== by 0x5C2C7B: do_my_cleanups (cleanups.c:155) ==29576== by 0x5C2CE8: do_cleanups (cleanups.c:177) ==29576== by 0x63A191: thread_apply_all_command (thread.c:1477) ==29576== by 0x50374D: do_cfunc (cli-decode.c:105) ==29576== by 0x506865: cmd_func (cli-decode.c:1893) ==29576== by 0x7562CB: execute_command (top.c:476) ==29576== by 0x647DA4: command_handler (event-top.c:494) ==29576== by 0x648367: command_line_handler (event-top.c:692) ==29576== by 0x7BF7C9: rl_callback_read_char (callback.c:220) ==29576== by 0x64784C: rl_callback_read_char_wrapper (event-top.c:171) ==29576== by 0x647CB5: stdin_event_handler (event-top.c:432) ==29576== ... This can happen easily today as linux-nat.c/linux-thread-db.c are forgetting to purge non-current exited threads. But even with that fixed, we can always do "thread apply all" with an exited thread selected, which won't be deleted until the user switches to another thread. That's what the test added by this commit exercises. Tested on x86_64 Fedora 20. gdb/ChangeLog: 2015-03-24 Pedro Alves <palves@redhat.com> * thread.c (thread_apply_all_command): Take exited threads into account. gdb/testsuite/ChangeLog: 2015-03-24 Pedro Alves <palves@redhat.com> * gdb.threads/no-unwaited-for-left.exp: Test "thread apply all". |
||
Pedro Alves
|
856e7dd698 |
Make "set scheduler-locking step" depend on user intention, only
Currently, "set scheduler-locking step" is a bit odd. The manual documents it as being optimized for stepping, so that focus of debugging does not change unexpectedly, but then it says that sometimes other threads may run, and thus focus may indeed change unexpectedly... A user can then be excused to get confused and wonder why does GDB behave like this. I don't think a user should have to know about details of how "next" or whatever other run control command is implemented internally to understand when does the "scheduler-locking step" setting take effect. This patch completes a transition that the code has been moving towards for a while. It makes "set scheduler-locking step" hold threads depending on whether the _command_ the user entered was a stepping command [step/stepi/next/nexti], or not. Before, GDB could end up locking threads even on "continue" if for some reason run control decides a thread needs to be single stepped (e.g., for a software watchpoint). After, if a "continue" happens to need to single-step for some reason, we won't lock threads (unless when stepping over a breakpoint, naturally). And if a stepping command wants to continue a thread for bit, like when skipping a function to a step-resume breakpoint, we'll still lock threads, so focus of debugging doesn't change. In order to make this work, we need to record in the thread structure whether what set it running was a stepping command. (A follow up patch will remove the "step" parameters of 'proceed' and 'resume') FWIW, Fedora GDB, which defaults to "scheduler-locking step" (mainline defaults to "off") carries a different patch that goes in this direction as well. Tested on x86_64 Fedora 20, native and gdbserver. gdb/ChangeLog: 2015-03-24 Pedro Alves <palves@redhat.com> * gdbthread.h (struct thread_control_state) <stepping_command>: New field. * infcmd.c (step_once): Pass step=1 to clear_proceed_status. Set the thread's stepping_command field. * infrun.c (resume): Check the thread's stepping_command flag to determine which threads should be resumed. Rename 'entry_step' local to user_step. (clear_proceed_status_thread): Clear 'stepping_command'. (schedlock_applies): Change parameter type to struct thread_info pointer. Adjust. (find_thread_needs_step_over): Remove 'step' parameter. Adjust. (switch_back_to_stepped_thread): Adjust calls to 'schedlock_applies'. (_initialize_infrun): Adjust "set scheduler-locking step" help. gdb/testsuite/ChangeLog: 2015-03-24 Pedro Alves <palves@redhat.com> * gdb.threads/schedlock.exp (test_step): No longer expect that "set scheduler-locking step" with "next" over a function call runs threads unlocked. gdb/doc/ChangeLog: 2015-03-24 Pedro Alves <palves@redhat.com> * gdb.texinfo (test_step) <set scheduler-locking step>: No longer mention that threads may sometimes run unlocked. |
||
Pedro Alves
|
8bf3b159e5 |
gdbserver/Linux: unbreak thread event randomization
Wanting to make sure the new continue-pending-status.exp test tests
both cases of threads 2 and 3 reporting an event, I added counters to
the test, to make it FAIL if events for both threads aren't seen.
Assuming a well behaved backend, and given a reasonable number of
iterations, it should PASS.
However, running that against GNU/Linux gdbserver, I found that
surprisingly, that FAILed. GDBserver always reported the breakpoint
hit for the same thread.
Turns out that I broke gdbserver's thread event randomization
recently, with git commit
|
||
Pedro Alves
|
eb54c8bf08 |
native/Linux: internal error if resume is short-circuited
If the linux_nat_resume's short-circuits the resume because the current thread has a pending status, and, a thread with a higher number was previously stopped for a breakpoint, GDB internal errors, like: /home/pedro/gdb/mygit/src/gdb/linux-nat.c:2590: internal-error: status_callback: Assertion `lp->status != 0' failed. Fix this by make status_callback bail out earlier. GDBserver is already doing the same. New test added that exercises this. gdb/ChangeLog: 2015-03-19 Pedro Alves <palves@redhat.com> * linux-nat.c (status_callback): Return early if the LWP has no status pending. gdb/testsuite/ChangeLog: 2015-03-19 Pedro Alves <palves@redhat.com> * gdb.threads/continue-pending-status.c: New file. * gdb.threads/continue-pending-status.exp: New file. |
||
Pedro Alves
|
be9957b82f |
Fix gdb.threads/thread-specific-bp.exp race
Gary stumbled on this: (gdb) PASS: gdb.threads/thread-specific-bp.exp: all-stop: continue to end info threads Id Target Id Frame * 1 Thread 0x7ffff7fdb700 (LWP 13717) "thread-specific" end () at /home/gary/work/archer/startswith/src/gdb/testsuite/gdb.threads/thread-specific-bp.c:29 (gdb) FAIL: gdb.threads/thread-specific-bp.exp: all-stop: thread start is gone info breakpoint The problem is that "...archer/startswith/src..." has a "start" in it, which matches the too-lax regex in the test. Rather than tweaking the regex, we can just remove the whole "info threads", like we removed similar ones in other files -- GDB nowadays does this implicitly already, so things should work without it. Thus removing this even improves testing here a bit. gdb/testsuite/ChangeLog: 2015-03-04 Pedro Alves <palves@redhat.com> * gdb.threads/thread-specific-bp.exp: Delete "info threads" test. |
||
Pedro Alves
|
511aee7c39 |
gdb.threads/clone-thread_db.c: Add missing includes and fix pthread_join call
This fixes: > gdb compile failed, /gdb/testsuite/gdb.threads/clone-thread_db.c: In function 'main': > /gdb/testsuite/gdb.threads/clone-thread_db.c:67:3: warning: implicit declaration of function 'alarm' [-Wimplicit-function-declaration] > alarm (300); > ^ > /gdb/testsuite/gdb.threads/clone-thread_db.c:69:3: warning: implicit declaration of function 'pthread_create' [-Wimplicit-function-declaration] > pthread_create (&child, NULL, thread_fn, NULL); > ^ > /gdb/testsuite/gdb.threads/clone-thread_db.c:70:3: warning: implicit declaration of function 'pthread_join' [-Wimplicit-function-declaration] > pthread_join (child); > ^ And then adding the missing headers revealed the pthread_join call was incorrect. This probably fixes the crash we see on ppc64be, e.g., at https://sourceware.org/ml/gdb-testers/2015-q1/msg04415.html the logs there show: ... Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x3fffb7ff54a0 (LWP 9275)] 0x00003fffb7f3ce74 in .pthread_join () from /lib64/libpthread.so.0 (gdb) FAIL: gdb.threads/clone-thread_db.exp: continue to end ... Tested on x86_64 Fedora 20. gdb/testsuite/ 2015-03-04 Pedro Alves <palves@redhat.com> * gdb.threads/clone-thread_db.c: Include unistd.h and pthread.h. (main): Pass missing retval argument to pthread_join call. |
||
Pedro Alves
|
95e50b2723 |
follow-exec: delete all non-execing threads
This fixes invalid reads Valgrind first caught when debugging against a GDBserver patched with a series that adds exec events to the remote protocol. Like these, using the gdb.threads/thread-execl.exp test: $ valgrind ./gdb -data-directory=data-directory ./testsuite/gdb.threads/thread-execl -ex "tar extended-remote :9999" -ex "b thread_execler" -ex "c" -ex "set scheduler-locking on" ... Breakpoint 1, thread_execler (arg=0x0) at src/gdb/testsuite/gdb.threads/thread-execl.c:29 29 if (execl (image, image, NULL) == -1) (gdb) n Thread 32509.32509 is executing new program: build/gdb/testsuite/gdb.threads/thread-execl [New Thread 32509.32532] ==32510== Invalid read of size 4 ==32510== at 0x5AA7D8: delete_breakpoint (breakpoint.c:13989) ==32510== by 0x6285D3: delete_thread_breakpoint (thread.c:100) ==32510== by 0x628603: delete_step_resume_breakpoint (thread.c:109) ==32510== by 0x61622B: delete_thread_infrun_breakpoints (infrun.c:2928) ==32510== by 0x6162EF: for_each_just_stopped_thread (infrun.c:2958) ==32510== by 0x616311: delete_just_stopped_threads_infrun_breakpoints (infrun.c:2969) ==32510== by 0x616C96: fetch_inferior_event (infrun.c:3267) ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57) ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877) ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137) ==32510== by 0x4AF6F0: fd_event (ser-base.c:182) ==32510== by 0x63806D: handle_file_event (event-loop.c:762) ==32510== Address 0xcf333e0 is 16 bytes inside a block of size 200 free'd ==32510== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==32510== by 0x77CB74: xfree (common-utils.c:98) ==32510== by 0x5AA954: delete_breakpoint (breakpoint.c:14056) ==32510== by 0x5988BD: update_breakpoints_after_exec (breakpoint.c:3765) ==32510== by 0x61360F: follow_exec (infrun.c:1091) ==32510== by 0x6186FA: handle_inferior_event (infrun.c:4061) ==32510== by 0x616C55: fetch_inferior_event (infrun.c:3261) ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57) ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877) ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137) ==32510== by 0x4AF6F0: fd_event (ser-base.c:182) ==32510== by 0x63806D: handle_file_event (event-loop.c:762) ==32510== [Switching to Thread 32509.32532] Breakpoint 1, thread_execler (arg=0x0) at src/gdb/testsuite/gdb.threads/thread-execl.c:29 29 if (execl (image, image, NULL) == -1) (gdb) The breakpoint in question is the step-resume breakpoint of the non-main thread, the one that was "next"ed. The exact same issue can be seen on mainline with native debugging, by running the thread-execl.exp test in non-stop mode, because the kernel doesn't report a thread exit event for the execing thread. Tested on x86_64 Fedora 20. gdb/ChangeLog: 2015-03-02 Pedro Alves <palves@redhat.com> * infrun.c (follow_exec): Delete all threads of the process except the event thread. Extended comments. gdb/testsuite/ChangeLog: 2015-03-02 Pedro Alves <palves@redhat.com> * gdb.threads/thread-execl.exp (do_test): Handle non-stop. (top level): Call do_test with non-stop as well. |
||
Pedro Alves
|
a47cd6e95a |
gdb.threads/multi-create-ns-info-thr.exp and native-extended-remote board
The buildbot shows that the new gdb.threads/multi-create-ns-info-thr.exp test is timing out when tested with --target=native-extended-remote. The reason is: No breakpoints or watchpoints. (gdb) break main Breakpoint 1 at 0x10000b00: file ../../../binutils-gdb/gdb/testsuite/gdb.threads/multi-create.c, line 72. (gdb) run Starting program: /home/gdb-buildbot/fedora-21-ppc64be-1/fedora-ppc64be-native-extended-gdbserver/build/gdb/testsuite/outputs/gdb.threads/multi-create-ns-info-thr/multi-cre ate-ns-info-thr Process /home/gdb-buildbot/fedora-21-ppc64be-1/fedora-ppc64be-native-extended-gdbserver/build/gdb/testsuite/outputs/gdb.threads/multi-create-ns-info-thr/multi-create-ns-inf o-thr created; pid = 16266 Unexpected vCont reply in non-stop mode: T0501:00003fffffffd190;40:00000080560fe290;thread:p3f8a.3f8a;core:0; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (gdb) break multi-create.c:45 Breakpoint 2 at 0x10000994: file ../../../binutils-gdb/gdb/testsuite/gdb.threads/multi-create.c, line 45. (gdb) commands Type commands for breakpoint(s) 2, one per line. Non-stop tests don't really work with the --target_board=native-extended-remote board, because tests toggle non-stop on after GDB is already connected to gdbserver, while Currently, non-stop must be enabled before connecting. This adjusts the test to bail if running to main fails, like all other non-stop tests. Note non-stop tests do work with --target_board=native-gdbserver. gdb/testsuite/ChangeLog: 2015-02-21 Pedro Alves <palves@redhat.com> * gdb.threads/multi-create-ns-info-thr.exp: Return early if runto_main fails. |
||
Pedro Alves
|
2db9a4275c |
GNU/Linux: Stop using libthread_db/td_ta_thr_iter
TL;DR - GDB can hang if something refreshes the thread list out of the target while the target is running. GDB hangs inside td_ta_thr_iter. The fix is to not use that libthread_db function anymore. Long version: Running the testsuite against my all-stop-on-top-of-non-stop series is still exposing latent non-stop bugs. I was originally seeing this with the multi-create.exp test, back when we were still using libthread_db thread event breakpoints. The all-stop-on-top-of-non-stop series forces a thread list refresh each time GDB needs to start stepping over a breakpoint (to pause all threads). That test hits the thread event breakpoint often, resulting in a bunch of step-over operations, thus a bunch of thread list refreshes while some threads in the target are running. The commit adds a real non-stop mode test that triggers the issue, based on multi-create.exp, that does an explicit "info threads" when a breakpoint is hit. IOW, it does the same things the as-ns series was doing when testing multi-create.exp. The bug is a race, so it unfortunately takes several runs for the test to trigger it. In fact, even when setting the test running in a loop, it sometimes takes several minutes for it to trigger for me. The race is related to libthread_db's td_ta_thr_iter. This is libthread_db's entry point for walking the thread list of the inferior. Sometimes, when GDB refreshes the thread list from the target, libthread_db's td_ta_thr_iter can somehow see glibc's thread list as a cycle, and get stuck in an infinite loop. The issue is that when a thread exits, its thread control structure in glibc is moved from a "used" list to a "cache" list. These lists are simply circular linked lists where the "next/prev" pointers are embedded in the thread control structure itself. The "next" pointer of the last element of the list points back to the list's sentinel "head". There's only one set of "next/prev" pointers for both lists; thus a thread can only be in one of the lists at a time, not in both simultaneously. So when thread C exits, simplifying, the following happens. A-C are threads. stack_used and stack_cache are the list's heads. Before: stack_used -> A -> B -> C -> (&stack_used) stack_cache -> (&stack_cache) After: stack_used -> A -> B -> (&stack_used) stack_cache -> C -> (&stack_cache) td_ta_thr_iter starts by iterating at the list's head's next, and iterates until it sees a thread whose next pointer points to the list's head again. Thus in the before case above, C's next points to stack_used, indicating end of list. In the same case, the stack_cache list is empty. For each thread being iterated, td_ta_thr_iter reads the whole thread object out of the inferior. This includes the thread's "next" pointer. In the scenario above, it may happen that td_ta_thr_iter is iterating thread B and has already read B's thread structure just before thread C exits and its control structure moves to the cached list. Now, recall that td_ta_thr_iter is running in the context of GDB, and there's no locking between GDB and the inferior. From it's local copy of B, td_ta_thr_iter believes that the next thread after B is thread C, so it happilly continues iterating to C, a thread that has already exited, and is now in the stack cache list. After iterating C, td_ta_thr_iter finds the stack_cache head, which because it is not stack_used, td_ta_thr_iter assumes it's just another thread. After this, unless the reverse race triggers, GDB gets stuck in td_ta_thr_iter forever walking the stack_cache list, as no thread in thatlist has a next pointer that points back to stack_used (the terminating condition). Before fully understanding the issue, I tried adding cycle detection to GDB's td_ta_thr_iter callback. However, td_ta_thr_iter skips calling the callback in some cases, which means that it's possible that the callback isn't called at all, making it impossible for GDB to break the loop. I did manage to get GDB stuck in that state more than once. Fortunately, we can avoid the issue altogether. We don't really need td_ta_thr_iter for live debugging nowadays, given PTRACE_EVENT_CLONE. We already know how to map and lwp id to a thread id without iterating (thread_from_lwp), so use that more. gdb/ChangeLog: 2015-02-20 Pedro Alves <palves@redhat.com> * linux-nat.c (linux_handle_extended_wait): Call thread_db_notice_clone whenever a new clone LWP is detected. (linux_stop_and_wait_all_lwps, linux_unstop_all_lwps): New functions. * linux-nat.h (thread_db_attach_lwp): Delete declaration. (thread_db_notice_clone, linux_stop_and_wait_all_lwps) (linux_unstop_all_lwps): Declare. * linux-thread-db.c (struct thread_get_info_inout): Delete. (thread_get_info_callback): Delete. (thread_from_lwp): Use td_thr_get_info and record_thread. (thread_db_attach_lwp): Delete. (thread_db_notice_clone): New function. (try_thread_db_load_1): If /proc is mounted and shows the process'es task list, walk over all LWPs and call thread_from_lwp instead of relying on td_ta_thr_iter. (attach_thread): Don't call check_thread_signals here. Split the tail part of the function (which adds the thread to the core GDB thread list) to ... (record_thread): ... this function. Call check_thread_signals here. (thread_db_wait): Don't call thread_db_find_new_threads_1. Always call thread_from_lwp. (thread_db_update_thread_list): Rename to ... (thread_db_update_thread_list_org): ... this. (thread_db_update_thread_list): New function. (thread_db_find_thread_from_tid): Delete. (thread_db_get_ada_task_ptid): Simplify. * nat/linux-procfs.c: Include <sys/stat.h>. (linux_proc_task_list_dir_exists): New function. * nat/linux-procfs.h (linux_proc_task_list_dir_exists): Declare. gdb/gdbserver/ChangeLog: 2015-02-20 Pedro Alves <palves@redhat.com> * thread-db.c: Include "nat/linux-procfs.h". (thread_db_init): Skip listing new threads if the kernel supports PTRACE_EVENT_CLONE and /proc/PID/task/ is accessible. gdb/testsuite/ChangeLog: 2015-02-20 Pedro Alves <palves@redhat.com> * gdb.threads/multi-create-ns-info-thr.exp: New file. |
||
Pedro Alves
|
5c5019c27c |
PR18006: internal error if threaded program calls clone(CLONE_VM)
On GNU/Linux, if a pthreaded program has a thread call clone(CLONE_VM) directly, and then that clone LWP hits a debug event (breakpoint, etc.) GDB internal errors. Threaded programs shouldn't really be calling clone directly, but GDB shouldn't crash either. The crash looks like this: (gdb) break clone_fn Breakpoint 2 at 0x4007d8: file clone-thread_db.c, line 35. (gdb) r ... [Thread debugging using libthread_db enabled] ... src/gdb/linux-nat.c:1030: internal-error: lin_lwp_attach_lwp: Assertion `lwpid > 0' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. The problem is that 'clone' ends up clearing the parent thread's tid field in glibc's thread data structure. For x86_64, the glibc code in question is here: sysdeps/unix/sysv/linux/x86_64/clone.S: ... testq $CLONE_THREAD, %rdi jne 1f testq $CLONE_VM, %rdi movl $-1, %eax <---- jne 2f movl $SYS_ify(getpid), %eax syscall 2: movl %eax, %fs:PID movl %eax, %fs:TID <---- 1: When GDB refreshes the thread list out of libthread_db, it finds a thread with LWP with pid -1 (the clone's parent), which naturally isn't yet on the thread list. GDB then tries to attach to that bogus LWP id, which is caught by that assertion. The fix is to detect the bad PID early. Tested on x86-64 Fedora 20. GDBserver doesn't need any fix. gdb/ChangeLog: 2015-02-20 Pedro Alves <palves@redhat.com> PR threads/18006 * linux-thread-db.c (thread_get_info_callback): Return early if the thread's lwp id is -1. gdb/testsuite/ChangeLog: 2015-02-20 Pedro Alves <palves@redhat.com> PR threads/18006 * gdb.threads/clone-thread_db.c: New file. * gdb.threads/clone-thread_db.exp: New file. |
||
Pedro Alves
|
0703599a49 |
Fix adjust_pc_after_break, remove still current thread check
On decr_pc_after_break targets, GDB adjusts the PC incorrectly if a background single-step stops somewhere where PC-$decr_pc has a breakpoint, and the thread that finishes the step is not the current thread, like: ADDR1 nop <-- breakpoint here ADDR2 jmp PC IOW, say thread A is stepping ADDR2's line in the background (an infinite loop), and the user switches focus to thread B. GDB's adjust_pc_after_break logic confuses the single-step stop of thread A for a hit of the breakpoint at ADDR1, and thus adjusts thread A's PC to point at ADDR1 when it should not, and reports a breakpoint hit, when thread A did not execute the instruction at ADDR1 at all. The test added by this patch exercises exactly that. I can't find any reason we'd need the "thread to be examined is still the current thread" condition in adjust_pc_after_break, at least nowadays; it might have made sense in the past. Best just remove it, and rely on currently_stepping(). Here's the test's log of a run with an unpatched GDB: 35 while (1); (gdb) PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: next over nop next& (gdb) PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: next& over inf loop thread 1 [Switching to thread 1 (Thread 0x7ffff7fc2740 (LWP 29027))](running) (gdb) PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: switch to main thread Breakpoint 2, thread_function (arg=0x0) at ...src/gdb/testsuite/gdb.threads/step-bg-decr-pc-switch-thread.c:34 34 NOP; /* set breakpoint here */ FAIL: gdb.threads/step-bg-decr-pc-switch-thread.exp: no output while stepping gdb/ChangeLog: 2015-02-11 Pedro Alves <pedro@codesourcery.com> * infrun.c (adjust_pc_after_break): Don't adjust the PC just because the event thread is not the current thread. gdb/testsuite/ChangeLog: 2015-02-11 Pedro Alves <pedro@codesourcery.com> * gdb.threads/step-bg-decr-pc-switch-thread.c: New file. * gdb.threads/step-bg-decr-pc-switch-thread.exp: New file. |
||
Pedro Alves
|
01b088bc51 |
Add "signal SIGTRAP" test
Some local changes I was working on related to SIGTRAP handling resulted in "signal SIGTRAP" no longer passing the SIGTRAP to the inferior. Surprisingly, only annota1.exp catches this. This commit adds a test that doesn't rely on annotations, so that at the point annotations are finaly dropped, we still have this use case covered ... This is a multi-threaded test to also exercise the case of first needing to do a step-over before delivering the signal. Tested on x86_64 Fedora 20, native, remote/extended-remote gdbserver. gdb/testsuite/ 2015-02-10 Pedro Alves <palves@redhat.com> * gdb.threads/signal-sigtrap.c: New file. * gdb.threads/signal-sigtrap.exp: New file. |
||
Pedro Alves
|
e584fdbc6a |
Improve gdb.threads/attach-many-short-lived-threads.exp timeout handling
The buildbot shows that this test is still racy, and occasionally fails with time outs on some machines. I'd like to get major issues with load out of the way. The test currently exits after 180s, which is just a random number, that has no relation to what the .exp file considers a time out. This commit makes the program wait a bit longer than what the .exp file considers a time out, and, resets the timer for each iteration. Tested on x86_64 Fedora 20, native and extended-remote gdbserver. gdb/testsuite/ 2015-02-06 Pedro Alves <palves@redhat.com> * gdb.threads/attach-many-short-lived-threads.c (SECONDS): New macro. (seconds_left, again): New globals. (main): Wait seconds_left in a 1-second sleep loop instead of sleeping 180 seconds. If 'again' is set, reset the seconds counter. * gdb.threads/attach-many-short-lived-threads.exp (test): Set 'again' in the inferior before detaching. Print the seconds left. (options): New global. (top level): Build program with -DTIMEOUT=$timeout. |
||
Mark Wielaard
|
37bc665e4e |
Remove testsuite compile errors with GCC5.
GCC5 defaults to the GNU11 standard for C and warns by default for implicit function declarations and implicit return types. https://gcc.gnu.org/gcc-5/porting_to.html Fixing these issues in the testsuite turns 9 untested and 17 unsupported testcases into 417 new passes when compiling with GCC5. gdb/testsuite/ChangeLog: * gdb.arch/i386-bp_permanent.c (standard): New declaration. * gdb.base/disp-step-fork.c: Include unistd.h. * gdb.base/siginfo-obj.c: Include stdio.h. * gdb.base/siginfo-thread.c: Likewise. * gdb.mi/non-stop.c: Include unistd.h. * gdb.mi/nsthrexec.c: Include stdio.h. * gdb.mi/pthreads.c: Include unistd.h. * gdb.modula2/unbounded1.c (main): Declare returns int. * gdb.reverse/consecutive-reverse.c: Likewise. * gdb.threads/create-fail.c: Include unistd.h. * gdb.threads/killed.c: Likewise. * gdb.threads/linux-dp.c: Likewise. * gdb.threads/non-ldr-exc-1.c: Include stdio.h and string.h. * gdb.threads/non-ldr-exc-2.c: Likewise. * gdb.threads/non-ldr-exc-3.c: Likewise. * gdb.threads/non-ldr-exc-4.c: Likewise. * gdb.threads/pthreads.c: Include unistd.h. (main): Declare returns int. * gdb.threads/tls-main.c (foo): New declaration. * gdb.threads/watchpoint-fork-mt.c: Define _GNU_SOURCE. |
||
Pedro Alves
|
198297aafb |
Linux: make target_is_async_p return false when async is off
linux_nat_is_async_p currently always returns true, even when the target is _not_ async. That confuses gdb_readline_wrapper/gdb_readline_wrapper_cleanup, which force-disables target-async while the secondary prompt is active. As a result, when gdb_readline_wrapper returns, the target is left async, even through it was sync to begin with. That can result in weird bugs, like the one the test added by this commit exposes. Ref: https://sourceware.org/ml/gdb-patches/2015-01/msg00592.html gdb/ChangeLog: 2015-01-23 Pedro Alves <palves@redhat.com> * linux-nat.c (linux_is_async_p): New macro. (linux_nat_is_async_p): (linux_nat_terminal_inferior): Check whether the target can async instead of whether it is already async. (linux_nat_terminal_ours): Don't check whether the target is async. (linux_async_pipe): Use linux_is_async_p. gdb/testsuite/ChangeLog: 2015-01-23 Pedro Alves <palves@redhat.com> * gdb.threads/continue-pending-after-query.c: New file. * gdb.threads/continue-pending-after-query.exp: New file. |
||
Pedro Alves
|
ede9f622af |
add non-stop test that stresses thread starvation issues
This commit adds a non-stop mode test originally inspired by signal-while-stepping-over-bp-other-thread.exp, that exposes the thread starvation issues fixed by the previous patches. It sets a set of threads stepping in parallel, and has one of them get a signal. Without the previous fixes, this would fail with timeouts. gdb/testsuite/ 2015-01-09 Pedro Alves <palves@redhat.com> * gdb.threads/non-stop-fair-events.c: New file. * gdb.threads/non-stop-fair-events.exp: New file. |
||
Pedro Alves
|
9665ffdd59 |
gdb.threads/{siginfo-thread.c,watchthreads-reorder.c,ia64-sigill.c} races with GDB
These three test all spawn a few threads and then send a SIGSTOP to their parent GDB in order to pause it while the new threads set things up for the test. With a GDB patch that changes the inferior thread's scheduling a bit, I sometimes see: FAIL: gdb.threads/siginfo-threads.exp: catch signal 0 (timeout) ... FAIL: gdb.threads/watchthreads-reorder.exp: reorder1: continue a (timeout) ... FAIL: gdb.threads/ia64-sigill.exp: continue (timeout) ... The issue is that the test program stops GDB before it had a chance of processing the new thread's clone event: (gdb) PASS: gdb.threads/siginfo-threads.exp: get pid continue Continuing. Stopping GDB PID 21541. Waiting till the threads initialize their TIDs. FAIL: gdb.threads/siginfo-threads.exp: catch signal 0 (timeout) On Linux (at least), new threads start stopped, and the debugger must resume them. The fix is to make the test program wait for the new threads to be running before stopping GDB. gdb/testsuite/ 2015-01-09 Pedro Alves <palves@redhat.com> * gdb.threads/ia64-sigill.c (threads_started_barrier): New global. (thread_func): Wait on barrier. (main): Wait for all threads to start before stopping GDB. * gdb.threads/siginfo-threads.c (threads_started_barrier): New global. (thread1_func, thread2_func): Wait on barrier. (main): Wait for all threads to start before stopping GDB. * gdb.threads/watchthreads-reorder.c (threads_started_barrier): New global. (thread1_func, thread2_func): Wait on barrier. (main): Wait for all threads to start before stopping GDB. |
||
Pedro Alves
|
c945a99f01 |
Test attaching to a program that constantly spawns short-lived threads
Before the previous fixes, on Linux, this would trigger several different problems, like: [New LWP 27106] [New LWP 27047] warning: unable to open /proc file '/proc/-1/status' [New LWP 27813] [New LWP 27869] warning: Can't attach LWP 11962: No child processes Warning: couldn't activate thread debugging using libthread_db: Cannot find new threads: debugger service failed warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. gdb/testsuite/ 2015-01-09 Pedro Alves <palves@redhat.com> * gdb.threads/attach-many-short-lived-threads.c: New file. * gdb.threads/attach-many-short-lived-threads.exp: New file. |
||
Pedro Alves
|
c1a747c109 |
Linux: Skip thread_db thread event reporting if PTRACE_EVENT_CLONE is supported
[A test I wrote stumbled on a libthread_db issue related to thread event breakpoints. See glibc PR17705: [nptl_db: stale thread create/death events if debugger detaches] https://sourceware.org/bugzilla/show_bug.cgi?id=17705 This patch avoids that whole issue by making GDB stop using thread event breakpoints in the first place, which is good for other reasons as well, anyway.] Before PTRACE_EVENT_CLONE (Linux 2.6), the only way to learn about new threads in the inferior (to attach to them) or to learn about thread exit was to coordinate with the inferior's glibc/runtime, using libthread_db. That works by putting a breakpoint at a magic address which is called when a new thread is spawned, or when a thread is about to exit. When that breakpoint is hit, all threads are stopped, and then GDB coordinates with libthread_db to read data structures out of the inferior to learn about what happened. Then the breakpoint is single-stepped, and then all threads are re-resumed. This isn't very efficient (stops all threads) and is more fragile (inferior's thread list in memory may be corrupt; libthread_db bugs, etc.) than ideal. When the kernel supports PTRACE_EVENT_CLONE (which we already make use of), there's really no need to use libthread_db's event reporting mechanism to learn about new LWPs. And if the kernel supports that, then we learn about LWP exits through regular WIFEXITED wait statuses, so no need for the death event breakpoint either. GDBserver has been likewise skipping the thread_db events for a long while: https://sourceware.org/ml/gdb-patches/2007-10/msg00547.html There's one user-visible difference: we'll no longer print about threads being created and exiting while the program is running, like: [Thread 0x7ffff7dbb700 (LWP 30670) exited] [New Thread 0x7ffff7db3700 (LWP 30671)] [Thread 0x7ffff7dd3700 (LWP 30667) exited] [New Thread 0x7ffff7dab700 (LWP 30672)] [Thread 0x7ffff7db3700 (LWP 30671) exited] [Thread 0x7ffff7dcb700 (LWP 30668) exited] This is exactly the same behavior as when debugging against remote targets / gdbserver. I actually think that's a good thing (and as such have listed this in the local/remote parity wiki page a while ago), as the printing slows down the inferior. It's also a distraction to keep bothering the user about short-lived threads that she won't be able to interact with anyway. Instead, the user (and frontend) will be informed about new threads that currently exist in the program when the program next stops: (gdb) c ... * ctrl-c * [New Thread 0x7ffff7963700 (LWP 7797)] [New Thread 0x7ffff796b700 (LWP 7796)] Program received signal SIGINT, Interrupt. [Switching to Thread 0x7ffff796b700 (LWP 7796)] clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:81 81 testq %rax,%rax (gdb) info threads A couple of tests had assumptions on GDB thread numbers that no longer hold. Tested on x86_64 Fedora 20. gdb/ 2014-01-09 Pedro Alves <palves@redhat.com> Skip enabling event reporting if the kernel supports PTRACE_EVENT_CLONE. * linux-thread-db.c: Include "nat/linux-ptrace.h". (thread_db_use_events): New function. (try_thread_db_load_1): Check thread_db_use_events before enabling event reporting. (update_thread_state): New function. (attach_thread): Use it. Check thread_db_use_events before enabling event reporting. (thread_db_detach): Check thread_db_use_events before disabling event reporting. (find_new_threads_callback): Check thread_db_use_events before enabling event reporting. Update the thread's state if not using libthread_db events. gdb/testsuite/ 2014-01-09 Pedro Alves <palves@redhat.com> * gdb.threads/fork-thread-pending.exp: Switch to the main thread instead of to thread 2. * gdb.threads/signal-command-multiple-signals-pending.c (main): Add barrier around each pthread_create call instead of around all calls. * gdb.threads/signal-command-multiple-signals-pending.exp (test): Set a break on thread_function and have the child threads hit it one at at a time. |
||
Joel Brobecker
|
32d0add0a6 |
Update year range in copyright notice of all files owned by the GDB project.
gdb/ChangeLog: Update year range in copyright notice of all files. |
||
Pedro Alves
|
78708b7c8c |
GDBserver: ctrl-c after leader has exited
The target->request_interrupt callback implements the handling for ctrl-c. User types ctrl-c in GDB, GDB sends a \003 to the remote target, and the remote targets stops the program with a SIGINT, just like if the user typed ctrl-c in GDBserver's terminal. The trouble is that using kill_lwp(signal_pid, SIGINT) sends the SIGINT directly to the program's main thread. If that thread has exited already, then that kill won't do anything. Instead, send the SIGINT to the process group, just like GDB does (see inf-ptrace.c:inf_ptrace_stop). gdb.threads/leader-exit.exp is extended to cover the scenario. It fails against GDBserver before the patch. Tested on x86_64 Fedora 20, native and GDBserver. gdb/gdbserver/ 2014-11-12 Pedro Alves <palves@redhat.com> * linux-low.c (linux_request_interrupt): Always send a SIGINT to the process group instead of to a specific LWP. gdb/testsuite/ 2014-11-12 Pedro Alves <palves@redhat.com> * gdb.threads/leader-exit.exp: Test sending ctrl-c works after the leader has exited. |
||
Pedro Alves
|
354204061c |
PR 17408 - assertion failure in switch_back_to_stepped_thread
This PR shows that GDB can easily trigger an assertion here, in infrun.c: 5392 /* Did we find the stepping thread? */ 5393 if (tp->control.step_range_end) 5394 { 5395 /* Yep. There should only one though. */ 5396 gdb_assert (stepping_thread == NULL); 5397 5398 /* The event thread is handled at the top, before we 5399 enter this loop. */ 5400 gdb_assert (tp != ecs->event_thread); 5401 5402 /* If some thread other than the event thread is 5403 stepping, then scheduler locking can't be in effect, 5404 otherwise we wouldn't have resumed the current event 5405 thread in the first place. */ 5406 gdb_assert (!schedlock_applies (currently_stepping (tp))); 5407 5408 stepping_thread = tp; 5409 } Like: gdb/infrun.c:5406: internal-error: switch_back_to_stepped_thread: Assertion `!schedlock_applies (1)' failed. The way the assertion is written is assuming that with schedlock=step we'll always leave threads other than the one with the stepping range locked, while that's not true with the "next" command. With schedlock "step", other threads still run unlocked when "next" detects a function call and steps over it. Whether that makes sense or not, still, it's documented that way in the manual. If another thread hits an event that doesn't cause a stop while the nexting thread steps over a function call, we'll get here and fail the assertion. The fix is just to adjust the assertion. Even though we found the stepping thread, we'll still step-over the breakpoint that just triggered correctly. Surprisingly, gdb.threads/schedlock.exp doesn't have any test that steps over a function call. This commits fixes that. This ensures that "next" doesn't switch focus to another thread, and checks whether other threads run locked or not, depending on scheduler locking mode and command. There's a lot of duplication in that file that this ends cleaning up. There's more that could be cleaned up, but that would end up an unrelated change, best done separately. This new coverage in schedlock.exp happens to trigger the internal error in question, like so: FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (1) (GDB internal error) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (3) (GDB internal error) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (5) (GDB internal error) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (7) (GDB internal error) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (9) (GDB internal error) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next does not change thread (switched to thread 0) FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: current thread advanced - unlocked (wrong amount) That's because we have more than one thread running the same loop, and while one thread is stepping over a function call, the other thread hits the step-resume breakpoint of the first, which needs to be stepped over, and we end up in switch_back_to_stepped_thread exactly in the problem case. I think a simpler and more directed test is also useful, to not rely on internal breakpoint magics. So this commit also adds a test that has a thread trip on a conditional breakpoint that doesn't cause a user-visible stop while another thread is stepping over a call. That currently fails like this: FAIL: gdb.threads/next-bp-other-thread.exp: schedlock=step: next over function call (GDB internal error) Tested on x86_64 Fedora 20. gdb/ 2014-10-29 Pedro Alves <palves@redhat.com> PR gdb/17408 * infrun.c (switch_back_to_stepped_thread): Use currently_stepping instead of assuming a thread with a stepping range is always stepping. gdb/testsuite/ 2014-10-29 Pedro Alves <palves@redhat.com> PR gdb/17408 * gdb.threads/schedlock.c (some_function): New function. (call_function): New global. (MAYBE_CALL_SOME_FUNCTION): New macro. (thread_function): Call it. * gdb.threads/schedlock.exp (get_args): Add description parameter, and use it instead of a global counter. Adjust all callers. (get_current_thread): Use "find current thread" for test message here rather than having all callers pass down the same string. (goto_loop): New procedure, factored out from ... (my_continue): ... this. (step_ten_loops): Change parameter from test message to command to use. Adjust. (list_count): Delete global. (check_result): New procedure, factored out from duplicate top level code. (continue tests): Wrap in with_test_prefix. (test_step): New procedure, factored out from duplicate top level code. (top level): Test "step" in combination with all scheduler-locking modes. Test "next" in combination with all scheduler-locking modes, and in combination with stepping over a function call or not. * gdb.threads/next-bp-other-thread.c: New file. * gdb.threads/next-bp-other-thread.exp: New file. |
||
Pedro Alves
|
09dd9a6907 |
Remove Vax Ultrix and VAX BSD support
Built and tested on x86_64 Fedora 20, with --enable-targets=all. gdb/ 2014-10-24 Pedro Alves <palves@redhat.com> * Makefile.in (ALLDEPFILES): Remove vax-nat.c. * NEWS (Removed targets): Add VAX BSD and VAX Ultrix. * config/vax/vax.mh: Delete. * configure.host: Move vax-*-bsd* and vax-*-ultrix* to the obsolete configurations section. * configure.tgt (vax-*-*): Don't mention 4.2BSD nor Ultrix. * vax-nat.c: Delete file. gdb/testsuite/ 2014-10-24 Pedro Alves <palves@redhat.com> * gdb.base/corefile.exp: Remove references to ultrix. * gdb.base/interrupt.exp: Likewise. * gdb.base/whatis.exp: Likewise. * gdb.gdb/selftest.exp: Likewise. * gdb.threads/manythreads.exp: Likewise. * gdb.threads/print-threads.exp: Likewise. * gdb.threads/pthreads.exp:: Likewise. * gdb.threads/schedlock.exp: Likewise. |
||
Pedro Alves
|
32a8097ba5 |
Delete Tru64 support
This commit does most of the mechanical removal. IOW, the easy part. procfs.c isn't touched beyond removing a couple obvious bits that are guarded by a couple macros defined in config/alpha/nm-osf3.h. Going beyond that for procfs.c & co would be a harder excision that potentially affects Solaris. Some comments in the generic alpha code ABIs that may still be relevant and I wouldn't know what to do with them. That can always be done on a separate pass, preferably by someone who can test on alpha. A couple other spots have references to OSF/Tru64 and related files being removed, but it felt like removing them would make things worse, not better. We can revisit those when we next need to touch that code. I didn't remove a reference to osf in testsuite/lib/future.exp, as I believe that code is imported from DejaGNU. Built and tested on x86_64 Fedora 20, with --enable-targets=all. Tested that building for --target=alpha-osf3 on x86_64 Fedora 20 fails with: checking for default auto-load directory... $debugdir:$datadir/auto-load checking for default auto-load safe-path... $debugdir:$datadir/auto-load *** Configuration alpha-unknown-osf3 is obsolete. *** Support has been REMOVED. make[1]: *** [configure-gdb] Error 1 make[1]: Leaving directory `build-osf' make: *** [all] Error 2 gdb/ 2014-10-17 Pedro Alves <palves@redhat.com> * Makefile.in (ALL_64_TARGET_OBS): Remove alpha-osf1-tdep.o. (HFILES_NO_SRCDIR): Remove config/alpha/nm-osf3.h. (ALLDEPFILES): Remove alpha-nat.c, alpha-osf1-tdep.c and solib-osf.c. * NEWS: Mention that support for alpha*-*-osf* has been removed. * ada-lang.h [__alpha__ && __osf__] (ADA_KNOWN_RUNTIME_FILE_NAME_PATTERNS): Delete. * alpha-nat.c, alpha-osf1-tdep.c: Delete files. * alpha-tdep.c (alpha_gdbarch_init): Remove reference to GDB_OSABI_OSF1. * config/alpha/alpha-osf3.mh, config/alpha/nm-osf3.h: Delete files. * config/djgpp/fnchange.lst (config/alpha/alpha-osf1.mh) (config/alpha/alpha-osf2.mh, config/alpha/alpha-osf3.mh): Delete. * configure: Regenerate. * configure.ac: Remove references to osf. * configure.host: Handle alpha*-*-osf* in the obsolete hosts section. Remove all other references to osf. * configure.tgt: Add alpha*-*-osf* to the obsolete targets section. Remove all other references to osf. * dec-thread.c: Delete file. * defs.h (GDB_OSABI_OSF1): Delete. * inferior.h (START_INFERIOR_TRAPS_EXPECTED): New unconditionally defined. * osabi.c (gdb_osabi_names): Delete "OSF/1". * procfs.c (procfs_debug_inferior) [PROCFS_DONT_TRACE_FAULTS]: Delete code. (unconditionally_kill_inferior) [PROCFS_NEED_CLEAR_CURSIG_FOR_KILL]: Delete code. * solib-osf.c: Delete file. gdb/testsuite/ 2014-10-17 Pedro Alves <palves@redhat.com> * gdb.base/callfuncs.exp: emove references to osf. * gdb.base/sigall.exp: Likewise. * gdb.gdb/selftest.exp: Likewise. * gdb.hp/gdb.base-hp/callfwmall.exp: Likewise. * gdb.mi/non-stop.c: Likewise. * gdb.mi/pthreads.c: Likewise. * gdb.reverse/sigall-precsave.exp: Likewise. * gdb.reverse/sigall-reverse.exp: Likewise. * gdb.threads/pthreads.c: Likewise. * gdb.threads/pthreads.exp: Likewise. gdb/doc/ 2014-10-17 Pedro Alves <palves@redhat.com> * gdb.texinfo (Ada Tasks and Core Files): Delete mention of Tru64. (SVR4 Process Information): Delete mention of OSF/1. |
||
Yao Qi
|
052ca37073 |
No longer pull thread list explicitly
As the result of the patch below, GDB updates thread list when a stop is presented to user. The tests don't have to fetch thread list explicitly. [PATCH 3/3] Fix non-stop regressions caused by "breakpoints always-inserted off" changes https://sourceware.org/ml/gdb-patches/2014-09/msg00734.html This patch is to remove the test code updating thread list. Run these three tests many times on arm-linux-gnueabi and x86-linux. No regressions. gdb/testsuite: 2014-10-11 Yao Qi <yao@codesourcery.com> * gdb.threads/thread-find.exp: Don't execute command "info threads". * gdb.threads/attach-into-signal.exp (corefunc): Likewise. * gdb.threads/linux-dp.exp: Don't check the condition $threads_created equals to zero. |
||
Pedro Alves
|
2278c276a8 |
gdb.threads/manythreads.exp: clean up and add comment
In git
|
||
Pedro Alves
|
b57bacecd5 |
Fix non-stop regressions caused by "breakpoints always-inserted off" changes
Commit
|
||
Yao Qi
|
345bcc73f2 |
Skip dlopen-libpthread.exp in cross testing
I see the following fails on arm-linux-gnueabi, result of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is 1 output of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is not a dynamic executable child process exited abnormally FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so output contains libs the test script invokes ldd (on host) for the target libraries, which is wrong. ldd can't be cross because it invokes dynamic linker with LD_TRACE_LOADED_OBJECTS and gets the dependent libraries. My first reaction to this problem is to execute ld.so on the target (like remote_exec target). When I start to hack proc build_executable_own_libs, I find it has assumptions here and there that the native testing is performed. Then I check the callers of build_executable_own_libs, and they are all skipped if isnative is false. It is reasonable to do the same in dlopen-libpthread.exp too. gdb/testsuite: 2014-09-30 Yao Qi <yao@codesourcery.com> * gdb.threads/dlopen-libpthread.exp: Skip it if isnative is false. |
||
Pedro Alves
|
a25a5a45ef |
Fix "breakpoint always-inserted off"; remove "breakpoint always-inserted auto"
By default, GDB removes all breakpoints from the target when the target stops and the prompt is given back to the user. This is useful in case GDB crashes while the user is interacting, as otherwise, there's a higher chance breakpoints would be left planted on the target. But, as long as any thread is running free, we need to make sure to keep breakpoints inserted, lest a thread misses a breakpoint. With that in mind, in preparation for non-stop mode, we added a "breakpoint always-inserted on" mode. This traded off the extra crash protection for never having threads miss breakpoints, and in addition is more efficient if there's a ton of breakpoints to remove/insert at each user command (e.g., at each "step"). When we added non-stop mode, and for a period, we required users to manually set "always-inserted on" when they enabled non-stop mode, as otherwise GDB removes all breakpoints from the target as soon as any thread stops, which means the other threads still running will miss breakpoints. The test added by this patch exercises this. That soon revealed a nuisance, and so later we added an extra "breakpoint always-inserted auto" mode, that made GDB behave like "always-inserted on" when non-stop was enabled, and "always-inserted off" when non-stop was disabled. "auto" was made the default at the same time. In hindsight, this "auto" setting was unnecessary, and not the ideal solution. Non-stop mode does depends on breakpoints always-inserted mode, but only as long as any thread is running. If no thread is running, no breakpoint can be missed. The same is true for all-stop too. E.g., if, in all-stop mode, and the user does: (gdb) c& (gdb) b foo That breakpoint at "foo" should be inserted immediately, but it currently isn't -- currently it'll end up inserted only if the target happens to trip on some event, and is re-resumed, e.g., an internal breakpoint triggers that doesn't cause a user-visible stop, and so we end up in keep_going calling insert_breakpoints. The test added by this patch also covers this. IOW, no matter whether in non-stop or all-stop, if the target fully stops, we can remove breakpoints. And no matter whether in all-stop or non-stop, if any thread is running in the target, then we need breakpoints to be immediately inserted. And then, if the target has global breakpoints, we need to keep breakpoints even when the target is stopped. So with that in mind, and aiming at reducing all-stop vs non-stop differences for all-stop-on-stop-of-non-stop, this patch fixes "breakpoint always-inserted off" to not remove breakpoints from the target until it fully stops, and then removes the "auto" setting as unnecessary. I propose removing it straight away rather than keeping it as an alias, unless someone complains they have scripts that need it and that can't adjust. Tested on x86_64 Fedora 20. gdb/ 2014-09-22 Pedro Alves <palves@redhat.com> * NEWS: Mention merge of "breakpoint always-inserted" modes "off" and "auto" merged. * breakpoint.c (enum ugll_insert_mode): New enum. (always_inserted_mode): Now a plain boolean. (show_always_inserted_mode): No longer handle AUTO_BOOLEAN_AUTO. (breakpoints_always_inserted_mode): Delete. (breakpoints_should_be_inserted_now): New function. (insert_breakpoints): Pass UGLL_INSERT to update_global_location_list instead of calling insert_breakpoint_locations manually. (create_solib_event_breakpoint_1): New, factored out from ... (create_solib_event_breakpoint): ... this. (create_and_insert_solib_event_breakpoint): Use create_solib_event_breakpoint_1 instead of calling insert_breakpoint_locations manually. (update_global_location_list): Change parameter type from boolean to enum ugll_insert_mode. All callers adjusted. Adjust to use breakpoints_should_be_inserted_now and handle UGLL_INSERT. (update_global_location_list_nothrow): Change parameter type from boolean to enum ugll_insert_mode. (_initialize_breakpoint): "breakpoint always-inserted" option is now a boolean command. Update help text. * breakpoint.h (breakpoints_always_inserted_mode): Delete declaration. (breakpoints_should_be_inserted_now): New declaration. * infrun.c (handle_inferior_event) <TARGET_WAITKIND_LOADED>: Remove breakpoints_always_inserted_mode check. (normal_stop): Adjust to use breakpoints_should_be_inserted_now. * remote.c (remote_start_remote): Likewise. gdb/doc/ 2014-09-22 Pedro Alves <palves@redhat.com> * gdb.texinfo (Set Breaks): Document that "set breakpoint always-inserted off" is the default mode now. Delete documentation of "set breakpoint always-inserted auto". gdb/testsuite/ 2014-09-22 Pedro Alves <palves@redhat.com> * gdb.threads/break-while-running.exp: New file. * gdb.threads/break-while-running.c: New file. |
||
Doug Evans
|
57cbd724c3 |
Fix set up of queue-signal.exp test.
The test does a backtrace to see which thread (#2 or #3) is assigned to which SIGUSR (1 or 2). If the main thread gets to all_threads_running before the sigusr threads get to their entry point, then the function name isn't in the backtrace and the test fails. Alas this version of the code is within epsilon of what I started with, and then over-simplified things. |
||
Doug Evans
|
81219e5358 |
New command queue-signal.
If I want to change the signalled state of multiple threads it's a bit cumbersome to do with the "signal" command. What you really want is a way to set the signal state of the desired threads and then just do "continue". This patch adds a new command, queue-signal, to accomplish this. Basically "signal N" == "queue-signal N" + "continue". That's not precisely true in that "signal" can be used to inject any signal, including signals set to "nopass"; whereas "queue-signal" just queues the signal as if the thread stopped because of it. "nopass" handling is done when the thread is resumed which "queue-signal" doesn't do. One could add extra complexity to allow queue-signal to be used to deliver "nopass" signals like the "signal" command. I have no current need for it so in the interests of incremental complexity, I have left such support out and just have the code flag an error if one tries to queue a nopass signal. gdb/ChangeLog: * NEWS: Mention new "queue-signal" command. * infcmd.c (queue_signal_command): New function. (_initialize_infcmd): Add new queue-signal command. gdb/doc/ChangeLog: * gdb.texinfo (Signaling): Document new queue-signal command. gdb/testsuite/ChangeLog: * gdb.threads/queue-signal.c: New file. * gdb.threads/queue-signal.exp: New file. |
||
Pedro Alves
|
fa43b1d7ca |
after gdb_run_cmd, gdb_expect -> gdb_test_multiple/gdb_test
See: https://sourceware.org/ml/gdb-patches/2014-09/msg00404.html We have a number of places that do gdb_run_cmd followed by gdb_expect, when it would be better to use gdb_test_multiple or gdb_test. This converts all that "grep gdb_run_cmd -A 2 | grep gdb_expect" found. Tested on x86_64 Fedora 20, native and gdbserver. gdb/testsuite/ 2014-09-12 Pedro Alves <palves@redhat.com> * gdb.arch/gdb1558.exp: Replace uses of gdb_expect after gdb_run_cmd with gdb_test_multiple or gdb_test throughout. * gdb.arch/i386-size-overlap.exp: Likewise. * gdb.arch/i386-size.exp: Likewise. * gdb.arch/i386-unwind.exp: Likewise. * gdb.base/a2-run.exp: Likewise. * gdb.base/break.exp: Likewise. * gdb.base/charset.exp: Likewise. * gdb.base/chng-syms.exp: Likewise. * gdb.base/commands.exp: Likewise. * gdb.base/dbx.exp: Likewise. * gdb.base/find.exp: Likewise. * gdb.base/funcargs.exp: Likewise. * gdb.base/jit-simple.exp: Likewise. * gdb.base/reread.exp: Likewise. * gdb.base/sepdebug.exp: Likewise. * gdb.base/step-bt.exp: Likewise. * gdb.cp/mb-inline.exp: Likewise. * gdb.cp/mb-templates.exp: Likewise. * gdb.objc/basicclass.exp: Likewise. * gdb.threads/killed.exp: Likewise. |
||
Doug Evans
|
564b7600f2 |
gdb.threads/thread-execl.exp: #include <stdio.h>.
gdb/testsuite/ChangeLog: * gdb.threads/thread-execl.exp: #include <stdio.h>. |
||
Jan Kratochvil
|
22fd09ae99 |
Fix 'gcore' with exited threads
Program received signal SIGABRT, Aborted. [...] (gdb) gcore foobar Couldn't get registers: No such process. (gdb) info threads [...] (gdb) gcore foobar Saved corefile foobar (gdb) gcore tries to access the exited thread: [Thread 0x7ffff7fce700 (LWP 6895) exited] ptrace(PTRACE_GETREGS, 6895, 0, 0x7fff18167dd0) = -1 ESRCH (No such process) Without the TRY_CATCH protection testsuite FAILs for: gcore .../gdb/testsuite/gdb.threads/gcore-thread0.test Cannot find new threads: debugger service failed (gdb) FAIL: gdb.threads/gcore-thread.exp: save a zeroed-threads corefile + core .../gdb/testsuite/gdb.threads/gcore-thread0.test ".../gdb/testsuite/gdb.threads/gcore-thread0.test" is not a core dump: File format not recognized (gdb) FAIL: gdb.threads/gcore-thread.exp: core0file: re-load generated corefile (bad file format) Maybe the TRY_CATCH could be more inside update_thread_list(). Similar update_thread_list() call is IMO missing in procfs_make_note_section() but I do not have where to verify that change. gdb/ChangeLog 2014-08-21 Jan Kratochvil <jan.kratochvil@redhat.com> * linux-tdep.c (linux_corefile_thread_callback): Ignore THREAD_EXITED. (linux_make_corefile_notes): call update_thread_list, protected against exceptions. gdb/testsuite/ChangeLog 2014-08-21 Jan Kratochvil <jan.kratochvil@redhat.com> * gdb.threads/gcore-stale-thread.c: New file. * gdb.threads/gcore-stale-thread.exp: New file. |