old-cross-binutils/gdb/testsuite/gdb.threads/fork-plus-threads.exp

113 lines
3.4 KiB
Text
Raw Normal View History

PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
# Copyright (C) 2015 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# This test verifies that threads created by the child fork are
# properly handled. Specifically, GDB used to have a bug where it
# would leave child fork threads stuck stopped, even though "info
# threads" would show them running.
#
# See https://sourceware.org/bugzilla/show_bug.cgi?id=18600
Target remote mode fork and exec test updates This patch updates tests for fork and exec events in target remote mode. In the majority of cases this was a simple matter of removing some code that disabled the test for target remote. In a few cases the test needed to be disabled; in those cases the gdb_protocol was checked instead of using the [is_remote target] etc. In a couple of cases we needed to use clean_restart, since target remote doesn't support the run command, and in one case we had to modify an expect expression to allow for a "multiprocess-style" ptid. Tested with the patch that implemented target remote mode fork and exec event support. gdb/testsuite/ChangeLog: * gdb.base/execl-update-breakpoints.exp (main): Enable for target remote. * gdb.base/foll-exec-mode.exp (main): Disable for target remote. * gdb.base/foll-exec.exp (main): Enable for target remote. * gdb.base/foll-fork.exp (main): Likewise. * gdb.base/foll-vfork.exp (main): Likewise. * gdb.base/multi-forks.exp (main): Likewise, and use clean_restart. (proc continue_to_exit_bp_loc): Use clean_restart. * gdb.base/pie-execl.exp (main): Disable for target remote. * gdb.base/watch-vfork.exp (main): Enable for target remote. * gdb.mi/mi-nsthrexec.exp (main): Likewise. * gdb.threads/execl.exp (main): Likewise. * gdb.threads/fork-child-threads.exp (main): Likewise. * gdb.threads/fork-plus-threads.exp (main): Disable for target remote. * gdb.threads/fork-thread-pending.exp (main): Enable for target remote. * gdb.threads/linux-dp.exp (check_philosopher_stack): Allow pid.tid style ptids, instead of just tid. * gdb.threads/thread-execl.exp (main): Enable for target remote. * gdb.threads/watchpoint-fork.exp (main): Likewise. * gdb.trace/report.exp (use_collected_data): Allow pid.tid style ptids, instead of just tid.
2015-12-14 19:18:05 +00:00
# In remote mode, we cannot continue debugging after all
# inferiors have terminated, and this test requires that.
if { [target_info exists gdb_protocol]
&& [target_info gdb_protocol] == "remote" } {
continue
}
PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
standard_testfile
proc do_test { detach_on_fork } {
global GDBFLAGS
global srcfile testfile
global gdb_prompt
set saved_gdbflags $GDBFLAGS
set GDBFLAGS [concat $GDBFLAGS " -ex \"set non-stop on\""]
if {[prepare_for_testing "failed to prepare" \
$testfile $srcfile {debug pthreads}] == -1} {
set GDBFLAGS $saved_gdbflags
return -1
}
set GDBFLAGS $saved_gdbflags
if ![runto_main] then {
fail "Can't run to main"
return 0
}
gdb_test_no_output "set detach-on-fork $detach_on_fork"
set test "continue &"
gdb_test_multiple $test $test {
-re "$gdb_prompt " {
pass $test
}
}
remote follow fork and spurious child stops in non-stop mode Running gdb.threads/fork-plus-threads.exp against gdbserver in extended-remote mode, even though the test passes, we still see broken behavior: (gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off continue & Continuing. (gdb) PASS: gdb.threads/fork-plus-threads.exp: continue & [New Thread 28092.28092] [Thread 28092.28092] #2 stopped. [New Thread 28094.28094] [Inferior 2 (process 28092) exited normally] [New Thread 28094.28105] [New Thread 28094.28109] ... [Thread 28174.28174] #18 stopped. [New Thread 28185.28185] [Inferior 10 (process 28174) exited normally] [New Thread 28185.28196] [Thread 28185.28185] #20 stopped. Cannot remove breakpoints because program is no longer writable. Further execution is probably impossible. [Inferior 11 (process 28185) exited normally] [Inferior 1 (process 28091) exited normally] PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint info threads No threads. (gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left info inferiors Num Description Executable * 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads (gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left All the "[Thread FOO] #NN stopped." above are bogus, as well as the "Cannot remove breakpoints because program is no longer writable.", which is a consequence. The problem is that when we intercept a fork event, we should report the event for the parent, only, and leave the child stopped, but not report its stop event. GDB later decides whether to follow the parent or the child. But because handle_extended_wait does not set the child's last_status.kind to TARGET_WAITKIND_STOPPED, a stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access memory) by mistake ends up queueing a SIGSTOP on the child, resuming it, and then when that SIGSTOP is intercepted, because the LWP has last_resume_kind set to resume_stop, gdbserver reports the stop to GDB, as GDB_SIGNAL_0: ... >>>> entering unstop_all_lwps unstopping all lwps proceed_one_lwp: lwp 1600 client wants LWP to remain 1600 stopped proceed_one_lwp: lwp 1828 Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sending sigstop to lwp 1828 pc is 0x3615ebc7cc Resuming lwp 1828 (continue, signal 0, stop expected) continue from pc 0x3615ebc7cc unstop_all_lwps done sigchld_handler <<<< exiting unstop_all_lwps handling possible target event >>>> entering linux_wait_1 linux_wait_1: [<all threads>] my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x1): status(137f), 1828 LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK LLW: waitpid 1828 received Stopped (signal) (stopped) pc is 0x3615ebc7cc Expected stop. LLW: resume_stop SIGSTOP caught for LWP 1828.1828. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... linux_wait_1 ret = LWP 1828.1828, 1, 0 <<<< exiting linux_wait_1 Writing resume reply for LWP 1828.1828:1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Tested on x86_64 Fedora 20, extended-remote. gdb/gdbserver/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> * linux-low.c (handle_extended_wait): Set the child's last reported status to TARGET_WAITKIND_STOPPED.
2015-07-30 17:41:44 +00:00
# gdbserver had a bug that resulted in reporting the fork child's
# initial stop to gdb, which gdb does not expect, in turn
# resulting in a broken session, like:
#
# [Thread 31536.31536] #16 stopped. <== BAD
# [New Thread 31547.31547]
# [Inferior 10 (process 31536) exited normally]
# [New Thread 31547.31560]
#
# [Thread 31547.31547] #18 stopped. <== BAD
# Cannot remove breakpoints because program is no longer writable. <== BAD
# Further execution is probably impossible. <== BAD
# [Inferior 11 (process 31547) exited normally]
# [Inferior 1 (process 31454) exited normally]
#
# These variables track whether we see such broken behavior.
set saw_cannot_remove_breakpoints 0
set saw_thread_stopped 0
PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
set test "inferior 1 exited"
gdb_test_multiple "" $test {
remote follow fork and spurious child stops in non-stop mode Running gdb.threads/fork-plus-threads.exp against gdbserver in extended-remote mode, even though the test passes, we still see broken behavior: (gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off continue & Continuing. (gdb) PASS: gdb.threads/fork-plus-threads.exp: continue & [New Thread 28092.28092] [Thread 28092.28092] #2 stopped. [New Thread 28094.28094] [Inferior 2 (process 28092) exited normally] [New Thread 28094.28105] [New Thread 28094.28109] ... [Thread 28174.28174] #18 stopped. [New Thread 28185.28185] [Inferior 10 (process 28174) exited normally] [New Thread 28185.28196] [Thread 28185.28185] #20 stopped. Cannot remove breakpoints because program is no longer writable. Further execution is probably impossible. [Inferior 11 (process 28185) exited normally] [Inferior 1 (process 28091) exited normally] PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint info threads No threads. (gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left info inferiors Num Description Executable * 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads (gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left All the "[Thread FOO] #NN stopped." above are bogus, as well as the "Cannot remove breakpoints because program is no longer writable.", which is a consequence. The problem is that when we intercept a fork event, we should report the event for the parent, only, and leave the child stopped, but not report its stop event. GDB later decides whether to follow the parent or the child. But because handle_extended_wait does not set the child's last_status.kind to TARGET_WAITKIND_STOPPED, a stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access memory) by mistake ends up queueing a SIGSTOP on the child, resuming it, and then when that SIGSTOP is intercepted, because the LWP has last_resume_kind set to resume_stop, gdbserver reports the stop to GDB, as GDB_SIGNAL_0: ... >>>> entering unstop_all_lwps unstopping all lwps proceed_one_lwp: lwp 1600 client wants LWP to remain 1600 stopped proceed_one_lwp: lwp 1828 Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sending sigstop to lwp 1828 pc is 0x3615ebc7cc Resuming lwp 1828 (continue, signal 0, stop expected) continue from pc 0x3615ebc7cc unstop_all_lwps done sigchld_handler <<<< exiting unstop_all_lwps handling possible target event >>>> entering linux_wait_1 linux_wait_1: [<all threads>] my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x1): status(137f), 1828 LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK LLW: waitpid 1828 received Stopped (signal) (stopped) pc is 0x3615ebc7cc Expected stop. LLW: resume_stop SIGSTOP caught for LWP 1828.1828. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... linux_wait_1 ret = LWP 1828.1828, 1, 0 <<<< exiting linux_wait_1 Writing resume reply for LWP 1828.1828:1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Tested on x86_64 Fedora 20, extended-remote. gdb/gdbserver/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> * linux-low.c (handle_extended_wait): Set the child's last reported status to TARGET_WAITKIND_STOPPED.
2015-07-30 17:41:44 +00:00
-re "Cannot remove breakpoints" {
set saw_cannot_remove_breakpoints 1
exp_continue
}
-re "Thread \[^\r\n\]+ stopped\\." {
set saw_thread_stopped 1
exp_continue
}
PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
-re "Inferior 1 \(\[^\r\n\]+\) exited normally" {
pass $test
}
}
remote follow fork and spurious child stops in non-stop mode Running gdb.threads/fork-plus-threads.exp against gdbserver in extended-remote mode, even though the test passes, we still see broken behavior: (gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off continue & Continuing. (gdb) PASS: gdb.threads/fork-plus-threads.exp: continue & [New Thread 28092.28092] [Thread 28092.28092] #2 stopped. [New Thread 28094.28094] [Inferior 2 (process 28092) exited normally] [New Thread 28094.28105] [New Thread 28094.28109] ... [Thread 28174.28174] #18 stopped. [New Thread 28185.28185] [Inferior 10 (process 28174) exited normally] [New Thread 28185.28196] [Thread 28185.28185] #20 stopped. Cannot remove breakpoints because program is no longer writable. Further execution is probably impossible. [Inferior 11 (process 28185) exited normally] [Inferior 1 (process 28091) exited normally] PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint info threads No threads. (gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left info inferiors Num Description Executable * 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads (gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left All the "[Thread FOO] #NN stopped." above are bogus, as well as the "Cannot remove breakpoints because program is no longer writable.", which is a consequence. The problem is that when we intercept a fork event, we should report the event for the parent, only, and leave the child stopped, but not report its stop event. GDB later decides whether to follow the parent or the child. But because handle_extended_wait does not set the child's last_status.kind to TARGET_WAITKIND_STOPPED, a stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access memory) by mistake ends up queueing a SIGSTOP on the child, resuming it, and then when that SIGSTOP is intercepted, because the LWP has last_resume_kind set to resume_stop, gdbserver reports the stop to GDB, as GDB_SIGNAL_0: ... >>>> entering unstop_all_lwps unstopping all lwps proceed_one_lwp: lwp 1600 client wants LWP to remain 1600 stopped proceed_one_lwp: lwp 1828 Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sending sigstop to lwp 1828 pc is 0x3615ebc7cc Resuming lwp 1828 (continue, signal 0, stop expected) continue from pc 0x3615ebc7cc unstop_all_lwps done sigchld_handler <<<< exiting unstop_all_lwps handling possible target event >>>> entering linux_wait_1 linux_wait_1: [<all threads>] my_waitpid (-1, 0x40000001) my_waitpid (-1, 0x1): status(137f), 1828 LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK LLW: waitpid 1828 received Stopped (signal) (stopped) pc is 0x3615ebc7cc Expected stop. LLW: resume_stop SIGSTOP caught for LWP 1828.1828. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... linux_wait_1 ret = LWP 1828.1828, 1, 0 <<<< exiting linux_wait_1 Writing resume reply for LWP 1828.1828:1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Tested on x86_64 Fedora 20, extended-remote. gdb/gdbserver/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> * linux-low.c (handle_extended_wait): Set the child's last reported status to TARGET_WAITKIND_STOPPED.
2015-07-30 17:41:44 +00:00
gdb_assert !$saw_cannot_remove_breakpoints \
"no failure to remove breakpoints"
gdb_assert !$saw_thread_stopped \
"no spurious thread stop"
PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
gdb_test "info threads" "No threads\." \
"no threads left"
PR threads/18600: Inferiors left around after fork+thread spawn The new gdb.threads/fork-plus-threads.exp test exposes one more problem. When one types "info inferiors" after running the program, one see's a couple inferior left still, while there should only be inferior #1 left. E.g.: (gdb) info inferiors Num Description Executable 4 process 8393 /home/pedro/bugs/src/test 2 process 8388 /home/pedro/bugs/src/test * 1 <null> /home/pedro/bugs/src/test (gdb) info threads Calling prune_inferiors() manually at this point (from a top gdb) does not remove them, because they still have inf->pid != 0 (while they shouldn't). This suggests that we never mourned those inferiors. Enabling logs (master + previous patch) we see: ... WL: waitpid Thread 0x7ffff7fc2740 (LWP 9513) received Trace/breakpoint trap (stopped) WL: Handling extended status 0x03057f LHEW: Got clone event from LWP 9513, new child is LWP 9579 [New Thread 0x7ffff37b8700 (LWP 9579)] WL: waitpid Thread 0x7ffff7fc2740 (LWP 9508) received 0 (exited) WL: Thread 0x7ffff7fc2740 (LWP 9508) exited. ^^^^^^^^ [Thread 0x7ffff7fc2740 (LWP 9508) exited] WL: waitpid Thread 0x7ffff7fc2740 (LWP 9499) received 0 (exited) WL: Thread 0x7ffff7fc2740 (LWP 9499) exited. [Thread 0x7ffff7fc2740 (LWP 9499) exited] RSRL: resuming stopped-resumed LWP Thread 0x7ffff37b8700 (LWP 9579) at 0x3615ef4ce1: step=0 ... (gdb) info inferiors Num Description Executable 5 process 9508 /home/pedro/bugs/src/test ^^^^ 4 process 9503 /home/pedro/bugs/src/test 3 process 9500 /home/pedro/bugs/src/test 2 process 9499 /home/pedro/bugs/src/test * 1 <null> /home/pedro/bugs/src/test (gdb) ... Note the "Thread 0x7ffff7fc2740 (LWP 9508) exited." line. That's this in wait_lwp: /* Check if the thread has exited. */ if (WIFEXITED (status) || WIFSIGNALED (status)) { thread_dead = 1; if (debug_linux_nat) fprintf_unfiltered (gdb_stdlog, "WL: %s exited.\n", target_pid_to_str (lp->ptid)); } } That was the leader thread reporting an exit, meaning the whole process is gone. So the problem is that this code doesn't understand that an WIFEXITED status of the leader LWP should be reported to infrun as process exit. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> PR threads/18600 * linux-nat.c (wait_lwp): Report to the core when thread group leader exits. gdb/testsuite/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.exp: Test that "info inferiors" only shows inferior 1.
2015-07-22 17:01:46 +00:00
gdb_test "info inferiors" \
"Num\[ \t\]+Description\[ \t\]+Executable\[ \t\]+\r\n\\* 1 \[^\r\n\]+" \
"only inferior 1 left"
PR threads/18600: Threads left stopped after fork+thread spawn When a program forks and another process start threads while gdb is handling the fork event, newly created threads are left stuck stopped by gdb, even though gdb presents them as "running", to the user. This can be seen with the test added by this patch. The test has the inferior fork a certain number of times and waits for all children to exit. Each fork child spawns a number of threads that do nothing and joins them immediately. Normally, the program should run unimpeded (from the point of view of the user) and exit very quickly. Without this fix, it doesn't because of some threads left stopped by gdb, so inferior 1 never exits. The program triggers when a new clone thread is found while inside the linux_stop_and_wait_all_lwps call in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); Within linux_stop_and_wait_all_lwps, we reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps, called from linux_unstop_all_lwps, never resumes the new LWP. There's lots of cruft in linux_handle_extended_wait that no longer makes sense. On systems with CLONE events support, we don't rely on libthread_db for thread listing anymore, so the code that preserves stop_requested and the handling of last_resume_kind is all dead. So the fix is to remove all that, and simply always mark the new LWP as resumed, so that resume_stopped_resumed_lwps re-resumes it. gdb/ChangeLog: 2015-07-30 Pedro Alves <palves@redhat.com> Simon Marchi <simon.marchi@ericsson.com> PR threads/18600 * linux-nat.c (linux_handle_extended_wait): On CLONE event, always mark the new thread as resumed. Remove STOPPING parameter. (wait_lwp): Adjust call to linux_handle_extended_wait. (linux_nat_filter_event): Adjust call to linux_handle_extended_wait. (resume_stopped_resumed_lwps): Add debug output. gdb/testsuite/ChangeLog: 2015-07-30 Simon Marchi <simon.marchi@ericsson.com> Pedro Alves <palves@redhat.com> PR threads/18600 * gdb.threads/fork-plus-threads.c: New file. * gdb.threads/fork-plus-threads.exp: New file.
2015-07-30 17:50:29 +00:00
}
foreach detach_on_fork {"on" "off"} {
with_test_prefix "detach-on-fork=$detach_on_fork" {
do_test $detach_on_fork
}
}