No description
Find a file
Pedro Alves 0a39bb3218 stepping is disturbed by setjmp/longjmp | try/catch in other threads
At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel
observed that trying to next/step a program on GNU/Linux sometimes
results in the following failed assertion:

	% gdb -q .obj/gprof/main
    (gdb) start
    (gdb) n
    (gdb) step
    [...]/infrun.c:2391: internal-error:
    resume: Assertion `sig != GDB_SIGNAL_0' failed.

What happened is that, during the "next" operation, GDB hit a
longjmp/exception/step-resume breakpoint but failed to see that this
breakpoint was set for a different thread than the one being stepped.

Joel's detailed analysis follows:

More precisely, at the end of the "start" command, we are stopped at
the start of function Main in main.adb; there are 4 threads in total,
and we are in the main thread (which is thread 1):

    (gdb) info thread
      Id   Target Id         Frame
      4    Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall ()
      3    Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall ()
      2    Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall ()
    * 1    Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57

All the logs below reference Thread ID/LWP, but it'll be easier to
talk about the threads by GDB thread number.  For instance, thread 1
is LWP 28370 while thread 3 is LWP 28378.  So, the explanations below
translate the LWPs into thread numbers.

Back to what happens while we are trying to "next' our program:
    (gdb) n
    infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379))
    infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378))
    infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377))
    infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370))
    infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT)
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e
    infrun: target_wait (-1.0.0, status) =
    infrun:   28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x8054523

We've resumed thread 1 (LWP 28370), and received in return a signal
that the same thread stopped slightly further.  It's still in the
range of instructions for the line of source we started the "next"
from, as evidenced by the following trace...

    infrun: stepping inside range [0x805451e-0x8054531]

... and thus, we decide to continue stepping the same thread:

    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
    infrun: prepare_to_wait

That's when we get an event from a different thread (thread 3)...

    infrun: target_wait (-1.0.0, status) =
    infrun:   28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x80782d0
    infrun: context switch
    infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)

... which we find to be at the address where we set a breakpoint on
"the unwinder debug hook" (namely "_Unwind_DebugHook").  But GDB fails
to notice that the breakpoint was inserted for thread 1 only, and so
decides to handle it as...

    infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME

... and inserts a breakpoint at the corresponding resume address, as
evidenced by this the next log:

    infrun: exception resume at 80542a2

That breakpoint seems innocent right now, but will play a role fairly
quickly.  But for now, GDB has inserted the exception-resume
breakpoint, and needs to single-step thread 3 past the breakpoint it
just hit.  Thus, it temporarily disables the exception breakpoint, and
requests a step of that thread:

    infrun: skipping breakpoint: stepping past insn at: 0x80782d0
    infrun: skipping breakpoint: stepping past insn at: 0x80782d0
    infrun: skipping breakpoint: stepping past insn at: 0x80782d0
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0
    infrun: prepare_to_wait

We then get a notification, still from thread 3, that it's now past
that breakpoint...

    infrun: prepare_to_wait
    infrun: target_wait (-1.0.0, status) =
    infrun:   28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x8078424

... so we can resume what we were doing before, which is single-stepping
thread 1 until we get to a new line of code:

    infrun: switching back to stepped thread
    infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370)
    infrun: expected thread still hasn't advanced
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523

The "resume" log above shows that we're resuming thread 1 from where
we left off (0x8054523).  We get one more stop at 0x8054529, which is
still inside our stepping range so we go again.  That's when we get
the following event, from thread 3:

    infrun: prepare_to_wait
    infrun: target_wait (-1.0.0, status) =
    infrun:   28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x80542a2

Now the stop_pc address is interesting, because it's the address of
"exception resume" breakpoint...

    infrun: context switch
    infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
    infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME

... and since that location is at a different line of code, this is
where it decides the "next" operation should stop:

    infrun: stop_waiting
    [Switching to Thread 0xb7c5aba0 (LWP 28378)]
    0x080542a2 in inte_tache_rt.ttache_rt (
        <_task>=0x80968ec <inte_tache_rt_inst.tache2>)
        at /[...]/inte_tache_rt.adb:54
    54            end loop;

However, what GDB should have noticed earlier that the exception
breakpoint we hit was for a different thread, thus should have
single-stepped that thread out of the breakpoint _without_ inserting
the exception-return breakpoint, and then resumed the single-stepping
of the initial thread (thread 1) until that thread stepped out of its
stepping range.

This is what this patch does, and after applying it, GDB now correctly
stops on the next line of code.

The patch adds a C++ test that exercises this, both for setjmp/longjmp
and exception breakpoints.  With an unpatched GDB it shows:

 (gdb) next
 [Switching to Thread 22445.22455]
 thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59
 59            catch (...)
 (gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1
 next
 /home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa
 iled.
 A problem internal to GDB has been detected,
 further debugging may prove unreliable.
 Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error)
 Resyncing due to internal error.
 n

Tested on x86_64-linux, no regressions.

gdb/ChangeLog:
2015-08-05  Pedro Alves  <palves@redhat.com>
	    Joel Brobecker  <brobecker@adacore.com>

        * breakpoint.c (bpstat_what) <bp_longjmp, bp_longjmp_call_dummy>
	<bp_exception, bp_longjmp_resume, bp_exception_resume>: Handle the
	case where BS->STOP is not set.

gdb/testsuite/ChangeLog:
2015-08-05  Pedro Alves  <palves@redhat.com>

	* gdb.threads/next-while-other-thread-longjmps.c: New file.
	* gdb.threads/next-while-other-thread-longjmps.exp: New file.
2015-08-05 20:01:42 +01:00
bfd Change the behaviour of the --only-keep-debug option to objcopy and strip so that the sh_link and sh_info fields in stripped section headers are preserved. 2015-08-05 16:16:39 +01:00
binutils Change the behaviour of the --only-keep-debug option to objcopy and strip so that the sh_link and sh_info fields in stripped section headers are preserved. 2015-08-05 16:16:39 +01:00
config Sync config with GCC 2015-07-27 07:43:26 -07:00
cpu Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
elfcpp Add chdr_size, Chdr, Chdr_write and Chdr_data 2015-04-08 10:29:40 -07:00
etc PR external/{16327,16328}: Remove etc/configure.texi and etc/standards.texi. 2014-06-27 11:33:25 +02:00
gas 2015-08-04 Thomas Preud'homme <thomas.preudhomme@arm.com> 2015-08-04 09:39:42 +08:00
gdb stepping is disturbed by setjmp/longjmp | try/catch in other threads 2015-08-05 20:01:42 +01:00
gold Regenerate configure files 2015-07-27 07:56:32 -07:00
gprof Regenerate configure files 2015-07-27 07:56:32 -07:00
include Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
intl Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
ld ld: map option for run_dump_test requires no program. 2015-08-04 11:25:37 +01:00
libdecnumber Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
libiberty Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
opcodes Properly disassemble movnti in Intel mode 2015-07-30 04:17:02 -07:00
readline Revert "Sync readline/ to version 7.0 alpha" 2015-07-25 15:57:00 -04:00
sim Fix building GDB for the M32C by providing a stub sim_info function. 2015-08-05 14:58:21 +01:00
texinfo
zlib Remove leading/trailing white spaces in ChangeLog 2015-07-24 04:16:47 -07:00
.cvsignore
.gitattributes Add a .gitattributes file for use with git-merge-changelog 2014-07-25 18:07:23 -04:00
.gitignore Sync the root .gitignore file with GCC's. 2013-01-11 15:17:35 +00:00
ChangeLog Sync toplevel files with GCC 2015-07-27 07:49:05 -07:00
compile Update from upstream Automake 2014-11-16 13:43:48 +01:00
config-ml.in Sync toplevel files with GCC 2015-07-27 07:49:05 -07:00
config.guess Update config.guess and config.sub to the latest upstream version 2015-03-30 16:28:14 -04:00
config.rpath
config.sub Update config.guess and config.sub to the latest upstream version 2015-03-30 16:28:14 -04:00
configure Sync toplevel files with GCC 2015-07-27 07:49:05 -07:00
configure.ac Sync toplevel files with GCC 2015-07-27 07:49:05 -07:00
COPYING
COPYING.LIB
COPYING.LIBGLOSS 2013-01-07 Jeff Johnston <jjohnstn@redhat.com> 2013-01-07 21:39:26 +00:00
COPYING.NEWLIB 2013-10-01 Jeff Johnston <jjohnstn@redhat.com> 2013-10-01 18:14:04 +00:00
COPYING3
COPYING3.LIB
depcomp Update from upstream Automake 2014-11-16 13:43:48 +01:00
djunpack.bat
install-sh Update from upstream Automake 2014-11-16 13:43:48 +01:00
libtool.m4 Update libtool.m4 from GCC trunk 2014-11-24 09:14:09 -08:00
ltgcc.m4
ltmain.sh PR target/59788 2014-02-06 11:01:57 +01:00
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS Update description of ownership of files in include/ 2014-11-04 16:14:14 -08:00
Makefile.def Configure zlib with --enable-host-shared for shared bfd 2015-05-01 08:34:08 -07:00
Makefile.in Sync Makefile.tpl with GCC 2015-07-14 09:52:36 -07:00
Makefile.tpl Sync Makefile.tpl with GCC 2015-07-14 09:52:36 -07:00
makefile.vms
missing Update from upstream Automake 2014-11-16 13:43:48 +01:00
mkdep
mkinstalldirs Update from upstream Automake 2014-11-16 13:43:48 +01:00
move-if-change Update `move-if-change' from gnulib 2014-11-16 17:04:02 +01:00
README
README-maintainer-mode
setup.com
src-release.sh Adjust src-release.sh for sim using the gdb create-version.sh. 2015-04-15 04:08:51 +02:00
symlink-tree
ylwrap Update from upstream Automake 2014-11-16 13:43:48 +01:00

		   README for GNU development tools

This directory contains various GNU compilers, assemblers, linkers, 
debuggers, etc., plus their support routines, definitions, and documentation.

If you are receiving this as part of a GDB release, see the file gdb/README.
If with a binutils release, see binutils/README;  if with a libg++ release,
see libg++/README, etc.  That'll give you info about this
package -- supported targets, how to use it, how to report bugs, etc.

It is now possible to automatically configure and build a variety of
tools with one command.  To build all of the tools contained herein,
run the ``configure'' script here, e.g.:

	./configure 
	make

To install them (by default in /usr/local/bin, /usr/local/lib, etc),
then do:
	make install

(If the configure script can't determine your type of computer, give it
the name as an argument, for instance ``./configure sun4''.  You can
use the script ``config.sub'' to test whether a name is recognized; if
it is, config.sub translates it to a triplet specifying CPU, vendor,
and OS.)

If you have more than one compiler on your system, it is often best to
explicitly set CC in the environment before running configure, and to
also set CC when running make.  For example (assuming sh/bash/ksh):

	CC=gcc ./configure
	make

A similar example using csh:

	setenv CC gcc
	./configure
	make

Much of the code and documentation enclosed is copyright by
the Free Software Foundation, Inc.  See the file COPYING or
COPYING.LIB in the various directories, for a description of the
GNU General Public License terms under which you can copy the files.

REPORTING BUGS: Again, see gdb/README, binutils/README, etc., for info
on where and how to report problems.