old-cross-binutils/gdb/testsuite/gdb.base/gdb-sigterm.c
Pedro Alves 77f4176143 gdb.base/gdb-sigterm.exp: Fix spurious FAILs
The buildbot shows that some machines FAIL this test frequently.
E.g.: https://sourceware.org/ml/gdb-testers/2015-q1/msg00997.html

If I stress my machine, I can sometimes see it fail too.

Bumping the 200 limit and tweaking the test to show the step count, I
get:

     ...
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 12 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 8 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 13 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 7 times
-->  FAIL: gdb.base/gdb-sigterm.exp: SIGTERM stepped 228 times <--
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 11 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 13 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 12 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 8 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 9 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 7 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 11 times
     PASS: gdb.base/gdb-sigterm.exp: SIGTERM stepped 8 times
     ...

Thinking that this might be a problem of SIGTERM reaching GDB, but
then the event loop taking too long to handle it, I hacked GDB to
print a debug log whenever the SIGTERM handler was called, and,
whenever the event loop finally calls the async SIGTERM handler.
Here's what I see:

     infrun:   30011 [Thread 30011],
     infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
     infrun: TARGET_WAITKIND_STOPPED
     infrun: stop_pc = 0x4005de
-->  infrun: got SIGTERM                                       <--
     infrun: stepping inside range [0x4005de-0x4005e0]
     infrun: resume (step=1, signal=GDB_SIGNAL_0), ...
     infrun: prepare_to_wait
-->  infrun: handling async SIGTERM                            <--
     Cannot execute this command while the target is running.
     Use the "interrupt" command to stop the target
     and then try again.
     gdb.base/gdb-sigterm.exp: expect eof #27
     FAIL: gdb.base/gdb-sigterm.exp: SIGTERM stepped 228 times

So, no delay on the GDB side.  It just happens that occasionally it
takes more than 200 single-steps before SIGTERM even reaches GDB.
This just looks like a kernel/scheduling issue --- some extra usage
spike in the system (e.g., an I/O spike) might cause it for me.  For
the build slaves, I'm guessing they're frequently busy enough to trip
on this often.  Particularly more so now that we're having them run
tests in parallel mode.

The fix is to detect failure by timeout instead of counting single
steps.  This should be more reliable.  Indeed for me, after this
commit, I couldn't trigger a FAIL anymore, even after letting the test
run for an hour.

By timeout is also nicer in that a board file for a slow host/target
can increase it (like, e.g., an embedded GNU/Linux board).

Tested on x86_64 Fedora 20, native, gdbserver, and extended-remote
gdbserver.

gdb/testsuite/
2015-02-06  Pedro Alves  <palves@redhat.com>

	* gdb.base/gdb-sigterm.c (main): Use the TIMEOUT define to
	determine how many seconds to pass to 'alarm'.
	* gdb.base/gdb-sigterm.exp (top level): Build program with
	-DTIMEOUT=$timeout.
	(do_test): Return success/failure indication.  Add more verbose
	logging.  Don't fail if 200 single steps are seen.  Instead, fail
	when the test times out.
	(passes): New global.
	(top level): Break the testing loop if testing fails on any
	iteration.  Use gdb_assert.
2015-02-06 11:09:42 +01:00

30 lines
962 B
C

/* This testcase is part of GDB, the GNU debugger.
Copyright 2013-2015 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include <unistd.h>
int
main (void)
{
/* Allow for as much timeout as DejaGnu wants, plus a bit of
slack. */
unsigned int seconds = TIMEOUT + 20;
alarm (seconds);
for (;;); /* loop-line */
}