The main motivation for this is making non-stop / all-stop behave
similarly on initial connection, in order to move in the direction of
reimplementing all-stop mode with the remote target always running in
non-stop mode.
When we connect to a remote target in non-stop mode, we may find
threads either running or already stopped. The act of connecting
itself does not force threads to stop. To handle that, the remote
non-stop connection is currently roughly like this:
#1 - Fetch list of remote threads (qXfer:threads:read, qfThreadInfo,
etc). All threads are assumed to be running until the target
reports an asynchronous stop reply for them.
#2 - Fetch the initial set of threads that were already stopped, with
the '?' packet. (In non-stop, this is coupled with the vStopped
mechanism to be able to retrieve the status of more than one
thread.)
The stop replies fetched in #2 are placed in the pending stop reply
queue, and left for the regular event loop to process. That is,
"target remote" finishes and returns _before_ those stops are
processed.
That means that it's possible to have GDB process further commands
before the initial set of stopped threads is reported to the user.
E.g., before the patch, note how the prompt is printed before the
frame:
Remote debugging using :9999
(gdb)
[Thread 15296] #1 stopped.
0x0000003615a011f0 in ?? ()
Even though thread #1 was not running, for a moment, the user can see
it as such:
$ gdb a.out -ex "set non-stop 1" -ex "tar rem :9999" -ex "info threads" -ex "info registers"
Remote debugging using :9999
Id Target Id Frame
* 1 Thread 4772 (running)
Target is executing. <<<<<<< info registers
(gdb)
[Thread 4772] #1 stopped.
0x0000003615a011f0 in ?? ()
To fix that, this commit makes gdb process all threads found already
stopped at connection time, before giving the prompt to the user.
The fix takes a cue from fork-child.c:startup_inferior [1], and
processes the events locally in remote.c, avoiding the whole
wait_for_inferior/handle_inferior_event path. I decided to try this
approach after noticing that:
- several cases in handle_inferior_event miss checking stop_soon.
- we don't want to fetch the thread list in normal_stop.
and trying to fix them was resulting in sprinkling stop_soon checks in
many places, and uglifying normal_stop even more.
While with this patch, I'm avoiding changing GDB's output other than
when the prompt is printed, I think this approach is more flexible if
we do want to change it. And also, it's likely easier to get rid of
the MI *running event that is still sent for threads that are
initially found stopped, if we want to.
This happens to fix the testsuite too. All non-stop tests are racy
against "target remote" / gdbserver testing currently. That is,
sometimes the tests run, but other times they're just skipped without
any indication of PASS/FAIL. When that happens, the logs show:
target remote localhost:2346
Remote debugging using localhost:2346
(gdb)
[Thread 25418] #1 stopped.
0x0000003615a011f0 in ?? ()
^CQuit
(gdb) Remote debugging from host 127.0.0.1
Killing process(es): 25418
monitor exit
(gdb) Remote connection closed
(gdb) testcase /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/multi-create-ns-info-thr.exp completed in 61 seconds
The trouble here is that there's output after the prompt, and the
regex in question doesn't expect that:
-re "Remote debugging using .*$serialport_re.*$gdb_prompt $" {
verbose "Set target to $targetname"
return 0
}
[1] - before startup_inferior was added, we'd go through
wait_for_inferior/handle_inferior_event while going through the shell,
and that turned out problematic.
Tested on x86_64 Fedora 20, gdbserver.
gdb/ChangeLog:
2015-08-20 Pedro Alves <palves@redhat.com>
* infrun.c (print_target_wait_results): Make extern.
* infrun.h (print_target_wait_results): Declare.
* remote.c (set_stop_requested_callback): Delete.
(process_initial_stop_replies): New function.
(remote_start_remote): Use it.
(stop_reply_queue_length): New function.
gdb/testsuite/ChangeLog:
2015-08-20 Pedro Alves <palves@redhat.com>
* gdb.server/connect-stopped-target.c: New file.
* gdb.server/connect-stopped-target.exp: New file.
Here, in dwarfread.c:process_full_comp_unit:
/* Set symtab language to language from DW_AT_language. If the
compilation is from a C file generated by language preprocessors, do
not set the language if it was already deduced by start_subfile. */
if (!(cu->language == language_c
&& COMPUNIT_FILETABS (cust)->language != language_c))
COMPUNIT_FILETABS (cust)->language = cu->language;
in case start_subfile doesn't manage to deduce a language
COMPUNIT_FILETABS(cust)->language ends up as language_unknown, not
language_c. So the condition above evals false and we never set the
language from the cu's language.
gdb/ChangeLog:
2015-08-20 Pedro Alves <palves@redhat.com>
* dwarf2read.c (process_full_comp_unit): To tell whether
start_subfile managed to deduce a language, test for
language_unknown instead of language_c.
gdb/testsuite/ChangeLog:
2015-08-20 Pedro Alves <palves@redhat.com>
* gdb.dwarf2/comp-unit-lang.exp: New file.
* gdb.dwarf2/comp-unit-lang.c: New file.
Before this change, trying to evaluate the following Ada expression
yielded a syntax error, even though it's completely legal:
(gdb) p s'first = 'a'
Error in expression, near `'.
The problem lies in the lexer (gdb/ada-lex.l): at the point we reach "'a'",
we're still in the BEFORE_QUAL_QUOTE start condition (the mechanism to
distinguish character literals from other "tick" usages: qualified
expressions and attributes), so we consider that this quote is actually a
separate "tick".
This changes resets the start condition to INITIAL in the
{TICK}[a-zA-Z][a-zA-Z]+ rule (for attributes): attributes activate this
BEFORE_QUAL_QUOTE condition and in this case the above rule is always
executed rather than the <BEFORE_QUAL_QUOTE>"'" one (in flex, it's
always the longest match that is chosen). We now have instead:
(gdb) p s'first = 'a'
$1 = true
gdb/ChangeLog:
* ada-lex.l: Reset the start condition to INITIAL in the rule
that matches attributes.
gdb/testsuite/ChangeLog:
* gdb.ada/attr_ref_and_charlit.exp: New testcase.
* gdb.ada/attr_ref_and_charlit/foo.adb: New file.
Tested on x86_64-linux, no regression.
This change introduces a new function, dwarf2_string_attr(), which is
a wrapper for dwarf2_attr(). dwarf2read.c has been updated to
call dwarf2_string_attr in most instances where a string-valued
attribute is decoded to produce a string value. In most cases, it
simplifies the code; in some instances, the complexity of the code
remains unchanged.
I performed this change by looking for instances where the
result of DW_STRING was used in an assignment. Many of these
had a pattern which (roughly) looks something like this:
struct attribute *attr = NULL;
attr = dwarf2_attr (die, name, cu);
if (attr != NULL && DW_STRING (attr))
{
const char *str;
...
str = DW_STRING (attr);
... /* Use str in some fashion. */
}
Code of this form is transformed to look like this instead:
const char *str;
str = dwarf2_string_attr (die, name, cu)
if (str != NULL)
{
...
/* Use str in some fashion. */
...
}
In addition to invoking dwarf2_attr() and DW_STRING(),
dwarf2_string_attr() checks to make sure that the attribute's
`form' field matches one of DW_FORM_strp, DW_FORM_string, or
DW_FORM_GNU_strp_alt. If it does not match one of these forms,
it will return a NULL value in addition to calling complaint().
An earlier version of this patch did this type checking for one
particular instance where a string attribute was being decoded.
The situation that I was attempting to handle in that earlier patch is
this:
The Texas Instruments compiler uses the encoding for
DW_AT_MIPS_linkage_name for other purposes. TI uses the encoding,
0x2007, for TI_AT_TI_end_line which, unlike DW_AT_MIPS_linkage_name,
does not have a string-typed value. In this instance, GDB was attempting
to use an integer value as a string pointer, with predictable results.
(GDB would die with a segmentation fault.)
I've added a test which reproduces the problem that I was orignally
wanting to fix. It uses DW_AT_MIPS_linkage name with an associate
value which is a string, and again, where the value is a small
integer.
My test case causes GDB to segfault in an unpatched GDB. There
will be two PASSes in a patched GDB.
Unpatched GDB:
(gdb) ptype f
ERROR: Process no longer exists
UNRESOLVED: gdb.dwarf2/dw2-bad-mips-linkage-name.exp: ptype f
ERROR: Couldn't send ptype g to GDB.
UNRESOLVED: gdb.dwarf2/dw2-bad-mips-linkage-name.exp: ptype g
Patched GDB:
(gdb) ptype f
type = bool ()
(gdb) PASS: gdb.dwarf2/dw2-bad-mips-linkage-name.exp: ptype f
ptype g
type = bool ()
(gdb) PASS: gdb.dwarf2/dw2-bad-mips-linkage-name.exp: ptype g
I see no regressions on an x86_64 native target.
gdb/ChangeLog:
* dwarf2read.c (dwarf2_string_attr): New function.
(lookup_dwo_unit, process_psymtab_comp_unit_reader)
(dwarf2_compute_name, dwarf2_physname, find_file_and_directory)
(read_call_site_scope, namespace_name, guess_full_die_structure_name)
(anonymous_struct_prefix, prepare_one_comp_unit): Use
dwarf2_string_attr in place of dwarf2_attr and DW_STRING.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dw2-bad-mips-linkage-name.c: New file.
* gdb.dwarf2/dw2-bad-mips-linkage-name.exp: New file.
One of the build slaves shows this error running explicit.exp:
(gdb) strace -m gdbfoobarbaz
Remote failure reply: E.In-process agent library not loaded in process.
Fast and static tracepoints unavailable.
(gdb) FAIL: gdb.linespec/explicit.exp: strace -m gdbfoobarbaz
There are two big problems with this test:
1) The expected output is actually not what the test is meant to test for.
2) This test should really only run where it is supported.
This is most easily fixed by moving the test to gdb.trace/strace.exp.
gdb/testsuite/ChangeLog
* gdb.linespec/explicit.exp: Move strace test from here ...
* gdb.trace/strace.exp: ... to here.
Invoking either of the above commands on an inferior that's not running
triggers the following assert failure:
.../binutils-gdb/gdb/thread.c:514: internal-error: any_thread_of_process: Assertion `pid != 0' failed.
The fix is straightforward. This patch also adds a test to check the
basic functionality of these commands, along with testing this fix in
particular. Tested on x86_64 Linux.
gdb/ChangeLog:
* inferior.c (detach_inferior_command): Don't call
any_thread_of_process when pid is 0.
(kill_inferior_command): Likewise.
gdb/testsuite/ChangeLog:
* gdb.base/kill-detach-inferiors-cmd.exp: New test file.
* gdb.base/kill-detach-inferiors-cmd.c: New test file.
The "source centric" /m option to the disassemble command is often
unhelpful, e.g., in the presence of optimized code.
This patch adds a /s modifier that is better.
For one, /m only prints instructions from the originating source file,
leaving out instructions from e.g., inlined functions from other files.
gdb/ChangeLog:
PR gdb/11833
* NEWS: Document new /s modifier for the disassemble command.
* cli/cli-cmds.c (disassemble_command): Add support for /s.
(_initialize_cli_cmds): Update online docs of disassemble command.
* disasm.c: #include "source.h".
(struct deprecated_dis_line_entry): Renamed from dis_line_entry.
All uses updated.
(dis_line_entry): New struct.
(hash_dis_line_entry, eq_dis_line_entry): New functions.
(allocate_dis_line_table): New functions.
(maybe_add_dis_line_entry, line_has_code_p): New functions.
(dump_insns): New arg end_pc. All callers updated.
(do_mixed_source_and_assembly_deprecated): Renamed from
do_mixed_source_and_assembly. All callers updated.
(do_mixed_source_and_assembly): New function.
(gdb_disassembly): Handle /s (DISASSEMBLY_SOURCE).
* disasm.h (DISASSEMBLY_SOURCE_DEPRECATED): Renamed from
DISASSEMBLY_SOURCE. All uses updated.
(DISASSEMBLY_SOURCE): New macro.
* mi/mi-cmd-disas.c (mi_cmd_disassemble): New modes 4,5.
gdb/doc/ChangeLog:
* gdb.texinfo (Machine Code): Update docs for mixed source/assembly
disassembly.
(GDB/MI Data Manipulation): Update docs for new disassembly modes.
gdb/testsuite/ChangeLog:
* gdb.mi/mi-disassemble.exp: Update.
* gdb.base/disasm-optim.S: New file.
* gdb.base/disasm-optim.c: New file.
* gdb.base/disasm-optim.h: New file.
* gdb.base/disasm-optim.exp: New file.
Keith reported that gdb.base/dso2dso.exp is broken, with the following
error:
| $ make check RUNTESTFLAGS=dso2dso.exp
| [snip]
| Running ../../../src/gdb/testsuite/gdb.base/dso2dso.exp ...
| ERROR: tcl error sourcing ../../../src/gdb/testsuite/gdb.base/dso2dso.exp.
| ERROR: couldn't open
| "../../../src/gdb/testsuite/gdb.base/../../../src/gdb/testsuite/gdb.base/dso2dso-dso1.c":
| no such file or directory
| while executing
| "error "$message""
| (procedure "gdb_get_line_number" line 14)
| invoked from within
| "gdb_get_line_number "STOP HERE" $srcfile_libdso1"
| (file "../../../src/gdb/testsuite/gdb.base/dso2dso.exp" line 60)
| invoked from within
| "source ../../../src/gdb/testsuite/gdb.base/dso2dso.exp"
| ("uplevel" body line 1)
| invoked from within
| "uplevel #0 source ../../../src/gdb/testsuite/gdb.base/dso2dso.exp"
| invoked from within
| "catch "uplevel #0 source $test_file_name""
This happens because gdb_get_line_number will prepend $srcdir/$subdir
if the given filename does not start with "/", and this happens when
GDB was configured using a relative path to the configure script.
When using an absolute path like I do, we avoid the pre-pending that
Keith is seeing.
gdb/testsuite/ChangeLog:
Keith Seitz <keiths@redhat.com>:
* gdb.base/dso2dso.exp: Pass basename of source file in call
to gdb_get_line_number.
Tested on x86_64-linux with both scenarios.
Making all-stop run on top of non-stop caused a small regression
in behavior. This was observed on x86_64-linux. The attached testcase
is in C whereas the investigation was done with an Ada program,
but it's the same scenario, and using a C testcase allows wider testing.
Basically: I am debugging a single-threaded program, and currently
stopped inside a function provided by a shared-library, at a line
calling a subprogram provided by a second shared library, and trying
to "next" over that function call.
Before we changed the default all-stop behavior, we had:
7 Impl_Initialize; -- Stop here and try "next" over this line
(gdb) n
8 return 5; <<-- OK
But now, "next" just stops much earlier:
(gdb) n
0x00007ffff7bd8560 in impl.initialize@plt () from /[...]/lib/libpck.so
What happens is that next stops at a call instruction, which calls
the function's PLT, and GDB fails to notice that the inferior stepped
into a subroutine, and so decides that we're done. We can see another
symptom of the same issue by looking at the backtrace at the point
GDB stopped:
(gdb) bt
#0 0x00007ffff7bd8560 in impl.initialize@plt ()
from /[...]/lib/libpck.so
#1 0x00000000f7bd86f9 in ?? ()
#2 0x00007fffffffdf50 in ?? ()
#3 0x0000000000401893 in a () at /[...]/a.adb:7
Backtrace stopped: frame did not save the PC
With a functioning GDB, the backtrace looks like the following instead:
#0 0x00007ffff7bd8560 in impl.initialize@plt ()
from /[...]/lib/libpck.so
#1 0x00007ffff7bd86f9 in sub () at /[...]/pck.adb:7
#2 0x0000000000401893 in a () at /[...]/a.adb:7
Note how, for frame #1, the address looks quite similar, except
for the high-order bits not being set:
#1 0x00007ffff7bd86f9 in sub () at /[...]/pck.adb:7 <<<-- OK
#1 0x00000000f7bd86f9 in ?? () <<<-- WRONG
^^^^
||||
Wrong
Investigating this further led me to displaced stepping.
As we are "next"-ing from a location where a breakpoint is inserted,
we need to step out of it, and since we're on non-stop mode, we need
to do it using displaced stepping. And looking at
amd64-tdep.c:amd64_displaced_step_fixup, I found the code that handles
the return address:
regcache_cooked_read_unsigned (regs, AMD64_RSP_REGNUM, &rsp);
retaddr = read_memory_unsigned_integer (rsp, retaddr_len, byte_order);
retaddr = (retaddr - insn_offset) & 0xffffffffUL;
The mask used to compute retaddr looks wrong to me, keeping only
4 bytes instead of 8, and explains why the high order bits of
the backtrace are unset. What happens is that, after the displaced
stepping has completed, GDB restores that return address at the location
where the program expects it. But because the top half bits of
the address have been masked out, the return address is now invalid.
The incorrect behavior of the "next" command and the backtrace at
that location are the first symptoms of that. Another symptom is
that this actually alters the behavior of the program, where a "cont"
from there soon leads to a SEGV when the inferior tries to jump back
to that incorrect return address:
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00000000f7bd86f9 in ?? ()
^^^^^^^^^^^^^^^^^^
This patch fixes the issue by using a mask that seems more appropriate
for this architecture.
gdb/ChangeLog:
* amd64-tdep.c (amd64_displaced_step_fixup): Fix the mask used to
compute RETADDR.
gdb/testsuite/ChangeLog:
* gdb.base/dso2dso-dso2.c, gdb.base/dso2dso-dso2.h,
gdb.base/dso2dso-dso1.c, gdb.base/dso2dso-dso1.h, gdb.base/dso2dso.c,
gdb.base/dso2dso.exp: New files.
Tested on x86_64-linux, no regression.
Keith found out that several tests were failing when testing the
native-gdbserver board on Fedora (x86_64). Strangely, these failures
had not been reported by our BuildBot. Later, he found that the reason
for this was because the failures only happened when running the
testsuite without FORCE_PARALLEL (i.e., on serial mode; maybe it would
be worth having a builder testing things on serial...). Then, he
decided to start bisecting the changes to see which one introduced the
failure (it was not trivial to know this only by looking at gdb.log).
After a lot of time, he found that Pedro's commit
e1316e60d4 was the culprit. There was
nothing wrong in the code, but the new gdb.base/checkpoint-ns.exp
testcase did something that left the GDBFLAGS variable in an
inconsistent state. This test works by modifying this variable to set
non-stop on, sourcing gdb.base/checkpoint.exp (which does the hard
work), and then restoring the old value on GDBFLAGS. However, this was
not working because gdb.base/checkpoint.exp bails out if it is being
tested on gdbserver, and when it calls "continue" the control goes back
to the function calling the tests, and not to
gdb.base/checkpoint-ns.exp.
The fix is simple: just wrap the "source" call, and make
gdb.base/checkpoint-ns.exp aware of the "continue"/"return" calls made
by gdb.base/checkpoint.exp.
gdb/testsuite/ChangeLog:
2015-08-12 Sergio Durigan Junior <sergiodj@redhat.com>
Pedro Alves <palves@redhat.com>
Keith Seitz <keiths@redhat.com>
* gdb.base/checkpoint-ns.exp: Use save_vars to save and restore
GDBFLAGS.
gdb/testsuite/ChangeLog:
* gdb.base/gdbhistsize-history.exp
(test_histsize_history_setting): Use save_vars.
* gdb.base/gdbinit-history.exp (test_gdbinit_history_setting):
Use save_vars.
(test_no_truncation_of_unlimited_history_file): Use save_vars.
* gdb.base/readline.exp: Use save_vars.
This patch adds documentation for explicit locations to both the
User Manual and gdb's online help system.
gdb/ChangeLog:
* NEWS: Mention explicit locations.
* breakpoint.c [LOCATION_HELP_STRING]: New macro.
[BREAK_ARGS_HELP]: Use LOCATION_HELP_STRING.
(_initialize_breakpoint): Update documentation for
"clear", "break", "trace", "strace", "ftrace", and "dprintf".
gdb/doc/ChangeLog:
* gdb.texinfo (Thread-Specific Breakpoints, Printing Source Lines):
Use "location(s)"instead of "linespec(s)".
(Specifying a Location): Rewrite.
Add subsections describing linespec, address, and explicit locations.
Add node/menu for each subsection.
(Source and Machine Code, C Preprocessor Macros)
(Create and Delete Trace points)
(Extensions for Ada Tasks): Use "location(s)" instead of "linespec(s)".
(Continuing at a Different Address): Remove "linespec" examples.
Add reference to "Specify a Location"
(The -break-insert Command): Rewrite. Add anchor.
Add reference to appropriate manual section discussing locations.
(The -dprintf-insert Command): Refer to -break-insert for
specification of 'location'.
gdb/testsuite/ChangeLog:
* gdb.base/help.exp: Update help_breakpoint_text.
This patch adds support for explicit locations to MI's -break-insert
command. The new options, documented in the User Manual, are
--source, --line, --function, and --label.
gdb/ChangeLog:
* mi/mi-cmd-break.c (mi_cmd_break_insert_1): Add support for
explicit locations, options "--source", "--function",
"--label", and "--line".
gdb/testsuite/ChangeLog:
* gdb.mi/mi-break.exp (test_explicit_breakpoints): New proc.
(at toplevel): Call test_explicit_breakpoints.
* gdb.mi/mi-dprintf.exp: Add tests for explicit dprintf
breakpoints.
* lib/mi-support.exp (mi_make_breakpoint): Add support for
breakpoint conditions, "-cond".
This patch exposes explicit locations to the CLI user. This enables
users to "explicitly" specify attributes of the breakpoint location
to avoid any ambiguity that might otherwise exist with linespecs.
The general syntax of explicit locations is:
-source SOURCE_FILENAME -line {+-}LINE -function FUNCTION_NAME
-label LABEL_NAME
Option names may be abbreviated, e.g., "-s SOURCE_FILENAME -li 3" and users
may use the completer with either options or values.
gdb/ChangeLog:
* completer.c: Include location.h.
(enum match_type): New enum.
(location_completer): Rename to ...
(linespec_completer): ... this.
(collect_explicit_location_matches, backup_text_ptr)
(explicit_location_completer): New functions.
(location_completer): "New" function; handle linespec
and explicit location completions.
(complete_line_internal): Remove all location completer-specific
handling.
* linespec.c (linespec_lexer_lex_keyword, is_ada_operator)
(find_toplevel_char): Export.
(linespec_parse_line_offset): Export.
Issue error if STRING is not numerical.
(gdb_get_linespec_parser_quote_characters): New function.
* linespec.h (linespec_parse_line_offset): Declare.
(get_gdb_linespec_parser_quote_characters): Declare.
(is_ada_operator): Declare.
(find_toplevel_char): Declare.
(linespec_lexer_lex_keyword): Declare.
* location.c (explicit_to_event_location): New function.
(explicit_location_lex_one): New function.
(string_to_explicit_location): New function.
(string_to_event_location): Handle explicit locations.
* location.h (explicit_to_event_location): Declare.
(string_to_explicit_location): Declare.
gdb/testsuite/ChangeLog:
* gdb.linespec/3explicit.c: New file.
* gdb.linespec/cpexplicit.cc: New file.
* gdb.linespec/cpexplicit.exp: New file.
* gdb.linespec/explicit.c: New file.
* gdb.linespec/explicit.exp: New file.
* gdb.linespec/explicit2.c: New file.
* gdb.linespec/ls-errs.exp: Add explicit location tests.
* lib/gdb.exp (capture_command_output): Regexp-escape `command'
before using in the matching pattern.
Clarify that `prefix' is a regular expression.
This patch converts the code base to use the new struct event_location
API being introduced. This patch preserves the current functionality and
adds no new features.
The "big picture" API usage introduced by this patch may be illustrated
with a simple exmaple. Where previously developers would write:
void
my_command (char *arg, int from_tty)
{
create_breakpoint (..., arg, ...);
...
}
one now uses:
void
my_command (char *arg, int from_tty)
{
struct event_locaiton *location;
struct cleanup *back_to;
location = string_to_event_locaiton (&arg, ...);
back_to = make_cleanup_delete_event_location (location);
create_breakpoint (..., location, ...);
do_cleanups (back_to);
}
Linespec-decoding functions (now called location-decoding) such as
decode_line_full no longer skip argument pointers over processed input.
That functionality has been moved into string_to_event_location as
demonstrated above.
gdb/ChangeLog
* ax-gdb.c: Include location.h.
(agent_command_1) Use linespec location instead of address
string.
* break-catch-throw.c: Include location.h.
(re_set_exception_catchpoint): Use linespec locations instead
of address strings.
* breakpoint.c: Include location.h.
(create_overlay_event_breakpoint, create_longjmp_master_breakpoint)
(create_std_terminate_master_breakpoint)
(create_exception_master_breakpoint, update_breakpoints_after_exec):
Use linespec location instead of address string.
(print_breakpoint_location): Use locations and
event_location_to_string.
Print extra_string for pending locations for non-MI streams.
(print_one_breakpoint_location): Use locations and
event_location_to_string.
(init_raw_breakpoint_without_location): Initialize b->location.
(create_thread_event_breakpoint): Use linespec location instead of
address string.
(init_breakpoint_sal): Likewise.
Only save extra_string if it is non-NULL and not the empty string.
Use event_location_to_string instead of `addr_string'.
Constify `p' and `endp'.
Use skip_spaces_const/skip_space_const instead of non-const versions.
Copy the location into the breakpoint.
If LOCATION is NULL, save the breakpoint address as a linespec location
instead of an address string.
(create_breakpoint_sal): Change `addr_string' parameter to a struct
event_location. All uses updated.
(create_breakpoints_sal): Likewise for local variable `addr_string'.
(parse_breakpoint_sals): Use locations instead of address strings.
Remove check for empty linespec with conditional.
Refactor.
(decode_static_tracepoint_spec): Make argument const and update
function.
(create_breakpoint): Change `arg' to a struct event_location and
rename.
Remove `copy_arg' and `addr_start'.
If EXTRA_STRING is empty, set it to NULL.
Don't populate `canonical' for pending breakpoints.
Pass `extra_string' to find_condition_and_thread.
Clear `extra_string' if `rest' was NULL.
Do not error with "garbage after location" if setting a dprintf
breakpoint.
Copy the location into the breakpoint instead of an address string.
(break_command_1): Use string_to_event_location and pass this to
create_breakpoint instead of an address string.
Check against `arg_cp' for a probe linespec.
(dprintf_command): Use string_to_event_location and pass this to
create_breakpoint instead of an address string.
Throw an exception if no format string was specified.
(print_recreate_ranged_breakpoint): Use event_location_to_string
instead of address strings.
(break_range_command, until_break_command)
(init_ada_exception_breakpoint): Use locations instead
of address strings.
(say_where): Print out extra_string for pending locations.
(base_breakpoint_dtor): Delete `location' and `location_range_end' of
the breakpoint.
(base_breakpoint_create_sals_from_location): Use struct event_location
instead of address string.
Remove `addr_start' and `copy_arg' parameters.
(base_breakpoint_decode_location): Use struct event_location instead of
address string.
(bkpt_re_set): Use locations instead of address strings.
Use event_location_empty_p to check for unset location.
(bkpt_print_recreate): Use event_location_to_string instead of
an address string.
Print out extra_string for pending locations.
(bkpt_create_sals_from_location, bkpt_decode_location)
(bkpt_probe_create_sals_from_location): Use struct event_location
instead of address string.
(bkpt_probe_decode_location): Use struct event_location instead of
address string.
(tracepoint_print_recreate): Use event_location_to_string to
recreate the tracepoint.
(tracepoint_create_sals_from_location, tracepoint_decode_location)
(tracepoint_probe_create_sals_from_location)
(tracepoint_probe_decode_location): Use struct event_location
instead of address string.
(dprintf_print_recreate): Use event_location_to_string to recreate
the dprintf.
(dprintf_re_set): Remove check for valid/missing format string.
(strace_marker_create_sals_from_location)
(strace_marker_create_breakpoints_sal, strace_marker_decode_location)
(update_static_tracepoint): Use struct event_location instead of
address string.
(location_to_sals): Likewise.
Pass `extra_string' to find_condition_and_thread.
For newly resolved pending breakpoint locations, clear the location's
string representation.
Assert that the breakpoint's condition string is NULL when
condition_not_parsed.
(breakpoint_re_set_default, create_sals_from_location_default)
(decode_location_default, trace_command, ftrace_command)
(strace_command, create_tracepoint_from_upload): Use locations
instead of address strings.
* breakpoint.h (struct breakpoint_ops) <create_sals_from_location>:
Use struct event_location instead of address string.
Update all uses.
<decode_location>: Likewise.
(struct breakpoint) <addr_string>: Change to struct event_location
and rename `location'.
<addr_string_range_end>: Change to struct event_location and rename
`location_range_end'.
(create_breakpoint): Use struct event_location instead of address
string.
* cli/cli-cmds.c: Include location.h.
(edit_command, list_command): Use locations instead of address strings.
* elfread.c: Include location.h.
(elf_gnu_ifunc_resolver_return_stop): Use event_location_to_string.
* guile/scm-breakpoint.c: Include location.h.
(bpscm_print_breakpoint_smob): Use event_location_to_string.
(gdbscm_register_breakpoint): Use locations instead of address
strings.
* linespec.c: Include location.h.
(struct ls_parser) <stream>: Change to const char *.
(PARSER_STREAM): Update.
(lionespec_lexer_lex_keyword): According to find_condition_and_thread,
keywords must be followed by whitespace.
(canonicalize_linespec): Save a linespec location into `canonical'.
Save a canonical linespec into `canonical'.
(parse_linespec): Change `argptr' to const char * and rename `arg'.
All uses updated.
Update function description.
(linespec_parser_new): Initialize `parser'.
Update initialization of parsing stream.
(event_location_to_sals): New function.
(decode_line_full): Change `argptr' to a struct event_location and
rename it `location'.
Use locations instead of address strings.
Call event_location_to_sals instead of parse_linespec.
(decode_line_1): Likewise.
(decode_line_with_current_source, decode_line_with_last_displayed)
Use locations instead of address strings.
(decode_objc): Likewise.
Change `argptr' to const char * and rename `arg'.
(destroy_linespec_result): Delete the linespec result's location
instead of freeing the address string.
* linespec.h (struct linespec_result) <addr_string>: Change to
struct event_location and rename to ...
<location>: ... this.
(decode_line_1, decode_line_full): Change `argptr' to struct
event_location. All callers updated.
* mi/mi-cmd-break.c: Include language.h, location.h, and linespec.h.
(mi_cmd_break_insert_1): Use locations instead of address strings.
Throw an error if there was "garbage" at the end of the specified
linespec.
* probe.c: Include location.h.
(parse_probes): Change `argptr' to struct event_location.
Use event locations instead of address strings.
* probe.h (parse_probes): Change `argptr' to struct event_location.
* python/py-breakpoint.c: Include location.h.
(bppy_get_location): Constify local variable `str'.
Use event_location_to_string.
(bppy_init): Use locations instead of address strings.
* python/py-finishbreakpoint.c: Include location.h.
(bpfinishpy_init): Remove local variable `addr_str'.
Use locations instead of address strings.
* python/python.c: Include location.h.
(gdbpy_decode_line): Use locations instead of address strings.
* remote.c: Include location.h.
(remote_download_tracepoint): Use locations instead of address
strings.
* spu-tdep.c: Include location.h.
(spu_catch_start): Remove local variable `buf'.
Use locations instead of address strings.
* tracepoint.c: Include location.h.
(scope_info): Use locations instead of address strings.
(encode_source_string): Constify parameter `src'.
* tracepoint.h (encode_source_string): Likewise.
gdb/testsuite/ChangeLog
* gdb.base/dprintf-pending.exp: Update dprintf "without format"
test.
Add tests for missing ",FMT" and ",".
gdb/ChangeLog:
* symtab.c (make_file_symbol_completion_list_1): Renamed from
make_file_symbol_completion_list and made static.
(make_file_symbol_completion_list): New function.
gdb/testsuite/ChangeLog:
* gdb.base/completion.exp: Add location completer tests.
The ppc64 displaced step code can't handle atomic sequences. Fallback
to stepping over the breakpoint in-line if we detect one.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* infrun.c (displaced_step_prepare_throw): Return -1 if
gdbarch_displaced_step_copy_insn returns NULL. Update intro
comment.
* rs6000-tdep.c (LWARX_MASK, LWARX_INSTRUCTION, LDARX_INSTRUCTION)
(STWCX_MASK, STWCX_INSTRUCTION, STDCX_INSTRUCTION): Move higher up
in file.
(ppc_displaced_step_copy_insn): New function.
(ppc_displaced_step_fixup): Update comment.
(rs6000_gdbarch_init): Install ppc_displaced_step_copy_insn as
gdbarch_displaced_step_copy_insn hook.
* gdbarch.sh (displaced_step_copy_insn): Document what happens on
NULL return.
* gdbarch.h: Regenerate.
gdb/testsuite/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* gdb.arch/ppc64-atomic-inst.exp (do_test): New procedure, move
tests here.
(top level): Run do_test with and without displaced stepping.
Running the testsuite with "maint set target-non-stop on" shows:
(gdb) PASS: gdb.base/valgrind-infcall.exp: continue #98 (false warning)
continue
Continuing.
dl_main (phdr=<optimized out>..., auxv=<optimized out>) at rtld.c:2302
2302 LIBC_PROBE (init_complete, 2, LM_ID_BASE, r);
Cannot access memory at address 0x400532
(gdb) PASS: gdb.base/valgrind-infcall.exp: continue #99 (false warning)
p gdb_test_infcall ()
$1 = 1
(gdb) FAIL: gdb.base/valgrind-infcall.exp: p gdb_test_infcall ()
Even though that was a native GNU/Linux test run, this test spawns
Valgrind and connects to it with "target remote". The error above is
actually orthogonal to target-non-stop. The real issue is that that
enables displaced stepping, and displaced stepping doesn't work with
Valgrind, because we can't write to the inferior memory (thus can't
copy the instruction to the scratch pad area).
I'm sure there will be other targets with the same issue, so trying to
identify Valgrind wouldn't be sufficient. The fix is to try setting
up the displaced step anyway. If we get a MEMORY_ERROR, we disable
displaced stepping for that inferior, and fall back to doing an
in-line step-over. If "set displaced-stepping" is "on" (as opposed to
"auto), GDB warns displaced stepping failed ("on" is mainly useful for
the testsuite, not for users).
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* inferior.h (struct inferior) <displaced_stepping_failed>: New
field.
* infrun.c (use_displaced_stepping_now_p): New parameter 'inf'.
Return false if dispaced stepping failed before.
(resume): Pass the current inferior to
use_displaced_stepping_now_p. Wrap displaced_step_prepare in
TRY/CATCH. If we get a MEMORY_ERROR, set the inferior's
displaced_stepping_failed flag, and fall back to an in-line
step-over.
gdb/testsuite/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* gdb.base/valgrind-disp-step.c: New file.
* gdb.base/valgrind-disp-step.exp: New file.
On a target that is both always in non-stop mode and can do displaced
stepping (such as native x86_64 GNU/Linux, with "maint set
target-non-stop on"), the step-over-trips-on-watchpoint.exp test
sometimes fails like this:
(gdb) PASS: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: thread 1
set scheduler-locking off
(gdb) PASS: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: set scheduler-locking off
step
-[Switching to Thread 0x7ffff7fc0700 (LWP 11782)]
-Hardware watchpoint 4: watch_me
-
-Old value = 0
-New value = 1
-child_function (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/step-over-trips-on-watchpoint.c:39
-39 other = 1; /* set thread-specific breakpoint here */
-(gdb) PASS: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: step
+wait_threads () at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/step-over-trips-on-watchpoint.c:49
+49 return 1; /* in wait_threads */
+(gdb) FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: no thread-specific bp: step: step
Note "scheduler-locking" was set off. The problem is that on such
targets, the step-over of thread 2 and the "step" of thread 1 can be
set to run simultaneously (since with displaced stepping the
breakpoint isn't ever removed from the target), and sometimes, the
"step" of thread 1 finishes first, so it'd take another resume to see
the watchpoint trigger. Fix this by replacing the wait_threads
function with a one-line infinite loop that doesn't call any function,
so that the "step" of thread 1 never finishes.
gdb/testsuite/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* gdb.threads/step-over-lands-on-breakpoint.c (wait_threads):
Delete function.
(main): Add alarm. Run an infinite loop instead of calling
wait_threads.
* gdb.threads/step-over-lands-on-breakpoint.exp (do_test): Change
comment.
* gdb.threads/step-over-trips-on-watchpoint.c (wait_threads):
Delete function.
(main): Add alarm. Run an infinite loop instead of calling
wait_threads.
* gdb.threads/step-over-trips-on-watchpoint.exp (do_test): Change
comment.
Letting a "checkpoint" run to exit with "set non-stop on" behaves
differently compared to the default all-stop mode ("set non-stop
off").
Currently, in non-stop mode:
(gdb) start
Temporary breakpoint 1 at 0x40086b: file src/gdb/testsuite/gdb.base/checkpoint.c, line 28.
Starting program: build/gdb/testsuite/gdb.base/checkpoint
Temporary breakpoint 1, main () at src/gdb/testsuite/gdb.base/checkpoint.c:28
28 char *tmp = &linebuf[0];
(gdb) checkpoint
checkpoint 1: fork returned pid 24948.
(gdb) c
Continuing.
Copy complete.
Deleting copy.
[Inferior 1 (process 24944) exited normally]
[Switching to process 24948]
(gdb) info threads
Id Target Id Frame
1 process 24948 "checkpoint" (running)
No selected thread. See `help thread'.
(gdb) c
The program is not being run.
(gdb)
Two issues above:
1. Thread 1 got stuck in "(running)" state (it isn't really running)
2. While checkpoints try to preserve the illusion that the thread is
still the same when the process exits, GDB switched to "No thread
selected." instead of staying with thread 1 selected.
Problem #1 is caused by handle_inferior_event and normal_stop not
considering that when a
TARGET_WAITKIND_SIGNALLED/TARGET_WAITKIND_EXITED event is reported,
and the inferior is mourned, the target may still have execution.
Problem #2 is caused by the make_cleanup_restore_current_thread
cleanup installed by fetch_inferior_event not being able to find the
original thread 1's ptid in the thread list, thus not being able to
restore thread 1 as selected thread. The fix is to make the cleanup
installed by make_cleanup_restore_current_thread aware of thread ptid
changes, by installing a thread_ptid_changed observer that adjusts the
cleanup's data.
After the patch, we get the same in all-stop and non-stop modes:
(gdb) c
Continuing.
Copy complete.
Deleting copy.
[Inferior 1 (process 25109) exited normally]
[Switching to process 25113]
(gdb) info threads
Id Target Id Frame
* 1 process 25113 "checkpoint" main () at src/gdb/testsuite/gdb.base/checkpoint.c:28
(gdb)
Turns out the whole checkpoints.exp file can run in non-stop mode
unmodified. I thought of moving most of the test file's contents to a
procedure that can be called twice, once in non-stop mode and another
in all-stop mode. But then, the test already takes close to 30
seconds to run on my machine, so I thought it'd be nicer to run
all-stop and non-stop mode in parallel. Thus I added a new
checkpoint-ns.exp file that just appends "set non-stop on" to GDBFLAGS
and sources checkpoint.exp.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* infrun.c (handle_inferior_event): If we get
TARGET_WAITKIND_SIGNALLED or TARGET_WAITKIND_EXITED in non-stop
mode, mark all threads of the exiting process as not-executing.
(normal_stop): If we get TARGET_WAITKIND_SIGNALLED or
TARGET_WAITKIND_EXITED in non-stop mode, finish all threads of the
exiting process, if inferior_ptid still points at a process.
* thread.c (struct current_thread_cleanup) <next>: New field.
(current_thread_cleanup_chain): New global.
(restore_current_thread_ptid_changed): New function.
(restore_current_thread_cleanup_dtor): Remove the cleanup from the
current_thread_cleanup_chain list.
(make_cleanup_restore_current_thread): Add the cleanup data to the
current_thread_cleanup_chain list.
(_initialize_thread): Install restore_current_thread_ptid_changed
as thread_ptid_changed observer.
gdb/testsuite/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* gdb.base/checkpoint-ns.exp: New file.
* gdb.base/checkpoint.exp: Pass explicit "checkpoint.c" to
standard_testfile.
Indicate speculatively executed instructions with a leading '?'. We use the
space that is normally used for the PC prefix. In the case where the
instruction at the current PC had been executed speculatively before, the PC
prefix will be partially overwritten resulting in "?> ".
As a side-effect, the /p modifier to omit the PC prefix in the "record
instruction-history" command now uses a 3-space PC prefix " " in order to
have enough space for the speculative execution indication.
gdb/
* btrace.c (btrace_compute_ftrace_bts): Clear insn flags.
(pt_btrace_insn_flags): New.
(ftrace_add_pt): Call pt_btrace_insn_flags.
* btrace.h (btrace_insn_flag): New.
(btrace_insn) <flags>: New.
* record-btrace.c (btrace_insn_history): Print insn prefix.
* NEWS: Announce it.
doc/
* gdb.texinfo (Process Record and Replay): Document prefixing of
speculatively executed instructions in the "record instruction-history"
command.
testsuite/
* gdb.btrace/instruction_history.exp: Update.
* gdb.btrace/tsx.exp: New.
* gdb.btrace/tsx.c: New.
* lib/gdb.exp (skip_tsx_tests, skip_btrace_pt_tests): New.
The buildbot shows that PPC64 and x86_64 builders, both native and
extended-remote gdbserver frequently timeout these tests.
until-precsave.exp times out on my x86_64 occasionally as well.
Inspecting the logs, we see that if we waited some more, the tests
would pass.
Simply bump until-precsave.exp timeouts further, and apply the same
treatment to step-precsave.exp.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.reverse/step-precsave.exp: Use with_timeout_factor to
increase timeout.
* gdb.reverse/until-precsave.exp: Bump timeouts.
This test fails with --target_board=native-extended-gdbserver because
it misses the usual "disconnect":
(gdb) target remote | /usr/lib64/valgrind/../../bin/vgdb --pid=30454
Already connected to a remote target. Disconnect? (y or n) n
Still connected.
(gdb) FAIL: gdb.base/valgrind-infcall.exp: target remote for vgdb (got interactive prompt)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.base/valgrind-infcall.exp: Issue a "disconnect".
This adds a kfailed test that has the whole process exit just while
several threads continuously step over a breakpoint. Usually, the
process exits just while GDB or GDBserver is handling the breakpoint
hit. In other words, the process disappears while the event thread is
(ptrace-) stopped. This exposes several issues in GDB and GDBserver.
Errors, crashes, etc.
I fixed some of these issues recently, but there's a lot more to do.
It's a bit like playing whack-a-mole at the moment. You fix an issue,
which then exposes several others.
E.g., with the native target, you get (among other errors):
(...)
[New Thread 0x7ffff47b9700 (LWP 18077)]
[New Thread 0x7ffff3fb8700 (LWP 18078)]
[New Thread 0x7ffff37b7700 (LWP 18079)]
Cannot find user-level thread for LWP 18076: generic error
(gdb) KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=1: inferior 1 exited (prompt) (PRMS: gdb/18749)
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
PR gdb/18749
* gdb.threads/process-dies-while-handling-bp.c: New file.
* gdb.threads/process-dies-while-handling-bp.exp: New file.
Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html
This adds a test that has a multithreaded program have several threads
continuously fork, while another thread continuously steps over a
breakpoint.
This exposes several intertwined issues, which this patch addresses:
- When we're stopping and suspending threads, some thread may fork,
and we missed setting its suspend count to 1, like we do when a new
clone/thread is detected. When we next unsuspend threads, the fork
child's suspend count goes below 0, which is bogus and fails an
assertion.
- If a step-over is cancelled because a signal arrives, but then gdb
is not interested in the signal, we pass the signal straight back
to the inferior. However, we miss that we need to re-increment the
suspend counts of all other threads that had been paused for the
step-over. As a result, other threads indefinitely end up stuck
stopped.
- If a detach request comes in just while gdbserver is handling a
step-over (in the test at hand, this is GDB detaching the fork
child), gdbserver internal errors in stabilize_thread's helpers,
which assert that all thread's suspend counts are 0 (otherwise we
wouldn't be able to move threads out of the jump pads). The
suspend counts aren't 0 while a step-over is in progress, because
all threads but the one stepping past the breakpoint must remain
paused until the step-over finishes and the breakpoint can be
reinserted.
- Occasionally, we see "BAD - reinserting but not stepping." being
output (from within linux_resume_one_lwp_throw). That was because
GDB pokes memory while gdbserver is busy with a step-over, and that
suspends threads, and then re-resumes them with proceed_one_lwp,
which missed another reason to tell linux_resume_one_lwp that the
thread should be set back to stepping.
- In a couple places, we were resuming threads that are meant to be
suspended. E.g., when a vCont;c/s request for thread B comes in
just while gdbserver is stepping thread A past a breakpoint. The
resume for thread B must be deferred until the step-over finishes.
- The test runs with both "set detach-on-fork" on and off. When off,
it exercises the case of GDB detaching the fork child explicitly.
When on, it exercises the case of gdb resuming the child
explicitly. In the "off" case, gdb seems to exponentially become
slower as new inferiors are created. This is _very_ noticeable as
with only 100 inferiors gdb is crawling already, which makes the
test take quite a bit to run. For that reason, I've disabled the
"off" variant for now.
gdb/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* target/waitstatus.h (enum target_stop_reason)
<TARGET_STOPPED_BY_SINGLE_STEP>: New value.
gdb/gdbserver/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Set the fork child's suspend
count if stopping and suspending threads.
(check_stopped_by_breakpoint): If stopped by trace, set the LWP's
stop reason to TARGET_STOPPED_BY_SINGLE_STEP.
(linux_detach): Complete an ongoing step-over.
(lwp_suspended_inc, lwp_suspended_decr): New functions. Use
throughout.
(resume_stopped_resumed_lwps): Don't resume a suspended thread.
(linux_wait_1): If passing a signal to the inferior after
finishing a step-over, unsuspend and re-resume all lwps. If we
see a single-step event but the thread should be continuing, don't
pass the trap to gdb.
(stuck_in_jump_pad_callback, move_out_of_jump_pad_callback): Use
internal_error instead of gdb_assert.
(enqueue_pending_signal): New function.
(check_ptrace_stopped_lwp_gone): Add debug output.
(start_step_over): Use internal_error instead of gdb_assert.
(complete_ongoing_step_over): New function.
(linux_resume_one_thread): Don't resume a suspended thread.
(proceed_one_lwp): If the LWP is stepping over a breakpoint, reset
it stepping.
gdb/testsuite/ChangeLog:
2015-08-06 Pedro Alves <palves@redhat.com>
* gdb.threads/forking-threads-plus-breakpoint.exp: New file.
* gdb.threads/forking-threads-plus-breakpoint.c: New file.
At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel
observed that trying to next/step a program on GNU/Linux sometimes
results in the following failed assertion:
% gdb -q .obj/gprof/main
(gdb) start
(gdb) n
(gdb) step
[...]/infrun.c:2391: internal-error:
resume: Assertion `sig != GDB_SIGNAL_0' failed.
What happened is that, during the "next" operation, GDB hit a
longjmp/exception/step-resume breakpoint but failed to see that this
breakpoint was set for a different thread than the one being stepped.
Joel's detailed analysis follows:
More precisely, at the end of the "start" command, we are stopped at
the start of function Main in main.adb; there are 4 threads in total,
and we are in the main thread (which is thread 1):
(gdb) info thread
Id Target Id Frame
4 Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall ()
3 Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall ()
2 Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall ()
* 1 Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57
All the logs below reference Thread ID/LWP, but it'll be easier to
talk about the threads by GDB thread number. For instance, thread 1
is LWP 28370 while thread 3 is LWP 28378. So, the explanations below
translate the LWPs into thread numbers.
Back to what happens while we are trying to "next' our program:
(gdb) n
infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379))
infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378))
infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377))
infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370))
infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8054523
We've resumed thread 1 (LWP 28370), and received in return a signal
that the same thread stopped slightly further. It's still in the
range of instructions for the line of source we started the "next"
from, as evidenced by the following trace...
infrun: stepping inside range [0x805451e-0x8054531]
... and thus, we decide to continue stepping the same thread:
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
infrun: prepare_to_wait
That's when we get an event from a different thread (thread 3)...
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80782d0
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
... which we find to be at the address where we set a breakpoint on
"the unwinder debug hook" (namely "_Unwind_DebugHook"). But GDB fails
to notice that the breakpoint was inserted for thread 1 only, and so
decides to handle it as...
infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME
... and inserts a breakpoint at the corresponding resume address, as
evidenced by this the next log:
infrun: exception resume at 80542a2
That breakpoint seems innocent right now, but will play a role fairly
quickly. But for now, GDB has inserted the exception-resume
breakpoint, and needs to single-step thread 3 past the breakpoint it
just hit. Thus, it temporarily disables the exception breakpoint, and
requests a step of that thread:
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: skipping breakpoint: stepping past insn at: 0x80782d0
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0
infrun: prepare_to_wait
We then get a notification, still from thread 3, that it's now past
that breakpoint...
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x8078424
... so we can resume what we were doing before, which is single-stepping
thread 1 until we get to a new line of code:
infrun: switching back to stepped thread
infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370)
infrun: expected thread still hasn't advanced
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523
The "resume" log above shows that we're resuming thread 1 from where
we left off (0x8054523). We get one more stop at 0x8054529, which is
still inside our stepping range so we go again. That's when we get
the following event, from thread 3:
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x80542a2
Now the stop_pc address is interesting, because it's the address of
"exception resume" breakpoint...
infrun: context switch
infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378)
infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME
... and since that location is at a different line of code, this is
where it decides the "next" operation should stop:
infrun: stop_waiting
[Switching to Thread 0xb7c5aba0 (LWP 28378)]
0x080542a2 in inte_tache_rt.ttache_rt (
<_task>=0x80968ec <inte_tache_rt_inst.tache2>)
at /[...]/inte_tache_rt.adb:54
54 end loop;
However, what GDB should have noticed earlier that the exception
breakpoint we hit was for a different thread, thus should have
single-stepped that thread out of the breakpoint _without_ inserting
the exception-return breakpoint, and then resumed the single-stepping
of the initial thread (thread 1) until that thread stepped out of its
stepping range.
This is what this patch does, and after applying it, GDB now correctly
stops on the next line of code.
The patch adds a C++ test that exercises this, both for setjmp/longjmp
and exception breakpoints. With an unpatched GDB it shows:
(gdb) next
[Switching to Thread 22445.22455]
thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59
59 catch (...)
(gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1
next
/home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa
iled.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error)
Resyncing due to internal error.
n
Tested on x86_64-linux, no regressions.
gdb/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
Joel Brobecker <brobecker@adacore.com>
* breakpoint.c (bpstat_what) <bp_longjmp, bp_longjmp_call_dummy>
<bp_exception, bp_longjmp_resume, bp_exception_resume>: Handle the
case where BS->STOP is not set.
gdb/testsuite/ChangeLog:
2015-08-05 Pedro Alves <palves@redhat.com>
* gdb.threads/next-while-other-thread-longjmps.c: New file.
* gdb.threads/next-while-other-thread-longjmps.exp: New file.
2015-08-03 Sandra Loosemore <sandra@codesourcery.com>
gdb/testsuite/
* gdb.base/bp-permanent.exp: Report test as unsupported if
the target cannot stop at the permanent breakpoint.
These testcases are mocks of real programs.
GDB doesn't care what the programs do, they just have to look
and/or behave like the real program.
These testcases exercise gdb when debugging really large programs.
E.g., gmonster-1 has 10,000 CUs, and gmonster-2 has 1000 shared libs
(which is actually a little small, 5000 would be more accurate).
gdb/testsuite/ChangeLog:
* gdb.perf/lib/perftest/utils.py: New file.
* gdb.perf/gm-hello.cc: New file.
* gdb.perf/gm-pervasive-typedef.cc: New file.
* gdb.perf/gm-pervasive-typedef.h: New file.
* gdb.perf/gm-std.cc: New file.
* gdb.perf/gm-std.h: New file.
* gdb.perf/gm-use-cerr.cc: New file.
* gdb.perf/gm-utils.h: New file.
* gdb.perf/gmonster-null-lookup.py: New file.
* gdb.perf/gmonster-pervasive-typedef.py: New file.
* gdb.perf/gmonster-print-cerr.py: New file.
* gdb.perf/gmonster-ptype-string.py: New file.
* gdb.perf/gmonster-runto-main.py: New file.
* gdb.perf/gmonster-select-file.py: New file.
* gdb.perf/gmonster1-null-lookup.exp: New file.
* gdb.perf/gmonster1-pervasive-typedef.exp: New file.
* gdb.perf/gmonster1-print-cerr.exp: New file.
* gdb.perf/gmonster1-ptype-string.exp: New file.
* gdb.perf/gmonster1-runto-main.exp: New file.
* gdb.perf/gmonster1-select-file.exp: New file.
* gdb.perf/gmonster1.cc: New file.
* gdb.perf/gmonster1.exp: New file.
* gdb.perf/gmonster2-null-lookup.exp: New file.
* gdb.perf/gmonster2-pervasive-typedef.exp: New file.
* gdb.perf/gmonster2-print-cerr.exp: New file.
* gdb.perf/gmonster2-ptype-string.exp: New file.
* gdb.perf/gmonster2-runto-main.exp: New file.
* gdb.perf/gmonster2-select-file.exp: New file.
* gdb.perf/gmonster2.cc: New file.
* gdb.perf/gmonster2.exp: New file.
single-step.exp takes a while to run, and while that's not necessarily
bad, here it's because the default value of SINGLE_STEP_COUNT is 10,000.
We're not going to gain any more insight into perf issues
single-stepping (stepi) 10,000 times over 1,000 times,
so this patch changes the default to 1,000.
gdb/testsuite/ChangeLog:
* gdb.perf/single-step.exp (SINGLE_STEP_COUNT): Change to 1000 from
10000.
gdb/testsuite/ChangeLog:
* Makefile.in (workers/%.worker, build-perf): New rule.
(GDB_PERFTEST_MODE): New variable.
(check-perf): Use it.
(clean): Clean up gdb.perf parallel build subdirs.
* lib/build-piece.exp: New file.
* lib/gdb.exp (make_gdb_parallel_path): New function
(standard_output_file, standard_temp_file): Call it.
(GDB_PARALLEL handling): Make outputs,temp,cache directories as subdirs
of $GDB_PARALLEL.
* lib/cache.exp (gdb_do_cache): Call make_gdb_parallel_path.
This patch does two things.
1) Add support for multiple data points.
2) Move the "report" output from perftest.log to perftest.sum.
I want to record the raw data somewhere, and a bit of statistical analysis
(standard deviation left for another day), but I also don't want
it to clutter up the basic report.
This patch takes a cue from gdb.{sum,log} and does the same thing
with perftest.{sum,log}.
Ultimately, we'll probably want to emit raw data to csv files or some
such and then do post-processing passes on that.
gdb/testsuite/ChangeLog:
* lib/perftest/reporter.py (SUM_FILE_NAME): New global.
(LOG_FILE_NAME): New global.
(TextReporter.__init__): Initialize self.txt_sum.
(TextReporter.report): Add support for multiple data-points.
Move report to perftest.sum, put raw data in perftest.log.
(TextReporter.start): Open sum and log files.
(TextReporter.end): Close sum and log files.
* lib/perftest/testresult.py (SingleStatisticTestResult.record): Handle
multiple data-points.
The buildbots show that attach-many-short-lived-thread.exp is racy.
But after staring at debug logs and playing with SystemTap scripts for
a (long) while, I figured out that neither GDB, nor the kernel nor the
test's program itself are at fault.
The problem is simply that the testsuite machinery is currently
subject to PID-reuse races. The attach-many-short-lived-threads.c
test program just happens to be much more susceptible to trigger this
race because threads and processes share the same number space on
Linux, and the test spawns many many short lived threads in
succession, thus enlarging the race window a lot.
Part of the problem is that several tests spawn processes with "exec&"
(in order to test the "attach" command) , and then at the end of the
test, to make sure things are cleaned up, issue a 'remote_spawn "kill
-p $testpid"'. Since with tcl's "exec&", tcl itself is responsible
for reaping the process's exit status, when we go kill the process,
testpid may have already exited _and_ its status may have (and often
has) been reaped already. Thus it can happen that another process
meanwhile reuses $testpid, and that "kill" command kills the wrong
process... Frequently, that happens to be
attach-many-short-lived-thread, but this explains other test's races
as well.
In the attach-many-short-lived-threads test, it sometimes manifests
like this:
(gdb) file /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads
Reading symbols from /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads...done.
(gdb) Loaded /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads into /home/pedro/gdb/mygit/build/gdb/testsuite/../../gdb/gdb
attach 5940
Attaching to program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads, process 5940
warning: process 5940 is a zombie - the process has already terminated
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ptrace: Operation not permitted.
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: attach
info threads
No threads.
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: no new threads
set breakpoint always-inserted on
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: set breakpoint always-inserted on
Other times the process dies while the test is ongoing (the process is
ptrace-stopped):
(gdb) print again = 1
Cannot access memory at address 0x6020cc
(gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: reset timer in the inferior
(Recall that on Linux, SIGKILL is not interceptable)
And other times it dies just while we're detaching:
$4 = 319
(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 2: print seconds_left
detach
Can't detach Thread 0x7fb13b7de700 (LWP 1842): No such process
(gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: detach
GDB mishandles the latter (it should ignore ESRCH while detaching just
like when continuing), but that's another story.
The fix here is to change spawn_wait_for_attach to use Expect's
'spawn' command instead of Tcl's 'exec&' to spawn programs, because
with spawn we control when to wait for/reap the process. That allows
killing the process by PID without being subject to pid-reuse races,
because even if the process is already dead, the kernel won't reuse
the process's PID until the zombie is reaped.
The other part of the problem lies in DejaGnu itself, unfortunately.
I have occasionally seen tests (attach-many-short-lived-threads
included, but not only that one) die with a random inexplicable
SIGTERM too, and that too is caused by the same reason, except that in
that case, the rogue SIGTERM is sent from this bit in DejaGnu's remote.exp:
exec sh -c "exec > /dev/null 2>&1 && (kill -2 $pgid || kill -2 $pid) && sleep 5 && (kill $pgid || kill $pid) && sleep 5 && (kill -9 $pgid || kill -9 $pid) &"
...
catch "wait -i $shell_id"
Even if the program exits promptly, that whole cascade of kills
carries on in the background, thus potentially killing the poor
process that manages to reuse $pid...
I sent a fix for that to the DejaGnu list:
http://lists.gnu.org/archive/html/dejagnu/2015-07/msg00000.html
With both patches in place, I haven't seen
attach-many-short-lived-threads.exp fail again.
Tested on x86_64 Fedora 20, native, gdbserver and extended-gdbserver.
gdb/testsuite/ChangeLog:
2015-07-31 Pedro Alves <palves@redhat.com>
* gdb.base/attach-pie-misread.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid. Wait for spawn id after eof. Use
kill_wait_spawned_process instead of explicit "kill -9".
* gdb.base/attach-pie-noexec.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid. Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/attach-twice.exp: Likewise.
* gdb.base/attach.exp: Likewise.
(do_command_attach_tests): Use gdb_spawn_with_cmdline_opts and
gdb_test_multiple.
* gdb.base/solib-overlap.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid. Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/valgrind-infcall.exp: Likewise.
* gdb.multi/multi-attach.exp: Likewise.
* gdb.python/py-prompt.exp: Likewise.
* gdb.python/py-sync-interp.exp: Likewise.
* gdb.server/ext-attach.exp: Likewise.
* gdb.threads/attach-into-signal.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.threads/attach-many-short-lived-threads.exp: Adjust to
spawn_wait_for_attach returning a spawn id instead of a pid. Use
spawn_id_get_pid and kill_wait_spawned_process.
* gdb.threads/attach-stopped.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/break-interp.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid. Wait for spawn id after eof. Use
kill_wait_spawned_process instead of explicit "kill -9".
* lib/gdb.exp (can_spawn_for_attach): Adjust comment.
(kill_wait_spawned_process, spawn_id_get_pid): New procedures.
(spawn_wait_for_attach): Use spawn instead of exec to spawn
processes. Don't map cygwin/windows pids here. Now returns a
spawn id list.
Running gdb.threads/fork-plus-threads.exp against gdbserver in
extended-remote mode, even though the test passes, we still see broken
behavior:
(gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off
continue &
Continuing.
(gdb) PASS: gdb.threads/fork-plus-threads.exp: continue &
[New Thread 28092.28092]
[Thread 28092.28092] #2 stopped.
[New Thread 28094.28094]
[Inferior 2 (process 28092) exited normally]
[New Thread 28094.28105]
[New Thread 28094.28109]
...
[Thread 28174.28174] #18 stopped.
[New Thread 28185.28185]
[Inferior 10 (process 28174) exited normally]
[New Thread 28185.28196]
[Thread 28185.28185] #20 stopped.
Cannot remove breakpoints because program is no longer writable.
Further execution is probably impossible.
[Inferior 11 (process 28185) exited normally]
[Inferior 1 (process 28091) exited normally]
PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint
info threads
No threads.
(gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left
info inferiors
Num Description Executable
* 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads
(gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left
All the "[Thread FOO] #NN stopped." above are bogus, as well as the
"Cannot remove breakpoints because program is no longer writable.",
which is a consequence.
The problem is that when we intercept a fork event, we should report
the event for the parent, only, and leave the child stopped, but not
report its stop event. GDB later decides whether to follow the parent
or the child. But because handle_extended_wait does not set the
child's last_status.kind to TARGET_WAITKIND_STOPPED, a
stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access
memory) by mistake ends up queueing a SIGSTOP on the child, resuming
it, and then when that SIGSTOP is intercepted, because the LWP has
last_resume_kind set to resume_stop, gdbserver reports the stop to
GDB, as GDB_SIGNAL_0:
...
>>>> entering unstop_all_lwps
unstopping all lwps
proceed_one_lwp: lwp 1600
client wants LWP to remain 1600 stopped
proceed_one_lwp: lwp 1828
Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sending sigstop to lwp 1828
pc is 0x3615ebc7cc
Resuming lwp 1828 (continue, signal 0, stop expected)
continue from pc 0x3615ebc7cc
unstop_all_lwps done
sigchld_handler
<<<< exiting unstop_all_lwps
handling possible target event
>>>> entering linux_wait_1
linux_wait_1: [<all threads>]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x1): status(137f), 1828
LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK
LLW: waitpid 1828 received Stopped (signal) (stopped)
pc is 0x3615ebc7cc
Expected stop.
LLW: resume_stop SIGSTOP caught for LWP 1828.1828.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
linux_wait_1 ret = LWP 1828.1828, 1, 0
<<<< exiting linux_wait_1
Writing resume reply for LWP 1828.1828:1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Tested on x86_64 Fedora 20, extended-remote.
gdb/gdbserver/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Set the child's last
reported status to TARGET_WAITKIND_STOPPED.
The new gdb.threads/fork-plus-threads.exp test exposes one more
problem. When one types "info inferiors" after running the program,
one see's a couple inferior left still, while there should only be
inferior #1 left. E.g.:
(gdb) info inferiors
Num Description Executable
4 process 8393 /home/pedro/bugs/src/test
2 process 8388 /home/pedro/bugs/src/test
* 1 <null> /home/pedro/bugs/src/test
(gdb) info threads
Calling prune_inferiors() manually at this point (from a top gdb) does
not remove them, because they still have inf->pid != 0 (while they
shouldn't). This suggests that we never mourned those inferiors.
Enabling logs (master + previous patch) we see:
...
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9513) received Trace/breakpoint trap (stopped)
WL: Handling extended status 0x03057f
LHEW: Got clone event from LWP 9513, new child is LWP 9579
[New Thread 0x7ffff37b8700 (LWP 9579)]
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9508) received 0 (exited)
WL: Thread 0x7ffff7fc2740 (LWP 9508) exited.
^^^^^^^^
[Thread 0x7ffff7fc2740 (LWP 9508) exited]
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9499) received 0 (exited)
WL: Thread 0x7ffff7fc2740 (LWP 9499) exited.
[Thread 0x7ffff7fc2740 (LWP 9499) exited]
RSRL: resuming stopped-resumed LWP Thread 0x7ffff37b8700 (LWP 9579) at 0x3615ef4ce1: step=0
...
(gdb) info inferiors
Num Description Executable
5 process 9508 /home/pedro/bugs/src/test
^^^^
4 process 9503 /home/pedro/bugs/src/test
3 process 9500 /home/pedro/bugs/src/test
2 process 9499 /home/pedro/bugs/src/test
* 1 <null> /home/pedro/bugs/src/test
(gdb)
...
Note the "Thread 0x7ffff7fc2740 (LWP 9508) exited." line.
That's this in wait_lwp:
/* Check if the thread has exited. */
if (WIFEXITED (status) || WIFSIGNALED (status))
{
thread_dead = 1;
if (debug_linux_nat)
fprintf_unfiltered (gdb_stdlog, "WL: %s exited.\n",
target_pid_to_str (lp->ptid));
}
}
That was the leader thread reporting an exit, meaning the whole
process is gone. So the problem is that this code doesn't understand
that an WIFEXITED status of the leader LWP should be reported to
infrun as process exit.
gdb/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
PR threads/18600
* linux-nat.c (wait_lwp): Report to the core when thread group
leader exits.
gdb/testsuite/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
PR threads/18600
* gdb.threads/fork-plus-threads.exp: Test that "info inferiors"
only shows inferior 1.
When a program forks and another process start threads while gdb is
handling the fork event, newly created threads are left stuck stopped
by gdb, even though gdb presents them as "running", to the user.
This can be seen with the test added by this patch. The test has the
inferior fork a certain number of times and waits for all children to
exit. Each fork child spawns a number of threads that do nothing and
joins them immediately. Normally, the program should run unimpeded
(from the point of view of the user) and exit very quickly. Without
this fix, it doesn't because of some threads left stopped by gdb, so
inferior 1 never exits.
The program triggers when a new clone thread is found while inside the
linux_stop_and_wait_all_lwps call in linux-thread-db.c:
linux_stop_and_wait_all_lwps ();
ALL_LWPS (lp)
if (ptid_get_pid (lp->ptid) == pid)
thread_from_lwp (lp->ptid);
linux_unstop_all_lwps ();
Within linux_stop_and_wait_all_lwps, we reach
linux_handle_extended_wait with the "stopping" parameter set to 1, and
because of that we don't mark the new lwp as resumed. As consequence,
the subsequent resume_stopped_resumed_lwps, called from
linux_unstop_all_lwps, never resumes the new LWP.
There's lots of cruft in linux_handle_extended_wait that no longer
makes sense. On systems with CLONE events support, we don't rely on
libthread_db for thread listing anymore, so the code that preserves
stop_requested and the handling of last_resume_kind is all dead.
So the fix is to remove all that, and simply always mark the new LWP
as resumed, so that resume_stopped_resumed_lwps re-resumes it.
gdb/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
Simon Marchi <simon.marchi@ericsson.com>
PR threads/18600
* linux-nat.c (linux_handle_extended_wait): On CLONE event, always
mark the new thread as resumed. Remove STOPPING parameter.
(wait_lwp): Adjust call to linux_handle_extended_wait.
(linux_nat_filter_event): Adjust call to
linux_handle_extended_wait.
(resume_stopped_resumed_lwps): Add debug output.
gdb/testsuite/ChangeLog:
2015-07-30 Simon Marchi <simon.marchi@ericsson.com>
Pedro Alves <palves@redhat.com>
PR threads/18600
* gdb.threads/fork-plus-threads.c: New file.
* gdb.threads/fork-plus-threads.exp: New file.
Just a slight cleanup. Committed as obvious.
gdb/testsuite/ChangeLog:
* gdb.base/batch-preserve-term-settings.exp
(test_terminal_settings_preserved_after_cli_exit): Use
send_quit_command.
Tested on x86_64 Debian Stretch, native, gdbserver and
extended-gdbserver. Also tested that the various error paths, like if
$PPID is empty or if SIGTERM did not not kill GDB, function correctly.
gdb/testsuite/ChangeLog:
* gdb.base/batch-preserve-term-settings.exp (send_quit_command):
New proc.
(test_terminal_settings_preserved_after_sigterm): New test.
Now that we can expect inferior output with the gdbserver boards, this
is all it takes to have the test pass against extended-remote
gdbserver.
Don Breazeal originally wrong something like this:
https://sourceware.org/ml/gdb-patches/2015-03/msg00506.html
which was what originally inspired the introduction of
$inferior_spawn_id.
gdb/testsuite/ChangeLog:
2015-07-29 Pedro Alves <palves@redhat.com>
Don Breazeal <donb@codesourcery.com>
* gdb.base/multi-forks.exp (continue_to_exit_bp_loc): Expect
output from both inferior_spawn_id and gdb_spawn_id.
Hi,
While examining BuildBot's logs, I noticed:
<https://sourceware.org/ml/gdb-testers/2015-q3/msg03767.html>
gdb.threads/attach-into-signal.exp has two nested loops and don't use
unique messages. This commit fixes that. Pushed under the obvious
rule.
gdb/testsuite/ChangeLog:
2015-07-29 Sergio Durigan Junior <sergiodj@redhat.com>
* gdb.threads/attach-into-signal.exp (corefunc): Use
with_test_prefix on nested loops, uniquefying the test messages.
My last commit d60a92216e introduced a
regression caused by a typo. This fixes it. Checked in as obvious.
Thanks to Pedro for reporting.
gdb/testsuite/ChangeLog:
2015-07-29 Sergio Durigan Junior <sergiodj@redhat.com>
* gdb.python/py-objfile.exp: Fix typo that snuck in from my last
commit.