If an MI client creates a varobj and attempts to update the root
/before/ the inferior is started, gdb will throw an internal error:
(gdb)
-var-create * - batch_flag
^done,name="var1",numchild="0",value="0",type="int",has_more="0"
(gdb)
-var-update var1
^done,changelist=[]
(gdb)
-var-update *
~"../../src/gdb/thread.c:628: internal-error: is_thread_state: Assertion `tp' failed.\nA problem internal to GDB has been detected,\nfurther debugging may prove unreliable.\nQuit this debugging session? "
~"(y or n) "
The function that handles the varobj update in the failing case,
mi_cmd_var_udpate_iter, checks if the thread/inferior is stopped before
attempting to update the varobj. It calls is_stopped (inferior_ptid)
which calls is_thread_state:
tp = find_thread_ptid (ptid);
gdb_assert (tp);
When there is no inferior, ptid is null_ptid, and find_thread_ptid (null_ptid)
returns NULL and the assertion is triggered.
This patch changes mi_cmd_var_update_iter to behave the same way
"-var-update var1" does: by calling the thread "stopped" if
there is no inferior (and thereby calling varobj_update_one).
ChangeLog
2014-06-16 Keith Seitz <keiths@redhat.com>
PR mi/15863
* mi/mi-cmd-var.c (mi_cmd_var_update_iter): Do not attempt
to update the varobj if inferior_ptid is null_ptid.
testsuite/ChangeLog
2014-06-16 Keith Seitz <keiths@redhat.com>
PR mi/15863
* gdb.mi/mi-var-cmd.exp: Add test for -var-update before
the inferior is started.
This makes a parameter of to_info_proc const and then fixes up some
fallout, including parameters in a couple of gdbarch methods.
I could not test the procfs.c change. I verified it by inspection.
If this causes an error here, it will be trivial to fix.
2014-06-16 Tom Tromey <tromey@redhat.com>
* target.h (struct target_ops) <to_info_proc>: Make parameter
const.
(target_info_proc): Update.
* target.c (target_info_proc): Make "args" const.
* procfs.c (procfs_info_proc): Update.
* linux-tdep.c (linux_info_proc): Update.
(linux_core_info_proc_mappings): Make "args" const.
(linux_core_info_proc): Update.
* gdbarch.sh (info_proc, core_info_proc): Make "args" const.
* gdbarch.c: Rebuild.
* gdbarch.h: Rebuild.
* corelow.c (core_info_proc): Update.
a syntax error is detected in an optional operand.
* config/tc-aarch64.c (END_OF_INSN): New macro.
(parse_operands): Handle operand given and be in wrong
format when operand is optional.
* gas/aarch64/diagnostic.s: New test patterns.
* gas/aarch64/diagnostic.l: Likewise.
Combining TLS descriptors and GNU indirect functions in the same
object could lead to assertions or multiple dynamic relocations
for the same GOT slot. Fix the bookkeeping so this doesn't happen.
This allows building and make checking glibc with -mtls-dialect=gnu2.
bfd/ChangeLog:
2014-06-16 Will Newton <will.newton@linaro.org>
* elf32-arm.c (elf32_arm_allocate_plt_entry): Increment
htab->next_tls_desc_index in the non-IPLT case.
Calculate GOT offset correctly for the non-IPLT case.
(allocate_dynrelocs_for_symbol): Don't increment
htab->next_tls_desc_index here.
ld/testsuite/ChangeLog:
2014-06-16 Will Newton <will.newton@linaro.org>
* ld-arm/arm-elf.exp: Add ifunc-gdesc test.
* ld-arm/ifunc-gdesc.r: New file.
* ld-arm/ifunc-gdesc.s: Likewise.
* ld-arm/ifunc-gdesc.ver: Likewise.
Turns out there's a difference between loading the program with "gdb
PROGRAM", vs loading it with "(gdb) file PROGRAM". The latter results
in the objfile ending up with OBJF_USERLOADED set, while not with the
former. (That difference seems bogus, but still that's not the point
of this patch. We can revisit that afterwards.)
The new code that suppresses breakpoint removal errors for
add-symbol-file objects ends up being too greedy:
/* In some cases, we might not be able to remove a breakpoint in
a shared library that has already been removed, but we have
not yet processed the shlib unload event. Similarly for an
unloaded add-symbol-file object - the user might not yet have
had the chance to remove-symbol-file it. shlib_disabled will
be set if the library/object has already been removed, but
the breakpoint hasn't been uninserted yet, e.g., after
"nosharedlibrary" or "remove-symbol-file" with breakpoints
always-inserted mode. */
if (val
&& (bl->loc_type == bp_loc_software_breakpoint
&& (bl->shlib_disabled
|| solib_name_from_address (bl->pspace, bl->address)
|| userloaded_objfile_contains_address_p (bl->pspace,
bl->address))))
val = 0;
as it turns out that OBJF_USERLOADED can be set for objfiles loaded by
some other means not add-symbol-file. In this case, symbol-file (or
"file", which is really just "exec-file"+"symbol-file").
Recall that add-symbol-file is documented as:
(gdb) help add-symbol-file
Load symbols from FILE, assuming FILE has been dynamically loaded.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
And it's the "dynamically loaded" aspect that the breakpoint.c code
cares about. So make add-symbol-file set OBJF_SHARED on its objfiles
too, and tweak the breakpoint.c code to look for OBJF_SHARED instead
of OBJF_USERLOADED.
This restores back the missing breakpoint removal warning when we let
sss-bp-on-user-bp-2.exp run on native GNU/Linux
(https://sourceware.org/ml/gdb-patches/2014-06/msg00335.html):
(gdb) PASS: gdb.base/sss-bp-on-user-bp-2.exp: define stepi_del_break
stepi_del_break
warning: Error removing breakpoint 3
(gdb) FAIL: gdb.base/sss-bp-on-user-bp-2.exp: stepi_del_break
I say "restores" because this was GDB's behavior in 7.7 and earlier.
And, likewise, "file" with no arguments only started turning
breakpoints set in the main executable to "<pending>" with the
remote-symbol-file patch (63644780). The old behavior is now
restored, and we break-unload-file.exp test now exercizes both "gdb;
file PROGRAM" and "gdb PROGRAM".
gdb/
2014-06-16 Pedro Alves <palves@redhat.com>
* breakpoint.c (insert_bp_location, remove_breakpoint_1): Adjust.
(disable_breakpoints_in_freed_objfile): Skip objfiles that don't
have OBJF_SHARED set.
* objfiles.c (userloaded_objfile_contains_address_p): Rename to...
(shared_objfile_contains_address_p): ... this. Check OBJF_SHARED
instead of OBJF_USERLOADED.
* objfiles.h (OBJF_SHARED): Update comment.
(userloaded_objfile_contains_address_p): Rename to ...
(shared_objfile_contains_address_p): ... this, and update
comments.
* symfile.c (add_symbol_file_command): Also set OBJF_SHARED in the
new objfile.
(remove_symbol_file_command): Skip objfiles that don't have
OBJF_SHARED set.
gdb/testsuite/
2014-06-16 Pedro Alves <palves@redhat.com>
* gdb.base/break-main-file-remove-fail.c: New file.
* gdb.base/break-main-file-remove-fail.exp: New file.
* gdb.base/break-unload-file.exp: Use build_executable instead of
prepare_for_testing.
(test_break): New parameter "initial_load". Handle it.
(top level): Add initial_load cmdline/file axis.
minsyms.h incorrectly claims that a couple of functions call
prim_record_minimal_symbol_full with COPY_NAME=0 -- but actually they
pass 1. Passing 1 is the correct behavior, so this patch fixes the
documentation.
I'm checking this in as obvious.
2014-06-16 Tom Tromey <tromey@redhat.com>
* minsyms.h (prim_record_minimal_symbol)
(prim_record_minimal_symbol_and_info): Update comments.
and fix more nds32 dependencies.
* scripttempl/elf.sc: Edit out __rela_iplt symbol assignments from
.rel sections, and __rel_iplt from .rela sections.
* scripttempl/nds32elf.sc: Likewise.
* Makefile.am (ends32*.c): Depend on nds32elf.sc.
* Makefile.in: Regenerate.
This is to fix unitialised memory access when printing listings.
Many targets don't initialise parts of insn frags or data frags that
have fixups, relying on md_apply_fix to finalise the frag. Which is
fine normally, but means we need to run write_object_file after
errors, for listings. Otherwise MALLOC_PERTURB_=1 causes errors like:
x86_64-linux +FAIL: i386 mpx-inval-1
x86_64-linux +FAIL: i386 inval-equ-1
x86_64-linux +FAIL: i386 x86-64-mpx-inval-1
Running write_object_file after errors requires some tweaking to the
testsuite, since we then get extra errors reported from md_apply_fix.
gas/
* write.h (subsegs_finish): Delete declaration.
* write.c (subsegs_finish): Make static.
(write_object_file): Call subsegs_finish from here. Don't print
warning and error count here..
* as.c (main): ..do so here instead. Remove dead code for "no
object file generated". Split out count strings to better support
internationalisation. Don't call subsegs_finish. Tidy setting of
"keep_it". Run write_object_file even after errors.
(keep_it): Make static.
* config/obj-elf.c (elf_frob_symbol): Remove assert.
(elf_frob_file_before_adjust): Likewise.
gas/testsuite/
* gas/elf/bad-group.s: Use %function.
* gas/elf/bad-group.err: Expect correct line number. Allow
other errors.
* gas/elf/bad-size.err: Allow other errors. Match expected
error somewhat more rigorously.
* gas/i386/reloc32.l: Allow other errors.
* gas/i386/mpx-inval-1.l: Match applied relocs.
* gas/i386/x86-64-mpx-inval-1.l: Likewise, and nop padding.
* gas/i386/x86-64-mpx-inval-2.l: Match nop padding, and allow
other errors.
* gas/macros/dot.s: Use .balign.
* gas/macros/dot.l: Update alignment output.
* gas/symver/symver6.l: Allow other errors.
In particular the_insn.reloc must be initialised, otherwise the early
exit cases for bad opcodes will result in cascading errors if
write_object_file is called after an error.
* config/tc-dlx.c (machine_ip): Move initialisation of the_insn
earlier.
MALLOC_PERTURB_=1 results in "FAIL: c54x macros".
* config/tc-tic54x.c (tic54x_mlib): Don't write garbage past
end of archive to temp file.
(tic54x_start_line_hook): Start scan for parallel on next line,
not one char into next line (which may overrun the buffer).
MALLOC_PERTURB_=1 results in "FAIL: VAX ELF relocations", due to object
file being emitted with uninitialised fields. Since these fields had
RELA relocs the field value won't be used at final link time, so the
problem is only seen in relocatable object files.
This rewrite of md_apply_fix clears all fields with relocs, whereas
before some fields had non-zero values.
gas/
* config/tc-vax.c (md_apply_fix): Rewrite.
(tc_gen_reloc, vax_cons, vax_cons_fix_new): Style: Use NO_RELOC
define rather than the equivalent BFD_RELOC_NONE.
gas/testsuite/
* gas/vax/elf-rel.d: Update.
Cures these failures with MALLOC_PERTURB_=1
FAIL: GOT test (executable)
FAIL: GOT test (shared library)
FAIL: VAX export class call relocation test
FAIL: VAX export class data relocation test
* elf32-vax.c (elf_vax_size_dynamic_sections): Clear linker
created sections.
MALLOC_PERTURB_=1 results in "FAIL: PIC" on arm-vxworks, due to garbage
in words with got relocs.
* config/tc-arm.c (s_arm_elf_cons): Initialise after frag_more.
(md_apply_fix): Delete now unnecessary zeroing for BFD_RELOC_ARM_GOT*
and BFD_RELOC_ARM_TLS* relocs. Simplify BFD_RELOC_8 case.
gas/
* config/tc-cris.c (md_create_long_jump): Follow "short" jump
with a nop rather than leaving uninitialised.
gas/testsuite/
* gas/cris/rd-bkw4v32.d: Update.
Currently there are many calls to help_list that pass the constant -1
as the "class" value. However, the parameter is declared as being of
type enum command_class, and uses of the constant violate this
abstraction.
This patch fixes the error everywhere it occurs in the gdb sources.
Tested by rebuilding.
2014-06-13 Tom Tromey <tromey@redhat.com>
* cp-support.c (maint_cplus_command): Pass all_commands, not -1,
to help_list.
* guile/guile.c (info_guile_command): Pass all_commands, not -1,
to help_list.
* tui/tui-win.c (tui_command): Pass all_commands, not -1, to
help_list.
* tui/tui-regs.c (tui_reg_command): Pass all_commands, not -1, to
help_list.Pass all_commands, not -1, to help_list.
* cli/cli-dump.c (dump_command, append_command)
(srec_dump_command, ihex_dump_command, tekhex_dump_command)
(binary_dump_command, binary_append_command): Pass all_commands,
not -1, to help_list.
* cli/cli-cmds.c (info_command, set_debug): Pass all_commands, not
-1, to help_list.
* valprint.c (set_print, set_print_raw): Pass all_commands, not
-1, to help_list.
* typeprint.c (set_print_type): Pass all_commands, not -1, to
help_list.
* top.c (set_history): Pass all_commands, not -1, to help_list.
* target-descriptions.c (set_tdesc_cmd, unset_tdesc_cmd): Pass
all_commands, not -1, to help_list.
* symfile.c (overlay_command): Pass all_commands, not -1, to
help_list.
* spu-tdep.c (info_spu_command): Pass all_commands, not -1, to
help_list.
* serial.c (serial_set_cmd): Pass all_commands, not -1, to
help_list.
* ser-tcp.c (set_tcp_cmd, show_tcp_cmd): Pass all_commands, not
-1, to help_list.
* remote.c (remote_command, set_remote_cmd): Pass all_commands,
not -1, to help_list.
* ravenscar-thread.c (set_ravenscar_command): Pass all_commands,
not -1, to help_list.
* maint.c (maintenance_command, maintenance_info_command)
(maintenance_print_command, maintenance_set_cmd): Pass
all_commands, not -1, to help_list.
* macrocmd.c (macro_command): Pass all_commands, not -1, to
help_list.
* language.c (set_check): Pass all_commands, not -1, to help_list.
* infcmd.c (unset_command): Pass all_commands, not -1, to
help_list.
* frame.c (set_backtrace_cmd): Pass all_commands, not -1, to
help_list.
* dwarf2read.c (set_dwarf2_cmd): Pass all_commands, not -1, to
help_list.
* dcache.c (set_dcache_command): Pass all_commands, not -1, to
help_list.
* breakpoint.c (save_command): Pass all_commands, not -1, to
help_list.
* ada-lang.c (maint_set_ada_cmd, set_ada_command): Pass
all_commands, not -1, to help_list.
As shown by the bug report, GDB crashes when the remote target was unable to
write to a register (the program counter) with the 'P' packet. This was reported
for AVR but can be reproduced on any architecture with a gdbserver that fails to
handle a 'P' packet.
Issue
=====
This GDB session was done with a custom gdbserver patched to send an error
packet when trying to set the program counter with a 'P' packet:
~~~
(gdb) file Debug/ATMega2560-simple-program.elf
Reading symbols from Debug/ATMega2560-simple-program.elf...done.
(gdb) target remote :51000
Remote debugging using :51000
0x00000000 in __vectors ()
(gdb) load
Loading section .text, size 0x1fc lma 0x0
Start address 0x0, load size 508
Transfer rate: 248 KB/sec, 169 bytes/write.
(gdb) b main
Breakpoint 1 at 0x164: file .././ATMega2560-simple-program.c, line 39.
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
main () at .././ATMega2560-simple-program.c:42
42 DDRD |= LED0_MASK;// | LED1_MASK;
(gdb) info line 43
Line 43 of ".././ATMega2560-simple-program.c" is at address 0x178 <main+40> but contains no code.
(gdb) set $pc=0x178
Could not write register "PC2"; remote failure reply 'E00'
(gdb) info registers pc
pc 0x178 0x178 <main+40>
(gdb) s
../../unisrc-mainline/gdb/infrun.c:1978: internal-error: resume: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
../../unisrc-mainline/gdb/infrun.c:1978: internal-error: resume: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n)
~~~
We can see that even though GDB reports that writing to the register failed, the
register cache was updated:
~~~
(gdb) set $pc=0x178
Could not write register "PC2"; remote failure reply 'E00'
(gdb) info registers pc
pc 0x178 0x178 <main+40>
~~~
The root of the problem is of course in the gdbserver but I thought GDB should
keep a register cache consistent with the hardware even in case of a failure.
Changes
=======
This patch adds routines to add a regcache_invalidate cleanup to the current
chain.
We can then register one before calling target_store_registers. This way if the
target throws an error, the register we wanted to write to will be invalidated
in cache. If target_store_registers succeeds, we can discard the new cleanup.
2014-06-12 Pierre Langlois <pierre.langlois@embecosm.com>
* regcache.c (struct register_to_invalidate): New structure.
(do_register_invalidate, make_cleanup_regcache_invalidate): New
functions.
(regcache_raw_write): Call make_cleanup_regcache_invalidate.
gdbserver defines freeargv, but it is now trivial to just use the one
in libiberty.
2014-06-12 Tom Tromey <tromey@redhat.com>
* utils.c (freeargv): Remove.
This builds a libiberty just for gdbserver and arranges for gdbserver
to use it. I've tripped across the lack of libiberty in gdbserver at
least once, and I have seen other threads where it would have been
useful.
2014-06-12 Tom Tromey <tromey@redhat.com>
* debug.c (debug_printf): Remove HAVE_GETTIMEOFDAY checks.
* server.c (monitor_show_help): Remove HAVE_GETTIMEOFDAY check.
(parse_debug_format_options): Likewise.
(gdbserver_usage): Likewise.
* Makefile.in (LIBIBERTY_BUILDDIR, LIBIBERTY): New variables.
(SUBDIRS, REQUIRED_SUBDIRS): Add libiberty.
(gdbserver$(EXEEXT), gdbreplay$(EXEEXT)): Depend on and link
against libiberty.
($(LIBGNU)): Depend on libiberty.
(all-lib): Recurse into all subdirs.
(install-only): Invoke "install" target in subdirs.
(vasprintf.o, vsnprintf.o, safe-ctype.o, lbasename.o): Remove
targets.
* configure: Rebuild.
* configure.ac: Add ACX_CONFIGURE_DIR for libiberty. Don't check
for vasprintf, vsnprintf, or gettimeofday.
* configure.srv: Don't add safe-ctype.o or lbasename.o to
srv_tgtobj.
I noticed that a few tests in completion.exp put the directory name
into the name of the resulting test. While the directory name is
relative, this still makes for spurious differences depending on
whether the test was run in serial or parallel mode.
This patch fixes the problem. I'm checking it in.
2014-06-12 Tom Tromey <tromey@redhat.com>
* gdb.base/completion.exp: Don't use directory name in test.
gdb/
2014-06-09 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_child_follow_fork): Initialize status with
W_STOPCODE (0) instead of 0. Remove shodowing 'status' local from
inner block. Only pass the signal to PTRACE_DETACH if in pass
state.
Use varobj_is_dynamic_p more widely so that the callers of
varobj_is_dynamic_p are unchanged when we add available-children-only
stuff in varobj_is_dynamic_p.
gdb:
2014-06-12 Yao Qi <yao@codesourcery.com>
* varobj.c (varobj_get_num_children): Call
varobj_is_dynamic_p.
(varobj_list_children): Likewise.
(varobj_update): Likewise. Update comments.
We think varobj with --available-children-only behaves like a dynamic
varobj, so dyanmic varobj is not pretty-printer specific. We rename
varobj_pretty_printed_p to varobj_is_dynamic_p, so that we can handle
available-children-only checking in varobj_is_dynamic_p in the next
patch.
gdb:
2014-06-12 Yao Qi <yao@codesourcery.com>
* varobj.c (varobj_pretty_printed_p): Rename to ...
(varobj_is_dynamic_p): ... this. New function.
* varobj.h (varobj_pretty_printed_p): Remove declaration.
(varobj_is_dynamic_p): Declare.
* mi/mi-cmd-var.c (print_varobj): All callers updated.
(mi_print_value_p, varobj_update_one): Likewise.
This patch removes some unnecessary "#if HAVE_PYTHON" so that more
code is generalized.
gdb:
2014-06-12 Pedro Alves <pedro@codesourcery.com>
Yao Qi <yao@codesourcery.com>
* varobj.c: Remove "#if HAVE_PYTHON" and "#endif".
(varobj_get_iterator): Wrap up code for pretty-printer by
"#if HAVE_PYTHON" and "#endif".
(update_dynamic_varobj_children): Likewise.
In previous patch, "saved_item" is still a PyOjbect and iteration is
still performed over PyObject. This patch continues to decouple
iteration from python code, so it changes its type to "struct
varobj_item *", so that the iterator itself is independent of python.
V2:
- Call varobj_delete_iter in free_variable.
- Fix changelog entries.
- Use XNEW.
V3:
- Return NULL early in py_varobj_iter_next if gdb_python_initialized
is false.
gdb:
2014-06-12 Pedro Alves <pedro@codesourcery.com>
Yao Qi <yao@codesourcery.com>
* python/py-varobj.c (py_varobj_iter_next): Return NULL if
gdb_python_initialized is false. Move some code from varobj.c.
* varobj-iter.h (struct varobj_item): Moved from varobj.c.
* varobj.c: Move "varobj-iter.h" inclusion earlier.
(struct varobj_item): Moved to varobj-iter.h".
(varobj_clear_saved_item): New function.
(update_dynamic_varobj_children): Move python-related code to
py-varobj.c.
(free_variable): Call varobj_clear_saved_item and
varobj_iter_delete.
This patch generalizes varobj iterator, in a python-independent way.
Note varobj_item is still a typedef of PyObject, we can only focus on
API changes, and leave the data type changes to the next patch. As a
result, we include "varobj-iter.h" after the typedef of PyObject in
varobj.c, but it is an intermediate state. Finally, varobj-iter.h is
independent of PyObject.
This change is helpful to move some python-related code out of
varobj.c.
V2:
- Fix a missing cleanup.
- Fix typos.
- Use XNEW.
- Check against NULL explicitly.
- Update copyright year for new added files.
V3:
- Call PyGILState_Ensure before Py_XDECREF.
- Use CPYCHECKER_STEALS_REFERENCE_TO_ARG.
- Code indentation.
V4:
- use varobj_ensure_python_env instead of PyGILState_Ensure.
gdb:
2014-06-12 Pedro Alves <pedro@codesourcery.com>
Yao Qi <yao@codesourcery.com>
* Makefile.in (SUBDIR_PYTHON_OBS): Add "py-varobj.o".
(SUBDIR_PYTHON_SRCS): Add "python/py-varobj.c".
(HFILES_NO_SRCDIR): Add "varobj-iter.h".
(py-varobj.o): New rule.
* python/py-varobj.c: New file.
* python/python-internal.h (py_varobj_get_iterator): Declare.
* varobj-iter.h: New file.
* varobj.c: Include "varobj-iter.h"
(struct varobj) <child_iter>: Change its type from "PyObject *"
to "struct varobj_iter *".
<saved_item>: Likewise.
[HAVE_PYTHON] (varobj_ensure_python_env): Make it extern.
[HAVE_PYTHON] (varobj_get_iterator): New function.
(update_dynamic_varobj_children) [HAVE_PYTHON]: Move
python-specific code to python/py-varobj.c.
(install_visualizer): Call varobj_iter_delete instead of
Py_XDECREF.
* varobj.h (varobj_ensure_python_env): Declare.
Hi,
name and value pair is widely used in varobj.c. This patch is to add
a new struct varobj_item to represent them, so that the number of
function arguments can be reduced. Finally, the iteration is done on
'struct varobj_item' instead of PyObject after this patch series.
V2:
- Fix changelog entry.
- Fix one grammatical mistake.
gdb:
2014-06-12 Yao Qi <yao@codesourcery.com>
* varobj.c (struct varobj_item): New structure.
(create_child_with_value): Update declaration.
(varobj_add_child): Replace arguments 'name' and 'value' with
'item'. All callers updated.
(install_dynamic_child): Likewise.
(update_dynamic_varobj_children): Likewise.
(varobj_add_child): Likewise.
(create_child_with_value): Likewise.
Now that the GDB 7.8 branch has been created, we can
bump the version number.
gdb/ChangeLog:
GDB 7.8 branch created (173373c6f6):
* version.in: Bump version to 7.8.50.DATE-cvs.
A call to demangle_template might allocate storage within a temporary
string even if the call to demangle_template eventually returns
failure.
This will never cause the demangler to crash, but does leak memory, as
a result I've not added any tests for this.
Calling string_delete is safe, even if nothing is allocated into the
string, the string is initialised with string_init, so we know the
internal pointers are NULL.
libiberty/ChangeLog
* cplus-dem.c (do_type): Call string_delete even if the call to
demangle_template fails.
Since target-async was turned on by default, debugging on Windows
using GDB+GDBserver sometimes hangs while waiting for a RSP reply.
The problem is a race in the gdb_select machinery.
This is what we see for a faulty next on the GDB side:
(gdb) n
infrun: clear_proceed_status_thread (Thread 4424)
infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=1)
(...)
infrun: resume (step=1, signal=GDB_SIGNAL_0), ...
Sending packet: $vCont;s:1148;c#5e...
*hang*
At this point, attaching a debugger to the hanging GDB confirms that
it is blocked, waiting for a socket event:
#6 0x757841d8 in WaitForMultipleObjects ()
from C:\Windows\syswow64\kernel32.dll
#7 0x004708e7 in gdb_select (n=469, readfds=0x88ca50 <gdb_notifier+784>,
writefds=0x88cb54 <gdb_notifier+1044>,
exceptfds=0x88cc58 <gdb_notifier+1304>, timeout=0x0)
at /[...]/gdb/mingw-hdep.c:172
#8 0x00527926 in gdb_wait_for_event (block=1)
at /[...]/gdb/event-loop.c:831
#9 0x00526ff1 in gdb_do_one_event ()
at /[...]/gdb/event-loop.c:403
However, on the GDBserver side, we see that GDBserver already sent a
T05 packet reply:
gdbserver: kernel event EXCEPTION_DEBUG_EVENT for pid=4968 tid=1148
EXCEPTION_SINGLE_STEP
Child Stopped with signal = 5
Writing resume reply for LWP 4968.4424:1
DEBUG: write_prim ($T0505:c8fe2800;04:a0fe2800;08:38164000;thread:1148;#f0)
-> 55
To recap, on Windows, 'select' only works with sockets, so we have a
wrapper, gdb_select, that uses the GDB serial abstraction to handle
sockets, consoles, pipes, and serial ports. Each serial descriptor
has a thread associated (we call those the select threads), and those
threads communicate with the main thread by means of standard Windows
events.
It basically goes like this: gdb_select first loops through all fds of
interest, calling their wait_handle hooks, which returns an event that
WaitForMultipleObjects can wait on. gdb_select then blocks in
WaitForMultipleObjects with all those event handles. The wait_handle
hook is responsible for arranging for the returned event to become set
once data is available. This is done by setting the descriptor's
helper thread running, which itself knows how to wait for data from
the type of handle it manages (sockets, pipes, consoles, files, etc.).
Once data arrives, the select thread sets the corresponding event
which unblocks WaitForMultipleObjects within gdb_select. However, the
wait_handle hook can also apply an optimization: if data is already
pending, then there's no need to set the thread running, and the
descriptors event can be set immediately. It's around this latter
aspect that lies the bug/race.
Adding some ad hoc debug logs to ser-mingw.c and mingw-hdep.c, we see
the following sequence of events, right after sending
"$vCont;s:1148;c#5e". Thread 1 is the main thread, and thread 2 is
the socket's helper/select thread. gdb_select was only passed one
descriptor to wait on, the remote target's socket.
net_windows_select_thread is the entry point of the select threads for
sockets.
#1 - thread 1: gdb_select: enter
#2 - thread 2: net_windows_select_thread: WaitForMultipleObjects blocking
gdb_select walked over the wait_handle hooks, and woke up the socket's
helper thread. The helper thread is now blocked waiting for socket
events.
#3 - thread 1: gdb_select: WaitForMultipleObjects polling (timeout=0ms)
#4 - thread 1: gdb_select: WaitForMultipleObjects returned 102 (WAIT_TIMEOUT)
There was no pending data available yet, and gdb_select was passed
timeout==0ms, and so WaitForMultipleObjects times out immediately.
#5 - thread 2: net_windows_select_thread: WaitForMultipleObjects returned 1
Just afterwards, socket data arrives, and thread 2 wakes up. Thread 2
calls WSAEnumNetworkEvents, which clears state->sock_event, and marks
the serial's read_event event, telling the main thread that data is
available.
#6 - thread 1: gdb_select: call serial_done_wait_handle on each serial
gdb_select stops all the helper/select threads.
#7 - thread 1: gdb_select: return 0 (WAIT_TIMEOUT)
gdb_select in the main thread returns to the caller.
Note that at this point, data is pending on the socket, the serial's
read_event is set, but the socket's sock_event event is not set, until
_further_ data arrives.
Now GDB does its thing and goes back to the event loop. That calls
gdb_select, but with timeout==INFINITE.
Again, gdb_select calls the socket serial's wait_handle hook. It
first clears its events, starting from a clean slate:
ResetEvent (state->base.read_event);
ResetEvent (state->base.except_event);
ResetEvent (state->base.stop_select);
That cleared read_event, which was previously set in #5 above. And
then it checks for pending events, in the sock_event event:
/* Check any pending events. This both avoids starting the thread
unnecessarily, and handles stray FD_READ events (see below). */
if (WaitForSingleObject (state->sock_event, 0) == WAIT_OBJECT_0)
{
That also fails because state->sock_event was cleared in #5 too...
So the wait_handle hook erroneously decides that it needs to start the
helper thread to wait for input:
#8 - thread 2: net_windows_select_thread: WaitForMultipleObjects blocking
#9 - thread 1: gdb_select: WaitForMultipleObjects blocking (INFINITE)
But, GDBserver already sent all it had to send, so both threads waits
forever...
At first I thought that net_windows_wait_handle shouldn't be resetting
state->base.read_event or state->base.except_event, but looking
deeper, the pipe and console wait_handle hooks reset all events too.
It actually makes sense that way -- consuming an event from different
threads is bad practice, and, we should always be able to query
pending state without looking at the state->sock_event from within
net_windows_wait_handle. The end result is much simpler, and makes
net_windows_select_thread look a lot like console_select_thread,
actually.
gdb/
2014-06-11 Pedro Alves <palves@redhat.com>
PR remote/17028
* ser-mingw.c (net_windows_socket_check_pending): New function.
(net_windows_select_thread): Ignore spurious wakeups. Use
net_windows_socket_check_pending.
(net_windows_wait_handle): Check for pending events with
ioctlsocket, through net_windows_socket_check_pending, instead of
checking the socket's event.