This commit adds a new global flag show_debug_regs to common-debug.h
to replace the flag debug_hw_points used by gdbserver and by the
Linux x86 and AArch64 ports, and to replace the flag maint_show_dr
used by the Linux MIPS port.
Note that some debug printing in the AArch64 port was enabled only if
debug_hw_points > 1 but no way to set debug_hw_points to values other
than 0 and 1 was provided; that code was effectively dead. This
commit enables all debug printing if show_debug_regs is nonzero, so
the AArch64 output will be more verbose than previously.
gdb/ChangeLog:
* common/common-debug.h (show_debug_regs): Declare.
* common/common-debug.c (show_debug_regs): Define.
* aarch64-linux-nat.c (debug_hw_points): Don't define. Replace
all uses with show_debug_regs. Replace all uses that considered
debug_hw_points as a multi-value integer with straight boolean
uses.
* x86-nat.c (debug_hw_points): Don't define. Replace all uses
with show_debug_regs.
* nat/x86-dregs.c (debug_hw_points): Don't declare. Replace
all uses with show_debug_regs.
* mips-linux-nat.c (maint_show_dr): Don't define. Replace all
uses with show_debug_regs.
gdb/gdbserver/ChangeLog:
* server.h (debug_hw_points): Don't declare.
* server.c (debug_hw_points): Don't define. Replace all uses
with show_debug_regs.
* linux-aarch64-low.c (debug_hw_points): Don't define. Replace
all uses with show_debug_regs.
This fixes two FAIL results on this testcase which were caused by a
misplaced "continue" command. This testcase used to end inferior's
execution too soon, causing the following tests to fail. Now we break
right after inferior's loop and perform the rest of the tests there.
gdb/testsuite/ChangeLog:
* gdb.fortran/array-element.exp: Remove unexpected "continue"
command in testcase. Simplify testcase.
Since the last change to address_from_register, it no longer supports
targets that require a special conversion (gdbarch_convert_register_p)
for plain pointer type; I had assumed no target does so.
This turned out to be incorrect: MIPS64 n32 big-endian needs such a
conversion in order to properly sign-extend pointer values.
This patch fixes this regression by handling targets that need a
special conversion in address_from_register as well.
gdb/ChangeLog:
* findvar.c (address_from_register): Handle targets requiring
a special conversion routine even for plain pointer types.
Old AIX versions required GDB to update the stack pointer register and
execute at least one instruction before accessing the space newly allocated
on the user stack. This was done using the exec_one_dummy_insn routine
in rs6000-nat.c
However, in currently supported AIX versions (tested on AIX 6.1), this hack
is no longer necessary. In fact, removing the hack actually fixed several
test case failures, and removes a call to deprecated_insert_raw_breakpoint.
gdb/ChangeLog:
* rs6000-nat.c (exec_one_dummy_insn): Remove.
(store_register): Do not call exec_one_dummy_insn.
Trying to print the bounds or the length of a pointer to an array
whose bounds are dynamic results in the following error:
(gdb) p foo.three_ptr.all'first
Location address is not set.
(gdb) p foo.three_ptr.all'length
Location address is not set.
This is because, after having dereferenced our array pointer, we
use the type of the resulting array value, instead of the enclosing
type. The former is the original type where the bounds are unresolved,
whereas we need to get the actual array bounds.
Similarly, trying to apply those attributes to the array pointer
directly (without explicitly dereferencing it with the '.all'
operator) yields the same kind of error:
(gdb) p foo.three_ptr'first
Location address is not set.
(gdb) p foo.three_ptr'length
Location address is not set.
This is caused by the fact that the dereference was done implicitly
in this case, and perform at the type level only, which is not
sufficient in order to resolve the array type.
This patch fixes both issues, thus allowing us to get the expected output:
(gdb) p foo.three_ptr.all'first
$1 = 1
(gdb) p foo.three_ptr.all'length
$2 = 3
(gdb) p foo.three_ptr'first
$3 = 1
(gdb) p foo.three_ptr'length
$4 = 3
gdb/ChangeLog:
* ada-lang.c (ada_array_bound): If ARR is a TYPE_CODE_PTR,
dereference it first. Use value_enclosing_type instead of
value_type.
(ada_array_length): Likewise.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.exp: Add 'first, 'last and 'length tests.
Consider a pointer to an array which dynamic bounds, described in
DWARF as follow:
<1><25>: Abbrev Number: 4 (DW_TAG_array_type)
<26> DW_AT_name : foo__array_type
[...]
<2><3b>: Abbrev Number: 5 (DW_TAG_subrange_type)
[...]
<40> DW_AT_lower_bound : 5 byte block: 97 38 1c 94 4
(DW_OP_push_object_address; DW_OP_lit8; DW_OP_minus;
DW_OP_deref_size: 4)
<46> DW_AT_upper_bound : 5 byte block: 97 34 1c 94 4
(DW_OP_push_object_address; DW_OP_lit4; DW_OP_minus;
DW_OP_deref_size: 4)
GDB is now able to correctly print the entire array, but not one
element of the array. Eg:
(gdb) p foo.three_ptr.all
$1 = (1, 2, 3)
(gdb) p foo.three_ptr.all(1)
Cannot access memory at address 0xfffffffff4123a0c
The problem occurs because we are missing a dynamic resolution of
the variable's array type when subscripting the array. What the current
code does is "fix"-ing the array type using the GNAT encodings, but
that operation ignores any of the array's dynamic properties.
This patch fixes the issue by using ada_value_ind to dereference
the array pointer, which takes care of the array type resolution.
It also continues to "fix" arrays described using GNAT encodings,
so backwards compatibility is preserved.
gdb/ChangeLog:
* ada-lang.c (ada_value_ptr_subscript): Remove parameter "type".
Adjust function implementation and documentation accordingly.
(ada_evaluate_subexp) <OP_FUNCALL>: Only assign "type" if
NOSIDE is EVAL_AVOID_SIDE_EFFECTS.
Update call to ada_value_ptr_subscript.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.exp: Add subscripting tests.
Consider the following declaration:
type Array_Type is array (Natural range <>) of Integer;
type Array_Ptr is access all Array_Type;
for Array_Ptr'Size use 64;
Three_Ptr : Array_Ptr := new Array_Type'(1 => 1, 2 => 2, 3 => 3);
This creates a pointer to an array where the bounds are stored
in a memory region just before the array itself (aka a "thin pointer").
In DWARF, this is described as a the usual pointer type to an array
whose subrange has dynamic values for its bounds:
<1><25>: Abbrev Number: 4 (DW_TAG_array_type)
<26> DW_AT_name : foo__array_type
[...]
<2><3b>: Abbrev Number: 5 (DW_TAG_subrange_type)
[...]
<40> DW_AT_lower_bound : 5 byte block: 97 38 1c 94 4
(DW_OP_push_object_address; DW_OP_lit8; DW_OP_minus;
DW_OP_deref_size: 4)
<46> DW_AT_upper_bound : 5 byte block: 97 34 1c 94 4
(DW_OP_push_object_address; DW_OP_lit4; DW_OP_minus;
DW_OP_deref_size: 4)
GDB is currently printing the value of the array incorrectly:
(gdb) p foo.three_ptr.all
$1 = (26629472 => 1, 2,
value.c:819: internal-error: value_contents_bits_eq: [...]
The dereferencing (".all" operator) is done by calling ada_value_ind,
which itself calls value_ind. It first produces a new value where
the bounds of the array were correctly resolved to their actual value,
but then calls readjust_indirect_value_type which replaces the resolved
type by the original type.
The problem starts when ada_value_print does not take this situation
into account, and starts using the type of the resulting value, which
has unresolved array bounds, instead of using the value's enclosing
type.
After fixing this issue, the debugger now correctly prints:
(gdb) p foo.three_ptr.all
$1 = (1, 2, 3)
gdb/ChangeLog:
* ada-valprint.c (ada_value_print): Use VAL's enclosing type
instead of VAL's type.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.c: New file.
* gdb.dwarf2/dynarr-ptr.exp: New file.
gdb/ChangeLog:
* acinclude.m4 (GDB_GUILE_PROGRAM_NAMES): Pass guile version as
last parameter to pkg-config, not first.
* configure.ac: Pass --with-guile provided pkg-config path to
GDB_GUILE_PROGRAM_NAMES.
* configure: Regenerate.
There are `.MIPS.abiflags', `.MIPS.options' and `.MIPS.stubs' sections
also present in Linux executables, so we can't infer IRIX OS ABI solely
from the existence of these sections. This is not going to be a problem
as there are bound to be other sections whose names start with `.MIPS.'
in IRIX executables and this selection only matters for a non-default OS
ABI in a multiple-target GDB executable. As a last resort the automatic
selection can be overridden with `set osabi'.
* mips-irix-tdep.c (mips_irix_elf_osabi_sniff_abi_tag_sections):
Exclude `.MIPS.abiflags', `.MIPS.options' and `.MIPS.stubs' from
the list of sections determining GDB_OSABI_IRIX.
Similarly to the previous changes to gdb.reverse/sigall-reverse.exp and
gdb.reverse/until-precsave.exp this corrects the timeout tweak in
gdb.base/watchpoint-solib.exp.
This test case executes a large amount of code with a software watchpoint
enabled. This means single-stepping all the way through and takes a lot
of time, e.g. for an ARMv7 Panda board and a `-march=armv5te' multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 714
for the same board and a `-mthumb -march=armv5te' multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 1275
and for QEMU in the system emulation mode and a `-march=armv4t'
multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 115
(values in seconds) -- all of which having the default timeout of 60s,
set based on the requirement of the remaining test cases (other than
gdb.reverse ones).
Here again the timeout extension to have a meaning should be calculated
by scaling rather than using an arbitrary constant, and a larger factor
of 30 will do, leaving some margin. Hopefully for everyone or otherwise
we'll probably have to come up with a smarter solution.
OTOH the other test cases in this script do not require the extension so
they can be moved outside its umbrella so as to avoid unnecessary delays
if something goes wrong and a genuine timeout triggers.
* gdb.base/watchpoint-solib.exp: Increase the timeout by a factor
of 30 rather than hardcoding 120 for a slow test case. Take the
`gdb,timeout' target setting into account for this calculation.
Don't extend the timeout for the test cases that don't need it.
There are three cases in two scripts in the gdb.reverse subset that
take a particularly long time. Two of them are already attempted to
take care of by extending the timeout from the default. The remaining
one has no precautions taken. The timeout extension is ineffective
though, it is done by adding a constant rather than by scaling and as
a result while it may work for target boards that get satisfied with
the detault test timeout of 10s, it does not serve its purpose for
slower ones.
Here are indicative samples of execution times (in seconds) observed
for these cases respectively, for an ARMv7 Panda board running Linux
and a `-march=armv5te' multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 385
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 4440
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 965
for the same board and a `-mthumb -march=armv5te' multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 465
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 4191
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 669
and for QEMU in the system emulation mode and a `-march=armv4t'
multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 45
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 433
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 104
Based on the performance of other tests these two test configurations
have their default timeout set to 450s and 60s respectively.
The remaining two multilibs (`-mthumb -march=armv4t' and `-mthumb
-march=armv7-a') do not produce test results usable enough to have data
available for these cases.
Based on these results I have tweaked timeouts for these cases as
follows. This, together with a suitable board timeout setting, removes
timeouts for these cases. Note that for the default timeout of 10s the
new setting for the first case in gdb.reverse/until-precsave.exp is
compatible with the old one, just a bit higher to keep the convention
of longer timeouts to remain multiples of 30s. The second case there
does not need such a high setting so I have lowered it a bit to avoid
an unnecessary delay where this test case genuinely times out.
* gdb.reverse/sigall-reverse.exp: Increase the timeout by
a factor of 2 for a slow test case. Take the `gdb,timeout'
target setting into account for this calculation.
* gdb.reverse/until-precsave.exp: Increase the timeout by
a factor of 15 and 3 respectively rather than adding 120
for a pair of slow test cases. Take the `gdb,timeout'
target setting into account for this calculation.
The recent change to introduce `gdb_reverse_timeout' turned out
ineffective for board setups that set the `gdb,timeout' target variable.
A lower `gdb,timeout' setting takes precedence and defeats the effect of
`gdb_reverse_timeout'. This is because the global timeout is overridden
in gdb_test_multiple and then again in gdb_expect.
Three timeout variables are taken into account in these two places, in
this precedence:
1. The `gdb,timeout' target variable.
2. The caller's local `timeout' variable (upvar timeout)
3. The global `timeout' variable.
This precedence is obeyed by gdb_test_multiple strictly. OTOH
gdb_expect will select the higher of the two formers and will only take
the latter into account if none of the formers is present. However the
two timeout selections are conceptually the same and gdb_test_multiple
does its only for the purpose of passing it down to gdb_expect.
Therefore I decided there is no point to keep carrying on this
duplication and removed the sequence from gdb_test_multiple, however
retaining the `upvar timeout' variable definition. This way gdb_expect
will still access gdb_test_multiple's caller `timeout' variable (if any)
via its own `upvar timeout' reference.
Now as to the sequence in gdb_expect. In addition to the three
variables described above it also takes a timeout argument into account,
as the fourth value to choose from. It is currently used if it is
higher than the timeout selected from the variables as described above.
With the timeout selection code from gdb_test_multiple gone, gone is
also the most prominent use of this timeout argument, it's now used in
a couple of places only, mostly within this test framework library code
itself for preparatory commands or suchlike. With this being the case
this timeout selection code can be simplified as follows:
1. Among the three timeout variables, the highest is always chosen.
This is so that a test case doesn't inadvertently lower a high value
timeout needed by slow target boards. This is what all test cases
use.
2. Any timeout argument takes precedence. This is for special cases
such as within the framework library code, e.g. it doesn't make sense
to send `set height 0' with a timeout of 7200 seconds. This is a
local command that does not interact with the target and setting a
high timeout here only risks a test suite run taking ages if it goes
astray for some reason.
3. The fallback timeout of 60s remains.
* lib/gdb.exp (gdb_test_multiple): Remove code to select the
timeout, don't pass one down to gdb_expect.
(gdb_expect): Rework timeout selection.
The trad_frame_set_reg_unknown declaration was added in commit
0db9b4b709 (March 2004), but apparently never defined or referenced.
gdb/ChangeLog:
* trad-frame.h (trad_frame_set_reg_unknown): Remove declaration.
As it happens we have a board that fails a gdb.base/gcore-relro.exp
test case reproducibly and moreover the case appears to trigger a
kernel bug making the it less than usable. Specifically the board
remains responsive to some extent, however processes do not appear
to be able to successfully complete termination anymore and perhaps
more importantly further gdbserver processes can be started, but they
never reach the stage of listening on the RSP socket.
This change handles timeouts in gdbserver start properly, by throwing
a TCL error exception when gdbserver does not report listening on the
RSP socket in time. This is then caught at the outer level and
reported, and 2 rather than 1 is returned so that the caller may tell
the failure to start gdbserver and other issues apart and act
accordingly (or do nothing).
I thought letting the exception unwind further on might be a good idea
for any test harnesses out there to break outright where a gdbserver
start error is silently ignored right now, however I figured out the
calls to gdbserver-support.exp are buried down too deep in the GDB test
suite for such a change to be made easily. I think returning a distinct
return value is good enough (the API says "non-zero", so 2 is as good as
1) and we can always make the error harder in a later step if required.
With config/gdbserver.exp being used this change remains transparent
to the target board, the return value is passed up by gdb_reload and
the error exception unwinds through gdbserver_gdb_load and is caught
and handled by mi_gdb_target_load. A call to perror is still made,
reporting the timeout, and in the case of mi_gdb_target_load the
procedure returns a value denoting unsuccessful completion. An
unsuccessful completion of gdb_reload is already handled elsewhere.
An alternative gdbserver board configuration can interpret the return
value in its gdb_reload implementation and catch the error in
gdbserver_gdb_load in an attempt to recover a target board that has
gone astray, for example by rebooting the board somehow. This has
proved effective with our failing board, that now completes the
remaining test cases with no further hiccups.
* lib/gdbserver-support.exp (gdbserver_start): Throw an error
exception on timeout.
(gdbserver_run): Catch any `gdbserver_spawn' error exceptions.
(gdbserver_start_extended): Catch any `gdbserver_start' error
exceptions.
(gdbserver_start_multi, mi_gdbserver_start_multi): Likewise.
* lib/mi-support.exp (mi_gdb_target_load): Catch any
`gdbserver_gdb_load' error exceptions.
Gdbserver support code uses the global timeout value to determine when
to stop waiting for a gdbserver process being started to respond before
continuing anyway. This timeout is usually as low as 10s and may not
be enough in this context, for example on the first run where the
filesystem cache is cold, even if it is elsewhere.
E.g. I observe this reliably with gdbserver started the first time in
QEMU running in the system emulation mode:
(gdb) file .../gdb.base/advance
Reading symbols from .../gdb.base/advance...done.
(gdb) delete breakpoints
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) break main
Breakpoint 1 at 0x87f8: file .../gdb.base/advance.c,
line 41.
(gdb) set remotetimeout 15
(gdb) kill
The program is not being run.
(gdb)
[...]
.../bin/gdbserver --once :6014 advance
target remote localhost:6014
Remote debugging using localhost:6014
Remote communication error. Target disconnected.: Connection reset by peer.
(gdb) continue
The program is not being run.
(gdb) Process advance created; pid = 999
Listening on port 6014
FAIL: gdb.base/advance.exp: Can't run to main
-- notice how the test harness proceeded with the `target remote ...'
command even though gdbserver hasn't completed its startup yet. A
while later when it's finally ready it's too late already. I checked
the timing here and it takes gdbserver roughly 25 seconds to start in
this scenario. Subsequent gdbserver starts in the same test run take
less time and usually complete within 10 seconds although occasionally
`target remote ...' precedes the corresponding `Listening on port...'
message again.
Therefore I have fixed this problem by setting an explicit timeout to
120s on the expect call in question. If this turns out too arbitrary
sometime, then perhaps a separate `gdbserver_timeout' setting might be
due.
* lib/gdbserver-support.exp (gdbserver_start): Set timeout to
120 on waiting for the TCP socket to open.
The following patch...
commit 3116063bd6
Date: Fri Jun 27 09:52:29 2014 +0100
Subject: Tidy #include lists
... introduced a build failure on certain x86 GNU/Linux distributions
(reproduced on SuSE 10 and RHES4) due to "struct iovec" not being
defined. This struct is defined in <sys/uio.h>, which used to be
explicitly included, but no longer is after the commit above was
applied.
[...]/i386-linux-nat.c: In function 'fetch_xstateregs':
[...]/i386-linux-nat.c:325:16: error: storage size of 'iov' isn't known
[...]/i386-linux-nat.c: In function 'store_xstateregs':
[...]/i386-linux-nat.c:348:16: error: storage size of 'iov' isn't known
make[2]: *** [i386-linux-nat.o] Error 1
It seems to be working on newer GNU/Linux distros thanks to indirect
inclusion of <sys/uio.h>, but it does not work on some other versions
of the same distros. This is why indirect includes of public APIs
should be avoided if at all possible.
This patch fixes the issue by adding the explicit include back.
gdb/ChangeLog:
* i386-linux-nat.c, x86-linux-nat.c: Add <sys/uio.h> #include.
The problem here is that if a thread other than gdb's main thread
gets a SIGCHLD (it's an asynchronous signal so the kernel will
essentially pick a random thread) then gdb will hang if it is
in sigsuspend when the SIGCHLD is delivered. The other thread
will see the signal and the sigsuspend won't "wake up".
Guile and libgc should be blocking SIGCHLD in their threads,
but we need to work with Guile 2.0 and libgc 7.4.
The problem first shows up in libgc 7.4 because it is the first
release that enables multiple marker threads by default.
gdb/ChangeLog:
PR 17247
* guile.c: #include <signal.h>.
(_initialize_guile): Block SIGCHLD while initializing Guile.
Replaces the following, which is reverted.
2014-07-26 Doug Evans <xdje42@gmail.com>
PR 17185
* configure.ac: Add check for header gc/gc.h.
Add check for function setenv.
* configure: Regenerate.
* config.in: Regenerate.
* guile/guile.c (_initialize_guile): Add workaround for libgc 7.4.0.
gdb/ChangeLog:
* guile/scm-cmd.c (gdbscm_parse_command_name): Replace magic number
with named constant. Fix style of pointer comparison.
* python/py-cmd.c (gdbpy_parse_command_name): Ditto.
Hi,
I see the following fail on arm-none-eabi target,
-var-evaluate-expression -f nat foo^M
^done,value="0x3 <_ftext+2>"^M
(gdb) ^M
FAIL: gdb.mi/mi-var-display.exp: eval variable -f nat foo
the "<_ftext+2>" isn't expected in the test, so "set print symbol off"
can prevent printing it. It is obvious and I'll commit it in three
days if no comments.
gdb/testsuite:
2014-09-09 Yao Qi <yao@codesourcery.com>
* gdb.mi/mi-var-display.exp: Set print symbol off.
The section name used to store the build-id on pe/coff is arbitrary, as it's
contents should be located using the pe/coff header's DataDirectory debug data
entry, not by using the section name.
But '.build-id' is not a good choice for that section name, as it is 9
characters long, and hence truncated to 8 characters when
--disable-long-section-names is used (which is the default, when producing an
executable with no dwarf debug sections, e.g. using ld --strip-all --build-id)
This truncation then breaks 'objcopy --only-keep-debug', which does use the
section name, due to concerns that keeping an arbitrary section which contains
the debug directory is not sensible.
binutils/ChangeLog
2014-09-01 Jon TURNEY <jon.turney@dronecode.org.uk>
* objcopy.c (is_nondebug_keep_contents_section): Change pe/coff
build-id section name from '.build-id' to '.buildid'.
ld/ChangeLog
2014-09-01 Jon TURNEY <jon.turney@dronecode.org.uk>
* emultempl/pe.em (write_build_id, setup_build_id): Change pe/coff
build-id section name from '.build-id' to '.buildid'.
* emultempl/pep.em (write_build_id, setup_build_id): Ditto.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
This patch fixes the routines to collect and supply ptrace registers on ppc64le
gdbserver. Originally written for big endian arch, they were causing several
issues on little endian. With this fix, the number of unexpected failures in
the testsuite dropped from 263 to 72 on ppc64le.
gdb/gdbserver/ChangeLog
* linux-ppc-low.c (ppc_collect_ptrace_register): Adjust routine to take
endianness into account.
(ppc_supply_ptrace_register): Likewise.
have empty bodies.
User-defined commands that have empty bodies weren't being shown because
the print function returned too soon. Now, it prints the command's name
before checking if it has any body at all. This also fixes the same
problem on "show user <myemptycommand>", which wasn't being printed due
to a similar reason.
gdb/Changelog:
* cli/cli-cmds.c (show_user): Use cli_user_command_p to
decide whether we display the command on "show user".
* cli/cli-script.c (show_user_1): Only verify cmdlines after
printing command name.
* cli/cli-decode.h (cli_user_command_p): Declare new function.
* cli/cli-decode.c (cli_user_command_p): Create helper function
to verify whether cmd_list_element is a user-defined command.
gdb/testsuite/Changelog:
* gdb.base/commands.exp: Add tests to verify user-defined
commands with empty bodies.
* gdb.python/py-cmd.exp: Test that we don't show user-defined
python commands in `show user command`.
* gdb.python/scm-cmd.exp: Test that we don't show user-defined
scheme commands in `show user command`.
https://bugzilla.redhat.com/show_bug.cgi?id=1126177
ERROR: AddressSanitizer: SEGV on unknown address 0x000000000050 (pc 0x000000992bef sp 0x7ffff9039530 bp 0x7ffff9039540
T0)
#0 0x992bee in value_type .../gdb/value.c:925
#1 0x87c951 in py_print_single_arg python/py-framefilter.c:445
#2 0x87cfae in enumerate_args python/py-framefilter.c:596
#3 0x87e0b0 in py_print_args python/py-framefilter.c:968
It crashes because frame_arg::val is documented it may contain NULL
(frame_arg::error is then non-NULL) but the code does not handle it.
Another bug is that py_print_single_arg() calls goto out of its TRY_CATCH
which messes up GDB cleanup chain crashing GDB later.
It is probably 7.7 regression (I have not verified it) due to the introduction
of Python frame filters.
gdb/ChangeLog
PR python/17355
* python/py-framefilter.c (py_print_single_arg): Handle NULL FA->VAL.
Fix goto out of TRY_CATCH.
gdb/testsuite/ChangeLog
PR python/17355
* gdb.python/amd64-py-framefilter-invalidarg.S: New file.
* gdb.python/py-framefilter-invalidarg-gdb.py.in: New file.
* gdb.python/py-framefilter-invalidarg.exp: New file.
* gdb.python/py-framefilter-invalidarg.py: New file.
This patch is a fix to PR gdb/17235. The bug is about an unused
variable that got declared and set during one of the parsing phases of
an SDT probe's argument. I took the opportunity to rewrite some of the
code to improve the parsing. The bug was actually a thinko, because
what I wanted to do in the code was to discard the number on the string
being parsed.
During this portion, the code identifies that it is dealing with an
expression that begins with a sign ('+', '-' or '~'). This means that
the expression could be:
- a numeric literal (e.g., '+5')
- a register displacement (e.g., '-4(%rsp)')
- a subexpression (e.g., '-(2*3)')
So, after saving the sign and moving forward 1 char, now the code needs
to know if there is a digit followed by a register displacement prefix
operand (e.g., '(' on x86_64). If yes, then it is a register
operation. If not, then it will be handled recursively, and the code
will later apply the requested operation on the result (either a '+', a
'-' or a '~').
With the bug, the code was correctly discarding the digit (though using
strtol unnecessarily), but it wasn't properly dealing with
subexpressions when the register indirection prefix was '(', like on
x86_64. This patch also fixes this bug, and includes a testcase. It
passes on x86_64 Fedora 20.
valgrind caught that parse_number reads uninitialized memory when we
parse literal "0":
$ valgrind ./gdb -q -nx -ex "set height 0"
(...)
==10378== Conditional jump or move depends on uninitialised value(s)
==10378== at 0x548A10: parse_number (c-exp.y:1828)
==10378== by 0x54A340: lex_one_token (c-exp.y:2638)
==10378== by 0x54B4BB: c_lex (c-exp.y:3089)
==10378== by 0x544951: c_parse_internal (c-exp.c:2208)
==10378== by 0x54BF8C: c_parse (c-exp.y:3260)
==10378== by 0x6502E7: parse_exp_in_context_1 (parse.c:1221)
==10378== by 0x650064: parse_exp_in_context (parse.c:1122)
==10378== by 0x65001F: parse_exp_1 (parse.c:1114)
==10378== by 0x650421: parse_expression (parse.c:1266)
==10378== by 0x5A74B7: parse_and_eval_long (eval.c:92)
==10378== by 0x501ABD: do_set_command (cli-setshow.c:302)
==10378== by 0x721059: execute_command (top.c:452)
==10378==
(gdb)
I've pushed the obvious fix.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
* c-exp.y (parse_number): Skip handling base-switching prefixes if
the input is only one character long.