In git b57bacec, I said:
> With that in place, the need to delay "Program received signal FOO"
> was actually caught by the manythreads.exp test. Without that bit, I
> was getting:
>
> [Thread 0x7ffff7f13700 (LWP 4499) exited]
> [New Thread 0x7ffff7f0b700 (LWP 4500)]
> ^C
> Program received signal SIGINT, Interrupt.
> [New Thread 0x7ffff7f03700 (LWP 4501)] <<< new output
> [Switching to Thread 0x7ffff7f0b700 (LWP 4500)]
> __GI___nptl_death_event () at events.c:31
> 31 {
> (gdb) FAIL: gdb.threads/manythreads.exp: stop threads 1
>
> That is, I was now getting "New Thread" lines after the "Program
> received signal" line, and the test doesn't expect them. As the
> number of new threads discovered before and after the "Program
> received signal" output is unbounded, it's much nicer to defer
> "Program received signal" until after synching the thread list, thus
> close to the "switching to thread" output and "current frame/source"
> info:
>
> [Thread 0x7ffff7863700 (LWP 7647) exited]
> ^C[New Thread 0x7ffff786b700 (LWP 7648)]
>
> Program received signal SIGINT, Interrupt.
> [Switching to Thread 0x7ffff7fc4740 (LWP 6243)]
> __GI___nptl_create_event () at events.c:25
> 25 {
> (gdb) PASS: gdb.threads/manythreads.exp: stop threads 1
This commit factors out the two places in the test that are effected
by this, and adds there a destilled version of the comment above.
gdb/testsuite/
2014-10-02 Pedro Alves <palves@redhat.com>
* gdb.threads/manythreads.exp (interrupt_and_wait): New procedure.
(top level) <stop threads 1, stop threads 2>: Use it.
Commit a25a5a45 (Fix "breakpoint always-inserted off"; remove
"breakpoint always-inserted auto") regressed non-stop remote
debugging.
This was exposed by mi-nsintrall.exp intermittently failing with a
spurious SIGTRAP.
The problem is that when debugging with "target remote", new threads
the target has spawned but have never reported a stop aren't visible
to GDB until it explicitly resyncs its thread list with the target's.
For example, in a program like this:
int
main (void)
{
pthread_t child_thread;
pthread_create (&child_thread, NULL, child_function, NULL);
return 0; <<<< set breakpoint here
}
If the user sets a breakpoint at the "return" statement, and runs the
program, when that breakpoint hit is reported, GDB is only aware of
the main thread. So if we base the decision to remove or insert
breakpoints from the target based on whether all the threads we know
about are stopped, we'll miss that child_thread is running, and thus
we'll remove breakpoints from the target, even through they should
still remain inserted, otherwise child_thread will miss them.
The break-while-running.exp test actually should also be exposing this
thread-list-out-of-synch problem. That test sets a breakpoint while
the main thread is stopped, but other threads are running. Because
other threads are running, the breakpoint is supposed to be inserted
immediately. But, unless something forces a refetch of the thread
list, like, e.g., "info threads", GDB won't be aware of the other
threads that had been spawned by the main thread, and so won't insert
new or old breakpoints in the target. And it turns out that the test
is exactly doing an explicit "info threads", masking out the
problem... This commit adjust the test to exercise the case of not
issuing "info threads". The test then fails without the GDB fix.
In the ni-nsintrall.exp case, what happens is that several threads hit
the same breakpoint, and when the first thread reports the stop,
because GDB wasn't aware other threads exist, all threads known to GDB
are found stopped, so GDB removes the breakpoints from the target.
The other threads follow up with SIGTRAPs too for that same
breakpoint, which has already been removed. For the first few
threads, the moribund breakpoints machinery suppresses the SIGTRAPs,
but after a few events (precisely '3 * thread_count () + 1' at the
time the breakpoint was removed, see update_global_location_list), the
moribund breakpoint machinery is no longer aware of the removed
breakpoint, and the SIGTRAP is reported as a spurious stop.
The fix is naturally then to stop assuming that if no thread in the
list is executing, then the target is fully stopped. We can't know
that until we fully sync the thread list. Because updating the thread
list on every stop would be too much RSP traffic, I chose instead to
update it whenever we're about to present a stop to the user.
Actually updating the thread list at that point happens to be an item
I had added to the local/remote parity wiki page a while ago:
Native GNU/Linux debugging adds new threads to the thread list as
the program creates them "The [New Thread foo] messages". Remote
debugging can't do that, and it's arguable whether we shouldn't even
stop native debugging from doing that, as it hinders inferior
performance. However, a related issue is that with remote targets
(and gdbserver), even after the program stops, the user still needs
to do "info threads" to pull an updated thread list. This, should
most likely be addressed, so that GDB pulls the list itself, perhaps
just before presenting a stop to the user.
With that in place, the need to delay "Program received signal FOO"
was actually caught by the manythreads.exp test. Without that bit, I
was getting:
[Thread 0x7ffff7f13700 (LWP 4499) exited]
[New Thread 0x7ffff7f0b700 (LWP 4500)]
^C
Program received signal SIGINT, Interrupt.
[New Thread 0x7ffff7f03700 (LWP 4501)] <<< new output
[Switching to Thread 0x7ffff7f0b700 (LWP 4500)]
__GI___nptl_death_event () at events.c:31
31 {
(gdb) FAIL: gdb.threads/manythreads.exp: stop threads 1
That is, I was now getting "New Thread" lines after the "Program
received signal" line, and the test doesn't expect them. As the
number of new threads discovered before and after the "Program
received signal" output is unbounded, it's much nicer to defer
"Program received signal" until after synching the thread list, thus
close to the "switching to thread" output and "current frame/source"
info:
[Thread 0x7ffff7863700 (LWP 7647) exited]
^C[New Thread 0x7ffff786b700 (LWP 7648)]
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7ffff7fc4740 (LWP 6243)]
__GI___nptl_create_event () at events.c:25
25 {
(gdb) PASS: gdb.threads/manythreads.exp: stop threads 1
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2014-10-02 Pedro Alves <palves@redhat.com>
* breakpoint.c (breakpoints_should_be_inserted_now): Use
threads_are_executing.
* breakpoint.h (breakpoints_should_be_inserted_now): Add
describing comment.
* gdbthread.h (threads_are_executing): Declare.
(handle_signal_stop) <random signals>: Don't print about the
signal here if stopping.
(end_stepping_range): Don't notify observers here.
(normal_stop): Update the thread list. If stopped by a random
signal or a stepping range ended, notify observers.
* thread.c (threads_executing): New global.
(init_thread_list): Clear 'threads_executing'.
(set_executing): Set or clear 'threads_executing'.
(threads_are_executing): New function.
(update_threads_executing): New function.
(update_thread_list): Use it.
gdb/testsuite/
2014-10-02 Pedro Alves <palves@redhat.com>
* gdb.threads/break-while-running.exp (test): Add new
'update_thread_list' argument. Skip "info threads" if false.
(top level): Add new 'update_thread_list' axis.
Following an exec with "breakpoint always-inserted on" tries to insert
breakpoints in the new image at the addresses the symbols had in the
old image.
With "always-inserted off", we see:
gdb gdb.multi/multi-arch-exec -ex "set breakpoint always-inserted off"
GNU gdb (GDB) 7.8.50.20140924-cvs
...
(gdb) b main
Breakpoint 1 at 0x400664: file gdb.multi/multi-arch-exec.c, line 24.
^^^^^^^^
(gdb) c
The program is not being run.
(gdb) r
Starting program: testsuite/gdb.multi/multi-arch-exec
Breakpoint 1, main () at gdb/testsuite/gdb.multi/multi-arch-exec.c:24
24 execl (BASEDIR "/multi-arch-exec-hello",
(gdb) c
Continuing.
process 9212 is executing new program: gdb/testsuite/gdb.multi/multi-arch-exec-hello
Breakpoint 1, main () at gdb/testsuite/gdb.multi/hello.c:40
40 bar();
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x080484e4 in main at gdb/testsuite/gdb.multi/hello.c:40
^^^^^^^^^^
breakpoint already hit 2 times
(gdb)
Note how main was 0x400664 in multi-arch-exec, and 0x080484e4 in
gdb.multi/hello.
With "always-inserted on", we get:
Breakpoint 1, main () at gdb/testsuite/gdb.multi/multi-arch-exec.c:24
24 execl (BASEDIR "/multi-arch-exec-hello",
(gdb) c
Continuing.
infrun: target_wait (-1, status) =
infrun: 9444 [process 9444],
infrun: status->kind = execd
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_EXECD
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x400664
(gdb)
That is, GDB is trying to insert a breakpoint at 0x400664, after the
exec, and then that address happens to not be mapped at all in the new
image.
The problem is that update_breakpoints_after_exec is creating
breakpoints, which ends up in update_global_location_list immediately
inserting breakpoints if "breakpoints always-inserted" is "on".
update_breakpoints_after_exec is called very early when we see an exec
event. At that point, we haven't loaded the symbols of the new
post-exec image yet, and thus haven't reset breakpoint's addresses to
whatever they may be in the new image. All we should be doing in
update_breakpoints_after_exec is deleting breakpoints that no longer
make sense after an exec. So the fix removes those breakpoint
creations.
The question is then, if not here, where are those breakpoints
re-created? Turns out we don't need to do anything else, because at
the end of follow_exec, we call breakpoint_re_set, whose tail is also
creating exactly the same breakpoints update_breakpoints_after_exec is
currently creating:
breakpoint_re_set (void)
{
...
create_overlay_event_breakpoint ();
create_longjmp_master_breakpoint ();
create_std_terminate_master_breakpoint ();
create_exception_master_breakpoint ();
}
A new test is added to exercise this.
Tested on x86_64 Fedora 20.
gdb/
2014-10-02 Pedro Alves <palves@redhat.com>
PR breakpoints/17431
* breakpoint.c (update_breakpoints_after_exec): Don't create
overlay, longjmp, std terminate nor exception breakpoints here.
gdb/testsuite/
2014-10-02 Pedro Alves <palves@redhat.com>
PR breakpoints/17431
* gdb.base/execl-update-breakpoints.c: New file.
* gdb.base/execl-update-breakpoints.exp: New file.
Currently, with "set breakpoint auto-hw off", we'll still try to
insert a software breakpoint at addresses covered by supposedly
read-only or inacessible regions:
(top-gdb) mem 0x443000 0x450000 ro
(top-gdb) set mem inaccessible-by-default off
(top-gdb) disassemble
Dump of assembler code for function main:
0x0000000000443956 <+34>: movq $0x0,0x10(%rax)
=> 0x000000000044395e <+42>: movq $0x0,0x18(%rax)
0x0000000000443966 <+50>: mov -0x24(%rbp),%eax
0x0000000000443969 <+53>: mov %eax,-0x20(%rbp)
End of assembler dump.
(top-gdb) b *0x0000000000443969
Breakpoint 5 at 0x443969: file ../../src/gdb/gdb.c, line 29.
(top-gdb) c
Continuing.
warning: cannot set software breakpoint at readonly address 0x443969
Breakpoint 5, 0x0000000000443969 in main (argc=1, argv=0x7fffffffd918) at ../../src/gdb/gdb.c:29
29 args.argc = argc;
(top-gdb)
We warn, saying that the insertion can't be done, but then proceed
attempting the insertion anyway, and in case of manually added
regions, the insert actually succeeds.
This is a regression; GDB used to fail inserting the breakpoint. More
below.
I stumbled on this as I wrote a test that manually sets up a read-only
memory region with the "mem" command, in order to test GDB's behavior
with breakpoints set on read-only regions, even when the real memory
the breakpoints are set at isn't really read-only. I wanted that in
order to add a test that exercises software single-stepping through
read-only regions.
Note that the memory regions that target_memory_map returns aren't
like e.g., what would expect to see in /proc/PID/maps on Linux.
Instead, they're the physical memory map from the _debuggers_
perspective. E.g., a read-only region would be real ROM or flash
memory, while a read-only+execute mapping in /proc/PID/maps is still
read-write to the debugger (otherwise the debugger wouldn't be able to
set software breakpoints in the code segment).
If one tries to manually write to memory that falls within a memory
region that is known to be read-only, with e.g., "p foo = 1", then we
hit a check in memory_xfer_partial_1 before the write mananges to make
it to the target side.
But writing a software/memory breakpoint nowadays goes through
target_write_raw_memory, and unlike when writing memory with
TARGET_OBJECT_MEMORY, nothing on the TARGET_OBJECT_RAW_MEMORY path
checks whether we're trying to write to a read-only region.
At the time "breakpoint auto-hw" was added, we didn't have the
TARGET_OBJECT_MEMORY vs TARGET_OBJECT_RAW_MEMORY target object
distinction yet, and the code path in memory_xfer_partial that blocks
writes to read-only memory was hit for memory breakpoints too. With
GDB 6.8 we had:
warning: cannot set software breakpoint at readonly address 0000000000443943
Warning:
Cannot insert breakpoint 1.
Error accessing memory address 0x443943: Input/output error.
So I started out by fixing this by adding the memory region validation
to TARGET_OBJECT_RAW_MEMORY too.
But later, when testing against GDBserver, I realized that that would
only block software/memory breakpoints GDB itself inserts with
gdb/mem-break.c. If a target has a to_insert_breakpoint method, the
insertion request will still pass through to the target. So I ended
up converting the "cannot set breakpoint" warning in breakpoint.c to a
real error return, thus blocking the insertion sooner.
With that, we'll end up no longer needing the TARGET_OBJECT_RAW_MEMORY
changes once software single-step breakpoints are converted to real
breakpoints. We need them today as software single-step breakpoints
bypass insert_bp_location. But, it'll be best to leave that in as
safeguard anyway, for other direct uses of TARGET_OBJECT_RAW_MEMORY.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2014-10-01 Pedro Alves <palves@redhat.com>
* breakpoint.c (insert_bp_location): Error out if inserting a
software breakpoint at a read-only address.
* target.c (memory_xfer_check_region): New function, factored out
from ...
(memory_xfer_partial_1): ... this. Make the 'reg_len' local a
ULONGEST.
(target_xfer_partial) <TARGET_OBJECT_RAW_MEMORY>: Check the access
against the memory region attributes.
gdb/testsuite/
2014-10-01 Pedro Alves <palves@redhat.com>
* gdb.base/breakpoint-in-ro-region.c: New file.
* gdb.base/breakpoint-in-ro-region.exp: New file.
Don't reset the exit code at inferior exit and print it in
-list-thread-groups.
gdb/ChangeLog:
* NEWS: Announce new exit-code field in -list-thread-groups
output.
* inferior.c (exit_inferior_1): Don't clear exit code.
(inferior_appeared): Clear exit code.
* mi/mi-main.c (print_one_inferior): Add printing of the exit
code.
gdb/testsuite/ChangeLog:
* gdb.mi/mi-exit-code.exp: New file.
* gdb.mi/mi-exit-code.c: New file.
gdb/doc/ChangeLog:
* gdb.texinfo (Miscellaneous gdb/mi Commands): Document new
exit-code field in -list-thread-groups output.
I see the following fails on arm-linux-gnueabi,
result of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is 1
output of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is not a dynamic executable
child process exited abnormally
FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so
FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so output contains libs
the test script invokes ldd (on host) for the target libraries, which
is wrong. ldd can't be cross because it invokes dynamic linker with
LD_TRACE_LOADED_OBJECTS and gets the dependent libraries. My first
reaction to this problem is to execute ld.so on the target (like
remote_exec target). When I start to hack proc build_executable_own_libs,
I find it has assumptions here and there that the native testing is
performed. Then I check the callers of build_executable_own_libs,
and they are all skipped if isnative is false. It is reasonable to do
the same in dlopen-libpthread.exp too.
gdb/testsuite:
2014-09-30 Yao Qi <yao@codesourcery.com>
* gdb.threads/dlopen-libpthread.exp: Skip it if isnative is
false.
By default, GDB removes all breakpoints from the target when the
target stops and the prompt is given back to the user. This is useful
in case GDB crashes while the user is interacting, as otherwise,
there's a higher chance breakpoints would be left planted on the
target.
But, as long as any thread is running free, we need to make sure to
keep breakpoints inserted, lest a thread misses a breakpoint. With
that in mind, in preparation for non-stop mode, we added a "breakpoint
always-inserted on" mode. This traded off the extra crash protection
for never having threads miss breakpoints, and in addition is more
efficient if there's a ton of breakpoints to remove/insert at each
user command (e.g., at each "step").
When we added non-stop mode, and for a period, we required users to
manually set "always-inserted on" when they enabled non-stop mode, as
otherwise GDB removes all breakpoints from the target as soon as any
thread stops, which means the other threads still running will miss
breakpoints. The test added by this patch exercises this.
That soon revealed a nuisance, and so later we added an extra
"breakpoint always-inserted auto" mode, that made GDB behave like
"always-inserted on" when non-stop was enabled, and "always-inserted
off" when non-stop was disabled. "auto" was made the default at the
same time.
In hindsight, this "auto" setting was unnecessary, and not the ideal
solution. Non-stop mode does depends on breakpoints always-inserted
mode, but only as long as any thread is running. If no thread is
running, no breakpoint can be missed. The same is true for all-stop
too. E.g., if, in all-stop mode, and the user does:
(gdb) c&
(gdb) b foo
That breakpoint at "foo" should be inserted immediately, but it
currently isn't -- currently it'll end up inserted only if the target
happens to trip on some event, and is re-resumed, e.g., an internal
breakpoint triggers that doesn't cause a user-visible stop, and so we
end up in keep_going calling insert_breakpoints. The test added by
this patch also covers this.
IOW, no matter whether in non-stop or all-stop, if the target fully
stops, we can remove breakpoints. And no matter whether in all-stop
or non-stop, if any thread is running in the target, then we need
breakpoints to be immediately inserted. And then, if the target has
global breakpoints, we need to keep breakpoints even when the target
is stopped.
So with that in mind, and aiming at reducing all-stop vs non-stop
differences for all-stop-on-stop-of-non-stop, this patch fixes
"breakpoint always-inserted off" to not remove breakpoints from the
target until it fully stops, and then removes the "auto" setting as
unnecessary. I propose removing it straight away rather than keeping
it as an alias, unless someone complains they have scripts that need
it and that can't adjust.
Tested on x86_64 Fedora 20.
gdb/
2014-09-22 Pedro Alves <palves@redhat.com>
* NEWS: Mention merge of "breakpoint always-inserted" modes "off"
and "auto" merged.
* breakpoint.c (enum ugll_insert_mode): New enum.
(always_inserted_mode): Now a plain boolean.
(show_always_inserted_mode): No longer handle AUTO_BOOLEAN_AUTO.
(breakpoints_always_inserted_mode): Delete.
(breakpoints_should_be_inserted_now): New function.
(insert_breakpoints): Pass UGLL_INSERT to
update_global_location_list instead of calling
insert_breakpoint_locations manually.
(create_solib_event_breakpoint_1): New, factored out from ...
(create_solib_event_breakpoint): ... this.
(create_and_insert_solib_event_breakpoint): Use
create_solib_event_breakpoint_1 instead of calling
insert_breakpoint_locations manually.
(update_global_location_list): Change parameter type from boolean
to enum ugll_insert_mode. All callers adjusted. Adjust to use
breakpoints_should_be_inserted_now and handle UGLL_INSERT.
(update_global_location_list_nothrow): Change parameter type from
boolean to enum ugll_insert_mode.
(_initialize_breakpoint): "breakpoint always-inserted" option is
now a boolean command. Update help text.
* breakpoint.h (breakpoints_always_inserted_mode): Delete declaration.
(breakpoints_should_be_inserted_now): New declaration.
* infrun.c (handle_inferior_event) <TARGET_WAITKIND_LOADED>:
Remove breakpoints_always_inserted_mode check.
(normal_stop): Adjust to use breakpoints_should_be_inserted_now.
* remote.c (remote_start_remote): Likewise.
gdb/doc/
2014-09-22 Pedro Alves <palves@redhat.com>
* gdb.texinfo (Set Breaks): Document that "set breakpoint
always-inserted off" is the default mode now. Delete
documentation of "set breakpoint always-inserted auto".
gdb/testsuite/
2014-09-22 Pedro Alves <palves@redhat.com>
* gdb.threads/break-while-running.exp: New file.
* gdb.threads/break-while-running.c: New file.
This patch is to extend dw2-var-zero-add.exp to cover the case that
partial symtabl is not used while full symtab is used, in order to
cover the changes in patch 2/3. This patch restarts GDB with
--readnow and does the same test again.
gdb/testsuite:
2014-09-19 Yao Qi <yao@codesourcery.com>
* gdb.dwarf2/dw2-var-zero-addr.exp: Move test into new proc test.
Invoke test. Restart GDB with --readnow and invoke test again.
I see the following fail on arm-none-eabi target,
(gdb) b 24^M
Breakpoint 1 at 0x4: file
../../../../git/gdb/testsuite/gdb.base/break-on-linker-gcd-function.cc,
line 24.^M
(gdb) FAIL: gdb.base/break-on-linker-gcd-function.exp: b 24
Currently, we are using flag has_section_at_zero to determine whether
address zero in debug info means the corresponding code has been
GC'ed, like this:
case DW_LNE_set_address:
address = read_address (abfd, line_ptr, cu, &bytes_read);
if (address == 0 && !dwarf2_per_objfile->has_section_at_zero)
{
/* This line table is for a function which has been
GCd by the linker. Ignore it. PR gdb/12528 */
However, this is incorrect on some bare metal targets, as .text
section is located at 0x0, so dwarf2_per_objfile->has_section_at_zero
is true. If a function is GC'ed by linker, the address is zero. GDB
thinks address zero is a function's address rather than this function
is GC'ed.
In this patch, we choose 'lowpc' got in read_file_scope to check
whether 'lowpc' is greater than zero. If it isn't, address zero really
means the function is GC'ed. In this patch, we pass 'lowpc' in
read_file_scope through handle_DW_AT_stmt_list and dwarf_decode_lines,
and to dwarf_decode_lines_1 finally.
This patch fixes the fail above. This patch also covers the path that
partial symbol isn't used, which is tested by starting gdb with
--readnow option.
It is regression tested on x86-linux with
target_board=dwarf4-gdb-index, and arm-none-eabi. OK to apply?
gdb:
2014-09-19 Yao Qi <yao@codesourcery.com>
* dwarf2read.c (dwarf_decode_lines): Update declaration.
(handle_DW_AT_stmt_list): Add argument 'lowpc'. Update
comments. Callers update.
(dwarf_decode_lines): Likewise.
(dwarf_decode_lines_1): Add argument 'lowpc'. Update
comments. Skip the line table if 'lowpc' is greater than
'address'. Don't check
dwarf2_per_objfile->has_section_at_zero.
gdb/testsuite:
2014-09-19 Yao Qi <yao@codesourcery.com>
* gdb.base/break-on-linker-gcd-function.exp: Move test into new
proc set_breakpoint_on_gcd_function. Invoke
set_breakpoint_on_gcd_function. Restart GDB with --readnow and
invoke set_breakpoint_on_gcd_function again.
This is just a testcase addition that I am proposing for upstream GDB.
We have this in our internal tree, and the related RH bug is:
<https://bugzilla.redhat.com/show_bug.cgi?id=809179>
(You might not be able to see all the comments without privileges.)
This bug is about a global variable that got incorrectly displayed by
GDB. This bug has already been fixed a long time ago by Joel's
commit:
commit 19630284f5
Author: Joel Brobecker <brobecker@gnat.com>
Date: Tue Jun 5 13:50:50 2012 +0000
But I think a testcase for it wouldn't hurt.
So, consider the following scenario:
$ cat solib1.c
int test;
void c_main (void)
{
test = 42;
}
$ cat solib2.c
int test;
void b_main (void)
{
test = 42;
}
$ cat main.c
int main (int argc, char *argv[])
{
c_main ();
b_main ();
return 0;
}
$ gcc -g -fPIC -shared -o libSO1.so -c solib1.c
$ gcc -g -fPIC -shared -o libSO2.so -c solib2.c
$ gcc -g -o main -L$PWD -lSO1 -lSO2 main.c
$ LD_LIBRARY_PATH=. gdb -q -batch -ex 'b c_main' -ex r -ex n -ex 'p test' ./main
...
$1 = 0
This happened with GDB before Joel's commit above. Now, things work
and GDB is able to correctly display the nested global variable:
$ LD_LIBRARY_PATH=. gdb -q -batch -ex 'b c_main' -ex r -ex n -ex 'p test' ./main
...
$1 = 42
The testcase attached tests this behavior.
gdb/testsuite/ChangeLog:
2014-09-16 Sergio Durigan Junior <sergiodj@redhat.com>
* gdb.base/global-var-nested-by-dso-solib1.c: New file.
* gdb.base/global-var-nested-by-dso-solib2.c: Likewise.
* gdb.base/global-var-nested-by-dso.c: Likewise.
* gdb.base/global-var-nested-by-dso.exp: Likewise.
Make test messages unique and a couple other tweaks.
gdb/testsuite/
2014-09-16 Sergio Durigan Junior <sergiodj@redhat.com>
Pedro Alves <palves@redhat.com>
* gdb.base/watch-bitfields.exp: Pass string other than test file
name to prepare_for_testing.
(watch): New procedure.
(expect_watchpoint): Use with_test_prefix.
(top level): Factor out tests to ...
(test_watch_location, test_regular_watch): ... these new
procedures, and use with_test_prefix and gdb_continue_to_end.
PR 12526 reports that -location watchpoints against bitfield arguments
trigger false positives when bits around the bitfield, but not the
bitfield itself, are modified.
This happens because -location watchpoints naturally operate at the
byte level, not at the bit level. When the address of a bitfield
lvalue is taken, information about the bitfield (i.e. its offset and
size) is lost in the process.
This information must first be retained throughout the lifetime of the
-location watchpoint. This patch achieves this by adding two new
fields to the watchpoint struct: val_bitpos and val_bitsize. These
fields are set when a watchpoint is first defined in watch_command_1.
They are both equal to zero if the watchpoint is not a -location
watchpoint or if the argument is not a bitfield.
Then these bitfield parameters are used inside update_watchpoint and
watchpoint_check to extract the actual value of the bitfield from the
watchpoint address, with the help of a local helper function
extract_bitfield_from_watchpoint_value.
Finally when creating a HW breakpoint pointing to a bitfield, we
optimize the address and length of the breakpoint. By skipping over
the bytes that don't cover the bitfield, this step reduces the
frequency at which a read watchpoint for the bitfield is triggered.
It also reduces the number of times a false-positive call to
check_watchpoint is triggered for a write watchpoint.
gdb/
PR breakpoints/12526
* breakpoint.h (struct watchpoint): New fields val_bitpos and
val_bitsize.
* breakpoint.c (watch_command_1): Use these fields to retain
bitfield information.
(extract_bitfield_from_watchpoint_value): New function.
(watchpoint_check): Use it.
(update_watchpoint): Use it. Optimize the address and length of a
HW watchpoint pointing to a bitfield.
* value.h (unpack_value_bitfield): New prototype.
* value.c (unpack_value_bitfield): Make extern.
gdb/testsuite/
PR breakpoints/12526
* gdb.base/watch-bitfields.exp: New file.
* gdb.base/watch-bitfields.c: New file.
Silly typo...
gdb/testsuite/
2014-09-16 Pedro Alves <palves@redhat.com>
* gdb.base/watchpoint-stops-at-right-insn.exp (test): Compare
software and hardware addresses, not software address against
itself.
This adds a test that makes sure GDB knows whether the target has
continuable, or non-continuable watchpoints.
That is, the test confirms that GDB presents a watchpoint value change
at the first instruction right after the instruction that changes
memory.
gdb/testsuite/ChangeLog:
2014-09-16 Pedro Alves <palves@redhat.com>
* gdb.base/watchpoint-stops-at-right-insn.c: New file.
* gdb.base/watchpoint-stops-at-right-insn.exp: New file.
In the recent review to my patch about copying files to remote host,
we find that we need a board file which is more closely mapped real
remote host testing to improve coverage. With the board file
local-remote-host-native.exp, DejaGNU copies files to
$build/gdb/testsuite/remote-host to emulate the effect of remote host.
Is it OK?
gdb/testsuite:
2014-09-16 Yao Qi <yao@codesourcery.com>
* boards/local-remote-host-native.exp: New file.
The test does a backtrace to see which thread (#2 or #3) is assigned
to which SIGUSR (1 or 2). If the main thread gets to all_threads_running
before the sigusr threads get to their entry point, then the function
name isn't in the backtrace and the test fails.
Alas this version of the code is within epsilon of what I started with,
and then over-simplified things.
If I want to change the signalled state of multiple threads
it's a bit cumbersome to do with the "signal" command.
What you really want is a way to set the signal state of the
desired threads and then just do "continue".
This patch adds a new command, queue-signal, to accomplish this.
Basically "signal N" == "queue-signal N" + "continue".
That's not precisely true in that "signal" can be used to inject
any signal, including signals set to "nopass"; whereas "queue-signal"
just queues the signal as if the thread stopped because of it.
"nopass" handling is done when the thread is resumed which
"queue-signal" doesn't do.
One could add extra complexity to allow queue-signal to be used to
deliver "nopass" signals like the "signal" command. I have no current
need for it so in the interests of incremental complexity, I have
left such support out and just have the code flag an error if one
tries to queue a nopass signal.
gdb/ChangeLog:
* NEWS: Mention new "queue-signal" command.
* infcmd.c (queue_signal_command): New function.
(_initialize_infcmd): Add new queue-signal command.
gdb/doc/ChangeLog:
* gdb.texinfo (Signaling): Document new queue-signal command.
gdb/testsuite/ChangeLog:
* gdb.threads/queue-signal.c: New file.
* gdb.threads/queue-signal.exp: New file.
I had occasion to use with_gdb_prompt in a test for the patch for PR 17314
and was passing the plain text prompt as the value, "(top-gdb)",
instead of a regexp, "\(top-gdb\)" (expressed as "\\(top-gdb\\)" in TCL).
I then discovered that in order to restore the prompt gdb passes the
original value of $gdb_prompt to "set prompt", which works because
"set prompt \(gdb\) " is equivalent to "set prompt (gdb) ".
Perhaps I'm being overly cautious but this feels a bit subtle,
but at any rate as an API choice I'd much rather pass the plain text
form to with_gdb_prompt.
I also discovered that the initial value of gdb_prompt is set in
two places to two different values.
At the global level gdb.exp sets it to "\[(\]gdb\[)\]"
and default_gdb_init sets it to "\\(gdb\\)".
The former form is undesirable as an argument to "set prompt",
but it's not clear to me that just deleting this code won't break
anything. Thus I just changed the value to be consistent and added
a comment.
gdb/testsuite/ChangeLog:
* lib/gdb.exp (gdb_prompt): Add comment and change initial value to
be consistent with what default_gdb_init uses.
(with_gdb_prompt): Change form of PROMPT argument from a regexp to
the plain text of the prompt. Add some logging printfs.
* gdb.perf/disassemble.exp: Update call to with_gdb_prompt.
See:
https://sourceware.org/ml/gdb-patches/2014-09/msg00404.html
We have a number of places that do gdb_run_cmd followed by gdb_expect,
when it would be better to use gdb_test_multiple or gdb_test.
This converts all that "grep gdb_run_cmd -A 2 | grep gdb_expect"
found.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/testsuite/
2014-09-12 Pedro Alves <palves@redhat.com>
* gdb.arch/gdb1558.exp: Replace uses of gdb_expect after
gdb_run_cmd with gdb_test_multiple or gdb_test throughout.
* gdb.arch/i386-size-overlap.exp: Likewise.
* gdb.arch/i386-size.exp: Likewise.
* gdb.arch/i386-unwind.exp: Likewise.
* gdb.base/a2-run.exp: Likewise.
* gdb.base/break.exp: Likewise.
* gdb.base/charset.exp: Likewise.
* gdb.base/chng-syms.exp: Likewise.
* gdb.base/commands.exp: Likewise.
* gdb.base/dbx.exp: Likewise.
* gdb.base/find.exp: Likewise.
* gdb.base/funcargs.exp: Likewise.
* gdb.base/jit-simple.exp: Likewise.
* gdb.base/reread.exp: Likewise.
* gdb.base/sepdebug.exp: Likewise.
* gdb.base/step-bt.exp: Likewise.
* gdb.cp/mb-inline.exp: Likewise.
* gdb.cp/mb-templates.exp: Likewise.
* gdb.objc/basicclass.exp: Likewise.
* gdb.threads/killed.exp: Likewise.
The problem is that rs6000_frame_cache attempts to read the stack backchain via
read_memory_unsigned_integer, which throws an exception if the stack pointer is
invalid. With this patch, it calls safe_read_memory_integer instead, which
doesn't throw an exception and allows for safe handling of that situation.
gdb/ChangeLog
2014-09-12 Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Ulrich Weigand <uweigand@de.ibm.com>
PR tdep/17379
* rs6000-tdep.c (rs6000_frame_cache): Use safe_read_memory_integer
instead of read_memory_unsigned_integer.
gdb/testcase/ChangeLog
2014-09-12 Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
PR tdep/17379
* gdb.arch/powerpc-stackless.S: New file.
* gdb.arch/powerpc-stackless.exp: New file.
I have started seeing occasional runaway 'attach' processes these days.
I cannot be certain it is really caused by this patch, for example
grep 'FAIL.*cmdline attach run' does not show anything in my logs.
But as I remember this 'attach' runaway process always happened in GDB (but
I do not remember it in the past months) I think it would be most safe to just
solve it forever by [attached].
gdb/testsuite/ChangeLog
2014-09-12 Jan Kratochvil <jan.kratochvil@redhat.com>
* gdb.base/attach.c: Include unistd.h.
(main): Call alarm. Add label postloop.
* gdb.base/attach.exp (do_attach_tests): Use gdb_get_line_number,
gdb_breakpoint, gdb_continue_to_breakpoint.
(test_command_line_attach_run): Kill ${testpid} in one exit path.
Doing:
gdb --pid=PID -ex run
Results in GDB getting a SIGTTIN, and thus ending stopped. That's
usually indicative of a missing target_terminal_ours call.
E.g., from the PR:
$ sleep 1h & p=$!; sleep 0.1; gdb -batch sleep $p -ex run
[1] 28263
[1] Killed sleep 1h
[2]+ Stopped gdb -batch sleep $p -ex run
The workaround is doing:
gdb -ex "attach $PID" -ex "run"
instead of
gdb [-p] $PID -ex "run"
With the former, gdb waits for the attach command to complete before
moving on to the "run" command, because the interpreter is in sync
mode at this point, within execute_command. But for the latter,
attach_command is called directly from captured_main, and thus misses
that waiting. IOW, "run" is running before the attach continuation
has run, before the program stops and attach completes. The broken
terminal settings are just one symptom of that. Any command that
queries or requires input results in the same.
The fix is to wait in catch_command_errors (which is specific to
main.c nowadays), just like we wait in execute_command.
gdb/ChangeLog:
2014-09-11 Pedro Alves <palves@redhat.com>
PR gdb/17347
* main.c: Include "infrun.h".
(catch_command_errors, catch_command_errors_const): Wait for the
foreground command to complete.
* top.c (maybe_wait_sync_command_done): New function, factored out
from ...
(maybe_wait_sync_command_done): ... here.
* top.h (maybe_wait_sync_command_done): New declaration.
gdb/testsuite/ChangeLog:
2014-09-11 Pedro Alves <palves@redhat.com>
PR gdb/17347
* lib/gdb.exp (gdb_spawn_with_cmdline_opts): New procedure.
* gdb.base/attach.exp (test_command_line_attach_run): New
procedure.
(top level): Call it.
Several places in the testsuite have a copy of a snippet of code that
spawns a test program, waits a bit, and then does some PID munging for
Cygwin. This is in order to have GDB attach to the spawned program.
This refactors all that to a common procedure.
(multi-attach.exp wants to spawn multiple processes, so this makes the
new procedure's interface work with lists.)
Tested on x86_64 Fedora 20.
gdb/testsuite/ChangeLog:
2014-09-11 Pedro Alves <palves@redhat.com>
* lib/gdb.exp (spawn_wait_for_attach): New procedure.
* gdb.base/attach.exp (do_attach_tests, do_call_attach_tests)
(do_command_attach_tests): Use spawn_wait_for_attach.
* gdb.base/solib-overlap.exp: Likewise.
* gdb.multi/multi-attach.exp: Likewise.
* gdb.python/py-prompt.exp: Likewise.
* gdb.python/py-sync-interp.exp: Likewise.
* gdb.server/ext-attach.exp: Likewise.
This fixes two FAIL results on this testcase which were caused by a
misplaced "continue" command. This testcase used to end inferior's
execution too soon, causing the following tests to fail. Now we break
right after inferior's loop and perform the rest of the tests there.
gdb/testsuite/ChangeLog:
* gdb.fortran/array-element.exp: Remove unexpected "continue"
command in testcase. Simplify testcase.
Trying to print the bounds or the length of a pointer to an array
whose bounds are dynamic results in the following error:
(gdb) p foo.three_ptr.all'first
Location address is not set.
(gdb) p foo.three_ptr.all'length
Location address is not set.
This is because, after having dereferenced our array pointer, we
use the type of the resulting array value, instead of the enclosing
type. The former is the original type where the bounds are unresolved,
whereas we need to get the actual array bounds.
Similarly, trying to apply those attributes to the array pointer
directly (without explicitly dereferencing it with the '.all'
operator) yields the same kind of error:
(gdb) p foo.three_ptr'first
Location address is not set.
(gdb) p foo.three_ptr'length
Location address is not set.
This is caused by the fact that the dereference was done implicitly
in this case, and perform at the type level only, which is not
sufficient in order to resolve the array type.
This patch fixes both issues, thus allowing us to get the expected output:
(gdb) p foo.three_ptr.all'first
$1 = 1
(gdb) p foo.three_ptr.all'length
$2 = 3
(gdb) p foo.three_ptr'first
$3 = 1
(gdb) p foo.three_ptr'length
$4 = 3
gdb/ChangeLog:
* ada-lang.c (ada_array_bound): If ARR is a TYPE_CODE_PTR,
dereference it first. Use value_enclosing_type instead of
value_type.
(ada_array_length): Likewise.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.exp: Add 'first, 'last and 'length tests.
Consider a pointer to an array which dynamic bounds, described in
DWARF as follow:
<1><25>: Abbrev Number: 4 (DW_TAG_array_type)
<26> DW_AT_name : foo__array_type
[...]
<2><3b>: Abbrev Number: 5 (DW_TAG_subrange_type)
[...]
<40> DW_AT_lower_bound : 5 byte block: 97 38 1c 94 4
(DW_OP_push_object_address; DW_OP_lit8; DW_OP_minus;
DW_OP_deref_size: 4)
<46> DW_AT_upper_bound : 5 byte block: 97 34 1c 94 4
(DW_OP_push_object_address; DW_OP_lit4; DW_OP_minus;
DW_OP_deref_size: 4)
GDB is now able to correctly print the entire array, but not one
element of the array. Eg:
(gdb) p foo.three_ptr.all
$1 = (1, 2, 3)
(gdb) p foo.three_ptr.all(1)
Cannot access memory at address 0xfffffffff4123a0c
The problem occurs because we are missing a dynamic resolution of
the variable's array type when subscripting the array. What the current
code does is "fix"-ing the array type using the GNAT encodings, but
that operation ignores any of the array's dynamic properties.
This patch fixes the issue by using ada_value_ind to dereference
the array pointer, which takes care of the array type resolution.
It also continues to "fix" arrays described using GNAT encodings,
so backwards compatibility is preserved.
gdb/ChangeLog:
* ada-lang.c (ada_value_ptr_subscript): Remove parameter "type".
Adjust function implementation and documentation accordingly.
(ada_evaluate_subexp) <OP_FUNCALL>: Only assign "type" if
NOSIDE is EVAL_AVOID_SIDE_EFFECTS.
Update call to ada_value_ptr_subscript.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.exp: Add subscripting tests.
Consider the following declaration:
type Array_Type is array (Natural range <>) of Integer;
type Array_Ptr is access all Array_Type;
for Array_Ptr'Size use 64;
Three_Ptr : Array_Ptr := new Array_Type'(1 => 1, 2 => 2, 3 => 3);
This creates a pointer to an array where the bounds are stored
in a memory region just before the array itself (aka a "thin pointer").
In DWARF, this is described as a the usual pointer type to an array
whose subrange has dynamic values for its bounds:
<1><25>: Abbrev Number: 4 (DW_TAG_array_type)
<26> DW_AT_name : foo__array_type
[...]
<2><3b>: Abbrev Number: 5 (DW_TAG_subrange_type)
[...]
<40> DW_AT_lower_bound : 5 byte block: 97 38 1c 94 4
(DW_OP_push_object_address; DW_OP_lit8; DW_OP_minus;
DW_OP_deref_size: 4)
<46> DW_AT_upper_bound : 5 byte block: 97 34 1c 94 4
(DW_OP_push_object_address; DW_OP_lit4; DW_OP_minus;
DW_OP_deref_size: 4)
GDB is currently printing the value of the array incorrectly:
(gdb) p foo.three_ptr.all
$1 = (26629472 => 1, 2,
value.c:819: internal-error: value_contents_bits_eq: [...]
The dereferencing (".all" operator) is done by calling ada_value_ind,
which itself calls value_ind. It first produces a new value where
the bounds of the array were correctly resolved to their actual value,
but then calls readjust_indirect_value_type which replaces the resolved
type by the original type.
The problem starts when ada_value_print does not take this situation
into account, and starts using the type of the resulting value, which
has unresolved array bounds, instead of using the value's enclosing
type.
After fixing this issue, the debugger now correctly prints:
(gdb) p foo.three_ptr.all
$1 = (1, 2, 3)
gdb/ChangeLog:
* ada-valprint.c (ada_value_print): Use VAL's enclosing type
instead of VAL's type.
gdb/testsuite/ChangeLog:
* gdb.dwarf2/dynarr-ptr.c: New file.
* gdb.dwarf2/dynarr-ptr.exp: New file.
Similarly to the previous changes to gdb.reverse/sigall-reverse.exp and
gdb.reverse/until-precsave.exp this corrects the timeout tweak in
gdb.base/watchpoint-solib.exp.
This test case executes a large amount of code with a software watchpoint
enabled. This means single-stepping all the way through and takes a lot
of time, e.g. for an ARMv7 Panda board and a `-march=armv5te' multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 714
for the same board and a `-mthumb -march=armv5te' multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 1275
and for QEMU in the system emulation mode and a `-march=armv4t'
multilib:
PASS: gdb.base/watchpoint-solib.exp: continue to foo again
elapsed: 115
(values in seconds) -- all of which having the default timeout of 60s,
set based on the requirement of the remaining test cases (other than
gdb.reverse ones).
Here again the timeout extension to have a meaning should be calculated
by scaling rather than using an arbitrary constant, and a larger factor
of 30 will do, leaving some margin. Hopefully for everyone or otherwise
we'll probably have to come up with a smarter solution.
OTOH the other test cases in this script do not require the extension so
they can be moved outside its umbrella so as to avoid unnecessary delays
if something goes wrong and a genuine timeout triggers.
* gdb.base/watchpoint-solib.exp: Increase the timeout by a factor
of 30 rather than hardcoding 120 for a slow test case. Take the
`gdb,timeout' target setting into account for this calculation.
Don't extend the timeout for the test cases that don't need it.
There are three cases in two scripts in the gdb.reverse subset that
take a particularly long time. Two of them are already attempted to
take care of by extending the timeout from the default. The remaining
one has no precautions taken. The timeout extension is ineffective
though, it is done by adding a constant rather than by scaling and as
a result while it may work for target boards that get satisfied with
the detault test timeout of 10s, it does not serve its purpose for
slower ones.
Here are indicative samples of execution times (in seconds) observed
for these cases respectively, for an ARMv7 Panda board running Linux
and a `-march=armv5te' multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 385
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 4440
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 965
for the same board and a `-mthumb -march=armv5te' multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 465
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 4191
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 669
and for QEMU in the system emulation mode and a `-march=armv4t'
multilib:
PASS: gdb.reverse/sigall-reverse.exp: continue to signal exit
elapsed: 45
PASS: gdb.reverse/until-precsave.exp: run to end of main
elapsed: 433
PASS: gdb.reverse/until-precsave.exp: save process recfile
elapsed: 104
Based on the performance of other tests these two test configurations
have their default timeout set to 450s and 60s respectively.
The remaining two multilibs (`-mthumb -march=armv4t' and `-mthumb
-march=armv7-a') do not produce test results usable enough to have data
available for these cases.
Based on these results I have tweaked timeouts for these cases as
follows. This, together with a suitable board timeout setting, removes
timeouts for these cases. Note that for the default timeout of 10s the
new setting for the first case in gdb.reverse/until-precsave.exp is
compatible with the old one, just a bit higher to keep the convention
of longer timeouts to remain multiples of 30s. The second case there
does not need such a high setting so I have lowered it a bit to avoid
an unnecessary delay where this test case genuinely times out.
* gdb.reverse/sigall-reverse.exp: Increase the timeout by
a factor of 2 for a slow test case. Take the `gdb,timeout'
target setting into account for this calculation.
* gdb.reverse/until-precsave.exp: Increase the timeout by
a factor of 15 and 3 respectively rather than adding 120
for a pair of slow test cases. Take the `gdb,timeout'
target setting into account for this calculation.
The recent change to introduce `gdb_reverse_timeout' turned out
ineffective for board setups that set the `gdb,timeout' target variable.
A lower `gdb,timeout' setting takes precedence and defeats the effect of
`gdb_reverse_timeout'. This is because the global timeout is overridden
in gdb_test_multiple and then again in gdb_expect.
Three timeout variables are taken into account in these two places, in
this precedence:
1. The `gdb,timeout' target variable.
2. The caller's local `timeout' variable (upvar timeout)
3. The global `timeout' variable.
This precedence is obeyed by gdb_test_multiple strictly. OTOH
gdb_expect will select the higher of the two formers and will only take
the latter into account if none of the formers is present. However the
two timeout selections are conceptually the same and gdb_test_multiple
does its only for the purpose of passing it down to gdb_expect.
Therefore I decided there is no point to keep carrying on this
duplication and removed the sequence from gdb_test_multiple, however
retaining the `upvar timeout' variable definition. This way gdb_expect
will still access gdb_test_multiple's caller `timeout' variable (if any)
via its own `upvar timeout' reference.
Now as to the sequence in gdb_expect. In addition to the three
variables described above it also takes a timeout argument into account,
as the fourth value to choose from. It is currently used if it is
higher than the timeout selected from the variables as described above.
With the timeout selection code from gdb_test_multiple gone, gone is
also the most prominent use of this timeout argument, it's now used in
a couple of places only, mostly within this test framework library code
itself for preparatory commands or suchlike. With this being the case
this timeout selection code can be simplified as follows:
1. Among the three timeout variables, the highest is always chosen.
This is so that a test case doesn't inadvertently lower a high value
timeout needed by slow target boards. This is what all test cases
use.
2. Any timeout argument takes precedence. This is for special cases
such as within the framework library code, e.g. it doesn't make sense
to send `set height 0' with a timeout of 7200 seconds. This is a
local command that does not interact with the target and setting a
high timeout here only risks a test suite run taking ages if it goes
astray for some reason.
3. The fallback timeout of 60s remains.
* lib/gdb.exp (gdb_test_multiple): Remove code to select the
timeout, don't pass one down to gdb_expect.
(gdb_expect): Rework timeout selection.
As it happens we have a board that fails a gdb.base/gcore-relro.exp
test case reproducibly and moreover the case appears to trigger a
kernel bug making the it less than usable. Specifically the board
remains responsive to some extent, however processes do not appear
to be able to successfully complete termination anymore and perhaps
more importantly further gdbserver processes can be started, but they
never reach the stage of listening on the RSP socket.
This change handles timeouts in gdbserver start properly, by throwing
a TCL error exception when gdbserver does not report listening on the
RSP socket in time. This is then caught at the outer level and
reported, and 2 rather than 1 is returned so that the caller may tell
the failure to start gdbserver and other issues apart and act
accordingly (or do nothing).
I thought letting the exception unwind further on might be a good idea
for any test harnesses out there to break outright where a gdbserver
start error is silently ignored right now, however I figured out the
calls to gdbserver-support.exp are buried down too deep in the GDB test
suite for such a change to be made easily. I think returning a distinct
return value is good enough (the API says "non-zero", so 2 is as good as
1) and we can always make the error harder in a later step if required.
With config/gdbserver.exp being used this change remains transparent
to the target board, the return value is passed up by gdb_reload and
the error exception unwinds through gdbserver_gdb_load and is caught
and handled by mi_gdb_target_load. A call to perror is still made,
reporting the timeout, and in the case of mi_gdb_target_load the
procedure returns a value denoting unsuccessful completion. An
unsuccessful completion of gdb_reload is already handled elsewhere.
An alternative gdbserver board configuration can interpret the return
value in its gdb_reload implementation and catch the error in
gdbserver_gdb_load in an attempt to recover a target board that has
gone astray, for example by rebooting the board somehow. This has
proved effective with our failing board, that now completes the
remaining test cases with no further hiccups.
* lib/gdbserver-support.exp (gdbserver_start): Throw an error
exception on timeout.
(gdbserver_run): Catch any `gdbserver_spawn' error exceptions.
(gdbserver_start_extended): Catch any `gdbserver_start' error
exceptions.
(gdbserver_start_multi, mi_gdbserver_start_multi): Likewise.
* lib/mi-support.exp (mi_gdb_target_load): Catch any
`gdbserver_gdb_load' error exceptions.
Gdbserver support code uses the global timeout value to determine when
to stop waiting for a gdbserver process being started to respond before
continuing anyway. This timeout is usually as low as 10s and may not
be enough in this context, for example on the first run where the
filesystem cache is cold, even if it is elsewhere.
E.g. I observe this reliably with gdbserver started the first time in
QEMU running in the system emulation mode:
(gdb) file .../gdb.base/advance
Reading symbols from .../gdb.base/advance...done.
(gdb) delete breakpoints
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) break main
Breakpoint 1 at 0x87f8: file .../gdb.base/advance.c,
line 41.
(gdb) set remotetimeout 15
(gdb) kill
The program is not being run.
(gdb)
[...]
.../bin/gdbserver --once :6014 advance
target remote localhost:6014
Remote debugging using localhost:6014
Remote communication error. Target disconnected.: Connection reset by peer.
(gdb) continue
The program is not being run.
(gdb) Process advance created; pid = 999
Listening on port 6014
FAIL: gdb.base/advance.exp: Can't run to main
-- notice how the test harness proceeded with the `target remote ...'
command even though gdbserver hasn't completed its startup yet. A
while later when it's finally ready it's too late already. I checked
the timing here and it takes gdbserver roughly 25 seconds to start in
this scenario. Subsequent gdbserver starts in the same test run take
less time and usually complete within 10 seconds although occasionally
`target remote ...' precedes the corresponding `Listening on port...'
message again.
Therefore I have fixed this problem by setting an explicit timeout to
120s on the expect call in question. If this turns out too arbitrary
sometime, then perhaps a separate `gdbserver_timeout' setting might be
due.
* lib/gdbserver-support.exp (gdbserver_start): Set timeout to
120 on waiting for the TCP socket to open.
Hi,
I see the following fail on arm-none-eabi target,
-var-evaluate-expression -f nat foo^M
^done,value="0x3 <_ftext+2>"^M
(gdb) ^M
FAIL: gdb.mi/mi-var-display.exp: eval variable -f nat foo
the "<_ftext+2>" isn't expected in the test, so "set print symbol off"
can prevent printing it. It is obvious and I'll commit it in three
days if no comments.
gdb/testsuite:
2014-09-09 Yao Qi <yao@codesourcery.com>
* gdb.mi/mi-var-display.exp: Set print symbol off.
have empty bodies.
User-defined commands that have empty bodies weren't being shown because
the print function returned too soon. Now, it prints the command's name
before checking if it has any body at all. This also fixes the same
problem on "show user <myemptycommand>", which wasn't being printed due
to a similar reason.
gdb/Changelog:
* cli/cli-cmds.c (show_user): Use cli_user_command_p to
decide whether we display the command on "show user".
* cli/cli-script.c (show_user_1): Only verify cmdlines after
printing command name.
* cli/cli-decode.h (cli_user_command_p): Declare new function.
* cli/cli-decode.c (cli_user_command_p): Create helper function
to verify whether cmd_list_element is a user-defined command.
gdb/testsuite/Changelog:
* gdb.base/commands.exp: Add tests to verify user-defined
commands with empty bodies.
* gdb.python/py-cmd.exp: Test that we don't show user-defined
python commands in `show user command`.
* gdb.python/scm-cmd.exp: Test that we don't show user-defined
scheme commands in `show user command`.
https://bugzilla.redhat.com/show_bug.cgi?id=1126177
ERROR: AddressSanitizer: SEGV on unknown address 0x000000000050 (pc 0x000000992bef sp 0x7ffff9039530 bp 0x7ffff9039540
T0)
#0 0x992bee in value_type .../gdb/value.c:925
#1 0x87c951 in py_print_single_arg python/py-framefilter.c:445
#2 0x87cfae in enumerate_args python/py-framefilter.c:596
#3 0x87e0b0 in py_print_args python/py-framefilter.c:968
It crashes because frame_arg::val is documented it may contain NULL
(frame_arg::error is then non-NULL) but the code does not handle it.
Another bug is that py_print_single_arg() calls goto out of its TRY_CATCH
which messes up GDB cleanup chain crashing GDB later.
It is probably 7.7 regression (I have not verified it) due to the introduction
of Python frame filters.
gdb/ChangeLog
PR python/17355
* python/py-framefilter.c (py_print_single_arg): Handle NULL FA->VAL.
Fix goto out of TRY_CATCH.
gdb/testsuite/ChangeLog
PR python/17355
* gdb.python/amd64-py-framefilter-invalidarg.S: New file.
* gdb.python/py-framefilter-invalidarg-gdb.py.in: New file.
* gdb.python/py-framefilter-invalidarg.exp: New file.
* gdb.python/py-framefilter-invalidarg.py: New file.
This patch is a fix to PR gdb/17235. The bug is about an unused
variable that got declared and set during one of the parsing phases of
an SDT probe's argument. I took the opportunity to rewrite some of the
code to improve the parsing. The bug was actually a thinko, because
what I wanted to do in the code was to discard the number on the string
being parsed.
During this portion, the code identifies that it is dealing with an
expression that begins with a sign ('+', '-' or '~'). This means that
the expression could be:
- a numeric literal (e.g., '+5')
- a register displacement (e.g., '-4(%rsp)')
- a subexpression (e.g., '-(2*3)')
So, after saving the sign and moving forward 1 char, now the code needs
to know if there is a digit followed by a register displacement prefix
operand (e.g., '(' on x86_64). If yes, then it is a register
operation. If not, then it will be handled recursively, and the code
will later apply the requested operation on the result (either a '+', a
'-' or a '~').
With the bug, the code was correctly discarding the digit (though using
strtol unnecessarily), but it wasn't properly dealing with
subexpressions when the register indirection prefix was '(', like on
x86_64. This patch also fixes this bug, and includes a testcase. It
passes on x86_64 Fedora 20.
This commit fixes the PR mentioned in $subject. It is about a set but
unused variable that refers to the output format of integer values
printed in Fortran.
This was probably a thinko (like most set-but-unused-vars), but it
could cause an internal error depending on the scenario. I am sending
a testcase which triggers this error as well.
gdb/ChangeLog:
2014-09-04 Sergio Durigan Junior <sergiodj@redhat.com>
PR fortran/17237
* f-valprint.c (f_val_print): Specify the correct print option to
use when printing integer values.
gdb/testsuite/ChangeLog:
2014-09-04 Sergio Durigan Junior <sergiodj@redhat.com>
PR fortran/17237
* gdb.fortran/print-formatted.exp: New file.
* gdb.fortran/print-formatted.f90: Likewise.
The ability to read registers is needed to use Frame Filter API to
display the frames created by JIT compilers.
gdb/ChangeLog:
2014-08-29 Sasha Smundak <asmundak@google.com>
* python/py-frame.c (frapy_read_register): New function.
gdb/doc/ChangeLog:
2014-08-26 Sasha Smundak <asmundak@google.com>
* python.texi (Frames in Python): Add read_register description.
gdb/testsuite/ChangeLog:
2014-08-26 Sasha Smundak <asmundak@google.com>
* gdb.python/py-frame.exp: Test Frame.read_register.
This PR came from a Red Hat bug that was filed recently. I checked and
it still exists on HEAD, so here's a proposed fix. Although this is
marked as a Python backend bug, this is really about the completion
mechanism used by GDB. Since this code reminds me of my first attempt
to make a good noodle, it took me quite some time to fix it in a
non-intrusive way.
The problem is triggered when one registers a completion method inside a
class in a Python script, rather than registering the command using a
completer class directly. For example, consider the following script:
class MyFirstCommand(gdb.Command):
def __init__(self):
gdb.Command.__init__(self,'myfirstcommand',gdb.COMMAND_USER,gdb.COMPLETE_FILENAME)
def invoke(self,argument,from_tty):
raise gdb.GdbError('not implemented')
class MySecondCommand(gdb.Command):
def __init__(self):
gdb.Command.__init__(self,'mysecondcommand',gdb.COMMAND_USER)
def invoke(self,argument,from_tty):
raise gdb.GdbError('not implemented')
def complete(self,text,word):
return gdb.COMPLETE_FILENAME
MyFirstCommand ()
MySecondCommand ()
When one loads this into GDB and tries to complete filenames for both
myfirstcommand and mysecondcommand, she gets:
(gdb) myfirstcommand /hom<TAB>
(gdb) myfirstcommand /home/
^
...
(gdb) mysecondcommand /hom<TAB>
(gdb) mysecondcommand /home
^
(The "^" marks the final position of the cursor after the TAB).
So we see that myfirstcommand honors the COMPLETE_FILENAME class (as
specified in the command creation), but mysecondcommand does not. After
some investigation, I found that the problem lies with the set of word
break characters that is used for each case. The set should be the same
for both commands, but it is not.
During the process of deciding which type of completion should be used,
the code in gdb/completer.c:complete_line_internal analyses the command
that requested the completion and tries to determine the type of
completion wanted by checking which completion function will be called
(e.g., filename_completer for filenames, location_completer for
locations, etc.).
This all works fine for myfirstcommand, because immediately after the
command registration the Python backend already sets its completion
function to filename_completer (which then causes the
complete_line_internal function to choose the right set of word break
chars). However, for mysecondcommand, this decision is postponed to
when the completer function is evaluated, and the Python backend uses an
internal completer (called cmdpy_completer). complete_line_internal
doesn't know about this internal completer, and can't choose the right
set of word break chars in time, which then leads to a bad decision when
completing the "/hom" word.
So, after a few attempts, I decided to create another callback in
"struct cmd_list_element" that will be responsible for handling the case
when there is an unknown completer function for complete_line_internal
to work with. So far, only the Python backend uses this callback, and
only when the user provides a completer method instead of registering
the command directly with a completer class. I think this is the best
option because it not very intrusive (all the other commands will still
work normally), but especially because the whole completion code is so
messy that it would be hard to fix this without having to redesign
things.
I have regtested this on Fedora 18 x86_64, without regressions. I also
included a testcase.
gdb/ChangeLog:
2014-09-03 Sergio Durigan Junior <sergiodj@redhat.com>
PR python/16699
* cli/cli-decode.c (set_cmd_completer_handle_brkchars): New
function.
(add_cmd): Set "completer_handle_brkchars" to NULL.
* cli/cli-decode.h (struct cmd_list_element)
<completer_handle_brkchars>: New field.
* command.h (completer_ftype_void): New typedef.
(set_cmd_completer_handle_brkchars): New prototype.
* completer.c (set_gdb_completion_word_break_characters): New
function.
(complete_line_internal): Call "completer_handle_brkchars"
callback from command.
* completer.h: Include "command.h".
(set_gdb_completion_word_break_characters): New prototype.
* python/py-cmd.c (cmdpy_completer_helper): New function.
(cmdpy_completer_handle_brkchars): New function.
(cmdpy_completer): Adjust to use cmdpy_completer_helper.
(cmdpy_init): Set completer_handle_brkchars to
cmdpy_completer_handle_brkchars.
gdb/testsuite/ChangeLog:
2014-09-03 Sergio Durigan Junior <sergiodj@redhat.com>
PR python/16699
* gdb.python/py-completion.exp: New file.
* gdb.python/py-completion.py: Likewise.
clang was using eax to construct %0 here:
asm ("mov %%eax, 0(%0)\n\t"
"mov %%ebx, 4(%0)\n\t"
"mov %%ecx, 8(%0)\n\t"
"mov %%edx, 12(%0)\n\t"
"mov %%esi, 16(%0)\n\t"
"mov %%edi, 20(%0)\n\t"
: /* no output operands */
: "r" (data)
: "eax", "ebx", "ecx", "edx", "esi", "edi");
which caused amd64-word.exp (and others similarly) to fail.
It's a perfectly legit thing for clang to do given the available data.
The patch fixes this by marking the registers as live from the
time of the preceding breakpoint.
gdb/testsuite/ChangeLog:
* gdb.arch/amd64-pseudo.c (main): Rewrite to better specify when
eax,etc. are live with values set by gdb and thus the compiler can't
use them.
* gdb.arch/i386-pseudo.c (main): Ditto.