On GNU/Linux, this test sometimes FAILs like this:
(gdb) run
Starting program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/killed
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
ptrace: No such process.
(gdb)
Program terminated with signal SIGKILL, Killed.
The program no longer exists.
FAIL: gdb.threads/killed.exp: run program to completion (timeout)
Note the suspicious "No such process" line (that's errno==ESRCH).
Adding debug output we see:
linux_nat_wait: [process -1], [TARGET_WNOHANG]
LLW: enter
LNW: waitpid(-1, ...) returned 18465, ERRNO-OK
LLW: waitpid 18465 received Stopped (signal) (stopped)
LNW: waitpid(-1, ...) returned 18461, ERRNO-OK
LLW: waitpid 18461 received Trace/breakpoint trap (stopped)
LLW: Handling extended status 0x03057f
LHEW: Got clone event from LWP 18461, new child is LWP 18465
LNW: waitpid(-1, ...) returned 0, ERRNO-OK
RSRL: resuming stopped-resumed LWP LWP 18465 at 0x3b36af4b51: step=0
RSRL: resuming stopped-resumed LWP LWP 18461 at 0x3b36af4b51: step=0
sigchld
ptrace: No such process.
(gdb) linux_nat_wait: [process -1], [TARGET_WNOHANG]
LLW: enter
LNW: waitpid(-1, ...) returned 18465, ERRNO-OK
LLW: waitpid 18465 received Killed (terminated)
LLW: LWP 18465 exited.
LNW: waitpid(-1, ...) returned 18461, No child processes
LLW: waitpid 18461 received Killed (terminated)
Process 18461 exited
LNW: waitpid(-1, ...) returned -1, No child processes
LLW: exit
sigchld
infrun: target_wait (-1, status) =
infrun: 18461 [process 18461],
infrun: status->kind = signalled, signal = GDB_SIGNAL_KILL
infrun: TARGET_WAITKIND_SIGNALLED
Program terminated with signal SIGKILL, Killed.
The program no longer exists.
infrun: stop_waiting
FAIL: gdb.threads/killed.exp: run program to completion (timeout)
The issue is that here:
RSRL: resuming stopped-resumed LWP LWP 18465 at 0x3b36af4b51: step=0
RSRL: resuming stopped-resumed LWP LWP 18461 at 0x3b36af4b51: step=0
The first line shows we had just resumed LWP 18465, which does:
void *
child_func (void *dummy)
{
kill (pid, SIGKILL);
exit (1);
}
So if the kernel manages to schedule that thread fast enough, the
process may be killed before GDB has a chance to resume LWP 18461.
GDBserver has code at the tail end of linux_resume_one_lwp to cope
with this:
~~~
ptrace (step ? PTRACE_SINGLESTEP : PTRACE_CONT, lwpid_of (thread),
(PTRACE_TYPE_ARG3) 0,
/* Coerce to a uintptr_t first to avoid potential gcc warning
of coercing an 8 byte integer to a 4 byte pointer. */
(PTRACE_TYPE_ARG4) (uintptr_t) signal);
current_thread = saved_thread;
if (errno)
{
/* ESRCH from ptrace either means that the thread was already
running (an error) or that it is gone (a race condition). If
it's gone, we will get a notification the next time we wait,
so we can ignore the error. We could differentiate these
two, but it's tricky without waiting; the thread still exists
as a zombie, so sending it signal 0 would succeed. So just
ignore ESRCH. */
if (errno == ESRCH)
return;
perror_with_name ("ptrace");
}
~~~
However, that's not a complete fix, because between starting to handle
the resume request and getting that PTRACE_CONTINUE, we run other
ptrace calls that can also fail with ESRCH, and that end up throwing
an error (with perror_with_name).
In the case above, I indeed sometimes see resume_stopped_resumed_lwps
fail in the registers read:
resume_stopped_resumed_lwps (struct lwp_info *lp, void *data)
{
...
CORE_ADDR pc = regcache_read_pc (regcache);
Or e.g., in 32-bit mode, i386_linux_resume has several calls that can
throw too.
Whether to ignore ptrace errors or not depends on context that is only
available somewhere up the call chain. So the fix is to let ptrace
errors throw as they do today, and wrap the resume request in a
TRY/CATCH that swallows it iff the lwp that we were trying to resume
is no longer ptrace-stopped.
gdb/gdbserver/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_resume_one_lwp): Rename to ...
(linux_resume_one_lwp_throw): ... this. Don't handle ESRCH here,
instead call perror_with_name.
(check_ptrace_stopped_lwp_gone): New function.
(linux_resume_one_lwp): Reimplement as wrapper around
linux_resume_one_lwp_throw that swallows errors if the LWP is
gone.
gdb/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_resume_one_lwp): Rename to ...
(linux_resume_one_lwp_throw): ... this. Don't handle ESRCH here,
instead call perror_with_name.
(check_ptrace_stopped_lwp_gone): New function.
(linux_resume_one_lwp): Reimplement as wrapper around
linux_resume_one_lwp_throw that swallows errors if the LWP is
gone.
(resume_stopped_resumed_lwps): Try register reads in TRY/CATCH and
swallows errors if the LWP is gone. Use
linux_resume_one_lwp_throw instead of linux_resume_one_lwp.
The previous change added an assertion that is catching yet another
bug in count_events_callback/select_event_lwp_callback:
(gdb)
PASS: gdb.mi/mi-nonstop.exp: interrupted
mi_expect_interrupt: expecting: \*stopped,(reason="signal-received",signal-name="0",signal-meaning="Signal 0"|reason="signal-received",signal-name="SIGINT",signal-meaning="Interrupt")[^
]*
/home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:2329: A problem internal to GDBserver has been detected.
select_event_lwp: Assertion `num_events > 0' failed.
=thread-group-exited,id="i1"
Certainly select_event_lwp_callback should always at least find one
event, as it's only called because an event triggered (though we may
have more than one: the point of the function is randomly picking
one).
An LWP that GDB previously asked to continue/step (thus is resumed)
and gets a vCont;t request ends up with last_resume_kind ==
resume_stop. These functions in gdbserver used to filter out events
that weren't going to be reported to GDB; I think the last_resume_kind
kind check used to make sense at that point, but it no longer does.
gdb/gdbserver/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-low.c (count_events_callback, select_event_lwp_callback):
No longer check whether the thread has resume_stop as last resume
kind.
inc * rl78.h (E_FLAG_RL78_G10): Redefine.
(E_FLAG_RL78_CPU_MASK, E_FLAG_RL78_ANY_CPU, E_FLAG_RL78_G13
E_FLAG_RL78_G14): New flags.
bin * readelf.c (get_machine_flags): Decode RL78's G13 and G14 flags.
gas * config/tc-rl78.c (enum options): Add G13 and G14.
(md_longopts): Add -mg13 and -mg14.
(md_parse_option): Handle -mg13 and -mg14.
(md_show_usage): List -mg13 and -mg14.
* doc/c-rl78.texi: Add description of -mg13 and -mg14 options.
bfd * elf32-rl78.c (rl78_cpu_name): New function. Prints the name of
the RL78 core based upon the flags.
(rl78_elf_merge_private_bfd_data): Handle merging of G13 and G14
flags.
(rl78_elf_print_private_bfd_data): Use rl78_cpu_name.
(elf32_rl78_machine): Always return bfd_mach_rl78.
Wanting to make sure the new continue-pending-status.exp test tests
both cases of threads 2 and 3 reporting an event, I added counters to
the test, to make it FAIL if events for both threads aren't seen.
Assuming a well behaved backend, and given a reasonable number of
iterations, it should PASS.
However, running that against GNU/Linux gdbserver, I found that
surprisingly, that FAILed. GDBserver always reported the breakpoint
hit for the same thread.
Turns out that I broke gdbserver's thread event randomization
recently, with git commit 582511be ([gdbserver] linux-low.c: better
starvation avoidance, handle non-stop mode too). In that commit I
missed that the thread structure also has a status_pending_p field...
The end result was that count_events_callback always returns 0, and
then if no thread is stepping, select_event_lwp always returns the
event thread. IOW, no randomization is happening at all. Quite
curious how all the other changes in that patch were sufficient to fix
non-stop-fair-events.exp anyway even with that broken.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/gdbserver/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-low.c (count_events_callback, select_event_lwp_callback):
Use the lwp's status_pending_p field, not the thread's.
gdb/testsuite/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* gdb.threads/continue-pending-status.exp (saw_thread_2)
(saw_thread_3): New globals.
(top level): Increment them when an event for the corresponding
thread is seen.
(no thread starvation): New test.
If the linux_nat_resume's short-circuits the resume because the
current thread has a pending status, and, a thread with a higher
number was previously stopped for a breakpoint, GDB internal errors,
like:
/home/pedro/gdb/mygit/src/gdb/linux-nat.c:2590: internal-error: status_callback: Assertion `lp->status != 0' failed.
Fix this by make status_callback bail out earlier. GDBserver is
already doing the same.
New test added that exercises this.
gdb/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-nat.c (status_callback): Return early if the LWP has no
status pending.
gdb/testsuite/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* gdb.threads/continue-pending-status.c: New file.
* gdb.threads/continue-pending-status.exp: New file.
This function (in both GDB and GDBserver) used to consider only
SIGTRAP/breakpoint events, but that's no longer the case nowadays.
gdb/gdbserver/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-low.c (select_event_lwp_callback): Update comments to
no longer mention SIGTRAP.
gdb/ChangeLog:
2015-03-19 Pedro Alves <palves@redhat.com>
* linux-nat.c (select_event_lwp_callback): Update comment to no
longer mention SIGTRAP.
PR gas/18087
gas/test * gas/i386/dw2-compress-1.d: Allow the test to pass regardless of
whether the .debug_info section was compressed on not.
bfd * compress.c (bfd_compress_section_contents): Do not define this
function if it is not used.
This fixes several problems with this test.
E.g,. with --target_board=native-extended-gdbserver on x86_64 Fedora
20, I get:
Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ...
FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout)
FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc
FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn
FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running)
And with --target=native-gdbserver, I get:
Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ...
KPASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS server/13796)
FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout)
FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc
FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn
FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running)
First, the lack of fork support on remote targets is supposed to be
kfailed, so the KPASS is obviously bogus. The extended-remote board
should have KFAILed too.
The problem is that the test is using "is_remote" instead of
gdb_is_target_remote.
And then, I get:
(gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on
stepi
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork
Obviously, that should be a FAIL. The problem is that the test only
expects SIGILL, not SIGSEGV. It also doesn't bail correctly if an
internal error or some other pattern caught by gdb_test_multiple
matches. The test doesn't really need to match specific exits/crashes
patterns, if the PASS regex is improved, like in ...
... this and the other "stepi" tests are a bit too lax, passing on
".*". This tightens those up to expect "x/i" and the "=>" current PC
indicator, like in:
1: x/i $pc
=> 0x3b36abc9e2 <vfork+34>: syscall
On x86_64 Fedora 20, I now get a quick KFAIL instead of timeouts with
both the native-extended-gdbserver and native-gdbserver boards:
PASS: gdb.base/disp-step-syscall.exp: vfork: delete break vfork
PASS: gdb.base/disp-step-syscall.exp: vfork: continue to syscall insn vfork
PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on
KFAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS: server/13796)
and a full pass with native testing.
gdb/testsuite/
2015-03-18 Pedro Alves <palves@redhat.com>
* gdb.base/disp-step-syscall.exp (disp_step_cross_syscall):
Use gdb_is_target_remote instead of is_remote. Use
gdb_test_multiple instead of gdb_expect. Exit early if
gdb_test_multiple hits its internal matches. Tighten stepi tests
expected output. Fail on exit with any signal, instead of just
SIGILL.
PR binutils/18087
gas * doc/as.texinfo: Note that when gas compresses debug sections the
compression is only performed if it makes the section smaller.
* write.c (compress_debug): Do not compress a debug section if
doing so would make it larger.
tests * gas/i386/dw2-compress-1.d: Do not expect the .debug_abbrev or
.debug_info sections to be compressed.
binu * doc/binutils.texi: Note that when objcopy compresses debug
sections the compression is only performed if it makes the section
smaller.
bfd * coffgen.c (make_a_section_from_file): Only prepend a z to a
debug section's name if the section was actually compressed.
* elf.c (_bfd_elf_make_section_from_shdr): Likewise.
* compress.c (bfd_init_section_compress_status): Do not compress
the section if doing so would make it bigger. In such cases leave
the section alone and return COMPRESS_SECTION_NONE.
Unwind info in system dlls uses almost all possible codes, contrary to unwind
info generated by gcc. A few issues have been discovered: incorrect handling
of SAVE_NONVOL opcodes and incorrect in prologue range checks. Furthermore I
added comments not to forget what has been investigated.
gdb/ChangeLog:
* amd64-windows-tdep.c (amd64_windows_find_unwind_info): Move
redirection code to ...
(amd64_windows_frame_decode_insns): ... Here. Fix in prologue
checks. Fix SAVE_NONVOL operations. Add debug code and comments.
This commit makes support for the "vFile:fstat" packet be detected
by probing rather than using qSupported, for consistency with the
other vFile: packets.
gdb/ChangeLog:
(remote_protocol_features): Remove the "vFile:fstat" feature.
(remote_hostio_fstat): Probe for "vFile:fstat" support.
gdb/doc/ChangeLog:
* gdb.texinfo (General Query Packets): Remove documentation
for now-removed vFile:fstat qSupported features.
gdb/gdbserver/ChangeLog:
* server.c (handle_query): Do not report vFile:fstat as supported.
Hi,
This patch is to support catch syscall on aarch64 linux. We
implement gdbarch method get_syscall_number for aarch64-linux,
and add aarch64-linux.xml file, which looks straightforward, however
the changes to test case doesn't.
First of all, we enable catch-syscall.exp on aarch64-linux target,
but skip the multi_arch testing on current stage. I plan to touch
multi arch debugging on aarch64-linux later.
Then, when I run catch-syscall.exp on aarch64-linux, gcc errors that
SYS_pipe isn't defined. We find that aarch64 kernel only has pipe2
syscall and libc already convert pipe to pipe2. As a result, I change
catch-syscall.c to use SYS_pipe if it is defined, otherwise use
SYS_pipe2 instead. The vector all_syscalls in catch-syscall.exp can't
be pre-determined, so I add a new proc setup_all_syscalls to fill it,
according to the availability of SYS_pipe.
Regression tested on {x86_64, aarch64}-linux x {native, gdbserver}.
gdb:
2015-03-18 Yao Qi <yao.qi@linaro.org>
PR tdep/18107
* aarch64-linux-tdep.c: Include xml-syscall.h
(aarch64_linux_get_syscall_number): New function.
(aarch64_linux_init_abi): Call
set_gdbarch_get_syscall_number.
* syscalls/aarch64-linux.xml: New file.
gdb/testsuite:
2015-03-18 Yao Qi <yao.qi@linaro.org>
PR tdep/18107
* gdb.base/catch-syscall.c [!SYS_pipe] (pipe2_syscall): New
variable.
* gdb.base/catch-syscall.exp: Don't skip it on
aarch64*-*-linux* target. Remove elements in all_syscalls.
(test_catch_syscall_multi_arch): Skip it on aarch64*-linux*
target.
(setup_all_syscalls): New proc.
When src or dst is NULL, the next fread or fwrite will cause a
segmentation fault, so we need to treat it as fatal.
* ldmain.c (main): Use %F instead of %X for einfo.
Forward declarations of struct stat break the Windows build.
This commit removes a forward declaration of struct stat and
includes sys/stat.h directly instead.
gdb/ChangeLog:
PR gdb/18131
* common/common-remote-fileio.h (sys/stat.h): New include.
(stuct stat): Remove forward declaration.
We see some fails in watchpoint-reuse-slot.exp on aarch64-linux, because
it sets some HW breakpoint on some address doesn't meet the alignment
requirements by kernel, kernel will reject the
ptrace (PTRACE_SETHBPREGS) call, and some fails are caused, for example:
(gdb) PASS: gdb.base/watchpoint-reuse-slot.exp: always-inserted off: watch x hbreak: : width 1, iter 0: base + 0: delete $bpnum
hbreak *(buf.byte + 0 + 1)^M
Hardware assisted breakpoint 80 at 0x410a61^M
(gdb) PASS: gdb.base/watchpoint-reuse-slot.exp: always-inserted off: watch x hbreak: : width 1, iter 0: base + 1: hbreak *(buf.byte + 0 + 1)
stepi^M
Warning:^M
Cannot insert hardware breakpoint 80.^M
Could not insert hardware breakpoints:^M
You may have requested too many hardware breakpoints/watchpoints.^M
^M
(gdb) FAIL: gdb.base/watchpoint-reuse-slot.exp: always-inserted off: watch x hbreak: : width 1, iter 0: base + 1: stepi advanced
hbreak *(buf.byte + 0 + 1)^M
Hardware assisted breakpoint 440 at 0x410a61^M
Warning:^M
Cannot insert hardware breakpoint 440.^M
Could not insert hardware breakpoints:^M
You may have requested too many hardware breakpoints/watchpoints.^M
^M
(gdb) FAIL: gdb.base/watchpoint-reuse-slot.exp: always-inserted on: watch x hbreak: : width 1, iter 0: base + 1: hbreak *(buf.byte + 0 + 1)
This patch is to skip some tests by checking proc valid_addr_p.
We can handle other targets in valid_addr_p too.
gdb/testsuite:
2015-03-16 Yao Qi <yao.qi@linaro.org>
* gdb.base/watchpoint-reuse-slot.exp (valid_addr_p): New proc.
(top level): Skip tests if valid_addr_p returns false for
$cmd1 or $cmd2.
Sync with GCC
2014-11-17 Bob Dunlop <bob.dunlop@xyzzy.org.uk>
* mt-ospace (CFLAGS_FOR_TARGET): Append -g -Os rather than
overwriting.
(CXXFLAGS_FOR_TARGET): Similarly.
Without this, not all registers were present in the core generated by
gcore. For example, running 'gcore' on a program without examining
the vector registers (SSE or AVX) would store all the vector registers
as zeros because they were not pulled into the regcache. Running
'info vector' before 'gcore' would store the correct values in the
core since it populated the regcache. For Linux processes, a similar
operation is achieved by having the thread iterator callback invoke
target_fetch_registers on each thread before its corresponding
register notes are dumped.
gdb/ChangeLog:
* fbsd-tdep.c (fbsd_make_corefile_notes): Fetch all target registers
before writing core register notes.
Fixes linking an --enable-build-with-cxx build on mingw:
../readline/terminal.c:278: undefined reference to `tgetnum'
../readline/terminal.c:297: undefined reference to `tgetnum'
../readline/libreadline.a(terminal.o): In function `get_term_capabilities':
../readline/terminal.c:427: undefined reference to `tgetstr'
../readline/libreadline.a(terminal.o): In function `_rl_init_terminal_io':
[etc.]
gdb/ChangeLog:
2015-03-16 Yuanhui Zhang <asmwarrior@gmail.com>
Pedro Alves <palves@redhat.com>
* gdb_curses.h (tgetnum): Mark with EXTERN_C.
* stub-termcap.c (tgetent, tgetnum, tgetflag, tgetstr, tputs)
(tgoto): Wrap with extern "C".
src/gdb/stub-termcap.c: In function 'int tputs(char*, int, int (*)())':
src/gdb/stub-termcap.c:67:22: error: too many arguments to function
outfun (*string++);
^
gdb/ChangeLog:
2015-03-16 Pedro Alves <palves@redhat.com>
Yuanhui Zhang <asmwarrior@gmail.com>
* stub-termcap.c (tputs): Change prototype.
Building mingw GDB with --enable-build-with-cxx shows:
../../binutils-gdb/gdb/windows-nat.c: At global scope:
../../binutils-gdb/gdb/windows-nat.c:192:1: error: conflicting declaration 'typedef struct thread_info_struct thread_info'
thread_info;
^
In file included from ../../binutils-gdb/gdb/windows-nat.c:52:0:
../../binutils-gdb/gdb/gdbthread.h:160:8: error: 'struct thread_info' has a previous declaration as 'struct thread_info'
struct thread_info
^
Simply rename the structure to avoid the conflict.
gdb/ChangeLog:
2015-03-16 Yuanhui Zhang <asmwarrior@gmail.com>
Pedro Alves <palves@redhat.com>
* windows-nat.c (struct thread_info_struct): Rename to ...
(struct windows_thread_info_struct): ... this.
(thread_info): Rename to ...
(windows_thread_info): ... this.
All users updated.
Rather than manually include tconfig.h when we think we'll need it (which
is error prone as it can define symbols we expect from config.h), have it
be included directly by config.h. Since we know we have to include that
header everywhere already, this will make sure tconfig.h isn't missed.
It should also be fine as tconfig.h is supposed to be simple and only set
up a few core defines for the target.
This allows us to stop symlinking it in place all the time and just use
it straight out of the respective source directory.