switch to fully parallel mode

This switches "make check" to fully parallel mode. One primary issue facing full parallelization is the overhead of "runtest". On my machine, if I "touch gdb.base/empty.exp", making a new file, and then "time runtest.exp", it takes 0.08 seconds. Multiply this by the 1008 (in my configuration) tests and you get ~80 seconds. This is the overhead that would theoretically be present if all tests were run in parallel. However, the problem isn't nearly as bad as this, for two reasons. First, you must divide by the number of jobs, assuming perfect parallelization -- reasonably true for small -j numbers, based on the results I see. Second, the current test suite parallelization approach bundles the tests, largely by directory, but also splitting up gdb.base into two halves. I was curious to see how the current bundling played out in practice, so I ran "make -j1 check RUNTEST='/bin/time runtest'". This invokes the parallel mode (thus the bundling) and then shows the time taken by each invocation of runtest. Then, I ran "/bin/time make -j3 check". (See below about -j2.) The time for the entire -j3 test run was the same as the time for "gdb.base1". What this means is that gdb.base1 is currently the time-limiting run, preventing further parallelization gains. So, I reason, whatever overhead we see from full parallelization will only be seen by "-j1" and "-j2". I then tried a -j2 test run. This does take longer than a -j3 build, meaning that the gdb.base1 job finishes and then proceeds to other runtest invocations. Finally I tried a -j2 test run with the appended patch. This was 9% slower than the -j2 run without the patch. I think that is a reasonable slowdown for what is probably a rare case. I believe this patch will yield faster test results for all -j values greater than 2. For -j3 on my machine, the test suite is a few seconds faster; I didn't try any larger -j values. For -j1, I went ahead and changed the Makefile so that, if no -j option is given, then the "check-single" mode is used. You can still use "make -j1 check" to get single-job parallel-mode, though of course there's no good reason to do so. This change is likely to speed up the plain "make check" scenario a little as we will now bypass dg-extract-results.sh. One drawback of this change is that "make -jN check" is now much more verbose. I generally only look at the .sum and .log files, but perhaps this will bother some. Another interesting question is scalability of the result. The slowest test, which limits the scalability, took 80.78 seconds. The mean of the remaining tests is 1.08 seconds. (Note that this is just a rough estimate, since there are still outliers.) This means we can run 80.78 / 1.08 =~ 74 tests in the time available. And, in this data set (slightly older than the above, but materially the same) there were 948 tests. So, I think the current test suite should scale ok up to about -j12. We could improve this number if need be by breaking up the biggest tests. 2013-11-04 Tom Tromey <tromey@redhat.com> * Makefile.in (TEST_DIRS): Remove. (TEST_TARGETS, check-parallel): Rewrite. (check-gdb.%, BASE1_FILES, BASE2_FILES, check-gdb.base%) (subdir_do, subdirs): Remove. (do-check-parallel, check/%): New targets. (clean): Remove outputs, temp, and cache directories. (saw_dash_j): New variable. (CHECK_TARGET): Use it. (check): Depend on all, site.exp. Rewrite. (check-single): Remove dependencies. (slow_tests, all_tests, reordered_tests): New variables.
2013-08-27 11:52:25 -06:00 · 2013-08-27 11:52:25 -06:00 · 8120838889
commit 8120838889
parent c63ffa1f25
2 changed files with 53 additions and 55 deletions
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@ -1,3 +1,17 @@
+2013-11-04  Tom Tromey  <tromey@redhat.com>
+
+	* Makefile.in (TEST_DIRS): Remove.
+	(TEST_TARGETS, check-parallel): Rewrite.
+	(check-gdb.%, BASE1_FILES, BASE2_FILES, check-gdb.base%)
+	(subdir_do, subdirs): Remove.
+	(do-check-parallel, check/%): New targets.
+	(clean): Remove outputs, temp, and cache directories.
+	(saw_dash_j): New variable.
+	(CHECK_TARGET): Use it.
+	(check): Depend on all, site.exp.  Rewrite.
+	(check-single): Remove dependencies.
+	(slow_tests, all_tests, reordered_tests): New variables.
+
 2013-11-04  Tom Tromey  <tromey@redhat.com>

 	* gdb.dwarf2/fission-base.S: Remove "gdb.dwarf/".
--- a/gdb/testsuite/Makefile.in
+++ b/gdb/testsuite/Makefile.in
@ -128,14 +128,23 @@ $(abs_builddir)/site.exp site.exp: ./config.status Makefile

 installcheck:

-# For GNU make, try to run the tests in parallel.  If RUNTESTFLAGS is
-# not empty, then by default the tests will be serialized.  This can
-# be overridden by setting FORCE_PARALLEL to any non-empty value.
-# For a non-GNU make, do not parallelize.
-@GMAKE_TRUE@CHECK_TARGET = $(if $(FORCE_PARALLEL),check-parallel,$(if $(RUNTESTFLAGS),check-single,check-parallel))
+# See whether -j was given to make.  Either it was given with no
+# arguments, and appears as "j" in the first word, or it was given an
+# argument and appears as "-j" in a separate word.
+@GMAKE_TRUE@saw_dash_j = $(or $(findstring j,$(firstword $(MAKEFLAGS))),$(filter -j,$(MAKEFLAGS)))
+
+# For GNU make, try to run the tests in parallel if any -j option is
+# given.  If RUNTESTFLAGS is not empty, then by default the tests will
+# be serialized.  This can be overridden by setting FORCE_PARALLEL to
+# any non-empty value.  For a non-GNU make, do not parallelize.
+@GMAKE_TRUE@CHECK_TARGET = $(if $(FORCE_PARALLEL),check-parallel,$(if $(RUNTESTFLAGS),check-single,$(if $(saw_dash_j),check-parallel,check-single)))
@GMAKE_FALSE@CHECK_TARGET = check-single

-check: $(CHECK_TARGET)
+# Note that we must resort to a recursive make invocation here,
+# because GNU make 3.82 has a bug preventing MAKEFLAGS from being used
+# in conditions.
+check: all $(abs_builddir)/site.exp
+	$(MAKE) $(CHECK_TARGET)

 # All the hair to invoke dejagnu.  A given invocation can just append
 # $(RUNTESTFLAGS)
@ -151,70 +160,45 @@ DO_RUNTEST = \
 	  export TCL_LIBRARY ; fi ; \
 	$(RUNTEST)

-check-single: all $(abs_builddir)/site.exp
+check-single:
 	$(DO_RUNTEST) $(RUNTESTFLAGS)

-# A list of all directories named "gdb.*" which also hold a .exp file.
-# We filter out gdb.base and add fake entries, because that directory
-# takes the longest to process, and so we split it in half.
-TEST_DIRS = gdb.base1 gdb.base2 $(filter-out gdb.base,$(sort $(notdir $(patsubst %/,%,$(dir $(wildcard $(srcdir)/gdb.*/*.exp))))))
-
-TEST_TARGETS = $(addprefix check-,$(TEST_DIRS))
-
-# We explicitly re-invoke make here for two reasons.  First, it lets
-# us add a -k option, which makes the parallel check mimic the
-# behavior of the serial check; and second, it means that we can still
-# regenerate the sum and log files even if a sub-make fails -- which
-# it usually does because dejagnu exits with an error if any test
-# fails.
 check-parallel:
-	$(MAKE) -k $(TEST_TARGETS); \
+	-rm -rf cache
+	$(MAKE) -k do-check-parallel; \
 	$(SHELL) $(srcdir)/dg-extract-results.sh \
-	  $(addsuffix /gdb.sum,$(TEST_DIRS)) > gdb.sum; \
+	  `find outputs -name gdb.sum -print` > gdb.sum; \
 	$(SHELL) $(srcdir)/dg-extract-results.sh -L \
-	  $(addsuffix /gdb.log,$(TEST_DIRS)) > gdb.log
+	  `find outputs -name gdb.log -print` > gdb.log

-@GMAKE_TRUE@$(filter-out check-gdb.base%,$(TEST_TARGETS)): check-gdb.%: all $(abs_builddir)/site.exp
-@GMAKE_TRUE@	@if test ! -d gdb.$*; then mkdir gdb.$*; fi
-@GMAKE_TRUE@	$(DO_RUNTEST) --directory=gdb.$* --outdir=gdb.$* $(RUNTESTFLAGS)
+# Turn a list of .exp files into "check/" targets.  Only examine .exp
+# files appearing in a gdb.* directory -- we don't want to pick up
+# lib/ by mistake.  For example, gdb.linespec/linespec.exp becomes
+# check/gdb.linespec/linespec.exp.  The list is generally sorted
+# alphabetically, but we take a few tests known to be slow and push
+# them to the front of the list to try to lessen the overall time
+# taken by the test suite -- if one of these tests happens to be run
+# late, it will cause the overall time to increase.
+slow_tests = gdb.base/break-interp.exp gdb.base/interp.exp \
+	gdb.base/multi-forks.exp
+@GMAKE_TRUE@all_tests := $(shell cd $(srcdir) && find gdb.* -name '*.exp' -print)
+@GMAKE_TRUE@reordered_tests := $(slow_tests) $(filter-out $(slow_tests),$(all_tests))
+@GMAKE_TRUE@TEST_TARGETS := $(addprefix check/,$(reordered_tests))

-# Each half (roughly) of the .exp files from gdb.base.
-BASE1_FILES = $(patsubst $(srcdir)/%,%,$(wildcard $(srcdir)/gdb.base/[a-m]*.exp))
-BASE2_FILES = $(patsubst $(srcdir)/%,%,$(wildcard $(srcdir)/gdb.base/[n-z]*.exp))
+do-check-parallel: $(TEST_TARGETS)
+	@:

-# Handle each half of gdb.base.
-check-gdb.base%: all $(abs_builddir)/site.exp
-	@if test ! -d gdb.base$*; then mkdir gdb.base$*; fi
-	$(DO_RUNTEST) $(BASE$*_FILES) --outdir gdb.base$* $(RUNTESTFLAGS)
-
-subdir_do: force
-	@for i in $(DODIRS); do \
-		if [ -d ./$$i ] ; then \
-			if (rootme=`pwd`/ ; export rootme ; \
-			    rootsrc=`cd $(srcdir); pwd`/ ; export rootsrc ; \
-				cd ./$$i; \
-				$(MAKE) $(TARGET_FLAGS_TO_PASS) $(DO)) ; then true ; \
-			else exit 1 ; fi ; \
-		else true ; fi ; \
-	done
+@GMAKE_TRUE@check/%.exp:
+@GMAKE_TRUE@	-mkdir -p outputs/$*
+@GMAKE_TRUE@	@$(DO_RUNTEST) GDB_PARALLEL=yes --outdir=outputs/$* $*.exp $(RUNTESTFLAGS)

 force:;

-subdirs:
-	for dir in ${ALL_SUBDIRS} ; \
-	do \
-		echo "$$dir:" ; \
-		if [ -d $$dir ] ; then \
-			(rootme=`pwd`/ ; export rootme ; \
-			 rootsrc=`cd $(srcdir); pwd`/ ; export rootsrc ; \
-			 cd $$dir; $(MAKE) $(TARGET_FLAGS_TO_PASS)); \
-		fi; \
-	done
-
 clean mostlyclean:
 	-rm -f *~ core *.o a.out xgdb *.x *.grt bigcore.corefile .gdb_history
 	-rm -f core.* *.tf *.cl *.py tracecommandsscript copy1.txt zzz-gdbscript
 	-rm -f *.dwo *.dwp
+	-rm -rf outputs temp cache
 	if [ x"${ALL_SUBDIRS}" != x ] ; then \
 	    for dir in ${ALL_SUBDIRS}; \
 	    do \