[Scons-dev] SCons performance investigations
Daniel Holth
dholth at gmail.com
Fri Jul 21 14:51:54 EDT 2017
At least in pypy a json sconsign would be quite fast compared to pickle.
On Fri, Jul 21, 2017, 11:39 Andrew C. Morrow <andrew.c.morrow at gmail.com>
wrote:
>
> Hi scons-dev -
>
> The following is a revised draft of an email that I had originally
> intended to send as a follow up to
> https://pairlist4.pair.net/pipermail/scons-users/2017-June/006018.html.
> Instead, Bill Deegan and I took some time to expand on my first draft and
> add some ideas about how to address some of th e issues. We hope to migrate
> this to the wiki, but wanted to share it here first for feedback.
>
> ----
>
> Performance is one of the major challenges facing SCons. When compared
> with other current options, particularly Ninja, in many cases performance
> can lag significantly. That said other options by and large lack the
> extensibility and many features of SCons.
>
> Bill Deegan (SCons project co-manager) and I have been working together to
> understand some of the issues that lead to poor SCons performance in a real
> world (and fairly modestly sized) C++ codebase. Here is a summary of some
> of our findings:
>
>
> -
>
> Python code usage: There are many places in the codebase where while
> the code is correct, performance based on cpython’s implementation can be
> improved by minor changes.
> -
>
> Examples
> -
>
> Using for loops and hashes to uniquify a list. Simple change in
> Node class yielded approximately 15% speedup for null build
> -
>
> Using if x.find(‘some character’) >=0 instead of is ‘some
> character’ in x (timeit benchmark shows a 10x speed difference)
> -
>
> Method to address
> -
>
> Profile the code looking for hotspots with cprofile and
> line_profiler. Then look for best implementations of code. (Use timeit if
> useful to compare implementations. There are examples of such in the bench
> dir (see:
> https://bitbucket.org/scons/scons/src/68a8afebafbefcf88217e9e778c1845db4f81823/bench/?at=default
> )
> -
>
> Serial DAG traversal: SCons walks the DAG to find out of date targets
> in a serial fashion. Once it finds them, it farms the work out to other
> threads, but the DAG walk remains serial. Given the proliferation of
> multicore machines since SCons’ initial implementation, a parallel walk of
> the DAG would yield significant speedup. Likely this would require
> implementation using the multiprocessing python library (instead of
> threads), since the GIL would block real parallelism otherwise. Packages
> like Boost where there are many header files can cause large increases in
> the size of the DAG, exacerbating this issue. There are two serious
> consequences of the slow DAG walk:
> -
>
> Incremental rebuilds in large projects. Typical developer workflow
> is to edit a file, rebuild, test. In our modestly sized codebase, we see
> the incremental time to do an ‘all’ rebuild for a one file change can reach
> well over a minute. This time is completely dominated by the serial
> dependency walk.
> -
>
> Inability to saturate distributed build clusters. In a
> distcc/icecream build, the serial DAG walk is slow enough that not enough
> jobs can be farmed out in parallel to saturate even a modest (400 cpu)
> build cluster. In our example, using ninja to drive a distributed full
> build results in an approximately 15x speedup, but SCons can only achieve a
> 2x speedup.
> -
>
> Method to address:
> -
>
> Investigate changing tree walk to generator
> -
>
> Investigate implementing tree walk using multiprocessing library
> -
>
> The dependency graph is the python object graph: The target dependency
> DAG is modeled via python Node Object to Node Object linkages (e.g. a list
> of child nodes held in a node). As a result, the only way to determine
> up-to-date-ness is by deeply nested method calls that repeatedly traverse
> the Python object graph. An attempt is made to mitigate this by memoizing
> state at the leaves (e.g. to cache the result of stat calls), but this
> still results in a large number of python function invocations for even the
> simplest state checks, where a result is already known. Similarly, the lack
> of global visibility precludes using externally provided change information
> to bypass scans.
> -
>
> See above re generator
> -
>
> Investigate modeling state separately from the python Node graph
> via some sort of centralized scoreboarding mechanism, it seems likely that
> both the function call overhead could be eliminated and that local
> knowledge could be propagated globally more effectively.
> -
>
> CacheDir: There are some issues listed below. End-to-end caching
> functionality of SCons, including generated files, object files, shared
> libraries, whole executables, etc., is one of its great strengths, but its
> performance has much room for improvement.
> -
>
> Existing bug(s) when combining CacheDir with MD5-Timestamp devalues
> CacheDir.
> -
>
> Bug: http://scons.tigris.org/issues/show_bug.cgi?id=2980
> -
>
> Performance issues:
> -
>
> CacheDir re-creates signature data when extracting nodes from
> the Cache, even though it could have recorded the signature when entering
> the objects into the cache.
> -
>
> Method to address
> -
>
> Store signatures for items in cachedir and then use them
> directly when copying items from Cache.
> -
>
> Fix the CacheDir / MD5-Timestamp integration bug
> -
>
> SConsign generation: The generation of the SConsign file is
> monolithic, not incremental. This means that if only one object file
> changed, the entire database needs to be re-written. It also appears that
> the mechanism used to serialize it is itself slow. Moving to a faster
> serialization model would be good, but even better would be to move to a
> faster serialization model that also admitted incremental updates to single
> items.
> -
>
> Method to address:
> -
>
> Replace sconsign with something faster than the current
> implementation, which is based on Pickle.
> -
>
> And/or Improve sconsign with something which can incrementally
> only write that which has changed.
> -
>
> Configure check performance: Even cached Configure checks seems slow,
> and for a complexly configured build this can add significant startup cost.
> Improvements here would be useful.
> -
>
> Method to address:
> -
>
> Code inspection, look for improvements
> -
>
> Profile
> -
>
> Variable Substitution: Currently variable substitution, which is
> largely used to create the command lines run by SCons, uses an appreciable
> percentage (approximately 18% for a null incremental build) of SCons’ CPU
> runtime. By and large much of this evaluation is duplicate (and thus
> avoidable work). For the moderate sized build discussed above there are
> approximately 100k calls to evaluation substitutions. There are only 413
> unique strings to be evaluated. Consider that the CXXCOM variable is
> expanded 2412 times for this build. The only variables which are guaranteed
> unique are the SOURCES and TARGETS, all others could be evaluated once and
> cached.
> -
>
> Prior work on this item:
> -
>
>
> https://bitbucket.org/scons/scons/wiki/SubstQuoteEscapeCache/Discussion
> -
>
> Working doc on current and areas for improvement:
> -
>
>
> https://bitbucket.org/scons/scons/wiki/SubstQuoteEscapeCache/SubstImprovement2017
> -
>
> Method to address:
> -
>
> Consider pre-evaluating Environment() variables where
> reasonable. This could use some sort of copy-on-write between cloned
> Environments. This pre-evaluation would skip known target specific
> variables (TARGET,SOURCES,CHANGED_SOURCES, and a few others), so minimally
> the per command line substitution should be faster.
>
>
> Bill and I would appreciate any feedback or thoughts on the above items,
> or suggestions for other areas to investigate. We are hoping that by
> addressing some or all of these items, the runtime overhead of SCons could
> be brought down significantly enough to re-render it competitive with other
> build systems. We hope to begin work on the above items once SCons 3.0 has
> shipped.
>
> Thanks,
> Andrew
>
>
> _______________________________________________
> Scons-dev mailing list
> Scons-dev at scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/scons-dev/attachments/20170721/4c5b7390/attachment-0001.html>
More information about the Scons-dev
mailing list