[Scons-dev] debug explain part 2

Mats Wichmann mats at wichmann.us
Mon Dec 10 10:27:00 EST 2018


this is part 2 of the debug=explain discussion.

Internally, when the problem occurs explain is fooled.  It fetches file
objects and matching signatures describing the dependencies from a
previous build (i.e. "from the database", although since that's already
been done elsewhere and stored it does not actually read it from the
db), and for the current build. It then loops through the objects this way:

(1) for each object in old, see if it is in new. If not, report it as
removed.

(2) for each object in new, see if it is in old. If not, report it as
new. If it is, compare the signatures to see if it changed.

However, this membership test depends on the objects being the same, and
in the problem scenario (variant dir, no duplication) they are not for
the source file which is to be rebuilt (why?)

>From earlier debug on our big project, all the other nodes pair up so we
can see they refer to the same thing, like this:

OLD type <class 'SCons.Node.FS.File'>, id 94472305963336, path
bridging/include/messageHandler.h
NEW type <class 'SCons.Node.FS.File'>, id 94472305963336, path
bridging/include/messageHandler.h

that is, they have the same identity, and a test "obj in list-of-objs"
works.  However for the file which needs rebuilding, the nodes are
actually different objects:

OLD type <class 'SCons.Node.FS.File'>, id 94472305685480, path
bridging/mini_plugin_manager/miniPluginManager.cpp
NEW type <class 'SCons.Node.FS.File'>, id 94472305684264, path
bridging/mini_plugin_manager/miniPluginManager.cpp

and the membership tests fail. This is where the paired removed/added
messages come from, and also the failure to report the file as changed
even if it was.

How to fix?

The problem could be considered to lie at any of three different levels,
in my opinion.

(a) explain behaves wrong on the data it is given
(b) the membership test (i.e. equality) should not fail if the objects
refer to the same file
(c) the variant dir code should not end up producing different file
objects in this case

So the question is where to fix... (or should more than one place be
fixed on the principle of making it more robust?)

I was tempted that (b) would be the low hanging fruit - namely, if the
two objects don't have the same identity, but share the most common
attribute - the pathname of the file - they should still compare equal.
That is, define an __eq__ for the class to answer as we want (and maybe
an __ne__ - I'm thinking I remember that Py2 might need that, Py3 I know
will supply the inverse by default).  This code is nested in so many
layers of classes it's kind of painful. But...  the FS Base class (nit:
really wish there weren't so many classes named Base) already defines an
__lt__ method along the exact same lines:

    def __lt__(self, other):
        """ less than operator used by sorting on py3"""
        return str(self) < str(other)

That sounds promising! So I tried adding an __eq__ like this, but that
leads to a recursion exception, as the __str__ method ends up calling
something which does an equality test which lands us back at __eq__,
etc..  So while we might want to pursue that route, it apparently won't
be simple and likely will introduce new side effects. (__str__
implementation here involves some kind of caching and other optimization
efforts).

As far as (c), I don't know anything about that part of the code yet,
but maybe it's the right place - why this anomaly only in the
variant/dup=0 case?

Meanwhile, I've fiddled with (a) and there's a work-in-progress PR
(3252) on it. That's surprisingly convoluted too - part (1) from above
is simple enough, just recheck the potentially removed item using the
path to the file against a list of the new paths and skip if found.  but
part (2) involves checking the maybe-but-not-actually added object's
signature, and since its identity is not in the old list, and the object
itself is basically used to index the signature, we can't find the old
sig to check against the new sig - IndexError. Without doing even more
work there are two choices: don't report it as new and don't check
signatures; or don't report it as new and say we don't know for sure.
Doing the former has an additional little surprise for us.  Using the
example from part 1, here is the result of adding a "maybe" statement
along with the other changes noted:

+ src/hello.c updated:
scons: rebuilding `build/hello.o' because `src/hello.c' might be changed
+ src/hello.h updated:
scons: rebuilding `build/hello.o' because:
           `src/hello.c' might be changed
           `src/hello.h' changed

That doesn't look quite right, though it's arguably improved (this is
what PR #3252 at the moment would produce).

So let's say nothing, instead of saying "might be changed":

+ src/hello.c updated:
scons: rebuilding `build/hello.o' because the dependency order changed:
               old: ['src/hello.c', 'src/hello.h', '/bin/gcc']
               new: ['src/hello.c', 'src/hello.h', '/bin/gcc']

This is because a later check uses whether or not any explanation lines
have been added yet as a flag in front of checking something else:

        if len(lines) == 0 and old_bkids != new_bkids:
            lines.append("the dependency order changed:\n" +

and because we already know one element of each list is the object for
src/hello.c, which differ, then the two lists cannot compare equal. So
now we get the enlightening (and incorrect) report that the dependency
order changed, and then prove it has not.

Moral of the story: checking equality of class instance objects (or
lists of same) is a slippery slope.

But maybe this isn't the place this should be getting fixed?



More information about the Scons-dev mailing list