[Scons-dev] SCons-like build system
Kenny, Jason L
jason.l.kenny at intel.com
Tue Jan 27 11:26:37 EST 2015
I think it is that Parts tries to use DB data smarter than SCons does. For me the understanding of the logic needed to validate that something is good or bad with MD5 I believe is the item that I had to learned from SCons. When we look at scons, it tries to keep it simple. This is good in general. The main logic SCons has is to:
1) read all build script
a. while reading files create nodes
b. execute other logic ( ie scanning disk for Glob() etc
2) once everything is read in turn targets into “node” objects
3) for each target node, grab target node object and start building children ( uses cache info to determine if a node is out of date or not)
a. if all children are up-to-date.. report everything up-to-date
This works great in general. But there are a few area that take a long time.
1) Reading all the script take time. The question is why. Python is very fast reading source files and making byte code to be processed. What takes time is the execution of the python file as it read in. The structure for a scontruct is to have python read it and then execute it. While it is execute, time consuming items happen, such as scanning disk with a SCons Glob ( or a Parts Pattern) call. Another time taking item is creating the nodes. SCons does what it can to make this faster, but it takes time. It gets worse as the system has to allocate more memory.
2) Given everything is loaded SCons processes the DAG to build stuff. It does not know if the target are up to date or not. SCons at this point has to process all the target nodes and its children. It takes time as SCons scans, subst, create build environments and check if there are different, etc everything as it would to do a build. In fact a do nothing build and a do something build only difference is if SCons will execute the build action or not. This is why it takes SCons so long to do say everything is up-to-date on do nothing build.
For me the interesting about SCons was that it had a DB/cache that can be used to make sure the build is good or bad. What it missed is that this information can be used to speed up the build as well. This is what Parts does. There are still a lot that can be done to be smarter in Parts ( in hindsight I should have done a few items differently). Given what we have at the moment Parts does this:
We read the SConstruct file. Each Parts call is noted, not nothing is read yet. Once the SConstruct is read, Parts uses a monkey patch to get called before SCons tries to build anything.
· At this point Parts looks to see if a .parts-cache exists.
o If it does not, we load each Parts normally, make our “Part” based targets SCons compatible and follow normal SCons logic, minus our hooks to do some extra stuff, such as store extra information in the Parts cache.
· Given that it exists we load information in the cache ( this is general logic)
o Parts looks to see what defining files have been modified. By this did are the set of Parts different in some way, are any Parts files different, are any Part configuration files, or files that define builders changed. This with a SCons based timestamp-md5 check is very fast. This tells me if anything that defined the main build tree we had from the last build might have changed. If anything changed we know we need to load these files and anything that might depend on it. There is more logic here to check dependency state changes etc.. This tells us that something might have changed that requires us to load these files. Items that could have changed are the build environment ( SCons will check this given we load the files) what we are building changed in some way ( ie source files changes, build files changes in what or how to build, etc). It is important that all Parts is doing here is a test to see if something changed so we cannot say with 100% certainty that something is not different. If still might be ok, but something changed so we will load it, to allow SCons to make the correct call.
o The next check is to see if to check the nodes that are needed to build a given target. Since we know at this point if something that might have changes large section of the known DAG has changed, we only need to test any component/Parts that still looks up to date and would be part of the targets the user wants to build. So we use the stored Ninfo data to check the leaf node on up to known point of stuff being possibly out of date to see if we need to load a part or not. This case is what happens most of the time when a developer does a build as a source file changed. This for Parts is hard coded to use the SCons timestamp-MD5 sig logic as it is the fastest of the group and does not lose correctness, as if a timestamp is out of data, we then do a MD5 check to make sure it is really out of date. This takes the longest when nothing changed as it has to test all known nodes for the targets in question to be able to say it is up-to-date. The logic that many miss is the it can generally stop once we find something is out of date as we don’t have to check all the other node for a given component. We don’t care if more of them are out of date, we know something is out of date so mark this parts to be loaded.
o At this point we know what Parts we have to load, before we load them we do a quick pass to make sure any Parts that we would load have its dependents loaded from cache. This is loads a set of state that we have to have to make sure SCons can build everything correctly
o So then we load each Parts and lets SCons do its thing. IF nothing is to be done, we force SCons to exist early and say everything is up-to-date.
This is main logic for the default “loader” as it is defined in Parts. Parts has four loaders at the moment that can be controlled via a --load-logic or --ll switch. These logics are as follows:
1) All – Load everything. Used when we have not parts cache. Basically this is default SCons logic
2) Targets – this load all files Parts files for a given target. Does not check if anything is out of date or not. Like All but loads a smaller set depending on the targets.
3) Unsafe ( no depends) – this load the target Parts files and its direct dependency from cache. This is very fast, but very unsafe as it require that the user know that nothing changed.
4) Min ( default logic) – given we have a parts cache does the logic as I describe above. Its goal is to be correct first, fast second.
So there are area’s of improvement to make still.
For example there some cases that are not checked in the cases of inputs to a parts call. Some logic could be added to reduce that pain of globs and Patterns calls to scan the disk which is a big time sucker for loading. ( it should be noted that the scanning the disk is a lot easier than defining a huge static array or string of nodes to be built). There is a case in which parts will not correct see that a file would be added to a scan on disk, which should have caused that Part file to be loaded. I should note that caching what is one disk in memory is a big time saver as well ( need to do) as for large build is common for a area on disk to be scanned many time for the different file ( or the same files in case of a cross build). This is a big deal for build system as the calling the system over and over again is “slow”.
So minus those opportunities for improvements, what could be done differently.
I found that I used the subst() system to pass data between components. In general this works well, but the subst system is inherently slow and has bugs. I did this to help make sure the that order in which a part is defined in a SConstruct does not matter. The main problem beside the extra time this can add to any build, is that functionally the system does not pass objects well, only strings ( which is expected I would think). This is a problem as I found there was an unexpected, but more common use case that happens in a logic of builds. This is that people want to define dynamic builders to be shared between components. As it turns out many project want to define a Part/component that defines a custom builder and they want to export it and have it imported as a dependency to certain components. Given this worked well, this would be great as a Parts allow for Symantec versioning which allows for controllable way to make sure the correct version is being used ( or more than one version if hackery has to be done to get something working quickly). While there is a work around to get to logic to work in some way, it is not very usable, unless I remove the subst() logic to pass dependency. This can only be done, if I can control the load order better. However this is complex as we have to know what to load, but we cannot know that before we load stuff to see what we need to know. This chicken and an egg problem can be solved two ways:
1) A continuous loader ( in the works) which load a component to a depends statement and stop to load something else till we have what we need to continue load a given parts file. This is complex, but has backwards compatibility with existing code, which I have to have ( cannot tell people redo your build files…)
2) A new format ( ie migrate to this). I had proto types this before, but could not implement it until I do 1). The main reason is that I had to internally at work provide some speed gains without a new format. This is the harder path, but is needed when you have a number of project depending on you. Given I get to the new format. What would it look like and who would it help. Given the time it takes SCons to start up, before it starts building anything ( which is the main issue with it on large builds). The new format addresses a few issues in this directions. I should note however that testing for me showed even if SCons took an extra minute or 5 to start building, most of the time a –j based build would match or still beat most other build system for a large build (mainly do that it could do a correct build, where a system like make, for example, it seems that people only trust a rebuild as something was not stated correctly and no one knew what). It was seen that python could easily read and parse 5000-8000+ files second. What was found was that executing the file took much longer. So the new format was to use decorators to allow a way to delay process as much as possible so we could get to a point of building something. the simple mockup of this would look like:
/////
PartName(…)
PartVersion(…)
Depends(….)
@build
Def config(env):
… stuff to set up environment
@build
Def emit(env):
.. scan for files
.. builder calls…
////
This overly simple example would allow declarative way to structure a build files. Python would be able to read and process this very fast as the “expensive” stuff can be delayed to till later. But anything I needed to know the about component relationship would be defined up front. This would allow Parts to take a simple “load all file” approach that SCons has without the delay we suffer today. Functionally this also allow the build system to define sections ( that could be custom). For example a simple case might be like:
/////
PartName(…)
PartVersion(…)
Depends(….)
@build
Def config(env):
… stuff to set up environment
@build
Def emit(env):
.. scan for files
.. builder calls…
@utest
Def test1(env):
@utest
Def test2(env):
////
This case defines what I called a utest ( unit test) section. This would be processed if and only if the target was to build and or run a unit test. This allow a design in which a component can define what it needs to have built, and test, ( unit , gold, etc..) as different concepts. For large build it mean an item can build built without having to build system taking time to process everything under the sun which keeping the definition of what to build in one place, which make can increase maintainability in really large systems.
If I do this or not, is not known at this point. It might be something you can take advantage of.
At any rate, the main less for me is that a build system, given a smart cache can do a lot to speed up a build with a few checks when it know that what has changed. the trick is the balance of what to check. For parts at the moment it is about not loading stuff to be process by SCons that we know did not change. Anything that might have changed is loaded so we can have SCons do a correct job of building anything that might need to be.
Hopefully this is useful. Lots of little details missing to keep this “short”. Let me know if you have any questions.
Jason
From: Constantine [mailto:voidmb at gmail.com]
Sent: Monday, January 26, 2015 5:06 PM
To: SCons developer list; Kenny, Jason L
Subject: Re: [Scons-dev] SCons-like build system
Parts check a number of things to see if a Part might be out of date, and if it could be, we load it, else we don’t read the file, and instead load cached information if and only if there is a component that depends on it.
I.e. Parts checks all sources, implicit dependencies, targets etc. by itself.
I mean - not using SCons standard logic. And it works faster than if SCons checked all these things.
Right?
Then what is the secret of Parts? How does it check all signatures much faster than SCons?
For me it sounds like if I split a big project to many totally independent small projects (SConstruct`s) and write a shell script:
scons -C subprj1 && \
scons -C subprj2 && \
scons -C subprj3 && \
...
scons -C subprj200
Then this shell script would work faster then one big project, but it would be still significantly slower than Parts.
Right?
Thank you.
Best regards,
Constantine.
In general Parts uses the notion of a component ( or a file that defines something) to make a “component” Dag. It checks to see if anything modified that state ( parts file changed, Sconstruct changed, configuration/toolchains was modified, source or output files for component changed etc… which are mostly a Scons timestamp-MD5 check. If any component is out of date or might be out of sync we mark it as such and load any dependent component ( no need to check them as we know there is “some” modification). This loads the items into SCons to be processed. The resulting time to load and process is *much* smaller on average.
Currently Parts is looking at removing our “mapper” logic, which is an abuse of the subst() system in SCons to share data between components, and moving to what is called “continuous loader” which will load the Components in dependency order (making data sharing simpler). This should also make the current logic for checking state simpler and faster.
From there the idea to look at is to store the actions defined in the Component and replay them as part of the rebuild, as this can be done once we know we have a sub tree that has nothing new added to it. While those items build we continue to load more data ( as the build would be waiting for build actions to complete). Given the common case for a incremental build this should work out. As any relationships defined explicitly or implicitly will not change. The most common change will be in the source files, which will be seem as part of a change that effect loading a component build file, or by the component build file being changed. At any rate the dependent information gets loaded as needed as processed to see what is safe or not safe to say. This idea need more flushing out as there are details that could mess up a correct build, or it might show to not speed anything up enough.
Jason
From: Scons-dev [mailto:scons-dev-bounces at scons.org] On Behalf Of Constantine
Sent: Sunday, January 25, 2015 3:24 AM
To: SCons developer list; Tom Tanner
Subject: Re: [Scons-dev] SCons-like build system
Do I understand correctly?
That the main idea is that using the Parts a big project is split into many small parts.
And then SCons builds these parts more-or-less independently.
I.e. SCons processes many small DAGs instead of one big DAG.
Thank you.
Best regards,
Constantine.
On 01/23/15 18:03, Kenny, Jason L wrote:
I understand this problem. This is one main thing Parts tries to address. It uses information about the different component to figure out what not to load. This requires the build to be laid out in some sort of component based like build, vs a recursive make, which may not be a big deal, depending on how your project is laid out. The result for example of a massive build I have here is a “do nothing” build takes ~7-20 second vs the normal ~15-20 minutes it would normally take. Incremental build also are reduced to minutes ( depending on what changed of course)
I honestly believe this can be better yet.. but we requires more work. The main issue with spawn issue that we pushed, was found when a build for this large product moved from rhel4 to rhel5. The time increased from 2 hours to 4+. It was the spawning issue. Something changed that really made this worse. Once we had this fixed rhel5 build went to 2 hours and the old rhel4 based build went to ~1.5 hours.
Jason
From: Scons-dev [mailto:scons-dev-bounces at scons.org] On Behalf Of Tom Tanner (BLOOMBERG/ LONDON)
Sent: Thursday, January 22, 2015 2:49 AM
To: scons-dev at scons.org<mailto:scons-dev at scons.org>
Subject: Re: [Scons-dev] SCons-like build system
Having been poking around in this, I see that 'posix_spawn' cleans to be availble on both aix and solaris (at least they have man pages). Which is a relief for me. However, my current grief is that a 'do nothing' build is still taking 20 minutes to traverse the tree. *part* of this is because I check the md5sum on each file to make sure someone hasn't gone and edit/patched/hacked a generated target (which happens more often than is desirable), but that isn't the bulk of the time taken.
I suspect I could leverage the fact that we use git to get the sha1 of certain repositories that are known not to be changed by us and if we find a file therein, to use the combined sha1 of the repos might improve that at the cost of potentially spurious rebuilds if one of the repos changed but the others didn't as i wouldn't be reading all the files.
_______________________________________________
Scons-dev mailing list
Scons-dev at scons.org<mailto:Scons-dev at scons.org>
https://pairlist2.pair.net/mailman/listinfo/scons-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/scons-dev/attachments/20150127/b4afa941/attachment-0001.html>
More information about the Scons-dev
mailing list