[Scons-dev] proposed improvement to temp file names used by scons cache
Mats Wichmann
mats at wichmann.us
Mon Aug 31 16:17:53 EDT 2020
On 8/31/20 1:06 PM, Raven Kopelman wrote:
> Hi there,
>
> We have a CI build framework configured such that many machines are
> concurrently building and sharing a scons cache. This cache lives on an
> Amazon EFS filesystem, mounted as NFS.
>
> In general this has been spectacularly successful, but every once in a
> while corrupted files start coming out of the cache. Our theory is that
> the EFS + NFS locking guarantees aren't good enough for the SCons temp
> name collision detection algorithm - attached is a patch we are going to
> try running with to see if it improves things.
>
> In addition to hoping a formalized version of this will be considered
> for SCons, I'm curious if anyone sees a more likely explanation for the
> symptoms described above.
>
> --- CacheDir.py 2020-08-19 12:59:25.790302000 -0700
> +++ CacheDir.py.uuid 2020-08-19 14:00:29.693749695 -0700
> @@ -32,6 +32,7 @@
> import os
> import stat
> import sys
> +import uuid
>
> import SCons.Action
> import SCons.Warnings
> @@ -100,7 +101,11 @@
>
> cd.CacheDebug('CachePush(%s): pushing to %s\n', t, cachefile)
>
> - tempfile = cachefile+'.tmp'+str(os.getpid())
> + # UUID in case filesystem doesn't support file operations well
> enough to deal with multiple
> + # machines sharing a cache and attempting to write the same file at
> the same time (NFS mount of
> + # AWS EFS?).
> + # TODO: Long filename concern on Windows?
> + tempfile = cachefile+'.tmp'+str(os.getpid()) + '_' + str(uuid.uuid1())
> errfmt = "Unable to copy %s to cache. Cache file is %s"
probably not much reason to keep the getpid().. that's a pretty weak way
to generate a "unique" filename if there are multiple machines in play...
More information about the Scons-dev
mailing list