Summary: Mozilla nightly builds were originally not setup to do nightly builds for a branch using the same code revision across all OS. This complicated any attempt to use nightly builds to track down a platform specific bug. This has now been fixed. Send beer and chocolate to catlee.
If you are curious for details, read on!
Because of how Mozilla originally setup the nightly build system, there was three little known quirks from the very beginning:
1) The nightly build was of the tip of the code at the time build started.
This sounds good, but this meant that anyone who checked in late at night ran the risk of breaking the build, and then not being able to back out the change quickly enough, causing the nightly builds to fail out.
2) The machines for each OS started builds at different times
A nightly was started whenever the machine finished building the previous build, and it was the first build started after 3am. The first build after 3am would build using the tip of the code, and be published as a nightly build. However “first job after 3am” when there was only one build machine per OS meant starting the nightly build at different times for different OS; this window of possible changes was ~3hours (longest build time minus 1min). Anyone who checked in during that period would get their change included in the nightly build for that OS, but not in the nightly build for any OS already started.
3) Nightly builds on different OS went into different directories
Because the nightly builds start at different times, the generated builds got different BuildIDs, and are posted into different directories on ftp.m.o. This complicated regression hunting work.
All confusing.
Catlee landed some changes recently which mean that now:
1) The nightly build is of the most recent “good” code.
The nightly build now does not automatically build “tip”. This sounds counterintuitive at first, but actually makes sense – read on. The nightly build now starts by attempting to find a changeset that is newer then the previous nightly, and which is also known to be a good changeset. Right now the definition of “good changeset” is “compiles+links”. Eventually, as there are fewer intermittent tests, the plan is to change that definition to “compiles+links+passes-tests”. Worst case, if *no* changeset has successfully built since the previous nightly, then we’ll fall back to current behavior, and attempt to build tip even though we expect it to fail.
2) Each nightly build is told its BuildID and changeset
The buildbot master tells the build slave which BuildID and changeset to use for the nightly builds. This means the nightly builds are created with the same BuildID for each OS – which means that the nightly builds for each OS show up in the *same* directory on ftp.m.o. No more finding last night’s mozilla-central nightly for linux, mac and win32 in different directories!
All obvious goodness!
“Why wasn’t this fixed years ago???” I hear you ask. It has only become possible after all the other recent changes and scaling done in RelEng, as well as detangling what “build start time” and “BuildID” mean in Tinderbox, Makefile and MozillaBuildSystem. Fixing this very long standing annoyance should help developers and QA triage problems with nightly builds, and also makes me happy. For the curious, further details are in bug#570814.
Wouldn’t it have been easier to just pull from a build repository to create the working repos for each nightly build?
ie:
1. once a day (say midnight or 2am) pull mozilla-central to mozilla-nightly-build-master
2. nightly builds pull mozilla-nightly-build-master to produce a build
This would guarantee that nightly builds were all done with the same revision wouldn’t it?
ps. This should still be done to ensure that in the case you mentioned where no build is “good” it still builds all at the same revision.
I support this plan.