Here’s a proposal to change the directory structure on ftp.m.o for new Firefox, Fennec and XULrunner builds going forward. To reduce disruption, existing builds would remain where they currently are, until they are aged off as usual.
This fixes an intermittent problem we hit with respins-of-nightly-builds, brings us one step closer to building cool regression-hunting tools, and streamlines RelEng automation as we consolidate Firefox+Fennec automation.
BIKESHED ALERT: There’s lots of potential opinions here. To avoid infinite loops, please read this entire doc, and the discussions in the two bugs, before commenting. Also, I’ve cross-posted to a few groups, to make sure this is widely seen. However, please respond here in dev.planning, or if appropriate, in the related bugs:
https://bugzilla.mozilla.org/show_bug.cgi?id=449607
https://bugzilla.mozilla.org/show_bug.cgi?id=487036
Details:
On ftp.m.o, this proposal would only change files under http://ftp.mozilla.org/pub/mozilla.org/firefox, http://ftp.mozilla.org/pub/mozilla.org/xulrunner and http://ftp.mozilla.org/pub/mozilla.org/mobile. Some concrete examples would be helpful:
before: firefox/tinderbox-builds/{branchname}-{OS}/{seconds-since-epoch}/
after: firefox/tinderbox-builds/{branchname}/{YYYYMMDDHHMMSS}/{OS}
before: firefox/nightly/YYYY-MM-DD-HH-{branchname}
after: firefox/nightly/{branchname}/YYYYMMDDHHMMSS/{OS}
before: mobile/tinderbox-builds/{branchname}-{OS}/{seconds-since-epoch}/
after: mobile/tinderbox-builds/{branchname}/{YYYYMMDDHHMMSS}/{OS}
before: mobile/nightly/YYYY-MM-DD-HH-{branchname}
after: mobile/nightly/{branchname}/YYYYMMDDHHMMSS/{OS}
As an example, this would change from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1283011618/ …to: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central/20100828160658/linux
…and change from: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-01-03-03-mozilla-central/firefox-4.0b9pre.en-US.win32.zip …to: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/mozilla-central/20110103035959/win32/firefox-4.0b9pre.en-US.win32.zip
Why change?
1) a common use case is when someone reports a problem with a buildID, and we want to find that specific build on ftp.m.o. The current process, of manually trying to find out approximately when the build was created, and then converting to epoch, or manually eyeballing the timestamps on files on ftp is inefficient. With this change, we would immediately be able to find that build. We could later build tools that directly link to the build on ftp.m.o.
2) Builds created with the same BuildID, for every OS, will be in the same directory. We already do this for nightly builds.
3) This full BuildID corresponds to the full BuildID in the txt file we already create alongside each build we post on ftp.m.o. For developers, this txt file also includes the changeset info. For example:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2010-08-29-04-mozilla-central/firefox-4.0b5pre.en-US.win32.txt contains:
20100829040614 414ff9016349
4) This avoids using changesets for unique directory identifier.
Changesets are unique, which is good. However, there are significant drawbacks:
4a) changesets do not sort sequentially, which makes it harder to do a binary divide on filesystem to find a regression.
4b) using changesets raises a different problem about how to handle respin-of-same-changeset. Using BuildID handles respins. However, using changesets would require an additional solution, like creating subdirs numbered build1, build2, or subdirs numbered by BuildIDs/timestamp. That seems even more complicated, and anyway still uses BuildIDs/timestamp info. Even for cases where we do not respin, we’d need to create this subdir anyway, to avoid having respin-logic need to move files (and break links that point to the old location).
4c) using changesets is usually advocated by people trying to figure out what changed between two specific builds. That is better resolved by bug#487036 (see below).
5) This helps fix a set of interconnected bugs
bug#431905 Change build process to generate consistent BuildIDs
bug#449607 change dated dirs on ftp.m.o to use new longer BuildID
bug#496549 relbranch names should have a finer resolution than 1 day
bug#487036 write tool to read buildbot db for BuildID+changesets of nightlies, and then construct URL to feed to hg pushlog
bug#538540 stop putting hour number in nightly directories
bug#584178 list hourly tinderbox builds by changeset on ftp.mozilla.org
6) Semi-related, bug#570814 “Nightly builds should all use the same revision” was fixed recently, so now all the builds for the same night on the same branch get the same BuildID. This should further help tidy up the build directories on ftp.m.o.
7) If RelEng is asked to respin a nightly, and we do so within the same hour as the first nightly (rare but it has happened), the new nightly overwrites the old. Not great, and causes problems for people getting updates that needs manual RelEng repair work.
8 ) By using {OS} as a directory, it makes it easy to delete the directory and recreate as part of posting the files of the build. This fixes the recurring unhappiness whenever filenames change (like between beta) and causes problems for nightly.m.o.
9) This makes the structure for Firefox, Fennec and XULrunner builds consistent. This makes the structure for incremental builds and nightly builds consistent. This consistency allows RelEng to further streamline automation.
Open question:
While we are doing this change, it seems like a good time to also rename the directory “tinderbox-builds”. We no longer using any tinderbox clients to build/test, and we are almost complete with the switchover from tinderbox-waterfall to TBPL, so this term no longer seems valid. I’m suggesting “continuous” or maybe “continuous-builds” as a better name to store all the incremental build-on-checkin work we do throughout the day.
(Alternatives already suggested that I’d prefer to avoid: “buildbot-builds” (in case we ever switch from buildbot), “builds” (too vague/overloaded), “depend_build” (what happens if we do a clobber in the day?) or “per_checkin_build” (what happens if we collapse build queues to have multiple checkins per build?). What alternatives can you come up with?)
Hope all that makes sense – there’s a lot of background and details, so if I missing something, do let me know. Also if you have comments or concerns, please chime in in the dev.planning newsgroup, in either of the bugs at the top, or even here as a comment on this post.
Thanks for reading this far!
John.
You must be logged in to post a comment.