Firefox 3.6.12 and Firefox 3.5.15 by the (wall-clock) numbers

Firefox3.6.12 was released on Wednesday 27-oct-2010, at 16:48PST. This was yet another release that shipping within 24hours, and yet again, this set a new speed record for us.

From “Dev says go” to “release is now available to public” was 21h 32m wall-clock time. The Release Engineering portion of that was 10h 25m. This was faster than our previous fastest ever release FF3.6.6, and well inside of 24 hours from start to finish. For FF3.6.12, the wall clock times were:

19:20 26oct: Dev says “go” for FF3.6.12
19:48 26oct: FF3.6.12 builds started
21:55 26oct: FF3.6.12 linux, mac, unsigned-win32 builds handed to QA
00:05 27oct: FF3.6.12 signed-win32 builds handed to QA
03:35 27oct: FF3.6.12 update snippets available on test update channel
09:05 27oct: Dev & QA says “go” for Release; ok to start mirror absorption
10:50 27oct: mirror absorption started
10:55 27oct: mirror absorption good enough for testing
16:00 27oct: website changes finalized and visible. Build given “go” to make updates snippets live.
16:21 27oct: update snippets available on live update channel
16:48 27oct: release announced

I note that we also simultaneously shipped FF3.5.15 in the same super-fast way.

19:20 26oct: Dev says “go” for FF3.5.15
19:46 26oct: FF3.5.15 builds started
21:55 26oct: FF3.5.15 linux, mac, unsigned-win32 builds handed to QA
11:05 26oct: FF3.5.15 signed-win32 builds handed to QA
04:35 27oct: FF3.5.15 update snippets available on test update channel
09:05 27oct: Dev & QA says “go” for Release; ok to start mirror absorption
10:10 27oct: mirror absorption started
10:50 27oct: mirror absorption good enough for testing
16:00 27oct: website changes finalized and visible. Build given “go” to make updates snippets live.
16:21 27oct: update snippets available on live update channel
16:48 27oct: release announced

Obviously, we don’t want to ship this quickly all the time, but its nice to know that we can if we have to. Really really nice. And stay tuned – we’ve got more work in progress to make this even faster! πŸ™‚

Notes:

1) While there’s been some discussion about how this release was done in “2 f*ing days”, I prefer to measure the time between “dev says go” to “release is available”. Explicitly I do *not* measure from “fix is reported” to “release is available”, because I don’t want to put any further time pressure on a developer trying to fix a problem under pressure. It feels much better to me to work a little longer to get the fix right instead of adding even more time pressure looking for a quick fix, and then having to do another emergency release a few days later to fix the “quick fix”.

2) The super fast mirror uptake was due to some great work by mrz and justdave. Not that we would do that for every release, but it was great to have the assist when time really mattered!

3) As usual, our blow-by-blow scribbles are public, so you can read all the details here or in tracking bug#607228. For FF3.5.15, the build notes are here or in tracking bug#607240.

Thank you
John.

Breakthrough on Android automation

If you are watching closely, you might have noticed a small change recently to TBPL and tinderbox server and graphserver. The circled little green “T”s mean that the Talos suites tdhtml, ts, tsvg are now automatically reporting valid results on Android systems for every checkin on the mobile tree.

Of course, we’ve had automated Android builds for a while now – those little green “B”s on TBPL are Android builds triggered per checkin, and available like all our other builds on ftp.m.o. The big news here is that now we have 3 talos test suites correctly reporting green, each reporting their results to graphserver, tinderbox and TBPL. Just like we do for any other fully supported OS on our infrastructure.

This is still only the beginning. There’s still the rest of the Talos suites and all the unittest suites to get reporting green. But at least, we now know that the basic infrastructure works!

From RelEng this took aki and bear leading a ton of work, both within RelEng and also with ctalbert, jmaher, bmoss, mwu, blassey, and others across Mozilla… the list goes on and on. Thank you – each and all.

If you are interested to help get more test suites reporting green and showing on TBPL, please follow along in bug#538524, or ping aki or bear. At this point, the tricky part for us is that we cant enable broken/failing tests in production – that would close the tree. πŸ™ Instead, we post the failing test suites on http://tinderbox.mozilla.org/showbuilds.cgi?tree=MobileTest, look through the logs and then go ask for developers/QA to help fix the broken code/tests. Only after suites are green can we have them report on the production mobile tree, and TBPL.

ps: So far, we have only 3 (soon 4) developer boards in production to run these tests. This means we are struggling with wait times for the checkin load on mobile branch. This also means we cannot realistically enable Android testing on high-load branches like mozilla-central, or TryServer – they simply couldn’t keep up with the number of jobs to test. I’ve been unsuccessfully trying for a month now to unjam this, so if you can help us get more developer boards, please drop me an email – I’d love to hear from you!

Please welcome Dustin Mitchell to Release Engineering

We’re excited to have Dustin join Coop’s group here within RelEng. If you’ve been using Buildbot over the last couple of years, you’d know that Dustin has been maintainer of the Buildbot project through some large new features, while also helping grow the community.

He’ll bring additional buildbot expertise to our group and help make sure our non-Mozilla-specific work continues to be upstream-able to the general buildbot community. Also, part of his time will be spent providing further outreach to the buildbot community, helping others make buildbot even more cool.

He’ll be another remote RelEng person – working from Chicago – but you can find him in irc.mozilla.org in the #build channel as “dustin”. You can also follow his blog.

[Updated to include URL for Dustin’s blog. joduinn 24oct2010]

How to fix “Things freezes at start of Sync”

A few days ago, my Things-on-Mac stopped synchronizing with my Things-on-iPhone. I tried everything on the CulturedCode forums and from the CulturedCode support emails without success. It took a while to debug this, so here are details, in case it helps others (or I have to do this again!)

What am I running:

  • MacBookPro running OSX 10.6.4
  • Things-for-Mac v1.4.2 (1420)
  • iPhone 3G running v4.1 (8B117)
  • Things-for-iPhone v1.6.1

Symptoms:

Individually, I could use Things on Mac, and on iPhone, just fine. However, any attempt to synchronize between the two would cause a progress dialog box on Mac saying “Preparing…” which would just hang, until I force-quit it. In case I was being too impatient, I left it running overnight once but it was still just as hung in the morning.

This hang happened 100% of the time. This hang happened regardless of whether I started synchronization on Mac with File->”Sync with … now”, or on iPhone, by starting Things-on-iPhone while on same wifi network as mac. This hang happened on my home wifi network and also on the office wifi network, and even when I had no other applications running on my mac.

This setup has been working without problems for months, and I hadnt installed any new software or updated any existing software, so I’m still baffled what caused this problem.

Here’s some things I tried first, unsuccessfully:

  • rebooting mac, rebooting phone, clicking sync. Hang.
  • rebooting mac, rebooting phone, removing phone from list of devices, adding phone into list of devices, re-pairing iPhone to mac, clicking sync. Hang.
  • removing things from iphone and reinstalling through itunes, rebooting mac, rebooting phone, removing phone from list of devices, adding phone into list of devices, re-pairing iPhone to mac, clicking sync. Hang.
  • Repeat all of the above on home wifi, and then again on work wifi.
  • At home I also tried all of this after rebooting my home wifi access point.
  • All to no success.

At that point, I remembered the idea of taking backups, so backed up the entire Things data directory, which in my case, was in /Users/john/Things:
$ cd /Users/john
$ rsync -av Things Things-2010-10-01

Note: using “rsync” preserved the timestamps, in case that was part of the synchronization logic.

Here’s the steps that fixed it:

  • remove Things from iPhone
  • exit Things on Mac
  • inside the Things directory on Mac, there is a “Backups” directory. This contains daily backups of your Things data. I copied the oldest backup over the current “latest” Things data file, as follows:
    $ cd /Users/john/Things/Backups
    $ cp DatabaseBackup\ 2010-09-29\ \(653\).xml ../Database.xml

  • reinstall Things on iPhone
  • start Things on iPhone
  • start Things on Mac
  • remove phone from list of devices, add phone into list of devices, re-pair iPhone to Mac
  • exit Things on Mac, Things on iPhone
  • start Things on Mac, Things on iPhone, and click sync.
  • It took several minutes of “Pending…”, but this time the progress bar was moving which gave me hope. After a few minutes of this, success! I could now see all the items on my ToDo on both devices!! OK, it was all from almost a week old backup – but still, encouraging progress.

At this point, my theory was that something happened during the week that corrupted the Things data xml file. The files were all still valid xml files, so something more subtle was wrong. To find when the corruption happened, I repeated these steps for each different backup, each time copying up the next newest backup. In theory, once I found the corrupted xml file, sync should not work again. However, following the process above, each restore attempt worked, all the way to the latest backup! I ended up with the latest contents of my Things-on-Mac finally visible again on my Things-on-iPhone.

Final step was to do a quick test update on Mac, along with another test update on iPhone, then syncing to verify that both changes were handled correctly. This worked fine too – so everything is good!

take care
John.