When to use the Beta Update channel vs the Release Update channel?

Here’s something I posted in bug#405584 today which others might find interesting.
“Can you let us know a few days before shipping, when a new FF is coming, so we can test it and confirm it doesn’t break with our site – *before* you ship FF”?

Well, actually, we already do this. Let me explain with some background…

We have 3 different channels for sending out updates to users. These channels are currently called: nightly channel, beta channel and release channel. The nightly channel keep you updated with new nightly builds as they are produced – the “bleeding edge” of browser development, so to speak, and typically of most interest to FF developers and testers. Its also the most unstable. However, I’d like to talk more about the “beta” and “release” channels.
Continue reading “When to use the Beta Update channel vs the Release Update channel?”

Firefox 2.0.0.11 by the (wall-clock) numbers

Mozilla released Firefox 2.0.0.11 on Friday 30-nov-2007, at 1:30pm PST.

From “do we need a release” to “release is now available to public” was almost 3 days (71.5 hours) wall-clock time, of which Build&Release took 36 hours.

13:50 27nov: decide bug#405584 regression in FF2.0.0.10 justifies producing a quick FF2.0.0.11 to address
16:00 27nov: Dev says “go”
17:55 27nov: 2.0.0.10rc1 builds started
19:45 27nov: linux builds handed to QA
21:45 27nov: mac builds handed to QA
12:50 28nov: win32 signed builds handed to QA
16:00 28nov: update snippets on betatest update channel
10:45 29nov: QA says “go” for Beta
14:30 29nov: update snippets on beta update channel
17:00 29nov: Dev & QA says “go” for Release; Build starts final signing, bouncer entries
19:00 29nov: final signing, bouncer entries done
07:00 30nov: mirror replication started
13:30 30nov: update snippets on live update channel; website changes finalized and visible; release announced
Notes:
1) This was a really fast release!! Despite the fast turnaround, it felt like things were still running in a controlled calm manner, we still covered everything we usually do, and even improved on the process a little. All great things to see. The Build Automation used in FF2.0.0.11 was identical to what we used for FF2.0.0.10, so this was still not yet a “100% human free” release.
2) bug#405643 was reported as another regression in FF2.0.0.10. However, we confirmed it was actually a feature of fixing security bug#369814, and proposed a workaround for LotusDomino servers.
3) For this one fix, we decided not to wait for a beta period, as it was a one line fix already landed on trunk back on 11th Oct. However, we still wanted to move people who were using FF2.0.0.10beta forward to FF2.0.0.11beta. This meant we still needed to push update snippets out on the beta channel and test appropriately.
4) During 2.0.0.10, we had to hold the release a few hours, waiting for some website changes to be finished. In a process change for FF2.0.0.11, we started the website and release note work much earlier, starting when QA says “go” for beta. This change helped, and we plan to do this for future releases.
5) We waited until morning to start pushing to mirrors. This was done so mirror absorption completed as QA were arriving in the office to start testing update channels. We did this because we wanted to reduce the time files were on the mirrors untested; in the past, overly excited people have post the locations of the files as “released” on public forums, even though they are not finished the last of the sanity checks. Coordinating the mirror push like this reduced that likelihood just a bit.
6) Mirror absorption took 3 hours to reach all values >= 60%.
take care
John.

Firefox 2.0.0.10 by the (wall-clock) numbers

Mozilla released Firefox 2.0.0.10 on Monday 26-nov-2007, at 6.30pm PST.

From “do we need a release” to “release is now available to public” was 14 days 2 hours wall-clock time, of which the Beta period took 6.75 days, and Build&Release took 34 hours.

16:25 12nov: decide regressions introduced in FF2.0.0.9 justify producing a quick FF2.0.0.10 to address
20:20 14nov: Dev says “go”
03:40 15nov: 2.0.0.10rc1 builds started
07:05 15nov: linux builds handed to QA
11:05 15nov: linux, mac and win32 signed builds handed to QA
07:00 16nov: update snippets on betatest update channel
14:00 19nov: QA says “go” for Beta
15:00 19nov: update snippets on beta update channel
09:15 26nov: Dev & QA says “go” for Release; Build starts final signing, bouncer entries
11:00 26nov: final signing, bouncer entries done; mirror replication started
18:30 26nov: update snippets on live update channel; website changes finalized and visible; release announced

While Build Automation in FF2.0.0.10 was much smoother than FF2.0.0.9, this was still not yet a “human free” release:
1) We still manually do signing, adding bouncer entries, starting mirror replication and monitoring mirror replication, pushing snippets to beta channel, pushing snippets to release channel. Combined, these took 5.5 hours of the Build time, and are not yet automated.
2) We had to hold the release, waiting for some website changes to be made and then published. This delay was caused by an internal human communication snafu within Mozilla – some folks had not been notified we were releasing that day, so some website changes were not ready. We eventually raised them on cellphones after they landed off a plane, and made the website changes, but this delay cost us approx 3 hours. We’re tweaking the human processes to try to avoid this in future.
3) Mirror absorption took 3 hours to reach all values >= 60%. We’ve been experimenting with the last few releases, to see what absorption value is “good enough” without hammering individual mirrors. So far, a lower limit of 70%, 65% and 60% have been tried. Without any real evidence, I just feel nervous about trying any lower percentage, as fewer mirrors would be sharing the overall load, maybe burning that mirror’s bandwidth. Open to persuasion though, if people have suggestions?!!

take care
John.

Firefox 3.0beta1 by the (wall-clock) numbers

Mozilla released Firefox 3.0 beta1 on Monday 19-nov-2007, at 11.10pm PST.

From “code freeze” to “release is now available to public” was 19 days 23 hours wall-clock time, of which Build&Release took 9 days and 1 hour.

23:59 31oct: code freeze for 3.0beta1
15:00 06nov: Dev says “go” to Build
18:25 06nov: rc1 builds started
20:30 06nov: win32 builds failed out (bug#346214)
22:30 06nov: win32 builds restarted after bug#346214 fixed on release branch
14:30 07nov: linux, mac and unsigned win32 builds handed to QA
17:25 07nov: rc1 abandoned (see details below)
17:25 07nov: rc2 builds started
17:20 08nov: rc2 builds abandoned (bug#402999)
19:05 08nov: rc3 builds started after bug#402999 fixed on release branch
17:30 09nov: linux & mac builds handed to QA
14:55 12nov: win32 signed builds handed to QA
18:15 16nov: Dev & QA says “go” for Release; Build starts final signing, bouncer entries
14:45 19nov: final signing, bouncer entries done; mirror replication started
23:10 19nov: announced

1) There is no Build automation running on trunk, so this release was done manually.
2) The rc1 builds were abandoned because of a manual error in how cvs was tagged. Two Build engineers were working in parallel to speed things up: one engineer typed PDT timezone into one machine, while the other engineer typed PST into another computer, so the two machines had an hour difference in what source timestamp to use for the builds. That one hour difference meant the generated builds missed one important last minute showstopper bugfix. This was totally a manual snafu within the Build team, and would have been avoided if automation was in place on trunk. (Ironically, daylight savings time only changed this same week; a week earlier this same snafu would have passed blissfully unnoticed!)
3) During rc1, there was a 4h20m delay while the Build team investigated a regxpcom test error at the end of the win32 build. Turns out the build was actually fine. The error was caused by a collision between the hourly build and production build processes running on the same machine at the same instant. Killing the hourly build and rerunning production worked fine.
4) By prior agreement, we did not create update snippets for this beta. Any users on earlier Alpha releases would not get updates refreshing them forward to beta1; instead Alpha users would have to manually download and install beta1. We do plan on creating update snippets for beta2 and beta3.
5) Because this was a Beta release, we did not do any “beta period” before releasing the beta! πŸ™‚
6) Mirror absorption took 8 hours to reach the 70% threshold, not the usual 3 hours. In a random coincidence, there was a problem with one of the central rsync hubs in the mirror farm, and also one of the major mirrors, further compounded by problems when switching to backup servers. Dave Miller has all the drama details of late night pagers, and various mirror owners jumping to help (shout out to Shane!).

take care
John.

Interesting commuter driving on Golden Gate Bridge

This morning (06:50am 28-nov-2007), a commuter went unconscious while driving her sport utility vehicle on the Golden Gate Bridge. With the sole-occupant driver unconscious behind the wheel, the car swerved out of its lane, and towards the oncoming traffic on the other side of the bridge. Another commuter reacted quickly, used his pickup truck to force the still moving sport utility vehicle away from oncoming traffic and over to the side of the road.

[Link to full story on sfgate.com]

Quick thinking and very nice driving.

Thunderbird 2.0.0.9 by the (wall-clock) numbers

Mozilla released Thunderbird 2.0.0.9 on Wednesday 14-nov-2007, at 5.10pm PST.

From “Dev says code ready to release” to “release is now available to public” was 15 days 22.5 hours wall-clock time, of which the Beta period took 6 days 8 hours, and Build&Release took just over 2.5 days (62.5 hours).

17:30 30oct: Dev say go
09:40 31oct: mac builds handed to QA
10:00 31oct: linux builds handed to QA
17:55 31oct: win32 signed builds handed to QA
06:50 02nov: update snippets available on betatest update channel
14:30 06nov: QA says “go” for Beta
16:10 06nov: update snippets available on beta update channel
00:30 13nov: Dev & QA says “go” for Release; Build starts final signing, bouncer entries
08:25 13nov: final signing, bouncer entries done; mirror replication started
09:40 13nov: Build announced enough mirror coverage for QA to use releasetest channel
12:40 13nov: win32 installer bug#403670 discovered
14:00 13nov: declare bug#403670 as showstopper, put TB2.0.0.9 on hold.
18:20 13nov: root cause and fix of bug#403670 found.
05:05 14nov: one rebuilt win32 installer handed to QA to verify bugfix
05:40 14nov: QA confirmed new win32 installer is ok.
08:30 14nov: all rebuilt win32 installers handed to QA
10:10 14nov: QA signoff on rebuilt win32 installers, mirror replication started
15:00 14nov: mirror replication confirmed complete on new win32 installers
16:00 14nov: update snippets available on release update channel (for end users)
17:10 14nov: release announced

1) This was not a “human free” release. The automation work done for FF2.0.0.9 has not been tested for TB2.0.0.9. In theory it should work just fine, but we just havent had time to test it, so we chose to play safe and do this release manually. Hence this took more time for Build to produce. All of that time was manually intensive Build work.
2) bug#403670 was caused by a combination of factors. One factor was human error, I incorrectly setup a workarea on a signing machine, the same incorrect setup works fine for Firefox releases; the signing doc has now been updated. The other factor was a long-standing-but-previously-unknown error handling problem in one of our signing scripts, how to fix this is being debated within the Build team. Note: this problem was with the windows installer only, not with any Thunderbird code, and not the linux/mac installers. Overall, this delayed the release by approx 22hours.
3) Mirror absorption times were messed up by the stop-and-restart caused by bug#403670.
4) The daylight savings PST change happened during this release, giving us an extra hour. That is counted in the overall times above.

take care
John

Keeping perspective: 34hours vs 37hours

It took 34 hours to produce Firefox3.0beta1 rc1.

Those 34 hours were frantic. Two people, tag teaming day & night, working with the nervous tension of knowing that a single one character typo could invalidate the entire build, and force us to start all over again. Those 34 hours only got us as far as producing unsigned builds on each platform – roughly 1/3 of the overall Build work needed to do a release – before we hit a problem. A typo. At the beginning of it all, one person typed PDT into one computer, while the other person typed PST into another computer. That typo meant rc1 did not include a last minute important bugfix. So, we scrapped rc1 and started all over again, building rc2. (I note that the D and S are even next to each other on the keyboard [sigh!]. And if it wasnt for the timezone change last week, it would have not mattered either[sigh! sigh!])

To put that 34 hours in perspective, Build took 37 hours to do everything needed for the complete FF2.0.0.9 release… and most of that was actually just watching the automation chugging along. Active human work was down to a handful of hours for signing, bouncer/mirror updates, and a little nervous manual rechecking of the automated checks, just to be sure, to be sure.

Why the night and day difference?

We’ve been focusing on automation for the FF2.0.0.x branch over the last few months, shipping FF2.0.0.7, FF2.0.0.8 and FF2.0.0.9 each time with automation improved from the previous release. Sadly, none of this automation work is live on trunk yet. All the trunk releases, like the alphas, and now this FF3.0beta1, are done the old fashioned way. By hand. One command at a time.

This week was a stark reminder of what things used to be like, and gave perspective on how much we’ve accomplished so far this year.

Free Software 2.0.0.9 builds now also available…

… at ftp://ftp.mozilla.org/pub/firefox/releases/2.0.0.9/contrib/free-software/.

This special build of Firefox2.0.0.9 uses the exact same code cutoff time and cvs branch as the regular Firefox2.0.0.9 release, but was compiled with branding, logos and talkback removed.

As an aside, I didnt know much about this special build until recently, hence there was no plan to include this in our build automation work. However, looking back on ftp.mozilla.org, I see quite a few of them, and asking around, it was done manually once the dust settled on a given Firefox release. We are now tracking automating these FreeSoftware builds in bug#385783, with some related cleanup in bug#402582.

Firefox 2.0.0.9 by the (wall-clock) numbers

Mozilla released Firefox 2.0.0.9 on Thursday 01-nov-2007, at 5.40pm PST.

From “do we need a release” to “release is now available to public” was 11 days 2 hours wall-clock time, of which the Beta period took 2.75 days, and Build&Release took 37 hours.

15:35 22oct: decide regressions introduced in FF2008 justify producing a quick FF2009 to address
12:30 25oct: Dev says “go”
14:40 25oct: 2009rc1 builds started
20:00 25oct: linux builds handed to QA
22:00 25oct: mac builds handed to QA
01:00 26oct: win32 signed builds handed to QA
19:40 26oct: update snippets on betatest update channel
16:30 29oct: QA says “go” for Beta
16:50 29oct: update snippets on beta update channel
10:40 01nov: Dev & QA says “go” for Release; Build starts final signing, bouncer entries
14:15 01nov: final signing, bouncer entries done; mirror replication started
17:15 01nov: update snippets on live update channel; announced

While Build Automation in FF2009 was much smoother than FF2008, this was not yet a “human free” release:
1) The talkback server had been renamed after the FF2.0.0.8 release shipped and before FF2.0.0.9 started, so our first automation run timed out at the end of the build, waiting for humans to answer the RSA “are you sure you want to connect to this machine” login question?! πŸ™ We didnt detect this until the build overran the estimated completion time, but then after a quick fix, we were forced to rerun the entire build again. This would have been caught if our nightlies were part of the same build automation (see bug#401936)
2) We still manually do signing, adding bouncer entries, starting mirror replication and monitoring mirror replication, pushing snippets to beta channel, pushing snippets to release channel. Combined, these took 6.5 hours of the Build time, and are worthy of automation attention. Pushing updates snippets to betatest channel has been automated since the FF2008 release.
3) Mirror absorption took 3 hours to reach 72-80%. The mac DMG files always straggle much lower then everything else for mirror absorption, apparently a known problem with how webservers handle that file type, but new details are emerging in bug#402141. Experiments continue, but every time we do a release, we always give thanks to morgamic for giving us the tools to measure with!

take care
John

The Baby Owners Manual

Bought this book again recently, and thought it was finally time to post a review of it.

I first found this in a bookshop years ago, just when some engineer friends of mine had their first baby, so I bought it as an impulse joke gift for them. It was easy to read, informative, and entertaining. I’m an engineer, with no prior baby experience, as were my two newly-parented friends; obviously the author’s target audience.

The book itself was written by father-and-son combination (a doctor and a parent) in the style of a computer manual – you know… the manual you never read… the manual which comes with your new PC… full of simplified diagrams, with bubbles and arrows, showing you how to plug in the printer? and troubleshooting techniques if the mouse doesnt work?… well, this book is exactly that, except its all about how to pickup a baby, burp a baby, change a baby’s diaper (different instructions for boy and girl!), wrap a baby, simple medical issues, while sending you to your nearest Baby Service Provider for more complex problems.

They smiled politely when I gave them the book, but you could tell they thought I was a little nuts.

Weeks later, they each pulled me aside and confided that they learnt lots from the book, loved it and were busy recommending it to other parents. It had become their first book to reach for, exactly because of its quick-troubleshooting design, and they learnt lots of practical tips just browsing through. Wow, funny and really useful. That settled it. Over the years, its become a kinda tradition now for me to buy it for any engineer friends who are having their first baby. So, Monday night, I delivered a copy of this book, along with some other gifts to a proud new parent at Mozilla. At this point, I’ve bought maybe a dozen copies, mostly through amazon, so who knows what that is doing to my own Amazon.com account profile! πŸ™‚

The publishers must think its successful, because they have recently started a series of books in a similar vein: The Dog Owner’s Manual, The Cat Owner’s Manual, The Toddler Owner’s Manual, The Home Owner’s Manual, etc…