The Writing on the Wall

Another round of construction started here recently in the Mountain View office.  They’re trying really hard to keep the dust and disruption to a minimum so they hung plastic sheeting over doorways, and taped plastic over the carpets in the corridors – its even inside the elevators.

Its funny how quickly you can get used to working in what feels like the set of a bad SciFi movie! However, while swiping my ID card on the way back to my desk, the following made me stop, double-take and then start carefully looking around me.

“Demo: Not now”

To be clear, in this context:

demo != demonstration
demo == demolition

Turns out, the entire corridor I was standing in was going to disappear… just not now.

linux64: now with extra builds and talos!

Some of you may have noticed this new item on this menu on GraphServer.

There’s been a lot of work with linux64 over the last few weeks behind the scenes.

1) There are now nightly and per-checkin builds available for mozilla-central, mozilla-192, mozilla-191, tracemonkey, electrolysis. Because we only have 10 linux64 build slaves, we dont have builders on Places, TryServer or the cvs-based mozilla-190/Firefox3.0.

2) We’ve got a pool of linux64 talos slaves running all the usual Talos suites, per build, on those same branches. You can now see those results on graphs.mozilla.org, listed just like any other OS. Just like it should be. 🙂

3) Caveats:

  • For the sake of speed, we’ve cloned the *one* preexisting linux64 machine (which dbaron? setup up), without generating a clean, new, refimage with fully identified toolchain. If you see any toolchain problems, please let us know, but as its identical to whats been in place before, hopefully it will continues to be good enough for now.
  • Unittests are not yet being run on linux64. This is being worked on as part of a bigger problem; unittests used to require doing a build first. This in turn meant we only could run unittests on platforms that we supported using for builds, so we dont have unittests on 10.4, xp, vista, etc. More on this as it develops, but its not complete yet.
  • We’re still working out some TBPL display updates to get linux64 showing up on TBPL. For now, you must use Tinderbox waterfall to see the linux64 builds. The curious can follow bug#532560)

Spinning up this new OS took work from most people in the group, and is the first new desktop platform we’ve supported in years. Very very cool work and a great way to end the week. Enjoy!

Another Major Update from FF2.0.0.20->FF3.0.15

Last week, we offered Firefox 2 (yes 2!) users a Major Update offer to Firefox 3.0.15. This was despite our official End Of Life for Firefox 2 way back in December 2008.

While most attention is naturally focused on new releases, and on new security releases, there were 5.3% of our users still using Firefox 2. Those users were not getting new fixes and features; even worse, these users were all using versions of the browser that had known, published, exploits – exploits that were already fixed in later supported releases of Firefox.

The previous major update offer was intentionally left available, so any FF2 user who did manual CheckForUpdates would get upgraded to FF3.0.6. However, few did. As most of these Firefox2 users were on FF2.0.0.20, they were obviously willing and able to upgrade when security releases prompted them to. It seemed worth the effort to prompt them again, with a new Major Update offer, and see how many would upgrade.

In the first 7 days after publishing those new major update snippets, 16% of FF2 users have upgraded. Its a slower rate of upgrading then we get for normal security releases. However, its still a significant amount, and its great to see those users get back onto supported, more secure, releases. I’ll continue to monitor uptake, and keep you posted.

(ps: It was really cool that nthomas and abillings were able to find the time to squeeze yet another release into the schedule in the midst of all the releases for FF3.0, FF3.5, FF3.6 beta/RCs and Fennec beta/RCs. To keep this work quick and safe, we did a FF2->FF3.0 MU offer, rather then attempting FF2.0->FF3.5, which would require a bigger testing cycle, details in . On behalf of those users who are only now discovering the Awesome Bar, our faster performance and all the new JIT work, I thank you both!!)

Firefox 3.5.5 by the (wall-clock) numbers

Firefox3.5.5 was released on Thursday 05-nov-2009, at 16:00PST.

This was our fastest turnaround on a release. By far.

From “Dev says go” to “release is now available to public” was approx 3 days (3d 4h 45m) wall-clock time. Release Engineering took 13-16hours. By comparison, the next fastest release turnaround was FF3.5.3 (~37hours) and FF2.0.0.9 (~37hours).

11:13 02nov: Dev says “go” for FF3.5.5
13:06 02nov: FF3.5.5 builds started
17:05 02nov: FF3.5.3 linux, mac builds handed to QA
20:03 02nov: FF3.5.3 signed-win32 builds handed to QA
00:28 03nov: FF3.5.3 update snippets available on test update channel
22:00 04nov: Dev & QA says “go” for Beta, and for Release; Build already completed final signing, bouncer entries
07:30 05nov: mirror replication started
10:55 05nov: mirror absorption good enough for testing
14:40 05nov: website changes finalized and visible. Build given “go” to make updates snippets live.
14:51 05nov: update snippets available on live update channel
16:00 05nov: release announced

Notes:

1) As we continue streamlining this process, now the long pole is communication between the groups, and also how the websites release notes are assembled and published. For this release, there were 8.5 9.5 hours of waiting between “go to mirrors” and “mirror push started”. Most of Thursday was spent updating release notes on websites. Meanwhile, we populated the mirrors, which takes ~3.5 hours of watching mirrors, but only took two brief commands on our part.

3) Our blow-by-blow scribbles are public, so the curious can read all about it here. Those Build Notes also link to our tracking bug#525814.
This super-super fast release turnaround was handled calmly and efficiently. It was a real credit to the team to see how well everyone worked well together on this, including smooth handoffs back-and-forth across timezones so everyone still had a life ! 🙂

take care
John.

Infrastructure load for October 2009

Summary:

  • The numbers for this month are:
    • 1,692 code changes to our mercurial-based repos, which triggered:
    • 20,887 build jobs, or ~90 jobs per hour.
    • 46,219 unittest jobs, or ~62 jobs per hour.
    • 42,873 talos jobs, or ~57 talos jobs per hour.
  • This was our 2nd busiest recorded month, only slightly below last month’s record high.
  • The busiest day was 6th October, with 102 checkins. For comparison, this high level was only exceeded on 22nd Sept (116 checkins) and 20th May (108 checkins).
  • The number of unittest and performance jobs run has increased significantly. This is because a) we added new suites, b) we enabled existing suites on more branches and c) we split some suites into smaller, quicker, self-contained suites.
  • We are still not tracking down any l10n repacks, nightly builds, release builds or any “idle-timer” builds.

Here’s how it looks compared to the year so far:

Detailed breakdown is :

Here’s how the math works out:

The types of build, unittest and performance jobs triggered by each individual push are best described here.

Where does all the (compute) time go?

Everytime someone does a checkin, we do a whole bunch of builds, unittests and performance runs. Sure. But did you know we run about 40 hours worth on desktops, with an additional 25 hours on mobile?

Chris AtLee put together a complete list here, listing out what jobs are run. Its much easier to read then anything I’ve tried in the past, and well worth a quick read.

Its hard to grasp the scale of all this by looking at Tinderbox waterfall, but mstange’s TinderBoxPushLog does a *great* job of showing what happens with every checkin. After you read Chris’s blogpost, all those cryptic code on the right hand side of TinderBoxPushLog will make much more sense!

IsTheBayBridgeOpen.com ?

isthetreegreen.com was a quick-and-easy way for Mozilla developers to tell the state of the tree with a simple “Yes/No/Maybe”. There are lots of more detailed dashboards, but this site distilled all that info down to a simple one word.

isthebaybridgeopen.com is looking to do the same for local commuters. Even the font looks the same! I love how people in this area cope with big setbacks like this. 🙂

(For those of you not familiar with local bay area news, a bridge here is closed for emergency repairs. Its used by 280,000+ cars per day, so this is messing with local commute traffic patterns in a big way. More details here.)

Why was September 2009 so busy?

September was our highest load in the entire year so far, ~37% above our previous record high point this year. Its interesting that this seems to be across almost all branches and even exceeds the load during the last few months of the FF3.5 development cycle?!?

.

This wildly exceeds our expected load. Personally, I’m impressed our systems stayed up and working correctly, even if they sometimes got backlogged. Last month’s data has us trying to figure out answers to two questions:

  • What was so special about September to cause this?
  • What will October look like?

World of Goo (on wii)

My first impressions of World of Goo still hold true. Its a great game. The soundtrack is so good, the company made it available as a separate download.  All just wonderful.

Now, to make it even better, there is:

  • collaborative multiplayer
  • on wii
  • installed in the Mozilla office in MountainView

I’d intentionally stayed with the demos before now, but I finally cracked after playing the collaborative multiplayer version while in Ireland! So, I bought a copy for the office, and downloaded it to the wii. Try it – you’ll like it!

(Aside: please dont used the first profile tagged within the game; that’s being shared by Aki, JHFord and myself.)

Infrastructure load for September 2009

Summary:

  • The numbers for this month are:
    • 1,774 code changes to our mercurial-based repos, which triggered:
    • 19,525 build/unittest jobs, or ~27 jobs per hour.
    • 9,375 talos jobs, or ~13 talos jobs per hour.
  • This was by far the most load since we started recording these numbers in Dec2008. This is a 37% jump above our previous record, and is the 3rd month in a row with record checkins.
  • We hit 116 checkins on 22nd Sept; new record for number of checkins on a single day. This beats our previous record of 108 checkins on 20th May, in the leadup to FF3.5 release.
  • We are still not tracking down any l10n repacks, nightly builds, release builds or any “idle-timer” builds.

Details:

Here’s how the math works out:

The builds/unittest/talos jobs triggered by each individual push are:

  • mozilla-central: 13 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux64 opt, linux-arm, WinCE, WinMo) and 6 talos jobs
  • mozilla-1.9.1: 12 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux64 opt, linux-arm, WinMo) and 5 talos jobs
  • mozilla-1.9.2: 13 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux64 opt, linux-arm, WinCE, WinMo) and 5 talos jobs
  • electrolysis: 12 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux64 opt, linux-arm, WinMo) and no talos.
  • mobile-browser: 5 jobs per push (WinMO m-c, linux-arm m-c, Fennec linux desktop, linux-arm tracemonkey, WinMo electrolysis) and 2 talos jobs.
  • places: 12 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux64 opt, linux-arm) and 6 talos jobs.
  • tracemonkey: 10 jobs per push (L/M/W opt, L/M/W leaktest, L/M/W unittest, linux-arm) and 6 talos jobs.
  • try: 9 jobs per push (L/M/W opt, L/M/W unittest, linux-arm, WinCE, WinMo) and 6 talos jobs.