HOWTO: Renewing a GPG key

I’ve just renewed the GPG keys which RelEng use for signing builds with our release automation. The details are in bug#673281, but I thought crossposting might be of help to others. If you dont care about GPG keys and signatures, skip now.

0) login to signing machine

1) Verify you are in a clean working directory and have a good gpg install.
$ cd
$ mv ~/.gnupg ~/.gnupg.backup
$ mkdir ~/.gnupg
$ cd ~/.gnupg
$ gpg --version
gpg (GnuPG) 1.4.7
$

2) Create new key, and two sub keys.
$ gpg --gen-key
gpg (GnuPG) 1.4.7; Copyright (C) 2006 Free Software Foundation, Inc.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. See the file COPYING for details.

gpg: keyring `/Users/john/.gnupg/secring.gpg' created
Please select what kind of key you want:
(1) DSA and Elgamal (default)
(2) DSA (sign only)
(5) RSA (sign only)
Your selection? 2
DSA keypair will have 1024 bits.
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 2y
Key expires at Sat Jul 20 20:06:32 2013 PDT
Is this correct? (y/N) y
You need a user ID to identify your key; the software constructs the user ID from the Real Name, Comment and Email Address in this form:
"Heinrich Heine (Der Dichter) "
Real name: Mozilla Software Releases
Email address: releases@mozilla.org
Comment:
You selected this USER-ID:
"Mozilla Software Releases "
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
...
gpg: key 1797CA3D marked as ultimately trusted
public and secret key created and signed.
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2013-07-21
pub 1024D/1797CA3D 2011-07-22 [expires: 2013-07-21]
Key fingerprint = C60B CDD2 9B91 A82F B837 A467 C0F5 550C 1797 CA3D
uid Mozilla Software Releases
Note that this key cannot be used for encryption. You may want to use
the command "--edit-key" to generate a subkey for this purpose.
Command>
Command> quit
$
$ gpg --list-keys
/Users/john/.gnupg/pubring.gpg
------------------------------
pub 1024D/1797CA3D 2011-07-22 [expires: 2013-07-21]
uid Mozilla Software Releases
$
$ echo "so far so good"
$
$ gpg --edit-key releases@mozilla.org
gpg (GnuPG) 1.4.7; Copyright (C) 2006 Free Software Foundation, Inc.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. See the file COPYING for details.
Secret key is available.
pub 1024D/1797CA3D created: 2011-07-22 expires: 2013-07-21 usage: SC
trust: ultimate validity: ultimate
[ultimate] (1). Mozilla Software Releases
Command>
Command>
Command> addkey
Key is protected.
You need a passphrase to unlock the secret key for
user: "Mozilla Software Releases "
1024-bit DSA key, ID 1797CA3D, created 2011-07-22
Please select what kind of key you want:
(2) DSA (sign only)
(4) Elgamal (encrypt only)
(5) RSA (sign only)
(6) RSA (encrypt only)
Your selection? 2
DSA keypair will have 1024 bits.
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 2y
Key expires at Sat Jul 20 20:14:05 2013 PDT
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
.....
pub 1024D/1797CA3D created: 2011-07-22 expires: 2013-07-21 usage: SC
trust: ultimate validity: ultimate
sub 1024D/B7D648C4 created: 2011-07-22 expires: 2013-07-21 usage: S
[ultimate] (1). Mozilla Software Releases
Command>
Command>
Command> addkey
Key is protected.
You need a passphrase to unlock the secret key for
user: "Mozilla Software Releases "
1024-bit DSA key, ID 1797CA3D, created 2011-07-22
Please select what kind of key you want:
(2) DSA (sign only)
(4) Elgamal (encrypt only)
(5) RSA (sign only)
(6) RSA (encrypt only)
Your selection? 4
ELG-E keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)
Requested keysize is 2048 bits
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 2y
Key expires at Sat Jul 20 20:14:53 2013 PDT
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
...
pub 1024D/1797CA3D created: 2011-07-22 expires: 2013-07-21 usage: SC
trust: ultimate validity: ultimate
sub 1024D/B7D648C4 created: 2011-07-22 expires: 2013-07-21 usage: S
sub 2048g/46784661 created: 2011-07-22 expires: 2013-07-21 usage: E
[ultimate] (1). Mozilla Software Releases
Command>
Command> list
pub 1024D/1797CA3D created: 2011-07-22 expires: 2013-07-21 usage: SC
trust: ultimate validity: ultimate
sub 1024D/B7D648C4 created: 2011-07-22 expires: 2013-07-21 usage: S
sub 2048g/46784661 created: 2011-07-22 expires: 2013-07-21 usage: E
[ultimate] (1). Mozilla Software Releases
Command>
Command> quit
Save changes? (y/N) y
$

3) create the public key file.
[snip]
Create a new text file “KEY” containing the following boilerplate text:

This file contains the PGP keys of various developers that work on
Mozilla and its subprojects (such as Firefox and Thunderbird).

Please don’t use these keys for email unless you have asked the owner
because some keys are only used for code signing.

Please realize that this file itself or the public key servers may be
compromised. You are encouraged to validate the authenticity of these keys in an out-of-band manner.

[snip]
3a) Append the following to “KEY” text file:
$ gpg --fingerprint --list-sigs releases@mozilla.org >> KEY
$ gpg --armor --export releases@mozilla.org >> KEY

4) Verify the private key / public key pair work
4a) on signing machine:
*) create a small helloworld.txt file
*) $ gpg --armor --detach-sig readme.txt
*) transfer KEY, readme.txt, readme.txt.asc to another machine

4b) on another machine
$ gpg --import KEY
$ gpg --verify readme.txt.asc readme.txt
gpg: Signature made Thu Jul 21 22:08:21 2011 PDT using DSA key ID C52175E2
gpg: Good signature from "Mozilla Software Releases "
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 9D03 193D 6BDC 541B D796 C4E4 7F4D 6645 1EBC AB3A
Subkey fingerprint: 247C A658 AA95 F617 1EB0 F13E A7D7 5CC7 C521 75E2


5) Post the template public keyfile “KEY” as patch for review, and checkin.
This checked in file will later be posted by the automation alongside the signed builds.


6) Post the template public keyfile to http://pgp.mit.edu, http://wwwkeys.pgp.net and other keymasters.

7) all done – declare victory!

Nightly builds on ftp.mozilla.org now use BuildID

On ftp.m.o, we now use the full BuildID in the directory names for nightly builds. For example:

Before: ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-06-20-12-mozilla-central

After: ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-07-13-03-07-41-mozilla-central/

This is actually more important then might first appear, because this means our automation will avoid user-visible problems we’ve hit in the past. Occasionally, we get two nightly/clobber builds within an hour, and hence both builds were both placed into the same dated directory. This caused problems for people who downloaded one build then got updates expecting the other build; the user is then broken.

Further, we have lots of munging code that deal with different directory-name-formats on ftp.m.o. Each time we fix a bug like this, it means we can then trim and refactor our automation code even further, making the remaining code cleaner, easier to maintain and more reliable.

There were a million-and-one little details to keep straight, in order to make sure nothing broke during this change. Most of these were surprise, undocumented, dependencies – all fun to figure out and debug. Apart from a brief problem with updates, nothing broke when coop rolled this out. Nice work coop, thank you!

(If curious, there’s lots more details in bug#449607, and in coop’s blogpost#1, blogpost#2.)

(Closing this also closes our 5th oldest bug – filed on 07aug2008 – which makes seeing coop complete this even sweeter! Thanks to coop for grabbing this bug from me, and driving it down.)

How many tegras can you fit into one Audi?

If that question doesn’t make sense, I should explain: behind the scenes, there are two big projects underway for Android:

1) coordinate getting the remaining orange tests fixed for Android.
These are tests that pass green on desktop, and for Maemo, but are failing on Android. Some are intermittently orange; some are perma-orange. This involves some RelEng fixes, but mostly involve coordinating work by ATeam, Developers and QA. The current list of bugs is here. We’d always love any help we can get fixing those tests!

2) increase the number of tegras we have in production.
Currently we own 96 tegras, of 86 are online, and the rest are physically broken or waiting for reimaging. Our reimage/reboot/status-tracking for these boards is going fairly well now, and we’ve been able to keep this rate of machines up for weeks now.
However, now that the imaging/production process is stable, we focus on the next part.

There is a limit on how many jobs these slow boards can process in a day. That means developers wait a relatively long time for Android test results. To keep the wait times from being even worse, RelEng restricts which branches have Android tests enabled on them. Now that we have racks to put them in, we’re getting 200 more tegra boards made for us. They dont have this in stock, so we’re receiving them in batches as they are made. Once we get these into production, and we can enable Android tests across the board, we’ll have a better idea of what our real load profile is, and can order more if needed.

To speed up delivery, I physically drove with jhford over to nvidia to collect this first batch of boards.

(Aside: After all the formal emails, and purchasing paperwork, it was great to chat with the guys in shipping, who didn’t know what to make of me, but were really really helpful. I love directions that end with “… ok, so then drive past the rolldown doors, park between the two dumpsters, and knock on the unmarked door”. 30mins later, there’s a group of us outside playing real-life Tetris to get all the boxes to fit in the car. It worked! Thanks Genaro!!)

This first batch proved that you can fit 40 tegras, as well as two Release Engineers into an Audi, and still have room to see out the windows for the drive back to Mozilla!! πŸ™‚

Behind the scene mechanics of Firefox5.0 / Fennec5.0

By now everyone knows that Firefox5.0 shipped, for desktop and for mobile, on 21June2011. That has already been covered elsewhere in great detail. However, now that the dust has settled, there are some behind-the-scenes details that I felt were important to draw attention to.

1) The Firefox5.0 and Fennec5.0 releases were both were based on the *same* identical changeset.
That wasn’t a coincidence – since 5.0beta2, every beta leading up to the Firefox5.0 release and Fennec 5.0 release was built from the same *identical* changeset. This was a major new milestone, made possible by streamlined infrastructures, and has important consequences for Mozilla’s ability to quickly find/fix/deliver security releases to protect our users.

2) We shipped Firefox5.0, Fennec5.0 and Firefox 3.6.18 all on the same day.
Shipping a major “new feature” release is tricky business – there’s a lot of fiddly details, and it’s typically “all hands on deck”. Because of this, we used to make sure *nothing* else was scheduled anywhere near a major release day. However, shipping a major release and announcing the security fixes in it, without also shipping a fix for older branches can be seen as a mixed message to our users on the older branches. Simultaneously shipping the same security fixes in security releases for the older supported branches is whats best for our entire set of users. However, this is tricky to do, and requires a lot of extra planning.

The first time we felt organized enough to safely ship a major “new feature” release was when we shipped Firefox4.0 and Firefox3.6.x and Firefox3.5.x on the one day. It was a very long, hectic 14 hour day (from 7am – ~9pm PDT), but it meant we could protect all Firefox users at the same time from a late breaking security exploit. Fennec4.0 had to wait and ship a week later. By contrast, when we shipped Firefox5.0, Fennec5.0 and Firefox3.6.18, we were all done in a calm orderly 4 hours (from 7am – ~11am PDT).

3) In the days leading up to the Firefox5.0/Fennec5.0/3.6.18 release, we did nine last-minute betas/releases in quick succession.
This fast turnaround was only possible because RelEng’s ongoing automation improvements has reduced our build times from 45hours down to 8hours. It was only because of this fast turnaround that Mozilla could accommodate some last minute fixes without impacting the release schedule. I feel its important to point out that, while there were tight deadlines, and a lot of very precise fast moving footwork, this was all done without burning out the humans in RelEng working those releases.

As Gary pointed out in a brief celebration speech Tuesday, its a big deal to change Mozilla’s development culture from “ship one big bang product when its ready” to “ship lots of smaller incremental products much more frequently”. Doing that transition in such a short timeframe is super impressive, and I believe was made possible by the infrastructure work from RelEng. Switching from the one-track model (trunk in cvs) to the multiple-concurrent-tracks model (currently ~35 active project branches in mercurial). Stabilizing the new mobile product infrastructure, and integrating it with the stable desktop product automation. Relentlessly improving our automation. “baby steps… relentless baby steps”. The list goes on and on…

I’m immensely proud of the quiet behind-the-scenes work that RelEng has done over the last 4 years to make this faster-release-cadence environment possible here at Mozilla. Thank you aki, armen, bear, bhearsum, catlee, coop, dustin, jhford, lsblakk, rail, nthomas.