Talos recalibration (status 18jan2010)

If you don’t care about Talos performance results, or Talos hardware, stop reading now!

Its been a hectic couple of weeks on the Talos front since my last post. Here’s a quick summary of whats been going on:

In RelEng, we’re using this recalibration as a chance to cleanup a few long-standing details of how Talos slaves are configured. These changes to the Talos ref images include:

  • setting up these new machines in the Build network, with the rest of the build machines, not in the QA network, where Talos has been running since its inception. This will allow us to clean up some VPN and firewall configurations within the colo.
  • changing accounts on Talos machines to be consistent with all the other build machines
  • making sure configuration management software is installed.
  • bumping up OS versions, which have been intentionally been unchanged since 2007!:
    • OSX10.4: we’re leaving on old machines for now. From initial glance, this doesnt work on the new minis, and is already de-supported on mozilla-central, so we might end up just leaving it. More on this as it emerges.

    • OSX10.5: upgraded from 10.5.2 to 10.5.8.
    • WinXP: no OS change
    • WinVista: replaced with Win7
    • ubuntu7: replaced with fedora12 (32bit)
    • ubuntu 64bit: replaced with fedora12 (64bit)

In IT:

  • The first batch of 100 minis arrived just before New Years. Last week, Matthew and Phong (with a little help from aki, jhford and myself) spent a day unboxing, removing wrappers, sorting power cables, putting on asset tags, scanning serial numbers, etc, etc, etc. Scroll down for photos!

  • the racks were delivered, installed and cabled. Also power upgraded and air conditioning prep’d by middle of last week.
  • as ref images were completed, and tested in staging, use them to image a set of minis.

As of Friday evening, 60 of the new slaves are imaged using the new OSX10.5.8, WinXP and Win7 reference images, racked and powered.

In the coming week, we’ll:

  • have the remaining 40 imaged with linux32, linux64.

  • schedule a downtime to have all these new slaves enabled in production, along side the existing production slaves.

    If all goes well, after about 2 weeks, we’ll take the old systems out of production, and declare that first phase done. Stay tuned for more details.

    (If you’ve read this far, and have questions about anything that I’ve missed, please let me know.)

6 thoughts on “Talos recalibration (status 18jan2010)

  1. I’d love to know more about the motivations for the switch to Fedora, rather than current Ubuntu. My deep-seated hatred for RPM-based distros aside, it seems that this complicates the recalibration. Will you be running Fedora on the old hardware to isolate the effects of the OS switch from the hardware upgrade?

  2. I am also curious how the racking/imaging process works right now.

    Seems like you could have them delivered directly to the colo and rack them up, and create dhcp/bootp entries for each machine (MAC) address (assuming they are netbootable, looks like it from a quick search). The boot, imaging and initial boot could be automatic on first power-on.

    I guess the recording of the MAC address (and associating that with a serial # etc.) would be manual, unless Apple sent this (ideally in a machine-readable format ;)).

    I could see the advantage to having a decent workspace like an office if you need to do a lot of work for cabling, making them easily rackable, etc. Doing large server deliveries directly to the colo and doing the rest on boot and from remote can be a nice division of labor, if it’s possible.

Leave a Reply