Infrastructure load for June 2010

Summary:

June 2010 logged 1,892 pushes – almost our previous record of 1,971 in January. Note this number for June is *under* reporting TryServer usage, as we accidentally lost Try Server usage logs from 01-10june. We assert, without proof, that we would have easily set a new record if we had the missing 10 days of data for TryServer, our busiest branch. Even missing 10-of-30 days of TryServer in June, TryServer was still the busiest branch of the entire infrastructure compared with full month data for other branches.

Overall load since Jan 2009The numbers for this month are:

  • 1,892 code changes to our mercurial-based repos, which triggered 234,387 jobs:
  • 35,308 build jobs, or ~49 jobs per hour.
  • 111,513 unittest jobs, or ~154 jobs per hour.
  • 87,566 talos jobs, or ~121 talos jobs per hour.

Infrastructure load by branch

Details:

  • Losing logs for 1/3 of month for our busiest branch means we are underreporting for June. Hopefully the work catlee/nthomas/anamarias are doing to automate reports will be live soon, to prevent this happening again
  • Our Unittest and Talos load continues high, like last month, and we expect this to jump further as more OS are still being added to Talos.
  • We’re still double-running unittests for some OS; running unittest-on-builder and also unittest-on-tester while developers and QA work through the issues. Whenever unittest-on-test-machine is live and green, we disable unittest-on-builders to reduce wait times for builds.
  • The trend of “what time of day is busiest” changed again this month. Not sure what this means, but worth pointing out that each month seems to be different. This makes finding a “good” time for a downtime almost impossible.
  • The entire series of these infrastructure load blogposts can be found here.
  • We are still not tracking down any l10n repacks, nightly builds, release builds or any “idle-timer” builds.

Detailed breakdown is :
#Pushes this month

#Pushes per hour

Here’s how the math works out (Descriptions of build, unittest and performance jobs triggered by each individual push are here:
the math behind the graphs

[UPDATE: thanks to jhford for catching some copy-paste typos! joduinn 15-jul-2010]

4 thoughts on “Infrastructure load for June 2010”

  1. The pushes per hour graph doesn’t state which timezone it is in, and also has 00 after 23 instead of before 01 (unless the bars are for hours which /end/ at the given time?)

    1. hi Dan;

      1) Regardless of what timezone the contributor is in, I’ve converted all times into PDT (MountainView time). I’ll explicitly add that to next months graph.

      2) Each bar shows the pushes within that hour. For example, changes landed between 23:00 and 23:59 (inclusive) are counted in the “23”bar.

      Let me know if I missed anything, and thanks for the questions.

      tc
      John.

      1. Thanks for the clarification John, I thought it would be PDT (which is UTC -7 right?)

        As for point 2, that makes sense, but then why does the graph start at 01 (meaning that it goes from 01:00 to 00:59 the next day) rather than 00 (00:00 to 23:59 of the same day)?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.