Better display for “compute hours per checkin”

After my last post about our compute-load-per-checkin, I received a email that made me sit up and smile. Andershol had “a quick script” that quickly and easily displayed the same information in a gridformat. Not just a suggestion – the actual code that ran, with real output. I found this format super helpful. We’ve refined this a few times now, and I think others would also find this useful, hence this post.

  • Each vertical column is the operating system used.
  • Each horizontal row is the job type (which build-type, which test-suite,…).
  • Each white cell is the elapsed time taken by that specific job on that specific operating system, so for example running “mochitest browser chrome” on linux 32bit opt build took 1h:53m:13s.

It is now easy to quickly see the total time spent on a given OS, by looking at the total in the gray column header (for example, Firefox desktop linux 32bit builds and tests took 21h:44m).

Similarly, its easy to see the total time spent on a given job (build/test), across all OS, by looking at the total in the gray row header. (for example, running “mochitest browser chrome” took 4h:54m on opt, 13h:13m on debug, for a total of 18h:07m).

The three major products (Firefox-for-desktop, Firefox-for-Android, FirefoxOS) are each shown in their own grid, but its worth noting that the jobs in *each* of the *three* grids are being processed per checkin. The combined total of all three grids is the overall compute load that RelEng is running per checkin.

This display format was super helpful to me, so big thanks to Andershol for making this a reality!

Also, its great to see no-longer-needed builds and testsuites being turned off… reducing load from 254 to 207 hours per checkin. Biggest highlights were turning off “talos dirtypaint” and “talos rafx” across all desktop OS, turning off all Android no-ionmonkey builds and tests, and turning off a range of Android armv6, armv7 builds and tests. At Mozilla’s volume-of-checkins, those savings quickly add up.

Of course, if you notice anything else being run which you think is no longer needed, please file a bug and we’ll take care of it.

John.

ps: Andershol has posted the code to https://github.com/andershol/buildtasks; if you have ideas, or would like to suggest enhancements, he’s happily accepting patches!

“We are all remoties” in Haas, UCBerkeley

[UPDATE: The newest version of this presentation is here. joduinn 12feb2014, 09nov2014]

Last week, I had the distinct privilege of being invited back to present “We are all remoties” in UCBerkeley’s “New Manager Bootcamp” series at Haas.

The auditorium was packed with ~90 people, from a range of different companies and different industries. After my experiences at Mozilla Summit, I started by asking two specific questions:

1) How many of you are remote? (only ~5% of hands went up).
2) How many of you routinely work with people who are not in the same geographical location as yourself (100% of the hands went up!).

I found it interesting that few thought of themselves as “remotie”, yet all were working in geo-distributed teams.


This was similar to what came up during the “We are all remoties” sessions at MozillaSummit just a few days before, as well as at other previous “We are all remoties” sessions I’ve done elsewhere. Somehow, physically working in an office tricks some people into believing they don’t need to think of themselves as “remote”, and hence don’t think “We are all remoties” is relevant to them!?

People were fully engaged, asking tons of great questions right from the start, and were clearly excited by practical tips to working more effectively in distributed groups. The organizers planned ahead, and specifically put this session immediately before lunch, so that the Q+A could continue overtime… and a separate crowded room of 15-20 people continued the great back/forth over food.

After lunch, I was part of a 4-person panel, where the class got to set direction and ask all the questions – no holds barred. As the class, and the panelists, all came from different backgrounds, different cultures, different careers, it was no surprise that the Q+A uncovered different perspectives and attitudes. The class were agreeing/disagreeing with each other and with the panelists. We even had panelists asking each other questions?!?! As individual panelists, we didn’t always agree on the mechanics of what we did, but we all agreed on the motivations of *why* we did what we did: doing a good job, while also taking care of the lives and careers of the individuals, the group, and the overall organization.

The trust and honesty in the room was great, and it was quickly evident that everyone was down-to-earth, asking brutally honest questions simply because they wanted to do right with their new roles and responsibilities. Even while being on the spot with some awkward questions, I admired their sincere desire to do well in their new role, and to treat people well. It gave me hope, and I thank them all for that.

Big thanks to Homa and Kim for putting it all together. I found it a great experience, and the lively discussions during+after lead me to believe others did too.

John.
PS: For a PDF copy of the presentation, click on the smiley faces! For the sake of my poor blogsite, the much, much, larger keynote file is available on request.

“We are ALL remoties” at Mozilla Summit

[UPDATE: The newest version of this presentation is here. joduinn 12feb2014, 09nov2014]

Last weekend, during Mozilla Summit, “We are all Remoties” was held *4* times: Brussels (catlee), Toronto (Armen and Kadir) and Santa Clara (myself, twice!). Big props to Kadir for joining in with his data – its always great to meet others who are also thinking about to best work together in a growing and geographically-distributed Mozilla.

I was happy to see that these different speakers, in different locations, all covered the session well, in their own personal style, and all had great responses and interactions. From all accounts, people really found this topic helpful, which is very nice to hear.

The one feedback that did surprise me, from all these sessions, was that most of the people attending were already working remotely, yet very few people based in offices attended, even if their entire group was geo-distributed. The topics covered addressed people in offices too, and several times people who were remoties said to me that they wished their office-based-co-workers had attended.

Its possible that the title makes people think the session only applies to non-office-based people. One earlier title I had was “working effectively in geo-distributed teams”, but that sounded very PHB. Another title (“If you are a remotie, or if you are in an office, working with a remotie…”) was too long, but it brought me to the current title. If everyone who is on a geo-distributed team considered themselves all to be on the same level playing field, then “we are ALL remoties!”.

Spreading the word, including to more people in physical offices, is important to make everyone’s work life more effective. If you’ve any ideas/suggestions, please let me know. And thanks again for the great support in all four summit sessions!

John.

[For a PDF copy of the entire presentation, click here or on the smiley faces! For the sake of my poor blogsite, the much, much, larger keynote files are available on request.]

“Journey to the Heart of Aikido” by Linda Holiday Sensei and Motomichi Anno Sensei

(This post is unusual, in that I am “reviewing” a book before reading the final print yet. Maybe “previewing” is more accurate?)

I’ve had the great fortune of repeatedly training on the mat with many world-class Aikido practitioners. Two of these, Linda Holiday Sensei (6th dan, runs Aikido of Santa Cruz dojo) and Motomichi Anno Sensei (8th dan, direct student of OSensei the founder of Aikido, recipient of Japan’s Distinguished Service Award, and ran the Kumano Juku Dojo in Shingu, Japan for ~40 years.) have just published a book they have been working on for literally *years*.

This is exciting.

Training with both of these authors has been pivotal for me, on and off the mat. Over the years, I’ve heard readings of various passages, and even been present for some interviews gathering source material. All random snippets, in various drafts, and out of sequence, which makes it hard to predict how the final form will pull together. What I’ve heard so far have been very meaningful to me, so I’m eager to get my hands on a signed 1st edition of this book on Saturday.

More info in the San Francisco Chronicle’s recent interview with Linda Holiday or the book’s official website. If you are interested, there’s a (free!) open-to-the-public book reading by Linda Holiday with live Aikido demonstrations in San Francisco this Saturday.

Oni gashi mas!

Infrastructure load for September 2013

  • September was special. Our previous record was to run 52,000 test jobs in a 24 hour day on 27aug… impressive by any standards. But in September, we blew past that record twice: we handled 66,456 test jobs on 11sep, and then we handled 73,453 test jobs in a 24 hour day on 17sep. Stunning, simply stunning.
  • #checkins-per-month: We had 7,580 checkins in September 2013. This is ~2% below last month’s record 7,771 checkins.

    Overall load since Jan 2009

  • #checkins-per-day: We hit 416 checkins on 03sep; impressive, yet still below our previous single-day record of 443 checkins on 26aug. During September, yet again all working days were over 200 checkins per day… In fact, if you exclude Friday 08sep and Monday 15sep when people were traveling for the b2g workweek, our weekday load throughout the month was 285 checkins per day, or higher. 19-of-30 days had over 250 checkins-per-day, 13-of-30 days had over 300 checkins-per-day. 2-of-30 days had over 400 checkins-per-day.
  • #checkins-per-hour: Checkins are still mostly mid-day PT/afternoon ET. For 8 of every 24 hours, we sustained over 12 checkins per hour. Our heaviest load time this month was 10am-11am PT 15.73 checkins-per-hour (a checkin every 3.8 min – a new record.

mozilla-inbound, b2g-inbound, fx-team:

  • mozilla-inbound continues to be heavily used as an integration branch. As developers start to use other *-inbound branches, we saw use of mozilla-inbound at 17.4% of all checkins is still much reduced from typical, yet only slightly higher then last month which was the lowest ever usage of mozilla-inbound. The use of multiple *-inbounds is clearly helping improve bottlenecks (see pie chart below) and the congestion on mozilla-inbound is being reduced significantly as people use switch to using other *-inbound branches instead. This also reduces stress and backlog headaches on sheriffs, which is good. All very cool to see and a definite part of the reason we continue to hit new records this month.
  • b2g-inbound continues to be a great success, with 8.8% of this month’s checkins landing here, a slight increase over last month’s 8.2% and further evidence that use of this branch is stabilizing.
  • With sheriff coverage, fx-team is clearly a very active third place for developers, with 5.5% of checkins this month, This is a slight drop from last month, but use also appears to be stabilizing. Having sheriff coverage clearly made a difference.
  • The combined total of these 3 integration branches is 31.7%, which is fairly consistent. Put another way, sheriff moderated branches consistently handle approx 1/3 of all checkins.

    Infrastructure load by branch

mozilla-aurora, mozilla-beta, mozilla-b2g18, gaia-central:
Of our total monthly checkins:

  • 2.3% landed into mozilla-central, slightly higher than last month. As usual, very few people land directly on mozilla-central these days, when there are sheriff-assisted branches available instead.
  • 1.7% landed into mozilla-aurora, about the same as last month.
  • 0.7% landed into mozilla-beta, slightly lower than last month.
  • 0.3% landed into mozilla-b2g18, slightly lower then last month. This should quickly drop to zero as we move to gecko26.
  • Note: gaia-central, and all other gaia-* branches, are not counted here anymore. For details, see here.

misc other details:
As usual, our build pool handled the load well, with >95% of all builds consistently being started within 15mins. Our test pool is getting up to par and we’re seeing more test jobs being handled with better response times. The peak per-day test load for September was insane: our previous record was 52,000 test jobs on 27aug… which we blew right past when we handled 66,456 test jobs on 11sep, and then again when we handled 73,453 test jobs a week later on 17sep. Still more work to be done here, but very encouraging progress.

As always, if you know of any test suites that no longer need to be run per-checkin, please let us know so we can immediately reduce the load a little. Also, if you know of any test suites which are perma-orange, and hidden on tbpl.m.o, please let us know – thats the worst of both worlds – using up scarce CPU time and not being displayed for people to make use of. We’ll make sure to file bugs to get tests fixed – or disabled – every little bit helps put scarce test CPU to better use.

Respect

Summit is coming.

Summit is exciting. With so many people scattered around the world, this gathering of Mozillians… this summit… is a rare chance for people to get together face-to-face.

Summit is scary and stressful. It is a total change in location and routine, which can be stressful. It forces everyone into a high-volume-of-contact… not anonymous contact like a crowded street in New York… high-volume-and-intense-contact with lots of people you work with, closely or intermittently, on a shared project that we all care about passionately. It’s exciting. It’s invigorating. It’s overwhelming. In the coming days, even extrovert people will need a quiet time or two… more introverted people doubly so. Add some small factors like: jet-lag, sleep deprivation, language barriers, change-of-routine, and it’s easy for people to get frayed at the edges.

With that context, I’d like to offer the following thoughts:

  • Respect of self (1): Despite all the great things going on, keep a mental track of how *you* are doing. If you are feeling stressed/overwhelmed with everything, take a few minutes to walk outside in the sunshine, read a book in your room, go for a jog in the sunshine, call family back home, go for a swim… everyone is different, so do whatever works for you. I’ve done this at every conference I attend over the years, and it really helps me recenter. It also lets me mentally process all the inputs so far, and gives me time to remind myself what is important that I still need to do when I go back in the crowd. After all, we’re all here to connect.
  • Respect of self (2): Don’t quietly put up with unacceptable behavior. If a conversation or a situation is making you uncomfortable, make a mental note of it, regardless of whether it’s directed at you, or something you observe/hear being directed at someone else. Politely say “I’m starting to feel uncomfortable“.
    It may not be intended, so this is a great way to give others a chance to quickly learn, self-correct and grow (without risking offense to either party). If that doesn’t fix things, politely excuse yourself with “That’s an interesting opinion, but I have to leave now” and disengage. Some people, at Mozilla and elsewhere, enjoy trolling… but keep in mind that you don’t have to feed the trolls if you don’t want to. Nicole’s presentation is just great, I re-watch it often. If you think the situation merits it, please do let any of the Mozilla Conductors or Site Hosts know.
  • Respect of others: Lively, honest, debate is a great way for smart people to quickly solve complex problems. When it works, it’s magic. True magic. And to be encouraged. Sometimes, however, these can spiral out-of-control. The difference, as far as I can tell, is respect. Don’t impose your thoughts/intentions where they are not welcome. To be clear, I’m not saying that people should stop having honest conversations, and suddenly be all super-politically-correct. Just be respectful. If you find yourself in a heated discussion with someone, and you’re not getting anywhere, try the following:
    • Wait, wait, wait. We’re repeating ourselves here, and clearly not agreeing, so lets take pause and reset.
    • Then wait a few seconds, and take a few deep breaths!
    • OK, to reset context, can we assume that we both are professionals in our areas? Can we assume that we both want the best outcome for Mozilla? Agree?” (It is important to have these be asked, and answered, honestly and with “yes” from both! If you cannot even agree to this, you’ve got a different situation to resolve.)
    • Once you get a “yes”, then speaking calmly, ask “ok, so using different words, can you tell me why you care about xxxxxx? And I promise to not say *anything* until you tell me you’re finished. Then afterwards, we’ll switch, so I’ll speak without interruption, and you listen. But you first…“.
    • Listen. Take notes if it helps. Allow the other person time to pause and collect their thoughts without interruption. Literally no interrupting.
    • When they finally say they’re all done, then say “ok, here’s what I heard you say – is this correct?” and paraphrase it all back to them. Adjust for corrections and repeat if needed, but make sure to state the full end-to-end one last time after last corrections, so they clearly hear you say their entire opinion/concerns *once* perfectly, in one uncorrected pass.
    • Now, reverse roles. “ok, now it’s my turn to speak without interruption, while you listen“.
    • Make sure they can paraphrase back to you, accurately like you did for them.
    • Almost every time I do this, we instantly find that we were actually solving unrelated *different* problems… problems which just happened to overlap in one small area. No wonder we couldn’t agree! We were two smart professional people who were each actually solving very different problems. This tactic helped debug *which* problem we were each solving, and typically cleared things up right away.
  • Respect of Mozilla: I didn’t create Mozilla, but I’m super glad that Mitchell, Brendan and others did years ago. Imagine for a second… if this was a organization that you had created, and nurtured over the years, how would you want yourself, and everyone else, to treat each other? With that thought in mind, go out into the great crowd and engage.

Hopefully people find these thoughts helpful. Disclaimer, this is an area I’m still working on myself, so any feedback/suggestions/improvements are very very welcome… either here or in email or (yes!) in person!

Travel safe, see (some of) you soon, and lets have a great Summit!

Respectfully
John.

ps: Some additional links I found helpful are: Bob Sutton’s No Asshole Rule and Laura Forrest’s “5 Hacks to make the most of Summit”, bsmedberg’s “Mozilla Summit: Listen Hard”… and yes, of course, I would be remiss to not include this great song: