Automated test harnesses for Firefox, zooming out a bit

In December I was going through some of the steps to be able to change, fix, and interpret the results of a small subset of the gazillion automated tests that Mozilla runs on Firefox builds. My last post arrived at the point of being able to run mochitests locally on my laptop on the latest Firefox code. Over the holidays I dug further into the underlying situation and broaded my perspective. I wanted to make sure that I didn’t end up grubbing away at something that no one cared about or wasn’t missing the big picture.

My question was, how can I, or QA in general, understand where Firefox is with e10s (Electrolysis, the code name for the multiprocess Firefox project). Can I answer the question, are we ready to have e10s turned on by default in Firefox — for the Developer Edition (formerly known as Aurora), for Beta, and for a new release? What criteria are we judging by? Complicated. And given those things, how can we help move the project along and ensure good quality; Firefox that works as well as or, we hope, better than, Firefox with e10s not enabled? My coworker Juan and I boiled it down to basically, stability (lowering the crash rate) and automated test coverage. To answer my bigger questions about how to improve automated test coverage I had to kind of zoom out, and look from another angle. For Firefox developers a lot of what I am about to describe is basic knowledge. It seems worth explaining, since it took me significant time to figure out.

First of all let’s look at treeherder. Treeherder is kind of the new TBPL, which is the old new Tinderbox. It lets us monitor the current state of the code repositories and the tests that run against them. It is a window into Mozilla’s continuous integration setup. A battery of tests are poised to run against Firefox builds on many different platforms. Have a look!

Current view of mozilla-central on treeherder

Treeherder2

Edward Tufte would have a cow. Luckily, this is not for Tufte to enjoy. And I love it. You can just keep digging around in there, and it will keep telling you things. What a weird, complicated gold mine.

Digression! When I first started working for Mozilla I went to the Automation and Tools team work week where they all came up with Treeherder. We wanted to name it something about Ents, because it is about the Tree(s) of the code repos. I explained the whole thing to my son, I think from the work week, which awesomely was in London. He was 11 or 12 at the time and he suggested the name “Yggdrazilla” keeping the -zilla theme and in reference to Yggdrasil, the World-Tree from Norse mythology. My son is pretty awesome. We had to reject that name because we can’t have more -zilla names and also no one would be able to spell Yggdrasil. Alas! So, anyway, treeherder.

The left side of the screen describes the latest batch of commits that were merged into mozilla-central. (The “tree” of code that is used to build Nightly.) On the right, there are a lot of operating systems/platforms listed. Linux opt (optimized version of Firefox for release) is at the top, along with a string of letters and numbers which we hope are green. Those letter and numbers represent batches of tests. You can hover over them to see a description. The tests marked M (1 2 3 etc) are mochitests, bc1 is mochitest-browser-chrome, dt are developer tools tests, and so on. For linux-opt you can see that there are some batches of tests with e10s in the name. We need the mochitest-plain tests to run on Firefox if it has e10s not enabled, or enabled. So the tests are duplicated, possibly changed to work under e10s, and renamed. We have M(1 2 3 . . .) tests, and also M-e10s(1 2 3 . . . ). Tests that are green are all passing. Orange means they aren’t passing. I am not quite sure what red (busted) means (bustage in the tree! red alert!) but let’s just worry about orange. (If you want to read more about the war on orange and what all this means, read Let’s have more green trees from Vaibhav’s blog. )

I kept asking, in order to figure out what needed doing that I could usefully do within the scope of As Soon As Possible, “So, what controls what tests are in which buckets? How do I know how many there are and what they are? Where are they in the codebase? How can I turn them on and off in a way that doesn’t break everything, or breaks it productively?” Good questions. Therefore the answers are long.

There are many other branches of the code other than mozilla-central. Holly is a branch where the builds for all the platforms have e10s enabled. (Many of these repos or branches or twigs or whatever, are named after different kinds of tree.) The tests are the standard set of tests, not particularly tweaked to allow for e10s. We can see what is succeeded and failing on treeherder’s view of holly. A lot of tests are orange on holly! Have a look at holly by clicking through on the link above. Here is a picture of the current state of holly.

Treeherder holly1

If you click a batch of tests where there are some failures — an orange one — then a new panel will open up in treeherder! I will pick a juicy looking one. Right now, for MacOS 10.6 opt, M(2) is orange. Clicking it gives me a ton of info. Scrolling down a bit in the bottom left panel tells me this:

mochitest-plain-chunked 164546/12/14136

The first number is how many tests ran. Scary. Really? 164546 tests ran? Kind of. This is counting assertions, in other words, “is” statements from SimpleTest. The first number lists how many assertions passed. The second number is for assertion failures and the third is for “todo” statements.

The batches of tests running on holly are all running against an e10s build of Firefox. Anything that’s consistently green on holly, we can move over to mozilla-central by making some changes in mozharness. I asked a few people how to do this, and Jim Matthies helpfully pointed me at a past example in Bug 1061014. I figured I could make a stab at adding some of the newly passing tests in Bug 1122901.

As I looked at how to do this, I realized I needed commit access level 2 so I filed a bug to ask for that. And, I also tried merging mozilla-central to holly. That was ridiculously exciting though I hadn’t fixed anything yet. I was just bringing the branch up to date. On my first try doing this, I immediately got a ping on IRC from one of the sheriffs (who do merges and watch the state of the “tree” or code repository) asking me why I had done something irritating and wrong. It took us a bit to figure out what had happened. When I set up mercurial, the setup process and docs told me to install a bunch of Mozilla specific hg extensions. So, I had an extension set up to post to bugzilla every time I updated something. Since I was merging several weeks of one branch to another this touched hundreds of bugs, sending bugmail to untold numbers of people. Mercifully, Bugzilla cut this off after some limit was reached. It was so embarrassing I could feel myself turning beet red as I thought of how many people just saw my mistake and wondered what the heck I was doing. And yet just had to forge onwards, fix my config file, and try it again. Super nicely, Clint Talbert told me that the first time he tried pushing some change he broke all branches of every product and had no idea what had happened. Little did he know I would blog about his sad story to make myself feel less silly….. That was years ago and I think Mozilla was still using cvs at that point! I merged mozilla-central to holly again, did not break anything this time, and watched the tests run and gradually appear across the screen. Very cool.

I also ended up realizing that the changes to mozharness to turn these batches of tests on again were not super obvious and to figure it out I needed to read a 2000-line configuration file which has somewhat byzantine logic. I’m not judging it, it is clearly something that has grown organically over time and someone else probably in release engineering is an expert on it and can tweak it casually to do whatever is needed.

Back to our story. For a batch of tests on holly that have failures and thus are showing up on treeherder as orange, it should be possible to go through the logs for the failing tests, figure out how to turn them off with some skip-if statements, filing bugs for each skipped failing test. Then, keep doing that till a batch of tests is green and it is ready to be moved over.

That seems like a reasonable plan for improving the automated test landscape, which should help developers to know that their code works in Firefox whether e10s is enabled or not. In effect, having the tests should mean that many problems are prevented from ever becoming bugs. The effect of this test coverage is hard to measure. How do you prove something didn’t happen? Perhaps by looking at which e10s tests fail on pushes to the try server. Another issue here is that there are quite a lot of tests that have been around for many years. It is hard too know how many of them are useful, whether there are a lot of redundant tests, in short whether there is a lot of cruft and there probably is. With a bit more experience in the code and fixing and writing tests it would be easier to judge the usefulness of these tests.

I can also see that, despite this taking a while to figure out (and to even begin to explain) it is a good entry point to contributing to Firefox. It has more or less finite boundaries. If you can follow what I just described in this post and my last post, and you can read and follow a little python and javascript, then you can do this. And, if you were to go through many of the tests, over time you would end up understanding more about how the codebase is structured.

As usual when I dive into anything technical at Mozilla, I think it’s pretty cool that most of this work happens in the open. It is a great body of data for academics to study, it’s an example of how this work actually happens for anyone interested in the field, and it’s something that anyone can contribute to if they have the time and interest to put in some effort.

This post seems very plain with only some screenshots of Treeherder for illustration. Here, have a photo of me making friends with a chicken.

Liz with chicken

How to test new features in Firefox 34 Aurora

If you’re a fan of free and open source software and would like to contribute to Firefox, join me for some Firefox feature testing!

There are some nifty features under development right now for Firefox 34 including translation in the browser, making voice or video calls (a feature called “Hello” or “Loop”), debugging information for web developers in the Dev Tools Inspector, and recent improvements to HTML5 gaming.

I’ve written step by step instructions on these
ways to test Firefox 34. If you would like to see what it’s like to improve a popular open source project, trying out these tasks is a good introduction.

Aurora

First, Install the Aurora version of Firefox. It is best to set it up to use multiple profiles. That ensures you don’t use your everyday version of Firefox for testing, so you won’t risk losing your usual profile information. It also makes it easy to restart Firefox with a new, clean profile with all the default settings, very useful for testing. Sometimes I realize I’m running 5 different versions of Firefox at once!

To test “Hello”, try making some voice or video calls from Firefox Aurora. You will need a friend to test with. Or, use two computers that you control. This is a good task to try while joining our chat channels, #qa or #testday on irc.mozilla.org; ask if anyone there wants to test Hello with you. The goal here is mostly to find and report new bugs.

If you test the translation infobar in Aurora you may find some new bugs. This is a fun feature to test. I like trying it on Wikipedia in many different languages, and also looking at newspapers!

If you’re a web developer, you may use Developer Tools in Firefox. I’m asking Aurora users to go through some unconfirmed bug reports, to help improve the Developer Tools Inspector.

If you like games you can test HTML5 web-based games in Firefox Aurora. This helps us improve Firefox and also helps the independent game developers. We have a list of demo games so you can play them, report glitches, and feel like a virtuous open source citizen all at once. Along the way you have opportunities to learn some interesting stuff about how graphics on the web can work (or not work).

Monster madness

These testing tasks are all set up in One and Done, Mozilla QA’s site to start people along the path to joining our open source community. This site was developed with a lot of community contribution including the design and concept by long-time community member Parul and a lot of code by two interns this summer, Pankaj and Maja.

Testing gives a great view into the development process for people who may not (yet) be programmers. I especially love how transparent Mozilla’s process can be. Anyone can report a bug, visible to the entire world in bugzilla.mozilla.org. There are many people watching that incoming stream of bug reports, confirming them and routing them to developer teams, sometimes tagging them as good first bugs for new contributors. Developers who may or may not be Mozilla employees show up in the bugs, like magic . . . if you think of bugmail notifications as magic . . .

It is amazing to see this very public and somewhat anarchic collaboration process at work. Of course, it can also be extremely satisfying to see a bug you discovered and reported, your pet bug, finally get fixed.

Mozilla's bug reporting, QA, and release processes

AdaCamp Portland was an amazing conference for feminist women in open source tech and culture. Not all, but many of the conference attendees are developers, system administrators, or do other technical work in open source software. I gave an informal talk meant to be an overview of some things I currently do at Mozilla. Lots of people came to the session! We all introduced ourselves going around the room.

To start off with, I showed a sample bug to talk about the process of reporting a bug, using Bugzilla, and practicing the skill of reading and understanding a bug report.

We looked first at Bug 926292.

Bugzilla couple

Let’s look at the life of this bug!

This bug was reported in October 2013 for Firefox 24 by someone new to bugzilla.mozilla.org. New users have basic permissions to file and comment on bugs. For around their first 25 bugs filed or commented on, they are marked “New to Bugzilla” to anyone with more permissions on the system. This helps more experienced users to know when they’re in conversation with people who are relatively new to the system. And, bugs reported by new users are automatically entered into Bugzilla with a status of “UNCONFIRMED”.

Our bug reporter was answered the same day by a community bug triager who used the “needinfo” checkbox to ask the bug reporter more questions. A bit later, in Comment 2, I was able to confirm the bug; I marked it NEW. Community members often jump in to do this from Bug Triage bug days, from our One and Done community taskboard, or because they watch the “Firefox::Untriaged” component. (Yes . . . you too can sign up to get email from Bugzilla every time a new bug is filed!)

Francesca Ciceri is currently working on bug triage and verification with our team as part of the GNOME-OPW internship program, doing similiar work to Tiziana Selitto who was an OPW intern last year! Both their blogs have good insights into what it’s like to approach QA in a huge and somewhat chaotic system like Mozilla’s.

In our example bug, I took a guess as to which product and component to add to the bug. This is like putting the bug into the right place where developers who work in a particular area will be likely to see it, and pay attention to it. I moved it from “Firefox” to “Core” and thought it may be something to do with the CSS Object Model. Picking the right product and component is tricky. Sometimes I look for similar bugs, to see what component they’re in. Sometimes I use Bugzilla’s Browse pages to skim or search through the descriptions of components for Firefox, Core, and Toolkit. Even after doing this for a year and a half, I get it wrong. Here, a developer moved the bug to what he thought was a better component for it, Core::Layout. (Developers also sometimes guess wrong, and keep passing a bug around to each others’ components like a hot potato.)

At this point a few developers explored the bug, and went back and forth with each other and the bug reporter about whether it had been fixed or not, exactly what the bug was, whether it is a Mac issue or a Firefox issue, and how to fix it. It was resolved as a duplicate of another bug in October, but the bug reporter came back to reopen it in February 2014. The bug reporter was polite but persistent in explaining their view, giving more details of the browser behavior, trying to find the bug in the very latest developer build (Nightly), giving a test case and comparing the behavior in different browsers. A developer submitted a patch, asked for code review. Related bugs were mentioned and linked. At least two new bugs were filed.

One important thing to note is that people working on QA and development tend to move very fluidly between using various Firefox versions. One of the best things you can do to get involved with helping out is to set up all four “channels” of Firefox with the capability to run them all at once with different profiles, and to start with new, clean profiles. In fact, we need better and more up to date documentation of how to do that on different operating systems, with screenshots! Here are some links that may help you set that up:
* http://www.callum-macdonald.com/about/faq/multiple-firefox-instances/
* https://developer.mozilla.org/en-US/docs/Mozilla/Multiple_Firefox_Profiles
* https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles

OK, back to bug 926292!

Since I had worked on the bug and added myself to the cc field, I got bugmail about all these changes, and more or less followed a long. I often think that the collaboration that happens in bug fixing is very beautiful, and even fairly efficient!

In comment 29 you can see that code got committed to a mercurial repository, to “inbound”. From there, it goes through automated tests and is merged by one of the “sheriffs” into another hg repository, mozilla-central, where it will go into the next build of Nightly, which at that point in April, was Firefox 31.

Comment 30 suggests uplifting the patch to versions that will soon be released, to Aurora and Beta. Release managers started to get involved, commenting and asking the developers to formally nominate the bug for uplift.

At this point in my talk I explained a little bit about the “trains”.

Trains

The versions of Firefox under development advance on a 6 week cycle, from Nightly to Aurora to Beta to the main release of Firefox. In this rapid release schedule, Firefox 31 was Nightly, so Aurora was 30, 29 was Beta, and the release version most folks use was 28. The uplift request was refused so the patch “rode the train”. That means, if you were using Firefox 31 any time after the patch was merged into mozilla-central, you will see its effect. (It would also be fixed for Firefox 32 and 33 which are currently in use as Aurora and Nightly, since 31 is currently Beta.)

Our bug was marked “FIXED” when the patch was merged into mozilla-central. You can see near the end of its comments that I tagged the bug “verifyme” to put it into the queue of bugs that need verifying for Firefox 31. Many people see that list and work on verifying bugs including community members in our Bug Verification test days. I hope the story of this particular bug is over. I don’t have the number immediately to hand but I believe that over 1000 bugs are fixed for each version of Firefox over its release cycle. We can’t verify them all, but it’s amazing what we do get done as a team!

Other tools we looked at in my talk and the ensuing discussion:

Datazilla, which tests and measures Firefox performace: https://datazilla.mozilla.org/

Mozmill, a UI automation framework for Mozilla apps including Firefox and Thunderbird: https://github.com/mozilla/mozmill

Socorro, or crash-stats, where QA and other teams keep track of crashes in Firefox and other Mozilla products: https://crash-stats.mozilla.com

The ftp directories where Firefox builds and build candidates are stored: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/

The mercurial repositories or “the tree”: http://hg.mozilla.org/

DXR, a nifty tool to search Mozilla’s code: http://dxr.mozilla.org/mozilla-central/source/

TBPL which shows the test results for every commit that’s merged into different branches https://tbpl.mozilla.org/

And a quick view into Mozilla’s Jenkins continuous integration dashboard which you can only see from our VPN, just to give an idea of the work we do when Firefox is in Beta. As a particular version of Firefox advances through rapid release, QA pays more attention to particular areas and uses different tools. We have to know a little bit about everything, be able to reproduce a user’s bug on many different possible platforms, figure out which developers may be able to fix a bug (or whose commit may have caused a regression or crash).

It was a lot to cover in an hour long talk! I wanted to pilot this informally as a test for doing a more formal talk with slides.

It represented fairly well that QA covers quite a lot of territory; it’s complicated and interesting work.