Using Swarming to run Chromium test suites

Published on 2024-12-03.

Tagged as #chromium, #testing, #swarming.

1513 words, ca. 8 minutes

tldr: if you’re just here for ready-to-use commands, skip to this section. There’s also a short overview of when to use what to run tests.

Motivation

Last year, as part of the work of Igalia’s Chromium team, I spent a lot of time working on the low-level parts of Chromium that are used for running interactive_ui_tests (a suite of end-to-end tests) under Wayland. That work required me to frequently run the suite to see which tests pass and which don’t with my changes applied.

Running singular tests or small batches locally is fine, but running the whole suite takes quite long (somewhere around 40 minutes). During this time, my CPU usage is quite high, impacting my laptop’s battery life and responsiveness.

I could use the CI that’s part of the regular code review process, the CQ trybots, to run the tests without impacting my local system, but that requires uploading the patch to Gerrit. And even when I just select one trybot (e.g. linux-wayland-rel) that includes the test suite I’m interested in, the bot also runs lots of other tests, meaning I waste electricity and take away bot time from others to get results I’ll never look at. And there are additional restrictions of the trybots that can make this approach unattractive; for example, you can’t run tests when your CL has a merge conflict, and it’s non-trivial to customize certain bot parameters like maximum run time.

I once uploaded a patch that caused lots of test timeouts, which led to the whole suite taking longer than an hour to run. But the specific trybot wasn’t allowed to run that long, so not all tests ran and I had incomplete results.

It is possible to change these timeouts (and other parameters, like what test suites are run by a bot) by modifying testing/buildbot/waterfalls.pyl and then executing testing/buildbot/generate_buildbot_json.py (thanks to Ben Pastene for telling me about this). You can also just directly edit the JSON for the trybot where you want to change something, but then you can only do git cl upload if you pass --bypass-hooks, or it will complain and refuse to upload. Unfortunately, these config files are updated fairly frequently, so the last time I tried this I always had merge conflicts soon after uploading a new patch set, preventing the trybots from testing my patch until I uploaded a rebased patch set.

There is one time of the year where changing the bot configuration works very well, though: in the two weeks around Christmas, all edits to the configs are prohibited, including automated ones. However, you can still upload patches with changes to these files using --bypass-hooks – you’re just not allowed to merge them. But as no one else is allowed to merge such changes, too, you won’t get merge conflicts.

Thus, I was unsatisfied with both options I currently had for running the full test suite. I vaguely remembered coming across a document mentioning that you can use Google’s bots to run tests for you without going through Gerrit to trigger the CQ. I’d tried to get it to work, but was unable to do so in the time I had available then. Still, it seemed to be exactly what I was looking for, and so I decided to invest some more time to see if I could get it to do what I want. As it would turn out, this was a very good decision.

In this post I’ll share the simple, ready-to-use parts of my findings with you. In a second post, we’ll look behind the curtain of these, and in a third and final post, I’ll explain how you can get a nice graphical overview of the test results, just like with the results from CQ runs. You’ll even have a link that you can share with others so they can see the results too! Please check the #swarming tag for the other posts.

Requirements and terms

Of course, Google doesn’t give free computing time on their bots to everyone. You need to have tryjob access for the Chromium project to use Swarming for running tests. Also, you need to have the Chromium repository cloned, as well as depot_tools (but if you didn’t have that, you probably wouldn’t want to run Chromium test suites anyway).

I’ll cover more details in the next post, but for now let me at least explain what exactly Swarming is: “Swarming is a system operated by the [Chromium] infra team that schedules and runs tasks under a specific set of constraints, like ‘this must run on a macOS 10.13 host’ or ‘this must run on a host with an intel GPU’. It is somewhat similar to […] Kubernetes.” (source) One unit of execution in the Swarming system is known as a task or run.

Triggering Swarming tasks

You only need one command to run tests with Swarming:

tools/run-swarmed.py $outdir $target -- $test_args

Note that when you run tools/run-swarmed.py for the first time, you’ll see something like this:

...
If you get authentication errors, follow:
  https://chromium.googlesource.com/chromium/src/+/HEAD/docs/workflow/debugging-with-swarming.md#authenticating
Uploading to isolate server, this can take a while...
[I2024-11-20T16:19:37.102022+01:00 803945 0 client.go:245] context metadata: contextmd.Metadata{ActionID:"e21a1c95-f8b4-4992-aa33-45585c4d3fea", InvocationID:"d94c5024-d469-4f26-a068-720ebb783281", CorrelatedInvocationsID:"", ToolName:"isolate", ToolVersion:""}
isolate: original error: interactive login is required
...

According to the instructions from the mentioned link, you need to run tools/luci-go/isolate login. After that, tools/run-swarmed.py should work.

To give a concrete example: if you want to check the flakiness of BookmarkBarViewTest5.DND from interactive_ui_tests by running it 100 times in a row (make sure you built the target beforehand):

tools/run-swarmed.py out/Release interactive_ui_tests -- \
    --gtest_filter='BookmarkBarViewTest5.DND' --gtest_repeat=100

This will first upload the necessary files to Google’s servers and then trigger a Swarming task to run the tests. You’ll get a link to https://chromium-swarm.appspot.com/task?id=... that you can use to monitor the task’s progress and output. run-swarmed.py blocks until the triggered task finishes, and will then store the task’s output in a separate text file and print its path. If you don’t need that, you can safely Ctrl+C and spawn more tasks or do other things in your shell.

Screenshot of a Swarming task page, showing metadata and the log — Screenshot of the page for a Swarming task triggered by me. We’ll take a closer look at the information on these pages in the next post.

Note that the above command uses out/Release. I’d recommend to always use release builds, or the tests will take much longer to run on the bots. As a bonus, release builds are smaller and thus take less time to upload to Google’s servers. You can also look up which GN args the CQ bots use and just use these. To view the args used for a given https://ci.chromium.org build, expand the “compilator steps (with patch)” step (if it exists), then the “lookup GN args” step, and if needed click on the “gn_args” link.

Make sure to check out the optional command-line arguments via tools/run-swarmed.py --help. For example, if you get weird errors like UnboundLocalError: local variable 'dbus_pid' referenced before assignment, you might need to pass --swarming-os Ubuntu-22.04.

For completeness’ sake, I want to mention that there’s also tools/mb/mb.py run, but I don’t know when you’d want to use it. It gives you more control in exchange for a messier commandline interface, but tools/run-swarmed.py has its own flags for everything that seems relevant in practice to me.

Conclusion

I’d like to close with a summary of the different options you have when running Chromium test suites and a recommendation for when to use which one:

locally: running singular tests or small batches, or when you don’t have internet access.
tools/run_swarmed.py: running medium to large batches of tests or a whole test suite on a single platform.
CQ trybots (single bot): running multiple test suites on a single platform.
CQ trybots (full dry run): running multiple test suites across multiple platforms.

If you need to set custom environment variables, specify the exact command to run, or some other fine-grained control over the task you want to trigger, check out the next post on Swarming for how to use tools/luci-go/swarming trigger. The final post will showcase a scenario where we’ll need this fine tuning.