Back in March last year, as part of my maintenance work on Chrome/Chromium accessibility at Igalia, I inadvertently started a long, deep dive into the accessible name calculation code, that had me intermittently busy since then.
As a matter of fact, I started drafting this post over a year ago, and I already presented part of my work in a lightning talk at BlinkOn 15, in November 2021:
This series of posts is the promised closure from that talk, that extends and updates what was presented there.
Introduction
Accessible name and description computation is an important matter in web accessibility. It conveys how user agents (web browsers) are expected to expose the information provided by web contents in text from, in a way it’s meaningful for users of accessibility technologies (ATs). This text is meant to be spoken, or routed to a braille printer.
We know a lot of the Web is text, but it’s not always easy. Style rules can affect what is visible and hence, what is part of a name or description and what is not. There are specific properties to make content hidden for accessibility purposes (aria-hidden
), and also to establish relations between elements for naming and description purposes (aria-labelledby
, aria-describedby
) beyond the markup hierarchy, potentially introducing loops. A variety of properties are used to provide naming, specially for non-text content, but authors may apply them anywhere (aria-label
, alt
, title
) and we need to decide their priorities. There are a lot of factors in play.
There is a document maintained by the ARIA Working Group at W3C, specifying how to do this: https://w3c.github.io/accname/. The spec text is complex, as it has to lay out the different ways to name content, settle their priorities when they interact and define how other properties affect them.
Accessible name/description in Chromium
You may have heard the browser keeps multiple tree structures in memory to be able to display your web sites. There is the DOM tree, which corresponds with the nodes as specified in the HTML document after it’s been parsed, any wrong markup corrected, etc. It’s a live tree, meaning that it’s not static; JavaScript code and user interaction can alter it.
There is a layout tree, which I won’t dare explaining because it’s out of my experience field, but we can say it’s related with how the DOM is displayed: what is hidden and what is visible, and where it’s placed.
And then we have the accessibility tree: it’s a structure that keeps the hierarchy and information needed for ATs. It’s based on the DOM tree but they don’t have the same information: nodes in those trees have different attributes but, more importantly, there is not a 1:1 relation between DOM and accessibility nodes. Not every DOM node needs to be exposed in the accessibility tree and, because of aria-owns
relations, tree hierarchies can be different too. It’s also a live tree, updated by JS code and user interaction.
In Chromium implementation, we must also talk about several accessibility trees. There is one tree per site, managed by Blink code in the renderer process, and represented in an internal and platform-agnostic way for the most part. But the browser process needs to compose the tree for the active tab contents with the browser UI and expose it through the means provided by the operating system. More importantly, these accessibility objects are different from the ones used in Blink, because they are defined by platform-specific abstractions: NSAccessibility, ATK/AT-SPI, MSAA/UIA/IAccessible2…
When working on accessible tree computation, most of the code I deal with is related with the Blink accessibility tree, save for bugs in the translation to platform abstractions. It is, still, a complex piece of code, more than I initially imagined. The logic is spread across twelve functions belonging to different classes, and they call each other creating cycles. It traversed the accessibility tree, but also the DOM tree in certain occasions.
The problems
Last year, my colleague Joanmarie Diggs landed a set of tests in Chromium, adapted from the test cases used by the Working Group as exit criteria for Candidate Recommendations. There were almost 300 tests, most of them passing just fine, but a few exposed bugs that I thought could be a good introduction to this part of the accessibility code.
The first issue I tried to fix was related to whitespace in names, as it looked simple enough to be a good first contact with the code. The issue consisted on missing whitespace in the accessible names: while empty or whitespace nodes translated to spaces in rendered nodes, the same didn’t happen in the accessible names and descriptions. Consider this example:
<label for="test">
<span>1</span><span>2</span><span>3</span>
<span>4</span><span>5</span>
</label>
<input id="test"/>
The label is rendered as “123 45” but Chromium produced the accessible name “12345”. It must be noticed that the spec is somewhat indefinite with regard of the whitespace, which is already reported, but we considered Chromium’s behavior was wrong for two reasons: it didn’t match the rendered text, and it was different from the other implementations. I reported the issue at the Chromium bug tracker.
Another, bigger problem I set to fix was that the CSS pseudo-elements ::before
and ::after
weren’t added to accessible names. Consider the markup:
<style>
label:after { content:" fruit"; }
</style>
<label for="test">fancy</label>
<input id="test"/>
The rendered text is “fancy fruit”, but Chromium was producing the accessible name “fancy”. In addition to not matching the rendered text, the spec explicitly states those pseudo-elements must contribute to the accessible name. I reported this issue, too.
I located the source of both problems at AXNodeObject::TextFromDescendants
: this function triggers the calculation of the name for a node based on its descendants, and it was traversing the DOM tree to obtain them. The DOM tree does not contain the aforementioned pseudo-elements and, for that reason, they were not being added to the name. Additionally, we were explicitly excluding whitespace nodes in this traversal, with the intention of preventing big chunks of whitespace being added to names, like new lines and tabs to indent the markup.
A question hit me at this point: why were we using the DOM for this in the first place? That meant doing an extra traversal for every name calculated, and duplicating code that selected which nodes are important. It seems the reasons for this traversal were lost in time, according to code comments:
// TODO: Why isn't this just using cached children?
While I could land (and I did) temporary fixes for these individual bugs, it looked like we could fix “the traversal problem” instead, addressing the root cause of the problems while streamlining our code and, potentially, fixing other issues. I will address this problem, the reasons to traverse the DOM in first place and the interesting detours this work lead me, in a future post. Stay tuned, and happy hacking!
EDIT: next chapter available here!
Pingback: Accessible name computation in Chromium Chapter II: Name from hidden subtrees | Jacobo's home at Igalia
Pingback: Accessible name computation in Chromium chapter III: name from hidden subtrees | Jacobo's home at Igalia