Spatial navigation operations for the webview tag

The modern web is a space meant to be browsed with a mouse or a touchscreen, but think of a smart TV, set-top box, game console or budget phone. What happens when your target hardware doesn’t have those devices and provides only directional input instead?

TV, remote, phone

If you are under control of the contents, you certainly can design your UI around directional input, providing custom focus management based on key events, and properly highlighting the focused item. But that doesn’t work when you confront external content provided by the millions of websites out there, not even if you are targeting a small subset of them like the most popular social networks and search engines.

Virtual cursors are one solution. They certainly address the problem and provide full functionality, but they are somehow inconvenient: traversing the screen can take too long, and precision can be an issue.

Tab navigation is always available and it certainly works well in sites designed for that, like forms, but it’s quite limited; it can also show unexpected behavior when the DOM tree doesn’t match the visual placement of the contents.

Spatial navigation is a feature designed to address this problem. It will:

  • respond to directional input taking into account the position of the elements of the screen
  • deal with scroll when we reach the limits of the viewport or a scrollable element, in a way that content can be easily read (e.g. no abrupt jumps to the next focusable element)
  • provide special highlight for the focused element.

You can see it in action in this BlinkOn lightning talk by Junho Seo, one of the main contributors. You can also test it yourself with the --enable-spatial-navigation flag in your Chrome/Chromium browser.

A non-trivial algorithm makes sure that input matches user expectations. That video shows a couple of corner cases that demonstrate this isn’t easy.

When creating a web UI for a device that makes use of cursor navigation, it’s usual that you have to deal with the two kinds of content mentioned before: the one created specifically for the device, which already takes cursor input into account in its design, and external content from websites that don’t follow those principles. It would be interesting to be able to isolate external content and provide spatial navigation for it.

The webview extension tag is one way to provide isolation for an external site. To support this use case, we implemented new API operations for the tag to enable or disable spatial navigation inside the webview contents independently from the global settings, and to check its state. They have been available for a while, since Chromium version 71.

There are some interesting details in the implementation of this operation. One is that it is required some IPC between different browser processes, because webview contents run on a different renderer (guest), which must be notified of the status changes in spatial navigation made by its “parent” (host). There is no need to do IPC in the other direction because we cache the value in the browser process for the implementation of isSpatialNavigationEnabled to use it.

Another interesting detail was, precisely, related with that cache; it made our test flaky because sometimes the calls to isSpatialNavigationEnabled happened before the call to setSpatialNavigationEnabled had completed its IPC communication. My first attempt to fix this added an ACK reply via IPC to confirm the cache value, but it was considered an overkill for a situation that would never trigger in a real-world scenario. It even was very hard to reproduce from the test! It was finally fixed with some workarounds in the test code.

A more general solution is proposed for all kinds of iframes in the CSS WG draft of the spatial navigation feature. I hope my previous work makes the implementation for the iframe tag straightforward.

Leave a Reply

Your email address will not be published. Required fields are marked *