Graphics improvements in WebKitGTK and WPEWebKit 2.46

WebKitGTK and WPEWebKit recently released a new stable version 2.46. This version includes important changes in the graphics implementation.

Skia

The most important change in 2.46 is the introduction of Skia to replace Cairo as the 2D graphics renderer. Skia supports rendering using the GPU, which is now the default, but we also use it for CPU rendering using the same threaded rendering model we had with Cairo. The architecture hasn’t changed much for GPU rendering: we use the same tiled rendering approach, but buffers for dirty regions are rendered in the main thread as textures. The compositor waits for textures to be ready using fences and copies them directly to the compositor texture. This was the simplest approach that already resulted in much better performance, specially in the desktop with more powerful GPUs. In embedded systems, where GPUs are not so powerful, it’s still better to use the CPU with several rendering threads in most of the cases. It’s still too early to announce anything, but we are already experimenting with different models to improve the performance even more and make a better usage of the GPU in embedded devices.

Skia has received several GCC specific optimizations lately, but it’s always more optimized when built with clang. The optimizations are more noticeable in performance when using the CPU for rendering. For this reason, since version 2.46 we recommend to build WebKit with clang for the best performance. GCC is still supported, of course, and performance when built with GCC is quite good too.

HiDPI

Even though there aren’t specific changes about HiDPI in 2.46, users of high resolution screens using a device scale factor bigger than 1 will notice much better performance thanks to scaling being a lot faster on the GPU.

Accelerated canvas

The 2D canvas can be accelerated independently on whether the CPU or the GPU is used for painting layers. In 2.46 there’s a new setting WebKitSettings:enable-2d-canvas-acceleration to control the 2D canvas acceleration. In some embedded devices the combination of CPU rendering for layer tiles and GPU for the canvas gives the best performance. The 2D canvas is normally rendered into an image buffer that is then painted in the layer as an image. We changed that for the accelerated case, so that the canvas is now rendered into a texture that is copied to a compositor texture to be directly composited instead of painted into the layer as an image. In 2.46 the offscreen canvas is enabled by default.

There are more cases where accelerating the canvas is not desired, for example when the canvas size is not big enough it’s faster to use the GPU. Also when there’s going to be many operations to “download” pixels from GPU. Since this is not always easy to predict, in 2.46 we added support for the willReadFrequently canvas setting, so that when set by the application when creating the canvas it causes the canvas to be always unaccelerated.

Filters

All the CSS filters are now implemented using Skia APIs, and accelerated when possible. The most noticeable change here is that sites using blur filters are no longer slow.

Color spaces

Skia brings native support for color spaces, which allows us to greatly simplify the color space handling code in WebKit. WebKit uses color spaces in many scenarios – but especially in case of SVG and filters. In case of some filters, color spaces are necessary as some operations are simpler to perform in linear sRGB. The good example of that is feDiffuseLighting filter – it yielded wrong visual results for a very long time in case of Cairo-based implementation as Cairo doesn’t have a support for color spaces. At some point, however, Cairo-based WebKit implementation has been fixed by converting pixels to linear in-place before applying the filter and converting pixels in-place back to sRGB afterwards. Such a workarounds are not necessary anymore as with Skia, all the pixel-level operations are handled in a color-space-transparent way as long as proper color space information is provided. This not only impacts the results of some filters that are now correct, but improves performance and opens new possibilities for acceleration.

Font rendering

Font rendering is probably the most noticeable visual change after the Skia switch with mixed feedback. Some people reported that several sites look much better, while others reported problems with kerning in other sites. In other cases it’s not really better or worse, it’s just that we were used to the way fonts were rendered before.

Damage tracking

WebKit already tracks the area of the layers that has changed to paint only the dirty regions. This means that we only repaint the areas that changed but the compositor incorporates them and the whole frame is always composited and passed to the system compositor. In 2.46 there’s experimental code to track the damage regions and pass them to the system compositor in addition to the frame. Since this is experimental it’s disabled by default, but can be enabled with the runtime feature PropagateDamagingInformation. There’s also UnifyDamagedRegions feature that can be used in combination with PropagateDamagingInformation to unify the damage regions into one before passing it to the system compositor. We still need to analyze the impact of damage tracking in performance before enabling it by default. We have also started an experiment to use the damage information in WebKit compositor and avoid compositing the entire frame every time.

GPU info

Working on graphics can be really hard in Linux, there are too many variables that can result in different outputs for different users: the driver version, the kernel version, the system compositor, the EGL extensions available, etc. When something doesn’t work for some people and work for others, it’s key for us to gather as much information as possible about the graphics stack. In 2.46 we have added more useful information to webkit://gpu, like the DMA-BUF buffer format and modifier used (for GTK port and WPE when using the new API). Very often the symptom is the same, nothing is rendered in the web view, even when the causes could be very different. For those cases, it’s even more difficult to gather the info because webkit://gpu doesn’t render anything either. In 2.46 it’s possible to load webkit://gpu/stdout to get the information as a JSON directly in stdout.

Sysprof

Another common symptom for people having problems is that a particular website is slow to render, while for others it works fine. In these cases, in addition to the graphics stack information, we need to figure out where we are slower and why. This is very difficult to fix when you can’t reproduce the problem. We added initial support for profiling in 2.46 using sysprof. The code already has some marks so that when run under sysprof we get useful information about timings of several parts of the graphics pipeline.

Next

This is just the beginning, we are already working on changes that will allow us to make a better use of both the GPU and CPU for the best performance. We have also plans to do other changes in the graphics architecture to improve synchronization, latency and security. Now that we have adopted sysprof for profiling, we are also working on improvements and new tools.

Thread safety support in libsoup3

In libsoup2 there’s some thread safety support that allows to send messages from a thread different than the one where the session is created. There are other APIs that can be used concurrently too, like accessing some of the session properties, and others that aren’t thread safe at all. It’s not clear what’s thread safe and even sending a message is not fully thread safe either, depending on the session features involved. However, several applications relay on the thread safety support and have always worked surprisingly well.

In libsoup3 we decided to remove the (broken) thread safety support and only allowed to use the API from the same thread where the session was created. This simplified the code and made easier to add the HTTP/2 implementation. Note that HTTP/2 supports multiple request over the same TCP connection, which is a lot more efficient than starting multiple requests from several threads in parallel.

When apps started to be ported to libsoup3, those that relied on the thread safety support ended up being a pain to be ported. Major refactorings where required to either stop using the sync API from secondary threads, or moving all the soup usage to the same secondary thread. We managed to make it work in several modules like gstreamer and gvfs, but others like evolution required a lot more work. The extra work was definitely worth it and resulted in much better and more efficient code. But we also understand that porting an application to a new version of a dependency is not a top priority task for maintainers.

So, in order to help with the migration to libsoup3, we decided to add thread safety support to libsoup3 again, but this time trying to cover all the APIs involved in sending a message and documenting what’s expected to be thread safe. Also, since we didn’t remove the sync APIs, it’s expected that we support sending messages synchronously from secondary threads. We still encourage to use only the async APIS from a single thread, because that’s the most efficient way, especially for HTTP/2 requests, but apps currently using threads can be easily ported first and then refactored later.

The thread safety support in libsoup3 is expected to cover only one use case: sending messages. All other APIs, including accessing session properties, are not thread safe and can only be used from the thread where the session is created.

There are a few important things to consider when using multiple threads in libsoup3:

  • In the case of HTTP/2, two messages for the same host sent from different threads will not use the same connection, so the advantage of HTTP/2 multiplexing is lost.
  • Only the API to send messages can be called concurrently from multiple threads. So, in case of using multiple threads, you must configure the session (setting network properties, features, etc.) from the thread it was created and before any request is made.
  • All signals associated to a message (SoupSession::request-queued, SoupSession::request-unqueued, and all SoupMessage signals) are emitted from the thread that started the request, and all the IO will happen there too.
  • The session can be created in any thread, but all session APIs except the methods to send messages must be called from the thread where the session was created.
  • To use the async API from a thread different than the one where the session was created, the thread must have a thread default main context where the async callbacks are dispatched.
  • The sync API doesn’t need any main context at all.

Epiphany automation mode

Last week I finally found some time to add the automation mode to Epiphany, that allows to run automated tests using WebDriver. It’s important to note that the automation mode is not expected to be used by users or applications to control the browser remotely, but only by WebDriver automated tests. For that reason, the automation mode is incompatible with a primary user profile. There are a few other things affected by the auotmation mode:

  • There’s no persistency. A private profile is created in tmp and only ephemeral web contexts are used.
  • URL entry is not editable, since users are not expected to interact with the browser.
  • An info bar is shown to notify the user that the browser is being controlled by automation.
  • The window decoration is orange to make it even clearer that the browser is running in automation mode.

So, how can I write tests to be run in Epiphany? First, you need to install a recently enough selenium. For now, only the python API is supported. Selenium doesn’t have an Epiphany driver, but the WebKitGTK driver can be used with any WebKitGTK+ based browser, by providing the browser information as part of session capabilities.

from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
options.set_capability('browserName', 'Epiphany')
options.set_capability('version', '3.31.4')

ephy = webdriver.WebKitGTK(options=options, desired_capabilities={})
ephy.get('http://www.webkitgtk.org')
ephy.quit()

This is a very simple example that just opens Epiphany in automation mode, loads http://www.webkitgtk.org and closes Epiphany. A few comments about the example:

  • Version 3.31.4 will be the first one including the automation mode.
  • The parameter desired_capabilities shouldn’t be needed, but there’s a bug in selenium that has been fixed very recently.
  • WebKitGTKOptions.set_capability was added in selenium 3.14, if you have an older version you can use the following snippet instead
from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
capabilities = options.to_capabilities()
capabilities['browserName'] = 'Epiphany'
capabilities['version'] = '3.31.4'

ephy = webdriver.WebKitGTK(desired_capabilities=capabilities)
ephy.get('http://www.webkitgtk.org')
ephy.quit()

To simplify the driver instantation you can create your own Epiphany driver derived from the WebKitGTK one:

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        options.set_capability('browserName', 'Epiphany')
        options.set_capability('version', '3.31.4')

        webdriver.WebKitGTK.__init__(self, options=options, desired_capabilities={})

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

The same for selenium < 3.14

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        capabilities = options.to_capabilities()
        capabilities['browserName'] = 'Epiphany'
        capabilities['version'] = '3.31.4'

        webdriver.WebKitGTK.__init__(self, desired_capabilities=capabilities)

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

WebDriver support in WebKitGTK+ 2.18

WebDriver is an automation API to control a web browser. It allows to create automated tests for web applications independently of the browser and platform. WebKitGTK+ 2.18, that will be released next week, includes an initial implementation of the WebDriver specification.

WebDriver in WebKitGTK+

There’s a new process (WebKitWebDriver) that works as the server, processing the clients requests to spawn and control the web browser. The WebKitGTK+ driver is not tied to any specific browser, it can be used with any WebKitGTK+ based browser, but it uses MiniBrowser as the default. The driver uses the same remote controlling protocol used by the remote inspector to communicate and control the web browser instance. The implementation is not complete yet, but it’s enough for what many users need.

The clients

The web application tests are the clients of the WebDriver server. The Selenium project provides APIs for different languages (Java, Python, Ruby, etc.) to write the tests. Python is the only language supported by WebKitGTK+ for now. It’s not yet upstream, but we hope it will be integrated soon. In the meantime you can use our fork in github. Let’s see an example to understand how it works and what we can do.

from selenium import webdriver

# Create a WebKitGTK driver instance. It spawns WebKitWebDriver 
# process automatically that will launch MiniBrowser.
wkgtk = webdriver.WebKitGTK()

# Let's load the WebKitGTK+ website.
wkgtk.get("https://www.webkitgtk.org")

# Find the GNOME link.
gnome = wkgtk.find_element_by_partial_link_text("GNOME")

# Click on the link. 
gnome.click()

# Find the search form. 
search = wkgtk.find_element_by_id("searchform")

# Find the first input element in the search form.
text_field = search.find_element_by_tag_name("input")

# Type epiphany in the search field and submit.
text_field.send_keys("epiphany")
text_field.submit()

# Let's count the links in the contents div to check we got results.
contents = wkgtk.find_element_by_class_name("content")
links = contents.find_elements_by_tag_name("a")
assert len(links) > 0

# Quit the driver. The session is closed so MiniBrowser 
# will be closed and then WebKitWebDriver process finishes.
wkgtk.quit()

Note that this is just an example to show how to write a test and what kind of things you can do, there are better ways to achieve the same results, and it depends on the current source of public websites, so it might not work in the future.

Web browsers / applications

As I said before, WebKitWebDriver process supports any WebKitGTK+ based browser, but that doesn’t mean all browsers can automatically be controlled by automation (that would be scary). WebKitGTK+ 2.18 also provides new API for applications to support automation.

  • First of all the application has to explicitly enable automation using webkit_web_context_set_automation_allowed(). It’s important to know that the WebKitGTK+ API doesn’t allow to enable automation in several WebKitWebContexts at the same time. The driver will spawn the application when a new session is requested, so the application should enable automation at startup. It’s recommended that applications add a new command line option to enable automation, and only enable it when provided.
  • After launching the application the driver will request the browser to create a new automation session. The signal “automation-started” will be emitted in the context to notify the application that a new session has been created. If automation is not allowed in the context, the session won’t be created and the signal won’t be emitted either.
  • A WebKitAutomationSession object is passed as parameter to the “automation-started” signal. This can be used to provide information about the application (name and version) to the driver that will match them with what the client requires accepting or rejecting the session request.
  • The WebKitAutomationSession will emit the signal “create-web-view” every time the driver needs to create a new web view. The application can then create a new window or tab containing the new web view that should be returned by the signal. This signal will always be emitted even if the browser has already an initial web view open, in that case it’s recommened to return the existing empty web view.
  • Web views are also automation aware, similar to ephemeral web views, web views that allow automation should be created with the constructor property “is-controlled-by-automation” enabled.

This is the new API that applications need to implement to support WebDriver, it’s designed to be as safe as possible, but there are many things that can’t be controlled by WebKitGTK+, so we have several recommendations for applications that want to support automation:

  • Add a way to enable automation in your application at startup, like a command line option, that is disabled by default. Never allow automation in a normal application instance.
  • Enabling automation is not the only thing the application should do, so add an automation mode to your application.
  • Add visual feedback when in automation mode, like changing the theme, the window title or whatever that makes clear that a window or instance of the application is controllable by automation.
  • Add a message to explain that the window is being controlled by automation and the user is not expected to use it.
  • Use ephemeral web views in automation mode.
  • Use a temporal user profile in application mode, do not allow automation to change the history, bookmarks, etc. of an existing user.
  • Do not load any homepage in automation mode, just keep an empty web view (about:blank) that can be used when a new web view is requested by automation.

The WebKitGTK client driver

Applications need to implement the new automation API to support WebDriver, but the WebKitWebDriver process doesn’t know how to launch the browsers. That information should be provided by the client using the WebKitGTKOptions object. The driver constructor can receive an instance of a WebKitGTKOptions object, with the browser information and other options. Let’s see how it works with an example to launch epiphany:

from selenium import webdriver
from selenium.webdriver import WebKitGTKOptions

options = WebKitGTKOptions()
options.browser_executable_path = "/usr/bin/epiphany"
options.add_browser_argument("--automation-mode")
epiphany = webdriver.WebKitGTK(browser_options=options)

Again, this is just an example, Epiphany doesn’t even support WebDriver yet. Browsers or applications could create their own drivers on top of the WebKitGTK one to make it more convenient to use.

from selenium import webdriver
epiphany = webdriver.Epiphany()

Plans

During the next release cycle, we plan to do the following tasks:

  • Complete the implementation: add support for all commands in the spec and complete the ones that are partially supported now.
  • Add support for running the WPT WebDriver tests in the WebKit bots.
  • Add a WebKitGTK driver implementation for other languages in Selenium.
  • Add support for automation in Epiphany.
  • Add WebDriver support to WPE/dyz.

WebKitGTK+ remote debugging in 2.18

WebKitGTK+ has supported remote debugging for a long time. The current implementation uses WebSockets for the communication between the local browser (the debugger) and the remote browser (the debug target or debuggable). This implementation was very simple and, in theory, you could use any web browser as the debugger because all inspector code was served by the WebSockets. I said in theory because in the practice this was not always so easy, since the inspector code uses newer JavaScript features that are not implemented in other browsers yet. The other major issue of this approach was that the communication between debugger and target was not bi-directional, so the target browser couldn’t notify the debugger about changes (like a new tab open, navigation or that is going to be closed).

Apple abandoned the WebSockets approach a long time ago and implemented its own remote inspector, using XPC for the communication between debugger and target. They also moved the remote inspector handling to JavaScriptCore making it available to debug JavaScript applications without a WebView too. In addition, the remote inspector is also used by Apple to implement WebDriver. We think that this approach has a lot more advantages than disadvantages compared to the WebSockets solution, so we have been working on making it possible to use this new remote inspector in the GTK+ port too. After some refactorings to the code to separate the cross-platform implementation from the Apple one, we could add our implementation on top of that. This implementation is already available in WebKitGTK+ 2.17.1, the first unstable release of this cycle.

From the user point of view there aren’t many differences, with the WebSockets we launched the target browser this way:

$ WEBKIT_INSPECTOR_SERVER=127.0.0.1:1234 browser

This hasn’t changed with the new remote inspector. To start debugging we opened any browser and loaded

http://127.0.0.1:1234

With the new remote inspector we have to use any WebKitGTK+ based browser and load

inspector://127.0.0.1:1234

As you have already noticed, it’s no longer possible to use any web browser, you need to use a recent enough WebKitGTK+ based browser as the debugger. This is because of the way the new remote inspector works. It requires a frontend implementation that knows how to communicate with the targets. In the case of Apple that frontend implementation is Safari itself, which has a menu with the list of remote debuggable targets. In WebKitGTK+ we didn’t want to force using a particular web browser as debugger, so the frontend is implemented as a builtin custom protocol of WebKitGTK+. So, loading inspector:// URLs in any WebKitGTK+ WebView will show the remote inspector page with the list of debuggable targets.

It looks quite similar to what we had, just a list of debuggable targets, but there are a few differences:

  • A new debugger window is opened when inspector button is clicked instead of reusing the same web view. Clicking on inspect again just brings the window to the front.
  • The debugger window loads faster, because the inspector code is not served by HTTP, but locally loaded like the normal local inspector.
  • The target list page is updated automatically, without having to manually reload it when a target is added, removed or modified.
  • The debugger window is automatically closed when the target web view is closed or crashed.

How does the new remote inspector work?

The web browser checks the presence of WEBKIT_INSPECTOR_SERVER environment variable at start up, the same way it was done with the WebSockets. If present, the RemoteInspectorServer is started in the UI process running a DBus service listening in the IP and port provided. The environment variable is propagated to the child web processes, that create a RemoteInspector object and connect to the RemoteInspectorServer. There’s one RemoteInspector per web process, and one debuggable target per WebView. Every RemoteInspector maintains a list of debuggable targets that is sent to the RemoteInspector server when a new target is added, removed or modified, or when explicitly requested by the RemoteInspectorServer.
When the debugger browser loads an inspector:// URL, a RemoteInspectorClient is created. The RemoteInspectorClient connects to the RemoteInspectorServer using the IP and port of the inspector:// URL and asks for the list of targets that is used by the custom protocol handler to create the web page. The RemoteInspectorServer works as a router, forwarding messages between RemoteInspector and RemoteInspectorClient objects.

WebKitGTK+ 2.16

The Igalia WebKit team is happy to announce WebKitGTK+ 2.16. This new release drastically improves the memory consumption, adds new API as required by applications, includes new debugging tools, and of course fixes a lot of bugs.

Memory consumption

After WebKitGTK+ 2.14 was released, several Epiphany users started to complain about high memory usage of WebKitGTK+ when Epiphany had a lot of tabs open. As we already explained in a previous post, this was because of the switch to the threaded compositor, that made hardware acceleration always enabled. To fix this, we decided to make hardware acceleration optional again, enabled only when websites require it, but still using the threaded compositor. This is by far the major improvement in the memory consumption, but not the only one. Even when in accelerated compositing mode, we managed to reduce the memory required by GL contexts when using GLX, by using OpenGL version 3.2 (core profile) if available. In mesa based drivers that means that software rasterizer fallback is never required, so the context doesn’t need to create the software rasterization part. And finally, an important bug was fixed in the JavaScript garbage collector timers that prevented the garbage collection to happen in some cases.

CSS Grid Layout

Yes, the future here and now available by default in all WebKitGTK+ based browsers and web applications. This is the result of several years of great work by the Igalia web platform team in collaboration with bloomberg. If you are interested, you have all the details in Manuel’s blog.

New API

The WebKitGTK+ API is quite complete now, but there’s always new things required by our users.

Hardware acceleration policy

Hardware acceleration is now enabled on demand again, when a website requires to use accelerated compositing, the hardware acceleration is enabled automatically. WebKitGTK+ has environment variables to change this behavior, WEBKIT_DISABLE_COMPOSITING_MODE to never enable hardware acceleration and WEBKIT_FORCE_COMPOSITING_MODE to always enabled it. However, those variables were never meant to be used by applications, but only for developers to test the different code paths. The main problem of those variables is that they apply to all web views of the application. Not all of the WebKitGTK+ applications are web browsers, so it can happen that an application knows it will never need hardware acceleration for a particular web view, like for example the evolution composer, while other applications, especially in the embedded world, always want hardware acceleration enabled and don’t want to waste time and resources with the switch between modes. For those cases a new WebKitSetting hardware-acceleration-policy has been added. We encourage everybody to use this setting instead of the environment variables when upgrading to WebKitGTk+ 2.16.

Network proxy settings

Since the switch to WebKit2, where the SoupSession is no longer available from the API, it hasn’t been possible to change the network proxy settings from the API. WebKitGTK+ has always used the default proxy resolver when creating the soup context, and that just works for most of our users. But there are some corner cases in which applications that don’t run under a GNOME environment want to provide their own proxy settings instead of using the proxy environment variables. For those cases WebKitGTK+ 2.16 includes a new UI process API to configure all proxy settings available in GProxyResolver API.

Private browsing

WebKitGTK+ has always had a WebKitSetting to enable or disable the private browsing mode, but it has never worked really well. For that reason, applications like Epiphany has always implemented their own private browsing mode just by using a different profile directory in tmp to write all persistent data. This approach has several issues, for example if the UI process crashes, the profile directory is leaked in tmp with all the personal data there. WebKitGTK+ 2.16 adds a new API that allows to create ephemeral web views which never write any persistent data to disk. It’s possible to create ephemeral web views individually, or create ephemeral web contexts where all web views associated to it will be ephemeral automatically.

Website data

WebKitWebsiteDataManager was added in 2.10 to configure the default paths on which website data should be stored for a web context. In WebKitGTK+ 2.16 the API has been expanded to include methods to retrieve and remove the website data stored on the client side. Not only persistent data like HTTP disk cache, cookies or databases, but also non-persistent data like the memory cache and session cookies. This API is already used by Epiphany to implement the new personal data dialog.

Dynamically added forms

Web browsers normally implement the remember passwords functionality by searching in the DOM tree for authentication form fields when the document loaded signal is emitted. However, some websites add the authentication form fields dynamically after the document has been loaded. In those cases web browsers couldn’t find any form fields to autocomplete. In WebKitGTk+ 2.16 the web extensions API includes a new signal to notify when new forms are added to the DOM. Applications can connect to it, instead of document-loaded to start searching for authentication form fields.

Custom print settings

The GTK+ print dialog allows the user to add a new tab embedding a custom widget, so that applications can include their own print settings UI. Evolution used to do this, but the functionality was lost with the switch to WebKit2. In WebKitGTK+ 2.16 a similar API to the GTK+ one has been added to recover that functionality in evolution.

Notification improvements

Applications can now set the initial notification permissions on the web context to avoid having to ask the user everytime. It’s also possible to get the tag identifier of a WebKitNotification.

Debugging tools

Two new debugged tools are now available in WebKitGTk+ 2.16. The memory sampler and the resource usage overlay.

Memory sampler

This tool allows to monitor the memory consumption of the WebKit processes. It can be enabled by defining the environment variable WEBKIT_SMAPLE_MEMORY. When enabled, the UI process and all web process will automatically take samples of memory usage every second. For every sample a detailed report of the memory used by the process is generated and written to a file in the temp directory.

$ WEBKIT_SAMPLE_MEMORY=1 MiniBrowser 
Started memory sampler for process MiniBrowser 32499; Sampler log file stored at: /tmp/MiniBrowser7ff2246e-406e-4798-bc83-6e525987aace
Started memory sampler for process WebKitWebProces 32512; Sampler log file stored at: /tmp/WebKitWebProces93a10a0f-84bb-4e3c-b257-44528eb8f036

The files contain a list of sample reports like this one:

Timestamp                          1490004807
Total Program Bytes                1960214528
Resident Set Bytes                 84127744
Resident Shared Bytes              68661248
Text Bytes                         4096
Library Bytes                      0
Data + Stack Bytes                 87068672
Dirty Bytes                        0
Fast Malloc In Use                 86466560
Fast Malloc Committed Memory       86466560
JavaScript Heap In Use             0
JavaScript Heap Committed Memory   49152
JavaScript Stack Bytes             2472
JavaScript JIT Bytes               8192
Total Memory In Use                86477224
Total Committed Memory             86526376
System Total Bytes                 16729788416
Available Bytes                    5788946432
Shared Bytes                       1037447168
Buffer Bytes                       844214272
Total Swap Bytes                   1996484608
Available Swap Bytes               1991532544

Resource usage overlay

The resource usage overlay is only available in Linux systems when WebKitGTK+ is built with ENABLE_DEVELOPER_MODE. It allows to show an overlay with information about resources currently in use by the web process like CPU usage, total memory consumption, JavaScript memory and JavaScript garbage collector timers information. The overlay can be shown/hidden by pressing CTRL+Shit+G.

We plan to add more information to the overlay in the future like memory cache status.

Accelerated compositing in WebKitGTK+ 2.14.4

WebKitGTK+ 2.14 release was very exciting for us, it finally introduced the threaded compositor to drastically improve the accelerated compositing performance. However, the threaded compositor imposed the accelerated compositing to be always enabled, even for non-accelerated contents. Unfortunately, this caused different kind of problems to several people, and proved that we are not ready to render everything with OpenGL yet. The most relevant problems reported were:

  • Memory usage increase: OpenGL contexts use a lot of memory, and we have the compositor in the web process, so we have at least one OpenGL context in every web process. The threaded compositor uses the coordinated graphics model, that also requires more memory than the simple mode we previously use. People who use a lot of tabs in epiphany quickly noticed that the amount of memory required was a lot more.
  • Startup and resize slowness: The threaded compositor makes everything smooth and performs quite well, except at startup or when the view is resized. At startup we need to create the OpenGL context, which is also quite slow by itself, but also need to create the compositing thread, so things are expected to be slower. Resizing the viewport is the only threaded compositor task that needs to be done synchronously, to ensure that everything is in sync, the web view in the UI process, the OpenGL viewport and the backing store surface. This means we need to wait until the threaded compositor has updated to the new size.
  • Rendering issues: some people reported rendering artifacts or even nothing rendered at all. In most of the cases they were not issues in WebKit itself, but in the graphic driver or library. It’s quite diffilcult for a general purpose web engine to support and deal with all possible GPUs, drivers and libraries. Chromium has a huge list of hardware exceptions to disable some OpenGL extensions or even hardware acceleration entirely.

Because of these issues people started to use different workarounds. Some people, and even applications like evolution, started to use WEBKIT_DISABLE_COMPOSITING_MODE environment variable, that was never meant for users, but for developers. Other people just started to build their own WebKitGTK+ with the threaded compositor disabled. We didn’t remove the build option because we anticipated some people using old hardware might have problems. However, it’s a code path that is not tested at all and will be removed for sure for 2.18.

All these issues are not really specific to the threaded compositor, but to the fact that it forced the accelerated compositing mode to be always enabled, using OpenGL unconditionally. It looked like a good idea, entering/leaving accelerated compositing mode was a source of bugs in the past, and all other WebKit ports have accelerated compositing mode forced too. Other ports use UI side compositing though, or target a very specific hardware, so the memory problems and the driver issues are not a problem for them. The imposition to force the accelerated compositing mode came from the switch to using coordinated graphics, because as I said other ports using coordinated graphics have accelerated compositing mode always enabled, so they didn’t care about the case of it being disabled.

There are a lot of long-term things we can to to improve all the issues, like moving the compositor to the UI (or a dedicated GPU) process to have a single GL context, implement tab suspension, etc. but we really wanted to fix or at least improve the situation for 2.14 users. Switching back to use accelerated compositing mode on demand is something that we could do in the stable branch and it would improve the things, at least comparable to what we had before 2.14, but with the threaded compositor. Making it happen was a matter of fixing a lot bugs, and the result is this 2.14.4 release. Of course, this will be the default in 2.16 too, where we have also added API to set a hardware acceleration policy.

We recommend all 2.14 users to upgrade to 2.14.4 and stop using the WEBKIT_DISABLE_COMPOSITING_MODE environment variable or building with the threaded compositor disabled. The new API in 2.16 will allow to set a policy for every web view, so if you still need to disable or force hardware acceleration, please use the API instead of WEBKIT_DISABLE_COMPOSITING_MODE and WEBKIT_FORCE_COMPOSITING_MODE.

We really hope this new release and the upcoming 2.16 will work much better for everybody.

WebKitGTK+ 2.12

We did it again, the Igalia WebKit team is pleased to announce a new stable release of WebKitGTK+, with a bunch of bugs fixed, some new API bits and many other improvements. I’m going to talk here about some of the most important changes, but as usual you have more information in the NEWS file.

FTL

FTL JIT is a JavaScriptCore optimizing compiler that was developed using LLVM to do low-level optimizations. It’s been used by the Mac port since 2014 but we hadn’t been able to use it because it required some patches for LLVM to work on x86-64 that were not included in any official LLVM release, and there were also some crashes that only happened in Linux. At the beginning of this release cycle we already had LLVM 3.7 with all the required patches and the crashes had been fixed as well, so we finally enabled FTL for the GTK+ port. But in the middle of the release cycle Apple surprised us announcing that they had the new FTL B3 backend ready. B3 replaces LLVM and it’s entirely developed inside WebKit, so it doesn’t require any external dependency. JavaScriptCore developers quickly managed to make B3 work on Linux based ports and we decided to switch to B3 as soon as possible to avoid making a new release with LLVM to remove it in the next one. I’m not going to enter into the technical details of FTL and B3, because they are very well documented and it’s probably too boring for most of the people, the key point is that it improves the overall JavaScript performance in terms of speed.

Persistent GLib main loop sources

Another performance improvement introduced in WebKitGTK+ 2.12 has to do with main loop sources. WebKitGTK+ makes an extensive use the GLib main loop, it has its own RunLoop abstraction on top of GLib main loop that is used by all secondary processes and most of the secondary threads as well, scheduling main loop sources to send tasks between threads. JavaScript timers, animations, multimedia, the garbage collector, and many other features are based on scheduling main loop sources. In most of the cases we are actually scheduling the same callback all the time, but creating and destroying the GSource each time. We realized that creating and destroying main loop sources caused an overhead with an important impact in the performance. In WebKitGTK+ 2.12 all main loop sources were replaced by persistent sources, which are normal GSources that are never destroyed (unless they are not going to be scheduled anymore). We simply use the GSource ready time to make them active/inactive when we want to schedule/stop them.

Overlay scrollbars

GNOME designers have requested us to implement overlay scrollbars since they were introduced in GTK+, because WebKitGTK+ based applications didn’t look consistent with all other GTK+ applications. Since WebKit2, the web view is no longer a GtkScrollable, but it’s scrollable by itself using native scrollbars appearance or the one defined in the CSS. This means we have our own scrollbars implementation that we try to render as close as possible to the native ones, and that’s why it took us so long to find the time to implement overlay scrollbars. But WebKitGTK+ 2.12 finally implements them and are, of course, enabled by default. There’s no API to disable them, but we honor the GTK_OVERLAY_SCROLLING environment variable, so they can be disabled at runtime.

But the appearance was not the only thing that made our scrollbars inconsistent with the rest of the GTK+ applications, we also had a different behavior regarding the actions performed for mouse buttons, and some other bugs that are all fixed in 2.12.

The NetworkProcess is now mandatory

The network process was introduced in WebKitGTK+ since version 2.4 to be able to use multiple web processes. We had two different paths for loading resources depending on the process model being used. When using the shared secondary process model, resources were loaded by the web process directly, while when using the multiple web process model, the web processes sent the requests to the network process for being loaded. The maintenance of this two different paths was not easy, with some bugs happening only when using one model or the other, and also the network process gained features like the disk cache that were not available in the web process. In WebKitGTK+ 2.12 the non network process path has been removed, and the shared single process model has become the multiple web process model with a limit of 1. In practice it means that a single web process is still used, but the network happens in the network process.

NPAPI plugins in Wayland

I read it in many bug reports and mailing lists that NPAPI plugins will not be supported in wayland, so things like http://extensions.gnome.org will not work. That’s not entirely true. NPAPI plugins can be windowed or windowless. Windowed plugins are those that use their own native window for rendering and handling events, implemented in X11 based systems using XEmbed protocol. Since Wayland doesn’t support XEmbed and doesn’t provide an alternative either, it’s true that windowed plugins will not be supported in Wayland. Windowless plugins don’t require any native window, they use the browser window for rendering and events are handled by the browser as well, using X11 drawable and X events in X11 based systems. So, it’s also true that windowless plugins having a UI will not be supported by Wayland either. However, not all windowless plugins have a UI, and there’s nothing X11 specific in the rest of the NPAPI plugins API, so there’s no reason why those can’t work in Wayland. And that’s exactly the case of http://extensions.gnome.org, for example. In WebKitGTK+ 2.12 the X11 implementation of NPAPI plugins has been factored out, leaving the rest of the API implementation common and available to any window system used. That made it possible to support windowless NPAPI plugins with no UI in Wayland, and any other non X11 system, of course.

New API

And as usual we have completed our API with some new additions:

 

WebKitGTK+ 2.10

HTTP Disk Cache

WebKitGTK+ already had an HTTP disk cache implementation, simply using SoupCache, but Apple introduced a new cross-platform implementation to WebKit (just a few bits needed a platform specific implementation), so we decided to switch to it. This new cache has a lot of advantages over the SoupCache approach:

  • It’s fully integrated in the WebKit loading process, sharing some logic with the memory cache too.
  • It’s more efficient in terms of speed (the cache is in the NetworkProcess, but only the file descriptor is sent to the Web Process that mmaps the file) and disk usage (resource body and headers are stored in separate files in disk, using hard links for the body so that difference resources with the exactly same contents are only stored once).
  • It’s also more robust thanks to the lack of index. The synchronization between the index and the actual contents has always been a headache in SoupCache, with many resources leaked in disk, resources that are cache twice, etc.

The new disk cache is only used by the Network Process, so in case of using the shared secondary process model the SoupCache will still be used in the Web Process.

New inspector UI

The Web Inspector UI has been redesigned, you can see some of the differences in this screenshot:

web-inspector-after-before

For more details see this post in the Safari blog

IndexedDB

This was one the few regressions we still had compared to WebKit1. When we switched to WebKit2 we lost IndexedDB support, but It’s now back in 2.10. It uses its own new process, the DatabaseProcess, to perform all database operations.

Lock/Condition

WebKitGTK+ 2.8 improved the overall performance thanks to the use of the bmalloc memory allocator. In 2.10 the overall performance has also improved, this time thanks to a new implementation of the locking primitives. All uses of mutex/condition have been replaced by a new implementation. You can see more details in the email Filip sent to webkit-dev or in the so detailed commit messages.

Screen Saver inhibitor

It’s more and more common to use the web browser to watch large videos in fullscreen mode, and quite annoying when the screen saver decides to “save” your screen every x minutes during the whole video. WebKitGTK+ 2.10 uses the Freedesktop.org ScreenSaver DBus service to inhibit the screen saver while a video is playing in fullscreen mode.

Font matching for strong aliases

WebKit’s font matching algorithm has improved, and now allows replacing fonts with metric-compatible equivalents. For example, sites that specify Arial will now get Liberation Sans, rather than your system’s default sans font (usually DejaVu). This makes text appear better on many pages, since some fonts require more space than others. The new algorithm is based on code from Skia that we expect will be used by Chrome in the future.

Improve image quality when using newer versions of cairo/pixman

The poor downscaling quality of cairo/pixman is a well known issue that was finally fixed in Cairo 1.14, however we were not taking advantage of it in WebKit even when using a recent enough version of cairo. The reason is that we were using CAIRO_FILTER_BILINEAR filter that was not affected by the cairo changes. So, we just switched to use CAIRO_FILTER_GOOD, that will use the BILINEAR filter in previous versions of Cairo (keeping the backwards compatibility), and a box filter for downscaling in newer versions. This drastically improves the image quality of downscaled images with a minim impact in performance.

New API

Editor API

The lack of editor capabilities from the API point of view was blocking the migration to WebKit2 for some applications like Evolution. In 2.10 we have started to add the required API to ensure not only that the migration is possible for any application using a WebView in editable mode, but also that it will be more convenient to use.

So, for example, to monitor the state of the editor associated to a WebView, 2.10 provides a new class WebKitEditorState, that for now allows to monitor the typing attributestyping attributes. With WebKit1 you had to connect to the selection-changed signal and use the DOM bindings API to manually query the typing attributes. This is quite useful for updating the state of the editing buttons in the editor toolbar, for example. You just need to connect to WebKitEditorState::notify::typying-attributes and update the UI accordingly. For now typing attributes is the only thing you can monitor from the UI process API, but we will add more information when needed like the current cursor position, for example.

Having WebKitEditorState doesn’t mean we don’t need a selection-changed signal that we can monitor to query the DOM ourselves. But since in WebKit2 the DOM lives in the Web Process, the selection-changed signal has been added to the Web Extensions API. A new class WebKitWebEditor has been added, to represent the web editor associated to a WebKitWebPage, and can be obtained with webkit_web_page_get_editor(). And is this new class the one providing the selection-changed signal. So, you can connect to the signal and use the DOM API the same way it was done in WebKit1.

Some of the editor commands require an argument, like for example, the command to insert an image requires the image source URL. But both the WebKit1 and WebKit2 APIs only provided methods to run editor commands without any argument. This means that, once again, to implement something like insert-image or insert link, you had to use the DOM bindings to create and insert the new elements in the correct place. WebKitGTK+ 2.10 provides webkit_web_view_execute_editing_command_with_argument() to make this a lot more convenient.

You can test all this features using the new editor mode of MiniBrowser, simply run it with -e command line option and no arguments.

mini-browser-editor

Website data

When browsing the web, websites are allowed to store data at the client side. It could be a cache, like the HTTP disk cache, or data required by web features like offline applications, local storage, IndexedDB, WebSQL, etc. All that data is currently stored in different directories and not all of those could be configured by the user. The new WebKitWebsiteDataManager class in 2.10 allows you to configure all those directories, either using a common base cache/data directory or providing a specific directory for every kind of data stored. It’s not mandatory to use it though, the default values are compatible with the ones previously used.

This gives the user more control over the browsing data stored in the client side, but in the future versions we plan to add support for actually handling the data, so that you will be able to query and delete the data stored by a particular security domain.

Web Processes limit

WebKitGTK+ currently supports two process models, the single shared secondary process and the multiple secondary processes. When using the latter, a new web process is created for every new web view. When there are a lot of web views created at the same time, the resources required to create all those processes could be too much in some systems. To improve that a bit 2.10 adds webkit_web_context_set_web_process_count_limit(), to set the maximum number of web process that can be created a the same time.

This new API can also be used to implement a slightly different version of the shared single process model. By using the multiple secondary process model with a limit of 1 web process, you still have a single shared web process, but using the multi-process mechanism, which means the network will happen in the Network Process, among other things. So, if you use the shared secondary process model in your application, unless your application only loads local resources, we recommend you to switch to multiple process model and use the limit to benefit from all the Network Process feature like the new disk cache, for example. Epiphany already does this for the secondary process model and web apps.

Missing media plugins installation permission request

When you try to play media, and the media backend doesn’t find the plugins/codecs required to play it, the missing plugin installation mechanism starts the package installer to allow the user to find and install the required plugins/codecs. This used to happen in the Web Process and without any way for the user to avoid it. WebKitGTK+ 2.10 provides a new WebKitPermissionRequest implementation that allows the user to block the request and prevent the installer from being invoked.

WebKitGTK+ 2.8.0

We are excited and proud of announcing WebKitGTK+ 2.8.0, your favorite web rendering engine, now faster, even more stable and with a bunch of new features and improvements.

Gestures

Touch support is one the most important features missing since WebKitGTK+ 2.0.0. Thanks to the GTK+ gestures API, it’s now more pleasant to use a WebKitWebView in a touch screen. For now only the basic gestures are implemented: pan (for scrolling by dragging from any point of the WebView), tap (handling clicks with the finger) and zoom (for zooming in/out with two fingers). We plan to add more touch enhancements like kinetic scrolling, overshot feedback animation, text selections, long press, etc. in future versions.

HTML5 Notifications

notifications

Notifications are transparently supported by WebKitGTK+ now, using libnotify by default. The default implementation can be overridden by applications to use their own notifications system, or simply to disable notifications.

WebView background color

There’s new API now to set the base background color of a WebKitWebView. The given color is used to fill the web view before the actual contents are rendered. This will not have any visible effect if the web page contents set a background color, of course. If the web view parent window has a RGBA visual, we can even have transparent colors.

webkitgtk-2.8-bgcolor

A new WebKitSnapshotOptions flag has also been added to be able to take web view snapshots over a transparent surface, instead of filling the surface with the default background color (opaque white).

User script messages

The communication between the UI process and the Web Extensions is something that we have always left to the users, so that everybody can use their own IPC mechanism. Epiphany and most of the apps use D-Bus for this, and it works perfectly. However, D-Bus is often too much for simple cases where there are only a few  messages sent from the Web Extension to the UI process. User script messages make these cases a lot easier to implement and can be used from JavaScript code or using the GObject DOM bindings.

Let’s see how it works with a very simple example:

In the UI process, we register a script message handler using the WebKitUserContentManager and connect to the “script-message-received-signal” for the given handler:

webkit_user_content_manager_register_script_message_handler (user_content, 
                                                             "foo");
g_signal_connect (user_content, "script-message-received::foo",
                  G_CALLBACK (foo_message_received_cb), NULL);

Script messages are received in the UI process as a WebKitJavascriptResult:

static void
foo_message_received_cb (WebKitUserContentManager *manager,
                         WebKitJavascriptResult *message,
                         gpointer user_data)
{
        char *message_str;

        message_str = get_js_result_as_string (message);
        g_print ("Script message received for handler foo: %s\n", message_str);
        g_free (message_str);
}

Sending a message from the web process to the UI process using JavaScript is very easy:

window.webkit.messageHandlers.foo.postMessage("bar");

That will send the message “bar” to the registered foo script message handler. It’s not limited to strings, we can pass any JavaScript value to postMessage() that can be serialized. There’s also a convenient API to send script messages in the GObject DOM bindings API:

webkit_dom_dom_window_webkit_message_handlers_post_message (dom_window, 
                                                            "foo", "bar");

 

Who is playing audio?

WebKitWebView has now a boolean read-only property is-playing-adio that is set to TRUE when the web view is playing audio (even if it’s a video) and to FALSE when the audio is stopped. Browsers can use this to provide visual feedback about which tab is playing audio, Epiphany already does that 🙂

ephy-is-playing-audio

HTML5 color input

Color input element is now supported by default, so instead of rendering a text field to manually input the color  as hexadecimal color code, WebKit now renders a color button that when clicked shows a GTK color chooser dialog. As usual, the public API allows to override the default implementation, to use your own color chooser. MiniBrowser uses a popover, for example.

mb-color-input-popover

APNG

APNG (Animated PNG) is a PNG extension that allows to create animated PNGs, similar to GIF but much better, supporting 24 bit images and transparencies. Since 2.8 WebKitGTK+ can render APNG files. You can check how it works with the mozilla demos.

webkitgtk-2.8-apng

SSL

The POODLE vulnerability fix introduced compatibility problems with some websites when establishing the SSL connection. Those problems were actually server side issues, that were incorrectly banning SSL 3.0 record packet versions, but that could be worked around in WebKitGTK+.

WebKitGTK+ already provided a WebKitWebView signal to notify about TLS errors when loading, but only for the connection of the main resource in the main frame. However, it’s still possible that subresources fail due to TLS errors, when using a connection different to the main resource one. WebKitGTK+ 2.8 gained WebKitWebResource::failed-with-tls-errors signal to be notified when a subresource load failed because of invalid certificate.

Ciphersuites based on RC4 are now disallowed when performing TLS negotiation, because it is no longer considered secure.

Performance: bmalloc and concurrent JIT

bmalloc is a new memory allocator added to WebKit to replace TCMalloc. Apple had already used it in the Mac and iOS ports for some time with very good results, but it needed some tweaks to work on Linux. WebKitGTK+ 2.8 now also uses bmalloc which drastically improved the overall performance.

Concurrent JIT was not enabled in GTK (and EFL) port for no apparent reason. Enabling it had also an amazing impact in the performance.

Both performance improvements were very noticeable in the performance bot:

webkitgtk-2.8-perf

 

The first jump on 11th Feb corresponds to the bmalloc switch, while the other jump on 25th Feb is when concurrent JIT was enabled.

Plans for 2.10

WebKitGTK+ 2.8 is an awesome release, but the plans for 2.10 are quite promising.

  • More security: mixed content for most of the resources types will be blocked by default. New API will be provided for managing mixed content.
  • Sandboxing: seccomp filters will be used in the different secondary processes.
  • More performance: FTL will be enabled in JavaScriptCore by default.
  • Even more performance: this time in the graphics side, by using the threaded compositor.
  • Blocking plugins API: new API to provide full control over the plugins load process, allowing to block/unblock plugins individually.
  • Implementation of the Database process: to bring back IndexedDB support.
  • Editing API: full editing API to allow using a WebView in editable mode with all editing capabilities.