Epiphany automation mode

Last week I finally found some time to add the automation mode to Epiphany, that allows to run automated tests using WebDriver. It’s important to note that the automation mode is not expected to be used by users or applications to control the browser remotely, but only by WebDriver automated tests. For that reason, the automation mode is incompatible with a primary user profile. There are a few other things affected by the auotmation mode:

  • There’s no persistency. A private profile is created in tmp and only ephemeral web contexts are used.
  • URL entry is not editable, since users are not expected to interact with the browser.
  • An info bar is shown to notify the user that the browser is being controlled by automation.
  • The window decoration is orange to make it even clearer that the browser is running in automation mode.

So, how can I write tests to be run in Epiphany? First, you need to install a recently enough selenium. For now, only the python API is supported. Selenium doesn’t have an Epiphany driver, but the WebKitGTK driver can be used with any WebKitGTK+ based browser, by providing the browser information as part of session capabilities.

from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
options.set_capability('browserName', 'Epiphany')
options.set_capability('version', '3.31.4')

ephy = webdriver.WebKitGTK(options=options, desired_capabilities={})
ephy.get('http://www.webkitgtk.org')
ephy.quit()

This is a very simple example that just opens Epiphany in automation mode, loads http://www.webkitgtk.org and closes Epiphany. A few comments about the example:

  • Version 3.31.4 will be the first one including the automation mode.
  • The parameter desired_capabilities shouldn’t be needed, but there’s a bug in selenium that has been fixed very recently.
  • WebKitGTKOptions.set_capability was added in selenium 3.14, if you have an older version you can use the following snippet instead
from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
capabilities = options.to_capabilities()
capabilities['browserName'] = 'Epiphany'
capabilities['version'] = '3.31.4'

ephy = webdriver.WebKitGTK(desired_capabilities=capabilities)
ephy.get('http://www.webkitgtk.org')
ephy.quit()

To simplify the driver instantation you can create your own Epiphany driver derived from the WebKitGTK one:

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        options.set_capability('browserName', 'Epiphany')
        options.set_capability('version', '3.31.4')

        webdriver.WebKitGTK.__init__(self, options=options, desired_capabilities={})

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

The same for selenium < 3.14

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        capabilities = options.to_capabilities()
        capabilities['browserName'] = 'Epiphany'
        capabilities['version'] = '3.31.4'

        webdriver.WebKitGTK.__init__(self, desired_capabilities=capabilities)

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

WebDriver support in WebKitGTK+ 2.18

WebDriver is an automation API to control a web browser. It allows to create automated tests for web applications independently of the browser and platform. WebKitGTK+ 2.18, that will be released next week, includes an initial implementation of the WebDriver specification.

WebDriver in WebKitGTK+

There’s a new process (WebKitWebDriver) that works as the server, processing the clients requests to spawn and control the web browser. The WebKitGTK+ driver is not tied to any specific browser, it can be used with any WebKitGTK+ based browser, but it uses MiniBrowser as the default. The driver uses the same remote controlling protocol used by the remote inspector to communicate and control the web browser instance. The implementation is not complete yet, but it’s enough for what many users need.

The clients

The web application tests are the clients of the WebDriver server. The Selenium project provides APIs for different languages (Java, Python, Ruby, etc.) to write the tests. Python is the only language supported by WebKitGTK+ for now. It’s not yet upstream, but we hope it will be integrated soon. In the meantime you can use our fork in github. Let’s see an example to understand how it works and what we can do.

from selenium import webdriver

# Create a WebKitGTK driver instance. It spawns WebKitWebDriver 
# process automatically that will launch MiniBrowser.
wkgtk = webdriver.WebKitGTK()

# Let's load the WebKitGTK+ website.
wkgtk.get("https://www.webkitgtk.org")

# Find the GNOME link.
gnome = wkgtk.find_element_by_partial_link_text("GNOME")

# Click on the link. 
gnome.click()

# Find the search form. 
search = wkgtk.find_element_by_id("searchform")

# Find the first input element in the search form.
text_field = search.find_element_by_tag_name("input")

# Type epiphany in the search field and submit.
text_field.send_keys("epiphany")
text_field.submit()

# Let's count the links in the contents div to check we got results.
contents = wkgtk.find_element_by_class_name("content")
links = contents.find_elements_by_tag_name("a")
assert len(links) > 0

# Quit the driver. The session is closed so MiniBrowser 
# will be closed and then WebKitWebDriver process finishes.
wkgtk.quit()

Note that this is just an example to show how to write a test and what kind of things you can do, there are better ways to achieve the same results, and it depends on the current source of public websites, so it might not work in the future.

Web browsers / applications

As I said before, WebKitWebDriver process supports any WebKitGTK+ based browser, but that doesn’t mean all browsers can automatically be controlled by automation (that would be scary). WebKitGTK+ 2.18 also provides new API for applications to support automation.

  • First of all the application has to explicitly enable automation using webkit_web_context_set_automation_allowed(). It’s important to know that the WebKitGTK+ API doesn’t allow to enable automation in several WebKitWebContexts at the same time. The driver will spawn the application when a new session is requested, so the application should enable automation at startup. It’s recommended that applications add a new command line option to enable automation, and only enable it when provided.
  • After launching the application the driver will request the browser to create a new automation session. The signal “automation-started” will be emitted in the context to notify the application that a new session has been created. If automation is not allowed in the context, the session won’t be created and the signal won’t be emitted either.
  • A WebKitAutomationSession object is passed as parameter to the “automation-started” signal. This can be used to provide information about the application (name and version) to the driver that will match them with what the client requires accepting or rejecting the session request.
  • The WebKitAutomationSession will emit the signal “create-web-view” every time the driver needs to create a new web view. The application can then create a new window or tab containing the new web view that should be returned by the signal. This signal will always be emitted even if the browser has already an initial web view open, in that case it’s recommened to return the existing empty web view.
  • Web views are also automation aware, similar to ephemeral web views, web views that allow automation should be created with the constructor property “is-controlled-by-automation” enabled.

This is the new API that applications need to implement to support WebDriver, it’s designed to be as safe as possible, but there are many things that can’t be controlled by WebKitGTK+, so we have several recommendations for applications that want to support automation:

  • Add a way to enable automation in your application at startup, like a command line option, that is disabled by default. Never allow automation in a normal application instance.
  • Enabling automation is not the only thing the application should do, so add an automation mode to your application.
  • Add visual feedback when in automation mode, like changing the theme, the window title or whatever that makes clear that a window or instance of the application is controllable by automation.
  • Add a message to explain that the window is being controlled by automation and the user is not expected to use it.
  • Use ephemeral web views in automation mode.
  • Use a temporal user profile in application mode, do not allow automation to change the history, bookmarks, etc. of an existing user.
  • Do not load any homepage in automation mode, just keep an empty web view (about:blank) that can be used when a new web view is requested by automation.

The WebKitGTK client driver

Applications need to implement the new automation API to support WebDriver, but the WebKitWebDriver process doesn’t know how to launch the browsers. That information should be provided by the client using the WebKitGTKOptions object. The driver constructor can receive an instance of a WebKitGTKOptions object, with the browser information and other options. Let’s see how it works with an example to launch epiphany:

from selenium import webdriver
from selenium.webdriver import WebKitGTKOptions

options = WebKitGTKOptions()
options.browser_executable_path = "/usr/bin/epiphany"
options.add_browser_argument("--automation-mode")
epiphany = webdriver.WebKitGTK(browser_options=options)

Again, this is just an example, Epiphany doesn’t even support WebDriver yet. Browsers or applications could create their own drivers on top of the WebKitGTK one to make it more convenient to use.

from selenium import webdriver
epiphany = webdriver.Epiphany()

Plans

During the next release cycle, we plan to do the following tasks:

  • Complete the implementation: add support for all commands in the spec and complete the ones that are partially supported now.
  • Add support for running the WPT WebDriver tests in the WebKit bots.
  • Add a WebKitGTK driver implementation for other languages in Selenium.
  • Add support for automation in Epiphany.
  • Add WebDriver support to WPE/dyz.

WebKitGTK+ 2.10

HTTP Disk Cache

WebKitGTK+ already had an HTTP disk cache implementation, simply using SoupCache, but Apple introduced a new cross-platform implementation to WebKit (just a few bits needed a platform specific implementation), so we decided to switch to it. This new cache has a lot of advantages over the SoupCache approach:

  • It’s fully integrated in the WebKit loading process, sharing some logic with the memory cache too.
  • It’s more efficient in terms of speed (the cache is in the NetworkProcess, but only the file descriptor is sent to the Web Process that mmaps the file) and disk usage (resource body and headers are stored in separate files in disk, using hard links for the body so that difference resources with the exactly same contents are only stored once).
  • It’s also more robust thanks to the lack of index. The synchronization between the index and the actual contents has always been a headache in SoupCache, with many resources leaked in disk, resources that are cache twice, etc.

The new disk cache is only used by the Network Process, so in case of using the shared secondary process model the SoupCache will still be used in the Web Process.

New inspector UI

The Web Inspector UI has been redesigned, you can see some of the differences in this screenshot:

web-inspector-after-before

For more details see this post in the Safari blog

IndexedDB

This was one the few regressions we still had compared to WebKit1. When we switched to WebKit2 we lost IndexedDB support, but It’s now back in 2.10. It uses its own new process, the DatabaseProcess, to perform all database operations.

Lock/Condition

WebKitGTK+ 2.8 improved the overall performance thanks to the use of the bmalloc memory allocator. In 2.10 the overall performance has also improved, this time thanks to a new implementation of the locking primitives. All uses of mutex/condition have been replaced by a new implementation. You can see more details in the email Filip sent to webkit-dev or in the so detailed commit messages.

Screen Saver inhibitor

It’s more and more common to use the web browser to watch large videos in fullscreen mode, and quite annoying when the screen saver decides to “save” your screen every x minutes during the whole video. WebKitGTK+ 2.10 uses the Freedesktop.org ScreenSaver DBus service to inhibit the screen saver while a video is playing in fullscreen mode.

Font matching for strong aliases

WebKit’s font matching algorithm has improved, and now allows replacing fonts with metric-compatible equivalents. For example, sites that specify Arial will now get Liberation Sans, rather than your system’s default sans font (usually DejaVu). This makes text appear better on many pages, since some fonts require more space than others. The new algorithm is based on code from Skia that we expect will be used by Chrome in the future.

Improve image quality when using newer versions of cairo/pixman

The poor downscaling quality of cairo/pixman is a well known issue that was finally fixed in Cairo 1.14, however we were not taking advantage of it in WebKit even when using a recent enough version of cairo. The reason is that we were using CAIRO_FILTER_BILINEAR filter that was not affected by the cairo changes. So, we just switched to use CAIRO_FILTER_GOOD, that will use the BILINEAR filter in previous versions of Cairo (keeping the backwards compatibility), and a box filter for downscaling in newer versions. This drastically improves the image quality of downscaled images with a minim impact in performance.

New API

Editor API

The lack of editor capabilities from the API point of view was blocking the migration to WebKit2 for some applications like Evolution. In 2.10 we have started to add the required API to ensure not only that the migration is possible for any application using a WebView in editable mode, but also that it will be more convenient to use.

So, for example, to monitor the state of the editor associated to a WebView, 2.10 provides a new class WebKitEditorState, that for now allows to monitor the typing attributestyping attributes. With WebKit1 you had to connect to the selection-changed signal and use the DOM bindings API to manually query the typing attributes. This is quite useful for updating the state of the editing buttons in the editor toolbar, for example. You just need to connect to WebKitEditorState::notify::typying-attributes and update the UI accordingly. For now typing attributes is the only thing you can monitor from the UI process API, but we will add more information when needed like the current cursor position, for example.

Having WebKitEditorState doesn’t mean we don’t need a selection-changed signal that we can monitor to query the DOM ourselves. But since in WebKit2 the DOM lives in the Web Process, the selection-changed signal has been added to the Web Extensions API. A new class WebKitWebEditor has been added, to represent the web editor associated to a WebKitWebPage, and can be obtained with webkit_web_page_get_editor(). And is this new class the one providing the selection-changed signal. So, you can connect to the signal and use the DOM API the same way it was done in WebKit1.

Some of the editor commands require an argument, like for example, the command to insert an image requires the image source URL. But both the WebKit1 and WebKit2 APIs only provided methods to run editor commands without any argument. This means that, once again, to implement something like insert-image or insert link, you had to use the DOM bindings to create and insert the new elements in the correct place. WebKitGTK+ 2.10 provides webkit_web_view_execute_editing_command_with_argument() to make this a lot more convenient.

You can test all this features using the new editor mode of MiniBrowser, simply run it with -e command line option and no arguments.

mini-browser-editor

Website data

When browsing the web, websites are allowed to store data at the client side. It could be a cache, like the HTTP disk cache, or data required by web features like offline applications, local storage, IndexedDB, WebSQL, etc. All that data is currently stored in different directories and not all of those could be configured by the user. The new WebKitWebsiteDataManager class in 2.10 allows you to configure all those directories, either using a common base cache/data directory or providing a specific directory for every kind of data stored. It’s not mandatory to use it though, the default values are compatible with the ones previously used.

This gives the user more control over the browsing data stored in the client side, but in the future versions we plan to add support for actually handling the data, so that you will be able to query and delete the data stored by a particular security domain.

Web Processes limit

WebKitGTK+ currently supports two process models, the single shared secondary process and the multiple secondary processes. When using the latter, a new web process is created for every new web view. When there are a lot of web views created at the same time, the resources required to create all those processes could be too much in some systems. To improve that a bit 2.10 adds webkit_web_context_set_web_process_count_limit(), to set the maximum number of web process that can be created a the same time.

This new API can also be used to implement a slightly different version of the shared single process model. By using the multiple secondary process model with a limit of 1 web process, you still have a single shared web process, but using the multi-process mechanism, which means the network will happen in the Network Process, among other things. So, if you use the shared secondary process model in your application, unless your application only loads local resources, we recommend you to switch to multiple process model and use the limit to benefit from all the Network Process feature like the new disk cache, for example. Epiphany already does this for the secondary process model and web apps.

Missing media plugins installation permission request

When you try to play media, and the media backend doesn’t find the plugins/codecs required to play it, the missing plugin installation mechanism starts the package installer to allow the user to find and install the required plugins/codecs. This used to happen in the Web Process and without any way for the user to avoid it. WebKitGTK+ 2.10 provides a new WebKitPermissionRequest implementation that allows the user to block the request and prevent the installer from being invoked.

WebKitGTK+ 2.8.0

We are excited and proud of announcing WebKitGTK+ 2.8.0, your favorite web rendering engine, now faster, even more stable and with a bunch of new features and improvements.

Gestures

Touch support is one the most important features missing since WebKitGTK+ 2.0.0. Thanks to the GTK+ gestures API, it’s now more pleasant to use a WebKitWebView in a touch screen. For now only the basic gestures are implemented: pan (for scrolling by dragging from any point of the WebView), tap (handling clicks with the finger) and zoom (for zooming in/out with two fingers). We plan to add more touch enhancements like kinetic scrolling, overshot feedback animation, text selections, long press, etc. in future versions.

HTML5 Notifications

notifications

Notifications are transparently supported by WebKitGTK+ now, using libnotify by default. The default implementation can be overridden by applications to use their own notifications system, or simply to disable notifications.

WebView background color

There’s new API now to set the base background color of a WebKitWebView. The given color is used to fill the web view before the actual contents are rendered. This will not have any visible effect if the web page contents set a background color, of course. If the web view parent window has a RGBA visual, we can even have transparent colors.

webkitgtk-2.8-bgcolor

A new WebKitSnapshotOptions flag has also been added to be able to take web view snapshots over a transparent surface, instead of filling the surface with the default background color (opaque white).

User script messages

The communication between the UI process and the Web Extensions is something that we have always left to the users, so that everybody can use their own IPC mechanism. Epiphany and most of the apps use D-Bus for this, and it works perfectly. However, D-Bus is often too much for simple cases where there are only a few  messages sent from the Web Extension to the UI process. User script messages make these cases a lot easier to implement and can be used from JavaScript code or using the GObject DOM bindings.

Let’s see how it works with a very simple example:

In the UI process, we register a script message handler using the WebKitUserContentManager and connect to the “script-message-received-signal” for the given handler:

webkit_user_content_manager_register_script_message_handler (user_content, 
                                                             "foo");
g_signal_connect (user_content, "script-message-received::foo",
                  G_CALLBACK (foo_message_received_cb), NULL);

Script messages are received in the UI process as a WebKitJavascriptResult:

static void
foo_message_received_cb (WebKitUserContentManager *manager,
                         WebKitJavascriptResult *message,
                         gpointer user_data)
{
        char *message_str;

        message_str = get_js_result_as_string (message);
        g_print ("Script message received for handler foo: %s\n", message_str);
        g_free (message_str);
}

Sending a message from the web process to the UI process using JavaScript is very easy:

window.webkit.messageHandlers.foo.postMessage("bar");

That will send the message “bar” to the registered foo script message handler. It’s not limited to strings, we can pass any JavaScript value to postMessage() that can be serialized. There’s also a convenient API to send script messages in the GObject DOM bindings API:

webkit_dom_dom_window_webkit_message_handlers_post_message (dom_window, 
                                                            "foo", "bar");

 

Who is playing audio?

WebKitWebView has now a boolean read-only property is-playing-adio that is set to TRUE when the web view is playing audio (even if it’s a video) and to FALSE when the audio is stopped. Browsers can use this to provide visual feedback about which tab is playing audio, Epiphany already does that 🙂

ephy-is-playing-audio

HTML5 color input

Color input element is now supported by default, so instead of rendering a text field to manually input the color  as hexadecimal color code, WebKit now renders a color button that when clicked shows a GTK color chooser dialog. As usual, the public API allows to override the default implementation, to use your own color chooser. MiniBrowser uses a popover, for example.

mb-color-input-popover

APNG

APNG (Animated PNG) is a PNG extension that allows to create animated PNGs, similar to GIF but much better, supporting 24 bit images and transparencies. Since 2.8 WebKitGTK+ can render APNG files. You can check how it works with the mozilla demos.

webkitgtk-2.8-apng

SSL

The POODLE vulnerability fix introduced compatibility problems with some websites when establishing the SSL connection. Those problems were actually server side issues, that were incorrectly banning SSL 3.0 record packet versions, but that could be worked around in WebKitGTK+.

WebKitGTK+ already provided a WebKitWebView signal to notify about TLS errors when loading, but only for the connection of the main resource in the main frame. However, it’s still possible that subresources fail due to TLS errors, when using a connection different to the main resource one. WebKitGTK+ 2.8 gained WebKitWebResource::failed-with-tls-errors signal to be notified when a subresource load failed because of invalid certificate.

Ciphersuites based on RC4 are now disallowed when performing TLS negotiation, because it is no longer considered secure.

Performance: bmalloc and concurrent JIT

bmalloc is a new memory allocator added to WebKit to replace TCMalloc. Apple had already used it in the Mac and iOS ports for some time with very good results, but it needed some tweaks to work on Linux. WebKitGTK+ 2.8 now also uses bmalloc which drastically improved the overall performance.

Concurrent JIT was not enabled in GTK (and EFL) port for no apparent reason. Enabling it had also an amazing impact in the performance.

Both performance improvements were very noticeable in the performance bot:

webkitgtk-2.8-perf

 

The first jump on 11th Feb corresponds to the bmalloc switch, while the other jump on 25th Feb is when concurrent JIT was enabled.

Plans for 2.10

WebKitGTK+ 2.8 is an awesome release, but the plans for 2.10 are quite promising.

  • More security: mixed content for most of the resources types will be blocked by default. New API will be provided for managing mixed content.
  • Sandboxing: seccomp filters will be used in the different secondary processes.
  • More performance: FTL will be enabled in JavaScriptCore by default.
  • Even more performance: this time in the graphics side, by using the threaded compositor.
  • Blocking plugins API: new API to provide full control over the plugins load process, allowing to block/unblock plugins individually.
  • Implementation of the Database process: to bring back IndexedDB support.
  • Editing API: full editing API to allow using a WebView in editable mode with all editing capabilities.

GTK+ 3 Plugins in WebKitGTK+ and Evince Browser Plugin

GTK+ 3 plugins in WebKitGTK+

The WebKit2 GTK+ API has always been GTK+ 3 only, but WebKitGTK+ still had a hard dependency on GTK+ 2 because of the plugin process. Some popular browser plugins like flash or Java use GTK+ 2 unconditionally (and it seems they are not going to be ported to GTK+ 3, at least not in the short term). These plugins stopped working in Epiphany when it switched to GTK+ 3 and started to work again when Epiphany moved to WebKit2.

To support GTK+ 2 plugins we had to build the plugin process with GTK+ 2, but also some parts of WebCore and WebKit2 (the ones depending on GTK+ and used by the plugin process) were built twice. As a result we had a WebKitPluginProcess binary of ~40MB, that was always used for all the plugins. This kind of made sense, since there were no plugins using GTK+ 3, and the GTK+ 2 dependency was harmless for plugins not using GTK+ at all. However, we realized we were making a rule for the exception, since most of the plugins don’t even use GTK+, and there weren’t plugins using GTK+ 3 because they were not supported by any browser (kind of chicken-egg problem).

Since WebKitGTK+ 2.5.1 we have two binaries for the plugin process: WebKitPluginProcess2 which is exactly the same 40MB binary using GTK+ 2 that we have always had, but that now is only used to load plugins using GTK+ 2; and WebKitPluginProcess, a 7,4K binary that is now used by default for everything except loading plugins that use GTK+ 2. And since it links to GTK+ 3, it might load plugins using GTK+ 3 as well. Another side effect is that now we can make GTK+ 2 optional, WebKitPluginProcess2 wouldn’t be built and only plugins using GTK+ 2 wouldn’t be supported.

Evince Browser Plugin

For a long time, we have maintained that PDF documents shouldn’t be opened inside the browser, but downloaded and then opened by the default document viewer. But then the GNOME design team came up with new mockups for Epiphany were everything was integrated in the browser, including PDF documents. It’s something all the major browsers do nowadays, using different approaches though (Custom PDF plugin inside the web engine, JavaScript libraries, etc.).

At the WebKitGTK+ hackfest in 2012 we started to think about how to implement the integrated document reading in Epiphany based on the design mockups. We quickly discarded the idea of implementing it as a NPAPI plugin, because that would mean we had to use a very old evince version using GTK+ 2. We can’t implement it inside WebKit using libevince because it’s a GPL library, so the first approach was to implement it inside Epiphany using libevince. I wrote a first patch, it was mostly a proof of concept hack, that added a new view widget based on EvView to be used instead of a WebView when a document supported by evince was requested. This approach has a lot of limitations, since it only works when the main resource is a document, but not for documents embedded in a HTML page or an iframe, and a lot of integration problems that makes it quite difficult to maintain inside Epiphany. All of these issues would be solved by implementing it as a NPAPI plugin and it wouldn’t require any change in Epiphany. Now that WebKitGTK+ supports GTK+ 3 plugins, there’s no reason not to do so.

Epiphany Evince Plugin

Thanks to a project in Igalia I’ve been able to work on it, and today I’ve landed an initial implementation of the browser plugin to Evince git master. It’s only a first implementation (written in C++ 11) with the basic features (page navigation, view modes, zoom and printing), and a very simple UI that needs to be updated to match the mockups. It can be disabled at compile time like all other frontends inside Evince (thumbnailer, previewer, nautilus properties page).

Epiphany embedded PDF document Epiphany standalone PDF document

Another advantage of being a NPAPI plugin is that it’s scriptable so that you can control the viewer using JavaScript.

Epiphany scriptable PDF

And you can pass initial parameters (like current page, zoom level, view mode, etc.) from the HTML tag.

<object data="test.pdf" type="application/pdf" width="600" height="300" 
                currentPage="2" zoomMode="fit-page" continuous="false">
  The pdf could not be rendered.
</object>

You can even hide the default toolbar and build your own one using HTML and JavaScript.

WebKitGTK+ Hackfest 2013: The Network Process

As every year many ideas came up during the WebKitGTK+ hackfest presentation, but this time there was one we all were very excited about, the multiple web processes support. Apple developers already implemented the support for multiple web processes in WebKit, which is mostly cross platform, but it requires the network process support to properly work (we need a common network process where cookies, HTTP cache, etc are shared for all web processes in the same web context). Soup based WebKit ports don’t implement the network process yet, so the goal of the hackfest became to complete the network process implementation previously started by EFL and Nix guys, as a first step to enable the multiple web processes support. Around 10 people were working on this goal during the whole hackfest, meeting from time to time to track the status of the tasks and assigning new ones.

After all this awesome work we managed to have the basic support, with MiniBrowser perfectly rendering pages and allowing navigation using the network process. But as expected, there were some bugs and missing features, so I ran the WebKit2 unit tests and we took failing tests to investigate why they were failing and how to fix them.

So, we are actually far from having a complete and stable network process support, but it’s a huge step forward. The good news is that once we have network process implemented, the multiple web processes support will work automatically just by selecting the multiple web process model.

All this sounds like a lot of work done, but that’s only a small part of what has happened this week in Coruña:

  • Martin and Gustavo made more parts of WebKit actually build with the cmake build.
  • Jon made several improvements in Epiphany UI.
  • Gustavo fixed the default charset encoding used by Epiphany.
  • Iago, Edu made some progress in the wayland support for WebKit2.
  • Dan was working on HTTP2 implementation for libsoup.
  • Zan was finalizing his GSoC work under Martin’s mentorship to bring WebGL support under Wayland
  • Gustavo added support for right-side docking of the web inspector in WebKitGTK+.
  • Javi, Rego and Minhea were focused on a new implementation for the selections in CSS regions.
  • Calvaris began to rewrite the GTK media controls once more, this time in JavaScript.
  • Zan finished the patches to add battery support in WebKitGTK+ using upower.
  • Martin, Gustavo and Zan worked on support for testing WebGL and accelerated compositing layout tests in WebKitGTK+.
  • Brian, Alex and Zan were working on the parsing of the valgrind xml output used when running the tests under valgrind to detect memory leaks.
  • Brendan added a setting to both WebKit1 and WebKit2 APIs to enable media source and fixed a crash in several video layout tests.
  • Claudio went back to his work on the notifications support for WebKitGTK+ and was reviewing patches like crazy.
  • Philippe worked on the WebRTC implementation for the GStreamer WebKit media backend.
  • Mario, Joanie and API were focused on accessibility, also making sure that the multiple web processes doesn’t affect the accessibility support.
  • Brendan also worked on MediaSource, investigating how to handle video resolution changes.
  • I removed all the WebKit1 unused code from Epiphany, and moved the GNOME shell search provider to its own binary.

And I’m sure I’m missing more great stuff done that I could not follow closely. It’s definitely been a very productive hackfest that it would haven’t been possible without the sponsors, Igalia and the GNOME Foundation. Thanks!

Igalia S.L. GNOME Foundation

WebKit2GTK+ Web Process Extensions

The multiprocess architecture of WebKit2 brought us a lot of advantages, but it also introduced important challenges, like how to expose some features that now live in the Web Process (DOM, JavaScript, etc.). The UI process API is fully asynchronous to make sure the UI is never blocked, but some APIs like the DOM bindings are synchronous by design. To expose those features that live in the Web Process, WebKit2GTK+ provides a Web Extensions mechanism. A Web Extension is like a plugin for the Web Process, that is loaded at start up, similar to a GTK module or gio extension, but that runs in the Web Process. WebKit2GTK+ exposes a simple low level API that at the moment provides access to three main features:

  • GObject DOM bindings: The exactly same API used in WebKit1 is available in WebKit2.
  • WebKitWebPage::send-request signal: It allows to change any request before it is sent to the server, or even simply prevent it from being sent.
  • Custom JavaScript injection: It provides a signal, equivalent to WebKitWebView::window-object-cleared in WebKit1, to inject custom JavaScript using the JavaScriptCore API. (Since 2.2)

This simple API doesn’t provide any way of communication with the UI Process, so that the user can use any IPC mechanism without interfering with the internal WebKit IPC traffic. Epiphany currently installs a Web Extension to implement some of its features such us pre-filled forms, ads blocker or Do Not Track using D-BUS for the communication between the Web Extension and the UI Process.

How to write a Web Extension?

Web Extensions are shared libraries loaded at run time by the Web Process, so they don’t have a main function, but they have an entry point called by the WebProcess right after the extension is loaded. The initialization function must be called webkit_web_extension_initialize() and it receives a WebKitWebExtension object as parameter. It should also be public, so make sure to use the G_MODULE_EXPORT macro. This is the function to initialize the Web Extension and can be used, for example, to be notified when a web page is created.

static void
web_page_created_callback (WebKitWebExtension *extension,
                           WebKitWebPage      *web_page,
                           gpointer            user_data)
{
    g_print ("Page %d created for %s\n", 
             webkit_web_page_get_id (web_page),
             webkit_web_page_get_uri (web_page));
}

G_MODULE_EXPORT void
webkit_web_extension_initialize (WebKitWebExtension *extension)
{
    g_signal_connect (extension, "page-created", 
                      G_CALLBACK (web_page_created_callback), 
                      NULL);
}

This would be a minimal Web Extension, it does nothing yet, but it can be compiled and loaded so let’s see how to create a Makefile.am file to build the extension.

webextension_LTLIBRARIES = libmyappwebextension.la
webextensiondir = $(libdir)/MyApp/web-extension
libmyappwebextension_la_SOURCES = my-app-web-extension.c
libmyappwebextension_la_CFLAGS = $(WEB_EXTENSION_CFLAGS)
libmyappwebextension_la_LIBADD = $(WEB_EXTENSION_LIBS)
libmyappwebextension_la_LDFLAGS = -module -avoid-version -no-undefined

The extension will be installed in $(libdir)/MyApp/web-extension so we need to tell WebKit where to find web extensions before the Web Process is spawned. Call webkit_web_context_set_web_extensions_directory() as soon as possible in your application, before any other WebKit call to make sure it’s called before a Web Process is launched. You can create a preprocessor macro in the Makefile.am to pass the value of the Web Extensions directory.

myapp_CPPFLAGS = -DMYAPP_WEB_EXTENSIONS_DIR=\""$(libdir)/MyApp/web-extension"\"

And then in the code

webkit_web_context_set_web_extensions_directory (webkit_web_context_get_default (), 
                                                 MYAPP_WEB_EXTENSIONS_DIR);

The Web Extension only needs WebKit2GTK+ to build, so in the configure.ac you can define WEB_EXTENSION_CFLAGS and WEB_EXTENSION_LIBS using pkg-config macros.

PKG_CHECK_MODULES(WEB_EXTENSION, [webkit2gtk-3.0 >= 2.0.0])
AC_SUBST(WEB_EXTENSION_CFLAGS)
AC_SUBST(WEB_EXTENSION_LIBS)

This should be enough. You should be able to build and install the Web Extension with you program and see the printf message every time a page is created. But that’s a useless example, let’s see how to use the Web Extensions API to do something useful.

Accessing the DOM

The GObject DOM bindings API available in WebKit1 is also exposed in WebKit2 from the Web Extensions API. We only need to call webkit_web_page_get_dom_document() to get the WebKitDOMDocument of the given web page.

static void
web_page_created_callback (WebKitWebExtension *extension,
                           WebKitWebPage      *web_page,
                           gpointer            user_data)
{
    WebKitDOMDocument *document;
    gchar             *title;

    document = webkit_web_page_get_dom_document (web_page);
    title = webkit_dom_document_get_title (document);
    g_print ("Page %d created for %s with title %s\n", 
             webkit_web_page_get_id (web_page),
             webkit_web_page_get_uri (web_page),
             title);
    g_free (title);
}

Using WebKitWebPage::send-request signal

Using the Web Extensions API it’s possible to modify the request of any resource before it’s sent to the server, adding HTTP headers or modifying the URI. You can also make WebKit ignore a request, for example to block resources depending on the URI, by simply connecting to the signal and returning TRUE.

static gboolean
web_page_send_request (WebKitWebPage     *web_page,
                       WebKitURIRequest  *request,
                       WebKitURIResponse *redirected_response,
                       gpointer           user_data)
{
    const char *request_uri;
    const char *page_uri;

    request_uri = webkit_uri_request_get_uri (request);
    page_uri = webkit_web_page_get_uri (web_page);

    return uri_is_an_advertisement (request_uri, page_uri);
}

static void
web_page_created_callback (WebKitWebExtension *extension,
                           WebKitWebPage      *web_page,
                           gpointer            user_data)
{
    g_signal_connect_object (web_page, "send-request",
                             G_CALLBACK (web_page_send_request),
                             NULL, 0);
}

Extending JavaScript

Using the JavaScriptCore API it’s possible to inject custom JavaScript code by connecting to the window-object-cleared signal of the default WebKitScriptWorld. You can get the global JavaScript execution context by calling webkit_frame_get_javascript_context_for_script_world() for the WebKitFrame passed as parameter of the window-object-cleared signal.

static void 
window_object_cleared_callback (WebKitScriptWorld *world, 
                                WebKitWebPage     *web_page, 
                                WebKitFrame       *frame, 
                                gpointer           user_data)
{
    JSGlobalContextRef jsContext;
    JSObjectRef        globalObject;

    jsContext = webkit_frame_get_javascript_context_for_script_world (frame, world);
    globalObject = JSContextGetGlobalObject (jsContext);

    /* Use JSC API to add the JavaScript code you want */
}

G_MODULE_EXPORT void
webkit_web_extension_initialize (WebKitWebExtension *extension)
{
    g_signal_connect (webkit_script_world_get_default (), 
                      "window-object-cleared", 
                      G_CALLBACK (window_object_cleared_callback), 
                      NULL);
}

WebKitGTK+ 2.0.0

After more than two years of development the Igalia WebKit team is proud to announce WebKitGTK+ 2.0.0.

But what’s so special about WebKitGTK+ 2.0?

The WebKit2GTK+ API is now the default one. This means that it’s now considered stable from the API/ABI backwards compatibility point of view, and that the old WebKit1 API is in maintenance mode and kind of deprecated. We will maintain both APIs, but we don’t plan to work on WebKi1 other than fixing bugs.

We encourage everybody to port their existing WebKitGTK+ applications to WebKit2, although we know the WebKit2 GTK+ API is not ready for all applications yet. We will work on adding new API during next release cycle, so let us know if you are missing some API that prevents you from porting your project.

Epiphany, the GNOME Web browser, has been successfully ported to WebKit2 and uses it by default since GNOME 3.8.

What are the benefits of the WebKit2 GTK+ API?

We have talked several times about the advantages of the multi-process architecture of WebKit2, robustness, responsiveness, security, etc. All of the details of the multi-process separation are mostly transparent for the API users, bringing all those benefits for free to any application using WebKit2 GTK+. We have developed the API on top of this multi-process architecture, but also with the experience of several years developing and maintaining the WebKit1 GTK+ API, learning from the mistakes made in the past and keeping the good ideas. As a result, the WebKit2 API is very similar to the WebKi1 in some parts and quite different in others. We started from scratch with the following goals:

  • Simple and easy to use. Instead of porting the WebKit1 API to WebKit2, we decided to add new API on demand. We set some milestones based on porting real applications, adding new API required to port them. This also allowed us to design the API, not only thinking about what we want or need to expose, but also how the applications expect to use the API.
  • Consistency. We have tried hard to be consistent with the names of the functions, signals and properties exposed by the API.
  • Flexibility. When possible, the API allows to use your own implementation of some parts that can be adopted to different platforms. So, you can use your own file chooser, JavaScript dialogs, context menu, print dialog, etc.
  • It works by default. For all those features where a custom implementation can be used, there’s a default implementation in WebKit that just works by default.
  • Unit tests. We have enforced all new patches adding API to WebKit2 GTK+ to include also unit tests, so the whole API is covered by unit tests.

Let’s see the major changes and advantages of this new WebKi2 API.

WebKitWebView is a scrolling widget

For API users this means that WebKitWebView should not be added to a GtkScrolledWindow, the widget is scrollable by itself. Actually this is also the case of the WebKitWebView in WebKit1, but some hacks were introduced to allow the widget to be used inside a GtkScrolledWindow. This caused a lot of headaches due to the synchronization between the internal scrolling and the GTK+ scroll adjustments. So now the main scrollbars are also handled by the WebKitWebView which, among other things, fixed the problem of the double scrollbars in some web sites.

Double scrollbar issue

Embedded HTTP authentication dialog

The default implementation of the HTTP authentication embeds a dialog in the WebView instead of using a real GtkDialog. It’s also integrated with the keyring by default using libsecret.

HTTP authentication dialog

GTK+ 2 plugins (flash)

Plugins also run in a different process that is built with GTK+ 2 to support the most popular plugins like flash that still use GTK+ 2.

MiniBrowser showing a youtube video using flash plugin

Web Inspector

The Web Inspector works automatically in both docked and undocked states without requiring any API call.

Inspector docked

It also has support for remote inspecting.

Remote inspecting

Accelerated compositing

Accelerated compositing is always enabled in WebKit2.

Poster circle

Future plans

During the next release cycle we’ll work on fixing bugs and completing the API, see our RoadMap for further details, but we’ll also explore some other areas not directly related the the API:

  • Multiple web processes support
  • Sandboxing
  • Network Process

WebKitGTK+ Hackfest 2012

This year again the WebKitGTK+ hackfest took place at the Igalia office in A Coruña, and this year again it’s been awesome.

My main goal for the hackfest was to implement an extension system for the web process in WebKit2, that would allow, among other things, to access the DOM, which is the major regression of the WebKit2 GTK+ API. The idea was to use the exactly same GObject DOM bindings API we are currently using in WebKit1, so I moved it to a convenient static library and installed the public headers in its own directory making it shareable between WebKit1 and WebKit2. Once GObject DOM bindings were accessible from WebKit2 I wrote a first patch to implement the web extension system providing a new API for extensions to access the DOM.

I also took advantage of the hackfest time, to re-take a task I had pending for some time, adding an API to WebKit2 to handle SSL errors. I didn’t have time to finish the API, but managed to write a first patch to set a policy for SSL errors. For now it only allows to ignore SSL errors and continue the load or make the load fail in case of SSL errors. The idea is to add a new policy to ask the user what to do.

Even though it was not part of my initial plans for the hackfest I ended up working on the document reading integration in Epiphany. I wrote an initial patch for Epiphany to load documents supported by Evince embedded in the window like a web view. There are still a lot of features to integrate like zooming, searching, printing, etc.

Epiphany showing a PDF document

I set a milestone to switch Epiphany to WebKit2 by default at the end of the hackfest, but I didn’t have time to fix all the regressions. We are a lot closer, though.

This event is impossible without the sponsors, thanks!

 

SSL certificates information support in Epiphany

Since the Epiphany migration to WebKit, websites with an invalid SSL certificate were marked as untrusted with the unsecure lock icon in the location bar. However, it wasn’t possible to know what was wrong with the certificate nor the certificate details. Using the certificate viewer widget available in gcr, I’ve implemented a dialog to show information about the possible SSL errors and certificate details in Epiphany. This also means Epiphany now depends on gcr.

Epiphany showing an invalid certificate

Epiphany showing an invalid certificate