Epiphany automation mode

Last week I finally found some time to add the automation mode to Epiphany, that allows to run automated tests using WebDriver. It’s important to note that the automation mode is not expected to be used by users or applications to control the browser remotely, but only by WebDriver automated tests. For that reason, the automation mode is incompatible with a primary user profile. There are a few other things affected by the auotmation mode:

  • There’s no persistency. A private profile is created in tmp and only ephemeral web contexts are used.
  • URL entry is not editable, since users are not expected to interact with the browser.
  • An info bar is shown to notify the user that the browser is being controlled by automation.
  • The window decoration is orange to make it even clearer that the browser is running in automation mode.

So, how can I write tests to be run in Epiphany? First, you need to install a recently enough selenium. For now, only the python API is supported. Selenium doesn’t have an Epiphany driver, but the WebKitGTK driver can be used with any WebKitGTK+ based browser, by providing the browser information as part of session capabilities.

from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
options.set_capability('browserName', 'Epiphany')
options.set_capability('version', '3.31.4')

ephy = webdriver.WebKitGTK(options=options, desired_capabilities={})
ephy.get('http://www.webkitgtk.org')
ephy.quit()

This is a very simple example that just opens Epiphany in automation mode, loads http://www.webkitgtk.org and closes Epiphany. A few comments about the example:

  • Version 3.31.4 will be the first one including the automation mode.
  • The parameter desired_capabilities shouldn’t be needed, but there’s a bug in selenium that has been fixed very recently.
  • WebKitGTKOptions.set_capability was added in selenium 3.14, if you have an older version you can use the following snippet instead
from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
capabilities = options.to_capabilities()
capabilities['browserName'] = 'Epiphany'
capabilities['version'] = '3.31.4'

ephy = webdriver.WebKitGTK(desired_capabilities=capabilities)
ephy.get('http://www.webkitgtk.org')
ephy.quit()

To simplify the driver instantation you can create your own Epiphany driver derived from the WebKitGTK one:

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        options.set_capability('browserName', 'Epiphany')
        options.set_capability('version', '3.31.4')

        webdriver.WebKitGTK.__init__(self, options=options, desired_capabilities={})

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

The same for selenium < 3.14

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        capabilities = options.to_capabilities()
        capabilities['browserName'] = 'Epiphany'
        capabilities['version'] = '3.31.4'

        webdriver.WebKitGTK.__init__(self, desired_capabilities=capabilities)

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

WebDriver support in WebKitGTK+ 2.18

WebDriver is an automation API to control a web browser. It allows to create automated tests for web applications independently of the browser and platform. WebKitGTK+ 2.18, that will be released next week, includes an initial implementation of the WebDriver specification.

WebDriver in WebKitGTK+

There’s a new process (WebKitWebDriver) that works as the server, processing the clients requests to spawn and control the web browser. The WebKitGTK+ driver is not tied to any specific browser, it can be used with any WebKitGTK+ based browser, but it uses MiniBrowser as the default. The driver uses the same remote controlling protocol used by the remote inspector to communicate and control the web browser instance. The implementation is not complete yet, but it’s enough for what many users need.

The clients

The web application tests are the clients of the WebDriver server. The Selenium project provides APIs for different languages (Java, Python, Ruby, etc.) to write the tests. Python is the only language supported by WebKitGTK+ for now. It’s not yet upstream, but we hope it will be integrated soon. In the meantime you can use our fork in github. Let’s see an example to understand how it works and what we can do.

from selenium import webdriver

# Create a WebKitGTK driver instance. It spawns WebKitWebDriver 
# process automatically that will launch MiniBrowser.
wkgtk = webdriver.WebKitGTK()

# Let's load the WebKitGTK+ website.
wkgtk.get("https://www.webkitgtk.org")

# Find the GNOME link.
gnome = wkgtk.find_element_by_partial_link_text("GNOME")

# Click on the link. 
gnome.click()

# Find the search form. 
search = wkgtk.find_element_by_id("searchform")

# Find the first input element in the search form.
text_field = search.find_element_by_tag_name("input")

# Type epiphany in the search field and submit.
text_field.send_keys("epiphany")
text_field.submit()

# Let's count the links in the contents div to check we got results.
contents = wkgtk.find_element_by_class_name("content")
links = contents.find_elements_by_tag_name("a")
assert len(links) > 0

# Quit the driver. The session is closed so MiniBrowser 
# will be closed and then WebKitWebDriver process finishes.
wkgtk.quit()

Note that this is just an example to show how to write a test and what kind of things you can do, there are better ways to achieve the same results, and it depends on the current source of public websites, so it might not work in the future.

Web browsers / applications

As I said before, WebKitWebDriver process supports any WebKitGTK+ based browser, but that doesn’t mean all browsers can automatically be controlled by automation (that would be scary). WebKitGTK+ 2.18 also provides new API for applications to support automation.

  • First of all the application has to explicitly enable automation using webkit_web_context_set_automation_allowed(). It’s important to know that the WebKitGTK+ API doesn’t allow to enable automation in several WebKitWebContexts at the same time. The driver will spawn the application when a new session is requested, so the application should enable automation at startup. It’s recommended that applications add a new command line option to enable automation, and only enable it when provided.
  • After launching the application the driver will request the browser to create a new automation session. The signal “automation-started” will be emitted in the context to notify the application that a new session has been created. If automation is not allowed in the context, the session won’t be created and the signal won’t be emitted either.
  • A WebKitAutomationSession object is passed as parameter to the “automation-started” signal. This can be used to provide information about the application (name and version) to the driver that will match them with what the client requires accepting or rejecting the session request.
  • The WebKitAutomationSession will emit the signal “create-web-view” every time the driver needs to create a new web view. The application can then create a new window or tab containing the new web view that should be returned by the signal. This signal will always be emitted even if the browser has already an initial web view open, in that case it’s recommened to return the existing empty web view.
  • Web views are also automation aware, similar to ephemeral web views, web views that allow automation should be created with the constructor property “is-controlled-by-automation” enabled.

This is the new API that applications need to implement to support WebDriver, it’s designed to be as safe as possible, but there are many things that can’t be controlled by WebKitGTK+, so we have several recommendations for applications that want to support automation:

  • Add a way to enable automation in your application at startup, like a command line option, that is disabled by default. Never allow automation in a normal application instance.
  • Enabling automation is not the only thing the application should do, so add an automation mode to your application.
  • Add visual feedback when in automation mode, like changing the theme, the window title or whatever that makes clear that a window or instance of the application is controllable by automation.
  • Add a message to explain that the window is being controlled by automation and the user is not expected to use it.
  • Use ephemeral web views in automation mode.
  • Use a temporal user profile in application mode, do not allow automation to change the history, bookmarks, etc. of an existing user.
  • Do not load any homepage in automation mode, just keep an empty web view (about:blank) that can be used when a new web view is requested by automation.

The WebKitGTK client driver

Applications need to implement the new automation API to support WebDriver, but the WebKitWebDriver process doesn’t know how to launch the browsers. That information should be provided by the client using the WebKitGTKOptions object. The driver constructor can receive an instance of a WebKitGTKOptions object, with the browser information and other options. Let’s see how it works with an example to launch epiphany:

from selenium import webdriver
from selenium.webdriver import WebKitGTKOptions

options = WebKitGTKOptions()
options.browser_executable_path = "/usr/bin/epiphany"
options.add_browser_argument("--automation-mode")
epiphany = webdriver.WebKitGTK(browser_options=options)

Again, this is just an example, Epiphany doesn’t even support WebDriver yet. Browsers or applications could create their own drivers on top of the WebKitGTK one to make it more convenient to use.

from selenium import webdriver
epiphany = webdriver.Epiphany()

Plans

During the next release cycle, we plan to do the following tasks:

  • Complete the implementation: add support for all commands in the spec and complete the ones that are partially supported now.
  • Add support for running the WPT WebDriver tests in the WebKit bots.
  • Add a WebKitGTK driver implementation for other languages in Selenium.
  • Add support for automation in Epiphany.
  • Add WebDriver support to WPE/dyz.