The process of creating a new WPE backend
The basics to understand and build a WPE WebKit backend from scratch.
What is a WPE Backend? #
WPE is the official port of WebKit for embedded platforms. Depending on the platform hardware it may need to use different technics and technologies to ensure correct graphical rendering.
In order to be independent from any user-interface toolkit and/or windowing system, WPE WebKit delegates the rendering to a third-party API defined in libwpe. The implementation of this API is called a WPE Backend. You can find more explainations on the WPE Architecture page.
WPE WebKit is a multiprocesses application, the end-user starts and controls the web widgets in the application process while the web engine itself is running in different subprocesses like the WPENetworkProcess in charge of managing the underlying network connections and the WPEWebProcess in charge of the HTML and Javascript parsing, execution and rendering.
The WPEWebProcess is, in particular, in charge of drawing the final composition of the web page through the
ThreadedCompositor
component.
The WPE Backend is at a crossroads between the WPEWebProcess and the user application process.
graph LR; subgraph WPEWebProcess A(WPE ThreadedCompositor) end subgraph Application Process C(User Application) end A -->|draw| B([WPE Backend]) --> C
The WPE Backend is a shared library that is loaded at runtime by the WPEWebProcess and by the user application process. It is used to render the visual aspect of a web page and transfer the resulting video buffer from the WPE WebKit engine process to the application process.
N.B. with the growing integration of DMA Buffers on all modern Linux platforms, the WPE WebKit architecture is evolving and, in the future, the need for a WPE Backend should disappear.
Future designs of the libwpe API should allow to directly receive the video buffers on the application process side without needing to implement a different WPE Backend for each hardware platform. From the application developer point of view, it will simplify the usage of WPE WebKit by hiding all multiprocesses considerations.
The WPE Backend interfaces #
The WPE Backend shared library must export at least one symbol called _wpe_loader_interface
of
type wpe_loader_interface
as defined
here
in the libwpe API.
This instance holds a simpler loader function that must return concrete implementations for the following libwpe interfaces:
- wpe_renderer_host_interface
- wpe_renderer_backend_egl_interface
- wpe_renderer_backend_egl_target_interface
- wpe_renderer_backend_egl_offscreen_target_interface
When the WPE Backend is loaded by the WPEWebProcess (or the application process), the process will look for the
_wpe_loader_interface
symbol and then call _wpe_loader_interface.load_object("...")
with predefined names whenever
it needs access to a specific interface:
- “_wpe_renderer_host_interface” for the
wpe_renderer_host_interface
- “_wpe_renderer_backend_egl_interface” for the
wpe_renderer_backend_egl_interface
- “_wpe_renderer_backend_egl_target_interface” for the
wpe_renderer_backend_egl_target_interface
- “_wpe_renderer_backend_egl_offscreen_target_interface” for the
wpe_renderer_backend_egl_offscreen_target_interface
The WPE Backend also needs to implement a fifth interface of type wpe_view_backend_interface that is used on the application process side when creating the specific backend instance.
All those interfaces follow the same structure:
- a
create(...)
function that acts like an object constructor - a
destroy(...)
function that acts like an object destructor - some functions that act like the methods of an object (each function receives the previously created object instance as first parameter)
Taking the names used in the WPEBackend-direct project, on the application process side WPE will create:
- a
rendererHost
instance using thewpe_renderer_host_interface::create(...)
function - multiple
rendererHostClient
instances using thewpe_renderer_host_interface::create_client(...)
function (those instances are mainly used for IPC communication, one instance is created for each WPEWebProcess launched by WPE WebKit) - multiple
viewBackend
instances created by thewpe_view_backend_interface::create(...)
function (one instance is created for each rendering target on the WPEWebProcess side)
On the WPEWebProcess side (there can be more than one WPEWebProcess depending of the URIs loaded by the WebKitWebView instances on the application process side), the web process will create:
- a
rendererBackendEGL
instance (one per WPEWebProcess) using thewpe_renderer_backend_egl_interface::create(...)
function - multiple
rendererBackendEGLTarget
instances using thewpe_renderer_backend_egl_target_interface::create(...)
function (one instance is created for each new rendering target requested by the application)
N:B: the
rendererBackendEGLTarget
instances may be created by thewpe_renderer_backend_egl_target_interface
or thewpe_renderer_backend_egl_offscreen_target_interface
depending on the interfaces implemented in the WPE Backend.Here we are only focussing on the
wpe_renderer_backend_egl_target_interface
that is relying on a classical EGL display (defined in therendererBackendEGL
instance). Thewpe_renderer_backend_egl_offscreen_target_interface
may be used in very specific use-cases that are out of the scope of this post. You can check its usage in the WPE WebKit source code for more information.
Former instances are communicating between each others using classical IPC technics (Unix sockets). The IPC layer must be implemented in the WPE Backend, the libwpe interfaces only share the endpoints file descriptors between the different processes when creating the different instances.
From a topological point of view, all those instances are organized as follows:
graph LR; subgraph S1 [WPEWebProcess] A1(rendererBackendEGL) subgraph S1TARGETS [ ] B1(rendererBackendEGLTarget) C1(rendererBackendEGLTarget) D1(rendererBackendEGLTarget) end end style S1 fill:#fb3,stroke:#333 style S1TARGETS fill:#fb3,stroke:#333 style A1 fill:#ff5 style B1 fill:#9f5 style C1 fill:#9f5 style D1 fill:#9f5 subgraph S2 [WPEWebProcess] A2(rendererBackendEGL) subgraph S2TARGETS [ ] B2(rendererBackendEGLTarget) C2(rendererBackendEGLTarget) end end style S2 fill:#fb3,stroke:#333 style S2TARGETS fill:#fb3,stroke:#333 style A2 fill:#ff5 style B2 fill:#9f5 style C2 fill:#9f5 subgraph S3 [Application Process] E(rendererHost) subgraph S4 [Clients] F1(rendererHostClient) F2(rendererHostClient) end F1 ~~~ F2 E -.- S4 subgraph S5 [ ] G1(viewBackend) G2(viewBackend) G3(viewBackend) G4(viewBackend) G5(viewBackend) end S4 ~~~ S5 G1 ~~~ G2 G2 ~~~ G3 G3 ~~~ G4 G4 ~~~ G5 end style S3 fill:#3cf,stroke:#333 style S4 fill:#9ef,stroke:#333 style S5 fill:#3cf,stroke:#333 style E fill:#ff5 style F1 fill:#ff5 style F2 fill:#ff5 style G1 fill:#9f5 style G2 fill:#9f5 style G3 fill:#9f5 style G4 fill:#9f5 style G5 fill:#9f5 A1 <--IPC--> F1 A2 <--IPC--> F2 B1 <--IPC--> G1 C1 <--IPC--> G2 D1 <--IPC--> G3 B2 <--IPC--> G4 C2 <--IPC--> G5
From a usage point of view:
- the
rendererHost
andrendererHostClient
instances are only used to manage IPC endpoints on the application process side that are connected to each running WPEWebProcess. They are not used by the graphical rendering system. - the
rendererBackendEGL
instance (one per WPEWebProcess) is only used to connect to the native display for a specific platform. For example, on a desktop Linux, the platform may be X11 and the native display can be obtained by callingXOpenDisplay(...)
; or the platform may be Wayland and the native display can be obtained by callingwl_display_connect(...)
; etc… - the
rendererBackendEGLTarget
(on the WPEWebProcess side) andviewBackend
(on the application process side) instances are the ones truly managing the web page graphical rendering.
The graphical rendering mechanism #
As seen above, the interfaces in charge of the rendering are: wpe_renderer_backend_egl_target_interface
and
wpe_view_backend_interface
. During the instances creation, WPE WebKit exchanges the file descriptors used to
establish a direct IPC connection between a rendererBackendEGL
on the WPEWebProcess side and a viewBackend
on the
application process side.
During the EGL initialization phase, run when a new WPEWebProcess is launched, WPE WebKit will use the native
display and platform provided by wpe_renderer_backend_egl_interface::get_native_display(...)
and get_platform(...)
functions to create a suitable GLES context.
When the WPE WebKit
ThreadedCompositor
is ready to render a new frame (on the WPEWebProcess side), it will call the
wpe_renderer_backend_egl_target_interface::frame_will_render(...)
function to advertise the WPE Backend that
rendering is going to start. At this moment, the previously created GLES context is current to the calling thread and
then all GLES drawing commands will be issued right after calling the former function.
Once the ThreadedCompositor
has finished drawing, it will swap the EGL buffers and call the
wpe_renderer_backend_egl_target_interface::frame_rendered(...)
function to advertise the WPE Backend that the
frame is ready. Then the ThreadedCompositor
will wait until the
wpe_renderer_backend_egl_target_dispatch_frame_complete(...)
function is called by the WPE Backend.
What happens during frame_will_render(...)
and frame_rendered(...)
calls is up to the WPE Backend. It can, for
example, select a Framebuffer Object to draw in a texture and
then pass the texture content to the application process, or use EGL extensions like
EGLStream or
DMA Buffers to transfer the
frame to the application process without doing a copy through the computer main RAM.
In all cases, in frame_rendered(...)
function the WPE Backend generally sends the new frame to the corresponding
viewBackend
instance on the application side. Then the application process uses or presents this frame and then sends
back an IPC message to the rendererBackendEGLTarget
instance to advertise the WPEWebProcess side that the frame is
not used anymore and can be recycled. When the rendererBackendEGLTarget
instance receives this IPC message, this is
generally the moment it calls the wpe_renderer_backend_egl_target_dispatch_frame_complete(...)
function to trigger
a new frame production.
With this mechanism, the application has full control over the synchronization of the rendering on the WPEWebProcess side.
sequenceDiagram participant A as ThreadedCompositor participant B as WPE Backend participant C as Application loop Rendering A->>B: frame_will_render(...) activate A A->>A: GLES rendering A->>B: frame_rendered(...) deactivate A B->>C: frame transfer activate C C->>C: frame presentation C->>A: dispatch_frame_complete(...) deactivate C end
A simple example: WPEBackend-direct #
The WPEBackend-direct project aims to implement a very simple WPE Backend that is not transfering any frame to the application process.
The frame presentation is directly done by the WPEWebProcess using a native X11 or Wayland window. Objective is to
provide an easy means of showing and debugging the output of the ThreadedCompositor
without the inherent complexity
of transfering frames between different processes.
The WPEBackend-direct implements the minimum set of libwpe interfaces needed by a functional backend.
The interfaces loader is implemented in
wpebackend-direct.cpp.
The wpe_renderer_backend_egl_offscreen_target_interface
is disabled as the backend is only providing the default
wpe_renderer_backend_egl_target_interface
for the target rendererBackendEGLTarget
instances.
extern "C"
{
__attribute__((visibility("default"))) wpe_loader_interface _wpe_loader_interface = {
+[](const char* name) -> void* {
if (std::strcmp(name, "_wpe_renderer_host_interface") == 0)
return RendererHost::getWPEInterface();
if (std::strcmp(name, "_wpe_renderer_backend_egl_interface") == 0)
return RendererBackendEGL::getWPEInterface();
if (std::strcmp(name, "_wpe_renderer_backend_egl_target_interface") == 0)
return RendererBackendEGLTarget::getWPEInterface();
if (std::strcmp(name, "_wpe_renderer_backend_egl_offscreen_target_interface") == 0)
{
static wpe_renderer_backend_egl_offscreen_target_interface s_interface = {
+[]() -> void* { return nullptr; },
+[](void*) {},
+[](void*, void*) {},
+[](void*) -> EGLNativeWindowType { return nullptr; },
nullptr,
nullptr,
nullptr,
nullptr};
return &s_interface;
}
return nullptr;
},
nullptr, nullptr, nullptr, nullptr};
}
On the application process side, the RendererHost
and RendererHostClient
classes do nothing particular, they are
just straightforward implementations of the libwpe interfaces used to maintain the IPC channel connection. Because
rendering and frame presentation are both handled on the WPEWebProcess side, the ViewBackend
class is only managing
the render loop synchronization. It receives a message from the IPC channel each time a frame has been rendered and
then it calls a callback from the sample application.
When the sample application is ready, it calls a specific function from the WPEBackend-direct API that will trigger
a call to ViewBackend::frameComplete()
that, in turn, will send an IPC message to the RendererBackendEGLTarget
instance on the WPEWebProcess side to call the wpe_renderer_backend_egl_target_dispatch_frame_complete(...)
function.
WebKitWebViewBackend* createWebViewBackend()
{
auto* directBackend = wpe_direct_view_backend_create(
+[](wpe_direct_view_backend* backend, void* /*userData*/) {
// This callback function is called by the WPE Backend each time a
// frame has been rendered and presented on the WPEWebProcess side.
// Calling the next function will trigger an IPC message sent to the
// corresponding RendererBackendEGLTarget instance on the WPEWebProcess
// side that will trigger the rendering of the next frame.
wpe_direct_view_backend_dispatch_frame_complete(backend);
},
nullptr, 800, 600);
return webkit_web_view_backend_new(wpe_direct_view_backend_get_wpe_backend(directBackend), nullptr, nullptr);
}
On the WPEWebProcess side, the RendererBackendEGL
class is a wrapper around an X11 or Wayland native display, it
also maintains an IPC connection with a corresponding RendererHostClient
instance on the application process side.
The RendererBackendEGLTarget
class is the one in charge of the presentation. It maintains an IPC connection with a
corresponding ViewBackend
instance on the application process side. During its initialization the
RendererBackendEGLTarget
instance uses the native display held by the RendererBackendEGL
instance to create a
native X11 or Wayland window. This native window is communicated to the WPE WebKit engine through the
wpe_renderer_backend_egl_target_interface
implementation.
The WPE WebKit engine will use the provided native display and window to create a compatible EGL display and surface (see the source code here).
When the ThreadedCompositor
calls frame_will_render(...)
, the compatible GLES context is already selected with the
EGL surface associated with the native window previously created by the RendererBackendEGLTarget
instance, so there
is nothing more to do. When the ThreadedCompositor
calls
m_context->swapBuffers(); the rendered frame is automatically presented in the native window.
Then, when the ThreadedCompositor
calls frame_rendered(...)
, the only thing left to do by the
RendererBackendEGLTarget
instance is to send an IPC message to the corresponding ViewBackend
instance, on the
application process side, to advertise that a frame has been rendered and that the ThreadedCompositor
is waiting
before rendering the next frame.
When dealing with WPE Backends the WPEBackend-direct example is an interesting starting point as it mainly focusses on the backend implementation. The extra code, that is not directly needed by the libwpe interfaces, is minimal and clearly separated from the backend implementation code.
Next step would be to transfer the frame content from the WPEWebProcess to the application process and do the presentation on the application process side. One obvious way of doing it would be to render to a texture on the WPEWebProcess side, transfer the texture content through a shared memory map and then create a new texture on the application process side with the shared graphical data. That would work but that would not be really efficient as, for each frame, data would need to be copied from the GPU to the CPU in the WPEWebProcess and then back from the CPU to the GPU in the application process.
What is really interesting would be to keep the frame data on the GPU side and just transfer an handle to this memory between the WPEWebProcess and the application process. This will be the topic of a future post, so stay tuned!