An eagle eye view into the Mesa source tree


My last post introduced Mesa’s loader as the module that takes care of auto-selecting the right driver for our hardware. If the loader fails to find a suitable hardware driver it will fall back to a software driver, but we can also force this situation ourselves, which may come in handy in some scenarios. We also took a quick look at the glxinfo tool that we can use to query the capabilities and features exposed by the selected driver.

The topic of today focuses on providing a quick overview of the Mesa source code tree, which will help us identify the parts of the code that are relevant to our interests depending on the driver and/or the feature we intend to work on.

Browsing the source code

First off, there is already some documentation on this topic available on the Mesa 3D website that is a good place to start. Since that already gives some insight on what goes into each part of the repository I’ll focus on complementing that information with a little bit more of detail for some of the most important parts I have interacted with so far:

  • In src/egl/ we have the implementation of the EGL standard. If you are working on EGL-specific features, tracking down an EGL-specific problem or you are simply curious about how EGL links into the GL implementation, this is the place you want to visit. This includes the EGL implementations for the X11, DRM and Wayland platforms.
  • In src/glx/ we have the OpenGL bits relating specifically to X11 platforms, known as GLX. So if you are working on the GLX layer, this is the place to go. Here there is all the stuff that takes care of interacting with the XServer, the client-side DRI implementation, etc.
  • src/glsl/ contains a critical aspect of Mesa: the GLSL compiler used by all Mesa drivers. It includes a GLSL parser, the definition of the Mesa IR, also referred to as GLSL IR, used to represent shader programs internally, the shader linker and various optimization passes that operate on the Mesa IR. The resulting Mesa IR produced by the GLSL compiler is then consumed by the various drivers which transform it into native GPU code that can be loaded and run in the hardware.
  • src/mesa/main/ contains the core Mesa elements. This includes hardware-independent views of core objects like textures, buffers, vertex array objects, the OpenGL context, etc as well as basic infrastructure, like linked lists.
  • src/mesa/drivers/ contains the actual classic drivers (not Gallium). DRI drivers in particular go into src/mesa/drivers/dri. For example the Intel i965 driver goes into src/mesa/drivers/dri/i965. The code here is, for the most part, very specific to the underlying hardware platforms.
  • src/mesa/swrast*/ and src/mesa/tnl*/ provide software implementations for things like rasterization or vertex transforms. Used by some software drivers and also by some hardware drivers to implement certain features for which they don’t have hardware support or for which hardware support is not yet available in the driver. For example, the i965 driver implements operations on the accumulation and selection buffers in software via these modules.
  • src/mesa/vbo/ is another important module. Across its various versions, OpenGL has specified many ways in which a program can tell OpenGL about its vertex data, from using functions of the glVertex*() family inside glBegin()/glEnd() blocks, to things like vertex arrays, vertex array objects, display lists, etc… The drivers, however, do not need to deal with all this, Mesa makes it so that they always receive their vertex data as collection of vertex arrays, significantly reducing complexity on the side of the driver implementator. This is the module that takes care of managing all this, so no matter what type of drawing you GL program is doing or how it specifies its vertex data, it will always go through this module before it reaches the driver.
  • src/loader/, as we have seen in my previous post, contains the Mesa driver loader, which provides the logic necessary to decide which Mesa driver is the right one to use for a specific hardware so that Mesa’s can auto-select the right driver when loaded.
  • src/gallium/ contains the Gallium3D framework implementation. If, like me, you only work on a classic driver, you don’t need to care about the contents of this at all. If you are working on Gallium drivers however, this is the place where you will find the various Gallium drivers in development (inside src/gallium/drivers/), like the various Gallium ATI/AMD drivers, Nouveau or the LLVM based software driver (llvmpipe) and the Gallium state trackers.

So with this in mind, one should have enough information to know where to start looking for something specific:

  • If are interested in how vertex data provided to OpenGL is manipulated and uploaded to the GPU, the vbo module is probably the right place to look.
  • If we are looking to work on a specific aspect of a concrete hardware driver, we should go to the corresponding directory in src/mesa/drivers/ if it is a classic driver, or src/gallium/drivers if it is a Gallium driver.
  • If we want to know about how Mesa, the framework, abstracts various OpenGL concepts like textures, vertex array objects, shader programs, etc. we should look into src/mesa/main/.
  • If we are interested in the platform specific support, be it EGL or GLX, we want to look into src/egl or src/glx.
  • If we are interested in the GLSL implementation, which involves anything from the compiler to the intermediary IR and the various optimization passes, we need to look into src/glsl/.

Coming up next

So now that we have an eagle view of the contents of the Mesa repository let’s see how we can prepare a development environment so we can start hacking on
some stuff. I’ll cover this in my next post.

Driver loading and querying in Mesa


In my previous post I explained that Mesa is a framework for OpenGL driver development. As such, it provides code that can be reused by multiple driver implementations. This code is, of course, hardware agnostic, but frees driver developers from doing a significant part of the work. The framework also provides hooks for developers to add the bits of code that deal with the actual hardware. This design allows multiple drivers to co-exist and share a significant amount of code.

I also explained that among the various drivers that Mesa provides, we can find both hardware drivers that take advantage of a specific GPU and software drivers, that are implemented entirely in software (so they work on the CPU and do not depend on a specific GPU). The latter are obviously slower, but as I discussed, they may come in handy in some scenarios.

Driver selection

So, Mesa provides multiple drivers, but how does it select the one that fits the requirements of a specific system?

You have probably noticed that Mesa is deployed in multiple packages. In my Ubuntu system, the one that deploys the DRI drivers is libgl1-mesa-dri:amd64. If you check its contents you will see that this package installs OpenGL drivers for various GPUs:

# dpkg -L libgl1-mesa-dri:amd64 

Since I have a relatively recent Intel GPU, the driver I need is the one provided in So how do we tell Mesa that this is the one we need? Well, the answer is that we don’t, Mesa is smart enough to know which driver is the right one for our GPU, and selects it automatically when you load The part of Mesa that takes care of this is called the ‘loader’.

You can, however, point Mesa to look for suitable drivers in a specific directory other than the default, or force it to use a software driver using various environment variables.

What driver is Mesa actually loading?

If you want to know exactly what driver Mesa is loading, you can instruct it to dump this (and other) information to stderr via the LIBGL_DEBUG environment variable:

# LIBGL_DEBUG=verbose glxgears 
libGL: screen 0 does not appear to be DRI3 capable
libGL: pci id for fd 4: 8086:0126, driver i965
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/tls/
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/

So we see that Mesa checks the existing hardware and realizes that the i965 driver is the one to use, so it first attempts to load the TLS version of that driver and, since I don’t have the TLS version, falls back to the normal version, which I do have.

The code in src/loader/loader.c (loader_get_driver_for_fd) is the one responsible for detecting the right driver to use (i965 in my case). This receives a device fd as input parameter that is acquired previously by calling DRI2Connect() as part of the DRI bring up process. Then the actual driver file is loaded in glx/dri_common.c (driOpenDriver).

We can also obtain a more descriptive indication of the driver we are loading by using the glxinfo program that comes with the mesa-utils package:

# glxinfo | grep -i "opengl renderer"
OpenGL renderer string: Mesa DRI Intel(R) Sandybridge Mobile 

This tells me that I am using the Intel hardware driver, and it also shares information related with the specific Intel GPU I have (SandyBridge).

Forcing a software driver

I have mentioned that having software drivers available comes in handy at times, but how do we tell the loader to use them? Mesa provides an environment variable that we can set for this purpose, so switching between a hardware driver and a software one is very easy to do:

libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/tls/
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/

As we can see, setting LIBGL_ALWAYS_SOFTWARE will make the loader select a software driver (swrast).

If I force a software driver and call glxinfo like I did before, this is what I get:

# LIBGL_ALWAYS_SOFTWARE=1 glxinfo | grep -i "opengl renderer"
OpenGL renderer string: Software Rasterizer

So it is clear that I am using a software driver in this case.

Querying the driver for OpenGL features

The glxinfo program also comes in handy to obtain information about the specific OpenGL features implemented by the driver. If you want to check if the Mesa driver for your hardware implements a specific OpenGL extension you can inspect the output of glxinfo and look for that extension:

# glxinfo | grep GL_ARB_texture_multisample

You can also ask glxinfo to include hardware limits for certain OpenGL features including the -l switch. For example:

# glxinfo -l | grep GL_MAX_TEXTURE_SIZE

Coming up next

In my next posts I will cover the directory structure of the Mesa repository, identifying its main modules, which should give Mesa newcomers some guidance as to where they should look for when they need to find the code that deals with something specific. We will then discuss how modern 3D hardware has changed the way GPU drivers are developed and explain how a modern 3D graphics pipeline works, which should pave the way to start looking into the real guts of Mesa: the implementation of shaders.

Diving into Mesa


In my last post I gave a quick introduction to the Linux graphics stack. There I explained how what we call a graphics driver in Linux is actually a combination of three different drivers:

  • the user space X server DDX driver, which handles 2D graphics.
  • the user space 3D OpenGL driver, that can be provided by Mesa.
  • the kernel space DRM driver.

Now that we know where Mesa fits let’s have a more detailed look into it.

DRI drivers and non-DRI drivers

As explained, Mesa handles 3D graphics by providing an implementation of the OpenGL API. Mesa OpenGL drivers are usually called DRI drivers too. Remember that, after all, the DRI architecture was brought to life precisely to enable efficient implementation of OpenGL drivers in Linux and, as I introduced in my previous post, DRI/DRM are the building blocks of the OpenGL drivers in Mesa.

There are other implementations of the OpenGL API available too. Hardware vendors that provide drivers for Linux will provide their own implementation of the OpenGL API, usually in the form of a binary blob. For example, if you have an NVIDIA GPU and install NVIDIA’s proprietary driver this will install its own

Notice that it is possible to create graphics drivers that do not follow the DRI architecture in Linux. For example, the NVIDIA proprietary driver installs a Kernel module that implements similar functionality to DRM but with a different API that has been designed by NVIDIA, and obviously, their corresponding user space drivers (DDX and OpenGL) will use this API instead of DRM to communicate with the NVIDIA kernel space driver.

Mesa, the framework

You have probably noticed that when I talk about Mesa I usually say ‘drivers’, in plural. That is because Mesa itself is not really a driver, but a project that hosts multiple drivers (that is, multiple implementations of the OpenGL API).

Indeed, Mesa is best seen as a framework for OpenGL implementators that provides abstractions and code that can be shared by multiple drivers. Obviously, there are many aspects of an OpenGL implementation that are independent of the underlying hardware, so these can be abstracted and reused.

For example, if you are familiar with OpenGL you know it provides a state based API. This means that many API calls do not have an immediate effect, they only modify the values of certain variables in the driver but do not require to push these new values to the hardware immediately. Indeed, usually that will happen later, when we actually render something by calling glDrawArrays() or a similar API: it is at that point that the driver will configure the 3D pipeline for rendering according to all the state that has been set by the previous API calls. Since these APIs do not interact with the hardware their implementation can be shared by multiple drivers, and then, each driver, in their implementation of glDrawArrays(), can fetch the values stored in this state and translate them into something meaningful for the hardware at hand.

As such, Mesa provides abstractions for many things and even complete implementations for multiple OpenGL APIs that do not require interaction with the hardware, at least not immediate interaction.

Mesa also defines hooks for the parts where drivers may need to do hardware specific stuff, for example in the implementation of glDrawArrays().

Looking into glDrawArrays()

Let’s see an example of these hooks into a hardware driver by inspecting the stacktrace produced from a call to glDrawArrays() inside Mesa. In this case, I am using the Mesa Intel DRI driver and I am calling glDrawArrays() from a function named render() in my program. This is the relevant part of the stacktrace:

brw_upload_state () at brw_state_upload.c:651
brw_try_draw_prims () at brw_draw.c:483
brw_draw_prims () at brw_draw.c:578
vbo_draw_arrays () at vbo/vbo_exec_array.c:667
vbo_exec_DrawArrays () at vbo/vbo_exec_array.c:819
render () at main.cpp:363

Notice that glDrawArrays() is actually vbo_exec_DrawArrays(). What is interesting about this stack is that vbo_exec_DrawArrays() and vbo_draw_arrays() are hardware independent and reused by many drivers inside Mesa. If you don’t have an Intel GPU like me, but also use a Mesa, your backtrace should be similar. These generic functions would usually do things like checks for API use errors, reformatting inputs in a way that is more appropriate for later processing or fetching additional information from the current state that will be needed to implement the actual operation in the hardware.

At some point, however, we need to do the actual rendering, which involves configuring the hardware pipeline according to the command we are issuing and the relevant state we have set in prior API calls. In the stacktrace above this starts with brw_draw_prims(). This function call is part of the Intel DRI driver, it is the hook where the Intel driver does the stuff required to configure the Intel GPU for drawing and, as you can see, it will later call something named brw_upload_state(), which will upload a bunch of state to the hardware to do exactly this, like configuring the various shader stages required by the current program, etc.

Registering driver hooks

In future posts we will discuss how the driver configures the pipeline in more detail, but for now let’s just see how the Intel driver registers its hook for the glDrawArrays() call. If we look at the stacktrace, and knowing that brw_draw_prims() is the hook into the Intel driver, we can just inspect how it is called from vbo_draw_arrays():

static void
vbo_draw_arrays(struct gl_context *ctx, GLenum mode, GLint start,
                GLsizei count, GLuint numInstances, GLuint baseInstance)
   struct vbo_context *vbo = vbo_context(ctx);
   vbo->draw_prims(ctx, prim, 1, NULL, GL_TRUE, start, start + count - 1,
                   NULL, NULL);

So the hook is draw_prims() inside vbo_context. Doing some trivial searches in the source code we can see that this hook is setup in brw_draw_init() like this:

void brw_draw_init( struct brw_context *brw )
   struct vbo_context *vbo = vbo_context(ctx);
   /* Register our drawing function:
   vbo->draw_prims = brw_draw_prims;

Let’s put a breakpoint there and see when Mesa calls into that:

brw_draw_init () at brw_draw.c:583
brwCreateContext () at brw_context.c:767
driCreateContextAttribs () at dri_util.c:435
dri2_create_context_attribs () at dri2_glx.c:318
glXCreateContextAttribsARB () at create_context.c:78
setupOpenGLContext () at main.cpp:411
init () at main.cpp:419
main () at main.cpp:477

So there it is, Mesa (unsurprisingly) calls into the Intel DRI driver when we setup the OpenGL context and it is there when the driver will register various hooks, including the one for drawing primitives.

We could do a similar thing to see how the driver registers its hook for the context creation. We will see that the Intel driver (as well as other drivers in Mesa) assign a global variable with the hooks they need like this:

static const struct __DriverAPIRec brw_driver_api = {
   .InitScreen           = intelInitScreen2,
   .DestroyScreen        = intelDestroyScreen,
   .CreateContext        = brwCreateContext,
   .DestroyContext       = intelDestroyContext,
   .CreateBuffer         = intelCreateBuffer,
   .DestroyBuffer        = intelDestroyBuffer,
   .MakeCurrent          = intelMakeCurrent,
   .UnbindContext        = intelUnbindContext,
   .AllocateBuffer       = intelAllocateBuffer,
   .ReleaseBuffer        = intelReleaseBuffer

PUBLIC const __DRIextension **__driDriverGetExtensions_i965(void)
   globalDriverAPI = &brw_driver_api;

   return brw_driver_extensions;

This global is then used throughout the DRI implementation in Mesa to call into the hardware driver as needed.

We can see that there are two types of hooks then, the ones that are needed to link the driver into the DRI implementation (which are the main entry points of the driver in Mesa) and then the hooks they add for tasks that are related to the hardware implementation of OpenGL bits, typically registered by the driver at context creation time.

In order to write a new DRI driver one would only have to write implementations for all these hooks, the rest is already implemented in Mesa and reused across multiple drivers.

Gallium3D, a framework inside a framework

Currently, we can split Mesa DRI drivers in two kinds: the classic drivers (not based on the Gallium3D framework) and the new Gallium drivers.

Gallium3D is part of Mesa and attempts to make 3D driver development easier and more practical than it was before. For example, classic Mesa drivers are tightly coupled with OpenGL, which means that implementing support for other APIs (like Direct3D) would pretty much require to write a completely new implementation/driver. This is addressed by the Gallium3D framework by providing an API that exposes hardware functions as present in modern GPUs rather than focusing on a specific API like OpenGL.

Other benefits of Gallium include, for example, support for various Operating Systems by separating the part of the driver that relies on specific aspects of the underlying OS.

In the last years we have seen a lot of drivers moving to the Gallium infrastructure, including nouveau (the open source driver for NVIDIA GPUs), various radeon drivers, some software drivers (swrast, llvmpipe) and more.

Gallium3D driver model (image via wikipedia)

Although there were some efforts to port the Intel driver to Gallium in the past, development of the Intel Gallium drivers (i915g and i965g) is stalled now as far as I know. Intel is focusing in the classic version of the drivers instead. This is probably because it would take a large amount of time and effort to bring the current classic driver to Gallium with the same features and stability that it has in its current classic form for many generations of Intel GPUs. Also, there is a lot of work going on to add support for new OpenGL features to the driver at the moment, which seems to be the priority right now.

Gallium and LLVM

As we will see in more detail in future posts, writing a modern GPU driver involves a lot of native code generation and optimization. Also, OpenGL includes the OpenGL Shading Language (GLSL) which directly requires to have a GLSL compiler available in the driver too.

It is no wonder then that Mesa developers thought that it would make sense to reuse existing compiler infrastructure rather than building and using their own: enter LLVM.

By introducing LLVM into the mix, Mesa developers expect to bring new and better optimizations to shaders and produce better native code, which is critical to performance.

This would also allow to eliminate a lot of code from Mesa and/or the drivers. Indeed, Mesa has its own complete implementation of a GLSL compiler, which includes a GLSL parser, compiler and linker as well as a number of optimizations, both for abstract representations of the code, in Mesa, and for the actual native code for a specific GPU, in the actual hardware driver.

The way that Gallium plugs LLVM is simple: Mesa parses GLSL and produces LLVM intermediary representation of the shader code that it can then pass to LLVM, which will take care of the optimization. The role of hardware drivers in this scenario is limited to providing LLVM backends that describe their respective GPUs (instruction set, registers, constraints, etc) so that LLVM knows how it can do its work for the target GPU.

Hardware and Software drivers

Even today I see people who believe that Mesa is just a software implementation of OpenGL. If you have read my posts so far it should be clear that this is not true: Mesa provides multiple implementations (drivers) of OpenGL, most of these are hardware accelerated drivers but Mesa also provides software drivers.

Software drivers are useful for various reasons:

  • For developing and testing purposes, when you want to take the hardware out of the equation. From this point of view, a software representation can provide a reference for expected behavior that is not tied or constrained by any particular hardware. For example, if you have an OpenGL program that does not work correctly we can run it with the software driver: if it works fine then we know the problem is in the hardware driver, otherwise we can suspect that the problem is in the application itself.
  • To allow execution of OpenGL in systems that lack 3D hardware drivers. It would obviously be slow, but in some scenarios it could be sufficient and it is definitely better than not having any 3D support at all.

I initially intended to cover more stuff in this post, but it is already getting long enough so let’s stop here for now. In the next post we will discuss how we can check and change the driver in use by Mesa, for example to switch between a software and hardware driver, and we will then start looking into Mesa’s source code and introduce its main modules.

A brief introduction to the Linux graphics stack

This post attempts to be a brief and simple introduction to the Linux graphics stack, and as such, it has an introductory nature. I will focus on giving enough context to understand the role that Mesa and 3D drivers in general play in the stack and leave it to follow up posts to dive deeper into the guts of Mesa in general and the Intel DRI driver specifically.

A bit of history

In order to understand some of the particularities of the current graphics stack it is important to understand how it had to adapt to new challenges throughout the years.

You see, nowadays things are significantly more complex than they used to be, but in the early times there was only a single piece of software that had direct access to the graphics hardware: the X server. This approach made the graphics stack simpler because it didn’t need to synchronize access to the graphics hardware between multiple clients.

In these early days applications would do all their drawing indirectly, through the X server. By using Xlib they would send rendering commands over the X11 protocol that the X server would receive, process and translate to actual hardware commands on the other side of a socket. Notice that this “translation” is the job of a driver: it takes a bunch of hardware agnostic rendering commands as its input and translates them into hardware commands as expected by the targeted GPU.

Since the X server was the only piece of software that could talk to the graphics hardware by design, these drivers were written specifically for it, became modules of the X server itself and an integral part of its architecture. These userspace drivers are called DDX drivers in X server argot and their role in the graphics stack is to support 2D operations as exported by Xlib and required by the X server implementation.

DDX drivers in the X server (image via wikipedia)

In my Ubuntu system, for example, the DDX driver for my Intel GPU comes via the xserver-xorg-video-intel package and there are similar packages for other GPU vendors.

3D graphics

The above covers 2D graphics as that is what the X server used to be all about. However, the arrival of 3D graphics hardware changed the scenario significantly, as we will see now.

In Linux, 3D graphics is implemented via OpenGL, so people expected an implementation of this standard that would take advantage of the fancy new 3D hardware, that is, a hardware accelerated However, in a system where only the X server was allowed to access the graphics hardware we could not have a that talked directly to the 3D hardware. Instead, the solution was to provide an implementation of OpenGL that would send OpenGL commands to the X server through an extension of the X11 protocol and let the X server translate these into actual hardware commands as it had been doing for 2D commands before.

We call this Indirect Rendering, since applications do not send rendering commands directly to the graphics hardware, and instead, render indirectly through the X server.

OpenGL with Indirect Rendering (image via wikipedia)

Unfortunately, developers would soon realize that this solution was not sufficient for intensive 3D applications, such as games, that required to render large amounts of 3D primitives while maintaining high frame rates. The problem was clear: wrapping OpenGL calls in the X11 protocol was not a valid solution.

In order to achieve good performance in 3D applications we needed these to access the hardware directly and that would require to rethink a large chunk of the graphics stack.

Enter Direct Rendering Infrastructure (DRI)

Direct Rendering Infrastructure is the new architecture that allows X clients to talk to the graphics hardware directly. Implementing DRI required changes to various parts of the graphics stack including the X server, the kernel and various client libraries.

Although the term DRI usually refers to the complete architecture, it is often also used to refer only to the specific part of it that involves the interaction of applications with the X server, so be aware of this dual meaning when you read about this stuff on the Internet.

Another important part of DRI is the Direct Rendering Manager (DRM). This is the kernel side of the DRI architecture. Here, the kernel handles sensitive aspects like hardware locking, access synchronization, video memory and more. DRM also provides userspace with an API that it can use to submit commands and data in a format that is adequate for modern GPUs, which effectively allows userspace to communicate with the graphics hardware.

Notice that many of these things have to be done specifically for the target hardware so there are different DRM drivers for each GPU. In my Ubuntu system the DRM module for my Intel GPU is provided via the libdrm-intel1:amd64 package.

OpenGL with Direct Rendering (image via wikipedia)

DRI/DRM provide the building blocks that enable userspace applications to access the graphics hardware directly in an efficient and safe manner, but in order to use OpenGL we need another piece of software that, using the infrastructure provided by DRI/DRM, implements the OpenGL API while respecting the X server requirements.

Enter Mesa

Mesa is a free software implementation of the OpenGL specification, and as such, it provides a, which OpenGL based programs can use to output 3D graphics in Linux. Mesa can provide accelerated 3D graphics by taking advantage of the DRI architecture to gain direct access to the underlying graphics hardware in its implementation of the OpenGL API.

When our 3D application runs in an X11 environment it will output its graphics to a surface (window) allocated by the X server. Notice, however, that with DRI this will happen without intervention of the X server, so naturally there is some synchronization to do between the two, since the X server still owns the window Mesa is rendering to and is the one in charge of displaying its contents on the screen. This synchronization between the OpenGL application and the X server is part of DRI. Mesa’s implementation of GLX (the extension of the OpenGL specification that addresses the X11 platform) uses DRI to talk to the X server and accomplish this.

Mesa also has to use DRM for many things. Communication with the graphics hardware happens by sending commands (for example “draw a triangle”) and data (for example the vertex coordinates of the triangle, their color attributes, normals, etc). This process usually involves allocating a bunch of buffers in the graphics hardware where all these commands and data are copied so that the GPU can access them and do its work. This is enabled by the DRM driver, which is the one piece that takes care of managing video memory and which offers APIs to userspace (Mesa in this case) to do this for the specific target hardware. DRM is also required whenever we need to allocate and manage video memory in Mesa, so things like creating textures, uploading data to textures, allocating color, depth or stencil buffers, etc all require to use the DRM APIs for the target hardware.

OpenGL/Mesa in the context of 3D Linux games (image via wikipedia)

What’s next?

Hopefully I have managed to explain what is the role of Mesa in the Linux graphics stack and how it works together with the Direct Rendering Infrastructure to enable efficient 3D graphics via OpenGL. In the next post we will cover Mesa in more detail, we will see that it is actually a framework where multiple OpenGL drivers live together, including both hardware and software variants, we will also have a look at its directory structure and identify its main modules, introduce the Gallium framework and more.

A tour around the world of Mesa and Linux graphics drivers

For some time now I have decided to focus my work at Igalia on the graphics stack. As a result of this I had the chance to participate in a couple of very interesting projects like implementing Wayland support in WebKitGtk+ (a topic I have visited in this blog a number of times) and, lately, work on graphics drivers for Linux in the Mesa framework.

The graphics stack in Linux is complex and it is not always easy to find information and technical documentation that can aid beginners in their firsts steps. This is usually a very demanding domain, the brave individuals who decide to put their energy into it usually have their hands full hacking on the code and they don’t have that much room for documenting what they do in a way that is particularly accessible to newcomers.

As I mentioned above, I have been hacking on Mesa lately (particularly on the Intel i965 driver) and so far it as been a lot of fun, probably the most exciting work I have done at Igalia in all these years, but it is also certainly challenging, requiring me to learn a lot of new things and some times fairly complex stuff.

Getting involved in this is no easy endeavor, the learning curve is steep because the kind of work you do here is probably unlike anything you have done before: for starters it requires a decent understanding of OpenGL and capacity to understand OpenGL specifications and what they mean in the context of the driver, you also need to have a general understanding of how modern 3D-capable GPUs work and finally, you have to dig deeper and understand how the specific GPU that your driver targets works and what is the role that the driver needs to play to make that hardware work as intended. And that’s not all of it, a driver may need to support multiple generations of GPUs which sometimes can be significantly different from each other, requiring driver developers to write and merge multiple code paths that handle these differences. You can imagine the maintenance burden and extra complexity that comes from this.

Finally, we should also consider the fact that graphics drivers are among the most critical pieces of code you can probably have in a system, they need to be performant and stable for all supported hardware generations, which adds to the overall complexity.

All this stuff can be a bit overwhelming in the beginning for those who attempt to give their first steps in this world but I believe that this initial steep learning curve can be smoothed out by introducing some of the most important concepts in a way that is oriented specifically to new developers. The rest will still not be an easy task, it requires hard work, some passion, be willing to learn and a lot of attention to detail, but I think anyone passionate enough should be able to get into it with enough dedication.

I had to go through all this process myself lately, so I figured I am in a very good situation to try and address this problem myself, so that’s why I decided to write a series of posts to introduce people to the world of Mesa and 3D graphics drivers, with a focus on OpenGL and Intel GPUs, which is the area were I am currently developing my work. Although I’ll focus on Intel hardware I believe that many of the concepts that I will be introducing here are general enough so that they are useful also to people interested in other GPUs. I’ll try to be clear about when I am introducing general concepts and when I am discussing Intel specific stuff.

My next post, which will be the first in this series, will serve as an introduction to the Linux graphics stack and Linux graphics drivers. We will discuss what Mesa brings to the table exactly and what we mean when we talk about graphics drivers in Linux exactly. I think that should put us on the right track to start looking into the internals of Mesa.

So that’s it, if you are interested in learning more about Linux graphics and specifically Mesa and 3D graphics drivers, stay tuned! I’ll try my best to post regularly and often.

Epiphany + WebKitGTK/WebKit2 + Wayland + Accelerated Compositing

In my previous post I shared that I had managed to get a basic implementation of WebKitGTK+WebKit2 to work under Wayland. I also discussed some of the pieces that were still missing, most important of which was supporting for multiple views, that is, having the possibility to run multiple browser windows/tabs that render accelerated content simultaneously.


In the last weeks I have continued making progress and I am happy to say that I have finally implemented support for this too, proof in the video below:


Support for multiple views required to implement an extension to the Wayland protocol so that we can effectively map widgets and their corresponding Wayland surfaces in our nested compositor. This is needed to know which surface provides the graphics for which widget. Thanks to Pekka Paalanen for introducing me into the world of Wayland extensions!


My work also uncovered a number of hidden bugs in WebKitGTK that were hidden by the fact that we always use a sharing context for all our GL contexts. In Wayland, however, my colleague Zan Dobersek is working on implementing support for the sharing context separately and our patches still need to be merged together, so I have been working all this time without a sharing context and that uncovered these bugs that show up when we deal with multiple views (and hence multiple GL contexts). I am still working on fixing these but in any case merging my work with Zan’s should be enough to prevent them from actually producing any harm, just like in X11. Actually, one of these bugs is the one behind the rendering issues I mentioned in my last post when clicking on the browser’s back button.


One more thing worth mentioning: I needed a full browser to test multiple browser windows and tabs, so that also led me to test all my work with Epiphany/Web, which I had not done yet (so far I had restricted myself to work only with WebKit’s MiniBrowser), that is of course the browser I use in the video above.


If you are interested in following progress more closely or want to look at the patches that enable Accelerated Compositing for WebKitGTK/WebKit2 under Wayland, here is the bug.


Finally, I would like to thank my sponsor, Igalia, for supporting this work since its inception.

WebKitGTK Wayland: Initial support for WebKit2 and Accelerated Compositing

Quick Recap

In my last post on the subject I explained how during the last WebKitGTK hackfest my colleague Eduardo Lima and I got a working GTK application that made use of the nested compositor design we need in WebKitGTK to get WebKit2 to work under Wayland and how the next steps would involve developing a similar solution inside WebKitGTK.


Current Status

During the last 2 weeks I have been working on this, and today I got Accelerated Compositing to work under Wayland and WebKit2. There are still a lot of rough edges of course since this milestone is mostly a prototype that only covers the basics. Its purpose was solely  to investigate how the nested compositor approach would work in WebKitGTK to support the Accelerated Compositing code path. Still, this is an important milestone: Accelerated Compositing and WebKit2 were the biggest missing pieces to bring Wayland support to WebKitGTK and even if this is  only a prototype it provides a solution for these two aspects, which is great news.


To Do List

There are probably a lot of things that need more work to convert this prototype into a proper solution. I have not tested it thoroughly but here is a quick list of things that I already know need more work:

  • Support for multiple windows and tabs (the prototype only supports one tab in one window)
  • For some pages the first composition can be very slow (as in taking >5 seconds). This problem only happens the first time the page is loaded, but it does not happen when  reloading the same page (the demo video below shows this)
  • Rendering of text selections does not seem to work
  • There are rendering artifacts when going back using the browser’s back button to a previously visited page that activates the Accelerated Compositing code path. If the page is reloaded things go back to normal though
  • There are some style rendering issues I have not looked into yet, might be on the side of GTK though
  • All this was tested in a Wayland environment inside an X session, so it can be that some things still need to be fixed for this to work in a pure Wayland environment (with no X server running).
  • Ideally we would like a solution that can make run-time decisions about the underlying platform (X or Wayland) so that we don’t have to build WebKitGTK specifically for one or the other. This is particularly important now that adoption of Wayland is still low. My prototype, however, only supports Wayland at the moment and would require more work to select between X and Wayland at run-time.

And there is probably a lot more to do that we will find out once we start testing this more seriously.



So here is a small video demoing the prototype. The demo uses WebKit’s MiniBrowser (WebKit2) to load 3 pages that activate the Accelerated Compositing code path in WebKitGTK. The browser is restarted for every page load only because it made it easier for me to record the video. You will see that for some pages, the first time the page  is composited it takes a long time which is one of the issues I mentioned above. The demo also shows how this is not the case when the page is reloaded:



Next Steps

Once I reached this milestone I think we should start moving things to get this  upstream as soon as possible: the current implementation provides the basics for Wayland support in WebKit2 and it would allow other interested developers to step in and help with testing, completing and fixing this initial implementation. I am sure there is still a lot of work to do for a fully operational Wayland port of WebKitGTK, so the more people who can contribute to this, the better.


I presume that upstreaming my code will still require a significant effort: my current implementation is a bit too hackish right now so there will be a lot of cleanups to do and a lot of back and forth with upstream maintainers to get the code in position to be merged, so the sooner we start the better. I also need to rebase my code against up-to-date versions of WebKitGTK and Wayland since I froze my development environment during the last WebKitGTK hackfest.


So that’s it. It is always good to reach milestones and I am happy to have reached this one in particular. If you are excited about WebKitGTK and Wayland I hope you enjoyed the news as much as I enjoyed working on it!


I would like to thank Igalia for sponsoring my work on this as well as all the other Igalians who helped me in the process, it would have not been possible without this support!

WebKitGTK+ 2013 hackfest: On the road to WebKit2 Wayland support in WebKitGTK+

So this was my first participation in the WebKitGTK+ hackfest. It was great to have some time to focus on WebKitGTK+ hacking for a few days as well as meeting other colleagues face to face to discuss various related topics, specifically the one I am most interested in: Wayland support in WebKit2.

A few months back I was reviewing the status of WebKitGTK+ in Wayland and mentioned that one of the main challenges was the multi-process architecture introduced with WebKit2, the one that Web/Epiphany is currently using.

The problem is simple enough to explain: In WebKit2, scene composition is done entirely in the Web Process and then painted on the screen in the UI Process, so we need to share a graphics surface between these two processes. In X11 we do this by having the UI Process create an offscreen XWindow and sharing the window ID with the Web Process, but in Wayland there is no direct means to share a graphics surface between two Wayland clients.

The solution for this is to use the same means that a Wayland compositor uses to share graphics with its clients. The meaning of this in the context of WebKitGTK+ is that we need to implement a small Wayland compositor that we can use to share the rendering surface. The way this would work in is like this: the UI Process would play the role of Wayland compositor and the WebProcess would ask it for  a surface using regular Wayland APIs. Since the UI Process implements a Wayland compositor it will have access to the graphics buffers rendered by the client (the Web Process) and we have our problem solved. This is quite a bit more of work than simply sharing a Xwindow ID though.

Some months ago I was prototyping a proof of concept for how this would work taking WebKit out of the equation to keep things easy as a first step. During the hackfest I had the opportunity to complete the work and bring it up to date with the current status of Wayland together with the help of my colleague Eduardo Lima This small prototype has two parts: a GTK+ application with a custom GtkContainer widget (which would play the role of the UIProcess in WebKitGTK+) and a separate Wayland client that renders a simple GL scene. The GTK program spawns another process to run the Wayland client when started and also implements the required bits of the Wayland compositor interface to serve Wayland requests as required by the client.

The point of the experiment was to get the GTK program to use the rendering results of the client to paint its own widget contents. This is basically what we need in WebKitGTK+, where the GtkWidget would be the WebView running in the UI Process and the client process would be the Web Process rendering the results of the scene composition to a GL texture.

The next step is to implement this solution in WebKitGTK+, which is a work in progress at the moment. It is still quite a bit of work since the WebKit code base is quite large and complex, but at this point I think it is only a matter of time to get a basic solution to work. Then of course we will have to deal with a lot of other details that this initial proof of concept did not care about, like  resizing, managing surfaces for multiple windows and probably a lot more stuff that will pop up along the way.

Finally, there is another interesting consideration to make. Even if the UI Process can share a graphics surface with Web Process, it still has to render it on the GTK widget’s surface. The problem here is that GTK on Wayland uses a cairo image surface as backing or the window surface, so this process involves a copy that results in bad performance. I guess this should be fixed at some point in GTK+ so we can have the same performance we currently have on X11. In the past I tried to go around this by creating accelerated Wayland subsurfaces for the widget and render to that instead. This worked well for performance but it had to be done completely outside GTK+ , and hence it breaks a number of things (for example you have to position the surface manually within the window surface, events are not managed properly, etc), so it was a no go. I suppose that if GTK+ can provide means to manage Wayland subsurfaces for a widget natively, this would also be another option to fix the performance  problem.

On WebKit, Layers and Accelerated Compositing

This post is a brief and quick introduction to Accelerated Compositing and the role this plays in WebKit.

The point of accelerated compositing in WebKit is to take advantage of the GPU to accelerate the rendering of web content. To understand how this works one needs to know how a web page is actually rendered by WebKit.

Rendering in WebKit

WebKit groups DOM elements in layers that are rendered separately and then composited together to create the final page. This enables proper handling of transparent objects and overlapping contents for example. These layers, just like the DOM, conform a tree.

WebKit defines some rules to decide when a new layer is needed and which DOM elements should be included in it. For example, if you have a WebGL canvas or a video element, these will go to separate layers, if you use explicit CSS position properties on an object you get another layer for it, etc.

When it is time to paint the web page on the browser, the work consists of traversing the layer tree and composite the layers together to create the final view of the web page. For example, if you have a transparent layer sitting on top of some other layer in the page, the composition will take the two individual layers that have been rendered independently and blend them together in the final page, in the right order, at the right position and with the right

Accelerated Compositing

Before we had accelerated compositing, the compositing process happend in software, that is, it was the CPU that would do all the layer compositing work, which is expensive and can hog the CPU, making for a worse user experience. Accelerated compositing however, involves offloading the compositing of the layers to the GPU hardware. It turns out that GPUs can do the compositing very fast and doing so would also free the CPU, delivering a better, more responsive, user experience too.


Accelerated compositing is all about making a better use of the available graphics hardware, offloading the work required to composite the final view of the webpage from the various layers it contains, which results in a better user experience and better overall rendering performance.

WebKitGTK+ / Wayland Demo and Future Work

So first things first, check out the video below to see the demo, it showcases Web (Epiphany), the default browser of the GNOME platform, running on WebKit1 under Wayland (Weston) and illustrates:

  • Browsing of regular text/image based sites
  • Embedded HTML5 video playback (Youtube and native)
  • 2D and 3D CSS transforms
  • WebGL

If you are intrigued to know more about WebKit, Wayland and Web, keep reading. If you already know about this and are more interested in knowing what is still missing, go here.

So WebKit you say, what is that and why should I care?

There is a significant chance that when you are reading this you are using WebKit. WebKit is an open source web browser engine that powers a variety of software that we run on our desktops, phones and a myriad of other gadgets every day and with the popularity that HTML5 has achieved this will only get bigger and bigger in the future.

Igalia has always considered the web the platform of the future and with the passing of the years we have seen this become more and more real. In this process we are certain that WebKit has played a very significant role, providing an open framework where many companies and individuals have worked together for years, building something that is at the core of most HTML5 based solutions out there. These companies and individuals have done a remarkable work in pushing the technology forward and make all this possible, and at Igalia we are very proud to be a part of this too.

Ok, so WebKit is important… but what does that have to do with Wayland?

WebKit and HTML5 are not the only big things happening out there. Wayland will progressively replace X in the coming years, something outstanding when you consider the fact that X has been there for 3 decades! This replacement is not trivial though, X has established the basis for graphical user interfaces since… well since before I even knew computers existed! Much of the software we use every day is based on X directly or indirectly, so with the change, a lot of software that we love will need to undergo some changes and adaptations to work with Wayland.

Platforms like GNOME and KDE have plans to support Wayland. GNOME in particular plans to have complete Wayland support some time in 2014, but for that to happen efforts have already started at many levels. GTK+, the graphical toolkit of the GNOME platform, is already compatible with Wayland, which is a big part of what needs to be done to make GNOME itself run on Wayland. Still, there are plenty of X bits in many other parts of the platform that will need to be addressed. WebKitGTK+ is no exception to this.

The work to get WebKitGTK+ ported to Wayland was initiated by my colleague José Dapena and I have recently joined this effort. I started from Jose’s patches, checking the current state of things and trying to identify the missing bits. I hope the video at the top gives a good idea of where we are now and later in this post I will explain what is still missing. For now I have focused mostly on WebKit1, but I believe most of my findings are just as valid for WebKit2. Hopefully this will help us define specific tasks and make steady progress in the months to come.

And what did you do to get that demo exactly?

Since WebKitGTK+ is based on GTK+ and GTK3 already provides a Wayland backend, we need to make sure that we use GTK3 in WebKitGTK+ and Web, which we already do (except for the plugin process which still needs GTK2, more on this later), so thankfully a lot of the heavy lifting had already been taken care of.

The second thing we need to check  is the backing store backend we use. We need to replace the X11 implementation with something more agnostic. Good that we already have a Cairo implementation of the backing store available!

A few other minor fixes aside, these two items represent most of what I needed to get the demo running.

How far are you from full Wayland support then?

1. Accelerated compositing

Current implementation of accelerated compositing depends on XComposite and XDamage. We need to identify how to re-implement this functionality in Wayland (at least the bits that depend on X directly)  while we make sure that we keep the same performance advantage.

2. Fullscreen video playback

Fullscreen video playback with GStreamer is implemented using the video overlay interface which is platform agnostic. However, even if the interface is platform agnostic, it requires GSteamer to provide a platform-specific implementation. There is a Wayland sink in GStreamer, however this sink does not implement the video overlay interface and I still don’t know if this is to be addressed in GStreamer or when. If the waylandsink in GStreamer is extended to implement the video overlay interface then the good news is that the work on the WebKit end should be fairly straight forward.

3. Plugins

Various plugins won’t work in Wayland. The Flash plugin in particular is known to only work with GTK2 (this is the reason the Plugin process in WebKit2 requires GTK2 actually). Since GTK2 does not support Wayland the consequence is that we won’t be seeing the Flash plugin in Wayland for now. We contacted Adobe some time ago on this subject to see if they would be interested in porting their Flash plugin to GTK3, but apparently this is not a priority for them. Even more, this restricts all plugins to use GTK2 and hence, it will pretty much disable all plugins under Wayland (well, at least the ones that need to render). Another example of a plugin needing work is Java, which seems to have direct X dependencies in its code leading to a nice crash when you load a Java applet for example. I did not do a comprehensive analysis of all the plugins, but this is definitely a problematic area in the short term. Of course, this will probably change with time, when Wayland becomes mainstream plugin developers will have to port the plugins or lose user base and be replaced by other implementations. Some plugins may even become completely obsolete before Wayland becomes mainstream too, that could be the case of the Flash plugin maybe. Bottom line: plugins suck and if you are a web developer you probably want to avoid them if you have an alternative.

4. WebKit2

WebKit2 brings a bigger challenge to the Wayland port due to the split process architecture that it introduces. In this context we need to make sure that all the rendering and painting that takes place in coordinated form between the UI process and the Web process use Wayland mechanisms for GPU acceleration efficiently. We still need to design a good solution for this before getting hands-on with fixing it and we will need to approach the Wayland community and discuss how WebKit2’s architecture fits in Wayland as it is today. Wayland has just integrated support for sub-surfaces and this is something that may come in handy in the context of this problem too.

5. Web (Epiphany)

Web is the default browser of the GNOME platform and a good test ground for WebKitGTK+ too. Making WebKitGTK+ Wayland compatible would not be good enough for GNOME if Web isn’t. Fortunately, WebKitGTK+ and GTK3 do most of the work for Web in this regard, although there seem to be some X interactions in Web itself they do not look difficult to replace or find a workaround for. I will use the video above to prove my point 😉


So that’s it, even if there are still some issues that need further work, WebKitGTK+ and Web can run in Wayland with few changes as they are today, including things like WebGL, CSS Transforms or embedded video playback. Next in the roadmap are accelerated compositing and WebKit2. We are planning to make progress in these areas in the coming months, so if you are as excited as we are for full Wayland support in WebKitGTK+ including WebKit2 and Accelerated Compositing, stay tuned!