Last weekend was the GStreamer Hackfest in Staines, UK, in the Samsung’s premises, who also sponsored the dinners and the lunches. Special thanks to Luis de Bethencourt, the almighty organizer!
My main purpose was to sip one or two pints with the GStreamer folks and, secondarily, to talk about gstreamer-vaapi, WebKitGTK+ and the new OpenGL/ES support in gst-plugins-bad.
About gstreamer-vaapi, there were a couple questions about some problems shown in downstream (stable releases in distributions) which I was happy to announce that they are mostly fixed in upstream. On the other hand, Sebastian Drödge was worried about the existing support of GStreamer 0.10 and I answered him that its removal is already in the pipeline. He looked pleased.
In the WebKitGTK+ realm, I worked on a new functionality: to share the OpenGL context and the display of the browser with the GStreamer pipeline. With it, we could add gl filters into the pipeline. But honour to whom honour is due: this patch is a split of a previous patch done by Philippe Normand. The ultimate goal is to ditch the custom video sink in WebKit and reuse the glimagesink, with it’s new off-screen rendering feature.
Finally, on Sunday’s afternoon, I walked around Richmond and it is beautiful.
Thanks to Igalia, Intel and all the sponsors that make possible the hackfest and my attendance.
The last Friday 25 of July, National Day of Galicia, started very early because I had to travel to Strasbourg, official seat of the European Parliament, not for any political duty, but for the GNOME Users and Developers European Conference, the GUADEC!
My last GUADEC was in The Hague, in 2010, though in 2012, when it was hosted in Coruña, I attended a couple talks. Nonetheless, it had been a long time since I met the community, and it was a pleasure to me meet them again.
My biggest impression was the number of attendees. I remember the times in Turkey or in Gran Canaria where hundreds packed the auditoriums and halls. Nowadays the audience was smaller, but that is a good thing, since now you get in touch with the core of developers who drive and move the project easily.
We, Igalia, as sponsors, had a banner in the main room and a table in a corridor. Here is a picture of Juan to prove it:
Also I ran across with Emmanuele Bassi, setting up a booth to show up the Endless Mobile OS, based on GNOME 3. The people at GUADEC welcomed with enthusiasm the user experience provided by it and the purpose of the project. Personally, I love it. If you don’t know the project, you should visit their web site.
The first talk I attended what the classic GStreamer update by Sebastian Dröge and Tim Müller. They talked about the new features in GStreamer 1.4. Neat stuff in there. I like the new pace of GStreamer, rather of the old stagnated evolution of 0.10 version.
Afterwards, Jim Hall gave us a keynote about Usability in GNOME. I really enjoyed that talk. He studied the usability of several GNOME applications such as Nautilus (aka Files), GEdit, Epiphany (aka Web), etc., as part of his Masters’ research. It was a pleasure to hear that Epiphany is regarded as having a good usability.
After lunch I was in the main room hearing Sylvain Le Bon about sustainable business models for free software. He talked about crowd funding, community management and related stuff.
The next talk was Christian Hergert about his project GOM, an object mapper from GObjects to SQLite, which is used in Grilo to prevent SQL injection by some plugins that use SQLite.
The day closed with the GNOME Foundation’s teams reports.
Sunday came and I arrived to the venue for the second keynote: Should We Teach The Robot To Kill by Nathan Willis. In his particular style, Nathan, presented a general survey of GNU/Linux in the Automotive Industry.
Next, one of main talks from Igalia: Web 3.12: a browser to make us proud, presented by Edu. It was fairly good. Edu showed us the latest development in WebKitGTK+ and Epiphany (aka Web). There were quite a few questions at the end of the talk. Epiphany nowadays is actively used by a lot of people in the community.
After, Zeeshan presented his GNOME boxes, an user interface for running virtual machines. Later on Alberto Ruiz showed us Fleet Commander, a web application to handle large desktop deployments.
And we took our classic group photo:
That Sunday closed with the intern’s lighting talks. Cool stuff is being cooked by them.
On Monday I was in the venue when Emmanuele Bassi talked us about GSK, the GTK+ Scene Graph Kit, his new project, using as a starting point the lessons learned in Clutter. Its objective is to have a scene graph library fully integrated in GTK+.
After the lunch and the second part of the Foundation’s Annual General Meeting, Benjamin Otte gave an amusing talk about the CSS implementation in GTK+. Later, Jasper St. Pierre talked about the Wayland support in GNOME.
When the coffee break ended, the almighty Žan Doberšek gave the other talk from Igalia: Wayland support in WebKit2GTK+.
In the last day of the GUADEC, I attended Bastien Nocera’s talk: Hardware integration, the GNOME way, where he reviewed the history of his contributions to GNOME related with hardware integration and the goal of nicely support most of the hardware in GNOME, like compasses, gyroscopes, et cetera.
Afterwards, Owen Taylor talked us about the GNOME’s continuous integration performance testing, in order to know exactly why one release of GNOME is faster or slower than the last.
And the third keynote came: Matthew Garrett talked us about his experiences with the GNOME community and his vision about where it should go: to enhance the privacy and security of the users, something that many GNOMErs are excited about, such as Federico Mena.
Later on, David King talked about his plans for Cheese, the webcam application, turning it into a DBus service, using the current development of kdbus to sandbox the interaction with the hardware.
Afterwards Christian Hergert talked us about his plans for Builder, a new IDE for GNOME. Promising stuff, but we will see how it goes. Christian said that he is going to take a full year working on this project.
The GUADEC ended with the lighting talks, where I enjoyed one about the problems around the current encryption and security tools.
Finally, the next GUADEC host was unveiled: the Sweden Conspiracy: Gothenburg!
I arrived to Munich on Tuesday evening, and when I reached the Marienplatz metro station, I ran across with a crowd of Bayern Munich fans, chanting songs about the glory of their team, huddling and dancing. And a lot of police officers surrounding the tracks.
The workshop was organized by the W3C Web and TV Interest Group, and intended to spark discussions around how to integrate and standardize TV technologies and the Web.
On Wednesday morning, the workshop began. People from Espial and Samsung talked about HbbTV, and japanese broadcasters talked about their Hybridcast. Both technologies seek to enhance the television experience, using the Internet Protocols, the first for Europe, and the former for Japan. Also, broadcasters from Chine showed their approach using ad-hoc technologies. I have to say that Hybricast awed me.
Afterwards, most of the workshop was around the problem of the companion device. People showed their solutions and proposals, in particular about device discovering, and data sharing and control. All the solutions relied on WebSockets and WebRTC for the data sharing between devices.
During the panels, I enjoyed a lot the participation of Jon Piesing, in particular his slide summarizing the specifications used by the HbbTV V2. It’s like juggling specs!
Finally, there were a couple talks about miscellaneous technologies surrounding the IPTV broadcasting.
The second stage of my visit to Bavaria’s Capital, was the GStreamer Hackfest. It was in the Google Offices, near to the Marienplatz.
Christan Schaller has made a very good summary of what appened along the hackfest. From my side, I worked with Nicolas Dufresne with the v4l2 video converter for the Exynos4, which is a piece required for the hardware acceleration decoding for that platform using v4l2 video decoder.
Some time ago I needed to jump into the fix-compile-test loop for WebKitGTK+, but in the armhf architecture, speaking in terms of Debian/Ubuntu.
To whom don’t know, WebKitGTK+ is huge, it is humongous, and it takes a lot of resources to compile. For me, at first glance, was impossible to even try to compile it natively in my hardware, which, by the way, is an Odroid-X2. So I setup a cross-compilation environment.
And I failed. I could not cross-compile the master branch of WebKitGTK+ using as root file system, a bootstrapped Debian. It is supposed to be the opposite, but all the multiarch thing made my old and good cross-compilation setup (based on scratchbox2) a bloody hell. Long story short, I gave up and I took more seriously the idea of native builds. Besides, Ubuntu and Debian does full native builds of their distributions for armhf, not to say that the Odroid-X2 has enough power for give it a try.
It is worth to mention that I could not use Yocto/OE or buildroot, though I would love to use them, because the target was a distribution based on Debian Sid/Ubuntu, and I would not afford a chroot environment only for WebKitGTK+.
With a lot of patience I was able to compile, in the Odroid, a minimalist configuration of WebKitGTK+ without symbols. As expected, it took ages (less than 3 hours, if I remember correctly)
Quickly an idea popped out in the office: to use distcc. I grabbed as many board based on ARMv7 I could find: another Odroid-X2, a couple Pandaboards, an Arndaleboard, and an IFC6410, installed in them a distcc compilation setup.
And yes, the compilation time went down, but not that much, though I don’t remember how much.
Many of the colleagues at the office migrated from distcc to icecream. Particularly, Juan A. Suárez told me about his experiments with icecc and his Raspberry pi. I decided to give it a shoot.
Icecream permits to do cross-compilation because the scheduler can deliver, into the compilation host, the required tool-chain by the requester.
First, you should have one or several cross tool-chains, one for each compilation tuple. In this case we will have only one: to compile in X86_64, generating code for armfh. Luckily, embdebian provides it, out of the box. Nevertheless you could use any other mean to obtain it, such as crosstool.
Second, you need the icecc-create-env script to create the tarball that the scheduler will distribute to the compilation host.
The output of this script is an archive file containing all the files necessary to setup the compiler environment. The file will have a random unique name like “ddaea39ca1a7c88522b185eca04da2d8.tar.bz2” per default. You will need to rename it to something more expressive.
Third, copy the generated archive file to board where your code will be compiled and linked, in this case WebKitGTK+.
For the purpose of this text, I assume that the board has already installed and configured the icecc daemon. Beside, I use ccache too. Hence my environment variables are more or less like these:
CCACHE_DIR=/mnt/hd/.ccache # /mnt/hd is a mounted hard disk through USB.
PATH=/usr/lib/ccache:.. # where Debian sets the compiler's symbolic links
Finally, the last pour of magic is the environment variable ICECC_VERSION. This variable needs to have this pattern
Where <native_archive_file> is the archive file with the native tool-chain. <platform> is the host hardware architecture. <cross_archive_file> is the archive file with the cross tool-chain. <target> is the target architecture of the cross tool-chain.
In my case, the target is not needed because I’m doing native compilation in armhf. Hence, my ICECC_VERSION environment variable looks like this:
Basically we can perceive a browser as an application for retrieving, presenting and traversing information on the Web.
For the composited video support, we are interested in the presentation task of the browser. More particularly, in the graphical presentation.
In WebKit, each HTML element on a web page is stored as a tree of Node objects called the DOM tree.
Then, each Node that produces visual output has a corresponding RenderObject, and they are stored in another tree, called the Render Tree.
Finally, each RenderObject is associated with a RenderLayer. These RenderLayers exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc.
It is worth to mention that there is not a one-to-one correspondence between RenderObjects and RenderLayers, and that there is a RenderLayer tree as well.
WebKit fundamentally renders a web page by traversing the RenderLayer tree.
What is the accelerated compositing?
WebKit has two paths for rendering the contents of a web page: the software path and hardware accelerated path.
The software path is the traditional model, where all the work is done in the main CPU. In this mode, RenderObjects paint themselves into the final bitmap, compositing a final layer which is presented to the user.
In the hardware accelerated path, some of the RenderLayers get their own backing surface into which they paint. Then, all the backing surfaces are composited onto the destination bitmap, and this task is responsibility of the compositor.
With the introduction of compositing an additional conceptual tree is added: the GraphicsLayer tree, where each RenderLayer may have its own GraphicsLayer.
In the hardware accelerated path, it is used the GPU for compositing some of the RenderLayer contents.
As Iago said, the accelerated compositing, involves offloading the compositing of the GraphicLayers onto the GPU, since it does the compositing very fast, releasing that burden to the CPU for delivering a better and more responsive user experience.
Although there are other options, typically, OpenGL is used to render computing graphics, interacting with the GPU to achieve hardware acceleration. And WebKit provides cross-platform implementation to render with
How does WebKit paint using OpenGL?
Ideally, we could go from the GraphicsLayer tree directly to OpenGL, traversing it and drawing the texture-backed layers with a common WebKit implementation.
But an abstraction layer was needed because different GPUs may behave differently, they may offer different extensions, and we still want to use the software path if hardware acceleration is not available.
This abstraction layer is known as the Texture Mapper, which is a light-weight scene-graph implementation, which is specially attuned for an efficient usage of the GPU.
It is a combination of a specialized accelerated drawing context (TextureMapper) and a scene-graph (TextureMapperLayer):
The TextureMapper is an abstract class that provides the necessary drawing primitives for the scene-graph. Its purpose is to abstract different implementations of the drawing primitives from the scene-graph.
One of the implementations is the TextureMapperGL, which provides a GPU-accelerated implementation of the drawing primitives, using shaders compatible with GL/ES 2.0.
There is a TextureMapperLayer which may represent a GraphicsLayer node in the GPU-renderable layer tree. The TextureMapperLayer tree is equivalent to the GraphicsLayer tree.
How does WebKitGTK+ play a video?
As we stated earlier, in WebKit each HTML element, on a web page, is stored as a Node in the DOM tree. And WebKit provides a Node class hierarchy for all the HTML elements. In the case of the video tag there is a parent class called HTMLMediaElement, which aggregates a common, cross platform, media player. The MediaPlayer is a decorator for a platform-specific media player known as MediaPlayerPrivate.
All previously said is shown in the next diagram.
In the GTK+ port the audio and video decoding is done with GStreamer. In the case of video, a special GStreamer video sink injects the decoded buffers into the WebKit process. You can think about it as a special kind of GstAppSink, and it is part of the WebKitGTK+ code-base.
And we come back to the two paths for content rendering in WebKit:
In the software path the decoded video buffers are copied into a Cairo surface.
But in the hardware accelerated path, the decoded video buffers shall be uploaded into a OpenGL texture. When a new video buffer is available to be shown, a message is sent to the GraphicsLayer asking for redraw.
Uploading video buffers into GL textures
When we are dealing with big enough buffers, such as the high definition video buffers, copying buffers is a performance killer. That is why zero-copy techniques are mandatory.
Even more, when we are working on a multi-processor environment, such as those where we have a CPU and a GPU, switching buffers among processor’s contexts, is also very expensive.
It is because of these reasons, that the video decoding and the OpenGL texture handling, should happen only in the GPU, without context switching and without copying memory chunks.
The simplest approach could be that decoder deliver an EGLImage, so we could blend the handle into the texture. As far as I know, the gst-omx video decoder in the Raspberry Pi, works in this way.
GStreamer added a new API, that will be available in the version 1.2, to upload video buffers into a texture efficiently: GstVideoGLTextureUploadMeta. This API is exposed through buffer’s metadata, and ought be implemented by any downstream element that deals with the decoded video frames, most commonly the video decoder.
For example, in gstreamer-vaapi there are a couple patches (which still are a work-in-progress) in bugzilla, enabling this API. In the low level, calling gst_video_gl_texture_upload_meta_upload() will call vaCopySurfaceGLX(), which will do an efficient copy of the vaAPI surface into a texture using a GLX extension.
This is an old demo, when all the pieces started to fit, but no the current performance. Still, it shows what has been achieved:
So far, all these bits are already integrated in WebKitGTK+ and GStreamer. Nevertheless there are some open issues.
gstreamer-vaapi et all:
GStreamer 1.2 is not released yet, and its new API might change. Also, the port of gstreamer-vaapi to GStreamer 1.2 is still a work in progress, where the available patches may have rough areas.Also, there are many other projects that need to be updated with this new API, such as clutter-gst and provide more feedback to the community.
Another important thing is to have more GStreamer elements implementing these new API, such as the texture upload and the caps features
The composited video task unveiled a major problem in WebKitGTK+: it does not handle the vertical blank interval at all, causing tearing artifacts, clearly observable in high resolutions videos with high motion.WebKitGTK+ composites the scene off-screen, using X Composite redirected window, and then display it at a X Damage callback, but currently, GTK+ does not take care of the vertical blank interval, causing this tearing artifact in heavy compositions.
At Igalia, we are currently researching for a way to fix this issue.
There is always room for performance improvement. And we are always aiming in that direction, improving the frame rate, the CPU, GPU and memory usage, et cetera.
So, keep tuned, or even better, come and help us.
Last week, from 28th to 31th of March, some of us gathered at Milan to hack some bits of the GStreamer internals. For me was a great experience interact with great hackers such as Sebastian Drödge, Wim Taymans, Edward Hervey, Alessandro Decina and many more. We talked about GStreamer and, more particularly, we agreed on new features which I would like to discuss here.
For sake of completeness, let me say that I have been interested in hardware accelerated multimedia for a while, and just lately I started to wet my feet in VAAPI and VDPAU, and their support in our beloved GStreamer.
The first feature that reached upstream is the GstContext. Historically, in 2011, Nicolas Dufresne added GstVideoContext as an interface to a share video context (such as display name, X11 display, VA-API display, etc.) among the pipeline elements and the applications. But now, Sebastian, generalized the interface to a container to stores and shares any kind of contexts between multiple elements and the application.
The first approach, that is still living in gst-plugins-bad, was merely a wrapper to a custom query to set or request a video context. But now, the context sharing is part of the pipeline setup.
An element that needs a shared context must follow these actions:
Check if the element already has a context
Query downstream for the context
Post a message in the bus to see if the application has one to share.
Create the context if there is none, post a message and send an event letting know that the element has the context.
Also in 2011, Nicolas Dufresne, added a helper class to upload a buffer into a surface (OpenGL texture, VA API surface, Wayland surface, etc.). This is quite important since the new video players are scene based, using framework such as Clutter or OpenGL directly, where the video display is composed by various actors, such as the multimedia controls widgets.
But still, this interface didn’t fit well for GStreamer 1.0, until now, where it was introduced in the figure of a buffer’s meta, though this meta is only specific for OpenGL textures. If the buffer provides this new GstVideoGLTextureUploadMeta meta, a new function gst_video_gl_texture_upload_meta_upload() is available to upload that buffer into an OpenGL texture specified by its numeric identifier.
Obviously, in order to use this meta, it should be proposed for allocation by the sink. Again, you can see the case of eglglesink as example.
The caps features are a new data type for specify a specific extension or requirement for the handled media.
From the practical point of view, we can say that caps structures with the same name but with a non-equal set of caps features are not compatible, and, if a pad supports multiple sets of features it has to add multiple equal structures with different feature sets to the caps.
Empty GstCapsFeatures are equivalent with the GstCapsFeatures handled by the common system memory. Other examples would be a specific memory types or the requirement of having a specific meta on the buffer.
This is a feature which has been pulled by Edward Hervey. The idea is that the video codec parsers (H264, MPEG, VC1) attach a meta into the buffer with a defined structure that carries that new information provided by the codified stream.
This is particularly useful by the decoders, which will not have to parse again the buffer in order to extract the information they need to decode the current buffer and the following.
Next Thursday I’ll be flying to Milan to attend the 2013 edition of the GStreamer Hackfest. My main interests are hardware codecs and GL integration, particularly VA API integrated with GL-based sinks.
Three weeks have passed since I wrote the last WebKit report, and they did so quick that it scares me. Many great things have happened since then.
Let’s start with my favorite area: multimedia. Phil landed a patch that avoids muting the sound if the audio pitch is preserved. And Calvaris finally landed his great new media controls. Now watching videos in WebKitGTK+ is a pleasure.
Claudio, besides his work in the snapshots API that we already commented, retook the implementation of the notifications API for WebKitGTK+. And, while implementing it, he fixed some crashers in WK2’s implementation. He has also given us an early screencast with the status of the notifications implementation: Check it out! (video).
Carlos García Campos, besides working hard on the stable and development releases of WebKitGTK+ library, has also landed a couple of fixes. Meanwhile, Dape removed some dependencies, making the code base more clean.