Reworking string management in WebKit’s GStreamer code

When you work on WebKit’s GStreamer code, you end up dealing with strings a lot. GStreamer and GLib APIs use C strings everywhere: element names, property values, caps descriptions, SDP fields… The list is long. The problem was that we were handling all those strings in a somewhat inconsistent way, and the interaction between C strings and WebKit’s own string types was not as clean as it should be.

This is why I started a meta-effort (Bug 289787) to rework how we manage strings in the GStreamer code of WebKit. The outcome is mainly two new classes: CStringView and GMallocString. In this post I want to explain what they are and why they were needed.

The problem

WebKit has its own string types, like WTF::String and WTF::CString, which are designed around WebKit’s internal needs. They handle reference counting, different encodings, and all kind of operations. But when you interface with GLib and GStreamer, you receive char* pointers from C APIs that are typically UTF-8 encoded and null-terminated. Converting back and forth between these and WebKit strings was often done in ad-hoc ways, sometimes creating unnecessary copies and sometimes not being very explicit about ownership.

Another issue was encoding. WebKit internally works with different encodings (Latin1, UTF-16), but all the strings coming from GLib and GStreamer are UTF-8. There was no compile-time enforcement of this distinction, so it was easy to mix things up.

At the beginning, using StringView to handle these C strings was kind of enough. It had a rawCharacters method that allowed us to just wrap the C string pointer and work with it, even though there was no way to enforce encoding correctness. It was not ideal, but it worked. Then WebKit began to move towards using spans and that is when the real problems popped up: spans carry a pointer and a size, but the size was not accounting for the null terminator. So when you wrapped a null-terminated C string into a span, the null terminator was left out, and if any code down the line relied on it being there, you had memory issues. This was the moment it became clear that we needed a proper type that understood null termination as a first-class concept.

CStringView: a non-owning view of null-terminated UTF-8 strings

My first version of CStringView was actually a bigger class. I had added many string operations as member functions: startsWith, endsWith, contains, find, case-insensitive comparisons… It felt natural to me as a developer — you go to the header of a class and you see everything you can do with it. But during the review of PR #51619, Darin Adler made a very good point: CStringView should only handle what makes it special, which is the null termination guarantee. All the other string operations should work on spans, because they do not depend on null termination and should not be duplicated for every string type. If you have a CStringView named view, you just write contains(view.span(), ...) instead of view.contains(...). This way, the same functions work for any span of characters, and CStringView stays small and focused. I was initially skeptical, but after writing the code I have to say he was right.

So the final CStringView is a lightweight, non-owning view over a null-terminated UTF-8 string. Think of it like std::string_view but with two key differences: it guarantees null termination and it works with char8_t instead of char.

class CStringView final {
    // ...
    static CStringView unsafeFromUTF8(const char* string);
    static CStringView fromUTF8(std::span<const char8_t> spanWithNullTerminator);

    const char* utf8() const;
    size_t lengthInBytes() const;
    std::span<const char8_t> span() const;
    std::span<const char8_t> spanIncludingNullTerminator() const;
    bool isEmpty() const;
    bool isNull() const;
};

Using char8_t is important because it prevents mixing encodings at compile time. If you try to pass a char* to something expecting char8_t, the compiler will complain. This is exactly what we wanted: making encoding mismatches a compilation error rather than a runtime bug. This requirement came from Geoffrey Garen during earlier reviews to ensure we were not mixing Latin1 or UTF-16 with this class. You might have noticed the unsafeFromUTF8 factory method: the unsafe prefix is a WebKit convention for APIs that deal with raw pointers without a known size. Since a bare const char* has no size information, we have to trust the caller and compute the length with strlen, which is inherently unsafe. The safe counterpart, fromUTF8, takes a std::span where the size is already known.

CStringView also has a utf8() method that returns a plain const char* for interfacing with C APIs that expect it. The class is designed to sit on the stack (heap allocation is forbidden) and it carries no overhead since it is essentially just a span.

The work on making the span-based operations available in StringCommon (Bug 299946) involved improving the templates to support char8_t spans, so that operations like startsWith, endsWith, contains, find and case-insensitive comparisons work seamlessly. Many of these methods were only templated for Latin1Character and needed some tweaking to also accept char8_t. On top of that, some of these functions accepted different character types for their different parameters, but we wanted to ensure that char8_t was checked at compile time so that it could not be mixed with any other encoding-incompatible type. For example, comparing a UTF-8 code unit with a Latin1 character byte by byte is simply incorrect for non-ASCII characters, so the compiler should prevent you from doing it in the first place. These utility functions can be found in StringCommon.h and StdLibExtras.h.

After the class was ready, I went through the GStreamer code and increased its use (Bug 299443). This touched a significant amount of files across the WebRTC, media player, and media stream code, replacing raw const char* juggling with proper CStringView usage.

If you ever need to convert a CStringView into a String, for example to supply it to a WebCore API, the proper way is straightforward: String newString = cStringView.span();. The String constructor that takes a std::span<const char8_t> handles the UTF-8 conversion correctly without needing any intermediate wrappers.

For the opposite direction, converting a String into a CStringView, the path goes through CString: you first call .utf8() on the String to get a CString (which performs the encoding conversion), and then wrap it with CStringView::unsafeFromUTF8(cString.data()). It is important to keep the CString alive as long as the CStringView is in use, since CStringView does not own the data. That said, this kind of conversion is not common in practice. If you already have a String, the logical step is to keep using it all the way to the end of the WebKit API onion and, when you finally need a const char* for a C API, just call myString.utf8().data().

GMallocString: an owning wrapper for GLib-allocated strings

The second piece was GMallocString (Bug 303909).

Many GLib and GStreamer APIs return newly allocated strings that you must free with g_free(). Before GMallocString, we had GUniquePtr<char> for this, but it was just a raw pointer wrapper with no string-specific functionality. You could not easily compare it, convert it to a WebKit string, or even get its length without calling strlen yourself.

GMallocString solves this by wrapping an owned, g_malloc-allocated, null-terminated UTF-8 string. It can adopt strings in several ways:

// From a raw char* (takes ownership):
auto str = GMallocString::unsafeAdoptFromUTF8(g_strdup("hello"));

// From a GUniquePtr<char>:
GUniquePtr<char> gstr(g_strdup("world"));
auto str2 = GMallocString::unsafeAdoptFromUTF8(WTFMove(gstr));

// Copy from a CStringView:
GMallocString str3(myCStringView);

Once you have a GMallocString, you get the same span(), utf8(), lengthInBytes() interface as CStringView. You can compare them with ==, convert to CStringView with toCStringView(), and use them with safePrintfType for safe logging. The class is move-only (no copies), which is the right semantics for an owning wrapper.

The nice thing is that GMallocString preserves the original null-terminated C string without performing any copy when adopting, and it frees it with g_free() when it goes out of scope. This makes it a perfect fit for the GLib/GStreamer interop pattern where you receive an allocated string and need to use it for a while before discarding it.

The bigger picture

These two classes together cover the two main use cases for C string handling in our code:

  • You do not own the string: use CStringView. No allocations, no copies, just a view. Ideal for parameters, string literals, and temporary references.
  • You own a GLib-allocated string: use GMallocString. It adopts the allocation, provides the same interface, and cleans up when done.

Both enforce UTF-8 encoding through char8_t, both provide span-based access for interacting with the rest of WTF’s string infrastructure, and both support equality comparisons with each other and with ASCIILiteral.

The end result is that the GStreamer code in WebKit is now more explicit about string ownership and encoding, and there are fewer raw char* pointers floating around without clear semantics. I also found and fixed a bug in CString::isEmpty() (Bug 303428) along the way, which was returning size_t instead of bool.

I think the codebase is in a much better shape now in this regard. There is still work to do, but the foundations are there.

Thanks go to my reviewers Darin Adler, Philippe Normand and Adrian Perez de Castro for their patience and thorough reviews, and to Igalia for sponsoring this work.

Serious Encrypted Media Extensions on GStreamer based WebKit ports

Encrypted Media Extensions (a.k.a. EME) is the W3C standard for encrypted media in the web. This way, media providers such as Hulu, Netflix, HBO, Disney+, Prime Video, etc. can provide their contents with a reasonable amount of confidence that it will make it very complicated for people to “save” their assets without their permission. Why do I use the word “serious” in the title? In WebKit there is already support for Clear Key, which is the W3C EME reference implementation but EME supports more encryption systems, even privative ones (I have my opinion about this, you can ask me privately). No service provider (that I know) supports Clear Key, they usually rely on Widevine, PlayReady or some other.

Three years ago, my colleague Žan Doberšek finished the implementation of what was going to be the shell of WebKit’s modern EME implementation, following latest W3C proposal. We implemented that downstream (at Web Platform for Embedded) as well using Thunder, which includes as a plugin a fork of what was Open Content Decryption Module (a.k.a. OpenCDM). The OpenCDM API changed quite a lot during this journey. It works well and there are millions of set-top-boxes using it currently.

The delta between downstream and the upstream GStreamer based WebKit ports was quite big, testing was difficult and syncing was not always easy, so we decided reverse the situation.

Our first step was done by my colleague Charlie Turner, who made Clear Key work upstream again while adapted some changes the Apple folks had done meanwhile. It was amazing to see Clear Key tests passing again and his work with the CDMProxy related classes was awesome. After having ClearKey working, I had to adapt them a bit to accomodate Thunder. To explain a bit about the WebKit EME architecture, I must say that there are two layers. The first is the crossplatform one, which implements the W3C API (MediaKeys, MediaKeySession, CDM…). These classes rely on the platform ones (CDMPrivate, CDMInstance, CDMInstanceSession) to handle the platform management, message exchange, etc. which would be the second layer. Apple playback system is fully integrated with their DRM system so they don’t need anything else. We do because we need to integrate our own decryptors to defer to Thunder for decryption so in the GStreamer based ports we also need the CDMProxy related classes, which would be CDMProxy, CDMInstanceProxy, CDMInstanceSessionProxy… The last two extend CDMInstance and CDMInstanceSession respectively to be able to deal with the key management, that is abstracted to the KeyHandle and KeyStore.

Once the abstraction is there (let’s remember that the abstranction works both for Clear Key and Thunder), the Thunder implementation is quite simple, just gluing the CDMProxy, CDMInstanceProxy and CDMInstanceSessionProxy classes to the Thunder system and writing a GStreamer decryptor element for it. I might have made a mistake when selecting the files but considering Thunder classes + the GStreamer common decryptor code, cloc says it is just 1198 lines of platform code. I think it is pretty low for what it does. Apart from that, obviously, there are 5760 lines of crossplatform code.

To build and run all this you need to do several things:

  1. Build the dependencies with WEBKIT_JHBUILD=1 JHBUILD_ENABLE_THUNDER="yes" to enable the old fashioned JHBuild build and force it to build the Thunder dependencies. All dependendies are on JHBuild, even Widevine is referenced but to download it you need the proper credentials as it is closed source.
  2. Pass --thunder when calling build-webkit.sh.
  3. Run MiniBrowser with WEBKIT_GST_EME_RANK_PRIORITY="Thunder" and pass parameters --enable-mediasource=TRUE --enable-encrypted-media=TRUE --autoplay-policy=allow. The autoplay policy is usually optional but in this case it is necessary for the YouTube TV tests. We need to give the Thunder decryptor a higher priority because of WebM, that does not specify a key system and without it the Clear Key one can be selected and fail. MP4 does not create trouble because the protection system is specified and the caps negotiation does its magic.

As you could have guessed if you have a closer look at the GStreamer JHBuild moduleset, you’ll see that only Widevine is supported. To support more, you only have to make them build in the Thunder ecosystem and add them to CDMFactoryThunder::supportedKeySystems.

When I coded this, all YouTube TV tests for Widevine were green in the desktop. At the moment of writing this post they aren’t because of some problem with the Widevine installation that will be sorted quickly, I hope.

VCR to WebM with GStreamer and hardware encoding

My family had bought many years ago a Panasonic VHS video camera and we had recorded quite a lot of things, holidays, some local shows, etc. I even got paid 5000 pesetas (30€ more than 20 years ago) a couple of times to record weddings in a amateur way. Since my father passed less than a year ago I have wanted to convert those VHS tapes into something that can survive better technologically speaking.

For the job I bought a USB 2.0 dongle and connected it to a VHS VCR through a SCART to RCA cable.

The dongle creates a V4L2 device for video and is detected by Pulseaudio for audio. As I want to see what I am converting live I need to tee both audio and video to the corresponding sinks and the other part would go to to the encoders, muxer and filesink. The command line for that would be:

gst-launch-1.0 matroskamux name=mux ! filesink location=/tmp/test.webm \
v4l2src device=/dev/video2 norm=255 io-mode=mmap ! queue ! vaapipostproc ! tee name=video_t ! \
queue ! vaapivp9enc rate-control=4 bitrate=1536 ! mux.video_0 \
video_t. ! queue ! xvimagesink \
pulsesrc device=alsa_input.usb-MACROSIL_AV_TO_USB2.0-02.analog-stereo ! 'audio/x-raw,rate=48000,channels=2' ! tee name=audio_t ! \
queue ! pulsesink \
audio_t. ! queue ! vorbisenc ! mux.audio_0

As you can see I convert to WebM with VP9 and Vorbis. Something interesting can be passing norm=255 to the v4l2src element so it’s capturing PAL and the rate-control=4 for VBR to the vaapivp9enc element, otherwise it will use cqp as default and file size would end up being huge.

You can see the pipeline, which is beatiful, here:

As you can see, we’re using vaapivp9enc here which is hardware enabled and having this pipeline running in my computer was consuming more or less 20% of CPU with the CPU absolutely relaxed, leaving me the necessary computing power for my daily work. This would not be possible without GStreamer and GStreamer VAAPI plugins, which is what happens with other solutions whose instructions you can find online.

If for some reason you can’t find vaapivp9enc in Debian, you should know there are a couple of packages for the intel drivers and that the one you should install is intel-media-va-driver. Thanks go to my colleague at Igalia Víctor Jáquez, who maintains gstreamer-vaapi and helped me solving this problem.

My workflow for this was converting all tapes into WebM and then cutting them in the different relevant pieces with PiTiVi running GStreamer Editing Services both co-maintained by my colleague at Igalia, Thibault Saunier.

Some rough numbers on WebKit code

My wife asked me for some rough LOC numbers on the WebKit project and I think I could share them with you here as well. They come from r221232. As I’ll take into account some generated code it is relevant to mention that I built WebKitGTK+ with the default CMake options.

First thing I did was running sloccount Source and got the following numbers:

cpp: 2526061 (70.57%)
ansic: 396906 (11.09%)
asm: 207284 (5.79%)
javascript: 175059 (4.89%)
java: 74458 (2.08%)
perl: 73331 (2.05%)
objc: 44422 (1.24%)
python: 38862 (1.09%)
cs: 13011 (0.36%)
ruby: 11605 (0.32%)
xml: 11396 (0.32%)
sh: 3747 (0.10%)
yacc: 2167 (0.06%)
lex: 1007 (0.03%)
lisp: 89 (0.00%)
php: 10 (0.00%)

This number do not include IDL code so I did some grepping to get the number myself that gave me 19632 IDL lines:

$ find Source/ -name ".idl" | xargs cat | grep -ve "^[[:space:]]\/*" -ve "^[[:space:]]*" -ve "^[[:space:]]$" -ve "^[[:space:]][$" -ve "^[[:space:]]};$" | wc -l
19632

The interesting part of the IDL files is that they are used to generate code so those 19632 IDL lines expand to:

ansic: 699140 (65.25%)
cpp: 368720 (34.41%)
python: 1492 (0.14%)
xml: 1040 (0.10%)
javascript: 883 (0.08%)
asm: 169 (0.02%)
perl: 11 (0.00%)

Let’s have a look now at the LayoutTests (they test the functionality of WebCore + the platform). Tests are composed mainly by HTML files so if you run sloccount LayoutTests you get:

javascript: 401159 (76.74%)
python: 87231 (16.69%)
xml: 22978 (4.40%)
php: 4784 (0.92%)
ansic: 3661 (0.70%)
perl: 2726 (0.52%)
sh: 199 (0.04%)

It’s quite interesting to see that sloccount does not consider HTML which is quite relevant when you’re testing a web engine so again, we have to count them manually (thanks to Carlos López who helped me to properly grep here as some binary lines were giving me a headache to get the numbers):

find LayoutTests/ -name ".html" -print0 | xargs -0 cat | strings | grep -Pv "^[[:space:]]$" | wc -l
2205690

You can see 2205690 of “meaningful lines” that combine HTML + other languages that you can see above. I can’t substract here to just get the HTML lines because the number above take into account files with a different extension than HTML, though many of them do include other languages, specially JavaScript.

But the LayoutTests do not include only pure WebKit tests. There are some imported ones so it might be interesting to run the same procedure under LayoutTests/imported to see which ones are imported and not written directly into the WebKit project. I emphasize that because they can be written by WebKit developers in other repositories and actually I can present myself and Youenn Fablet as an example as we wrote tests some tests that were finally moved into the specification and included back later when imported. So again, sloccount LayoutTests/imported:

python: 84803 (59.99%)
javascript: 51794 (36.64%)
ansic: 3661 (2.59%)
php: 575 (0.41%)
xml: 250 (0.18%)
sh: 199 (0.14%)
perl: 86 (0.06%)

The same procedure to count HTML + other stuff lines inside that directory gives a number of 295490:

$ find LayoutTests/imported/ -name ".html" -print0 | xargs -0 cat | strings | grep -Pv "^[[:space:]]$" | wc -l
295490

There are also some other tests that we can talk about, for example the JSTests. I’ll mention already the numbers summed up regarding languages and the manual HTML code (if you made it here, you know the drill already):

javascript: 1713200 (98.64%)
xml: 20665 (1.19%)
perl: 2449 (0.14%)
python: 421 (0.02%)
ruby: 56 (0.00%)
sh: 38 (0.00%)
HTML+stuff: 997

ManualTests:

javascript: 297 (41.02%)
ansic: 187 (25.83%)
java: 118 (16.30%)
xml: 103 (14.23%)
php: 10 (1.38%)
perl: 9 (1.24%)
HTML+stuff: 16026

PerformanceTests:

javascript: 950916 (83.12%)
cpp: 147194 (12.87%)
ansic: 38540 (3.37%)
asm: 5466 (0.48%)
sh: 872 (0.08%)
ruby: 419 (0.04%)
perl: 348 (0.03%)
python: 325 (0.03%)
xml: 5 (0.00%)
HTML+stuff: 238002

TestsWebKitAPI:

cpp: 44753 (99.45%)
ansic: 163 (0.36%)
objc: 76 (0.17%)
xml: 7 (0.02%)
javascript: 1 (0.00%)
HTML+stuff: 3887

And this is all. Remember that these are just some rough statistics, not a “scientific” paper.

Update:

In her expert opinion, in the WebKit project we are devoting around 50% of the total LOC to testing, which makes it a software engineering “textbook” project regarding testing and I think we can be proud of it!

Web Engines Hackfest 2016

From September 26th to 28th we celebrated at the Igalia HQ the 2016 edition of the Web Engines Hackfest. This year we broke all records and got participants from the three main companies behind the three biggest open source web engines, say Mozilla, Google and Apple. Or course, it was not only them, we had some other companies and ourselves. I was active part of the organization and I think we not only did not get any complain but people were comfortable and happy around.

We had several talks (I included the slides and YouTube links):

We had lots and lots of interesting hacking and we also had several breakout sessions:

  • WebKitGTK+ / Epiphany
  • Servo
  • WPE / WebKit for Wayland
  • Layout Models (Grid, Flexbox)
  • WebRTC
  • JavaScript Engines
  • MathML
  • Graphics in WebKit

What I did during the hackfest was working with Enrique and Žan to advance on reviewing our downstream implementation of our GStreamer based of Media Source Extensions (MSE) in order to land it as soon as possible and I can proudly say that we did already (we didn’t finish at the hackfest but managed to do it after it). We broke the bots and pissed off Michael and Carlos but we managed to deactivate it by default and continue working on it upstream.

So summing up, from my point of view and it is not only because I was part of the organization at Igalia, based also in other people’s opinions, I think the hackfest was a success and I think we will continue as we were or maybe growing a bit (no spoilers!).

Finally I would like to thank our gold sponsors Collabora and Igalia and our silver sponsor Mozilla.

WebRTC in WebKit/WPE

For some time I worked at Igalia to enable WebRTC on WebKitForWayland or WPE for the Raspberry Pi 2.

The goal was to have the WebKit WebRTC tests working for a demo. My fellow Igalian Alex was working on the platform itself in WebKit and assisting with some tuning for the Pi on WebKit but the main work needed to be done in OpenWebRTC.

My other fellow Igalian Phil had begun a branch to work on this that was half way with some workarounds. My first task was getting into combat/workaround mode and make OpenWebRTC work with compressed streams from gst-rpicamsrc. OpenWebRTC supported only raw video streams and that Raspberry Pi Cam module GStreamer element provides only H264 encoded ones. I moved some encoders and parsers around, some caps modifications, removed some elements that didn’t work on the Pi and made it work eventually. You can see the result at:

<

video style=”max-width: 100%;” src=”/xrcalvar/files/2016/10/201607-webrtc.mp4″ controls poster=”/xrcalvar/files/2016/10/webrtc-poster.png”>

To make this work by yourselves you needed a custom branch of Buildroot where you could build with the proper plugins enabled also selected the appropriate branches in WPE and OpenWebRTC.

Unfortunately the work was far from being finished so I continued the effort to try to make the arch changes in OpenWebRTC have production quality and that meant do some tasks step by step:

  • Rework the video orientation code: The workaround deactivated it as so far it was being done in GStreamer. In the case of rpicamsrc that can be done by the hardware itself so I cooked a GStreamer interface to enable rotation the same way it was done for the [gl]videoflip elements. The idea would be deprecate the original ones and use the new interface. These landed both in videoflip and glvideoflip. Of course I also implemented it on gst-rpicamsrc here and here and eventually in OpenWebRTC sources.
  • Rework video flip: Once OpenWebRTC sources got orientation support, I could rework the flip both for local and remote feeds.
  • Add gl{down|up}load elements back: There were some issues with the gl elements to upload and download textures, which we had removed. I readded them again.
  • Reworked bins linking: In OpenWebRTC there are some bins that are created to perform some tasks and depending on different circumstances you add or not some elements. I reworked the way those elements are linked so that we don’t have to take into account all the use cases to link them. Now this is easier as the elements are linked as they are the added to the bin.
  • Reworked the renderer_disabled: As in the case for orientation, some elements such as gst-rpicamsrc are able to change color and balance so I added support for that to avoid having that done by GStreamer elements if not necessary. In this case the proper interfaces were already there in GStreamer.
  • Moved the decoding/parsing from the source to the renderer: Before our changes the source was parsing/decoding the remote feeds, local sources were not decoded, just raw was supported. Our workarounds made the local sources decode too but this was not working for all cases. So why decoding at the sources when GStreamer has caps and you can just chain all that to the renderers? So I did eventually, I moved the parsing/decoding to the renderers. This took fixing all the caps negotiation from sources to renderers. Unfortunatelly I think I broke audio on the way, but surely nothing difficult to fix.

This is still a work in progress and now I am changing tasks and handing this over back to my fellow Igalians Phil, who I am sure will do an awesome job together with Alex.

And again, thanks to Igalia for letting me work on this and to Metrological that is sponsoring this work.

Über latest Media Source Extensions improvements in WebKit with GStreamer

In this post I am going to talk about the implementation of the Media Source Extensions (known as MSE) in the WebKit ports that use GStreamer. These ports are WebKitGTK+, WebKitEFL and WebKitForWayland, though only the latter has the latest work-in-progress implementation. Of course we hope to upstream WebKitForWayland soon and with it, this backend for MSE and the one for EME.

My colleague Enrique at Igalia wrote a post about this about a week ago. I recommend you read it before continuing with mine to understand the general picture and the some of the issues that I managed to fix on that implementation. Come on, go and read it, I’ll wait.

One of the challenges here is something a bit unnatural in the GStreamer world. We have to process the stream information and then make some metadata available to the JavaScript app before playing instead of just pushing everything to a playing pipeline and being happy. For this we created the AppendPipeline, which processes the data and extracts that information and keeps it under control for the playback later.

The idea of the our AppendPipeline is to put a data stream into it and get it processed at the other side. It has an appsrc, a demuxer (qtdemux currently) and an appsink to pick up the processed data. Something tricky of the spec is that when you append data into the SourceBuffer, that operation has to block it and prevent with errors any other append operation while the current is ongoing, and when it finishes, signal it. Our main issue with this is that the the appends can contain any amount of data from headers and buffers to only headers or just partial headers. Basically, the information can be partial.

First I’ll present again Enrique’s AppendPipeline internal state diagram:

First let me explain the easiest case, which is headers and buffers being appended. As soon as the process is triggered, we move from Not started to Ongoing, then as the headers are processed we get the pads at the demuxer and begin to receive buffers, which makes us move to Sampling. Then we have to detect that the operation has ended and move to Last sample and then again to Not started. If we have received only headers we will not move to Sampling cause we will not receive any buffers but we still have to detect this situation and be able to move to Data starve and then again to Not started.

Our first approach was using two different timeouts, one to detect that we should move from Ongoing to Data starve if we did not receive any buffer and another to move from Sampling to Last sample if we stopped receiving buffers. This solution worked but it was a bit racy and we tried to find a less error prone solution.

We tried then to use custom downstream events injected from the source and at the moment they were received at the sink we could move from Sampling to Last sample or if only headers were injected, the pads were created and we could move from Ongoing to Data starve. It took some time and several iterations to fine tune this but we managed to solve almost all cases but one, which was receiving only partial headers and no buffers.

If the demuxer received partial headers and no buffers it stalled and we were not receiving any pads or any event at the output so we could not tell when the append operation had ended. Tim-Philipp gave me the idea of using the need-data signal on the source that would be fired when the demuxer ran out of useful data. I realized then that the events were not needed anymore and that we could handle all with that signal.

The need-signal is fired sometimes when the pipeline is linked and also when the the demuxer finishes processing data, regardless the stream contains partial headers, complete headers or headers and buffers. It works perfectly once we are able to disregard that first signal we receive sometimes. To solve that we just ensure that at least one buffer left the appsrc with a pad probe so if we receive the signal before any buffer was detected at the probe, it shall be disregarded to consider that the append has finished. Otherwise, if we have seen already a buffer at the probe we can consider already than any need-data signal means that the processing has ended and we can tell the JavaScript app that the append process has ended.

Both need-data signal and probe information come in GStreamer internal threads so we could use mutexes to overcome any race conditions. We thought though that deferring the operations to the main thread through the pipeline bus was a better idea that would create less issues with race conditions or deadlocks.

To finish I prefer to give some good news about performance. We use mainly the YouTube conformance tests to ensure our implementation works and I can proudly say that these changes reduced the time of execution in half!

That’s all folks!

Web Engines Hackfest according to me

And once again, in December we celebrated the hackfest. This year happened between Dec 7-9 at the Igalia premises and the scope was much broader than WebKitGTK+, that’s why it was renamed as Web Engines Hackfest. We wanted to gather people working on all open source web engines and we succeeded as we had people working on WebKit, Chromium/Blink and Servo.

The edition before this I was working with Youenn Fablet (from Canon) on the Streams API implementation in WebKit and we spent our time on the same thing again. We have to say that things are much more mature now. During the hackfest we spent our time in fixing the JavaScriptCore built-ins inside WebCore and we advanced on the automatic importation of the specification web platform tests, which are based on our prior test implementation. Since now they are managed there, it does not make sense to maintain them inside WebKit too, we just import them. I must say that our implementation is fairly complete since we support the current version of the spec and have almost all tests passing, including ReadableStream, WritableStream and the built-in strategy classes. What is missing now is making Streams work together with other APIs, such as Media Source Extensions, Fetch or XMLHttpRequest.

There were some talks during the hackfest and we did not want to be less, so we had our own about Streams. You can enjoy it here:

You can see all hackfest talks in this YouTube playlist. The ones I liked most were the ones by Michael Catanzaro about HTTP security, which is always interesting given the current clumsy political movements against cryptography and the one by Dominik Röttsches about font rendering. It is really amazing what a browser has to do just to get some letters painted on the screen (and look good).

As usual, the environment was amazing and we had a great time, including the traditional Street Fighter‘s match, where Gustavo found a worthy challenger in Changseok 🙂

Of course, I would like to thank Collabora and Igalia for sponsoring the event!

And by the way, quite shortly after that, I became a WebKit reviewer!

WebKit Contributors Meeting 2015 (late, I know)

After writing my last post I realized that I needed to write a bit more about what I had been doing at the WebKit Contributors Meeting.

First thing to say is that it happened in March at Apple campus in Cupertino and I atteded as part of the Igalia gang.

My goal when I went there was to discuss with Youenn Fablet about Streams API and we are implementing and see how we could bootstrap the reviews and being to get the code reviewed and landed efficiently. Youenn and I also made a presentation (mainly him) about it. At that moment we got some comments and help from Benjamin Poulain and nowadays we are also working with Darin Adler and Geoffrey Garen so the work is ongoing.

WebRTC was also a hot topic and we talked a bit about how to deal with the promises as they seem to be involved in the WebRTC standard was well. My Igalian partner Philippe was missed in this regard as he is involved in the development of WebRTC in WebKit, but he unfortunately couldn’t make it because of personal reasons.

I also had an interesting talk with Jer Noble and Eric Carlson about Media Source and Encrypted Media Extensions. I told them about the several downstream implementations that we are or were working on, specially the GStreamer based implementation of WPE one and that we expect to begin to upstream soon (update: this is done, yay!). They commented that they still have doubts about the abstractions they made for them and of course I promised to get back to them when we begin with the job. Actually I already discussed some issues with Quique, another fellow Igalian.

Among the other interesting discussions, I found very necessary the migration of Mac port to CMake. Actually, I am experiencing now the painbenefits of using XCode to add files, specially the generated ones to the compilation. I hope that Alex succeeds with the task and soon we have a common build system for all main ports.