Serious Encrypted Media Extensions on GStreamer based WebKit ports

Encrypted Media Extensions (a.k.a. EME) is the W3C standard for encrypted media in the web. This way, media providers such as Hulu, Netflix, HBO, Disney+, Prime Video, etc. can provide their contents with a reasonable amount of confidence that it will make it very complicated for people to “save” their assets without their permission. Why do I use the word “serious” in the title? In WebKit there is already support for Clear Key, which is the W3C EME reference implementation but EME supports more encryption systems, even privative ones (I have my opinion about this, you can ask me privately). No service provider (that I know) supports Clear Key, they usually rely on Widevine, PlayReady or some other.

Three years ago, my colleague Žan Doberšek finished the implementation of what was going to be the shell of WebKit’s modern EME implementation, following latest W3C proposal. We implemented that downstream (at Web Platform for Embedded) as well using Thunder, which includes as a plugin a fork of what was Open Content Decryption Module (a.k.a. OpenCDM). The OpenCDM API changed quite a lot during this journey. It works well and there are millions of set-top-boxes using it currently.

The delta between downstream and the upstream GStreamer based WebKit ports was quite big, testing was difficult and syncing was not always easy, so we decided reverse the situation.

Our first step was done by my colleague Charlie Turner, who made Clear Key work upstream again while adapted some changes the Apple folks had done meanwhile. It was amazing to see Clear Key tests passing again and his work with the CDMProxy related classes was awesome. After having ClearKey working, I had to adapt them a bit to accomodate Thunder. To explain a bit about the WebKit EME architecture, I must say that there are two layers. The first is the crossplatform one, which implements the W3C API (MediaKeys, MediaKeySession, CDM…). These classes rely on the platform ones (CDMPrivate, CDMInstance, CDMInstanceSession) to handle the platform management, message exchange, etc. which would be the second layer. Apple playback system is fully integrated with their DRM system so they don’t need anything else. We do because we need to integrate our own decryptors to defer to Thunder for decryption so in the GStreamer based ports we also need the CDMProxy related classes, which would be CDMProxy, CDMInstanceProxy, CDMInstanceSessionProxy… The last two extend CDMInstance and CDMInstanceSession respectively to be able to deal with the key management, that is abstracted to the KeyHandle and KeyStore.

Once the abstraction is there (let’s remember that the abstranction works both for Clear Key and Thunder), the Thunder implementation is quite simple, just gluing the CDMProxy, CDMInstanceProxy and CDMInstanceSessionProxy classes to the Thunder system and writing a GStreamer decryptor element for it. I might have made a mistake when selecting the files but considering Thunder classes + the GStreamer common decryptor code, cloc says it is just 1198 lines of platform code. I think it is pretty low for what it does. Apart from that, obviously, there are 5760 lines of crossplatform code.

To build and run all this you need to do several things:

  1. Build the dependencies with WEBKIT_JHBUILD=1 JHBUILD_ENABLE_THUNDER="yes" to enable the old fashioned JHBuild build and force it to build the Thunder dependencies. All dependendies are on JHBuild, even Widevine is referenced but to download it you need the proper credentials as it is closed source.
  2. Pass --thunder when calling build-webkit.sh.
  3. Run MiniBrowser with WEBKIT_GST_EME_RANK_PRIORITY="Thunder" and pass parameters --enable-mediasource=TRUE --enable-encrypted-media=TRUE --autoplay-policy=allow. The autoplay policy is usually optional but in this case it is necessary for the YouTube TV tests. We need to give the Thunder decryptor a higher priority because of WebM, that does not specify a key system and without it the Clear Key one can be selected and fail. MP4 does not create trouble because the protection system is specified and the caps negotiation does its magic.

As you could have guessed if you have a closer look at the GStreamer JHBuild moduleset, you’ll see that only Widevine is supported. To support more, you only have to make them build in the Thunder ecosystem and add them to CDMFactoryThunder::supportedKeySystems.

When I coded this, all YouTube TV tests for Widevine were green in the desktop. At the moment of writing this post they aren’t because of some problem with the Widevine installation that will be sorted quickly, I hope.

VCR to WebM with GStreamer and hardware encoding

My family had bought many years ago a Panasonic VHS video camera and we had recorded quite a lot of things, holidays, some local shows, etc. I even got paid 5000 pesetas (30€ more than 20 years ago) a couple of times to record weddings in a amateur way. Since my father passed less than a year ago I have wanted to convert those VHS tapes into something that can survive better technologically speaking.

For the job I bought a USB 2.0 dongle and connected it to a VHS VCR through a SCART to RCA cable.

The dongle creates a V4L2 device for video and is detected by Pulseaudio for audio. As I want to see what I am converting live I need to tee both audio and video to the corresponding sinks and the other part would go to to the encoders, muxer and filesink. The command line for that would be:

gst-launch-1.0 matroskamux name=mux ! filesink location=/tmp/test.webm \
v4l2src device=/dev/video2 norm=255 io-mode=mmap ! queue ! vaapipostproc ! tee name=video_t ! \
queue ! vaapivp9enc rate-control=4 bitrate=1536 ! mux.video_0 \
video_t. ! queue ! xvimagesink \
pulsesrc device=alsa_input.usb-MACROSIL_AV_TO_USB2.0-02.analog-stereo ! 'audio/x-raw,rate=48000,channels=2' ! tee name=audio_t ! \
queue ! pulsesink \
audio_t. ! queue ! vorbisenc ! mux.audio_0

As you can see I convert to WebM with VP9 and Vorbis. Something interesting can be passing norm=255 to the v4l2src element so it’s capturing PAL and the rate-control=4 for VBR to the vaapivp9enc element, otherwise it will use cqp as default and file size would end up being huge.

You can see the pipeline, which is beatiful, here:

As you can see, we’re using vaapivp9enc here which is hardware enabled and having this pipeline running in my computer was consuming more or less 20% of CPU with the CPU absolutely relaxed, leaving me the necessary computing power for my daily work. This would not be possible without GStreamer and GStreamer VAAPI plugins, which is what happens with other solutions whose instructions you can find online.

If for some reason you can’t find vaapivp9enc in Debian, you should know there are a couple of packages for the intel drivers and that the one you should install is intel-media-va-driver. Thanks go to my colleague at Igalia Víctor Jáquez, who maintains gstreamer-vaapi and helped me solving this problem.

My workflow for this was converting all tapes into WebM and then cutting them in the different relevant pieces with PiTiVi running GStreamer Editing Services both co-maintained by my colleague at Igalia, Thibault Saunier.

Web Engines Hackfest 2016

From September 26th to 28th we celebrated at the Igalia HQ the 2016 edition of the Web Engines Hackfest. This year we broke all records and got participants from the three main companies behind the three biggest open source web engines, say Mozilla, Google and Apple. Or course, it was not only them, we had some other companies and ourselves. I was active part of the organization and I think we not only did not get any complain but people were comfortable and happy around.

We had several talks (I included the slides and YouTube links):

We had lots and lots of interesting hacking and we also had several breakout sessions:

  • WebKitGTK+ / Epiphany
  • Servo
  • WPE / WebKit for Wayland
  • Layout Models (Grid, Flexbox)
  • WebRTC
  • JavaScript Engines
  • MathML
  • Graphics in WebKit

What I did during the hackfest was working with Enrique and Žan to advance on reviewing our downstream implementation of our GStreamer based of Media Source Extensions (MSE) in order to land it as soon as possible and I can proudly say that we did already (we didn’t finish at the hackfest but managed to do it after it). We broke the bots and pissed off Michael and Carlos but we managed to deactivate it by default and continue working on it upstream.

So summing up, from my point of view and it is not only because I was part of the organization at Igalia, based also in other people’s opinions, I think the hackfest was a success and I think we will continue as we were or maybe growing a bit (no spoilers!).

Finally I would like to thank our gold sponsors Collabora and Igalia and our silver sponsor Mozilla.

WebRTC in WebKit/WPE

For some time I worked at Igalia to enable WebRTC on WebKitForWayland or WPE for the Raspberry Pi 2.

The goal was to have the WebKit WebRTC tests working for a demo. My fellow Igalian Alex was working on the platform itself in WebKit and assisting with some tuning for the Pi on WebKit but the main work needed to be done in OpenWebRTC.

My other fellow Igalian Phil had begun a branch to work on this that was half way with some workarounds. My first task was getting into combat/workaround mode and make OpenWebRTC work with compressed streams from gst-rpicamsrc. OpenWebRTC supported only raw video streams and that Raspberry Pi Cam module GStreamer element provides only H264 encoded ones. I moved some encoders and parsers around, some caps modifications, removed some elements that didn’t work on the Pi and made it work eventually. You can see the result at:

<

video style=”max-width: 100%;” src=”/xrcalvar/files/2016/10/201607-webrtc.mp4″ controls poster=”/xrcalvar/files/2016/10/webrtc-poster.png”>

To make this work by yourselves you needed a custom branch of Buildroot where you could build with the proper plugins enabled also selected the appropriate branches in WPE and OpenWebRTC.

Unfortunately the work was far from being finished so I continued the effort to try to make the arch changes in OpenWebRTC have production quality and that meant do some tasks step by step:

  • Rework the video orientation code: The workaround deactivated it as so far it was being done in GStreamer. In the case of rpicamsrc that can be done by the hardware itself so I cooked a GStreamer interface to enable rotation the same way it was done for the [gl]videoflip elements. The idea would be deprecate the original ones and use the new interface. These landed both in videoflip and glvideoflip. Of course I also implemented it on gst-rpicamsrc here and here and eventually in OpenWebRTC sources.
  • Rework video flip: Once OpenWebRTC sources got orientation support, I could rework the flip both for local and remote feeds.
  • Add gl{down|up}load elements back: There were some issues with the gl elements to upload and download textures, which we had removed. I readded them again.
  • Reworked bins linking: In OpenWebRTC there are some bins that are created to perform some tasks and depending on different circumstances you add or not some elements. I reworked the way those elements are linked so that we don’t have to take into account all the use cases to link them. Now this is easier as the elements are linked as they are the added to the bin.
  • Reworked the renderer_disabled: As in the case for orientation, some elements such as gst-rpicamsrc are able to change color and balance so I added support for that to avoid having that done by GStreamer elements if not necessary. In this case the proper interfaces were already there in GStreamer.
  • Moved the decoding/parsing from the source to the renderer: Before our changes the source was parsing/decoding the remote feeds, local sources were not decoded, just raw was supported. Our workarounds made the local sources decode too but this was not working for all cases. So why decoding at the sources when GStreamer has caps and you can just chain all that to the renderers? So I did eventually, I moved the parsing/decoding to the renderers. This took fixing all the caps negotiation from sources to renderers. Unfortunatelly I think I broke audio on the way, but surely nothing difficult to fix.

This is still a work in progress and now I am changing tasks and handing this over back to my fellow Igalians Phil, who I am sure will do an awesome job together with Alex.

And again, thanks to Igalia for letting me work on this and to Metrological that is sponsoring this work.

Über latest Media Source Extensions improvements in WebKit with GStreamer

In this post I am going to talk about the implementation of the Media Source Extensions (known as MSE) in the WebKit ports that use GStreamer. These ports are WebKitGTK+, WebKitEFL and WebKitForWayland, though only the latter has the latest work-in-progress implementation. Of course we hope to upstream WebKitForWayland soon and with it, this backend for MSE and the one for EME.

My colleague Enrique at Igalia wrote a post about this about a week ago. I recommend you read it before continuing with mine to understand the general picture and the some of the issues that I managed to fix on that implementation. Come on, go and read it, I’ll wait.

One of the challenges here is something a bit unnatural in the GStreamer world. We have to process the stream information and then make some metadata available to the JavaScript app before playing instead of just pushing everything to a playing pipeline and being happy. For this we created the AppendPipeline, which processes the data and extracts that information and keeps it under control for the playback later.

The idea of the our AppendPipeline is to put a data stream into it and get it processed at the other side. It has an appsrc, a demuxer (qtdemux currently) and an appsink to pick up the processed data. Something tricky of the spec is that when you append data into the SourceBuffer, that operation has to block it and prevent with errors any other append operation while the current is ongoing, and when it finishes, signal it. Our main issue with this is that the the appends can contain any amount of data from headers and buffers to only headers or just partial headers. Basically, the information can be partial.

First I’ll present again Enrique’s AppendPipeline internal state diagram:

First let me explain the easiest case, which is headers and buffers being appended. As soon as the process is triggered, we move from Not started to Ongoing, then as the headers are processed we get the pads at the demuxer and begin to receive buffers, which makes us move to Sampling. Then we have to detect that the operation has ended and move to Last sample and then again to Not started. If we have received only headers we will not move to Sampling cause we will not receive any buffers but we still have to detect this situation and be able to move to Data starve and then again to Not started.

Our first approach was using two different timeouts, one to detect that we should move from Ongoing to Data starve if we did not receive any buffer and another to move from Sampling to Last sample if we stopped receiving buffers. This solution worked but it was a bit racy and we tried to find a less error prone solution.

We tried then to use custom downstream events injected from the source and at the moment they were received at the sink we could move from Sampling to Last sample or if only headers were injected, the pads were created and we could move from Ongoing to Data starve. It took some time and several iterations to fine tune this but we managed to solve almost all cases but one, which was receiving only partial headers and no buffers.

If the demuxer received partial headers and no buffers it stalled and we were not receiving any pads or any event at the output so we could not tell when the append operation had ended. Tim-Philipp gave me the idea of using the need-data signal on the source that would be fired when the demuxer ran out of useful data. I realized then that the events were not needed anymore and that we could handle all with that signal.

The need-signal is fired sometimes when the pipeline is linked and also when the the demuxer finishes processing data, regardless the stream contains partial headers, complete headers or headers and buffers. It works perfectly once we are able to disregard that first signal we receive sometimes. To solve that we just ensure that at least one buffer left the appsrc with a pad probe so if we receive the signal before any buffer was detected at the probe, it shall be disregarded to consider that the append has finished. Otherwise, if we have seen already a buffer at the probe we can consider already than any need-data signal means that the processing has ended and we can tell the JavaScript app that the append process has ended.

Both need-data signal and probe information come in GStreamer internal threads so we could use mutexes to overcome any race conditions. We thought though that deferring the operations to the main thread through the pipeline bus was a better idea that would create less issues with race conditions or deadlocks.

To finish I prefer to give some good news about performance. We use mainly the YouTube conformance tests to ensure our implementation works and I can proudly say that these changes reduced the time of execution in half!

That’s all folks!

WebKit Contributors Meeting 2015 (late, I know)

After writing my last post I realized that I needed to write a bit more about what I had been doing at the WebKit Contributors Meeting.

First thing to say is that it happened in March at Apple campus in Cupertino and I atteded as part of the Igalia gang.

My goal when I went there was to discuss with Youenn Fablet about Streams API and we are implementing and see how we could bootstrap the reviews and being to get the code reviewed and landed efficiently. Youenn and I also made a presentation (mainly him) about it. At that moment we got some comments and help from Benjamin Poulain and nowadays we are also working with Darin Adler and Geoffrey Garen so the work is ongoing.

WebRTC was also a hot topic and we talked a bit about how to deal with the promises as they seem to be involved in the WebRTC standard was well. My Igalian partner Philippe was missed in this regard as he is involved in the development of WebRTC in WebKit, but he unfortunately couldn’t make it because of personal reasons.

I also had an interesting talk with Jer Noble and Eric Carlson about Media Source and Encrypted Media Extensions. I told them about the several downstream implementations that we are or were working on, specially the GStreamer based implementation of WPE one and that we expect to begin to upstream soon (update: this is done, yay!). They commented that they still have doubts about the abstractions they made for them and of course I promised to get back to them when we begin with the job. Actually I already discussed some issues with Quique, another fellow Igalian.

Among the other interesting discussions, I found very necessary the migration of Mac port to CMake. Actually, I am experiencing now the painbenefits of using XCode to add files, specially the generated ones to the compilation. I hope that Alex succeeds with the task and soon we have a common build system for all main ports.

New media controls in WebKitGTK+ (reloaded)

In December we organized in A Coruña the WebKitGTK+ hackfest at the Igalia premises as usual and also as usual it was an awesome oportunity to meet the rest of the team. For more information about the progress done in the hackfest, you can have a look at KaL’s post.

As part of the hackfest I decided to take a task that it would take some time so that I could focus and I decided to go for rewriting once again the WebKitGTK+ multimedia controls. People who just read this post will wonder why I say again and the reason is that last year we completely redesigned the multimedia controls that use GStreamer for playback underneath. This time I have not redesigned them (well, a bit) but rewritten them in JavaScript as the Apple guys had done before.

To get the job done, the first step was bundling the JavaScript code and activating the codepath to use those controls. I used the Apple controls as template so you can imagine that the first result was a non-working monster that at some point reminded to Safari multimedia controls. At that point I could do two things, forking or inheriting. I decided to go with inheritance because it keeps the spirit of WebKit (and almost all Free Software projects) of sharing as much code as possible and because forking later is easier than merging. Then step by step I kept redefining JavaScript methods and tweaking some stuff in the C++ and CSS code to create the current user experience that we had so far.

Some of the non-aesthetic changes are the following:

  • Focus rings are now managed from CSS instead of C++.
  • Tests got new fixes, rebaselines and more love.
  • CMake support for the new controls.
  • Load captions icon from theme.
  • Load and hide elements handled now with CSS (and JavaScript).

The captions icon problem was interesting because I found out that the one we were using was “user-invisible-symbolic” and it was hardcoded directly in the CSS code. I changed it to be loaded from the theme but it raised the issue of using the incorrect metaphor though the current icon looks nice for captions. I filed a GNOME bug (and another WebKit bug to follow this up) so that a new icon can be created for captions/subtitles with the correct metaphor.

And which are the controls aesthetic changes?

  • Show a very subtle gradient when the elements are focused or active to improve the accessibility support (which won’t be complete until bug 117857 is fixed).
  • Volume slider rolls up and down with a nice animation.
  • Some other elements are not shown when they are not needed.
  • Captions menu shows up with both click and mouse hover for coherence with the volume slider.
  • Captions menu is also animated the same way as the volume slider.
  • Captions menu was propertly centered.
  • Captions menu style was changed to make it more similar to the rest of the controls (fonts, margings…)
  • Volume slider shows below the media element when it is too close to the page top and it cannot be shown on it. This was a regression that I introduced with the first rewrite, happy to have it fixed now.

As I already said the aesthetic differences with the former C++ are not a big deal unless you compare them with the original controls:

Starting point

To appreciate the new controls I cannot just show a screenshot, because the nicest thing are the animations. Therefore a video is needed (and if you have WebKit compiled you can experience them yourself)):

Of course, I thank our hackfest sponsors as the it was possible because of them:

Igalia GNOME Foundation

GUADEC 2013 is over

I departed from A Coruña, flying to Barcelona (where I had time to have a coffee with a friend), then to Vienna and then to Brno by bus. It was a bit tiring so after having a quick drink at our hotel with the rest of the Igalians, I just went to bed.

Web, the future is now by my Igalian friend Claudio Saavedra (different from Garnacho) was my first talk (apart from the keynote). Even after knowing the content already, I found quite interesting the way Claudio spoke about the latest changes in the development of Web (a.k.a. Epiphany), specially the way WebKit 2 helps to provide a much better user experience.

I liked a lot Alex Larsson’s talk of High resolution display support in GNOME. His approach of the abstract pixels and all the way down in the stack from GTK+ was very interesting.

Matt Dalio and his Endless Mobile project just rock, not only because of the tech and GNOME involved but also because of its social implications. Keep going!

Given the rest of my career and my work in WebKit, specially in the WebKitGTK+ port, I am always interested in GStreamer, as it is the framework we use use for multimedia playback (and other ports like EFL and Qt in Linux). What’s cooking in GStreamer talk by Sebastian Dröge and Tim-Philipp Müller was of course mandatory and worth going.

Jan-Christoph Borchardt’s talk about GNOME and ownCloud: desktop plus web for a holisic experience showed everything that is and is coming with ownCloud. I introduced him to Claudio as he mentioned that he would like to see at least Ephy’s bookmarks syncronized with ownCloud. It looks interesting.

I missed Philip Withnall’s talk about Testing online services because it was at the same time as the ownCloud one, but I was interested because of my wife‘s research project about the same subject. Philip, if you read this, leave me a comment or get in touch directly with her as she was very interested.

It is always refreshing to attend Marina Zhurakhinskaya’s talk about Outreach Program for Women: a lesson in collaboration, because I think it is very important to keep reminding ourselves about the relevance of programs to integrate more women into our community and I think we, and specially Marina and the OPW program, get always awesome results, though we cannot fall asleep and keep pushing till, at least, the 50%.

My Igalian friend Juan Suárez had a talk called Writing multimedia applications with Grilo where he showed the easy but powerful possibilities of the Grilo API and its plugins. There was also a lightning talk by an intern about adding support to build Lua plugins for Grilo.

Juan Pablo Ugarte was showing off his Glade skills in his talk about Rich custom user interfaces with Glade and CSS. His slides were made with Glade. Cool!

More secure with less “security” by Stef Walter had a lot of interesting ideas about how to improve security and making at the same time the user’s life easier.

Just after lunch it was the turn of my Igalian friend Martin Robinson who gave the talk about Webkit2 and you and explained many things about the WebKit 2 model, like for example all its layers and how the processes work.

There was another talk given by other fellow Igalians, Alejandro Piñeiro and Joanmarie Diggs, called Tag, your PDF is it that I couldn’t attend. In that case I preferred to prioritize other talks as I can always easily talk to them.

The rest of the talks I attended were:

Of course it would not be a GUADEC without hanging out with friends at the different parties, one of them sponsored by Igalia and Red Hat.

I really enjoyed the tour at the city, which is very beatiful and peaceful, though we had to finish it prematurely because of the impresive storm. On our way back from our unofficial GNOME Hispano dinner, which of course was great, to our hotel, we could see some broken tree branches and even what we think that were some fallen young trees.

The venue was really a nice place (with funny chairs) and the volunteers did an awesome job (THANKS!). Of course, there are always issues, like the AC problem (these things can always happen) and the internet connection which is an inherent problem to almost every GUADEC.

Wonderful work also of Ana, taking a lot of cool pictures, even from me, that you can see in her Flickr collections.

Of course, I cannot forget to thank Igalia for sponsoring my trip.

Strasbourg, there we go!