Improvements for GStreamer Intel-MSDK plugins

|

Last November I had a chance to dive into Intel Media SDK plugins in gst-plugins-bad. It was very good chance for me to learn how gstreamer elements are handling special memory with its own infrastructures like GstBufferPool and GstMemory. In this post I’m going to talk about the improvements of GStreamer Intel MSDK plugins in release 1.14, which is what I’ve been through since last November.

First of all, for those not familiar with Media SDK I’m going to explain what Media SDK is briefly. Media SDK(Aka. MSDK) is the cross-platform API to access Intel’s hardware accelerated video encoder and decoder functions on Windows and Linux. You can get more information about MSDK here and here.

But on Linux so far, it’s hard to set up environment to make MSDK driver working. If you want to set up development environment for MSDK and see what’s working on linux, you should follow the steps described in this page. But I would recommend you refer to the Victor’s post, which is very-well explained for this a little bit annoying stuff.

Additionally the driver in linux supports only Skylake as far as I know, which is very disappointing for users of other chipsets. I(and you probably) hope we can work on it without any specific patch/driver (since it is open source!) and any dependency of chipset. As far as I know, Intel has been considering this issue to be solved, so we’ll see what’s going to happen in the future.

Back to gstreamer, gstreamer plugins using MSDK landed in 2016. At that time they were working fine with basic features for playback and encoding but there were several things to be improved, especially for performance.

Eventually, in the middle of last March, GStreamer 1.14 has been released including improvements of MSDK plugins, which is what I want to talk in this post.

So let me begin now.

Suuports bufferpool and starts using video memory.

This is a key feature that improves the preformance.

In MSDK, there are two types of memory supported in the driver. One is “System Memory” and another is “Video Memory”. (There is one more type of memory, which is called “Opaque Memory” but won’t talk about it in this post)

System memory is a memory allocated on user space, which is normal. System memory is being used in the plugins already, which is quite simple to use but not recommended since the performance is not good enough.

Video memory is a memory used by hardware acceleration device, also known as GPU, to hold frame and other types of video data.

For applications to use video memory, something specific on its platform should be implemented. On linux for example, we can use VA-API to handle video memory through MSDK. And that’s included to the 1.14 release, which means we still need to implement something specific on Windows like this way.

To implement using video memory, I needed to implement GstMSDK(System/Video)Memory to generalize how to access and map the memory in the way of GStreamer. And GstMSDKBufferPool can allocate this type of memory and can be proposed to upstream and can be used in each MSDK element itself. There were lots of problems and argues during this work since the design of MSDK APIs and GStreamer infrastructure don’t match perfectly.

You can see discussion and patches here in this issue.

In addition, if using video memory on linux, we can use DMABuf by acquiring fd handle via VA-API at allocation time. Recently this has been done only for DMABuf export in this issue though it’s not included in 1.14 release.

Sharing context/session

For resource utilization, there needs to share MSDK session with each MSDK plugin in a pipeline. A MSDK session maintains context for the use of any of decode,encode and convert(VPP) functions. Yes it’s just like a handle of the driver. One session can run exactly one of decode, encode and convert(VPP).

So let’s think about an usual transcoding pipeline. It should be like this, src - decoder - converter - encoder - sink. In this case we should use same MSDK session to decode, convert and encode. Otherwise we should copy the data from upstream to work with different session because sessions cannot share data, which should get much worse.

Also there’s one more thing. MSDK supports joining session. If application wants(or has) to use multiple sessions, it can join sessions to share data and we need to support it in the level of GStreamer as well.

All of these can be achieved by GstContext which provides a way of sharing not only between elements. You can see the patch in this issue, same as MSDK Bufferpool issue.

Adds vp8 and mpeg2 decoder.

Sree has added mpeg2/vc1 decoder and I have added vp8 decoder.

Supports a number of algorithms and tuning options in encoder

Encoders are exposing a number of rate control algorithms now and more encoder tuning options like trellis-quantiztion (h264), slice size control (h264), B-pyramid prediction(h264), MB-level bitrate control, frame partitioning and adaptive I/B frame insertion were added. The encoder now also handles force-key-unit events and can insert frame-packing SEIs for side-by-side and top-bottom stereoscopic 3D video.

All of this has been done by Sree and you can see the details in this issue

Capability of encoder’s sinkpad is improved by using VPP.

MSDK encoders had accepted only NV12 raw data since MSDK encoder supports only NV12 format. But other formats can be handled too if we convert them to NV12 by using VPP, which is also supported in the MSDK driver. This has been done by slomo and I fixed a bug related to it. See this bug for more details.

You can find all of patches for MSDK plugins here.

As I said in the top of this post, all MSDK stuffs should be opened first and should support some of major Intel chipsets at least even if not all. But now, the one thing that I can say is GStreamer MSDK plugins are being improved continuously. We can see what’s happening in the near future.

Finally I want to say that Sree and Victor helped me a lot as a reviewer and an adviser with this work. I really appreciate it.

Thanks for reading!

Comments