In 2012 I started to work on a video renderer for GStreamer which uses directly the DRM/KMS kernel subsystem to display images. I even blogged about it, but I didn’t finished it.
Nonetheless, in December last year, a customer asked us to finish the element, but this time focusing on i.MX6, rather than OMAP4. Consequently, early this year, kmssink got merged into upstream.
The video sink has some nice features, such as DMA-buf importing through the PRIME kernel interface. This feature makes possible a zero-copy path when the video decoder delivers dmabuf based frames. For this particular project, the kernel we used supports the CODA media subsystem, and through the GStreamer element v4l2videodec we could link pipelines negotiating dmabuf sharing, and thus we played videos very efficiently.
During the last GStreamer Hackfest, Nicolas Dufresne got working the MFC media subsystem, for Exynos, with v4l2videodec and kmssink, but I don’t remember if he also got the dmabuf sharing working.
All in all, kmssink had only been tested in a couple of ARM devices, so I wonder, as a gstremer-vaapi co-maintainer, if it also works in x86 devices.
Some colleagues at Igalia, use a Minnowboard to test WebKit4Wayland, so I borrowed it, installed a minimal Fedora 23 in it, and, surprisingly, I found out that it has a nice VA-API support:
$ vainfo error: can't connect to X server! libva info: VA-API version 0.38.1 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib64/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_38 libva info: va_openDriver() returns 0 vainfo: VA-API version: 0.38 (libva 1.6.2) vainfo: Driver version: Intel i965 driver for Intel(R) Bay Trail - 1.6.2 vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Simple : VAEntrypointEncSlice VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointEncSlice VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSlice VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSlice VAProfileH264StereoHigh : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc VAProfileJPEGBaseline : VAEntrypointVLD
Afterwards, I compiled GStreamer using gst-uninstalled in order to have kmssink and the master branch of gstreamer-vaapi. And got it! I had hardware accelerated decoding with VA-API, and video rendering using KMS/DRM. But, sadly, no zero copy, since, in order to have dmabuf sharing, we have to finish and to merge bug 755072 in gstreamer-vaapi.
Running the typical Big Bunny video (1080p, H.264) the playback consumes less than the 50% of the CPU usage, compared with the 100% of CPU usage with software decoders (and a lot of frames dropped).
I recorded a small video with the experiment as a proof.
Update: Javer Martinez told me in twitter, that MFC can do dma-buf export since uses vb2 and Exynos DRM can import, but he couldn’t get zero copy to work due missing format support.