A Complement story in the Fosdem 2022 Talk

|

At 6th February, fortunately I had a chance to have a lightening talk at the graphics devroom of fosdem 2022. At the time I presented a summary of the status of turnip driver development in mesa and I’d like to write some more things that I couldn’t say at the talk as a complement to the talk.

History of the development.

I write here again, Turnip is the code name of Qualcomm Adreno GPU’s open-source vulkan driver. I dived into this GPU at 2018 and participated in developing the Freedreno OpenGL driver, that is founded and leaded by Rob Clark. It was new for me to develop a user-space graphics driver at that time and it was a good chance to learn and experience of how GPUs work, how to develop a graphics driver and how to communicate with Mesa community. Due to the contributions to the freedreno driver at that time, I could be a committer at the Mesa community. And at the beginning of 2020, Igalia including me started working on the opensource vulkan driver, that is “Turnip” driver. Actually the turnip development started already at 2018 by Google people and it’s definately lucky for us to get a chance to participate in the turnip development.

As I said in the talk it was immature though it works fine at the beginning thus lots of extensions and features are implemented since 2020. Joanathan Marek and Connor Abbott played an important role at this time(and still) and they implemented very basic and essential features from the scratch (and copied from freedreno if necessary). So Igalia could get involved in the progress of the implementation of extensions and features on top of what they have done. At 2021, we could speed-up the progress based on accumulated experiences and knowledges, especially Danylo’s join was a great addition to the project and he’s been always doing a great job since then. I won’t write details about this in this post but you can refer to Samuel and Danylo’s blog, who are my great colleagues at Igalia. It’s a pity that there’s no posts about my works though. Hopefully I could write some more posts about it in the near future. (I got a list to write already! XD)

One of main goals: Playing games!

Especially I’d like to highlight this in this post, probably this is why I write this post.

I presented things like below at the “What happend at 2021”

  • Make it run for windows games with dxvk/vkd3d on linux/arm
    • with x86 emulators (Fex, Box86)
    • Some window games started running!

And what about this at the “What’ll happen in 2022”?

  • Focusing on real world use cases.
    • Still not enough games running on arm.
    • Trying to run more window games via wine(proton)

Cool! Isn’t it?

Actually I realized many people out there are very interested in this, that is to “make games run on linux/arm”, of course, including windows/x86 games. And I know people want to test by themselves with their devices if possible but there are some issues to do it.

  • First, qualcomm devices are very expensive comparing to RasberryPi or other arm devices.
  • Second, the setup is so tricky to make and there’s no good document for the setup.
  • Third, the setup is unstable to run every game so far even if you’ve completed the setup.

Honestly I’m not going to write details like how to do in this post. Instead, I’d like to show some efforts to do it as an example of what we’ve tried. I said in the talk that we’ve been trying this complicated setup because we want to test turnip with real cases but there are not enough native games running on linux/arm yet. Here are cases that we’ve been trying.

The first case is here : https://gitlab.freedesktop.org/mesa/mesa/-/issues/4738.

One day at 2021, one issue has been raised in the community. As you see at the system information, someone was trying to run window games with turnip on a qualcomm device.

There were 2 kind of setups on Android: One, most complicated setup, is using virgl-vtest to get OpenGL calls from windows games as you see below. It seems, at that time, the ExaGear couldn’t access to GPU directly then they used virgl-vtest-server to access to GPU with turnip, which causes performance degradation.

-------------------------------------------------------
Window Games
-------------------------------------------------------
ExaGear:    (virgl) | <----> |  virgl-vtest
                    |        | ------------------------
 Wine on            |        |  turnip/zink
  x86 Ubuntu 18.04  |        | ------------------------
                    |        |  Ubuntu 20.04 on chroot
-------------------------------------------------------
                  Android 10
-------------------------------------------------------

So there was another try to not use virgl-vtest-server and seems it was successful for a few games with another setup like this:

---------------------------------
Window Games
---------------------------------
WINE(proton using dxvk)
---------------------------------
X86 emulator(Box86)
---------------------------------
turnip/zink
---------------------------------
Ubuntu 20.04 on Termux proot
---------------------------------
Android 10
---------------------------------

Which looks simpler but still complicated. As you can see in the issue, Danylo managed to set up the system and found root causes and fixed all of it.

Now it looks someone has succeeded in accessing to GPU directly in this emulaotr so we wouldn’t need to use virgl-vtest any more since we got new issue for the setup including ExaGear. :) See https://gitlab.freedesktop.org/mesa/mesa/-/issues/6024 in detail.

The second case is here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5723

That is similar to the second setup of the first case, which is like

---------------------------------
Window Games
---------------------------------
WINE
---------------------------------
X86 emulator(Box86)
---------------------------------
turnip/zink
---------------------------------
Linux
---------------------------------

Yeah, starts using linux directly instead of Termux for android, thus it got simple a bit more. You can see the instructions and some trouble shootings here for this setup: https://github.com/Heasterian/Box86-64-on-SD845-mobian

The third case is to run android games with just replacing the proprietary driver with turnip

Which is most simple. You can see instructions how to build the turnip driver for Android here But note that there seems to be not enough vulkan games on Android yet, which means that we need to try with x86 emulators to test real games. Also You can see some efforts on this setup at Danylo’s blog as the following:

https://blogs.igalia.com/dpiliaiev/turnips-in-the-wild-part-1/ https://blogs.igalia.com/dpiliaiev/turnips-in-the-wild-part-2 https://blogs.igalia.com/dpiliaiev/gfxreconstruct-test-mobile-gpus/

Additionally there was also a talk about emulation on arm for playing games: FEX-Emu: Fast(-er) x86 emulation for AArch64 I think you can get more informations to run x86 games on arm/linux and I believe you can have fun too with this talk.

Ok, until here, I did tell a bit more that I missed at the talk. Hopefully it could be useful to someone interested. Also I’ll give it a try to write far more details, how to setup and run games step by step for example, in the near future.

Thanks for reading!

Improvements for GStreamer Intel-MSDK plugins

|

Last November I had a chance to dive into Intel Media SDK plugins in gst-plugins-bad. It was very good chance for me to learn how gstreamer elements are handling special memory with its own infrastructures like GstBufferPool and GstMemory. In this post I’m going to talk about the improvements of GStreamer Intel MSDK plugins in release 1.14, which is what I’ve been through since last November.

First of all, for those not familiar with Media SDK I’m going to explain what Media SDK is briefly. Media SDK(Aka. MSDK) is the cross-platform API to access Intel’s hardware accelerated video encoder and decoder functions on Windows and Linux. You can get more information about MSDK here and here.

But on Linux so far, it’s hard to set up environment to make MSDK driver working. If you want to set up development environment for MSDK and see what’s working on linux, you should follow the steps described in this page. But I would recommend you refer to the Victor’s post, which is very-well explained for this a little bit annoying stuff.

Additionally the driver in linux supports only Skylake as far as I know, which is very disappointing for users of other chipsets. I(and you probably) hope we can work on it without any specific patch/driver (since it is open source!) and any dependency of chipset. As far as I know, Intel has been considering this issue to be solved, so we’ll see what’s going to happen in the future.

Back to gstreamer, gstreamer plugins using MSDK landed in 2016. At that time they were working fine with basic features for playback and encoding but there were several things to be improved, especially for performance.

Eventually, in the middle of last March, GStreamer 1.14 has been released including improvements of MSDK plugins, which is what I want to talk in this post.

So let me begin now.

Suuports bufferpool and starts using video memory.

This is a key feature that improves the preformance.

In MSDK, there are two types of memory supported in the driver. One is “System Memory” and another is “Video Memory”. (There is one more type of memory, which is called “Opaque Memory” but won’t talk about it in this post)

System memory is a memory allocated on user space, which is normal. System memory is being used in the plugins already, which is quite simple to use but not recommended since the performance is not good enough.

Video memory is a memory used by hardware acceleration device, also known as GPU, to hold frame and other types of video data.

For applications to use video memory, something specific on its platform should be implemented. On linux for example, we can use VA-API to handle video memory through MSDK. And that’s included to the 1.14 release, which means we still need to implement something specific on Windows like this way.

To implement using video memory, I needed to implement GstMSDK(System/Video)Memory to generalize how to access and map the memory in the way of GStreamer. And GstMSDKBufferPool can allocate this type of memory and can be proposed to upstream and can be used in each MSDK element itself. There were lots of problems and argues during this work since the design of MSDK APIs and GStreamer infrastructure don’t match perfectly.

You can see discussion and patches here in this issue.

In addition, if using video memory on linux, we can use DMABuf by acquiring fd handle via VA-API at allocation time. Recently this has been done only for DMABuf export in this issue though it’s not included in 1.14 release.

Sharing context/session

For resource utilization, there needs to share MSDK session with each MSDK plugin in a pipeline. A MSDK session maintains context for the use of any of decode,encode and convert(VPP) functions. Yes it’s just like a handle of the driver. One session can run exactly one of decode, encode and convert(VPP).

So let’s think about an usual transcoding pipeline. It should be like this, src - decoder - converter - encoder - sink. In this case we should use same MSDK session to decode, convert and encode. Otherwise we should copy the data from upstream to work with different session because sessions cannot share data, which should get much worse.

Also there’s one more thing. MSDK supports joining session. If application wants(or has) to use multiple sessions, it can join sessions to share data and we need to support it in the level of GStreamer as well.

All of these can be achieved by GstContext which provides a way of sharing not only between elements. You can see the patch in this issue, same as MSDK Bufferpool issue.

Adds vp8 and mpeg2 decoder.

Sree has added mpeg2/vc1 decoder and I have added vp8 decoder.

Supports a number of algorithms and tuning options in encoder

Encoders are exposing a number of rate control algorithms now and more encoder tuning options like trellis-quantiztion (h264), slice size control (h264), B-pyramid prediction(h264), MB-level bitrate control, frame partitioning and adaptive I/B frame insertion were added. The encoder now also handles force-key-unit events and can insert frame-packing SEIs for side-by-side and top-bottom stereoscopic 3D video.

All of this has been done by Sree and you can see the details in this issue

Capability of encoder’s sinkpad is improved by using VPP.

MSDK encoders had accepted only NV12 raw data since MSDK encoder supports only NV12 format. But other formats can be handled too if we convert them to NV12 by using VPP, which is also supported in the MSDK driver. This has been done by slomo and I fixed a bug related to it. See this bug for more details.

You can find all of patches for MSDK plugins here.

As I said in the top of this post, all MSDK stuffs should be opened first and should support some of major Intel chipsets at least even if not all. But now, the one thing that I can say is GStreamer MSDK plugins are being improved continuously. We can see what’s happening in the near future.

Finally I want to say that Sree and Victor helped me a lot as a reviewer and an adviser with this work. I really appreciate it.

Thanks for reading!

Support GstContext for VA-API elements

|

Since I started working on gstreamer-vaapi, one of what’s disappointing me is that vaapisink is not so popular even though it should be the best choice on vaapi installed machine. There are some reasonable causes and one of the reasons is probably it doesn’t provide a convinient way to be integrated for application developers.

Until now, we provided a way to set X11 window handle by gst_video_overlay_set_window_handle, which is to tell the overlay to display video output to a specific window. But this is not enough since VA and X11 Display handle is managed internally inside gstreamer-vaapi elements, which means that users can’t handle them by themselves.

In short, there was no way to share display handle created by application. Also we have some additional problems due to this issue as the following.

  • If users want to handle multiple display seperatedly, it can’t be possible. bug 754820
  • If users run multiple decoding pipelines with vaapisink, performance is down critically since there’s some locks in each vaapisink with same VADisplay. bug 747946

Recently we have merged a series of patches to provide a way to set external VA Display and X11 Display from application via GstContext. GstContext provides a way of sharing not only between elements but also with the application using queries and messages. (For more details, see https://developer.gnome.org/gstreamer/stable/gstreamer-GstContext.html)

By these patches, application can set its own VA Display and X11 Display to VA-API elements as the following:

  • Create VADisplay instance by vaGetDisplay, it doesn’t need to be initialized at startup and terminated at endup.
  • Call gst_element_set_context with the context to which each display instance is set.

Example: sharing an VADisplay and X11 display with the bus callback, this is almost same as other examples using GstContext.

static GstBusSyncReply
bus_sync_handler (GstBus * bus, GstMessage * msg, gpointer data)
{
  switch (GST_MESSAGE_TYPE (msg)) {
    case GST_MESSAGE_NEED_CONTEXT:{
      const gchar *context_type;
      GstContext *context;
      GstStructure *s;
      VADisplay va_display;
      Display *x11_display;

      gst_message_parse_context_type (msg, &context_type);
      gst_println ("Got need context %s from %s", context_type,
          GST_MESSAGE_SRC_NAME (msg));

      if (g_strcmp0 (context_type, "gst.vaapi.app.Display") != 0)
        break;

      x11_display = /* Get X11 Display somehow */
      va_display = vaGetDisplay (x11_display);

      context = gst_context_new ("gst.vaapi.app.Display", TRUE);
      s = gst_context_writable_structure (context);
      gst_structure_set (s, "va-display", G_TYPE_POINTER, va_display, NULL);
      gst_structure_set (s, "x11-display", G_TYPE_POINTER, x11_display, NULL);

      gst_element_set_context (GST_ELEMENT (GST_MESSAGE_SRC (msg)), context);
      gst_context_unref (context);
      break;
    }   
    default:
      break;
  }

  return GST_BUS_PASS;
}

Also you can find the entire example code here.

Furthermore, we know we need to support Wayland for this feature. See bug 705821. There’s already some pending patches but they need to be rebased and modified based on the current way. I’ll be working on this in the near future.

We really want to test this feature more especially in practical cases until next release. I’d appreciate if someone reports any bug or issue and I promise I’d focus on it precisely.

Thanks for reading!

100 commits in GStreamer

|

It’s been 3 years since I’ve started working on GStreamer, meanwhile I contributed over 100 commits fortunately!

Let’s look at my commits in each project in GStreamer.

Now that I write this article, I have made 128 commits.

In Samsung Electronics, which was my previous company that I had been working for, I had a chance to work on gstreamer, which is main multimedia framework on Tizen. Since then, I realized that there are lots of opportunity in open source world and I started enjoying contribution to this project.

This is my first commit:

Yes. It’s just a fix typo. This landed in just five minutes after I proposed and I realized that maintainers are looking at all issues in bugzilla. To be honest, I doubted it a bit. :P

Looking at other commits that I was really happy with.

While I was working on gst-rtsp-server at that time, I found it’s not working properly for RTP retransmission on the server. I reported the issue and discussion went very positive, then my proposed patches landed finally thanks to Sebastian.

This was enhancement of infrastructure for RTSP/RTP in GStreamer, which is providing a way to report stats for sender/receiver.

Then I contributed huge patches of creating new APIs for transformation between SDP and GstCaps including removing duplicated codes. Thanks, Sebastian again.

Until this time I focused on RTSP/RTP streaming on server side since I was working on Miracast on Tizen which uses gst-rtsp-server. At this time I started looking for company so that I could work on open source more closely. Eventually I found Igalia, which is doing great work in open source world including Webkit, Chromium and GStreamer.

Since I joined Igalia I have been focusing on gstreamer-vaapi with my great colleague Victor, who is one of maintainers of GStreamer project. I got to have much more chances to contribute than before. As I said, I worked on RTSP server side before, which means that I should focus on encoder, muxer and networking stuff. But since this move, I got started focusing on playback including decoder and sink to be playable on various platforms.

These are my best patches I think.

By this set of patches, performance of playback on GL/VAAPI has been improved dramatically.

Besides, I have contributed some patches that improve vaapi decoder and encoder, most of them is for h264, which also makes me happy.

During the last three years I worked for GStreamer, I grew up with more capability of SW development, the idea of open source and more deep insight for the world of software. I give a deep appreciation for Igalia that gave me this opportunity, and also I thank you, Victor, for giving me a lot of motivation.

Even at this moment, I’m still working, enjoying and sometimes struggling with GStreamer. I really want to keep continuing this work and find a chance to contribute something new which could be applied on GStreamer.

Thanks for reading!

Libva-rust(libva binding to rust) in development stage

|

Since Rust language appeared in the world, I felt strongly this is the language I should learn.

This is because:

  • Rust guarantees to prevent from common bugs in C/C++ such as memory corruption and race condition, which are very painful to fix whenever you encounter in large project.
  • Rust guarantees it doesn’t lose performance even supporting these features!

I don’t think that Rust aims at replacing C/C++, but it’s worth learning for C/C++ developers like me at least. So I’ve been searching and thinking of what I can do with this new language. In the end of last year, I decided to implement libva bindings to Rust.

Here are advantages of doing this project.

  • I’m working on gstreamer-vaapi project, which means that I’m familiar with VA-API and middleware using this.
  • This kind of binding the existing project to another language makes me understanding the project much more than the moment.
  • Simultaneously, I could also learn new language in the level of practical development.
  • H/W acceleration is a critical feature, especially for laptop or other embedded systems. So this project could be a good option for those trying to use H/W acceleration for playback on linux.

Finally, I did open this internal project on github, named as libva-rust.
There is one example, creating an VASurface and putting raw data to the surface and displaying it only on X11 window.

Let’s see the example code briefly.

let va_disp = VADisplay::initialize(native_display as *mut VANativeDisplay).unwrap();

let va_surface = VASurface::new(&va_disp, WIDTH, HEIGHT, ffi::VA_RT_FORMAT_YUV420, 1).unwrap();

let va_config = VAConfig::new(&va_disp, ffi::VAProfileMPEG2Main, ffi::VAEntrypointVLD, 1).unwrap();

let va_context = VAContext::new(&va_disp,
                                &va_config,
                                &va_surface,
                                WIDTH as i32,
                                HEIGHT as i32,
                                0).unwrap();

Initalization for VA-API.

test_draw::image_generate(&va_disp, &va_image, &va_image_buf);

va_image.put_image(&va_disp,
                   &va_surface,
                   0,
                   0,
                   WIDTH,
                   HEIGHT,
                   0,
                   0,
                   WIDTH,
                   HEIGHT);

Draw raw data to VaapiImage in test_draw.rs and put it to the created surface.

va_surface.put_surface(&va_disp, win, 0, 0, WIDTH, HEIGHT, 0, 0, WIDTH, HEIGHT);

Finally, display it by putting the surface to created X11 window.
It’s simple as you see, but the important first step.

My first goal is providing general and easy a set of “rusty” APIs so that this could be integrated into other rust-multimedia project like rust-media.

Another potential goal is implementation of vaapi plugins in gstreamer, written in Rust. Recently, Sebastian has been working on this(https://github.com/sdroege/rsplugin), I would really like to get involved in this project.

There are tons of things to do for the moment.
Here’s to-do list for now.

  • Implement vp8 decoder first: Simply, it looks easier than h26x decoder. Is there any useful rust h26x parser out there, by the way?
  • Manipulate raw data using Rust apis like Bit/Byte Reader/Writer.
  • Implement general try-catch statement in Rust.
  • Make test cases.
  • Support wayland.

Yes. It has a long way to go still and I don’t have enough time to focus on this project. But I’ll be managing to keep working on this project.

So feel free to use and absolutely welcome contributions including issue report, bug fix, providing a patch, etc.

Thanks!