To stream or not to stream

Multimedia blog and other fancy stuff

WebRTC, GStreamer and HTML5 - Part 1

An easy 360º solution for realtime multimedia communication.

Part 1 - The story so far… #

It’s been a few years that we’ve been able to communicate in realtime from one web browser to another using the WebRTC protocol. The same protocol also allows broadcasting or ingestion of multimedia streams with a very low latency, in general less than half a second. All web browsers have started integrating the protocol from 2013/2014 and today, in 2023, we have a pretty stable and efficient support everywhere.

The GStreamer multimedia framework has also started integrating WebRTC from 2017 through the webrtcbin plugin. Using this plugin you can perfectly connect to a web browser and stream audio and video in realtime. Unfortunately, webrtcbin is a low-level component. It implements the peer-to-peer connection handshake (using ICE and external STUN servers), packets rerouting if direct connection is not possible (using external TURN servers) and then maintains the underlying RTP session which transports the actual audio and video data.

To build a 100% working product you need to write a lot of code above webrtcbin. Not only do you have to design your own signalling protocol but also implement a signalling server, take care of packet loss and retransmission, manage network congestion and adapt the encoding bitrates to maintain an acceptable user experience over networks of different qualities.

With all that in mind, in 2021 Mathieu Duponchelle started implementing a new GStreamer element called webrtcsink. This element, based on webrtcbin and written in Rust, allows to produce a WebRTC stream and to maintain the underlying connections to multiple remote peers with retransmission of lost packets, control of the network congestion and adaptative encoding bitrates. The webrtcsink project comes with a signalling server that helps normalizing communication between peers.

N.B. as far as signalling normalization is concerned, it is interesting to point out the WHIP protocol that is exclusively used for media ingestion.

Since then, at Igalia, we continuously worked to improve webrtcsink. Last year, in particular, Thibault Saunier has implemented the complete Google Congestion Control Algorithm from its RFC. And recently he has ported the whole project to gst-plugins-rs in order to make the plugin available with the official GStreamer distribution and foster a broader collaboration from the GStreamer community.

So far, webrtcsink is the best looking starting point for building a 360º solution including bi-directional and realtime communications between multiple peers using transparently web browsers and/or native code.

Improvements #

The webrtcsink element is dedicated to broadcast a WebRTC stream to multiple remote clients (this is a sink as its name suggests), and the original signalling server is designed with this objective in mind.

For one of our clients, we needed to be able to also consume a remote WebRTC stream and interact with web browsers and mobile WebViews on Android and iOS.

For that we decided at Igalia to:

Signalling #

The signalling protocol relies on the same original concepts but with the improvements listed above.

Each peer connects to the signalling server using a WebSocket (can be secured over SSL/TLS) and receives a unique identifier used for further commands. This unique identifier remains until the WebSocket connection is closed.

signalling diagram

By default, all peers inherit the role of consumer and can connect to a remote WebRTC stream to consume. To do so, a peer needs to create a session with a remote producer peer. This session has its own unique identifier and is in charge of exchanging SDP and ICE messages between peers until the WebRTC link can be established.

A peer can explicitly require to become a producer, in which case it is announced as this to all other connected peers and gets ready to receive WebRTC connection requests from remote consumer peers.

To finish with, a peer can also explicitly require to become a listener, in which case it will receive a message each time a new producer appears on the signalling network or disappears. The list of currently available producers can also be required independently of the peer roles.

The webrtcsink element always registers as a producer whereas the webrtcsrc element is a consumer. The gstwebrtc-api allows to activate and desactivate the producer mode from the API.

The novelty here is that, from now on, it is possible to activate and deactivate the listener and producer roles without having to open several WebSocket connections or to disconnect from the signalling server. You can also create more than one consumer sessions but you are still limited to one producer session per peer.

To sum-up: if you are creating a web application you can connect to as many remote streams as you want but you can only produce one unique stream (which can have several video and audio tracks). If you are creating a native application with GStreamer, you can use as many webrtcsrc elements as you want to connect to several remote streams and one webrtcsink element to produce one WebRTC stream.

With these combinations you can easily create low-latency streaming applications, media ingestion tools or any kind of video conferencing software, among other examples.

Webrtcsrc #

The new webrtcsrc element developed in Rust by Thibault Saunier offers a full-featured WebRTC consumer. The element connects to the signalling server and manages automatically the WebRTC session with a remote peer identified by its unique identifier.

It also registers the gstwebrtc(s):// schemes to be used directly with uridecodebin, playbin or playbin3 elements. Connecting to a remote WebRTC stream has never been so easy:

gst-launch-1.0 playbin uri=gstwebrtc://

or directly:

gst-play-1.0 gstwebrtc://

The structure of a webrtcsrc URI is as follows:

The webrtcsrc element also provides access to its internal signaller object, so you can communicate with the signalling server and, for example, listen to all producers created on the signalling network without needing to implement the signalling protocol by yourself. The following example shows how to create an instance of the signaller to receive information about new producers.

class Application;
extern Application* myApp;

GstElement* elem = gst_element_factory_make("webrtcsrc", nullptr);
if (elem)
    GObject* signaller = nullptr;
    g_object_get(elem, "signaller", &signaller, nullptr);

    if (signaller)
        // After usage the signaller must be released by calling g_object_unref(signaller);
        // This will automatically disconnect the signaller from the server and attached
        // signals will not be called anymore

        gst_util_set_object_arg(signaller, "role", "listener");

        g_signal_connect_swapped(signaller, "error", G_CALLBACK(+[](Application* app, const char* error) {
            // Manage errors...
        }), myApp);

        g_signal_connect_swapped(signaller, "producer-added",
            G_CALLBACK(+[](Application* app, const char* producerId, const GstStructure* producerMeta, gboolean newConnection) {
                // Manage an available producer...
                // You can, for example, create a new pipeline/branch using a webrtcsrc
                // element and the producerId value to consume the remote stream
            }), myApp);

        g_signal_connect_swapped(signaller, "producer-removed",
            G_CALLBACK(+[](Application* app, const char* producerId, const GstStructure* producerMeta) {
                // Cleanup a disconnected producer...
                // Any local consumer pipeline/branch previously created
                // with a webrtcsrc element will receive an EOS event
            }), myApp);

        // Connect to the signalling server and start listening for remote events
        bool ret = false;
        g_signal_emit_by_name(signaller, "start", &ret);