<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:base="en">
	<title>Untangling the Web</title>
	<subtitle>Paweł Lampe&#39;s blog describing a various topics related to web, web engines, programming, and technology in general.</subtitle>
	<link href="https://blogs.igalia.com/plampe/feed/feed.xml" rel="self"/>
	<link href="https://blogs.igalia.com/plampe/"/>
	<updated>2025-12-19T00:00:00Z</updated>
	<id>https://blogs.igalia.com/</id>
	<author>
		<name>Paweł Lampe</name>
		<email>plampe@igalia.com</email>
	</author>
	
	<entry>
		<title>WPE performance considerations: pre-rendering</title>
		<link href="https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/"/>
		<updated>2025-12-19T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/</id>
		<content type="html">&lt;p&gt;This article is a continuation of the series on &lt;strong&gt;WPE performance considerations&lt;/strong&gt;. While the &lt;a href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;previous article&lt;/a&gt; touched upon fairly low-level aspects of the DOM tree overhead,
this one focuses on more high-level problems related to managing the application’s workload over time. Similarly to before, the considerations and conclusions made in this blog post are strongly related to web applications
in the context of embedded devices, and hence the techniques presented should be used with extra care (and benchmarking) if one would like to apply those on desktop-class devices.&lt;/p&gt;
&lt;h2 id=&quot;the-workload&quot; tabindex=&quot;-1&quot;&gt;The workload &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Typical web applications on embedded devices have their workloads distributed over time in various ways. In practice, however, the workload distributions can usually be fitted into one of the following categories:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Idle applications with occasional updates&lt;/strong&gt; - the applications that present static content and are updated at very low intervals. As an example, one can think of some static dashboard that presents static content and switches
the page every, say, 60 seconds - such as e.g. a static departures/arrivals dashboard on the airport.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Idle applications with frequent updates&lt;/strong&gt; - the applications that present static content yet are updated frequently (or are presenting some dynamic content, such as animations occasionally). In that case, one can imagine a similar
airport departures/arrivals dashboard, yet with the animated page scrolling happening quite frequently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Active applications with occasional updates&lt;/strong&gt; - the applications that present some dynamic content (animations, multimedia, etc.), yet with major updates happening very rarely. An example one can think of in this case is an application
playing video along with presenting some metadata about it, and switching between other videos every few minutes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Active applications with frequent updates&lt;/strong&gt; - the applications that present some dynamic content and change the surroundings quite often. In this case, one can think of a stock market dashboard continuously animating the charts
and updating the presented real-time statistics very frequently.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Such workloads can be well demonstrated on charts plotting the browser’s CPU usage over time:&lt;/p&gt;
&lt;center style=&quot;transform:scale(0.8,0.8);&quot;&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/obgL44nHKc-1385.avif 1385w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/obgL44nHKc-1385.webp 1385w&quot;&gt;&lt;img alt=&quot;Typical web application workloads.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/obgL44nHKc-1385.png&quot; width=&quot;1385&quot; height=&quot;360&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;As long as the peak workload (due to updates) is small, no negative effects are perceived by the end user. However, when the peak workload is significant, some negative effects may start getting noticeable.&lt;/p&gt;
&lt;p&gt;In case of applications from groups (1) and (2) mentioned above, a significant peak workload may not be a problem at all. As long as there are no continuous visual changes and no interaction is allowed during updates, the end-user
is unable to notice that the browser was not responsive or missed some frames for some period of time. In such cases, the application designer does not need to worry much about the workload.&lt;/p&gt;
&lt;p&gt;In other cases, especially the ones involving applications from groups (3) and (4) mentioned above, the significant peak workload may lead to visual stuttering, as any processing making the browser busy for longer than 16.6 milliseconds
will lead to lost frames. In such cases, the workload has to be managed in a way that the peaks are reduced either by optimizing them or distributing them over time.&lt;/p&gt;
&lt;h4 id=&quot;first-step-optimization&quot; tabindex=&quot;-1&quot;&gt;First step: optimization &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The first step to addressing the peak workload is usually optimization. Modern web platform gives a full variety of tools to optimize all the stages of web application processing done by the browser. The usual process of optimization is a
2-step cycle starting with measuring the bottlenecks and followed by fixing them. In the process, the usual improvements involve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;using CSS containment,&lt;/li&gt;
&lt;li&gt;using shadow DOM,&lt;/li&gt;
&lt;li&gt;promoting certain parts of the DOM to layers and manipulating them with transforms,&lt;/li&gt;
&lt;li&gt;parallelizing the work with workers/worklets,&lt;/li&gt;
&lt;li&gt;using the &lt;code&gt;visibility&lt;/code&gt; CSS property to separate painting from layout,&lt;/li&gt;
&lt;li&gt;optimizing the application itself (JavaScript code, the structure of the DOM, the architecture of the application),&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;second-step-pre-rendering&quot; tabindex=&quot;-1&quot;&gt;Second step: pre-rendering &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Unfortunately, in practice, it’s not uncommon that even very well optimized applications still have too much of a peak workload for the constrained embedded devices they’re used on. In such cases, the last resort solution is
&lt;strong&gt;pre-rendering&lt;/strong&gt;. As long as it’s possible from the application business-logic perspective, having at least some web page content pre-rendered is very helpful in situations when workload has to be managed, as &lt;strong&gt;pre-rendering&lt;/strong&gt;
allows the web application designer to choose the precise moment when the content should actually be rendered and how it should be done. With that, it’s possible to establish a proper trade-off between reduction in peak workload and
the amount of extra memory used for storing the pre-rendered contents.&lt;/p&gt;
&lt;h2 id=&quot;pre-rendering-techniques&quot; tabindex=&quot;-1&quot;&gt;Pre-rendering techniques &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Nowadays, the web platform provides at lest a few widely-adapted APIs that provide means for the application to perform various kinds of pre-rendering. Also, due to the ways the browsers are implemented, some APIs can be purposely misused
to provide pre-rendering techniques not necessarily supported by the specification. However, in the pursuit of good trade-offs, all the possibilities should be taken into account.&lt;/p&gt;
&lt;p&gt;Before jumping into particular pre-rendering techniques, it’s necessary to emphasize that the &lt;strong&gt;pre-rendering&lt;/strong&gt; term used in this article refers to the actual rendering being done earlier than it’s visually presented. In that
sense, the resource is rasterized to some intermediate form when desired and then just composited by the browser engine’s compositor later.&lt;/p&gt;
&lt;h4 id=&quot;pre-rendering-offline&quot; tabindex=&quot;-1&quot;&gt;Pre-rendering offline &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The most basic (and limited at the same time) pre-rendering technique is one that involves rendering offline i.e. before the browser even starts. In that case, the first limitation is that the content to be rendered must be known
beforehand. If that’s the case, the rendering can be done in any way, and the result may be captured as e.g. raster or vector image (depending on the desired trade-off). However, the other problem is that such a rendering is usually out of
the given web application scope and thus requires extra effort. Moreover, depending on the situation, the amount of extra memory used, the longer web application startup (due to loading the pre-rendered resources), and the processing
power required to composite a given resource, it may not always be trivial to obtain the desired gains.&lt;/p&gt;
&lt;h4 id=&quot;pre-rendering-using-canvas&quot; tabindex=&quot;-1&quot;&gt;Pre-rendering using canvas &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The first group of actual pre-rendering techniques happening during web application runtime is related to &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API&quot;&gt;Canvas&lt;/a&gt; and
&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas&quot;&gt;OffscreenCavas&lt;/a&gt;. Those APIs are really useful as they offer great flexibility in terms of usage and are usually very performant.
However, in this case, the natural downside is the lack of support for rendering the DOM inside the canvas. Moreover, canvas has a very limited support for painting text — unlike the DOM, where
CSS has a significant amount of features related to it. Interestingly, there’s an ongoing proposal called &lt;a href=&quot;https://github.com/WICG/html-in-canvas&quot;&gt;HTML-in-Canvas&lt;/a&gt; that could resolve those limitations
to some degree. In fact, Blink has a functioning prototype of it already. However, it may take a while before the spec is mature and widely adopted by other browser engines.&lt;/p&gt;
&lt;p&gt;When it comes to actual usage of canvas APIs for pre-rendering, the possibilities are numerous, and there are even more of them when combined with processing using &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API&quot;&gt;workers&lt;/a&gt;.
The most popular ones are as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;rendering to an invisible canvas and showing it later,&lt;/li&gt;
&lt;li&gt;rendering to a canvas detached from the DOM and attaching it later,&lt;/li&gt;
&lt;li&gt;rendering to an invisible/detached canvas and producing an image out of it to be shown later,&lt;/li&gt;
&lt;li&gt;rendering to an offscreen canvas and producing an image out of it to be shown later.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When combined with workers, some of the above techniques may be used in the worker threads with the rendered artifacts transferred to the main for presentation purposes. In that case, one must be careful with
the transfer itself, as some objects may get serialized, which is very costly. To avoid that, it’s recommended to use &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects&quot;&gt;transferable objects&lt;/a&gt;
and always perform a proper benchmarking to make sure the transfer is not involving serialization in the particular case.&lt;/p&gt;
&lt;p&gt;While the use of canvas APIs is usually very straightforward, one must be aware of two extra caveats.&lt;/p&gt;
&lt;p&gt;First of all, in the case of many techniques mentioned above, there is no guarantee that the browser will perform actual rasterization at the given point in time. To ensure the rasterization is triggered, it’s usually
necessary to enforce it using e.g. a dummy readback (&lt;code&gt;getImageData()&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Finally, one should be aware that the usage of canvas comes with some overhead. Therefore, creating many canvases or creating them often, may lead to performance problems that could outweigh the gains from the
pre-rendering itself.&lt;/p&gt;
&lt;h4 id=&quot;pre-rendering-using-eventually-invisible-layers&quot; tabindex=&quot;-1&quot;&gt;Pre-rendering using eventually-invisible layers &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The second group of pre-rendering techniques happening during web application runtime is limited to the DOM rendering and comes out of a combination of purposeful spec misuse and tricking the browser engine into making it rasterizing
on demand. As one can imagine, this group of techniques is very much browser-engine-specific. Therefore, it should always be backed by proper benchmarking of all the use cases on the target browsers and target hardware.&lt;/p&gt;
&lt;p&gt;In principle, all the techniques of this kind consist of 3 parts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enforcing the content to be pre-rendered being placed on a separate layer backed by an actual buffer internally in the browser,&lt;/li&gt;
&lt;li&gt;Tricking the browser’s compositor into thinking that the layer needs to be rasterized right away,&lt;/li&gt;
&lt;li&gt;Ensuring the layer won’t be composited eventually.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;When all the elements are combined together, the browser engine will allocate an internal buffer (e.g. texture) to back the given DOM fragment, it will process that fragment (style recalc, layout), and rasterize it right away. It will do so
as it will not have enough information to allow delaying the rasterization of the layer (as e.g. in case of &lt;code&gt;display: none&lt;/code&gt;). Then, when the compositing time comes, the layer will turn out to be invisible in practice
due to e.g. being occluded, clipped, etc. This way, the rasterization will happen right away, but the results will remain invisible until a later time when the layer is made visible.&lt;/p&gt;
&lt;p&gt;In practice, the following approaches can be used to trigger the above behavior:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for &lt;strong&gt;(1)&lt;/strong&gt;, the CSS properties such as &lt;code&gt;will-change: transform&lt;/code&gt;, &lt;code&gt;z-index&lt;/code&gt;, &lt;code&gt;position: fixed&lt;/code&gt;, &lt;code&gt;overflow: hidden&lt;/code&gt; etc. can be used depending on the browser engine,&lt;/li&gt;
&lt;li&gt;for &lt;strong&gt;(2)&lt;/strong&gt; and &lt;strong&gt;(3)&lt;/strong&gt;, the CSS properties such as &lt;code&gt;opacity: 0&lt;/code&gt;, &lt;code&gt;overflow: hidden&lt;/code&gt;, &lt;code&gt;contain: strict&lt;/code&gt; etc. can be utilized, again, depending on the browser engine.&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;The scrolling trick&lt;/h5&gt;
&lt;p&gt;While the above CSS properties allow for various combinations, in case of WPE WebKit in the context of embedded devices (tested on &lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt;), the combination that has proven to yield the best performance benefits turns
out to be a simple approach involving &lt;code&gt;overflow: hidden&lt;/code&gt; and scrolling. The example of such an approach is explained below.&lt;/p&gt;
&lt;p&gt;Suppose the goal of the application is to update a big table with numbers once every &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; frames — like in the following demo:
&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table.html?cs=20&amp;amp;rs=20&amp;amp;if=59&quot;&gt;random-numbers-bursting-in-table.html?cs=20&amp;amp;rs=20&amp;amp;if=59&lt;/a&gt;&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/0PwhZK5P7T-654.avif 654w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/0PwhZK5P7T-654.webp 654w&quot;&gt;&lt;img alt=&quot;Bursting numbers demo.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/0PwhZK5P7T-654.png&quot; width=&quot;654&quot; height=&quot;653&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;With the number of idle frames (&lt;code&gt;if&lt;/code&gt;) set to 59, the idea is that the application does nothing significant for the 59 frames, and then every 60th frame it updates all the numbers in the table.&lt;/p&gt;
&lt;p&gt;As one can imagine, on constrained embedded devices, such an approach leads to a very heavy workload during every 60th frame and hence to lost frames and unstable application’s FPS.&lt;/p&gt;
&lt;p&gt;As long as the numbers are available earlier than every 60th frame, the above application is a perfect example where pre-rendering could be used to reduce the peak workload.&lt;/p&gt;
&lt;p&gt;To simulate that, the 3 variants of the approach involving the &lt;strong&gt;scrolling trick&lt;/strong&gt; were prepared for comparison with the above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table-prerendered-1.html?cs=20&amp;amp;rs=20&amp;amp;if=59&quot;&gt;random-numbers-bursting-in-table-prerendered-1.html?cs=20&amp;amp;rs=20&amp;amp;if=59&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table-prerendered-2.html?cs=20&amp;amp;rs=20&amp;amp;if=59&quot;&gt;random-numbers-bursting-in-table-prerendered-2.html?cs=20&amp;amp;rs=20&amp;amp;if=59&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table-prerendered-3.html?cs=20&amp;amp;rs=20&amp;amp;if=59&quot;&gt;random-numbers-bursting-in-table-prerendered-3.html?cs=20&amp;amp;rs=20&amp;amp;if=59&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the above demos, the idea is that each cell with a number becomes a scrollable container with 2 numbers actually — one above the other. In that case, because &lt;code&gt;overflow: hidden&lt;/code&gt; is set, only one of the numbers is visible while the
other is hidden — depending on the current scrolling:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qFqjTXuuSo-611.avif 611w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qFqjTXuuSo-611.webp 611w&quot;&gt;&lt;img alt=&quot;Scrolling trick explained.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/qFqjTXuuSo-611.png&quot; width=&quot;611&quot; height=&quot;348&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;With such a setup, it’s possible to update the invisible numbers during &lt;strong&gt;idle&lt;/strong&gt; frames without the user noticing. Due to how WPE WebKit accelerates the scrolling, changing the invisible
numbers, in practice, triggers the layout and rendering right away. Moreover, the actual rasterization to the buffer backing the scrollable container happens immediately (depending on the tiling settings), and hence the high cost of layout
and text rasterization can be distributed. When the time comes, and all the numbers need to be updated, the scrollable containers can be just scrolled, which in that case turns out to be ~2 times faster than updating all the numbers in place.&lt;/p&gt;
&lt;p&gt;To better understand the above effect, it’s recommended to compare the mark views from sysprof traces of the
&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table.html?cs=10&amp;amp;rs=10&amp;amp;if=11&quot;&gt;random-numbers-bursting-in-table.html?cs=10&amp;amp;rs=10&amp;amp;if=11&lt;/a&gt; and
&lt;a href=&quot;https://scony.github.io/web-examples/text/random-numbers-bursting-in-table-prerendered-1.html?cs=10&amp;amp;rs=10&amp;amp;if=11&quot;&gt;random-numbers-bursting-in-table-prerendered-1.html?cs=10&amp;amp;rs=10&amp;amp;if=11&lt;/a&gt; demos:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/NVtyG7e_K1-2363.avif 2363w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/NVtyG7e_K1-2363.webp 2363w&quot;&gt;&lt;img alt=&quot;Sysprof from basic demo.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/NVtyG7e_K1-2363.png&quot; width=&quot;2363&quot; height=&quot;1169&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/6du_zbm-hI-2363.avif 2363w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/6du_zbm-hI-2363.webp 2363w&quot;&gt;&lt;img alt=&quot;Sysprof from pre-rendering demo.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/6du_zbm-hI-2363.png&quot; width=&quot;2363&quot; height=&quot;1172&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;While the first sysprof trace shows very little processing during 11 idle frames and a big chunk of processing (21 ms) every 12th frame, the second sysprof trace shows how the distribution of load looks. In
that case, the amount of work during 11 idle frames is much bigger (yet manageable), but at the same time, the formerly big chunk of processing every 12th frame is reduced almost 2 times (to 11 ms). Therefore, the overall
frame rate in the application is much better.&lt;/p&gt;
&lt;h5&gt;Results&lt;/h5&gt;
&lt;p&gt;Despite the above improvement speaking for itself, it’s worth summarizing the improvement with the benchmarking results of the above demos obtained from the &lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt; and presenting the application’s average
frames per second (FPS):&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/YqNYgMaEpQ-1104.avif 1104w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/YqNYgMaEpQ-1104.webp 1104w&quot;&gt;&lt;img alt=&quot;Benchmarking results.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/YqNYgMaEpQ-1104.png&quot; width=&quot;1104&quot; height=&quot;204&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;Clearly, the positive impact of pre-rendering can be substantial depending on the conditions. In practice, when the rendered DOM fragment is more complex, the trick such as above can yield even better results.
However, due to how tiling works, the effect can be minimized if the content to be pre-rendered spans multiple tiles. In that case, the browser may defer rasterization until the tiles are actually needed. Therefore,
the above needs to be used with care and always with proper benchmarking.&lt;/p&gt;
&lt;h2 id=&quot;conclusions&quot; tabindex=&quot;-1&quot;&gt;Conclusions &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-pre-rendering/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;As demonstrated in the above sections, when it comes to pre-rendering the contents to distribute the web application workload over time, the web platform gives both the official APIs to do it, as well as unofficial
means through purposeful misuse of APIs and exploitation of browser engine implementations. While this article hasn’t covered all the possibilities available, the above should serve as a good initial read with some easy-to-try
solutions that may yield surprisingly good results. However, as some of the ideas mentioned above are very much browser-engine-specific, they should be used with extra care and with the limitations (lack of portability)
in mind.&lt;/p&gt;
&lt;p&gt;As the web platform constantly evolves, the pool of pre-rendering techniques and tricks should keep evolving as well. Also, as more and more web applications are used on embedded devices, more pressure should be
put on the specification, which should yield more APIs targeting the low-end devices in the future. With that in mind, it’s recommended for the readers to stay up-to-date with the latest specification and
perhaps even to get involved if some interesting use cases would be worth introducing new APIs.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>Tracking WebKit&#39;s memory allocations with Malloc Heap Breakdown</title>
		<link href="https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/"/>
		<updated>2025-10-24T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/</id>
		<content type="html">&lt;p&gt;One of the main constraints that embedded platforms impose on the browsers is a very limited memory. Combined with the fact that embedded web applications tend to run actively for days, weeks, or even longer,
it’s not hard to imagine how important the proper memory management within the browser engine is in such use cases. In fact, WebKit and WPE in particular receive numerous memory-related fixes and improvements every year.
Before making any changes, however, the areas to fix/improve need to be narrowed down first. Like any C++ application, WebKit memory can be profiled using a variety of industry-standard tools. Although such well-known
tools are really useful in the majority of use cases, they have their limits that manifest themselves when applied on production-grade embedded systems in conjunction with long-running web applications.
In such cases, a very useful tool is a debug-only feature of WebKit itself called &lt;strong&gt;malloc heap breakdown&lt;/strong&gt;, which this article describes.&lt;/p&gt;
&lt;h2 id=&quot;industry-standard-memory-profilers&quot; tabindex=&quot;-1&quot;&gt;Industry-standard memory profilers &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;When it comes to profiling memory of applications on linux systems, the 2 outstanding tools used usually are &lt;strong&gt;Massif (Valgrind)&lt;/strong&gt; and &lt;strong&gt;Heaptrack&lt;/strong&gt;.&lt;/p&gt;
&lt;h4 id=&quot;massif-valgrind&quot; tabindex=&quot;-1&quot;&gt;Massif (Valgrind) &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;&lt;a href=&quot;https://valgrind.org/docs/manual/ms-manual.html&quot;&gt;Massif&lt;/a&gt; is a heap profiler that comes as part of the &lt;a href=&quot;https://valgrind.org/&quot;&gt;Valgrind&lt;/a&gt; suite. As its documentation states:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It measures how much heap memory your program uses. This includes both the useful space, and the extra bytes allocated for book-keeping and alignment purposes. It can also measure the size of your program’s stack(s),
although it does not do so by default.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Using &lt;strong&gt;Massif&lt;/strong&gt; with WebKit is very straightforward and boils down to a single command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Malloc=1 valgrind --tool=massif --trace-children=yes WebKitBuild/GTK/Debug/bin/MiniBrowser &#39;&amp;lt;URL&amp;gt;&#39;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;Malloc=1&lt;/code&gt; environment variable set above is necessary to instruct WebKit to enable debug heaps that use the system malloc allocator.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Given some results are generated, the memory usage over time can be visualized using &lt;a href=&quot;https://github.com/KDE/massif-visualizer&quot;&gt;massif-visualizer&lt;/a&gt; utility. An example of such a visualization is presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/260Myr22BA-2054.avif 2054w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/260Myr22BA-2054.webp 2054w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/260Myr22BA-2054.png&quot; width=&quot;2054&quot; height=&quot;1170&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;While &lt;strong&gt;Massif&lt;/strong&gt; has been widely adopted and used for many years now, from the very beginning, it suffered from a few significant downsides.&lt;/p&gt;
&lt;p&gt;First of all, the way &lt;strong&gt;Massif&lt;/strong&gt; instruments the profiled application introduces significant overhead that may slow down the application up to 2 orders of magnitude. In some cases, such overhead makes it simply unusable.&lt;/p&gt;
&lt;p&gt;The other important problem is that &lt;strong&gt;Massif&lt;/strong&gt; is snapshot-based, and hence, the level of detail is not ideal.&lt;/p&gt;
&lt;h4 id=&quot;heaptrack&quot; tabindex=&quot;-1&quot;&gt;Heaptrack &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/KDE/heaptrack&quot;&gt;Heaptrack&lt;/a&gt; is a modern heap profiler developed as part of &lt;a href=&quot;https://kde.org/&quot;&gt;KDE&lt;/a&gt;. The below is its description from the git repository:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Heaptrack traces all memory allocations and annotates these events with stack traces. Dedicated analysis tools then allow you to interpret the heap memory profile to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;find hotspots that need to be optimized to reduce the memory footprint of your application&lt;/li&gt;
&lt;li&gt;find memory leaks, i.e. locations that allocate memory which is never deallocated&lt;/li&gt;
&lt;li&gt;find allocation hotspots, i.e. code locations that trigger a lot of memory allocation calls&lt;/li&gt;
&lt;li&gt;find temporary allocations, which are allocations that are directly followed by their deallocation&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;At first glance, &lt;strong&gt;Heaptrack&lt;/strong&gt; resembles &lt;strong&gt;Massif&lt;/strong&gt;. However, a closer look at the architecture and features shows that it’s much more than the latter. While it’s fair to say it’s a bit similar, in fact, it is a
significant progression.&lt;/p&gt;
&lt;p&gt;Usage of &lt;strong&gt;Heaptrack&lt;/strong&gt; to profile WebKit is also very simple. At the moment of writing, the most suitable way to use it is to attach to a certain running WebKit process using the following command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;heaptrack -p &amp;lt;PID&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;while the WebKit needs to be run with system malloc, just like in &lt;strong&gt;Massif&lt;/strong&gt; case:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;WEBKIT_DISABLE_SANDBOX_THIS_IS_DANGEROUS=1 Malloc=1 WebKitBuild/GTK/Debug/bin/MiniBrowser &#39;&amp;lt;URL&amp;gt;&#39;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;If profiling of e.g. web content process startup is essential, it’s then recommended also to use &lt;code&gt;WEBKIT2_PAUSE_WEB_PROCESS_ON_LAUNCH=1&lt;/code&gt;, which adds 30s delay to the process startup.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When the profiling session is done, the analysis of the recordings is done using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;heaptrack --analyze &amp;lt;RECORDING&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The utility opened with the above, shows various things, such as the memory consumption over time:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qbRY2709hH-2550.avif 2550w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qbRY2709hH-2550.webp 2550w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/qbRY2709hH-2550.png&quot; width=&quot;2550&quot; height=&quot;1287&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;flame graphs of memory allocations with respect to certain functions in the code:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/6x5DTZiYJU-2486.avif 2486w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/6x5DTZiYJU-2486.webp 2486w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/6x5DTZiYJU-2486.png&quot; width=&quot;2486&quot; height=&quot;1274&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;etc.&lt;/p&gt;
&lt;p&gt;As &lt;strong&gt;Heaptrack&lt;/strong&gt; records every allocation and deallocation, the data it gathers is very precise and full of details, especially when accompanied by stack traces arranged into flame graphs. Also, as &lt;strong&gt;Heaptrack&lt;/strong&gt;
does instrumentation differently than e.g. &lt;strong&gt;Massif&lt;/strong&gt;, it’s usually much faster in the sense that it slows down the profiled application only up to 1 order of magnitude.&lt;/p&gt;
&lt;h4 id=&quot;shortcomings-on-embedded-systems&quot; tabindex=&quot;-1&quot;&gt;Shortcomings on embedded systems &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Although the memory profilers such as above are really great for everyday use, their limitations on embedded platforms are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;they significantly slow down the profiled application — especially on low-end devices,&lt;/li&gt;
&lt;li&gt;they effectively cannot be run for a longer period of time such as days or weeks, due to memory consumption,&lt;/li&gt;
&lt;li&gt;they are not always provided in the images — and hence require additional setup,&lt;/li&gt;
&lt;li&gt;they may not be buildable out of the box on certain architectures — thus requiring extra patching.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While the above limitations are not always a problem, usually at least one of them is. What’s worse, usually at least one of the limitations turns into a blocking problem. For example, if the target device is very short on memory,
it may be basically impossible to run anything extra beyond the browser. Another example could be a situation where the application slowdown due to the profiler usage, leads to different application behavior, such as a problem
that originally reproduced 100% of the time, does not reproduce anymore etc.&lt;/p&gt;
&lt;h2 id=&quot;malloc-heap-breakdown-in-webkit&quot; tabindex=&quot;-1&quot;&gt;Malloc heap breakdown in WebKit &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Profiling the memory of WebKit while addressing the above problems points towards a solution that does not involve any extra tools, i.e. instrumenting WebKit itself. Normally, adding such an instrumentation to the C++ application
means a lot of work. Fortunately, in the case of WebKit, all that work is already done and can be easily enabled by using the &lt;strong&gt;Malloc heap breakdown&lt;/strong&gt;.&lt;/p&gt;
&lt;!-- TODO: now I need to rework the above to glue with the above --&gt;
&lt;p&gt;In a nutshell, &lt;strong&gt;Malloc heap breakdown&lt;/strong&gt; is a debug-only feature that enables memory allocation tracking within WebKit itself. Since it’s built into WebKit, it’s very lightweight and very easy to build, as it’s just about setting
the &lt;code&gt;ENABLE_MALLOC_HEAP_BREAKDOWN&lt;/code&gt; build option. Internally, when the feature is enabled, WebKit switches to using debug heaps that use system malloc along with the &lt;a href=&quot;https://www.manpagez.com/man/3/malloc_zone_free/&quot;&gt;malloc zone API&lt;/a&gt;
to mark objects of certain classes as belonging to different heap zones and thus allowing one to track the allocation sizes of such zones.&lt;/p&gt;
&lt;p&gt;As the &lt;a href=&quot;https://www.manpagez.com/man/3/malloc_zone_free/&quot;&gt;malloc zone API&lt;/a&gt; is specific to BSD-like OSes, the actual implementations (and usages) in WebKit have to be considered separately for Apple and non-Apple ports.&lt;/p&gt;
&lt;h4 id=&quot;malloc-heap-breakdown-on-apple-ports&quot; tabindex=&quot;-1&quot;&gt;Malloc heap breakdown on Apple ports &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;Malloc heap breakdown&lt;/strong&gt; was originally designed only with Apple ports in mind, with the reason being twofold:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;a href=&quot;https://www.manpagez.com/man/3/malloc_zone_free/&quot;&gt;malloc zone API&lt;/a&gt; is provided virtually by all platforms that Apple ports integrate with.&lt;/li&gt;
&lt;li&gt;MacOS platforms provide a great utility called &lt;code&gt;footprint&lt;/code&gt; that allows one to inspect per-zone memory statistics for a given process.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Given the above, usage of &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; with Apple ports is very smooth and as simple as building WebKit with the &lt;code&gt;ENABLE_MALLOC_HEAP_BREAKDOWN&lt;/code&gt; build option and running on macOS while using the &lt;code&gt;footprint&lt;/code&gt; utility:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Footprint is a macOS specific tool that allows the developer to check memory usage across regions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For more details, one should refer to the &lt;a href=&quot;https://docs.webkit.org/Infrastructure/MemoryInspection.html&quot;&gt;official documentation page&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id=&quot;malloc-heap-breakdown-on-non-apple-ports&quot; tabindex=&quot;-1&quot;&gt;Malloc heap breakdown on non-Apple ports &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Since all of the non-Apple WebKit ports are mostly being built and run on non-BSD-like systems, it’s safe to assume the &lt;a href=&quot;https://www.manpagez.com/man/3/malloc_zone_free/&quot;&gt;malloc zone API&lt;/a&gt; is not offered to such ports by the system itself.
Because of the above, for many years, &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; was only available for Apple ports.&lt;/p&gt;
&lt;p&gt;Fortunately, with the changes introduced in 2025, such as: &lt;a href=&quot;https://commits.webkit.org/294667@main&quot;&gt;294667@main&lt;/a&gt; (+ fix &lt;a href=&quot;https://commits.webkit.org/294848@main&quot;&gt;294848@main&lt;/a&gt;), &lt;a href=&quot;https://commits.webkit.org/301702@main&quot;&gt;301702@main&lt;/a&gt;, and improvements
such as:
&lt;a href=&quot;https://commits.webkit.org/294848@main&quot;&gt;294848@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/299555@main&quot;&gt;299555@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/301695@main&quot;&gt;301695@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/301709@main&quot;&gt;301709@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/301712@main&quot;&gt;301712@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/301839@main&quot;&gt;301839@main&lt;/a&gt;,
&lt;a href=&quot;https://commits.webkit.org/301861@main&quot;&gt;301861@main&lt;/a&gt;,
the &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; integrates also with non-Apple ports and &lt;strong&gt;is stable as of &lt;code&gt;main@a235408c2b4eb12216d519e996f70828b9a45e19&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The idea behind the integration for non-Apple ports is to provide a simple WebKit-internal library that provides a fake &lt;code&gt;&amp;lt;malloc/malloc.h&amp;gt;&lt;/code&gt; header along with simple implementation that provides &lt;code&gt;malloc_zone_*()&lt;/code&gt; function implementations
as proxy calls to &lt;code&gt;malloc()&lt;/code&gt;, &lt;code&gt;calloc()&lt;/code&gt;, &lt;code&gt;realloc()&lt;/code&gt; etc. along with a tracking mechanism that keeps references to memory chunks. Such an approach gathers all the information needed to be reported later on.&lt;/p&gt;
&lt;p&gt;At the moment of writing, the above allows 2 methods of reporting the memory usage statistics periodically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;printing to standard output,&lt;/li&gt;
&lt;li&gt;reporting to &lt;a href=&quot;https://wiki.gnome.org/Apps/Sysprof&quot;&gt;sysprof&lt;/a&gt; as counters.&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;Periodic reporting to standard output&lt;/h5&gt;
&lt;p&gt;By default, when WebKit is built with &lt;code&gt;ENABLE_MALLOC_HEAP_BREAKDOWN&lt;/code&gt;, the heap breakdown is printed to the standard output every few seconds for each process. That can be tweaked by setting &lt;code&gt;WEBKIT_MALLOC_HEAP_BREAKDOWN_LOG_INTERVAL=&amp;lt;SECONDS&amp;gt;&lt;/code&gt;
environment variable.&lt;/p&gt;
&lt;p&gt;The results have a structure similar to the one below:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; MHB: &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; PID &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;Zone name&quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token comment&quot;&gt;#chunks | #bytes | {&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;ExecutableMemoryHandle&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;32&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;AssemblerData&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;192&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;VectorBuffer&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;37&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;16184&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;StringImpl&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;103&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;5146&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;WeakPtrImplBase&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;17&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;272&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;HashTable&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;37&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;9408&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;Vector&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;16&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;EmbeddedFixedVector&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;32&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;BloomFilter&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;65536&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;CStringBuffer&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;86&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;Default Zone&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token number&quot;&gt;402339&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; MHB: grand total bytes allocated: &lt;span class=&quot;token number&quot;&gt;9690&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Given the allocation statistics per-zone, it’s easy to narrow down the unusual usage patterns manually. The example of a successful investigation is presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/jo-e1cbuvQ-1285.avif 1285w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/jo-e1cbuvQ-1285.webp 1285w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/jo-e1cbuvQ-1285.png&quot; width=&quot;1285&quot; height=&quot;545&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;Moreover, the data presented can be processed either manually or using scripts to create memory usage charts that span as long as the application lifetime so e.g. hours (20+ like below), days, or even longer:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/N8ftE_7zzb-1128.avif 1128w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/N8ftE_7zzb-1128.webp 1128w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/N8ftE_7zzb-1128.png&quot; width=&quot;1128&quot; height=&quot;686&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;h5&gt;Periodic reporting to sysprof&lt;/h5&gt;
&lt;p&gt;The other reporting mechanism currently supported is reporting periodically to &lt;a href=&quot;https://wiki.gnome.org/Apps/Sysprof&quot;&gt;sysprof&lt;/a&gt; as counters. In short, &lt;a href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/(https:/wiki.gnome.org/Apps/Sysprof)&quot;&gt;sysprof&lt;/a&gt; is a modern system-wide profiling tool
that already integrates with WebKit very well when it comes to non-Apple ports.&lt;/p&gt;
&lt;p&gt;The condition for &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; reporting to &lt;strong&gt;sysprof&lt;/strong&gt; is that the WebKit browser needs to be profiled e.g. using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sysprof-cli -f -- &amp;lt;BROWSER_COMMAND&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and the &lt;strong&gt;sysprof&lt;/strong&gt; has to be in the latest version possible.&lt;/p&gt;
&lt;p&gt;With the above, the memory usage statistics can then be inspected using the &lt;code&gt;sysprof&lt;/code&gt; utility and look like in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/zFYfgfacYN-2541.avif 2541w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/zFYfgfacYN-2541.webp 2541w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/zFYfgfacYN-2541.png&quot; width=&quot;2541&quot; height=&quot;1283&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;In the case of &lt;strong&gt;sysprof&lt;/strong&gt;, memory statistics in that case are just a minor addition to other powerful features that were well described &lt;a href=&quot;https://feaneron.com/2024/07/12/profiling-a-web-engine/&quot;&gt;in this blog post&lt;/a&gt; from Georges.&lt;/p&gt;
&lt;h4 id=&quot;caveats&quot; tabindex=&quot;-1&quot;&gt;Caveats &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;While &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; is very useful in some use cases — especially on embedded systems — there are a few problems with it.&lt;/p&gt;
&lt;p&gt;First of all, compilation with &lt;code&gt;-DENABLE_MALLOC_HEAP_BREAKDOWN=ON&lt;/code&gt; is not guarded by any continuous integration bots; therefore, &lt;strong&gt;the compilation issues are expected&lt;/strong&gt; on the latest WebKit &lt;code&gt;main&lt;/code&gt;. Fortunately, fixing the problems
is usually straightforward. For a reference on what may be causing compilation problems usually, one should refer to &lt;a href=&quot;https://commits.webkit.org/299555@main&quot;&gt;299555@main&lt;/a&gt;, which contains a full variety of fixes.&lt;/p&gt;
&lt;p&gt;The second problem is that &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; uses WebKit’s debug heaps, and hence the memory usage patterns may be different just because system malloc is used.&lt;/p&gt;
&lt;p&gt;The third, and final problem, is that &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; integration for non-Apple ports introduces some overhead as the allocations need to lock/unlock the mutex, and as statistics are stored in the memory as well.&lt;/p&gt;
&lt;h4 id=&quot;opportunities&quot; tabindex=&quot;-1&quot;&gt;Opportunities &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Although &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; can be considered fairly constrained, in the case of non-Apple ports, it gives some additional possibilities that are worth mentioning.&lt;/p&gt;
&lt;p&gt;Because on non-Apple ports, the custom library is used to track allocations (as mentioned at the beginning of the &lt;a href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;Malloc heap breakdown on non-Apple ports&lt;/a&gt; section), it’s very easy
to add more sophisticated tracking/debugging/reporting capabilities. The only file that requires changes in such a case is:
&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WTF/wtf/malloc_heap_breakdown/main.cpp&quot;&gt;Source/WTF/wtf/malloc_heap_breakdown/main.cpp&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Some examples of custom modifications include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;adding different reporting mechanisms — e.g. writing to a file, or to some other tool,&lt;/li&gt;
&lt;li&gt;reporting memory usage with more details — e.g. reporting the per-memory-chunk statistics,&lt;/li&gt;
&lt;li&gt;dumping raw memory bytes — e.g. when some allocations are &lt;em&gt;suspicious&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;altering memory in-place — e.g. to simulate memory corruption.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;summary&quot; tabindex=&quot;-1&quot;&gt;Summary &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/tracking-webkit-s-memory-allocations-with-malloc-heap-breakdown/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;While the presented &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; mechanism is a rather poor approximation of what industry standard tools offer, the main benefit of it is that it’s built into WebKit, and that in some rare use-cases (especially on
embedded platforms), it’s the only way to perform any reasonable profiling.&lt;/p&gt;
&lt;p&gt;In general, as a rule of thumb, it’s not recommended to use &lt;strong&gt;malloc heap breakdown&lt;/strong&gt; unless all other methods have failed. In that sense, &lt;strong&gt;it should be considered a last resort approach&lt;/strong&gt;. With that in mind, &lt;strong&gt;malloc heap breakdown&lt;/strong&gt;
can be seen as a nice mechanism complementing other tools in the toolbox.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>WPE performance considerations: DOM tree</title>
		<link href="https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/"/>
		<updated>2025-09-26T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/</id>
		<content type="html">&lt;p&gt;Designing performant web applications is not trivial in general. Nowadays, as many companies decide to use web platform on embedded devices, the problem of designing performant web applications becomes even more complicated.
Typical embedded devices are orders of magnitude slower than desktop-class ones. Moreover, the proportion between CPU and GPU power is commonly different as well. This usually results in unexpected performance bottlenecks
when the web applications designed with desktop-class devices in mind are being executed on embedded environments.&lt;/p&gt;
&lt;p&gt;In order to help web developers approach the difficulties that the usage of web platform on embedded devices may bring, this blog post initiates a series of articles covering various performance-related aspects
in the context of WPE WebKit usage on embedded devices. The coverage in general will include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;introducing the demo web applications dedicated to showcasing use cases of a given aspect,&lt;/li&gt;
&lt;li&gt;benchmarking and profiling the WPE WebKit performance using the above demos,&lt;/li&gt;
&lt;li&gt;discussing the causes for the performance measured,&lt;/li&gt;
&lt;li&gt;inferring some general pieces of advice and rules of thumb based on the results.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This article, in particular, discusses the overhead of nodes in the DOM tree when it comes to layouting. It does that primarily by investigating the impact of &lt;strong&gt;idle nodes&lt;/strong&gt; that introduce the least overhead and hence
may serve as a lower bound for any general considerations. With the data presented in this article, it should be clear how the DOM tree size/depth scales in the case of embedded devices.&lt;/p&gt;
&lt;h2 id=&quot;dom-tree&quot; tabindex=&quot;-1&quot;&gt;DOM tree &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Historically, the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model&quot;&gt;DOM&lt;/a&gt; trees emerging from the usual web page designs were rather limited in size and fairly shallow. This was the case as there were
no reasons for them to be excessively large unless the web page itself had a very complex UI. Nowadays, not only are the DOM trees much bigger and deeper, but they also tend to contain &lt;strong&gt;idle nodes&lt;/strong&gt; that artificially increase
the size/depth of the tree. The &lt;strong&gt;idle nodes&lt;/strong&gt; are the nodes in the DOM that are active yet do not contribute to any visual effects. Such nodes are usually a side effect of using various frameworks and approaches that
conceptualize &lt;strong&gt;components&lt;/strong&gt; or &lt;strong&gt;services&lt;/strong&gt; as nodes, which then participate in various kinds of processing utilizing JavaScript. Other than &lt;strong&gt;idle nodes&lt;/strong&gt;, the DOM trees are usually bigger and deeper nowadays, as there
are simply more possibilities that emerged with the introduction of modern APIs such as &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM&quot;&gt;Shadow DOM&lt;/a&gt;,
&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_anchor_positioning&quot;&gt;Anchor positioning&lt;/a&gt;, &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Popover_API&quot;&gt;Popover&lt;/a&gt;, and the like.&lt;/p&gt;
&lt;p&gt;In the context of web platform usage on embedded devices, the natural consequence of the above is that web designers require more knowledge on how the particular browser performance scales with the DOM tree size and shape.
Before considering embedded devices, however, it’s worth to take a brief look at how various web engines scale on desktop with the DOM tree growing in depth.&lt;/p&gt;
&lt;h2 id=&quot;desktop-considerations&quot; tabindex=&quot;-1&quot;&gt;Desktop considerations &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;To measure the impact of the DOM tree depth on the performance, the &lt;a href=&quot;https://scony.github.io/web-examples/dom/random-number-changing-in-the-tree.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&quot;&gt;random-number-changing-in-the-tree.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&lt;/a&gt;
demo can be used to perform a series of experiments with different parameters.&lt;/p&gt;
&lt;p&gt;In short, the above demo measures the average duration of a benchmark function run, where the run does the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;changes the text of a single DOM element to a random number,&lt;/li&gt;
&lt;li&gt;forces a full tree layout.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Moreover, the demo allows one to set 0 or more parent &lt;strong&gt;idle nodes&lt;/strong&gt; for the node holding text, so that the layout must consider those &lt;strong&gt;idle nodes&lt;/strong&gt; as well.&lt;/p&gt;
&lt;p&gt;The parameters used in the URL above mean the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;vr=0&lt;/code&gt; — the results are reported to the console. Alternatively (&lt;code&gt;vr=1&lt;/code&gt;), at the end of benchmarking (~23 seconds), the result appears on the web page itself.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ms=1&lt;/code&gt; — the results are reported in “milliseconds per run”. Alternatively (&lt;code&gt;ms=0&lt;/code&gt;), “runs per second” are reported instead.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;dv=0&lt;/code&gt; — the &lt;strong&gt;idle nodes&lt;/strong&gt; are using &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; tag. Alternatively, (&lt;code&gt;dv=1&lt;/code&gt;) &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; tag is used instead.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;ns=N&lt;/code&gt;&lt;/strong&gt; — the &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; &lt;strong&gt;idle nodes&lt;/strong&gt; are added.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The idea behind the experiment is to check how much overhead is added as the number of extra &lt;strong&gt;idle nodes&lt;/strong&gt; (&lt;code&gt;ns=N&lt;/code&gt;) in the DOM tree increases. Since the browsers used in the experiments are not fair to compare due to various reasons,
instead of concrete numbers in milliseconds, the results are presented in relative terms for each browser separately. It means that the benchmarking result for &lt;code&gt;ns=0&lt;/code&gt; serves as a baseline, and other results show the relative duration
increase to that baseline result, where, e.g. a 300% increase means 3 times the baseline duration.&lt;/p&gt;
&lt;p&gt;The results for a few mainstream browsers/browser engines (&lt;strong&gt;WebKit GTK MiniBrowser [09.09.2025]&lt;/strong&gt;, &lt;strong&gt;Chromium 140.0.7339.127&lt;/strong&gt;, and &lt;strong&gt;Firefox 142.0&lt;/strong&gt;) and a few experimental ones (&lt;strong&gt;Servo [04.07.2024]&lt;/strong&gt; and &lt;strong&gt;Ladybird [30.06.2024]&lt;/strong&gt;)
are presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/aK07eCrhoz-1148.avif 1148w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/aK07eCrhoz-1148.webp 1148w&quot;&gt;&lt;img alt=&quot;Idle nodes overhead on mainstream browsers.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/aK07eCrhoz-1148.png&quot; width=&quot;1148&quot; height=&quot;771&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;As the results show, trends among all the browsers are very close to linear. It means that the overhead is very easy to assess, as usually &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; times more &lt;strong&gt;idle nodes&lt;/strong&gt; will result in &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt;
times the overhead.
Moreover, up until 100-200 extra &lt;strong&gt;idle nodes&lt;/strong&gt; in the tree, the overhead trends are very similar in all the browsers except for experimental &lt;strong&gt;Ladybird&lt;/strong&gt;. That in turn means that even for big web applications, it’s safe to
assume the overhead among the browsers will be very much the same. Finally, past the 200 extra &lt;strong&gt;idle nodes&lt;/strong&gt; threshold, the overhead across browsers diverges. It’s very likely due to the fact that the browsers are not
optimizing such cases as a result of a lack of real-world use cases.&lt;/p&gt;
&lt;p&gt;All in all, the conclusion is that on desktop, only very large / specific web applications should be cautious about the overhead of nodes, as modern web browsers/engines are very well optimized for handling substantial amounts
of nodes in the DOM.&lt;/p&gt;
&lt;h2 id=&quot;embedded-device-considerations&quot; tabindex=&quot;-1&quot;&gt;Embedded device considerations &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;When it comes to the embedded devices, the above conclusions are no longer applicable. To demonstrate that, a minimal browser utilizing &lt;strong&gt;WPE WebKit&lt;/strong&gt; is used to run the demo from the previous section both on &lt;strong&gt;desktop&lt;/strong&gt; and
&lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt; platforms. The latter is a popular choice for embedded applications as it has quite an interesting set of features while still having strong specifications, which may be compared to those of &lt;strong&gt;Raspberry Pi 5&lt;/strong&gt;.
The results are presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/tNFgwxADqJ-1148.avif 1148w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/tNFgwxADqJ-1148.webp 1148w&quot;&gt;&lt;img alt=&quot;Idle nodes overhead compared between desktop and embedded devices.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/tNFgwxADqJ-1148.png&quot; width=&quot;1148&quot; height=&quot;771&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;This time, the Y axis presents the duration (in milliseconds) of a single benchmark run, and hence makes it very easy to reason about overhead. As the results show, in the case of the desktop, 100 extra &lt;strong&gt;idle nodes&lt;/strong&gt; in the DOM
introduce barely noticeable overhead. On the other hand, on an embedded platform, even without any extra &lt;strong&gt;idle nodes&lt;/strong&gt;, the time to change and layout the text is already taking around 0.6 ms. With 10 extra idle nodes, this
duration increases to 0.75 ms — thus yielding 0.15 ms overhead. With 100 extra &lt;strong&gt;idle nodes&lt;/strong&gt;, such overhead grows to 1.3 ms.&lt;/p&gt;
&lt;p&gt;One may argue if 1.3 ms is much, but considering an application that e.g. does 60 FPS rendering, the
time at application disposal each frame is below 16.67 ms, and 1.3 ms is ~8% of that, thus being very considerable. Similarly, for the application to be perceived as responsive, the input-to-output latency should usually
be under 20 ms. Again, 1.3 ms is a significant overhead for such a scenario.&lt;/p&gt;
&lt;p&gt;Given the above, it’s safe to state that the &lt;strong&gt;20 extra idle nodes&lt;/strong&gt; should be considered the safe maximum for embedded devices in general. In case of low-end embedded devices i.e. ones comparable to &lt;strong&gt;Raspberry Pi 1 and 2&lt;/strong&gt;,
the maximum should be even lower, but a proper benchmarking is required to come up with concrete numbers.&lt;/p&gt;
&lt;h4 id=&quot;inline-vs-block&quot; tabindex=&quot;-1&quot;&gt;Inline vs block &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;While the previous subsection demonstrated that on embedded devices, adding extra &lt;strong&gt;idle nodes&lt;/strong&gt; as parents must usually be done in a responsible way, it’s worth examining if there are nuances that need to be considered as
well.&lt;/p&gt;
&lt;p&gt;The first matter that one may wonder about is whether there’s any difference between the overhead of &lt;strong&gt;idle nodes&lt;/strong&gt; being &lt;strong&gt;inlines&lt;/strong&gt; (&lt;code&gt;display: inline&lt;/code&gt;) or &lt;strong&gt;blocks&lt;/strong&gt; (&lt;code&gt;display: block&lt;/code&gt;). The intuition here may be that, as &lt;strong&gt;idle nodes&lt;/strong&gt;
have no visual impact on anything, the overhead should be similar.&lt;/p&gt;
&lt;p&gt;To verify the above, the demo from &lt;a href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;Desktop considerations&lt;/a&gt; section can be used with &lt;code&gt;dv&lt;/code&gt; parameter used to control whether extra &lt;strong&gt;idle nodes&lt;/strong&gt; should be &lt;strong&gt;blocks&lt;/strong&gt; (1, &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;) or &lt;strong&gt;inlines&lt;/strong&gt; (0, &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt;).
The results from such experiments — again, executed on &lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt; — are presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/KQAm5n5lI3-1148.avif 1148w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/KQAm5n5lI3-1148.webp 1148w&quot;&gt;&lt;img alt=&quot;Comparison of overhead of idle nodes being inline or block elements.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/KQAm5n5lI3-1148.png&quot; width=&quot;1148&quot; height=&quot;771&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;While in the safe range of 0-20 extra &lt;strong&gt;idle nodes&lt;/strong&gt; the results are very much similar, it’s evident that in general, the &lt;strong&gt;idle nodes&lt;/strong&gt; of &lt;strong&gt;block&lt;/strong&gt; type are actually introducing more overhead.&lt;/p&gt;
&lt;p&gt;The reason for the above is that, for layout purposes, the handling of &lt;strong&gt;inline&lt;/strong&gt; and &lt;strong&gt;block&lt;/strong&gt; elements is very different. The &lt;strong&gt;inline&lt;/strong&gt; elements sharing the same line can be thought of as being flattened within so called
line box tree. The &lt;strong&gt;block&lt;/strong&gt; elements, on the other hand, have to be represented in a tree.&lt;/p&gt;
&lt;p&gt;To show the above visually, it’s interesting to compare sysprof flamegraphs of WPE WebProcess from the scenarios comprising 20 &lt;strong&gt;idle nodes&lt;/strong&gt; and using either &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; or &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; for &lt;strong&gt;idle nodes&lt;/strong&gt;:&lt;/p&gt;
&lt;h5&gt;idle &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; nodes:&lt;/h5&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/STVhIkrY6f-2358.avif 2358w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/STVhIkrY6f-2358.webp 2358w&quot;&gt;&lt;img alt=&quot;Sysprof flamegraph of WPE WebProcess layouting inline elements.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/STVhIkrY6f-2358.png&quot; width=&quot;2358&quot; height=&quot;1048&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;h5&gt;idle &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; nodes:&lt;/h5&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/nGvL9TuLOb-2361.avif 2361w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/nGvL9TuLOb-2361.webp 2361w&quot;&gt;&lt;img alt=&quot;Sysprof flamegraph of WPE WebProcess layouting block elements.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/nGvL9TuLOb-2361.png&quot; width=&quot;2361&quot; height=&quot;1227&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;The first flamegraph proves that there’s no clear dependency between the call stack and the number of &lt;strong&gt;idle nodes&lt;/strong&gt;. The second one, on the other hand, shows exactly the opposite — each of the extra &lt;strong&gt;idle nodes&lt;/strong&gt; is
visible as adding extra calls. Moreover, each of the extra &lt;strong&gt;idle block nodes&lt;/strong&gt; adds some overhead thus making the flamegraph have a pyramidal shape.&lt;/p&gt;
&lt;h4 id=&quot;whitespaces&quot; tabindex=&quot;-1&quot;&gt;Whitespaces &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Another nuance worth exploring is the overhead of text nodes created because of whitespaces.&lt;/p&gt;
&lt;p&gt;When the DOM tree is created from the HTML, usually a lot of text nodes are created just because of whitespaces. It’s because the HTML usually looks like:&lt;/p&gt;
&lt;pre class=&quot;language-html&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-html&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;br&gt;    (...)&lt;br&gt;  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;rather than:&lt;/p&gt;
&lt;pre class=&quot;language-html&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-html&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;(...)&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;span&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;which makes sense from the readability point of view. From the performance point of view, however, more text nodes naturally mean more overhead. When such redundant text nodes are combined with
&lt;strong&gt;idle nodes&lt;/strong&gt;, the net outcome may be that with each extra &lt;strong&gt;idle node&lt;/strong&gt;, some overhead will be added.&lt;/p&gt;
&lt;p&gt;To verify the above hypothesis, the demo similar to the above one can be used along with the above one to perform a series of experiments comparing the approach with and without redundant whitespaces:
&lt;a href=&quot;https://scony.github.io/web-examples/dom/random-number-changing-in-the-tree-w-whitespaces.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&quot;&gt;random-number-changing-in-the-tree-w-whitespaces.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&lt;/a&gt;.
The only difference between the demos is that the &lt;code&gt;w-whitespaces&lt;/code&gt; one creates the DOM tree with artificial whitespaces, simulating as-if it was written in the formatted document. The comparison results
from the experiments run on &lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt; are presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ir10dqXHFS-1148.avif 1148w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ir10dqXHFS-1148.webp 1148w&quot;&gt;&lt;img alt=&quot;Overhead of redundant whitespace nodes.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/ir10dqXHFS-1148.png&quot; width=&quot;1148&quot; height=&quot;771&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;As the numbers suggest, the overhead of redundant text nodes is rather small on a per-idle-node basis. However, as the number of &lt;strong&gt;idle nodes&lt;/strong&gt; scales, so does the overhead. Around 100 extra &lt;strong&gt;idle nodes&lt;/strong&gt;, the
overhead is noticeable already. Therefore, a natural conclusion is that the redundant text nodes should rather be avoided — especially as the number of nodes in the tree becomes significant.&lt;/p&gt;
&lt;h4 id=&quot;parents-vs-siblings&quot; tabindex=&quot;-1&quot;&gt;Parents vs siblings &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The last topic that deserves a closer look is whether adding &lt;strong&gt;idle nodes&lt;/strong&gt; as siblings is better than adding them as parent nodes. In theory, having extra nodes added as siblings should be better as the layout engine
will have to consider them, yet it won’t mark them with a dirty flag and hence it won’t have to layout them.&lt;/p&gt;
&lt;p&gt;As in other cases, the above can be examined using a series of experiments run on &lt;strong&gt;NXP i.MX8M Plus&lt;/strong&gt; using the demo from &lt;a href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;Desktop considerations&lt;/a&gt; section and comparing against either
&lt;a href=&quot;https://scony.github.io/web-examples/dom/random-number-changing-before-siblings.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&quot;&gt;random-number-changing-before-siblings.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&lt;/a&gt;
or &lt;a href=&quot;https://scony.github.io/web-examples/dom/random-number-changing-after-siblings.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&quot;&gt;random-number-changing-after-siblings.html?vr=0&amp;amp;ms=1&amp;amp;dv=0&amp;amp;ns=0&lt;/a&gt; demo. As both of those
yield similar results, any of them can be used. The results of the comparison are depicted in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/cwVF4ew89A-1148.avif 1148w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/cwVF4ew89A-1148.webp 1148w&quot;&gt;&lt;img alt=&quot;Overhead of idle nodes added as parents vs as siblings.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/cwVF4ew89A-1148.png&quot; width=&quot;1148&quot; height=&quot;771&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;The experiment results corroborate the theoretical considerations made above — &lt;strong&gt;idle nodes&lt;/strong&gt; added as siblings indeed introduce less layout overhead. The savings are not very large from a single &lt;strong&gt;idle node&lt;/strong&gt; perspective,
but once scaled enough, they are beneficial enough to justify DOM tree re-organization (if possible).&lt;/p&gt;
&lt;h2 id=&quot;conclusions&quot; tabindex=&quot;-1&quot;&gt;Conclusions &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/wpe-performance-considerations-dom-tree/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The above experiments mostly emphasized the &lt;strong&gt;idle nodes&lt;/strong&gt;, however, the results can be extrapolated to regular nodes in the DOM tree. With that in mind, the overall conclusion to the experiments done in the former sections
is that &lt;strong&gt;DOM tree size and shape has a measurable impact on web application performance on embedded devices&lt;/strong&gt;. Therefore, web developers should try to optimize it as early as possible and follow the general rules of thumb that
can be derived from this article:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Nodes are not free, so they should always be added with extra care.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Idle nodes should be limited to ~20 on mid-end and ~10 on low-end embedded devices.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Idle nodes should be inline elements, not block ones.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Redundant whitespaces should be avoided — especially with idle nodes.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Nodes (especially idle ones) should be added as siblings.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Although the above serves as great guidance, for better results, it’s recommended to do the proper browser benchmarking on a given target embedded device — as long as it’s feasible.&lt;/p&gt;
&lt;p&gt;Also, the above set of rules is not recommended to follow on &lt;strong&gt;desktop-class devices&lt;/strong&gt;, as in that case, it can be considered a premature optimization. Unless the particular web application yields an exceptionally large
DOM tree, the gains won’t be worth the time spent optimizing.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>The problem of storing the damage</title>
		<link href="https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/"/>
		<updated>2025-09-05T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/</id>
		<content type="html">&lt;p&gt;This article is a continuation of the series on &lt;strong&gt;damage propagation&lt;/strong&gt;. While the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;previous article&lt;/a&gt; laid some foundation on the subject, this one
discusses the cost (increased CPU and memory utilization) that the feature incurs, as this is highly dependent on design decisions and the implementation of the data structure used for storing &lt;strong&gt;damage&lt;/strong&gt; information.&lt;/p&gt;
&lt;p&gt;From the perspective of this article, the two key things worth remembering from the previous one are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The damage propagation is an optional WPE/GTK WebKit feature that — when enabled — reduces the browser’s GPU utilization at the expense of increased CPU and memory utilization.&lt;/li&gt;
&lt;li&gt;On the implementation level, the &lt;strong&gt;damage&lt;/strong&gt; is almost always a collection of rectangles that cover the changed region.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;the-damage-information&quot; tabindex=&quot;-1&quot;&gt;The damage information &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Before diving into the problem and its solutions, it’s essential to understand basic properties of the &lt;strong&gt;damage&lt;/strong&gt; information.&lt;/p&gt;
&lt;h4 id=&quot;the-damage-nature&quot; tabindex=&quot;-1&quot;&gt;The damage nature &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;As mentioned in &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;the section about damage&lt;/a&gt; of the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;previous article&lt;/a&gt;,
the &lt;strong&gt;damage&lt;/strong&gt; information describes a &lt;strong&gt;region that changed and requires repainting&lt;/strong&gt;. It was also pointed out that such a description is usually done via a collection of rectangles. Although sometimes
it’s better to describe a region in a different way, the rectangles are a natural choice due to the very nature of the &lt;strong&gt;damage&lt;/strong&gt; in the web engines that originates from the &lt;a href=&quot;https://www.w3.org/TR/css-box-3/&quot;&gt;box model&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A more detailed description of the &lt;strong&gt;damage&lt;/strong&gt; nature can be inferred from the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;Pipeline details&lt;/a&gt; section of the
&lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;previous article&lt;/a&gt;. The bottom line is, in the end, the visual changes to the render tree yield the &lt;strong&gt;damage&lt;/strong&gt; information in the form of rectangles.
For the sake of clarity, such original rectangles may be referred to as &lt;strong&gt;raw damage&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In practice, the above means that it doesn’t matter whether, e.g. the circle is drawn on a 2D canvas or the background color of some block element changes — ultimately, the rectangles (&lt;strong&gt;raw damage&lt;/strong&gt;) are always produced
in the process.&lt;/p&gt;
&lt;h4 id=&quot;approximating-the-damage&quot; tabindex=&quot;-1&quot;&gt;Approximating the damage &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;As the &lt;strong&gt;raw damage&lt;/strong&gt; is a collection of rectangles describing a damaged region, the geometrical consequence is that &lt;strong&gt;there may be more than one set of rectangles describing the same region&lt;/strong&gt;.
It means that &lt;strong&gt;raw damage&lt;/strong&gt; could be stored by a different set of rectangles and still precisely describe the original damaged region — e.g. when &lt;strong&gt;raw damage&lt;/strong&gt; contains more rectangles than necessary.
The example of different approximations of a simple &lt;strong&gt;raw damage&lt;/strong&gt; is depicted in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qQhiWBzhuN-763.avif 763w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qQhiWBzhuN-763.webp 763w&quot;&gt;&lt;img alt=&quot;Raw damage approximated multiple ways.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/qQhiWBzhuN-763.svg&quot; width=&quot;763&quot; height=&quot;208&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;Changing the set of rectangles that describes the damaged region may be very tempting — especially when the size of the set could be reduced. However, the following consequences must be taken into account:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The damaged region could shrink when some damaging information would be lost e.g. if too many rectangles would be removed.&lt;/li&gt;
&lt;li&gt;The damaged region could expand when some damaging information would be added e.g. if too many or too big rectangles would be added.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The first consequence may lead to visual glitches when repainting. The second one, however, causes no visual issues but degrades performance since a larger area
(i.e. more pixels) must be repainted — typically increasing GPU usage. This means the &lt;strong&gt;damage information can be approximated&lt;/strong&gt; as long as the trade-off between the extra repainted area and the degree of simplification
in the underlying set of rectangles is acceptable.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;approximation&lt;/strong&gt; mentioned above means the situation where the approximated damaged region covers the original damaged region entirely i.e. not a single pixel of information is lost. In that sense, the
approximation can only add extra information. Naturally, the lower the extra area added to the original damaged region, the better.&lt;/p&gt;
&lt;p&gt;The approximation quality can be referred to as &lt;strong&gt;damage resolution&lt;/strong&gt;, which is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;low&lt;/strong&gt; — when the extra area added to the original damaged region is significant,&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;high&lt;/strong&gt; — when the extra area added to the original damaged region is small.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The examples of low (left) and high (right) &lt;strong&gt;damage resolutions&lt;/strong&gt; are presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/9KChyXTdq7-805.avif 805w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/9KChyXTdq7-805.webp 805w&quot;&gt;&lt;img alt=&quot;Various damage resolutions.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/9KChyXTdq7-805.svg&quot; width=&quot;805&quot; height=&quot;288&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;h2 id=&quot;the-problem&quot; tabindex=&quot;-1&quot;&gt;The problem &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Given the description of the &lt;strong&gt;damage&lt;/strong&gt; properties presented in the sections above, it’s evident there’s a certain degree of flexibility when it comes to processing &lt;strong&gt;damage&lt;/strong&gt; information. Such a situation is very fortunate in the
context of storing the &lt;strong&gt;damage&lt;/strong&gt;, as it gives some freedom in designing a proper data structure. However, before jumping into the actual solutions, it’s necessary to understand the problem end-to-end.&lt;/p&gt;
&lt;h4 id=&quot;the-scale&quot; tabindex=&quot;-1&quot;&gt;The scale &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;Pipeline details&lt;/a&gt; section of the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;previous article&lt;/a&gt;
introduced two basic types of &lt;strong&gt;damage&lt;/strong&gt; in the &lt;strong&gt;damage propagation pipeline&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;layer damage&lt;/strong&gt; — the damage tracked separately for each layer,&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;frame damage&lt;/strong&gt; — the damage that aggregates individual layer damages and consists of the final damage of a given frame.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Assuming there are &lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;/math&gt; layers and there is some data structure called &lt;code&gt;Damage&lt;/code&gt; that can store the &lt;strong&gt;damage&lt;/strong&gt; information, it’s easy to notice that there may be &lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt; instances
of &lt;code&gt;Damage&lt;/code&gt; present at the same time in the pipeline as the browser engine requires:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;/math&gt; &lt;code&gt;Damage&lt;/code&gt; objects for storing &lt;strong&gt;layer damage&lt;/strong&gt;,&lt;/li&gt;
&lt;li&gt;&lt;math&gt;&lt;mo&gt;1&lt;/mo&gt;&lt;/math&gt; &lt;code&gt;Damage&lt;/code&gt; object for storing &lt;strong&gt;frame damage&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As there may be a lot of layers in more complex web pages, the &lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt; mentioned above may be a very big number.&lt;/p&gt;
&lt;p&gt;The first consequence of the above is that the &lt;code&gt;Damage&lt;/code&gt; data structure in general should store the &lt;strong&gt;damage&lt;/strong&gt; information in a very compact way to &lt;strong&gt;avoid excessive memory usage&lt;/strong&gt; when &lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt; &lt;code&gt;Damage&lt;/code&gt; objects
are present at the same time.&lt;/p&gt;
&lt;p&gt;The second consequence of the above is that the &lt;code&gt;Damage&lt;/code&gt; data structure in general should be &lt;strong&gt;very performant&lt;/strong&gt; as each of &lt;math&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt; &lt;code&gt;Damage&lt;/code&gt; objects may be involved into a considerable amount of processing when there are
lots of updates across the web page (and hence huge numbers of &lt;strong&gt;damage&lt;/strong&gt; rectangles).&lt;/p&gt;
&lt;p&gt;To better understand the above consequences, it’s essential to examine the input and the output of such a hypothetical &lt;code&gt;Damage&lt;/code&gt; data structure more thoroughly.&lt;/p&gt;
&lt;h4 id=&quot;the-input&quot; tabindex=&quot;-1&quot;&gt;The input &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;There are 2 kinds of &lt;code&gt;Damage&lt;/code&gt; data structure input:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;other &lt;code&gt;Damage&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;raw damage&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;Damage&lt;/code&gt; becomes an input of other &lt;code&gt;Damage&lt;/code&gt; in some situations, happening in the middle of the &lt;strong&gt;damage propagation pipeline&lt;/strong&gt; when the broader &lt;strong&gt;damage&lt;/strong&gt; is being assembled from smaller chunks of &lt;strong&gt;damage&lt;/strong&gt;. What it consists
of depends purely on the &lt;code&gt;Damage&lt;/code&gt; implementation.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;raw damage&lt;/strong&gt;, on the other hand, becomes an input of the &lt;code&gt;Damage&lt;/code&gt; always at the very beginning of the &lt;strong&gt;damage propagation pipeline&lt;/strong&gt;. In practice, it consists of a set of rectangles that are potentially overlapping, duplicated, or empty. Moreover,
such a set is always as big as the set of changes causing visual impact. Therefore, in the worst case scenario such as drawing on a 2D canvas, the number of rectangles may be enormous.&lt;/p&gt;
&lt;p&gt;Given the above, it’s clear that the hypothetical &lt;code&gt;Damage&lt;/code&gt; data structure should support 2 distinct input operations in the most performant way possible:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;add(Damage)&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;add(Rectangle)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;the-output&quot; tabindex=&quot;-1&quot;&gt;The output &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;When it comes to the &lt;code&gt;Damage&lt;/code&gt; data structure output, there are 2 possibilities either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;other &lt;code&gt;Damage&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;platform&lt;/strong&gt; API.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;Damage&lt;/code&gt; becomes the output of other &lt;code&gt;Damage&lt;/code&gt; on each &lt;code&gt;Damage&lt;/code&gt;-to-&lt;code&gt;Damage&lt;/code&gt; append that was described in the subsection above.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;platform&lt;/strong&gt; API, on the other hand, becomes the output of &lt;code&gt;Damage&lt;/code&gt; at the very end of the pipeline e.g. when the &lt;strong&gt;platform&lt;/strong&gt; API consumes the &lt;strong&gt;frame damage&lt;/strong&gt; (as described in the
&lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;pipeline details&lt;/a&gt; section of the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;previous article&lt;/a&gt;).
In this situation, what’s expected on the output technically depends on the particular &lt;strong&gt;platform&lt;/strong&gt; API. However, in practice, all platforms supporting &lt;strong&gt;damage&lt;/strong&gt; propagation require a set of rectangles that describe the damaged region.
Such a set of rectangles is fed into the &lt;strong&gt;platforms&lt;/strong&gt; via APIs by simply iterating the rectangles describing the damaged region and transforming them to whatever data structure the particular API expects.&lt;/p&gt;
&lt;p&gt;The natural consequence of the above is that the hypothetical &lt;code&gt;Damage&lt;/code&gt; data structure should support the following output operation — also in the most performant way possible:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;forEachRectangle(...)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;the-problem-statement&quot; tabindex=&quot;-1&quot;&gt;The problem statement &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Given all the above perspectives, the problem of designing the &lt;code&gt;Damage&lt;/code&gt; data structure can be summarized as storing the input &lt;strong&gt;damage&lt;/strong&gt; information to be accessed (iterated) later in a way that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the performance of operations for adding and iterating rectangles is maximal &lt;strong&gt;(performance)&lt;/strong&gt;,&lt;/li&gt;
&lt;li&gt;the memory footprint of the data structure is minimal &lt;strong&gt;(memory footprint)&lt;/strong&gt;,&lt;/li&gt;
&lt;li&gt;the stored region covers the original region and has the area as close to it as possible &lt;strong&gt;(damage resolution)&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With the problem formulated this way, it’s obvious that this is a multi-criteria optimization problem with 3 criteria:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;performance&lt;/strong&gt; (maximize),&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;memory footprint&lt;/strong&gt; (minimize),&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;damage resolution&lt;/strong&gt; (maximize).&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;damage-data-structure-implementations&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;Damage&lt;/code&gt; data structure implementations &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Given the problem of storing &lt;strong&gt;damage&lt;/strong&gt; defined as above, it’s possible to propose various ways of solving it by implementing a &lt;code&gt;Damage&lt;/code&gt; data structure. Before diving into details, however, it’s important to emphasize
that the weights of criteria may be different depending on the situation. Therefore, before deciding how to design the &lt;code&gt;Damage&lt;/code&gt; data structure, one should consider the following questions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What is the proportion between the power of GPU and CPU in the devices I’m targeting?&lt;/li&gt;
&lt;li&gt;What are the memory constraints of the devices I’m targeting?&lt;/li&gt;
&lt;li&gt;What are the cache sizes on the devices I’m targeting?&lt;/li&gt;
&lt;li&gt;What is the balance between GPU and CPU usage in the applications I’m going to optimize for?
&lt;ul&gt;
&lt;li&gt;Are they more rendering-oriented (e.g. using WebGL, Canvas 2D, animations etc.)?&lt;/li&gt;
&lt;li&gt;Are they more computing-oriented (frequent layouts, a lot of JavaScript processing etc.)?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Although answering the above usually points into the direction of specific implementation, usually the answers are unknown and hence the implementation should be as generic as possible. In practice,
it means the implementation should not optimize with a strong focus on just one criterion. However, as there’s no silver bullet solution, it’s worth exploring multiple, quasi-generic solutions that have been researched as
part of Igalia’s work on the &lt;strong&gt;damage&lt;/strong&gt; propagation, and which are the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Damage&lt;/code&gt; storing all input rects,&lt;/li&gt;
&lt;li&gt;Bounding box &lt;code&gt;Damage&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Damage&lt;/code&gt; using WebKit’s &lt;code&gt;Region&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;R-Tree &lt;code&gt;Damage&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;Grid-based &lt;code&gt;Damage&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of the above implementations are being evaluated along the 3 criteria the following way:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;by specifying the time complexity of &lt;code&gt;add(Rectangle)&lt;/code&gt; operation as &lt;code&gt;add(Damage)&lt;/code&gt; can be transformed into the series of &lt;code&gt;add(Rectangle)&lt;/code&gt; operations,&lt;/li&gt;
&lt;li&gt;by specifying the time complexity of &lt;code&gt;forEachRectangle(...)&lt;/code&gt; operation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;by specifying the space complexity of &lt;code&gt;Damage&lt;/code&gt; data structure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;by subjectively specifying the damage resolution.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&quot;damage-storing-all-input-rects&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;Damage&lt;/code&gt; storing all input rects &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The most natural — yet very naive — &lt;code&gt;Damage&lt;/code&gt; implementation is one that wraps a simple collection (such as vector) of rectangles and hence stores the &lt;strong&gt;raw damage&lt;/strong&gt; in the original form.
In that case, the evaluation is as simple as evaluating the underlying data structure.&lt;/p&gt;
&lt;p&gt;Assuming a &lt;strong&gt;vector&lt;/strong&gt; data structure and &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; amortized time complexity of insertion, the evaluation of such a &lt;code&gt;Damage&lt;/code&gt; implementation is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;perfect&lt;/strong&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Despite being trivial to implement, this approach is heavily skewed towards the &lt;strong&gt;damage resolution&lt;/strong&gt; criterion. Essentially, the &lt;strong&gt;damage&lt;/strong&gt; quality  is the best possible, yet the expense is a very poor
&lt;strong&gt;performance&lt;/strong&gt; and substantial &lt;strong&gt;memory footprint&lt;/strong&gt;. It’s because a number of input rects &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; can be a very big number, thus making the linear complexities unacceptable.&lt;/p&gt;
&lt;p&gt;The other problem with this solution is that it performs no filtering and hence may store a lot of redundant rectangles. While the empty rectangles can be filtered out in &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;,
filtering out duplicates and some of the overlaps (one rectangle completely containing the other) would make insertion &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;. Naturally, such a filtering
would lead to a smaller &lt;strong&gt;memory footprint&lt;/strong&gt; and faster iteration in practice, however, their complexities would not change.&lt;/p&gt;
&lt;h4 id=&quot;bounding-box-damage&quot; tabindex=&quot;-1&quot;&gt;Bounding box &lt;code&gt;Damage&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The second simplest &lt;code&gt;Damage&lt;/code&gt; implementation one can possibly imagine is the implementation that stores just a single rectangle, which is a minimum bounding rectangle (bounding box) of all the &lt;strong&gt;damage&lt;/strong&gt;
rectangles that were added into the data structure. The minimum bounding rectangle — as the name suggests — is a minimal rectangle that can fit all the input rectangles inside. This is well demonstrated in the picture below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/C6E2nj1zlQ-305.avif 305w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/C6E2nj1zlQ-305.webp 305w&quot;&gt;&lt;img alt=&quot;Bounding box.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/C6E2nj1zlQ-305.svg&quot; width=&quot;305&quot; height=&quot;348&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;As this implementation stores just a single rectangle, and as the operation of taking the bounding box of two rectangles is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;, the evaluation is as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;usually &lt;strong&gt;low&lt;/strong&gt; ⚠️&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Contrary to the &lt;a href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;&lt;code&gt;Damage&lt;/code&gt; storing all input rects&lt;/a&gt;, this solution yields a perfect &lt;strong&gt;performance&lt;/strong&gt; and &lt;strong&gt;memory footprint&lt;/strong&gt; at the expense of low &lt;strong&gt;damage resolution&lt;/strong&gt;. However,
in practice, the &lt;strong&gt;damage resolution&lt;/strong&gt; of this solution is not always low. More specifically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in the optimistic cases (&lt;strong&gt;raw damage&lt;/strong&gt; clustered), the area of the bounding box is close to the area of the &lt;strong&gt;raw damage&lt;/strong&gt; inside,&lt;/li&gt;
&lt;li&gt;in the average cases, the approximation of the damaged region suffers from covering significant areas that were not damaged,&lt;/li&gt;
&lt;li&gt;in the worst cases (small damage rectangles on the other ends of a viewport diagonal), the approximation is very poor, and it may be as bad as covering the whole viewport.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As this solution requires a minimal overhead while still providing a relatively useful &lt;strong&gt;damage&lt;/strong&gt; approximation, in practice, it is a baseline solution used in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Chromium,&lt;/li&gt;
&lt;li&gt;Firefox,&lt;/li&gt;
&lt;li&gt;WPE and GTK WebKit when &lt;code&gt;UnifyDamagedRegions&lt;/code&gt; runtime preference is enabled, which means it’s used in GTK WebKit by default.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;damage-using-webkit-s-region&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;Damage&lt;/code&gt; using WebKit’s &lt;code&gt;Region&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;When it comes to more sophisticated &lt;code&gt;Damage&lt;/code&gt; implementations, the simplest approach in case of WebKit is to wrap data structure already implemented in WebCore called
&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/Region.h&quot;&gt;&lt;code&gt;Region&lt;/code&gt;&lt;/a&gt;. Its purpose
is just as the name suggests — to store a region. More specifically, it’s meant to store rectangles describing region in an efficient way both for storage and for access to take advantage
of scanline coherence during rasterization. The key characteristic of the data structure is that it stores rectangles without overlaps. This is achieved by storing y-sorted lists of x-sorted, non-overlapping
rectangles. Another important property is that due to the specific internal representation, the number of integers stored per rectangle is usually smaller than 4. Also, there are some other useful properties
that are, however, not very useful in the context of storing the &lt;strong&gt;damage&lt;/strong&gt;. More details on the data structure itself can be found in the J. E. Steinhart’s paper from 1991 titled
&lt;a href=&quot;https://www.sciencedirect.com/science/article/abs/pii/B9780080507545500190&quot;&gt;SCANLINE COHERENT SHAPE ALGEBRA&lt;/a&gt;
published as part of &lt;a href=&quot;https://www.sciencedirect.com/book/9780080507545/graphics-gems-ii&quot;&gt;Graphics Gems II&lt;/a&gt; book.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;Damage&lt;/code&gt; implementation being a wrapper of the &lt;code&gt;Region&lt;/code&gt; was actually used by GTK and WPE ports as a first version of more sophisticated &lt;code&gt;Damage&lt;/code&gt; alternative for the &lt;strong&gt;bounding box &lt;code&gt;Damage&lt;/code&gt;&lt;/strong&gt;. Just as expected,
it provided better &lt;strong&gt;damage resolution&lt;/strong&gt; in some cases, however, it suffered from effectively degrading to a more expensive variant &lt;strong&gt;bounding box &lt;code&gt;Damage&lt;/code&gt;&lt;/strong&gt; in the majority of situations.&lt;/p&gt;
&lt;p&gt;The above was inevitable as the implementation was falling back to &lt;strong&gt;bounding box &lt;code&gt;Damage&lt;/code&gt;&lt;/strong&gt; when the &lt;code&gt;Region&lt;/code&gt;’s internal representation was getting too complex. In essence, it was addressing the  &lt;code&gt;Region&lt;/code&gt;’s biggest problem,
which is that it can effectively store &lt;math&gt;&lt;msup&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt; rectangles in the worst case due to the way it splits rectangles for storing purposes. More specifically, as the &lt;code&gt;Region&lt;/code&gt; stores ledges
and spans, each insertion of a new rectangle may lead to splitting &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; existing rectangles. Such a situation is depicted in the image below, where 3 rectangles are being split
into 9:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/TfPtWWJ35c-466.avif 466w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/TfPtWWJ35c-466.webp 466w&quot;&gt;&lt;img alt=&quot;WebKit&#39;s Region storing method.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/TfPtWWJ35c-466.svg&quot; width=&quot;466&quot; height=&quot;409&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;Putting the above fallback mechanism aside, the evaluation of &lt;code&gt;Damage&lt;/code&gt; being a simple wrapper on top of &lt;code&gt;Region&lt;/code&gt; is the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;perfect&lt;/strong&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Adding a fallback, the evaluation is technically the same as &lt;strong&gt;bounding box &lt;code&gt;Damage&lt;/code&gt;&lt;/strong&gt; for &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; above the fallback point, yet with extra overhead. At the same time, for smaller &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt;, the above evaluation
didn’t really matter much as in such case all the &lt;strong&gt;performance&lt;/strong&gt;, &lt;strong&gt;memory footprint&lt;/strong&gt;, and the &lt;strong&gt;damage resolution&lt;/strong&gt; were very good.&lt;/p&gt;
&lt;p&gt;Despite this solution (with a fallback) yielded very good results for some simple scenarios (when &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; was small enough), it was not sustainable in the long run, as it was not addressing the majority of use cases,
where it was actually a bit slower than &lt;strong&gt;bounding box &lt;code&gt;Damage&lt;/code&gt;&lt;/strong&gt; while the results were similar.&lt;/p&gt;
&lt;h4 id=&quot;r-tree-damage&quot; tabindex=&quot;-1&quot;&gt;R-Tree &lt;code&gt;Damage&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;In the pursuit of more sophisticated &lt;code&gt;Damage&lt;/code&gt; implementations, one can think of wrapping/adapting data structures similar to quadtrees, KD-trees etc. However, in most of such cases, a lot of unnecessary overhead is added
as the data structures partition the space so that, in the end, the input is stored without overlaps. As overlaps are not necessarily a problem for storing &lt;strong&gt;damage&lt;/strong&gt; information, the list of candidate data structures
can be narrowed down to the most performant data structures allowing overlaps. One of the most interesting of such options is the R-Tree.&lt;/p&gt;
&lt;p&gt;In short, &lt;a href=&quot;https://en.wikipedia.org/wiki/R-tree&quot;&gt;R-Tree&lt;/a&gt; (rectangle tree) is a tree data structure that allows storing multiple entries (rectangles) in a single node. While the leaf nodes of such a tree store the original
rectangles inserted into the data structure, each of the intermediate nodes stores the bounding box (minimum bounding rectangle, MBR) of the children nodes. As the tree is balanced, the above means that with every next
tree level from the top, the list of rectangles (either bounding boxes or original ones) gets bigger and more detailed. The example of the R-tree is depicted in the Figure 5 from
the &lt;a href=&quot;https://www.researchgate.net/publication/225542188_Object_Trajectory_Analysis_in_Video_Indexing_and_Retrieval_Applications&quot;&gt;Object Trajectory Analysis in Video Indexing and Retrieval Applications&lt;/a&gt; paper:&lt;/p&gt;
&lt;center&gt;
&lt;a href=&quot;https://www.researchgate.net/figure/R-Tree-indexing-example-2D-visualization-a-hierarchical-dependencies-b_fig4_225542188&quot;&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/HcJsPmi_It-850.avif 850w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/HcJsPmi_It-850.webp 850w&quot;&gt;&lt;img alt=&quot;TODO.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/HcJsPmi_It-850.png&quot; width=&quot;850&quot; height=&quot;428&quot;&gt;&lt;/picture&gt;
&lt;/a&gt;
&lt;/center&gt;
&lt;p&gt;The above perfectly shows the differences between the rectangles on various levels and can also visually suggest some ideas when it comes to adapting such a data structure into &lt;code&gt;Damage&lt;/code&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The first possibility is to make &lt;code&gt;Damage&lt;/code&gt; a simple wrapper of R-Tree that would just build the tree and allow the &lt;code&gt;Damage&lt;/code&gt; consumer to pick the desired &lt;strong&gt;damage resolution&lt;/strong&gt; on iteration attempt. Such an approach is possible
as having the full R-Tree allows the iteration code to limit iteration to a certain level of the tree or to various levels from separate branches. The latter allows &lt;code&gt;Damage&lt;/code&gt; to offer a particularly interesting API where the
&lt;code&gt;forEachRectangle(...)&lt;/code&gt; function could accept a parameter specifying how many rectangles (at most) are expected to be iterated.&lt;/li&gt;
&lt;li&gt;The other possibility is to make &lt;code&gt;Damage&lt;/code&gt; an adaptation of R-Tree that conditionally prunes the tree while constructing it not to let it grow too much, yet to maintain a certain height and hence certain &lt;strong&gt;damage quality&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Regardless of the approach, the R-Tree construction also allows one to implement a simple filtering mechanism that eliminates input rectangles being duplicated or contained by existing rectangles on the fly. However,
such a filtering is not very effective as it can only consider a limited set of rectangles i.e. the ones encountered during traversal required by insertion.&lt;/p&gt;
&lt;h5&gt;&lt;code&gt;Damage&lt;/code&gt; as a simple R-Tree wrapper&lt;/h5&gt;
&lt;p&gt;Although this option may be considered very interesting, in practice, storing all the input rectangles in the R-Tree means storing &lt;math&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; rectangles along with the overhead of a tree structure. In the worst case scenario
(node size of 2), the number of nodes in the tree may be as big as &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;, thus adding a lot of overhead required to maintain the tree structure. This fact alone makes this solution have an
unacceptable &lt;strong&gt;memory footprint&lt;/strong&gt;. The other problem with this idea is that in practice,
the &lt;strong&gt;damage resolution&lt;/strong&gt; selection is usually done once — during browser startup. Therefore, the ability to select &lt;strong&gt;damage resolution&lt;/strong&gt; during runtime brings no benefits while introduces unnecessary overhead.&lt;/p&gt;
&lt;p&gt;The evaluation of the above is the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; where &lt;math&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt; is the node size ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; where &lt;math&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt; is a parameter and &lt;math&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;≤&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;≤&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;low&lt;/strong&gt; to &lt;strong&gt;high&lt;/strong&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h5&gt;&lt;code&gt;Damage&lt;/code&gt; as an R-Tree adaptation with pruning&lt;/h5&gt;
&lt;p&gt;Considering the problems the previous idea has, the option with pruning seems to be addressing all the problems:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the memory footprint can be controlled by specifying at which level of the tree the pruning should happen,&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;damage resolution&lt;/strong&gt; (level of the tree where pruning happens) can be picked on the implementation level (compile time), thus allowing some extra implementation tricks if necessary.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While it’s true the above problems are not existing within this approach, the option with pruning — unfortunately — brings new problems that need to be considered. As a matter of fact, all the new problems it brings
are originating from the fact that each pruning operation leads to the loss of information and hence to the tree deterioration over time.&lt;/p&gt;
&lt;p&gt;Before actually introducing those new problems, it’s worth understanding more about how insertions work in the R-Tree.&lt;/p&gt;
&lt;p&gt;When the rectangle is inserted to the R-Tree, the first step is to find a proper position for the new record (see &lt;strong&gt;ChooseLeaf&lt;/strong&gt; algorithm from &lt;a href=&quot;https://dl.acm.org/doi/10.1145/971697.602266&quot;&gt;Guttman1984&lt;/a&gt;). When the target node is
found, there are two possibilities:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;adding the new rectangle to the target node does not cause overflow,&lt;/li&gt;
&lt;li&gt;adding the new rectangle to the target node causes overflow.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If no overflow happens, the new rectangle is just added to the target node. However, if overflow happens i.e. the number of rectangles in the node exceeds the limit, the node splitting algorithm is invoked (see &lt;strong&gt;SplitNode&lt;/strong&gt;
algorithm from &lt;a href=&quot;https://dl.acm.org/doi/10.1145/971697.602266&quot;&gt;Guttman1984&lt;/a&gt;) and the changes are being propagated up the tree (see &lt;strong&gt;ChooseLeaf&lt;/strong&gt; algorithm from &lt;a href=&quot;https://dl.acm.org/doi/10.1145/971697.602266&quot;&gt;Guttman1984&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The node splitting, along with adjusting the tree, are very important steps within insertion as those algorithms are the ones that are responsible for shaping and balancing the tree. For example, when all the nodes in the tree are
full and the new rectangle is being added, the node splitting will effectively be executed for some leaf node and all its ancestors, including root. It means that the tree will grow and possibly, its structure will change significantly.&lt;/p&gt;
&lt;p&gt;Due to the above mechanics of R-Tree, it can be reasonably asserted that the tree structure becomes better as a function of node splits. With that, the first problem of the tree pruning becomes obvious:
tree pruning on insertion limits the amount of node splits (due to smaller node splits cascades) and hence limits the quality of the tree structure. The second problem — also related to node splits — is that
with all the information lost due to pruning (as pruning is the same as removing a subtree and inserting its bounding box into the tree) each node split is less effective as the leaf rectangles themselves are
getting bigger and bigger due to them becoming bounding boxes of bounding boxes (…) of the original rectangles.&lt;/p&gt;
&lt;p&gt;The above problems become more visible in practice when the R-tree input rectangles tend to be sorted. In general, one of the R-Tree problems is that its structure tends to be biased when the input rectangles are sorted.
Despite the further insertions usually fix the structure of the biased tree, it’s only done to some degree, as some tree nodes may not get split anymore. When the pruning happens and the input is sorted (or partially sorted)
the fixing of the biased tree is much harder and sometimes even impossible. It can be well explained with the example where a lot of rectangles from the same area are inserted into the tree. With the number of such rectangles
being big enough, a lot of pruning will happen and hence a lot of rectangles will be lost and replaced by larger bounding boxes. Then, if a series of new insertions will start inserting nodes from a different area which is
partially close to the original one, the new rectangles may end up being siblings of those large bounding boxes instead of the original rectangles that could be clustered within nodes in a much more reasonable way.&lt;/p&gt;
&lt;p&gt;Given the above problems, the evaluation of the whole idea of &lt;code&gt;Damage&lt;/code&gt; being the adaptation of R-Tree with pruning is the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; where &lt;math&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt; is the node size, &lt;math&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt; is a parameter, and &lt;math&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;&amp;lt;&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;≤&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;low&lt;/strong&gt; to &lt;strong&gt;medium&lt;/strong&gt; ⚠️&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Despite the above evaluation looks reasonable, in practice, it’s very hard to pick the proper pruning strategy. When the tree is allowed to be taller, the &lt;strong&gt;damage resolution&lt;/strong&gt; is usually better, but the increased &lt;strong&gt;memory footprint&lt;/strong&gt;,
logarithmic insertions, and increased iteration time combined pose a significant problem. On the other hand, when the tree is shorter, the &lt;strong&gt;damage resolution&lt;/strong&gt; tends to be low enough not to justify using R-Tree.&lt;/p&gt;
&lt;h4 id=&quot;grid-based-damage&quot; tabindex=&quot;-1&quot;&gt;Grid-based &lt;code&gt;Damage&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The last, more sophisticated &lt;code&gt;Damage&lt;/code&gt; implementation, uses some ideas from R-Tree and forms a very strict, flat structure. In short, the idea is to take some rectangular part of a plane and divide it into cells,
thus forming a grid with &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt; columns and &lt;math&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt; rows. Given such a division, each cell of the grid is meant to store at most one rectangle that effectively is a bounding box of the rectangles matched to
that cell. The overview of the approach is presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/rSa-hrDW6c-1063.avif 1063w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/rSa-hrDW6c-1063.webp 1063w&quot;&gt;&lt;img alt=&quot;Grid-based Damage creation process.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/rSa-hrDW6c-1063.svg&quot; width=&quot;1063&quot; height=&quot;348&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;As the above situation is very straightforward, one may wonder what would happen if the rectangle would span multiple cells i.e. how the matching algorithm would work in that case.&lt;/p&gt;
&lt;p&gt;Before diving into the matching, it’s important to note that from the algorithmic perspective, the matching is very important as it accounts for the majority of operations during new rectangle insertion into the &lt;code&gt;Damage&lt;/code&gt; data structure.
It’s because when the matched cell is known, the remaining part of insertion is just about taking the bounding box of existing rectangle stored in the cell and the new rectangle, thus having
&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; time complexity.&lt;/p&gt;
&lt;p&gt;As for the matching itself, it can be done in various ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it can be done using strategies known from R-Tree, such as matching a new rectangle into the cell where the bounding box enlargement would be the smallest etc.,&lt;/li&gt;
&lt;li&gt;it can be done by maximizing the overlap between the new rectangle and the given cell,&lt;/li&gt;
&lt;li&gt;it can be done by matching the new rectangle’s center (or corner) into the proper cell,&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above matching strategies fall into 2 categories:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; matching algorithms that compare a new rectangle against existing cells while looking for the best match,&lt;/li&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; matching algorithms that calculate the target cell using a single formula.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Due to the nature of matching, the &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; strategies eventually lead to smaller bounding boxes stored in the &lt;code&gt;Damage&lt;/code&gt; and hence to better &lt;strong&gt;damage resolution&lt;/strong&gt; as compared to the
&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; algorithms. However, as the practical experiments show, the difference in &lt;strong&gt;damage resolution&lt;/strong&gt; is not big enough to justify &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;
time complexity over &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt;. More specifically, the difference in &lt;strong&gt;damage resolution&lt;/strong&gt; is usually unnoticeable, while the difference between
&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; and &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; insertion complexity is major, as the insertion is the most critical operation of the &lt;code&gt;Damage&lt;/code&gt; data structure.&lt;/p&gt;
&lt;p&gt;Due to the above, the matching method that has proven to be the most practical is &lt;strong&gt;matching the new rectangle’s center into the proper cell&lt;/strong&gt;. It has &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; time complexity
as it requires just a few arithmetic operations to calculate the center of the incoming rectangle and to match it to the proper cell (see
&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/Damage.h&quot;&gt;the implementation in WebKit&lt;/a&gt;). The example of such matching is presented in the image below:&lt;/p&gt;
&lt;center&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/g6BW5LjYRp-1063.avif 1063w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/g6BW5LjYRp-1063.webp 1063w&quot;&gt;&lt;img alt=&quot;Matching rectangles to proper cells.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/g6BW5LjYRp-1063.svg&quot; width=&quot;1063&quot; height=&quot;349&quot;&gt;&lt;/picture&gt;
&lt;/center&gt;
&lt;p&gt;The overall evaluation of the grid-based &lt;code&gt;Damage&lt;/code&gt; constructed the way described in the above paragraphs is as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;performance&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;insertion is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;1&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;li&gt;iteration is &lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;memory footprint&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;damage resolution&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;low&lt;/strong&gt; to &lt;strong&gt;high&lt;/strong&gt; (depending on the &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;) ✅&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Clearly, the fundamentals of the grid-based &lt;code&gt;Damage&lt;/code&gt; are strong, but the data structure is heavily dependent on the &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;. The good news is that, in practice, even a fairly small grid such as 8x4
(&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/math&gt;)
yields a &lt;strong&gt;damage resolution&lt;/strong&gt; that is &lt;strong&gt;high&lt;/strong&gt;. It means that this &lt;code&gt;Damage&lt;/code&gt; implementation is a great alternative to bounding box &lt;code&gt;Damage&lt;/code&gt; as even with very small &lt;strong&gt;performance&lt;/strong&gt; and &lt;strong&gt;memory footprint&lt;/strong&gt; overhead,
it yields much better &lt;strong&gt;damage resolution&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Moreover, the grid-based &lt;code&gt;Damage&lt;/code&gt; implementation gives an opportunity for very handy optimizations that improve &lt;strong&gt;memory footprint&lt;/strong&gt;, &lt;strong&gt;performance&lt;/strong&gt; (iteration), and &lt;strong&gt;damage resolution&lt;/strong&gt; further.&lt;/p&gt;
&lt;p&gt;As the grid dimensions are given a-priori, one can imagine that intrinsically, the data structure needs to allocate a fixed-size array of rectangles with &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt; entries to store cell bounding boxes.&lt;/p&gt;
&lt;p&gt;One possibility for improvement in such a situation (assuming small &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;) is to use a vector along with bitset so that only non-empty cells are stored in the vector.&lt;/p&gt;
&lt;p&gt;The other possibility (again, assuming small &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;) is to not use a grid-based approach at all as long as the number of rectangles inserted so far does not exceed &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;.
In other words, the data structure can allocate an empty vector of rectangles upon initialization and then just append new rectangles to the vector as long as the insertion does not extend the vector beyond
&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt; entries. In such a case, when &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt; is e.g. 32, up to 32 rectangles can be stored in the original form. If at some point the data structure detects that it would need to
store 33 rectangles, it switches internally to a grid-based approach, thus always storing at most 32 rectangles for cells. Also, note that in such a case, the first improvement possibility (with bitset) can still be used.&lt;/p&gt;
&lt;p&gt;Summarizing the above, both improvements can be combined and they allow the data structure to have a limited, small &lt;strong&gt;memory footprint&lt;/strong&gt;, good &lt;strong&gt;performance&lt;/strong&gt;, and &lt;strong&gt;perfect damage resolution&lt;/strong&gt; as long as there
are not too many damage rectangles. And if the number of input rectangles exceeds the limit, the data structure can still fall-back to a grid-based approach and maintain very good results. In practice, the situations
where the input damage rectangles are not exceeding &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt; (e.g. 32) are very common, and hence the above improvements are very important.&lt;/p&gt;
&lt;p&gt;Overall, the grid-based approach with the above improvements has proven to be the best solution for all the embedded devices tried so far, and therefore, such a &lt;code&gt;Damage&lt;/code&gt; implementation is a baseline solution used in
WPE and GTK WebKit when &lt;code&gt;UnifyDamagedRegions&lt;/code&gt; runtime preference is not enabled — which means it works by default in WPE WebKit.&lt;/p&gt;
&lt;h2 id=&quot;conclusions&quot; tabindex=&quot;-1&quot;&gt;Conclusions &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/the-problem-of-storing-the-damage/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The former sections demonstrated various approaches to implementing the &lt;code&gt;Damage&lt;/code&gt; data structure meant to store damage information. The summary of the results is presented in the table below:&lt;/p&gt;
&lt;div&gt;
    &lt;template shadowrootmode=&quot;open&quot;&gt;
        &lt;style&gt;
            table {
                border-collapse: separate;
                border-spacing: 2px;
            }
            th {
                background-color: #666;
                color: #fff;
            }
            th, td {
                padding: 20px;
            }
            tr:nth-child(odd) {
                background-color: #fafafa;
            }
            tr:nth-child(even) {
                background-color: #f2f2f2;
            }
            .code {
                background-color: #e5e5e5;
                padding: .25rem;
                border-radius: 3px;
            }
        &lt;/style&gt;
        &lt;center&gt;
            &lt;table&gt;
                &lt;thead&gt;
                    &lt;tr&gt;
                        &lt;th&gt;Implementation&lt;/th&gt;
                        &lt;th&gt;Insertion&lt;/th&gt;
                        &lt;th&gt;Iteration&lt;/th&gt;
                        &lt;th&gt;Memory&lt;/th&gt;
                        &lt;th&gt;Overlaps&lt;/th&gt;
                        &lt;th&gt;Resolution&lt;/th&gt;
                    &lt;/tr&gt;
                &lt;/thead&gt;
                &lt;tbody&gt;
                    &lt;tr&gt;
                        &lt;td&gt;Bounding box&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;No&lt;/td&gt;
                        &lt;td&gt;usually &lt;b&gt;low&lt;/b&gt; ⚠️&lt;/td&gt;
                    &lt;/tr&gt;
                    &lt;tr&gt;
                        &lt;td&gt;Grid-based&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;1&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;Yes&lt;/td&gt;
                        &lt;td&gt;&lt;b&gt;low&lt;/b&gt; to &lt;b&gt;high&lt;/b&gt; ✅&lt;br&gt;(depending on the &lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;)&lt;/td&gt;
                    &lt;/tr&gt;
                    &lt;tr&gt;
                        &lt;td&gt;R-Tree (with pruning)&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;Yes&lt;/td&gt;
                        &lt;td&gt;&lt;b&gt;low&lt;/b&gt; to &lt;b&gt;medium&lt;/b&gt; ⚠️&lt;/td&gt;
                    &lt;/tr&gt;
                    &lt;tr&gt;
                        &lt;td&gt;R-Tree (without pruning)&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/td&gt;
                        &lt;td&gt;Yes&lt;/td&gt;
                        &lt;td&gt;&lt;b&gt;low&lt;/b&gt; to &lt;b&gt;high&lt;/b&gt; ✅&lt;/td&gt;
                    &lt;/tr&gt;
                    &lt;tr&gt;
                        &lt;td&gt;All rects&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/td&gt;
                        &lt;td&gt;Yes&lt;/td&gt;
                        &lt;td&gt;&lt;b&gt;perfect&lt;/b&gt; ✅&lt;/td&gt;
                    &lt;/tr&gt;
                    &lt;tr&gt;
                        &lt;td&gt;&lt;font class=&quot;code&quot;&gt;Region&lt;/font&gt;&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;log&lt;/mi&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ✅&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/td&gt;
                        &lt;td&gt;&lt;math&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/math&gt; ❌&lt;/td&gt;
                        &lt;td&gt;No&lt;/td&gt;
                        &lt;td&gt;&lt;b&gt;perfect&lt;/b&gt; ✅&lt;/td&gt;
                    &lt;/tr&gt;
                &lt;/tbody&gt;
            &lt;/table&gt;
        &lt;/center&gt;
    &lt;/template&gt;
&lt;/div&gt;
&lt;p&gt;While all the solutions have various pros and cons, the &lt;strong&gt;Bounding box&lt;/strong&gt; and &lt;strong&gt;Grid-based&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt; implementations are the most lightweight and hence are most useful in generic use cases.&lt;/p&gt;
&lt;p&gt;On typical embedded devices — where CPUs are quite powerful compared to GPUs — both above solutions are acceptable, so the final choice can be determined based on the actual use case. If the actual web application
often yields clustered damage information, the &lt;strong&gt;Bounding box&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt; implementation should be preferred. Otherwise (majority of use cases), the &lt;strong&gt;Grid-based&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt; implementation will work better.&lt;/p&gt;
&lt;p&gt;On the other hand, on desktop-class devices – where CPUs are far less powerful than GPUs – the only acceptable solution is &lt;strong&gt;Bounding box&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt; as it has a minimal overhead while it sill provides some
decent damage resolution.&lt;/p&gt;
&lt;p&gt;The above are the reasons for the default &lt;code&gt;Damage&lt;/code&gt; implementations used by desktop-oriented &lt;strong&gt;GTK&lt;/strong&gt; WebKit port (&lt;strong&gt;Bounding box&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt;) and embedded-device-oriented &lt;strong&gt;WPE&lt;/strong&gt; WebKit (&lt;strong&gt;Grid-based&lt;/strong&gt; &lt;code&gt;Damage&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;When it comes to non-generic situations such as unusual hardware, specific applications etc. it’s always recommended to do a proper evaluation to determine which solution is the best fit. Also, the &lt;code&gt;Damage&lt;/code&gt; implementations
other than the two mentioned above should not be ruled out, as in some exotic cases, they may give much better results.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>Introduction to damage propagation in WPE and GTK WebKit ports</title>
		<link href="https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/"/>
		<updated>2025-04-16T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/</id>
		<content type="html">&lt;p&gt;Damage propagation is an optional WPE/GTK WebKit feature that — when enabled — reduces browser’s GPU utilization at the expense of increased CPU and memory utilization. It’s very useful especially in the context of low- and mid-end
embedded devices, where GPUs are most often not too powerful and thus become a performance bottleneck in many applications.&lt;/p&gt;
&lt;h2 id=&quot;basic-definitions&quot; tabindex=&quot;-1&quot;&gt;Basic definitions &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The only two terms that require explanation to understand the feature on a surface level are the &lt;strong&gt;damage&lt;/strong&gt; and its &lt;strong&gt;propagation&lt;/strong&gt;.&lt;/p&gt;
&lt;h4 id=&quot;the-damage&quot; tabindex=&quot;-1&quot;&gt;The damage &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;In computer graphics, the &lt;strong&gt;damage&lt;/strong&gt; term is usually used in the context of repeatable rendering and means essentially &lt;strong&gt;“the region of a rendered scene that changed and requires repainting”&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the context of WebKit, the above definition may be specialized a bit as WebKit’s rendering engine is about rendering web content to frames (passed further to the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt;) in response to changes within a web page.
Thus the definition of WebKit’s &lt;strong&gt;damage&lt;/strong&gt; refers, more specifically, to &lt;strong&gt;“the region of web page view that changed since previous frame and requires repainting”&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;On the implementation level, the &lt;strong&gt;damage&lt;/strong&gt; is almost always a collection of rectangles that cover the changed region. This is exactly the case for WPE and GTK WebKit ports.&lt;/p&gt;
&lt;p&gt;To better understand what the above means, it’s recommended to carefully examine the below screenshot of GTK MiniBrowser as it depicts the rendering of &lt;a href=&quot;https://webkit.org/blog-files/3d-transforms/poster-circle.html&quot;&gt;the poster circle demo&lt;/a&gt;
with the damage visualizer activated:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/o-b3HpVCi--1677.avif 1677w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/o-b3HpVCi--1677.webp 1677w&quot;&gt;&lt;img alt=&quot;GTK MiniBrowser screenshot showing the damage visualization.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/o-b3HpVCi--1677.png&quot; width=&quot;1677&quot; height=&quot;1003&quot;&gt;&lt;/picture&gt;
In the image above, one can see the following elements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the &lt;strong&gt;web page view&lt;/strong&gt; — marked with a rectangle stroked to magenta color,&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;damage&lt;/strong&gt; — marked with red rectangles,&lt;/li&gt;
&lt;li&gt;the browser elements — everything that lays above the rectangle stroked to a magenta color.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What the above image depicts in practice, is that during that particular frame rendering, the area highlighted red (the &lt;strong&gt;damage&lt;/strong&gt;) has changed and needs to be repainted. Thus — as expected — only the moving parts of the demo require repainting.
It’s also worth emphasizing that in that case, it’s also easy to see how small fraction of the web page view requires repainting. Hence one can imagine the gains from the reduced amount of painting.&lt;/p&gt;
&lt;h4 id=&quot;the-propagation&quot; tabindex=&quot;-1&quot;&gt;The propagation &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Normally, the job of the rendering engine is to paint the contents of a &lt;strong&gt;web page view&lt;/strong&gt; to a &lt;strong&gt;frame&lt;/strong&gt; (or buffer in more general terms) and provide such rendering result to the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt; on every scene rendering iteration —
which usually is 60 times per second.
Without the damage propagation feature, the whole frame is marked as changed (the whole &lt;strong&gt;web page view&lt;/strong&gt;) always. Therefore, the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt; has to perform the full update of the pixels it has 60 times per second.&lt;/p&gt;
&lt;p&gt;While in most of the use cases, the above approach is good enough, in the case of embedded devices with less powerful GPUs, this can be optimized. The basic idea is to produce the &lt;strong&gt;frame&lt;/strong&gt; along with the &lt;strong&gt;damage&lt;/strong&gt; information i.e. a hint for
the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt; on what changed within the produced &lt;strong&gt;frame&lt;/strong&gt;. With the &lt;strong&gt;damage&lt;/strong&gt; provided (usually as an array of rectangles), the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt; can optimize a lot of its operations as — effectively — it can
perform just a partial update of its internal memory. In practice, this usually means that fewer pixels require updating on the screen.&lt;/p&gt;
&lt;p&gt;For the above optimization to work, the damage has to be calculated by the rendering engine for each frame and then propagated along with the produced frame up to its final destination. Thus the &lt;strong&gt;damage propagation&lt;/strong&gt; can be summarized
as &lt;strong&gt;continuous damage calculation and propagation throughout the web engine&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&quot;damage-propagation-pipeline&quot; tabindex=&quot;-1&quot;&gt;Damage propagation pipeline &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Once the general idea has been highlighted, it’s possible to examine the damage propagation in more detail. Before reading further, however, it’s highly recommended for the reader to go carefully through the
famous &lt;a href=&quot;https://wpewebkit.org/blog/03-wpe-graphics-architecture.html&quot;&gt;&lt;em&gt;“WPE Graphics architecture”&lt;/em&gt;&lt;/a&gt; article that gives a good overview of the WebKit graphics pipeline in general and which introduces the basic terminology
used in that context.&lt;/p&gt;
&lt;h4 id=&quot;pipeline-overview&quot; tabindex=&quot;-1&quot;&gt;Pipeline overview &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The information on the visual changes within the &lt;strong&gt;web page view&lt;/strong&gt; has to travel a very long way before it reaches the final destination. As it traverses the thread and process boundaries in an orderly manner, it can be summarized
as forming a pipeline within the broader graphics pipeline. The image below presents an overview of such damage propagation pipeline:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ZEycMrSEVA-4883.avif 4883w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ZEycMrSEVA-4883.webp 4883w&quot;&gt;&lt;img alt=&quot;Damage propagation pipeline overview.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/ZEycMrSEVA-4883.svg&quot; width=&quot;4883&quot; height=&quot;1363&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;h4 id=&quot;pipeline-details&quot; tabindex=&quot;-1&quot;&gt;Pipeline details &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;This pipeline starts with the changes to the &lt;strong&gt;web page view&lt;/strong&gt; visual state (&lt;strong&gt;RenderTree&lt;/strong&gt;) being triggered by one of many possible sources. Such sources may include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;User interactions — e.g. moving mouse cursor around (and hence hovering elements etc.), typing text using keyboard etc.&lt;/li&gt;
&lt;li&gt;Web API usage — e.g. the web page changing DOM, CSS etc.&lt;/li&gt;
&lt;li&gt;multimedia — e.g. the media player in a playing state,&lt;/li&gt;
&lt;li&gt;and many others.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once the changes are induced for certain &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/rendering/RenderObject.h&quot;&gt;RenderObjects&lt;/a&gt;, their visual impact is calculated and encoded as rectangles called &lt;strong&gt;dirty&lt;/strong&gt; as they
require re-painting within a &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/GraphicsLayer.h&quot;&gt;GraphicsLayer&lt;/a&gt; the particular &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/rendering/RenderObject.h&quot;&gt;RenderObject&lt;/a&gt;
maps to. At this point, the visual changes may simply be called &lt;strong&gt;layer damage&lt;/strong&gt; as the &lt;strong&gt;dirty&lt;/strong&gt; rectangles are stored in the layer coordinate space and as they describe what changed within that certain layer since the last frame was rendered.&lt;/p&gt;
&lt;p&gt;The next step in the pipeline is passing the &lt;strong&gt;layer damage&lt;/strong&gt; of each &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/GraphicsLayer.h&quot;&gt;GraphicsLayer&lt;/a&gt;
(&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/coordinated/GraphicsLayerCoordinated.h&quot;&gt;GraphicsLayerCoordinated&lt;/a&gt;) to the WebKit’s compositor. This is done along with any other layer
updates and is mostly covered by the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/coordinated/CoordinatedPlatformLayer.h&quot;&gt;CoordinatedPlatformLayer&lt;/a&gt;.
The &lt;strong&gt;“coordinated”&lt;/strong&gt; prefix of that name is not without meaning. As threaded accelerated compositing is usually used nowadays, passing the &lt;strong&gt;layer damage&lt;/strong&gt; to the WebKit’s compositor must be coordinated between the main thread and
the compositor thread.&lt;/p&gt;
&lt;p&gt;When the &lt;strong&gt;layer damage&lt;/strong&gt; of each layer is passed to the WebKit’s compositor, it’s stored in the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/TextureMapperLayer.h&quot;&gt;TextureMapperLayer&lt;/a&gt; that corresponds to the given
layer’s &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/coordinated/CoordinatedPlatformLayer.h&quot;&gt;CoordinatedPlatformLayer&lt;/a&gt;. With that — and with all other layer-level updates — the
WebKit’s compositor can start computing the &lt;strong&gt;frame damage&lt;/strong&gt; i.e. &lt;strong&gt;damage&lt;/strong&gt; that is the final &lt;strong&gt;damage&lt;/strong&gt; to be passed to the very end of the pipeline.&lt;/p&gt;
&lt;p&gt;The first step to building &lt;strong&gt;frame damage&lt;/strong&gt; is to process the layer updates. Layer updates describe changes of various layer properties such as size, position, transform, opacity, background color, etc. Many of those updates
have a visual impact on the final frame, therefore a portion of &lt;strong&gt;frame damage&lt;/strong&gt; must be inferred from those changes. For example, a layer’s transform change that effectively changes the layer position means that the layer
visually disappears from one place and appears in the other. Thus the &lt;strong&gt;frame damage&lt;/strong&gt; has to account for both the layer’s old and new position.&lt;/p&gt;
&lt;p&gt;Once the layer updates are processed, WebKit’s compositor has a full set of information to take the &lt;strong&gt;layer damage&lt;/strong&gt; of each layer into account. Thus in the second step, WebKit’s compositor traverses the tree formed out of
&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/TextureMapperLayer.h&quot;&gt;TextureMapperLayer&lt;/a&gt; objects and collects their &lt;strong&gt;layer damages&lt;/strong&gt;. Once the &lt;strong&gt;layer damage&lt;/strong&gt; of a certain layer
is collected, it’s transformed from the layer coordinate space into a global coordinate space so that it can be added to the &lt;strong&gt;frame damage&lt;/strong&gt; directly.&lt;/p&gt;
&lt;p&gt;After those two steps, the &lt;strong&gt;frame damage&lt;/strong&gt; is ready. At this point, it can be used for a couple of extra use cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;for WebKit’s compositor itself to perform some extra optimizations — as will be explained in the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/webkit-s-compositor-optimizations&quot;&gt;WebKit’s compositor optimizations&lt;/a&gt; section,&lt;/li&gt;
&lt;li&gt;for layout tests.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Eventually — regardless of extra uses — the WebKit’s compositor composes the frame and sends it (a handle to it) to the UI Process along with &lt;strong&gt;frame damage&lt;/strong&gt; using the IPC mechanism.&lt;/p&gt;
&lt;p&gt;In the UI process, there are basically two options determining &lt;strong&gt;frame damage&lt;/strong&gt; destiny — it can be either consumed or ignored — depending on the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt;-facing implementation. At the moment of writing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GTK port will consume the &lt;strong&gt;damage&lt;/strong&gt; (see &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebKit/UIProcess/gtk/AcceleratedBackingStoreDMABuf.cpp&quot;&gt;(…)/gtk/AcceleratedBackingStoreDMABuf.cpp&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;WPE port will consume the &lt;strong&gt;damage&lt;/strong&gt; only if the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;new WPE platform API&lt;/a&gt; is used along with the following platforms:
&lt;ul&gt;
&lt;li&gt;Wayland (see &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebKit/WPEPlatform/wpe/wayland/WPEViewWayland.cpp&quot;&gt;(…)/WPEPlatform/wpe/wayland/WPEViewWayland.cpp&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;DRM (see &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebKit/WPEPlatform/wpe/drm/WPEViewDRM.cpp&quot;&gt;(…)/WPEPlatform/wpe/drm/WPEViewDRM.cpp&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once the &lt;strong&gt;frame damage&lt;/strong&gt; is consumed, it means that it reached the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt; and thus the pipeline ends for that frame.&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/XoiATRtEhC-4084.avif 4084w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/XoiATRtEhC-4084.webp 4084w&quot;&gt;&lt;img alt=&quot;Damage propagation pipeline details.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/XoiATRtEhC-4084.svg&quot; width=&quot;4084&quot; height=&quot;1263&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;h2 id=&quot;current-status-of-the-implementation&quot; tabindex=&quot;-1&quot;&gt;Current status of the implementation &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;At the moment of writing, the damage propagation feature is run-time-disabled by default (&lt;code&gt;PropagateDamagingInformation&lt;/code&gt; feature flag) and compile-time enabled by default for GTK and WPE (with &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;new platform API&lt;/a&gt;) ports.
Overall, the feature works pretty well in the majority of real-world scenarios. However, there are still some uncovered code paths that lead to visual glitches. Therefore it’s fair to say the feature is still a work in progress.
The work, however, is pretty advanced. Moreover, the feature is set to a &lt;strong&gt;testable&lt;/strong&gt; state and thus it’s active throughout all the &lt;a href=&quot;https://github.com/WebKit/WebKit/tree/main/LayoutTests&quot;&gt;layout test&lt;/a&gt; runs on CI.
Not only the feature is tested by every layout test that tests any kind of rendering, but it also has quite a lot of &lt;a href=&quot;https://github.com/WebKit/WebKit/tree/main/LayoutTests/platform/glib/damage&quot;&gt;dedicated layout tests&lt;/a&gt;.
Not to mention the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Tools/TestWebKitAPI/Tests/WebCore/glib/Damage.cpp&quot;&gt;unit tests&lt;/a&gt; covering the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/Damage.h&quot;&gt;Damage&lt;/a&gt; class.&lt;/p&gt;
&lt;p&gt;In terms of functionalities, when the feature is enabled it:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;activates the damage propagation pipeline and hence propagates the &lt;strong&gt;damage&lt;/strong&gt; up to the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;activates additional WebKit-compositor-level optimizations.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;damage-propagation&quot; tabindex=&quot;-1&quot;&gt;Damage propagation &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;When the feature is enabled, the main goal is to activate the damage propagation pipeline so that eventually the &lt;strong&gt;damage&lt;/strong&gt; can be provided to the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;platform&lt;/a&gt;. However, in reality, a substantial part of the pipeline is always active
regardless of the features being enabled or compiled. This part of the pipeline ends before the &lt;strong&gt;damage&lt;/strong&gt; reaches
&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/coordinated/CoordinatedPlatformLayer.h&quot;&gt;CoordinatedPlatformLayer&lt;/a&gt; and is always active because it was used for layer-level optimizations for a long time.
More specifically — this part of the pipeline existed long before the damage propagation feature and was using &lt;strong&gt;layer damage&lt;/strong&gt; to optimize the layer painting to the intermediate surfaces.&lt;/p&gt;
&lt;p&gt;Because of the above, when the feature is enabled, only the part of the pipeline that starts with &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/main/Source/WebCore/platform/graphics/texmap/coordinated/CoordinatedPlatformLayer.h&quot;&gt;CoordinatedPlatformLayer&lt;/a&gt;
is activated. It is, however, still a significant portion of the pipeline and therefore it implies additional CPU/memory costs.&lt;/p&gt;
&lt;h4 id=&quot;webkit-s-compositor-optimizations&quot; tabindex=&quot;-1&quot;&gt;WebKit’s compositor optimizations &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;When the feature is activated and the &lt;strong&gt;damage&lt;/strong&gt; flows through the WebKit’s compositor, it creates a unique opportunity for the compositor to utilize that information and reduce the amount of painting/compositing it has to perform.
At the moment of writing, the GTK/WPE WebKit’s compositor is using the &lt;strong&gt;damage&lt;/strong&gt; to optimize the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;to apply global &lt;code&gt;glScissor&lt;/code&gt; to define the smallest possible clipping rect for all the painting it does — thus reducing the amount of painting,&lt;/li&gt;
&lt;li&gt;to reduce the amount of painting when compositing the tiles of the layers using tiled backing stores.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Detailed descriptions of the above optimizations are well beyond the scope of this article and thus will be provided in one of the next articles on the subject of damage propagation.&lt;/p&gt;
&lt;h2 id=&quot;trying-it-out&quot; tabindex=&quot;-1&quot;&gt;Trying it out &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;As mentioned in the above sections, the feature only works in the GTK and the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/(&quot;&gt;new-platform-API&lt;/a&gt;-powered WPE ports. This means that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In the case of GTK, one can use &lt;strong&gt;MiniBrowser&lt;/strong&gt; or any up-to-date GTK-WebKit-derived browser to test the feature.&lt;/li&gt;
&lt;li&gt;In the case of WPE with the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;new WPE platform API&lt;/a&gt; the &lt;a href=&quot;https://github.com/Igalia/cog&quot;&gt;cog&lt;/a&gt; browser cannot be used as it uses the old API. Therefore, one has to use &lt;strong&gt;MiniBrowser&lt;/strong&gt;
with the &lt;code&gt;--use-wpe-platform-api&lt;/code&gt; argument to activate the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;new WPE platform API&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Moreover, as the feature is run-time-disabled by default, it’s necessary to activate it. In the case of &lt;strong&gt;MiniBrowser&lt;/strong&gt;, the switch is &lt;code&gt;--features=+PropagateDamagingInformation&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id=&quot;building-and-running-the-gtk-minibrowser&quot; tabindex=&quot;-1&quot;&gt;Building &amp;amp; running the GTK MiniBrowser &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;For quick testing, it’s highly recommended to use the latest revision of &lt;a href=&quot;https://github.com/WebKit/WebKit/tree/main/&quot;&gt;WebKit@main&lt;/a&gt; with &lt;a href=&quot;https://github.com/Igalia/webkit-container-sdk&quot;&gt;wkdev SDK&lt;/a&gt; container and with GTK port.
Assuming one has set up the container, the commands to build and run GTK’s &lt;strong&gt;MiniBrowser&lt;/strong&gt; are as follows:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# building:&lt;/span&gt;&lt;br&gt;./Tools/Scripts/build-webkit &lt;span class=&quot;token parameter variable&quot;&gt;--gtk&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt;&lt;br&gt;&lt;br&gt;&lt;span class=&quot;token comment&quot;&gt;# running with visualizer&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token assign-left variable&quot;&gt;WEBKIT_SHOW_DAMAGE&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  Tools/Scripts/run-minibrowser &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--gtk&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--features&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;+PropagateDamagingInformation &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token string&quot;&gt;&#39;https://webkit.org/blog-files/3d-transforms/poster-circle.html&#39;&lt;/span&gt;&lt;br&gt;&lt;br&gt;&lt;span class=&quot;token comment&quot;&gt;# running without visualizer&lt;/span&gt;&lt;br&gt;Tools/Scripts/run-minibrowser &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--gtk&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--features&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;+PropagateDamagingInformation &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token string&quot;&gt;&#39;https://webkit.org/blog-files/3d-transforms/poster-circle.html&#39;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h4 id=&quot;building-and-running-the-wpe-minibrowser&quot; tabindex=&quot;-1&quot;&gt;Building &amp;amp; running the WPE MiniBrowser &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Alternatively, a WPE port can be used. Assuming some Wayland display is available, the commands to build and run the &lt;strong&gt;MiniBrowser&lt;/strong&gt; are the following:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# building:&lt;/span&gt;&lt;br&gt;./Tools/Scripts/build-webkit &lt;span class=&quot;token parameter variable&quot;&gt;--wpe&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt;&lt;br&gt;&lt;br&gt;&lt;span class=&quot;token comment&quot;&gt;# running with visualizer&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token assign-left variable&quot;&gt;WEBKIT_SHOW_DAMAGE&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  Tools/Scripts/run-minibrowser &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--wpe&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt; --use-wpe-platform-api &lt;span class=&quot;token parameter variable&quot;&gt;--features&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;+PropagateDamagingInformation &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token string&quot;&gt;&#39;https://webkit.org/blog-files/3d-transforms/poster-circle.html&#39;&lt;/span&gt;&lt;br&gt;&lt;br&gt;&lt;span class=&quot;token comment&quot;&gt;# running without visualizer&lt;/span&gt;&lt;br&gt;Tools/Scripts/run-minibrowser &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--wpe&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--release&lt;/span&gt; --use-wpe-platform-api &lt;span class=&quot;token parameter variable&quot;&gt;--features&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;+PropagateDamagingInformation &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br&gt;  &lt;span class=&quot;token string&quot;&gt;&#39;https://webkit.org/blog-files/3d-transforms/poster-circle.html&#39;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h4 id=&quot;trying-various-urls&quot; tabindex=&quot;-1&quot;&gt;Trying various URLs &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;While any URL can be used to test the feature, below is a short list of recommendations to check:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://igalia.com&quot;&gt;https://igalia.com&lt;/a&gt; — great for testing regular web page interactions and scrolling,&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://webkit.org/blog-files/3d-transforms/poster-circle.html&quot;&gt;https://webkit.org/blog-files/3d-transforms/poster-circle.html&lt;/a&gt; — great to see CSS transformations and animations handling,&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scony.github.io/web-examples/canvas-2d/drawing-noise-in-moving-rect.html&quot;&gt;https://scony.github.io/web-examples/canvas-2d/drawing-noise-in-moving-rect.html&lt;/a&gt; — great to see how damage works with canvas using &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D&quot;&gt;CanvasRenderingContext2D&lt;/a&gt;.
Please note that at the moment accelerated canvas is not supported and hence the &lt;code&gt;,-CanvasUsesAcceleratedDrawing&lt;/code&gt; must be added to the &lt;code&gt;--features=(...)&lt;/code&gt; list.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s also worth mentioning that &lt;code&gt;WEBKIT_SHOW_DAMAGE=1&lt;/code&gt; environment variable disables damage-driven GTK/WPE WebKit’s compositor optimizations and therefore some glitches that are seen without the envvar, may not be seen
when it is set. The URL to &lt;a href=&quot;https://abotella.pages.igalia.com/past-and-future-of-server-side-runtimes&quot;&gt;this presentation&lt;/a&gt; is a great example to explore various glitches that are yet to be fixed. To trigger them, it’s enough to navigate
around the presentation using top/right/down/left arrows.&lt;/p&gt;
&lt;h2 id=&quot;coming-up-next&quot; tabindex=&quot;-1&quot;&gt;Coming up next &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;This article was meant to scratch the surface of the broad, damage propagation topic. While it focused mostly on introducing basic terminology and describing the &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;damage propagation pipeline&lt;/a&gt; in more detail,
it briefly mentioned or skipped completely the following aspects of the feature:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the problem of storing the damage information efficiently,&lt;/li&gt;
&lt;li&gt;the damage-driven optimizations of the GTK/WPE WebKit’s compositor,&lt;/li&gt;
&lt;li&gt;the most common use cases for the feature,&lt;/li&gt;
&lt;li&gt;the benchmark results on desktop-class and embedded devices.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore, in the next articles, the above topics will be examined to a larger extent.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&quot;references&quot; tabindex=&quot;-1&quot;&gt;References &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;new WPE platform API&lt;/strong&gt; is still not released and thus it’s not yet officially announced. Some information on it, however, is provided by
&lt;a href=&quot;https://www.slideshare.net/slideshow/the-new-wpe-api/262768862&quot;&gt;this presentation&lt;/a&gt; prepared for a WebKit contributors meeting.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;platform&lt;/strong&gt; that the WebKit renders to depends on the WebKit port:
&lt;ul&gt;
&lt;li&gt;in case of GTK port, the platform is GTK so the rendering is done to &lt;strong&gt;GtkWidget&lt;/strong&gt;,&lt;/li&gt;
&lt;li&gt;in case of WPE port with &lt;a href=&quot;https://blogs.igalia.com/plampe/introduction-to-damage-propagation-in-wpe-and-gtk-webkit-ports/&quot;&gt;new WPE platform API&lt;/a&gt;, the platform is one of the following:
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;wayland&lt;/strong&gt; — in that case rendering is done to the system’s compositor,&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DRM&lt;/strong&gt; — in that case rendering is done directly to the screen,&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;headless&lt;/strong&gt; — in that case rendering is usually done into memory buffer.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
</content>
	</entry>
	
	<entry>
		<title>Working with WebKit and GStreamer logs in Emacs.</title>
		<link href="https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/"/>
		<updated>2025-02-26T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/</id>
		<content type="html">&lt;p&gt;WebKit has grown into a massive codebase throughout the years. To make developers’ lives easier, it offers various subsystems and integrations.
One such subsystem is a logging subsystem that offers the recording of textual logs describing an execution of the internal engine parts.&lt;/p&gt;
&lt;p&gt;The logging subsystem in WebKit (as in any computer system), is usually used for both debugging and educational purposes. As WebKit is a widely-used piece of software that runs on
everything ranging from desktop-class devices up to low-end embedded devices, it’s not uncommon that logging is sometimes the only way for debugging when various limiting
factors come into play. Such limiting factors don’t have to be only technical - it may also be that the software runs on some restricted systems and direct debugging is not allowed.&lt;/p&gt;
&lt;h3 id=&quot;requirements-for-efficient-work-with-textual-logs&quot; tabindex=&quot;-1&quot;&gt;Requirements for efficient work with textual logs &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Regardless of the reasons why logging is used, once the set of logs is produced, one can work with it according to the particular need.
From my experience, efficient work with textual logs requires a tool with the following capabilities:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ability to search for a particular substring or regular expression.&lt;/li&gt;
&lt;li&gt;Ability to filter text lines according to the substring or regular expressions.&lt;/li&gt;
&lt;li&gt;Ability to highlight particular substrings.&lt;/li&gt;
&lt;li&gt;Ability to mark certain lines for separate examination (with extra notes if possible).&lt;/li&gt;
&lt;li&gt;Ability to save and restore the current state of work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;While all text editors should be able to provide requirement 1, requirements 2-5 are usually more tricky and text editors won’t support them out of the box.
Fortunately, any modern extensible text editor should be able to support requirements 2-5 after some extra configuration.&lt;/p&gt;
&lt;h3 id=&quot;setting-up-emacs-to-work-with-logs&quot; tabindex=&quot;-1&quot;&gt;Setting up Emacs to work with logs &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Throughout the following sections, I use Emacs, the classic “extensible, customizable, free/libre text editor”, to showcase how it can be set up and used to meet
the above criteria and to make work with logs a gentle experience.&lt;/p&gt;
&lt;p&gt;Emacs, just like any other text editor, provides the support for requirement 1 from the previous section out of the box.&lt;/p&gt;
&lt;h4 id=&quot;loccur-the-minor-mode-for-text-filtering&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;loccur&lt;/code&gt; - the minor mode for text filtering &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;To support requirement 2, it requires some extra mode. My recommendation for that is &lt;a href=&quot;https://codeberg.org/fourier/loccur&quot;&gt;loccur&lt;/a&gt; - the minor mode
that acts just like a classic &lt;code&gt;grep&lt;/code&gt; *nix utility yet directly in the editor. The benefit of that mode (over e.g. &lt;a href=&quot;https://www.masteringemacs.org/article/searching-buffers-occur-mode&quot;&gt;occur&lt;/a&gt;)
is that it works in-place. Therefore it’s very ergonomic and - as I’ll show later - it works well in conjunction with bookmarking mode.&lt;/p&gt;
&lt;p&gt;Installation of loccur is very simple and can be done from within the built-in package manager:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;M-x package-install RET loccur RET
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With loccur installed, one can immediately start using it by calling &lt;code&gt;M-x loccur RET &amp;lt;regex&amp;gt; RET&lt;/code&gt;. The figure below depicts the example of filtering:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/KIvXYAU2KL-1243.avif 1243w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/KIvXYAU2KL-1243.webp 1243w&quot;&gt;&lt;img alt=&quot;loccur - the minor mode for text filtering.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/KIvXYAU2KL-1243.png&quot; width=&quot;1243&quot; height=&quot;683&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;h4 id=&quot;highlight-symbol-the-package-with-utility-functions-for-text-highlighting&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;highlight-symbol&lt;/code&gt; - the package with utility functions for text highlighting &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;To support requirement 3, Emacs also requires the installation of extra module. In that case my recommendation is &lt;a href=&quot;http://nschum.de/src/emacs/highlight-symbol/&quot;&gt;highlight-symbol&lt;/a&gt;
that is a simple set of functions that enables basic text fragment highlighting on the fly.&lt;/p&gt;
&lt;p&gt;Installation of this module is also very simple and boils down to:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;M-x package-install RET highlight-symbol RET
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With the above, it’s very easy to get results like in the figure below:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/JnQXHs4adb-1243.avif 1243w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/JnQXHs4adb-1243.webp 1243w&quot;&gt;&lt;img alt=&quot;highlight-symbol - the package with utility functions for text highlighting.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/JnQXHs4adb-1243.png&quot; width=&quot;1243&quot; height=&quot;685&quot;&gt;&lt;/picture&gt;
just by moving the cursor around and using &lt;code&gt;C-c h&lt;/code&gt; to toggle the highlight of the text at the current cursor position.&lt;/p&gt;
&lt;h4 id=&quot;bm-the-package-with-utility-functions-for-buffer-lines-bookmarking&quot; tabindex=&quot;-1&quot;&gt;&lt;code&gt;bm&lt;/code&gt; - the package with utility functions for buffer lines bookmarking &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Finally, to support requirements 4 and 5, Emacs requires one last extra package. This time my recommendation is &lt;a href=&quot;https://github.com/joodland/bm&quot;&gt;bm&lt;/a&gt;
that is quite a powerful set of utilities for text bookmarking.&lt;/p&gt;
&lt;p&gt;In this case, installation is also very simple and is all about:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;M-x package-install RET bm RET
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In a nutshell, the &lt;code&gt;bm&lt;/code&gt; package brings some visual capabilities like in the figure below:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qgwA30P858-1243.avif 1243w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qgwA30P858-1243.webp 1243w&quot;&gt;&lt;img alt=&quot;bm - the package with utility functions for buffer lines bookmarking&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/qgwA30P858-1243.png&quot; width=&quot;1243&quot; height=&quot;683&quot;&gt;&lt;/picture&gt;
as well as non-visual capabilities that will be discussed in further sections.&lt;/p&gt;
&lt;h4 id=&quot;the-final-configuration&quot; tabindex=&quot;-1&quot;&gt;The final configuration &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Once all the necessary modules are installed, it’s worth to spend some time on configuration. With just a few simple tweaks it’s possible to make the work with logs
simple and easily reproducible.&lt;/p&gt;
&lt;p&gt;To not influence other workflows, I recommend attaching as much configuration as possible to any major mode and setting that mode as a default for
files with certain extensions. The configuration below uses a major mode called &lt;code&gt;text-mode&lt;/code&gt; as  the one for working with logs and associates all the files with a
suffix &lt;code&gt;.log&lt;/code&gt; with it. Moreover, the most critical commands of the modes installed in the previous sections are binded to the key shortcuts. The one last
thing is to enable truncating the lines (&lt;code&gt;(set-default &#39;truncate-lines t)&lt;/code&gt;) and highlighting the line that the cursor is currently at (&lt;code&gt;(hl-line-mode 1)&lt;/code&gt;).&lt;/p&gt;
&lt;pre class=&quot;language-elisp&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-elisp&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;add-to-list&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;auto-mode-alist&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#39;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;&#92;&#92;.log&#92;&#92;&#39;&quot;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt; text-mode&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;add-hook&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;text-mode-hook&lt;/span&gt;&lt;br&gt;          &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token lambda&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token arguments&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;define-key&lt;/span&gt; text-mode-map &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;C-c t&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;bm-toggle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;define-key&lt;/span&gt; text-mode-map &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;C-c n&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;bm-next&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;define-key&lt;/span&gt; text-mode-map &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;C-c p&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;bm-previous&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;define-key&lt;/span&gt; text-mode-map &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;C-c h&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;highlight-symbol&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;define-key&lt;/span&gt; text-mode-map &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;C-c C-c&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;highlight-symbol-next&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;set-default&lt;/span&gt; &lt;span class=&quot;token quoted-symbol variable symbol&quot;&gt;&#39;truncate-lines&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token car&quot;&gt;hl-line-mode&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br&gt;            &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;webkit-logs-case-study&quot; tabindex=&quot;-1&quot;&gt;WebKit logs case study &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;To show what the workflow of Emacs is with the above configuration and modules, some logs are required first. It’s &lt;a href=&quot;https://trac.webkit.org/wiki/WebKitGTK/Debugging#Loggingsupport&quot;&gt;very easy&lt;/a&gt; to
get some logs out of WebKit, so I’ll additionally get some GStreamer logs as well. For that, I’ll build a WebKit GTK port from the latest revision of &lt;a href=&quot;https://github.com/WebKit/WebKit&quot;&gt;WebKit repository&lt;/a&gt;.
To make the build process easier, I’ll use the &lt;a href=&quot;https://github.com/Igalia/webkit-container-sdk&quot;&gt;WebKit container SDK&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here’s the build command:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;./Tools/Scripts/build-webkit &lt;span class=&quot;token parameter variable&quot;&gt;--gtk&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--debug&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--cmakeargs&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;-DENABLE_JOURNALD_LOG=OFF&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The above command disables the &lt;code&gt;ENABLE_JOURNALD_LOG&lt;/code&gt; build option so that logs are printed to stderr. This will result in the WebKit and GStreamer logs being bundled together as intended.&lt;/p&gt;
&lt;p&gt;Once the build is ready, one can run any URL to get the logs. I’ve chosen a &lt;a href=&quot;https://ytlr-cert.appspot.com/2021/main.html&#39;&quot;&gt;YouTube conformance tests&lt;/a&gt; suite from 2021 and selected test case &lt;strong&gt;“39. PlaybackRateChange”&lt;/strong&gt;
to get some interesting entries from multimedia-related subsystems:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token builtin class-name&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;GST_DEBUG_NO_COLOR&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token builtin class-name&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;GST_DEBUG&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;4&lt;/span&gt;,webkit*:7&lt;br&gt;&lt;span class=&quot;token builtin class-name&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;WEBKIT_DEBUG&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;Layout,Media&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;debug,Events&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;debug&lt;br&gt;&lt;span class=&quot;token builtin class-name&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;URL&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&#39;https://ytlr-cert.appspot.com/2021/main.html&#39;&lt;/span&gt;&lt;br&gt;./Tools/Scripts/run-minibrowser &lt;span class=&quot;token parameter variable&quot;&gt;--gtk&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--debug&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--features&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;+LogsPageMessagesToSystemConsole &lt;span class=&quot;token string&quot;&gt;&quot;&lt;span class=&quot;token variable&quot;&gt;${URL}&lt;/span&gt;&quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;&amp;amp;&gt;&lt;/span&gt; log.log&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The commands above reveal some interesting aspects of how to get certain logs. First of all, the commands above specify a few environment variables:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;GST_DEBUG=4,webkit*:7&lt;/code&gt; - to enable &lt;a href=&quot;https://gstreamer.freedesktop.org/documentation/tutorials/basic/debugging-tools.html?gi-language=c#the-debug-log&quot;&gt;GStreamer logs&lt;/a&gt; of level &lt;code&gt;INFO&lt;/code&gt; (for all categories) and of level &lt;code&gt;TRACE&lt;/code&gt;
for the &lt;code&gt;webkit*&lt;/code&gt; categories&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GST_DEBUG_NO_COLOR=1&lt;/code&gt; - to disable coloring of GStreamer logs&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WEBKIT_DEBUG=Layout,Media=debug,Events=debug&lt;/code&gt; - to enable WebKit logs for a few interesting channels.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Moreover, the runtime preference &lt;code&gt;LogsPageMessagesToSystemConsole&lt;/code&gt; is enabled to log console output logged by JavaScript code.&lt;/p&gt;
&lt;h4 id=&quot;the-workflow&quot; tabindex=&quot;-1&quot;&gt;The workflow &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Once the logs are collected, one can open them using Emacs and start making sense out of them by gradually exploring the flow of execution. In the below exercise, I intend to understand
what happened from the multimedia perspective during the execution of the test case &lt;strong&gt;“39. PlaybackRateChange”&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The first step is usually to find the most critical lines that mark more/less the area in the file where the interesting things happen. In that case I propose using &lt;code&gt;M-x loccur RET CONSOLE LOG RET&lt;/code&gt; to check what the
console logs printed by the application itself are. Once some lines are filtered, one can use &lt;code&gt;bm-toggle&lt;/code&gt; command (&lt;code&gt;C-c t&lt;/code&gt;) to mark some lines for later examination (highlighted as orange):
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/HNfjKsSQ7U-1242.avif 1242w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/HNfjKsSQ7U-1242.webp 1242w&quot;&gt;&lt;img alt=&quot;Effect of filtering and marking some console logs.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/HNfjKsSQ7U-1242.png&quot; width=&quot;1242&quot; height=&quot;683&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;For practicing purposes I propose exiting the filtered view &lt;code&gt;M-x loccur RET&lt;/code&gt; and trying again to see what events the browser was dispatching e.g. using &lt;code&gt;M-x loccur RET on node node 0x7535d70700b0 VIDEO RET&lt;/code&gt;:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/hR2A7ltmx1-1243.avif 1243w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/hR2A7ltmx1-1243.webp 1243w&quot;&gt;&lt;img alt=&quot;Effect of filtering and marking some video node events.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/hR2A7ltmx1-1243.png&quot; width=&quot;1243&quot; height=&quot;684&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;In general, the combination of loccur and substring/regexp searches should be very convenient to quickly explore various types of logs along with marking them for later. In case of very important log
lines, one can additionally use &lt;code&gt;bm-bookmark-annotate&lt;/code&gt; command to add extra notes for later.&lt;/p&gt;
&lt;p&gt;Once some interesting log lines are marked, the most basic thing to do is to jump between them using &lt;code&gt;bm-previous&lt;/code&gt; (&lt;code&gt;C-c n&lt;/code&gt;) and &lt;code&gt;bm-next&lt;/code&gt; (&lt;code&gt;C-c p&lt;/code&gt;). However, the true power of bm mode comes with
the use of &lt;code&gt;M-x bm-show RET&lt;/code&gt; to get the view containing only the lines marked with &lt;code&gt;bm-toggle&lt;/code&gt; (originally highlighted orange):
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/N_qSA4e869-2542.avif 2542w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/N_qSA4e869-2542.webp 2542w&quot;&gt;&lt;img alt=&quot;Effect of invoking bm-show.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/N_qSA4e869-2542.png&quot; width=&quot;2542&quot; height=&quot;1388&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;This view is especially useful as it shows only the lines deliberately marked using &lt;code&gt;bm-toggle&lt;/code&gt; and allows one to quickly jump to them in the original file. Moreover, the lines are displayed in
the order they appear in the original file. Therefore it’s very easy to see the unified flow of the system and start making sense out of the data presented. What’s even more interesting,
the view contains also the line numbers from the original file as well as manually added annotations if any. The line numbers are especially useful as they may be used for resuming the work
after ending the Emacs session - which I’ll describe further in this section.&lt;/p&gt;
&lt;p&gt;When the &lt;code&gt;*bm-bookmarks*&lt;/code&gt; view is rendered, the only problem left is that the lines are hard to read as they are displayed using a single color. To overcome that problem one can use the macros from
the &lt;code&gt;highlight-symbol&lt;/code&gt; package using the &lt;code&gt;C-c h&lt;/code&gt; shortcut defined in the configuration. The result of highlighting some strings is depicted in the figure below:
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/pXvKiDy3wH-2541.avif 2541w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/pXvKiDy3wH-2541.webp 2541w&quot;&gt;&lt;img alt=&quot;Highlighting strings in bm-show.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/pXvKiDy3wH-2541.png&quot; width=&quot;2541&quot; height=&quot;1389&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;With some colors added, it’s much easier to read the logs and focus on essential parts.&lt;/p&gt;
&lt;h4 id=&quot;saving-and-resuming-the-session&quot; tabindex=&quot;-1&quot;&gt;Saving and resuming the session &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;On some rare occasions it may happen that it’s necessary to close the Emacs session yet the work with certain log file is not done and needs to be resumed later. For that, the simple trick is to open the current
set of bookmarks with &lt;code&gt;M-x bm-show RET&lt;/code&gt; and then save that buffer to the file. Personally, I just create a file with the same name as log file yet with &lt;code&gt;.bm&lt;/code&gt; prefix - so for &lt;code&gt;log.log&lt;/code&gt; it’s &lt;code&gt;log.log.bm&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Once the session is resumed, it is enough to open both &lt;code&gt;log.log&lt;/code&gt; and &lt;code&gt;log.log.bm&lt;/code&gt; files side by side and create a simple ad-hoc macro to use line numbers from &lt;code&gt;log.log.bm&lt;/code&gt; to mark them again in the &lt;code&gt;log.log&lt;/code&gt;
file:
&lt;img src=&quot;https://scony.github.io/web-examples/media/emacs-logfile-flow-continuation.gif&quot; alt=&quot;Resuming the session&quot;&gt;&lt;/p&gt;
&lt;p&gt;As shown in the above gif, within a few seconds all the marks are applied in the buffer with &lt;code&gt;log.log&lt;/code&gt; file and the work can resume from that point i.e. one can jump around using &lt;code&gt;bm&lt;/code&gt;, add new marks etc.&lt;/p&gt;
&lt;h3 id=&quot;summary&quot; tabindex=&quot;-1&quot;&gt;Summary &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Although the above approach may not be ideal for everybody, I find it fairly ergonomic, smooth, and covering all the &lt;a href=&quot;https://blogs.igalia.com/plampe/working-with-webkit-and-gstreamer-logs-in-emacs/&quot;&gt;requirements&lt;/a&gt; I identified earlier.
I’m certain that editors other than Emacs can be set up to allow the same or very similar flow, yet any particular configurations are left for the reader to explore.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>Contributing to CSS Anchor Positioning in WebKit.</title>
		<link href="https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/"/>
		<updated>2024-12-27T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/</id>
		<content type="html">&lt;!-- - quick description of CSS Anchor Positioning spec --&gt;
&lt;p&gt;&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_anchor_positioning&quot;&gt;CSS Anchor Positioning&lt;/a&gt; is a novel &lt;a href=&quot;https://drafts.csswg.org/css-anchor-position-1/&quot;&gt;CSS specification module&lt;/a&gt;
that allows &lt;em&gt;positioned elements&lt;/em&gt; to size and position themselves relative to one or more &lt;em&gt;anchor elements&lt;/em&gt; anywhere on the web page.
In simpler terms, it is a new web platform API that simplifies advanced relative-positioning scenarios such as tooltips, menus, popups, etc.&lt;/p&gt;
&lt;h3 id=&quot;css-anchor-positioning-in-practice&quot; tabindex=&quot;-1&quot;&gt;CSS Anchor Positioning in practice &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;To better understand the true power it brings, let’s consider a &lt;a href=&quot;https://scony.github.io/web-examples/css-anchor-positioning/context-menu.html&quot;&gt;non-trivial layout&lt;/a&gt; presented in Figure 1:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/hAjWZkihEe-781.avif 781w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/hAjWZkihEe-781.webp 781w&quot;&gt;&lt;img alt=&quot;Non-trivial layout.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/hAjWZkihEe-781.png&quot; width=&quot;781&quot; height=&quot;171&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;In the past, creating a context menu with &lt;code&gt;position: fixed&lt;/code&gt; and positioned relative to the button required doing positioning-related calculations manually.
The more complex the layout, the more complex the situation. For example, if the table in the above example was in a scrollable container,
the position of the context menu would have to be updated manually on every scroll event.&lt;/p&gt;
&lt;p&gt;With the CSS Anchor Positioning the solution to the above problem becomes trivial and requires 2 parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;&amp;lt;button&amp;gt;&lt;/code&gt; element must be marked as an &lt;em&gt;anchor&lt;/em&gt; element by adding &lt;code&gt;anchor-name: --some-name&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The context menu element must position itself using the &lt;code&gt;anchor()&lt;/code&gt; function: &lt;code&gt;left: anchor(--some-name right); top: anchor(--some-name bottom)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above is enough for the web engine to understand that the context menu element’s &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;top&lt;/code&gt; must be positioned to the anchor element’s &lt;code&gt;right&lt;/code&gt; and &lt;code&gt;bottom&lt;/code&gt;.
With that, the web engine can carry out the job under the hood, so the result is as in Figure 2:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/mvapakqW4F-781.avif 781w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/mvapakqW4F-781.webp 781w&quot;&gt;&lt;img alt=&quot;Non-trivial layout with anchor-positioned context menu.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/mvapakqW4F-781.png&quot; width=&quot;781&quot; height=&quot;220&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;As the above demonstrates, even with a few simple API pieces, it’s now possible to address very complex scenarios in a very elegant fashion from the web developer’s perspective.
Moreover, CSS Anchor Positioning offers even more than that. There are numerous articles with great examples such as
&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_anchor_positioning&quot;&gt;this MDN article&lt;/a&gt;,
&lt;a href=&quot;https://css-tricks.com/css-anchor-positioning-guide/&quot;&gt;this css-tricks article&lt;/a&gt;,
or &lt;a href=&quot;https://developer.chrome.com/blog/anchor-positioning-api&quot;&gt;this chrome blog post&lt;/a&gt;, but the long story short is that
both positioning and sizing elements relative to anchors are now very simple.&lt;/p&gt;
&lt;!-- - current implementation status across engines (including WebKit) --&gt;
&lt;h3 id=&quot;implementation-status-across-web-engines&quot; tabindex=&quot;-1&quot;&gt;Implementation status across web engines &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The first draft of the specification &lt;a href=&quot;https://drafts.csswg.org/date/2023-02-14T21:19:31/css-anchor-position-1/&quot;&gt;was published&lt;/a&gt; in early 2023,
which in the web engines field is not so long time ago.
Therefore - as one can imagine - not all the major web engines support it yet. The first (and so far the only) web engine
to support CSS Anchor Positioning was Chromium (see the &lt;a href=&quot;https://developer.chrome.com/blog/anchor-positioning-api&quot;&gt;introduction blog post&lt;/a&gt;) -
thus the information on &lt;a href=&quot;https://caniuse.com/css-anchor-positioning&quot;&gt;caniuse.com&lt;/a&gt;.
However, despite the information visible on the &lt;a href=&quot;https://wpt.fyi/results/css/css-anchor-position?label=experimental&amp;amp;label=master&amp;amp;aligned&quot;&gt;WPT results page&lt;/a&gt;,
the other web engines are currently implementing it (see the &lt;a href=&quot;https://bugzilla.mozilla.org/show_bug.cgi?id=1838746&quot;&gt;meta bug&lt;/a&gt; for Gecko and
&lt;a href=&quot;https://bugs.webkit.org/buglist.cgi?bug_status=UNCONFIRMED&amp;amp;bug_status=NEW&amp;amp;bug_status=ASSIGNED&amp;amp;bug_status=REOPENED&amp;amp;bug_status=RESOLVED&amp;amp;bug_status=VERIFIED&amp;amp;bug_status=CLOSED&amp;amp;list_id=11796448&amp;amp;order=changeddate%20DESC%2Cpriority%2Cbug_severity&amp;amp;query_format=advanced&amp;amp;short_desc=css-anchor-position-1&amp;amp;short_desc_type=allwordssubstr&quot;&gt;bug list&lt;/a&gt;
for WebKit). The lack of progress on the WPT results page is due to the feature not being enabled by default yet in those cases.&lt;/p&gt;
&lt;!-- - history of work in WebKit --&gt;
&lt;h3 id=&quot;implementation-status-in-webkit&quot; tabindex=&quot;-1&quot;&gt;Implementation status in WebKit &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;From the commits visible publicly, one can deduce that the work on CSS Anchor Positioning in WebKit has been started by Apple early 2024.
The implementation was initiated by adding a core part - support for &lt;code&gt;anchor-name&lt;/code&gt;, &lt;code&gt;position-anchor&lt;/code&gt;, and &lt;code&gt;anchor()&lt;/code&gt;. Those 2 properties and function are enough to start using the feature
in real-world scenarios as well as more sophisticated WPT tests.&lt;/p&gt;
&lt;p&gt;The work on the above had been finished by the end of Q3 2024, and then - in Q4 2024 - the work significantly intensified. A parsing/computing support has been added for numerous
properties and functions and moreover, a lot of new functionalities and bug fixes landed afterwards. One could expect some more things to land by the end of the year even if there’s
not much time left.&lt;/p&gt;
&lt;p&gt;Overall, the implementation is in progress and is far from being done, but can already be tested in many real-world scenarios.
This can be done using custom WebKit builds (across various OSes) or using &lt;a href=&quot;https://developer.apple.com/safari/technology-preview/&quot;&gt;Safari Technology Preview&lt;/a&gt; on Mac.
The precondition for testing is, however, that the runtime preference called &lt;code&gt;CSSAnchorPositioning&lt;/code&gt; is enabled.&lt;/p&gt;
&lt;h3 id=&quot;my-contributions&quot; tabindex=&quot;-1&quot;&gt;My contributions &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Since the CSS Anchor Positioning in WebKit is still work in progress, and since the demand for the set of features this module brings is high, I’ve been privileged to contribute
a little to the implementation myself. My work so far has been focused around the parts of API that allow creating menu-like elements becoming visible on demand.&lt;/p&gt;
&lt;p&gt;The first challenge with the above was to fix various problems related to toggling visibility status such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://bugs.webkit.org/show_bug.cgi?id=279588&quot;&gt;WebProcess crash on toggling visibility status&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://bugs.webkit.org/show_bug.cgi?id=281728&quot;&gt;Layout broken&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The obvious first step towards addressing the above was to isolate elegant scenarios to reproduce the above. In the process, I’ve created some test cases, and &lt;a href=&quot;https://github.com/web-platform-tests/wpt/pull/48496&quot;&gt;added them to WPT&lt;/a&gt;.
With tests in place, I’ve &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/35212&quot;&gt;imported them&lt;/a&gt; into WebKit’s source tree and proceeded with actual bug fixing.
The result was the &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/33696&quot;&gt;fix for the above crash&lt;/a&gt;, and the &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/35418&quot;&gt;fix for the layout being broken&lt;/a&gt;.
With that in place, the visibility of menu-like elements can be changed without any problems now.&lt;/p&gt;
&lt;p&gt;The second challenge was about the missing features allowing automatic alignment to the anchor. In a nutshell, to get the alignment like in the Figure 3:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/O5tUICqu8q-781.avif 781w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/O5tUICqu8q-781.webp 781w&quot;&gt;&lt;img alt=&quot;Non-trivial layout with centered anchor-positioned context menu.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/O5tUICqu8q-781.png&quot; width=&quot;781&quot; height=&quot;220&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;there are 2 possibilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/position-area&quot;&gt;&lt;code&gt;position-area&lt;/code&gt;&lt;/a&gt; CSS property can be used: &lt;code&gt;position-area: bottom center;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_anchor_positioning/Using#centering_on_the_anchor_using_anchor-center&quot;&gt;&lt;code&gt;anchor-center&lt;/code&gt;&lt;/a&gt; value of
&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/CSS/justify-self&quot;&gt;&lt;code&gt;justify-self&lt;/code&gt;&lt;/a&gt; can be used: &lt;code&gt;justify-self: anchor-center;&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At first, I wasn’t aware of the &lt;code&gt;anchor-center&lt;/code&gt; and hence I’ve started &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/35202&quot;&gt;initial work&lt;/a&gt; towards supporting &lt;code&gt;position-area&lt;/code&gt;.
Once I became aware, however, I’ve switched my focus to implementing &lt;code&gt;anchor-center&lt;/code&gt; and left the above for Apple to continue - not to block them.
Until now, both the &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/35517&quot;&gt;initial&lt;/a&gt; and &lt;a href=&quot;https://github.com/WebKit/WebKit/pull/36705&quot;&gt;core&lt;/a&gt; parts of &lt;code&gt;anchor-center&lt;/code&gt; implementation have landed.
It means, the basic support is in place.&lt;/p&gt;
&lt;p&gt;Despite &lt;code&gt;anchor-center&lt;/code&gt; layout tests passing, I’ve already discovered some problems such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://bugs.webkit.org/show_bug.cgi?id=283295&quot;&gt;The problem with default anchor resolution&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://bugs.webkit.org/show_bug.cgi?id=284619&quot;&gt;The problem with incorrect sizing in some corner-cases&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and I anticipate more problems may appear once the testing intensifies.&lt;/p&gt;
&lt;p&gt;To address the above, I’ll be focusing on adding extra WPT coverage along with fixing the problems one by one. The key is to
make sure that at the end of the day, all the unexpected problems are covered with WPT test cases. This way, other web engines
will also benefit.&lt;/p&gt;
&lt;h3 id=&quot;the-future&quot; tabindex=&quot;-1&quot;&gt;The future &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/contributing-to-css-anchor-positioning-in-webkit/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;With WebKit’s implementation of CSS Anchor Positioning in its current shape, the work can be very much parallel. Assuming that Apple will keep
working on that at the same pace as they did for the past few months, I wouldn’t be surprised if CSS Anchor Positioning would be pretty much
done by the end of 2025. If the implementation in Gecko doesn’t stall, I think one can also expect a lot of activity around testing in the
WPT. With that, the quality of implementation across the web engines should improve, and eventually (perhaps in 2026?) the CSS Anchor Positioning
should reach the state of full interoperability.&lt;/p&gt;
</content>
	</entry>
	
	<entry>
		<title>Nuts and bolts of Canvas2D - globalCompositeOperation and shadows.</title>
		<link href="https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/"/>
		<updated>2024-08-02T00:00:00Z</updated>
		<id>https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/</id>
		<content type="html">&lt;p&gt;In recent months I’ve been privileged to work on the &lt;a href=&quot;https://blogs.igalia.com/carlosgc/2024/02/19/webkit-switching-to-skia-for-2d-graphics-rendering/&quot;&gt;transition from Cairo to Skia&lt;/a&gt; for 2D graphics rendering in WPE and GTK WebKit ports. Big
reworks like this are a great opportunity to explore all kinds of graphics-related APIs. One of the broader APIs in this area is the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D&quot;&gt;CanvasRenderingContext2D&lt;/a&gt; API from HTML
&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API&quot;&gt;Canvas&lt;/a&gt;. It’s a fairly straightforward yet extensive API allowing one to perform all kinds of drawing operations on the canvas. The comprehensiveness, however, comes at the expense of
some complex situations the web engine needs to handle under the hood. One such situation was the &lt;a href=&quot;https://bugs.webkit.org/show_bug.cgi?id=273239&quot;&gt;issue&lt;/a&gt; I was working on recently regarding broken test cases
involving drawing shadows when using Skia in WebKit. What makes it complex is that some problems are still visible due to multiple web engine layers being involved, but despite that I was eventually able to address the
broken test cases.&lt;/p&gt;
&lt;p&gt;In the next few sections I’m going to introduce the parts of the API that are involved in the problems while in the sections closer to the end I will gradually showcase the problems and explore potential paths toward fixing the entire situation.&lt;/p&gt;
&lt;h3 id=&quot;drawing-on-canvas2d-with-globalcompositeoperation&quot; tabindex=&quot;-1&quot;&gt;Drawing on Canvas2D with &lt;code&gt;globalCompositeOperation&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D&quot;&gt;Canvas2D API&lt;/a&gt; offers multiple methods for drawing various primitives such as rectangles, arcs, text etc. On top of that, it allows one to control compositing and clipping
using the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/globalCompositeOperation&quot;&gt;&lt;code&gt;globalCompositeOperation&lt;/code&gt;&lt;/a&gt; property. The idea is very simple - the user of an API can change the property using one of the predefined
compositing operations and immediately after that, all new drawing operations will behave according to the rules the particular compositing operation specifies:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;canvas2DContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;fillRect&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token comment&quot;&gt;// Draws rect on top of existing content (default).&lt;/span&gt;&lt;br&gt;canvas2DContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;globalCompositeOperation &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;destination-atop&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br&gt;canvas2DContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;fillRect&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token comment&quot;&gt;// Draws rect according to &#39;destination-atop&#39;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/globalCompositeOperation#value&quot;&gt;many compositing operations&lt;/a&gt;, but I’ll be focusing mostly on the ones having &lt;code&gt;source&lt;/code&gt; and &lt;code&gt;destination&lt;/code&gt; in their names.
The &lt;code&gt;source&lt;/code&gt; and &lt;code&gt;destination&lt;/code&gt; terms refer to the new content to be drawn and the existing (already-drawn) content respectively.&lt;/p&gt;
&lt;p&gt;The images below present some &lt;a href=&quot;https://scony.github.io/web-examples/canvas-w-composite-operation/&quot;&gt;examples&lt;/a&gt; of compositing operations in action:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ukuOTO6ohc-878.avif 878w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/ukuOTO6ohc-878.webp 878w&quot;&gt;&lt;img alt=&quot;Compositing operations in action.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/ukuOTO6ohc-878.png&quot; width=&quot;878&quot; height=&quot;170&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;h3 id=&quot;drawing-on-canvas2d-with-shadows&quot; tabindex=&quot;-1&quot;&gt;Drawing on Canvas2D with shadows &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;When drawing primitives using the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D&quot;&gt;Canvas2D API&lt;/a&gt; one can use &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D#shadows&quot;&gt;&lt;code&gt;shadow*&lt;/code&gt; properties&lt;/a&gt;
to enable drawing of shadows along with any content that is being drawn. The usage is very simple - one has to alter at least one property such as e.g. &lt;code&gt;shadowOffsetX&lt;/code&gt; to make the shadow visible:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot; tabindex=&quot;0&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;canvas2DContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;shadowColor &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;#0f0&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br&gt;canvas2DContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;shadowOffsetX &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;token comment&quot;&gt;// From now on, any draw call will have a green shadow attached.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;the above combined with simple code to draw a circle &lt;a href=&quot;https://scony.github.io/web-examples/canvas-w-shadow/&quot;&gt;produces&lt;/a&gt; a following effect:&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/V_cHmYogyw-295.avif 295w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/V_cHmYogyw-295.webp 295w&quot;&gt;&lt;img alt=&quot;Circle with shadow.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/V_cHmYogyw-295.png&quot; width=&quot;295&quot; height=&quot;169&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;h3 id=&quot;shadows-meet-globalcompositeoperation&quot; tabindex=&quot;-1&quot;&gt;Shadows meet &lt;code&gt;globalCompositeOperation&lt;/code&gt; &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Things are getting interesting once one starts thinking about how &lt;code&gt;globalCompositeOperation&lt;/code&gt; may affect the way shadows are drawn. When I thought about it for the first time, I imagined at least 3 possibilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Shadow and shadow origin are both treated as one entity (shadow always below the origin) and thus are drawn together.&lt;/li&gt;
&lt;li&gt;Shadow and shadow origin are combined and then drawn as a one entity.&lt;/li&gt;
&lt;li&gt;Shadow and shadow origin are drawn separately - shadow first, then the content.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When I confronted the above with the &lt;a href=&quot;https://html.spec.whatwg.org/multipage/canvas.html#drawing-model&quot;&gt;drawing model&lt;/a&gt; and &lt;a href=&quot;https://html.spec.whatwg.org/multipage/canvas.html#when-shadows-are-drawn&quot;&gt;shadows&lt;/a&gt; specification,
it turned out the last guess was the correct one. The specification basically says that the shadow should be computed first, then composited within the clipping region over the current canvas content, and finally, the shadow origin should be composited
within the clipping region over the current canvas content (the original canvas content combined with shadow).&lt;/p&gt;
&lt;p&gt;The above can be confirmed visually using few &lt;a href=&quot;https://scony.github.io/web-examples/canvas-w-composite-operation-n-shadows/combo-w-4-operations.html&quot;&gt;examples&lt;/a&gt; (generated using chromium browser v126.0.6478.126):&lt;/p&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.avif 1024w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.webp 1024w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.png&quot; width=&quot;1024&quot; height=&quot;171&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;source-over&lt;/code&gt; operation shows the drawing order - destination first, shadow second, and shadow origin third.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;destination-over&lt;/code&gt; operation shows the reversed drawing order - destination first, shadow second (below destination), and shadow origin third (below destination and shadow).&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;source-atop&lt;/code&gt; operation is more tricky as it behaves like &lt;code&gt;source-over&lt;/code&gt; but with clipping to the destination content - therefore, destination is drawn first, then clipping is set to destination, then the shadow is drawn,
and finally the shadow origin is drawn.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;destination-atop&lt;/code&gt; operation is even more tricky as it behaves like &lt;code&gt;destination-over&lt;/code&gt; yet with the clipping region always being different. That difference can be seen on the image below that presents intermediate states of canvas
after each drawing step:&lt;br&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/CuaioPCaaW-443.avif 443w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/CuaioPCaaW-443.webp 443w&quot;&gt;&lt;img alt=&quot;Breakdown of destination-atop operation.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/CuaioPCaaW-443.png&quot; width=&quot;443&quot; height=&quot;187&quot;&gt;&lt;/picture&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;initial state&lt;/strong&gt; shows a canvas after drawing the destination on it.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;after drawing shadow&lt;/strong&gt; state, shows a shadow drawn below the destination. In this case, the clipping is set to new content (shadow), and hence the part of destination that is not “atop” shadow is being clipped out.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;after drawing shadow origin&lt;/strong&gt; state, shows the final state after drawing the shadow origin below the previous canvas content (new destination) that is at this point “a shadow combined with destination”. Similarly as in the previous step,
the clipping is set to the new content (shadow origin), and hence any part of new destination that is not “atop” the shadow origin is being clipped out.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;discrepancies-between-browser-engines&quot; tabindex=&quot;-1&quot;&gt;Discrepancies between browser engines &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Whenever one realizes the drawing of shadows with &lt;code&gt;globalCompositeOperation&lt;/code&gt; in general may be tricky, then one must also consider that when it comes to particular browser engines, the things are even more tricky as virtually no graphics library
provides an API that matches the &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D&quot;&gt;Canvas2D API&lt;/a&gt; 1-to-1. This means that depending on the graphics library used, the browser engine must implement more or less integration parts
here and there. For example, one can imagine that some graphics library may not have native support for shadows - that would mean the browser engine has to prepare shadows itself by e.g. drawing shadow origin (no matter how complex) on extra
surface, changing color, blurring etc. so that it can be used as a whole once prepared.&lt;/p&gt;
&lt;p&gt;Having said the above, one would expect that all the above aspects should be tested and implemented really well. After all, whenever the subject matter becomes complicated, extra care is required. It turns out, however, this is not necessarily the
case when it comes to &lt;code&gt;globalCompositeOperation&lt;/code&gt; and shadows. As for the testing part, there are &lt;a href=&quot;https://github.com/web-platform-tests/wpt/tree/16a0c5a3a283b68970c7699060f0299a432d5bda/html/canvas/element/shadows&quot;&gt;very few tests&lt;/a&gt;
(&lt;code&gt;2d.shadow.composite*&lt;/code&gt;) in &lt;a href=&quot;https://github.com/web-platform-tests/wpt&quot;&gt;WPT&lt;/a&gt; (Web Platform Tests) covering the use cases described above. It’s also not much better for internal web engine test suites. As for implementations, there’s a substantial amount of
discrepancy.&lt;/p&gt;
&lt;h4 id=&quot;simple-examples&quot; tabindex=&quot;-1&quot;&gt;Simple examples &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;To show exactly what’s the situation, the &lt;a href=&quot;https://scony.github.io/web-examples/canvas-w-composite-operation-n-shadows/combo-w-4-operations.html&quot;&gt;examples&lt;/a&gt; from section &lt;a href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;Shadows meet &lt;code&gt;globalCompositeOperation&lt;/code&gt;&lt;/a&gt;
can be used again. This time using browsers representing different web engines:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Chromium 126.0.6478.126&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.avif 1024w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.webp 1024w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Chromium.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/LUoPvfTK4L-1024.png&quot; width=&quot;1024&quot; height=&quot;171&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Firefox 128.0&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/IH-mDoFE9n-1018.avif 1018w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/IH-mDoFE9n-1018.webp 1018w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Firefox.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/IH-mDoFE9n-1018.png&quot; width=&quot;1018&quot; height=&quot;168&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gnome Web (Epiphany) 45.0 (WebKit/Cairo)&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/9Z1SA4WyhU-1020.avif 1020w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/9Z1SA4WyhU-1020.webp 1020w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Epiphany.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/9Z1SA4WyhU-1020.png&quot; width=&quot;1020&quot; height=&quot;168&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WPE MiniBrowser build from WebKit@&lt;code&gt;098c58dd13bf40fc81971361162e21d05cb1f74a&lt;/code&gt; (WebKit/Skia)&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/U5yTjGvWP0-1017.avif 1017w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/U5yTjGvWP0-1017.webp 1017w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - WPE MiniBrowser.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/U5yTjGvWP0-1017.png&quot; width=&quot;1017&quot; height=&quot;165&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Safari 17.1 (WebKit/Core Graphics)&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/Bxdc-PzFmP-1017.avif 1017w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/Bxdc-PzFmP-1017.webp 1017w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Safari.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/Bxdc-PzFmP-1017.png&quot; width=&quot;1017&quot; height=&quot;167&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Servo release from 2024/07/04&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/3HBRpFE6K5-1017.avif 1017w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/3HBRpFE6K5-1017.webp 1017w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Servo.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/3HBRpFE6K5-1017.png&quot; width=&quot;1017&quot; height=&quot;169&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ladybird build from 2024/06/29&lt;/strong&gt;
&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/m4dqb8Tf5Q-1017.avif 1017w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/m4dqb8Tf5Q-1017.webp 1017w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Ladybird&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/m4dqb8Tf5Q-1017.png&quot; width=&quot;1017&quot; height=&quot;167&quot;&gt;&lt;/picture&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;First of all, it’s evident that experimental browsers such as servo and ladybird are falling behind the competition - servo doesn’t seem to support shadows at all, while ladybird doesn’t support anything other than drawing a rect filled with color.&lt;/p&gt;
&lt;p&gt;Second, the non-experimental browsers are pretty stable in terms of covering most of the combinations presented above.&lt;/p&gt;
&lt;p&gt;Finally, the most tricky combination above seems to be the one including &lt;code&gt;destination-atop&lt;/code&gt; - in that case almost every mainstream browser renders different results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Chromium is the only one rendering correctly.&lt;/li&gt;
&lt;li&gt;Firefox and Epiphany are pretty close, but both are suffering from a similar glitch where the red part is covered by the part of destination that should be clipped out already.&lt;/li&gt;
&lt;li&gt;WPE MiniBrowser and Safari are both rendering in correct order, but the clipping is wrong.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&quot;more-sophisticated-examples&quot; tabindex=&quot;-1&quot;&gt;More sophisticated examples &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Until now, the discrepancies don’t seem to be very dramatic, and hence it’s time to present more sophisticated &lt;a href=&quot;https://scony.github.io/web-examples/canvas-w-composite-operation-n-shadows/combo-w-16-operations.html&quot;&gt;examples&lt;/a&gt; that are an extended
version of the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/249516a7a88e67310d6e96f00e62565ffa3a0ab0/LayoutTests/fast/canvas/canvas-composite.html&quot;&gt;test case&lt;/a&gt; from the WebKit source tree:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Chromium 126.0.6478.126&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/d5EkvwvM45-582.avif 582w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/d5EkvwvM45-582.webp 582w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Chromium.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/d5EkvwvM45-582.png&quot; width=&quot;582&quot; height=&quot;503&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Firefox 128.0&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/DXRahFIYsP-581.avif 581w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/DXRahFIYsP-581.webp 581w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Firefox.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/DXRahFIYsP-581.png&quot; width=&quot;581&quot; height=&quot;504&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gnome Web (Epiphany) 45.0 (WebKit/Cairo)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qxr0pqhsXP-585.avif 585w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/qxr0pqhsXP-585.webp 585w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Epiphany.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/qxr0pqhsXP-585.png&quot; width=&quot;585&quot; height=&quot;505&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;WPE MiniBrowser build from WebKit@&lt;code&gt;098c58dd13bf40fc81971361162e21d05cb1f74a&lt;/code&gt; (WebKit/Skia)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/zkzrK7BgDI-585.avif 585w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/zkzrK7BgDI-585.webp 585w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - WPE MiniBrowser.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/zkzrK7BgDI-585.png&quot; width=&quot;585&quot; height=&quot;503&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Safari 17.1 (WebKit/Core Graphics)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/plwqzYeNaD-585.avif 585w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/plwqzYeNaD-585.webp 585w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Safari.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/plwqzYeNaD-585.png&quot; width=&quot;585&quot; height=&quot;506&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Servo release from 2024/07/04&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/PdjBIRW-a6-585.avif 585w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/PdjBIRW-a6-585.webp 585w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Servo.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/PdjBIRW-a6-585.png&quot; width=&quot;585&quot; height=&quot;515&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ladybird build from 2024/06/29&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;picture&gt;&lt;source type=&quot;image/avif&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/RLEl8uUF_b-586.avif 586w&quot;&gt;&lt;source type=&quot;image/webp&quot; srcset=&quot;https://blogs.igalia.com/plampe/img/RLEl8uUF_b-586.webp 586w&quot;&gt;&lt;img alt=&quot;Shadows combined with compositing operation - Ladybird.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; src=&quot;https://blogs.igalia.com/plampe/img/RLEl8uUF_b-586.png&quot; width=&quot;586&quot; height=&quot;513&quot;&gt;&lt;/picture&gt;&lt;/p&gt;
&lt;p&gt;Other than &lt;code&gt;destination-out&lt;/code&gt;, &lt;code&gt;xor&lt;/code&gt;, and a few simple operations presented before, all the operations presented above pose serious problems to the majority of browsers. The only browser that is correct in all the cases
(to the best of my understanding) is Chromium that is using rendering engine called &lt;a href=&quot;https://www.chromium.org/blink/&quot;&gt;blink&lt;/a&gt; which in turn uses the Skia library. One may wonder if perhaps it’s Skia that’s responsible for the Chromium success,
but given the above results where e.g. WPE MiniBrowser uses Skia as well, it’s evident that the problems lay above the particular graphics library.&lt;/p&gt;
&lt;p&gt;Looking at the operations and browsers that render incorrectly, it’s clearly visible that even small problems - with either ordering of draw calls or clipping - lead to spectacularly broken results. The pinnacle of misery is the &lt;code&gt;source-out&lt;/code&gt;
operation that is the most variable one across browsers. One has to admit, however, that WPE MiniBrowser is slightly closer to being correct than others.&lt;/p&gt;
&lt;h3 id=&quot;towards-unification&quot; tabindex=&quot;-1&quot;&gt;Towards unification &lt;a class=&quot;header-anchor&quot; href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Fixing the above problems is a long journey. After all, every single web engine has to be fixed in its own, specific way. If the specification would be a problem - it would be the obvious way to start. However, as mentioned in the section
&lt;a href=&quot;https://blogs.igalia.com/plampe/nuts-and-bolts-of-canvas2d-globalcompositeoperation-and-shadows/&quot;&gt;Shadows meet &lt;code&gt;globalCompositeOperation&lt;/code&gt;&lt;/a&gt;, the specification, is pretty clear on how drawing, shadows, and &lt;code&gt;globalCompositeOperation&lt;/code&gt; come together. In such case, the next obvious place to
start improving things is a &lt;a href=&quot;https://github.com/web-platform-tests/wpt&quot;&gt;WPT&lt;/a&gt; test suite.&lt;/p&gt;
&lt;p&gt;What makes WPT outstanding is that it is a de facto standard cross-browser test suite for testing the web platform stack. Thus the test suite is developed as an open collaboration effort by developers from around the globe and hence is very broad
in terms of specification coverage. What’s also important, the test results are actively evaluated against the popular browser engines and published under &lt;a href=&quot;https://wpt.fyi/results&quot;&gt;wpt.fyi&lt;/a&gt;, therefore putting some pressure on web engine developers
to fix the problems so that they keep up with competition.&lt;/p&gt;
&lt;p&gt;Granted the above, extending WPE test suite by adding test cases to cover &lt;code&gt;globalCompositeOperation&lt;/code&gt; operations combined with shadows is the reasonable first step towards the unification of browser implementations. This can be done either by
directly contributing tests to WPT, or by &lt;a href=&quot;https://github.com/web-platform-tests/wpt/issues&quot;&gt;creating an issue&lt;/a&gt;. Personally, I’ve decided to file an issue first (&lt;a href=&quot;https://github.com/web-platform-tests/wpt/issues/46544&quot;&gt;WPT#46544&lt;/a&gt;) and to add
tests once I have some time. I haven’t contributed to WPT yet, but I’m excited to work with it soon. Once I land my first pull request, I’ll start fixing WebKit and I won’t hesitate to post some updates on this blog.&lt;/p&gt;
</content>
	</entry>
</feed>
