Improving the editing code in WebKit

For a while now Igalia and Bloomberg have been collaborating to advance Web technologies. As part of that, I’ve been lately involved on improving some editing capabilities of WebKit (posts to follow soon).

As you probably know, in HTML5 any element can be editable. The feature was introduced some time ago, but was finally standardized by the WHATWG. It’s as easy as adding the attribute contenteditable=true and voilà, the magic unfolds (check it out!!!).

That’s indeed a very powerful feature that allows you to create fast and full-featured rich text editors on the Web. Thing is that, although the contenteditable is standardized, some of the behaviors are not, and thus, each Web engine must provide the one it considers the most correct.

Let me show an example of the above. Imagine that you have the typical HTML bulleted list but inside an editable content element (like a <div>). It’d look like something like this:

one
two
three
four

Nothing unusual so far. Now imagine that you select the item “two” and then drag it to the end of “four”. What would be the expected behavior? There is no clear answer to that, you might think that one possible outcome could be:

one
three
fourtwo

which matches Firefox behavior (and WebKit’s as well before I landed this). Another possible result would be this one:

one
three
four
two

Both of them seem pretty sensible, the question is whether you consider that you’re dragging just some text, or a text inside a <li> element. It is not an easy decision, so let’s examine some other cases, for example, what happens if we try with two items instead of just one?. In that case (assuming we select “two” and “three” and drop them after “four”) the result is the following:

one
four
two
three

So despite apparently being both outcomes equally correct, it seems that the behavior is a bit inconsistent depending on the number of items you select. I don’t know how this is implemented in Firefox but I’ll try to explain why WebKit was behaving like that.

WebKit is actually pretty smart when dealing with this kind of operations. When doing a drag and drop of a selection, WebKit builds a markup representation (see this file for further details) of the selected contents. In the first case (one item selected) WebKit detects that the selection starts and ends inside the same node (a text element). That’s why the generated markup does not include any <li> tags, and thus, the content is pasted at the end of the “four” item as text.

In the case of multiple item selection, WebKit traverses the DOM tree looking for a common ancestor of the selected nodes in order to build a proper serialized markup representation of the data that is dragged. As you might have deduced, that operation generates a markup that does include all the nodes it finds until the common ancestor is reached (to preserve the structure and appearance). So instead of having a simple two we’d end up with something similar to <li>two</li><li>three</li> for the multiple selection case.

The actual code is a bit more complex than just “get this markup and insert it into the document” because it has to deal with several corner cases and also needs to perform some cleanups after the move operations. Also the generated markup is much more complex (for the single selected item the markup is normally a <span> element which includes styling data among other stuff) but I simplified it for the sake of simplicity.

So after some work (tracked in bug 111556) I came up with a solution that effectively makes WebCore’s editing code to behave the same way independently of the number of selected list items (note that partial selections inside a list item are not affected by this change).

I’d like to thank Ryosuke Niwa for his insightful reviews and Igalia for giving me the chance to work on exciting stuff around WebKit.