Box Alignment and Grid Layout

September 8th, 2014 No comments

As some of my readers already know, Igalia and Bloomberg are collaborating in the implementation of the Grid Layout specification for the Blink/Chromium and WebKit web engines. As part of this assignment, I had the opportunity to review and contirbute to the implementaiton of another feature I consider quite useful for the web: CSS Box Alignment Module (level 3).

The Box Alignment specification was designed to generalize the behavior of boxes alignment within their containers, which is nowadays defined across multiple specifications. Several layout models are affected by this new specification: block, table, flex and grid. This post is about how it affects to the Grid Layout implementation.

I think is a good idea to begin my exposition with a brief introduction of some concepts related to alignment and CSS Writing Modes, which I consider quite relevant to understand the implications of this specification for the Grid Layout implementation and, more important, to realize about its potential.

Examples are mandatory when analyzing W3C specifications; personally, I can’t see all the angles and implications of a feature described in a specification without the proper examples, both visual and source code.

Finally, I’d like to conclude my article with a development angle describing some interesting implementation details and technical challenges I faced while working on both Blink and WebKit web engines. Also, which perhaps is more interesting, the ones I couldn’t solve yet and I’m still working on. As always comments and feedback are really welcome.

Introduction to Box Alignment and Writing-Modes

From the CSS Box Alignment specification:

features of CSS relating to the alignment of boxes within their containers in the various CSS box layout models: block layout, table layout, flex layout, and grid layout.

From the CSS Writing Modes specification:

CSS features to support for various international writing modes, such as left-to-right (e.g. Latin or Indic), right-to-left (e.g. Hebrew or Arabic), bidirectional (e.g. mixed Latin and Arabic) and vertical (e.g. Asian scripts).

In order to get a better understanding of alignment some abstract dimensional and directional terms should be explained and taken into account. I’m going to briefly describe some of them, the ones I consider more relevant for my exposition; a more detailed definition can be obtained from the Abstract Box Terminology section of the specification.

There are three sets of directional terms in CSS:

  • physical – Interpreted relative to the page, independent of writing mode. The physical directions are left, right, top, and bottom
  • flow-relative –  Interpreted relative to the flow of content. The flow-relative directions are start and end, or block-start, block-end, inline-start, and inline-end if the dimension is also ambiguous.
  • line-relative – Interpreted relative to the orientation of the line box. The line-relative directions are line-left, line-right, line-over, and line-under.

The abstract dimensions are defined below:

  • block dimension – The dimension perpendicular to the flow of text within a line, i.e. the vertical dimension in horizontal writing modes, and the horizontal dimension in vertical writing modes.
  • inline dimension – The dimension parallel to the flow of text within a line, i.e. the horizontal dimension in horizontal writing modes, and the vertical dimension in vertical writing modes.
  • block axis – The axis in the block dimension, i.e. the vertical axis in horizontal writing modes and the horizontal axis in vertical writing modes.
  • inline axis - The axis in the inline dimension, i.e. the horizontal axis in horizontal writing modes and the vertical axis in vertical writing modes.
  • extent or logical height - A measurement in the block dimension: refers to the physical height (vertical dimension) in horizontal writing modes, and to the physical width (horizontal dimension) in vertical writing modes.
  • measure or logical width - A measurement in the inline dimension: refers to the physical width (horizontal dimension) in horizontal writing modes, and to the physical height (vertical dimension) in vertical writing modes. (The term measure derives from its use in typography.)

Then, there are flow-relative and line-relative directions. For the time being, I’ll consider only flow-relative directions terms since they are more relevant for discussing alignment issues.

  • block-start - The side that comes earlier in the block progression, as determined by the writing-mode property: the physical top in horizontal-tb mode, the right in vertical-rl, and the left in vertical-lr.
  • block-end - The side opposite block-start.
  • inline-start - The side from which text of the inline base direction would start. For boxes with a used direction value of ltr, this means the line-left side. For boxes with a used direction value of rtl, this means the line-right side.
  • inline-end - The side opposite start.

writing-modes

So now that we have defined the box edges and flow direction concepts we can review how they are used when defining the alignment

properties and values inside a Grid Layout, which can be defined along two axes:

  • which dimension they apply to (inline vs. stacking)
  • whether they control the position of the box within its parent, or the box’s content within itself.

alignment-properties

Regarding the alignment values, there are two concepts that are important to understand:

  • alignment subject - The alignment subject is the thing or things being aligned by the property. For justify-self and align-self, the alignment subject is the margin box of the box the property is set on. For justify-content and align-content, the alignment subject is defined by the layout mode.
  • alignment container - The alignment container is the rectangle that the alignment subject is aligned within. This is defined by the layout mode, but is usually the alignment subject’s containing block.

Also, there are several kind of alignment behaviors:

  • Positional Alignment - specify a position for an alignment subject with respect to its alignment container.
  • Baseline Alignment - form of positional alignment that aligns multiple alignment subjects within a shared alignment context (such as cells within a row or column) by matching up their alignment baselines.
  • Distributed Alignment - used by justify-content and align-content to distribute the items in the alignment subject evenly between the start and end edges of the alignment container.
  • Overflow Alignment - when the alignment subject is larger than the alignment container, it will overflow. To help combat this problem, an overflow alignment mode can be explicitly specified.

At the time of this writing, only Positional Alignment is implemented so I’ll focus on those values in the rest of the article. I’m still working on implementing the specification, though, so there will be time to talk about the other values in future posts.

  • center - Centers the alignment subject within its alignment container.
  • start - Aligns the alignment subject to be flush with the alignment container’s start edge.
  • end - Aligns the alignment subject to be flush with the alignment container’s end edge.
  • self-start - Aligns the alignment subject to be flush with the edge of the alignment container corresponding to the alignment subject’s start side. If the writing modes of the alignment subject and the alignment container are orthogonal, this value computes to start.
  • self-end - Aligns the alignment subject to be flush with the edge of the alignment container corresponding to the alignment subject’s end side. If the writing modes of the alignment subject and the alignment container are orthogonal, this value computes to end.
  • left - Aligns the alignment subject to be flush with the alignment container’s line-left edge. If the property’s axis is not parallel with the inline axis, this value computes to start.
  • right - Aligns the alignment subject to be flush with the alignment container’s line-right edge. If the property’s axis is not parallel with the inline axis, this value computes to start.

So, after this introduction and with all these concepts in mind, it’s now time to get hands on the Grid Layout implementation of the Box Alignment specification. As it was commented before, I’ll try to use as many examples as possible.

Aligning items inside a Grid Layout

Before entering in details with source code and examples, I’d like to summarize most of the concepts described below with some pretty simple diagrams:

2×2 Grid Layout (LTR)

grid-alignment-ltr

2×2 Grid Layout (RTL)

grid-alignment-rtl

The diagram below illustrates how items are placed inside the grid using different writing modes:

grid-writing-modes

At this point, some real examples would help to understand how the CSS alignment properties work on Grid Layout and why they are so important to get all the potential behind this new layout model.

Let’s consider this basic stylesheet which will be used in the examples from now on:

<style>
  .grid {
      grid-auto-columns: 100px;
      grid-auto-rows: 200px;
      width: -webkit-fit-content;
      margin-bottom: 20px;
  }
   .item {
      width: 20px;
      height: 40px;
  }
   .content {
      width: 10px;
      height: 20px;
      background: white;
  }
   .verticalRL {
      -webkit-writing-mode: vertical-rl;
  }
   .verticalLR {
      -webkit-writing-mode: vertical-lr;
  }
   .horizontalBT {
      -webkit-writing-mode: horizontal-bt;
  }
   .directionRTL {
      direction: rtl;
  }
</style>

The item style will be used for the grid items, while the content will be the style of the elements to be placed inside each grid item. There are as well writing-mode related styles, which will be useful later to experiment with different flow and text directions.

In the first example we will center all the cells content so we can have a fully aligned grid, which is particularly interesting for many web applications.

<div class="grid" style="align-items: center; 
                         justify-items: center">
  <div class="cell row1-column1">
    <div class="item"></div>
  </div>
  <div class="cell row1-column2">
    <div class="item"></div>
  </div>
  <div class="cell row2-column1">
    <div class="item"></div>
  </div>
  <div class="cell row2-column2">
    <div class="item"></div>
  </div>
</div>
grid-alignment-example1

In the next example we will illustrate how to use all the Positional Alignment values so we can place nine items in the same grid cell.

 
<div class="grid">
  <div class="cell row1-column1"
     style="align-self: start; justify-self: start;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: center; justify-self: start;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: end; justify-self: start;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: start; justify-self: center;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: center; justify-self: center;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: end; justify-self: center;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: start; justify-self: end;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: center; justify-self: end;">
    <div class="item"></div>
  </div>
  <div class="cell row1-column1"
     style="align-self: end; justify-self: end;">
    <div class="item"></div>
  </div>
</div>
grid-alignment-example2

Let’s start playing with inline and block-flow direction and see how it affects to the different Positional Alignment values. I’ll start with the inline direction, which affects only to the justify-xxx set of CSS properties.

<div class="grid" style="align-items: self-start; justify-items: self-start">
  <div class="cell row1-column1">
    <div class="item"></div>
  </div>
  <div class="cell row1-column2">
    <div class="item"></div>
  </div>
  <div class="cell row2-column1">
    <div class="item"></div>
  </div>
  <div class="cell row2-column2">
    <div class="item"></div>
  </div>
</div>
Direction LTR Direction RTL
grid-alignment-example3 grid-alignment-example4

The writing-mode CSS Property applies to the block-flow direction, hence it’s the align-xxx properties the ones affected. In this case, orthogonal writing-modes can be specified in the HTML source code; however, these use cases are not yet fully supported by the current implementation of Grid Layout.

<div class="grid"
      style="align-items: self-start; 
             justify-items: self-start">
  <div class="cell row1-column1">
    <div class="item"></div>
  </div>
  <div class="cell row1-column2">
    <div class="item"></div>
  </div>
  <div class="cell row2-column1">
    <div class="item"></div>
  </div>
  <div class="cell row2-column2">
    <div class="item"></div>
  </div>
</div>
grid-alignment-example3
Vertical LR Vertical RL
grid-alignment-example5 grid-alignment-example6

Technical challenges, accomplished and to be faced

Implementing the Box Alignment specification has been a long task and there is still quite much work ahead for both, WebKit and Blink/Chromium web engines. Perhaps one of the most tedious issue was the definition of a couple of new CSS properties: justify-self and justify-items, which required to touch several Core components, from the CSS parser, the style builder and resolver and finally the rendering.

Another important technical challenge comes from the fact that the Box Alignment properties already present in both web engines were implemented as part of the Flexible Box specification. As it was commented before in this post, the Box Alignment specification aims to generalize the alignment behavior for several layout models, hence these properties were not tied to the Flexible Box implementation anymore; this lead to many technical issue, as I’ll explain later.

The patch implemented for issue 333423005 is a good example of the files to touch and logic to be added in order to implement a new CSS property in Blink/Chromium. There is a similar work to be done in the WebKit web engine; at the time of this writing the similarities are still big, even though some parts changed considerably, like the CSS parsing and style builder logic. As an example, the patch implemented in bug 134419

The following code is quite descriptive of the nature of the CSS Box Alignment properties and how they are applied during the style cascade:

void StyleAdjuster::adjustStyleForAlignment(RenderStyle& style, const RenderStyle& parentStyle)
{
    bool isFlexOrGrid = style.isDisplayFlexibleOrGridBox();
    bool absolutePositioned = style.position() == AbsolutePosition;
 
    // If the inherited value of justify-items includes the legacy keyword, 'auto'
    // computes to the the inherited value.
    // Otherwise, auto computes to:
    //  - 'stretch' for flex containers and grid containers.
    //  - 'start' for everything else.
    if (style.justifyItems() == ItemPositionAuto) {
        if (parentStyle.justifyItemsPositionType() == LegacyPosition) {
            style.setJustifyItems(parentStyle.justifyItems());
            style.setJustifyItemsPositionType(parentStyle.justifyItemsPositionType());
        } else {
            style.setJustifyItems(isFlexOrGrid ? ItemPositionStretch : ItemPositionStart);
        }
    }
 
    // The 'auto' keyword computes to 'stretch' on absolutely-positioned elements,
    // and to the computed value of justify-items on the parent (minus
    // any 'legacy' keywords) on all other boxes (to be resolved during the layout).
    if ((style.justifySelf() == ItemPositionAuto) && absolutePositioned)
        style.setJustifySelf(ItemPositionStretch);
 
    // The 'auto' keyword computes to:
    //  - 'stretch' for flex containers and grid containers,
    //  - 'start' for everything else.
    if (style.alignItems() == ItemPositionAuto)
        style.setAlignItems(isFlexOrGrid ? ItemPositionStretch : ItemPositionStart);
 
    // The 'auto' keyword computes to 'stretch' on absolutely-positioned elements,
    // and to the computed value of align-items on the parent (minus
    // any 'legacy' keywords) on all other boxes (to be resolved during the layout).
    if ((style.alignSelf() == ItemPositionAuto) && absolutePositioned)
        style.setAlignSelf(ItemPositionStretch);
}

The WebKit web engine implements the same logic in the StyleResolver class; the StyleAdjuster class is just a helper class defined in the blink/Chromium engine to assist the StyleReslolver logic during the style cascade in order to make some final adjustmetns.

The issue 297483005 implements the align-self CSS property support in Grid Layout; the follwong code extrated from that patch is a good example of how alingment interacts with the grid tracks.

LayoutUnit RenderGrid::rowPositionForChild(const RenderBox* child) const
{
    bool hasOrthogonalWritingMode = child->isHorizontalWritingMode() != isHorizontalWritingMode();
    ItemPosition alignSelf = resolveAlignment(style(), child->style());
 
    switch (alignSelf) {
    case ItemPositionSelfStart:
        // If orthogonal writing-modes, this computes to 'Start'.
        // FIXME: grid track sizing and positioning does not support orthogonal modes yet.
        if (hasOrthogonalWritingMode)
            return startOfRowForChild(child);
 
        // self-start is based on the child's block axis direction. That's why we need to check against the grid container's block flow.
        if (child->style()->writingMode() != style()->writingMode())
            return endOfRowForChild(child);
 
        return startOfRowForChild(child);
    case ItemPositionSelfEnd:
        // If orthogonal writing-modes, this computes to 'End'.
        // FIXME: grid track sizing and positioning does not support orthogonal modes yet.
        if (hasOrthogonalWritingMode)
            return endOfRowForChild(child);
 
        // self-end is based on the child's block axis direction. That's why we need to check against the grid container's block flow.
        if (child->style()->writingMode() != style()->writingMode())
            return startOfRowForChild(child);
 
        return endOfRowForChild(child);
 
    case ItemPositionLeft:
        // orthogonal modes make property and inline axes to be parallel, but in any case
        // this is always equivalent to 'Start'.
        //
        // self-align's axis is never parallel to the inline axis, except in orthogonal
        // writing-mode, so this is equivalent to 'Start’.
        return startOfRowForChild(child);
 
    case ItemPositionRight:
        // orthogonal modes make property and inline axes to be parallel.
        // FIXME: grid track sizing and positioning does not support orthogonal modes yet.
        if (hasOrthogonalWritingMode)
            return endOfRowForChild(child);
 
        // self-align's axis is never parallel to the inline axis, except in orthogonal
        // writing-mode, so this is equivalent to 'Start'.
        return startOfRowForChild(child);
 
    case ItemPositionCenter:
        return centeredRowPositionForChild(child);
        // Only used in flex layout, for other layout, it's equivalent to 'Start'.
    case ItemPositionFlexStart:
    case ItemPositionStart:
        return startOfRowForChild(child);
        // Only used in flex layout, for other layout, it's equivalent to 'End'.
    case ItemPositionFlexEnd:
    case ItemPositionEnd:
        return endOfRowForChild(child);
    case ItemPositionStretch:
        // FIXME: Implement the Stretch value. For now, we always start align the child.
        return startOfRowForChild(child);
    case ItemPositionBaseline:
    case ItemPositionLastBaseline:
        // FIXME: Implement the ItemPositionBaseline value. For now, we always start align the child.
        return startOfRowForChild(child);
    case ItemPositionAuto:
        break;
    }
 
    ASSERT_NOT_REACHED();
    return 0;
}

The resolveAlignment function call deserves an special mention, since it will lead to the open issues I’m still working on. The Box Alignment specification states that the auto values must be resolved to either stretch or start depending on the kind of element. This is theoretically performed during the style cascade, so it wouldn’t be necessary to resolve it at the rendering stage. The code is pretty simple :

static ItemPosition resolveAlignment(const RenderStyle* parentStyle, const RenderStyle* childStyle)
{
    ItemPosition align = childStyle->alignSelf();
    // The auto keyword computes to the parent's align-items computed value, or to "stretch", if not set or "auto".
    if (align == ItemPositionAuto)
        align = (parentStyle->alignItems() == ItemPositionAuto) ? ItemPositionStretch : parentStyle->alignItems();
    return align;
}

The RenderFlexibleBox implementation has to define a similar logic and what is more important, the default value of all the Box Alignment properties have been changed to auto, instead of stretch as it’s stated in the Flexbible Box specification.

To make things even more complicated, many HTML elements are being rendered by RenderFlexibleBox objects as an implementation decision, without the proper display value set to indicate such assumption. This causes many issues and layout tests failures, since the resolved value for auto depends on the kind of element, which is defined by its display property value. Additionally, there are also problems with the anonymous render objects added to the tree on certain implementations.

Both WebKit and Blink/Chromium are affected by these issues; Mathml is a good example for the WebKit engine, since most if its render objects are implemented using a RenderFlexibleBox; also, it assigns and manipulates the align-{self, items} properties during the layout. The RenderFullScreen object is a source of problems for the Blink/Chromium web engine on this regard; it uses a RenderFleixibleBox because of its stretch default behavior, which is not the case anymore according to the Box Alignment specification.

I’m still working on theses issues in both web engines, so this issue is trying to face part of the problems on Blink/Chromium. There are a similar bug in the WebKit engine with similar challenges.

Another pending issue present in both web engines is the lack of support for different writing-modes. Eventhouth the Grid Layout logic is prepared to support them, it’s still buggy and for certain combinations it does not produce the expected outcome.

I’d like to finish this post pointing out that anybody can follow the progress of the Box Alignment spec implementation for Grid Layout you can track these bugs on either of the web engine you are more interested on:

  • Blink/Chromium
    • bug 249451: [CSS Grid Layout] Implement row-axis Alignment
    • bug 376823: [CSS Grid Layout] Implement column-axis Alignment
  • WebKit
    • bug 133224 – [meta] [CSS Grid Layout] Implement column-axis Alignment
    • bug 133222 – [meta] [CSS Grid Layout] Implement row-axis Alignment

This work wouldn’t be possible without the support of Bloomberg and Igalia, who are comitted to provide a better web platform for developers.

Igalia & Bloomberg logos

Igalia and Bloomberg working to build a better web platform

Categories: Blink, CSS Grid Layout, WebKit, WebKit Igalia Tags:

2014 Webkit Contributors Meeting

April 30th, 2014 Comments off

A few weeks ago, I had the opportunity to represent Igalia in the Webkit Contributors Meeting. It was hosted by Apple at their campus in Cupertino and, unlike what some Igalia’s fellows told me about, it wasn’t the huge event it used to be which gave me the chance to meet personally some of the well known hackers I interact with by IRC and bugzilla; that was definitively nice.

The usual unconference-like environment was something I liked a lot, specially how we made the session scheduling on the fly; it was funny to directly notice the interest of the audience on the talk I proposed. My involvement in Webkit during the last year has been improving and implementing CSS standards like CSS Regions and CSS Grid Layout, being precisely the former what my talk was about.

CSS Grid Layout

When I knew I was going to attend the meeting I saw the opportunity of spread out the work we are doing at Igalia with the support of Bloomberg on the implementation of the CSS Grid Layout standard, which we both consider very important for the Web because of the use cases it solves.

I had the feeling that the situation of this new feature could be considerably improved among the Webkit community and perhaps get the attention of more hackers and reviewers willing to collaborate and accelerate the implementation and, eventually, shipping it in future releases of the Safari browser. Considering that IE/Trident is already shipping it, Blink/Chromium has plans to do it as soon as possible and Mozilla is also willing to do it in the long term (the E.D is being written by members of those web engines), we at Igalia considered sensible to suggest the Webkit community to go in that direction.

Regarding the talk itself, I think it helped to increase the visibility of the CSS Grid Layout implementation in Webkit, so let’s see what this means for our work in the following months. If you are interested on the slides, you can check them out here. We talked about ways to enable the feature by default in the nightly builds, so it can be tested by Webkit hackers more easily, and also about the work to be done in the next months. I was happy to see this point being moved to the mailing list and finally become a patch landing. We have a web site with many examples and guidelines to activate the feature on different browsers:

http://igalia.github.io/css-grid-layout/

Selection

One of the goals to attend this meeting was to unblock the issue of Selection with CSS Regions, something that Igalia was involved on during some months last year in collaboration with Adobe. I discussed with David Hyatt the Sub-Trees Approach as a way of making Selection specification compliant when using CSS Regions, even though the issues from the end user perspective are still there. I consider this an important milestone because it puts the CSS Regions standard in a similar position than the rest of layout models affected by these Selection problems (CSS Grid Layout, MathML, Shadom DOM, Absolute Positioning and Multi-Column, among others).

Now a different challenge has to be addressed: how to provide a usable Selection for these layout models ? Even though it was not in the approved schedule, we managed to setup a meeting among different people interested on Selection. I have to say that it was a nice and interesting discussion, which these are the main conclusions I’d extract from:

  • We need to go beyond the editing/selection specification, since the DOM nature of its definition is very limited for the new layout models.
  • Emulating the iOS selection behavior was something most of the participants liked, so perhaps we should consider it in the future.
  • There are several approaches we can follow, multi-range, sub-trees, region-as-containers, … all of them have benefits and cons and what is worst, some of them address only issues of specific layout models; the idea that there is not single solution for all the layout models was always present.
  • Perhaps we should define different implementations of Selection for these specific features; something incremental since it’s very important to keep such an old and important feature like Selection as stable as it’s been all these years.

These ideas are not actually breaking news, since they were already mentioned in previous meetings, but I perceived a better mood now to implement more ambitious approaches to improve Selection. Some of the standards affected by these issues, like CSS Regions and CSS Grid Layout, are really important features so these technical challenges must be faced as soon as possible.

One of the action points agreed was to analyze case by case the issues each of these layout models have, defining very specific examples and test cases; we could later start a discussion on the best way to solve them. At Igalia we have been thinking about this for some time already, so we have created a web site and test cases repository to analyze the different issues present in the layout models we considered more interesting so far. We’d like to invite any hacker interested on this topic to contribute to the wiki and the test cases repository, even adding different layout models to be included in the analysis.

http://igalia.github.io/web-selection-examples/

Misc topics

I attended also some other talks and discussions interesting for the future of the Webkit project, which Igalia is quite committed to, specially as maintainers of the WebkitGtk+ port.

We participated in the discussion of “How to make WebKit more awesome” representing the WebkitGtk+ port community; it was a nice discussion with plenty of good ideas.

I attended the  CSS Regions and CSS Shapes talks, which were quite rich in terms of progress and roadmap announcements  and with the usual awesome demos. The CSS Regions talk presented by Andrei Abucur got quite much attention and it generated an interesting discussion about it’s future.

I attended the session about Subpixel Layout as well, something I was involved into some months ago. I found out some CSS Regions bugs related to the Subpixel Layout feature, which root cause was not enabling the SATURATED_LAYOUT_ARITHMETIC flag. Actually, we’ve recently enabled it in the WebkitGtk+ port with quite good results, although there are still a few regressions  I’m still working on.

 

Categories: WebKit Tags:

New shorthand properties for CSS Grid Layout

April 11th, 2014 Comments off

I’ve been working for a while already on the implementation of the CSS Grid Layout standard for the WebKit and Blink web engines. It’s a complex specification, indeed, like most of them, so I enjoyed a lot decrypting all the angles behind the language used to define the different CSS properties, their usage and limits, exceptions and so on.  It’s fair to start thanking the WebKit reviewers and Blink owners for their patient and support reviewing patches. It also worth mentioning that the E.D is still a live document with frequent changes and active discussions in the www-style mailing list, which is very active and supportive solving doubts and attending suggestions of the hackers working on the implementation.

Before continue reading, I’d strongly recommend reading the previous posts of my colleges Manuel and Sergio to understand the basic concepts of the CSS Grid Layout and its main features and advantages for the web.

I had the chance to land several patches in WebKit and Blink that improved the current implementation of the standard, both fixing bugs and adapting it to the latest syntax changes introduced in the spec, but perhaps the most noticeable improvements are, so far, the new grid-template and grid shorthands added recently.
 

The “grid-template” shorthand

 
Quoting the CSS Grid Layout specs:

The grid-template property is a shorthand for setting grid-template-columns, grid-template-rows, and grid-template-areas in a single declaration. It has several distinct syntax forms:

none | subgrid | <‘grid-template-columns’> / <‘grid-template-rows’> | [<’track-list’>/ ]? [<’line-names’>? <’string’> <’track-size’>?]+

It’s always easier if we have some examples at hand:

grid-template: auto 1fr auto / auto 1fr;
grid-template: 10px / "a" 15px;
grid-template: 10px / (head1) "a" 15px (tail1)
                      (head2) "b" 20px (tail2);
grid-template: (first) 10px repeat(2, (nav nav2) 15px) /       "a b c" 100px (nav)
                                                        (nav2) "d e f" 25px  (nav)
                                                        (nav2) "g h i" 25px  (last);

It’s important to notice that the subgrid functionality is under discussion to be postponed for the level 2 of the specification, hence  it was not implemented, for the time being,  in the shorthand either. This decision had the support of IE and Chromium browsers;   Mozilla partially agree on this, even though with some  doubts.

There was something special in the implementation of this shorthand property. Usually, the CSS property parsing methods are implemented straight forward, avoiding unnecessary or duplicated operations over the parsed value list. However, due to the ambiguity of the shorthand syntax, it’s not clear which form the expression belongs to until reaching the <string> clause. In order to reuse the <grid-template-{row, column}> parsing function, it was necessary to allow rewinding the parsedValue list in case of detecting the wrong form was being processed.

Another remarkable implementation detail was the change in the gridLineName parsing function, required to join the adjoining line names of the last and first columns (nav and nav2 in the example). See below the longhand equivalence of the last case in the previous example:

grid-template-columns: (first) 10px repeat(2, (nav nav2) 15px);
grid-template-rows: 100px (av nav2) 25px (nav nav2) auto (last):
grid-template-areas: "a b c" 
                     "d e f"
                     "g h i";

 

The “grid” shorthand

 
Quoting the CSS Grid Layout specs:

The grid property is a shorthand that sets all of the explicit grid properties (grid-template-rows, grid-template-columns, and grid-template-areas) as well as all the implicit grid properties (grid-auto-rows, grid-auto-columns, and grid-auto-flow) in a single declaration.

<‘grid-template’> | [<‘grid-auto-flow’> [<‘grid-auto-columns’>[/ <‘grid-auto-rows’>]?]?]

Even that the shorthand sets both implicit and explicit grid properties, it can be only specified either implicit or explicit grid properties; the missing properties will be set to the initial values. Now let’s see some examples:

grid: 15px / 10px;
grid: row 10px;
grid: column 10px / 20px;

The “grid” shorthand is the recommended mechanism even to define just the  the explicit shorthand, unless web authors are interested on cascade separately the impicit grid properties.
 

Current status and next steps

 
Both properties landed Blink trunk rencetly (revisions 170552 and 171143) and and they are waiting for the final review in WebKit, hopefully they will land soon. There are enough layout tests to cover the most common cases but perhaps some additional cases might be added in the future. As it was mentioned, there are certain ambiguities in both shorthands syntax and it’s also important to check out the www-style mailing list looking for changes that might require modifying the implementation, hence adding the proper test cases.

With the implemmentation of these two new shorthands, the properties implementation tasks are almost completed. We are working gonw on fixing bugs and implementing the alignment features. There is a quite important gap between the Blink and WebKit implementation, but we are working on porting patches as soon as possible, since we think it’s important to have both implementations synced.

I’ll attend the WebKit Contributors Meeting next week, so perhaps I could speed up the landing the patches for the shorthand properties. My main goal, though, will be to gather feedback from the WebKit community about the status of the CSS Grid Layout implementation, what features they miss the most, which bugs should have more priority and share with them our future plans at Igalia.

All this work was possbile thanks to the collaboration between Igalia and Bloomberg, We both are working hard to help and promote the wide adpoption of this standar, which will be shipped soon on IE and hopefully also in Chromimum. We are also following the efforts Mozila is doing, which give us the impresion that the interest of most of the browsers on this standar is quite high.

Categories: Blink, CSS Grid Layout, planet, WebKit Tags:

Improving selection in CSS Regions

January 22nd, 2014 Comments off

I would like to introduce in this post the main problems we have detected in the Selection implementation of two of the most important web engines, such as Blink and WebKit. I’ve already described some of these issues, particularly for CSSRegions, but I’ll go a bit further now analyzing them and also introducing one of the proposal we have been working on at Igalia s part of the collaboration we have with Adobe.

Selection is a DOM Tree matter

At Igalia, we have been investigating how to adapt the selection to the new layout models which provide more complex ways of visualizing the content defined in the DOM Tree. Let’s consider the following basic example using CSSRegions layout:

base-case

Figure 1: base case

In the last post about this issue we have identified 4 main problems with selection in CSSRegions:

  • Selection direction
  • Highlighting and content mismatch
  • Incorrect block gaps filling
  • Clear the selection

I’ll describe some examples where these issues are present, where are the root causes and how they can be solved or, at least, improved. I’ll try as well to explain how the Selection works, starting from the mouse events the end user generates to perform a new selection, how those are mapped into a DOM Tree range and finally, how the rendering process produces the highlighting of the selected elements.

example-a1

Figure 2: Highlighting and selected content mismatch

I’ll use this first example (Figure 2) to briefly describe how the Selection is implemented and how all the involved components interact to generate both, the selected DOM Range and the corresponding highlighting by the RenderTree. Obviously the end user selects contents from the Visualized elements, in this case the content of two regular blocks (no regions involved). The mouse events are translated to VisiblePosition instances (Start and End)  in the DOM Tree using the positionForPoint method. Such VisiblePositions are then mapped into the corresponding RenderObjects in the Render Tree; these objects are the ones used to traverse the tree in the RenderView::setSelection method and mark the appropriated elements with one of the following flags: SelectionNone, SelectionStart, SelectionInside, SelectionEnd, SelectionBoth. These flags are also very important in the block gaps filling algorithm, implemented basically in the RenderBlock::selectionGaps method.

The algorithm implemented in the RenderVieww::setSelection method can be described, very simplified, as the following steps:

  • gathering information (RenderSelectionInfo and RenderblockSelectionInfo) of the old selection.
  • clearing the old selection (basically mark all the elements as SelectionNone)
  • updating the flags of the elements of the new selection.
  • gathering information of the new selection.
  • repainting old objects which might have changed.
  • painting the new selected objects.
  • repainting the old blocks which might have changed.
  • painting the new selected blocks.

The algorithm traverses the RenderTree, from the Start to the End using the RenderObject::nextInPreOrder function. Here is where the clear operation issues can appear. If not all objects can be reached by the pre-order traversal, the clear operation does not work properly. That’s why we introduced a way to traverse back the Tree (r155058) looking for elements which can be unreachable. One of the causes of this issue is the selection direction change.

This first example shows the highlighting and content mismatch issue, since the DOM Range considers the source (flow-into) element, while is not highlighted by the RenderTree.

The next example considers now selection from both regions and regular blocks and introduces also an interesting Selection topic: selection direction.

example-b1

Figure 3: Incorrect block gaps filling

As you can see in the diagram Figure 3, the user selected content upwards. In most of the cases the selection direction is not used at all, so Start  must be always above the End VisiblePosition in the DOM Tree. The VisibleSelection ensures that, but in this case, because of how the Source (flow-into) is defined according to the CSSRegions specification and where it was placed in the HTML code, the original Start and End position are not flipped. We will talk more about this in the next example. However,  the RenderObject associated to a DOM element with a flow-into property is located in the in the render tree under the RenderFlowThread object, which itself is placed at the end of normal render tree, thus causing the start render object to be below the end render object. This fact causes the highlighted content to be exactly the opposite to what the user actually selected.

This example illustrates also the issue of incorrect block gaps filling, since the element with the id content-1 is considered a block gap to be filled. This happens because of the changes introduced in the already mentioned revision  r155058  since the element with id content-2 and the body are flagged as SelectionEnd, the intermediate elements are considered as block gaps to be filled.

At this point is quite clear that the way the Render Tree is traversed is very important to produce a coherent selection rendering; notice that in this case, highlight and selected content match perfectly. The idea of using the Region DIV as container of the Source DIV content portion, which is rendered inside such region, looks like a promising approach. The next example will go further into this idea.

example-c1

Figure 4: Selection direction

In this example (Figure 4) the Start and End VisiblePosition instance have to be flipped by the generated VisibleSelection, since the DIV with the id content-1 is above the original Start element defined by the end user. By flipping both positions it makes the corresponding Start and End RenderObject instances to be consistent, that’s why there is no selection direction issue in this case. However, because of the position of the End element as child of the RenderFlowThread, the RenderElement with the id content-2  is selected, which, while being seamless from the user experience point of view, it does not match the selected content retrieved from the DOM Range.

The solution: Regions as containers

At this point is clear that the selection direction issues are one of the most important source of problems for the Selection with CSSRegions. The current algprithms, RenderView::setselection and RenderBlock::selectionGaps, require to traverse the RenderTree downwards from start to end. In fact, this is specially important for the block gaps filling algorithm.

It’s important to notice that the divergence of the DOM and Render trees when using CSSRegions comes from how these two concepts, the flow-into DOM element and the RenderFlowThread object, are managed and placed in each trees. Why not just using the region elements for the selection algorithms and considering both flow-into and RFT as “meta-elements” where the selected content is extracted from ?

Considering the steps defined previously for the selection algorithm the regions as containers approach could be described as follows:

  • Case 1: Start and Stop elements, either both or none, are children of the RenderFlowThread.
    • The current RenderView::setSelection algorithm works fine.
  • Case 2: Only Start is child of the RenderFlowThread.
    • First, determining the RenderElements range [RegionStart, RegionEnd] in the RenderFlowThread associated to the RenderRegion the Start element is rendered by.
    • Then, applying the current algorithm to the range of elements [Start, RegionEnd]
    • Finally, applying the current algorithm from the NextInPreOrder element of the RenderRegion until the Stop, as usual.
  • Case 3: Only Stop is child of the RenderFlowThread.
    • First, applying the current algorithm from the Start element to the RenderRegion the Stop element is rendered by.
    • Then, determining the RenderElements range [RegionStart, RegionEnd] in the RenderFlowThread associated to the RenderRegion the Stop element is rendered by.
    • Finally, applying the current algorithm to the range of elements [RegionStart, Stop]

Determining the selection direction, at VisibleSelection, is also affected by the structure of the RenderTree; even that the editing module in both WebKit and Blink is also using rendering info for certain operations, this is perhaps one of the weakest points of this approach. Let’s use one of the examples defined before to illustrate this situation.

example-b2

Figure 5: Block gaps filling issues solved

While traversing the RenderTree, once a RenderRegion is reached its corresponding range of RenderObjects is determined in the RenderFlowThread. The entire range will be traversed for the blocks flagged as SelectionInside. For the ones flagged as SelectionStart or SelectionEnd, the steps previously defined are applied.

The key of this new approach is that traversing is always downwards, from the Start to the End, which solves also the block gaps related issues.

Let’s considering now a more complex example (Figure 6), with several regions between a number of regular blocks. selection is more natural with this approach, coherent with what the user expects and also matching the DOM Tree range for most of the cases. This is, however, the biggest drawback of this approach, since it does not follow completely the Editing/Selection specs. I’ll talk  more about this in the last lines of this post.

example-d

Figure 6: Selection direction issues solved

The following video showcase our proposal on the WebKit MiniBrowser testing application using a real HTML example based on the Figure 6 diagram.

Even though selection is more natural an coherent, as I already mentioned, it does not follow completely the Editing/Selection specs. As I stated at the beginning of this post, selection is a DOM matter, defined by a Range of elements in the DOM Tree. This very simple case (Figure 7) is enough to describe this issue:

example-a

Figure 7: Regions as containers NON specs compliant

The regions as containers approach does not considers the Source (flow-into) elements as actual DOM elements, so they will never be part of the selection. This breaks the Editing/Selection specification, since those are regular DOM elements as they are defined in the CSS Region standard. This approach was our first try and perhaps too ambitious, providing a good user experience on selection with CSSRegions and specs compliant at the same time. Even that it was a good experience we can conclude that the problem is too complex and it requires a different strategy.

We had the opportunity to introduce and discuss our proposal during the last WebKitGtk+ Hackfest, where Rego, Mihnea and me had the chance to work hand in hand, carefully analyzing and digesting this proposal. Even that we finally discard it, we were able to design other approaches which some of them are really promising. Rego will introduce some of them shortly in a blog post. Hence, can’t end this post without thanking to Igalia and the GNOME Foundation for sponsoring such a productive and useful hackfest.

Categories: CSS Regions, planet, WebKit Tags:

CSS Regions and Selection

October 15th, 2013 Comments off

Back in early June, Adobe and Igalia announced a collaboration to work on the CSS Regions and CSS Shapes W3C standards. Our first challenge has been to improve the Selection use cases when using complex layout models, like CSS Regions.

The CSS Regions model allows content to flow across multiple areas called Regions. This new approach offers web content designers a way to build richer and more complex layouts, mapping content with specific visual areas of a document. Defining different Flow Threads with multiple Regions, associating them to specific content, and applying different styles to a set of Regions is very powerful in terms of design and user experience. If you are interested, here you can find some examples of what is possible with CSS Regions.

But having this flexibility in web design requires overcoming quite a few technical challenges. The current implementation of CSS Regions in WebKit changes the way the Render Tree is created from the DOM Tree. This poses the challenge of making selection work with regions since selection is DOM based. Given its importance and frequent use, improving the interaction of selection and CSS Regions has been the main goal of our collaboration.

The W3C selection specification, which has not been updated since the last year, does not address the complexities introduced by new layout modules, like CSS Regions, CSS FlexBox and CSS Grid Layout module . We found out very quickly that selection had many issues, with respect to both visual appearance and selected content. We have created a tests suite to evaluate the different use cases of selection with CSS Regions.

test2

Selected content does not match the highlighted area.

test1

Selection direction issues

Let’s start with a very simple description of the concept of Selection Direction, which consist of the following points:

  • The WebCore::VisibleSelection class has two attributes called base and extent declared as dom::Position instances
  • Such positions refer respectively to the anchor and focus nodes in the DOM Tree.
  • Additionally, WebCore::VisibleSelection has two dom:Position attributes, start and end, which are used later during the rendering phase

Once the base and extent fields are set when instantiating a new VisibleSelection class, some adjustments and checks are performed to validate the selection. One of those checks is whether base position is before extent in the DOM Tree. Based on the result of this check, the start and end attributes will be set to either base or extent respectively.

In the first test, the selected content includes the entire region block, even when it was not selected by the user. The cause, as we will see in later, derives from the position of the source block in the DOM Tree, which in this case is defined between the two regular blocks.

The second example shows some selection direction related issues; in this case, what the user selected is precisely the content between the two highlighted areas. The problem here is that the base node of the original selection is below the extent node in the DOM Tree, so they are swapped in the selection logic. In addition, the CSS Regions implementation builds the Render Tree in a way that the source content defined by a RenderNamedFlowThread block is positioned below where it was defined in the DOM Tree. The consequence of this is that the start node is below the end node, so the highlighted area starts in the region block and continues from the root element (usually the body) until reaching the end node.

Our first approach was trying to provide a better user experience during selection with CSS Regions. We thought that adding multi-range capabilities to the DOM Selection API was the best way to go and we provided a patch. However, this approach was rejected by some Apple reviewers because multi-range selection introduces many problems, such as those detailed in the selection specification.

We have opened the debate again on the mailing list, though, because we think there might be some advantages to this approach, even without modifying the selection API. For instance, being able to handle independent Ranges and compose the expected selection will provide the flexibility needed to implement complex use cases. But, after some discussion with some of the Adobe Web Platform contributors, we have decided to focus on improving the selection following the current specification. While we feel this approach may lead to a non-optimal user experience for certain use cases, we expect implementing it will help us discover the problems inherent in the current selection specification. We have also been discussing these issues with some reviewers from Apple, Ryosuke Niwa and David Hyatt, and looking for alternatives to the multi-range approach.

We have posted additional patches, one to improve the selection behavior and another to revert the current limitation of selections related to including content from different Flow Threads. We think that this approach provides better integration of CSS Regions with HTML documents. Plus, it will allow us to properly evaluate the performance issues of selection with CSS Regions.

The other challenges we faced during this collaboration include:

  • changing how the selection rendering traverses the Render Tree in order to deal with the special RenderFlowThread blocks
  • adjusting the block gaps filling algorithm
  • clearing the selection
  • selection direction issues derived from the Render Tree and DOM Tree divergence

We detected a number of other issues,  such as how LayoutPoints are positioned in the DOM Tree when pointing to Region blocks, leading to an incorrect Selection extent node.  We are confident that we will ultimately have a fully-compliant selection specification implementation for CSS Regions, but the improvements using this approach are limited. Even after solving all the issues we have detected so far, selection might still seem weird and buggy from an end user perspective. Thus we think that the final solution, one which will provide the user with a more consistent experience, will be to complement the selection specification to consider not only the DOM Tree, but also how the Render Tree is built by the Layout Model.

Categories: CSS Regions, planet, WebKit Tags:

Node.js + Socket.io = Real-Time IO.

December 19th, 2012 4 comments

The use of javascript for implementing server-side logic is not breaking news these days, but definitively Node.js is gaining relevance as one of the hottest technologies in this area. There are multiple reasons that explain this trend, but clearly one of those is the asynchronous event-driven model and it’s advantages for dealing with BigData problems.

When dealing with real-time requirements, Socket.io can play an important role on the web application architecture, providing an abstraction layer for the communication between the browser and the server. The Node.js event-driven model combined with the Socket.io real-time capabilities offer a good solution to face BigData challenges on domains where real-time capability is important

I’ll try to show some examples of the combination of these two technologies for implementing real-time web operations.

Socket.io based client-server communication

Lets consider the basic and default configuration of socket.io, described as follows:

  • Client side javascript
"use strict";
 
jQuery(document).ready(function() {
 
  var socket = io.connect();
 
  socket.on('connect', function() {
    console.log('connected');
  });
  socket.on('disconnect', function() {
    console.log('disconnected');
  });
  socket.on('error', function(err) {
    if(err === 'handshake error') {
      console.log('handshake error', err);
    } else {
      console.log('io error', err);
    }
  });
  socket.on('updates', function(newUpdates) {
  });
  $("#target").click(function() {
    socket.emit('myEvent');
  });
 });

The script uses JQuery to provide support for UI operations manipulating the DOM by the ‘updates’  event handler. This event is emitted by the server’s StreamAssembler, which I’ll describe later. Obviously, Socket.io does not require at all JQuery and it could be even defined inside a Javascript tag in the html page.

The client script can also emit events though the socket, which will be handled by server Node.js event loop. It’s a bidirectional communication channel.

  • Server side javascript
'use strict';
 
var express = require('express');
var fs = require('fs');
var indexBuffer = fs.readFileSync('index.html').toString();
var app = express();
var io = require('socket.io');
var http = require('http');
var server = http.createServer(app);
 
app.use(express.bodyParser());
app.use('/scripts', express.static(__dirname + '/scripts'));
 
app.get('/',
  function(req, res) {
  console.log('Request to "/" ...');
  res.contentType('text/html');
  res.send(indexBuffer);
});
 
server.listen(8080);
io = io.listen(server);
 
io.configure(function (){
  io.set('log level', 1);
});
 
io.sockets.on('connection', function (socket) {
  console.log('got socket.io connection - id: %s', socket.id);
  var assembler = new StreamAssembler(keys, socket, redisClient);
 
  socket.on('myEvent', function() {
    console.log('"myEvent" event received');
  });
 
  socket.on('disconnect', function() {
    // needs to be stopped explicitly or it will continue
    // listening to redis for updates.
    if(assembler)
      assembler.stop();
  });
});

This code represents a minimalistic http server with Socket.io support. It just creates the server using the express module and makes the Socket.io process listening the http server. The Socket.io configuration just sets the log level, but it might be used for other purposes, like authentication.

The sever also sets up the StreamAssembler, which is the responsible of collecting, aggregating and assembling the raw data retrieved from the database and emitting events for the client (Browsers) through the Socket.io communication channel.

Stateful processing and data handy in-memory

The Node.js even-driven model eases the development of client/server stateful logic, which is very important when implementing distributed systems devoted to online processing of stream data and for assembling in one place all the context required for servicing a web request. It also helps to define async states-machine patterns thanks to the single-threaded approach, so the implementation results simpler and easier to debug than the typical multi-thread based solutions.

Also, perhaps even more important when dealing with real-time requirements, the in-memory data processing is mandatory to really provide a real-time user experience. Node.js provides a programming model fully aligned with this real-time approach in mind.

So, lets consider we have access to a large storage where huge data streams are stored and manipulated. Lets consider Redis as a cache system for such large storage, to be used for real-time purposes.

The StreamAssembler component receives a constant raw data stream and produces structured data, aggregating data from different sources, always under the limit of the window size in order to ensuring all the operations are executed in memory, taking into account the server’s HW specifications.

It uses the redis-sync module for exploiting the Redis Replication interface and monitoring the Redis storage, looking for commands that alter the database status on specific keys. It might also use the redis-sync module for replicating specific keys from the Redis (cache) database to the main storage (sediment), larger and usually offering better performance on write operations (Cassandra or HBase, for instance).

function StreamAssembler(keys, socket, redisClient) {
  var that = this;
 
  var updates = {};
  var monitor = null;
 
  function moveServerWindow() {
    console.info('Moving server Window');
    serverList = [];
    var k;
    for (k in updates) { serverList.push([k, updates[k].t]);}
    serverList.sort(function(a, b) {
      return a[1] &lt; b[1] ? -1 : 0;
    });
    while (serverList.length &gt; serverWindowLimit) {
      var toDelete = serverList.pop();
      delete updates[toDelete[0]];
    }
  }
 
  function addUpdates(results) {
    var idList = [];
    var i, update, t, uk, u;
    for(i = 0; i &lt; results.length; i += 2) {
      update = JSON.parse(results[i]);
      t = results[i + 1];
 
      uk = update.id;
      idList.push(uk);
 
      u = updates[uk];
 
      if(u === undefined) {
        //console.info(uk, 'not seen yet');
        u = {t:t, data:update};
        updates[uk] = u;
      }
    }
    return idList;
  }
 
  function getRawData(key, cb) {
    console.log('Getting raw data from: ' + key);
    redisClient.zrange(key, '-100', '-1', 'withscores',
                       function(err, results) {
      if(err) return cb(err);
      addUpdates(results);
      cb(null);
    });
  } 
 
  function initialUpdate() {
    console.log('initial update');
    moveServerWindow();
    socket.emit('updates', updates);
  } 
 
  that.addRawDataKeys = function addRawDataKeys(keys) {
    var rem = 0; var tlId;
    for(key in keys) {
      ++rem;
      getRawData(keys[key], function(err) {
        if(err) console.error(err);
        --rem;
        if(rem === 0) {
          initialUpdate();
          that.addMonitorKeys(keys);
        }
      });
    }
    if(rem === 0) {
      console.log('No update keys'); // no updates to retrieve
      initialUpdate();
      that.addMonitorKeys(keys);
    }
  } 
 
  that.addRawDataKeys(keys); 
 
  that.addMonitorKeys = function addMonitorKeys(keys) {
    if (monitor) {
      monitor.addKeys(keys);
    } else {
      console.log('Creating new monitor');
      monitor = new m.Monitor(keys);
      monitor.on('changed', handleCommand);
      monitor.connect();
    }
  } 
 
  that.stop = function() {
    if (monitor) {
      console.log('Stopping monitor');
      monitor.disconnect(handleCommand);
    }
  } 
 
  function handleCommand(key, command, args) {
    var i, t, u;
    var tlId, id, values;
    var key, suId, prop, v, enc, eng;
    var newUpdates = [];
    if(command === 'zadd') {
      var values = [];
      for(i = 0; i &lt; args.length; i += 2) {
        t = Buffer.concat(args[i]).toString();
        u = Buffer.concat(args[i + 1]).toString();
        values.push(u);
        values.push(t);
        newUpdates.push(JSON.parse(u));
      }
      addUpdates(values);
      moveServerWindow();
      socket.emit('dUpdates', newUpdates);
    }
  }
}

The StreamAssembler uses the specific socket, passed as argument and created by the Node.js server through the socket.io module, to emit two different events: “updates”,  for the initial updates retrieved from the Redis database, and “dUpdates”, for incremental updates detected by the redis-sync monitor.

Some examples: system performance monitoring

With the diagram described above in mind, I’ve been playing with Node.js, and Socket.io to implement some illustrative examples of how such architecture works and how to implement real-time communication with Browsers.

We could define a simple retriever of system performance data (e.g, top, netstat, …) and feed a Redis database with raw data from several hosts.

The StreamAssembler will transform the raw data into structured data, to be displayed by the Browser using the d3.js library.

There is a video of the code running available here; check also the source code here. It’s just a small example of the use of Node.js + Socket.io and the StreamAssembler pattern.

Categories: BigData Tags:

Exploting the Redis replication interface with Node.js

November 22nd, 2012 Comments off

The Redis replication interface can be easily exploited for other purposes by creating a new TCP connection and issuing the SYNC command. The Redis server will use such connection to stream any writing command as soon as it’s executed.

'use strict';
 
var util = require('util');
var client = net.connect(6379);
 
client.on('connect', function(a) {
  console.log('syncing ...');
  client.write('sync\r\n');
});
client.on('data', function(data) {
  console.log('Streaming commands ...');
});
client.on('error', function(err) {
  console.log(err);
});
client.on('end', function() {
  console.log('end');
});

At Igalia, we’ve been working on building smart and distributed components for real-time data streams analysis in collaboration with Perceptive Constructs. We are using several of its Redis components to face our BigData and real-time challenges, but perhaps one of the most useful ones has been the redis-sync module.

The redis-sync component is a Node.js module for exploiting the Redis replication interface in order to monitor all the writing commands executed by the server. It emits different signals providing Node.js data structures for the command arguments, which might be handled by top level Javascript applications.

The redis-sync component might get advantage of the rdb-parser component, which helps to parse generic Redis RDB dumps, in order load all the changes in the database prior to the sync call.

The rdb-parser

The rdb-parser module generates Node.js data structures from Redis RDB dumps or commands replies, based on the Redis new Unified Request Protocol.  The current development status offers almost a complete parser for all the Redis entities:

  • REDIS_STRING
  • REDIS_LIST
  • REDIS_SET
  • REDIS_ZSET
  • REDIS_HASH

The rdb-parser emits the ‘entity’ signal for every Bulk Reply detected in the buffer, with the following structure:

that.emit(‘entity’, [REDIS_TYPE, key, data]);

The parsing process is triggered with the function write(data), assigning a new buffer to parse. The process is implemented using a simple states machine pattern and it ensues the data manipulation is binary safe, also according to the Redis unified protocol. The example provided in the repository is quite illustrative.  Just type:

node ./test.js < ./dump.rdb

The redis-sync module

The redis-sync module uses the Redis replication interface to monitor and stream Redis commands which modify the database state. This component is very useful for implementing real-time capabilities but also for data migration into larger databases, since Redis is a pure in-memory storage and data size is too precious.

As the rdb-parser, the redis-sync module is implemented using a simple states machine pattern and supports both, unified protocol and inline commands; it provides binary safeness as well. You can check the usage examples here.

We already commented that it might use the rdb-parser internally for dealing with the initial sync Bulk Reply:

case 'bulkReplyLenR':
  if(data[i] === 10) { // \n
    ++i;
    if((that.listeners('entity').length &gt; 0 || that.listeners('rdb').length &gt; 0) &amp;&amp; !readRDB) {
      if(!rdbParser) {
        rdbParser = new rdb.Parser();
        rdbParser.on('entity', function(e) {
          that.emit('entity', e);
        });
      if(that.listeners('rdb').length === 0) {
        rdbParser.on('error', function(err) {
          // stream is used internally, error handling is done at the outer level });
        }
      }
      that.emit('rdb', rdbParser);
      startReadingBytes(bulkReplyLen, false,
           function(buf) { rdbParser.write(buf); },
           function() { rdbParser.end(); readRDB = true; rdbParser = undefined; connectedOK(); state = 'ready';});
    } else {
      startReadingBytes(bulkReplyLen, false,
           function(buf) { that.emit('bulkReplyData', buf); },
           function() { that.emit('bulkReplyEnd'); readRDB = true; connectedOK(); state = 'ready';});
    }
  }
  break;

Once the initial sync is done and the corresponding Bulk Reply parsed by the rdb-parser, the readRDB variable determines whether a new sync (reconnection) or a regular command reply is being processed.

In order to receive new commands just listen to the “command” or “inlineCommand” events:

  • that.emit(‘command’, command, unifiedArgs.slice(1));
  • that.emit(‘inlineCommand’, inlineCommandBuffers);

Monitoring specific keys

A very basic use case for the redis-sync module would be to monitor individual keys and triggering specific actions. The listeners of such actions will be notified by the Node.js even loop, providing kind of real-time capabilities to the client.

'use strict';
 
var util = require('util');
var EventEmitter = require('events').EventEmitter;
var redisSync = require('redis-sync');
 
function Monitor(keys) {
  var that = this;
  var sync = new redisSync.Sync();
  sync.on('command', function(command, args) {
    var key = Buffer.concat(args[0]).toString();
    if (keys.indexOf(key) !== -1) {
      console.log('key %s changed: %s', key, command);
      that.emit('changed', key, command, args);
    }
  });
  sync.on('error', function(err) {
    console.error(err);
  });
  that.connect = function(p, h) {
    sync.connect(p, h);
  };
}
 
util.inherits(Monitor, EventEmitter);
exports.Monitor = Monitor;
Categories: BigData, NoSQL Tags:

My first Strata Conference

October 24th, 2012 1 comment

This year was the first time the Strata Conference reached Europe and thanks to Igalia, I could be there to evaluate the investment we have been doing on BigData technologies.

This trip is part of the roadmap of the Distributed Computing team we created at Igalia with the aim of exploring a field where Open Source is a key and how our already more than ten years of experience as Open Source consultants would fit in such a competitive area.

We have been lately collaborating with the company Perceptive Constructs to increase our data modelling and machine learning capabilities. Both companies were present at the Strata Conference to showcase our work on Distributed and Smart components for Social Media streams analysis. We will unveil our achievements in future posts, but first I’ll share my impressions about the conference and the future of the BigData area.

O’Relly Strata Conferences

These conferences were usually US events, with presence in both coasts (New York, San Francisco and Santa Clara), but this time was the first EU conference so it was very important for us to attend. There is a great activity in UK regarding BigData and the Open Data commitment is very important in that area, which is causing a lot of start-ups can grow up there.

The conference is what you could expect from a big event like this, quite expensive but very well organized and fully oriented to networking and business. There were some events very interesting, like the Start-up Competition, connecting young companies and independent developers with investors and entrepreneurs.  The Office Hours gave us the possibility of face-to-face meetings with some of the speakers, which was a great thing in my opinion.

I’ll comment on the talks I’ve considered most relevant, but just mentioned before that I think the contents were very well structured, with a good mix of technical and inspiring stuff. The keynotes were a great warm-up for a conference which I think tries to show the social aspects behind the BigData field and how it could help to acquire better knowledge in an era of access to the information we have never seen before.

The Talks

First of all, I think it’s worth sharing all the keynotes videos, but I would like to comment on the most remarkable ones, in my opinion.

Good Data, Good Values

It was a really inspiring talk, describing how Big Data can help to make a better world, supporting not so big companies and organizations to make sense the BigData they are generating. “No need to have big data for getting big insights”.

The Great Railway Caper: Big Data in 1955

The talk was interesting because it explained very well what BigData is and which are the actual challenges:

  • Partitioning
  • Slow storage media.
  • Algorithms.
  • Lots of storage.

Current situation haven’t changed since 1955:

  • Not enough ram to solve the entire problem.
  • Algorithm doesn’t exits yet.
  • Machines busy for other stuff.
  • Secondary storage are slow.
  • Having to deal with tight deadlines.

Keynote by Liam Maxwell

The UK Government is really pushing for BigData and committed with the Open Data initiative. I would like to see the Spanish government to continue the efforts to increase the transparency and openness regarding the public data.

They really want to work with SMEs, avoiding big players and vendor locking issue, which I personally think is the right approach. As it was stated many time during the conference:

  • Open Source + Open Data = Democratization of BigData.

BigData in retail

This talk was an excellent example of a domain where BigData could fit very well. On-line retail providers manage huge volumes of data from many countries and statistical models apply pretty well on consumer habits and depot stock trends.

They basically use matlab, so I guess the real-time analysis is not crucial. They focus on different angles:

  • Predicting how weather affects on sales.
  • Reducing depot stock holding.
  • Improving promotions.

Transparency transformed

They have developed a very cool system, kind of expert system for detecting, classifying and generating new knowledge on different topics: news and media channels, sports, real state, financial services, … They are now approaching regular companies to analyse their business processes.

  • Scheme: data – facts – angles – structure – narrative.
  • Fully automated process: meta-journalism.

There are some cases studies: financial analysis and on-line education.

  • Generating financial reports from isolated insights.
  • Interpretation of financial charts.
  • Giving advices to students and teachers.
  • Social networks are another example.
  • Challenge of dealing with unstructured data.

The core system is based on AI techniques (expert systems, probably) using pattern-matching rules.

  • They don’t predict, but it’s in the roadmap (long term).
  • They don’t expose API.
  • They don’t do machine-learning.

 

 

 

 

 

 

 

Categories: BigData Tags:

GeoClue and Meego: QtMobility

October 1st, 2010 1 comment

As you probably know, GeoClue is part of the Meego architecture as the Geolocation component. However, current plans are using the QtMobility API for UI applications and defining GeoClue as one of the available backends.

The QtMobility software implements a set of APIs to ease the development of UI software focused on mobile devices. It provides some interesting features and tools for a great variety of mobile oriented development areas:

  • Contacts
  • Bearer (Network Management)
  • Location
  • Messaging
  • Multimedia
  • Sensors
  • Service Framework
  • System Information

All those software pieces are a kind of abstraction to expose easy and comprehensive API’s to be used in the UI application developments. In regard to Geolocation, lets describe in detail the Location component.

It was recently announced the first public implementation of a GeoClue based backend for the QtMobility Location API. The starting point to implement the GeoClue backend, as described in the QtMobility documentation, is the QGeoPositionInfoSource abstract class.  The implementation of this abstract class using GeoClue seems not too hard, however, the current GeoClue architecture has some limitations to fulfill the QtMobility specifications:

  • The QtGeoPositionInfo class, defined for storing the Geolocation data retrieved by the selected backend (GeoClue in this case) manages together global location, direction and velocity.
  • The GeoClue API has separated methods and classes for location, address and velocity. Independent signals are emitted whenever such parameters are changed.
  • The GeoClue Velocity interface is not implemented in the GeoClue Master provider.
  • Even though is not too hard to implement the abstract methods of the QGeoPositionInfoSource class, the start/stop updating methods are not very efficient in regard to battery and memory consumption. There is not easy or direct way to remove one provider when is not used.

As part of the Igalia’s plans on Meego, I’ve been working in the implementation of such GeoClue based backend for the Meego QtMobility framework. Now that part of my work has been already done, it’s time to share efforts and contribute to the public repository with some patches and performance reports I’ve got during the last months. Some work is still needed before releasing my work, but I hope I will be able to send something in the following weeks, so stay tunned.

Even though the code is not ready for being public, I could show a snapshots of the test application I implemented for the Meego Handset platform using the Meego Touch framework:

GeoClue test application for Meego Handset

The purpose of this application would be monitoring the DBus communication between the different location providers, creating some performance tests and evaluating the impact on a Mobile platform.

194412

QGeoPositionInfo Class Reference

Categories: Geolocation, MeeGo, Mobile, planet Tags:

GeoClue and Meego: Connman support

September 28th, 2010 Comments off

As promised, GeoClue now supports Connman as the connectivity manager module for acquiring network based location data.This step has been essential to complete the integration of GeoClue in the Meego architecture.

Check the patch if you want to know the details.

Thanks to Bastian Nocera for reviewing and pushing the commit, which is now part of the master branch of GeoCLue. Let see if it passes the appropriated tests before becoming part of some official release.

Network based positioning is one of the advantages of using GeoClue as Location provider. That’s obvious for Desktop implementations, where GPS and Cell Id based methods are not the most common use cases. On the other hand, Mobile environments could also get benefits from network based positioning, assisting the GPS based methods for improving the fix acquiring process; perhaps indicating where the closest satellite network is or showing a less accuracy location while the GPS fix is being established.

Finally, I would like to remark that my work is part of the Igalia’s bet for the Meego platform. I think the GeoClue project will be an important technology to invest in the future, since it’s relevant also for GNOME and Desktop technologies. In fact, GeoClue is also the Ubuntu’ s default Geolocation component.

Categories: Geolocation, GNOME, MeeGo, Mobile, planet Tags: