My first Strata Conference

This year was the first time the Strata Conference reached Europe and thanks to Igalia, I could be there to evaluate the investment we have been doing on BigData technologies.

This trip is part of the roadmap of the Distributed Computing team we created at Igalia with the aim of exploring a field where Open Source is a key and how our already more than ten years of experience as Open Source consultants would fit in such a competitive area.

We have been lately collaborating with the company Perceptive Constructs to increase our data modelling and machine learning capabilities. Both companies were present at the Strata Conference to showcase our work on Distributed and Smart components for Social Media streams analysis. We will unveil our achievements in future posts, but first I’ll share my impressions about the conference and the future of the BigData area.

O’Relly Strata Conferences

These conferences were usually US events, with presence in both coasts (New York, San Francisco and Santa Clara), but this time was the first EU conference so it was very important for us to attend. There is a great activity in UK regarding BigData and the Open Data commitment is very important in that area, which is causing a lot of start-ups can grow up there.

The conference is what you could expect from a big event like this, quite expensive but very well organized and fully oriented to networking and business. There were some events very interesting, like the Start-up Competition, connecting young companies and independent developers with investors and entrepreneurs.  The Office Hours gave us the possibility of face-to-face meetings with some of the speakers, which was a great thing in my opinion.

I’ll comment on the talks I’ve considered most relevant, but just mentioned before that I think the contents were very well structured, with a good mix of technical and inspiring stuff. The keynotes were a great warm-up for a conference which I think tries to show the social aspects behind the BigData field and how it could help to acquire better knowledge in an era of access to the information we have never seen before.

The Talks

First of all, I think it’s worth sharing all the keynotes videos, but I would like to comment on the most remarkable ones, in my opinion.

Good Data, Good Values

It was a really inspiring talk, describing how Big Data can help to make a better world, supporting not so big companies and organizations to make sense the BigData they are generating. “No need to have big data for getting big insights”.

The Great Railway Caper: Big Data in 1955

The talk was interesting because it explained very well what BigData is and which are the actual challenges:

  • Partitioning
  • Slow storage media.
  • Algorithms.
  • Lots of storage.

Current situation haven’t changed since 1955:

  • Not enough ram to solve the entire problem.
  • Algorithm doesn’t exits yet.
  • Machines busy for other stuff.
  • Secondary storage are slow.
  • Having to deal with tight deadlines.

Keynote by Liam Maxwell

The UK Government is really pushing for BigData and committed with the Open Data initiative. I would like to see the Spanish government to continue the efforts to increase the transparency and openness regarding the public data.

They really want to work with SMEs, avoiding big players and vendor locking issue, which I personally think is the right approach. As it was stated many time during the conference:

  • Open Source + Open Data = Democratization of BigData.

BigData in retail

This talk was an excellent example of a domain where BigData could fit very well. On-line retail providers manage huge volumes of data from many countries and statistical models apply pretty well on consumer habits and depot stock trends.

They basically use matlab, so I guess the real-time analysis is not crucial. They focus on different angles:

  • Predicting how weather affects on sales.
  • Reducing depot stock holding.
  • Improving promotions.

Transparency transformed

They have developed a very cool system, kind of expert system for detecting, classifying and generating new knowledge on different topics: news and media channels, sports, real state, financial services, … They are now approaching regular companies to analyse their business processes.

  • Scheme: data – facts – angles – structure – narrative.
  • Fully automated process: meta-journalism.

There are some cases studies: financial analysis and on-line education.

  • Generating financial reports from isolated insights.
  • Interpretation of financial charts.
  • Giving advices to students and teachers.
  • Social networks are another example.
  • Challenge of dealing with unstructured data.

The core system is based on AI techniques (expert systems, probably) using pattern-matching rules.

  • They don’t predict, but it’s in the roadmap (long term).
  • They don’t expose API.
  • They don’t do machine-learning.

 

 

 

 

 

 

 

GeoClue and Meego: QtMobility

As you probably know, GeoClue is part of the Meego architecture as the Geolocation component. However, current plans are using the QtMobility API for UI applications and defining GeoClue as one of the available backends.

The QtMobility software implements a set of APIs to ease the development of UI software focused on mobile devices. It provides some interesting features and tools for a great variety of mobile oriented development areas:

  • Contacts
  • Bearer (Network Management)
  • Location
  • Messaging
  • Multimedia
  • Sensors
  • Service Framework
  • System Information

All those software pieces are a kind of abstraction to expose easy and comprehensive API’s to be used in the UI application developments. In regard to Geolocation, lets describe in detail the Location component.

It was recently announced the first public implementation of a GeoClue based backend for the QtMobility Location API. The starting point to implement the GeoClue backend, as described in the QtMobility documentation, is the QGeoPositionInfoSource abstract class.  The implementation of this abstract class using GeoClue seems not too hard, however, the current GeoClue architecture has some limitations to fulfill the QtMobility specifications:

  • The QtGeoPositionInfo class, defined for storing the Geolocation data retrieved by the selected backend (GeoClue in this case) manages together global location, direction and velocity.
  • The GeoClue API has separated methods and classes for location, address and velocity. Independent signals are emitted whenever such parameters are changed.
  • The GeoClue Velocity interface is not implemented in the GeoClue Master provider.
  • Even though is not too hard to implement the abstract methods of the QGeoPositionInfoSource class, the start/stop updating methods are not very efficient in regard to battery and memory consumption. There is not easy or direct way to remove one provider when is not used.

As part of the Igalia’s plans on Meego, I’ve been working in the implementation of such GeoClue based backend for the Meego QtMobility framework. Now that part of my work has been already done, it’s time to share efforts and contribute to the public repository with some patches and performance reports I’ve got during the last months. Some work is still needed before releasing my work, but I hope I will be able to send something in the following weeks, so stay tunned.

Even though the code is not ready for being public, I could show a snapshots of the test application I implemented for the Meego Handset platform using the Meego Touch framework:

GeoClue test application for Meego Handset

The purpose of this application would be monitoring the DBus communication between the different location providers, creating some performance tests and evaluating the impact on a Mobile platform.

194412

QGeoPositionInfo Class Reference

GeoClue and Meego: Connman support

As promised, GeoClue now supports Connman as the connectivity manager module for acquiring network based location data.This step has been essential to complete the integration of GeoClue in the Meego architecture.

Check the patch if you want to know the details.

Thanks to Bastian Nocera for reviewing and pushing the commit, which is now part of the master branch of GeoCLue. Let see if it passes the appropriated tests before becoming part of some official release.

Network based positioning is one of the advantages of using GeoClue as Location provider. That’s obvious for Desktop implementations, where GPS and Cell Id based methods are not the most common use cases. On the other hand, Mobile environments could also get benefits from network based positioning, assisting the GPS based methods for improving the fix acquiring process; perhaps indicating where the closest satellite network is or showing a less accuracy location while the GPS fix is being established.

Finally, I would like to remark that my work is part of the Igalia’s bet for the Meego platform. I think the GeoClue project will be an important technology to invest in the future, since it’s relevant also for GNOME and Desktop technologies. In fact, GeoClue is also the Ubuntu’ s default Geolocation component.

GeoClue and Meego

As most of you probably know, GeoClue is the default component of the Meego architecture for supporting Geolocation services.

GeoClue on Meego

The geoclue packages are installed by default in both, Netbook and Handset Meego SDK environments. I’ve been playing a bit with the Meego simulator and GeoClue seems to be perfectly configured and the examples can be executed without any problem.

But here are the bad news 🙂 Some work is needed to adapt the GeoCLue Connectivity module to the Meego connection manager component: connman.

I think I’m going to spend some time figuring out how much work is required and trying to propose some feasible approach. Another interesting task I’ve got in my mind is to implement some Meego specific examples for GeoClue using the Meego Touch framework.

GUADEC experiences

After spending the last week attending the GUADEC, it’s time to share some thoughts and impressions. The warm-up days were not such bad after all; some networking, hacking face-to-face with people you only know from irc or email. Also, some important meetings took place during those days and it’s a good forum for exchanging information and ideas about Gnome and Free Software.

Besides, I think those early, and probably more relaxed, days are the perfect moment to perform activities like the GNOME Developer Training. It was a very good experience, from the feedback I’ve got. I think it was the proof that it fits fine in the GUADEC  purposes and, having more time for planning and some additional marketing actions, the success of future editions will be guaranteed. In my opinion, this training is a very interesting marketing tool to get more companies involved in Gnome and Free Software, providing some knowledge to ease the change.

Thinking about it, I think it could be a good idea to prepare some meetings with local companies where next conferences will take place.  Some months before the GUADEC, a marketing plan should targeting the local companies and governments to embrace this kind of courses as a way of understand both, social and business, advantages of Free Software.

Once we reached the Core days the technical conferences got the spotlight. They were really interesting, as always, full of new and interesting stuff. The Luis Villa key note at the opening talk, even not being breaking news, gave me the clear message that Gnome Desktop should look at the Web if it wants to keep, at least, its relevance in the desktop and mobile markets.

During the talk about WebKitGtk presented by Xan López and Gustavo Noronha, this fact was exposed again in detail. They suggested a new path for GNOME to strengthen the community and get new developers interested in Web technologies that could help the Desktop to become a more integrated tool.

There were several other talks about Web and Javascript as the perfect language for UX development. Specially interesting the talk given by John Palmieri, titled The future is Javascript; let see 🙂

Another interesting talk I would like to remark is the one made by Bastian about GeoClue. Despite not having too many improvements since the last GUADEC, I think this is one of the most promising projects in the GNOME/freedesktop environment. Geolocation will be one of the keys of the new devices, not only mobile phones, but any kind of device which is designed to be carried while traveling; I mean laptops, tablets, wayfinder navigators, portable TVs and the like.

And finally, the talk about Grilo presented by Iago. The Grilo component was presented to a quite interesting audience; its architecture, main features like searching and browsing, and potential uses of this piece of software. I think this component will play an important role in the future, since Internet, as media content provider, is reaching high rates of use, stealing users from the typical channels like TV, DVDs, or even direct downloads.

Geolocation unit tests for WebKit using GeoClue

As you probably know, the WebKit project is focused on the implementation of  an open source web browser engine. Support for the W3C Geolocation API is available since 2008 and GeoClue was the selected choice for the WebKitGtk+ port implementation.

Since Igalia is very interested on the WebKit project, I’ve got the chance to devote some time  to explore the integration of GeoClue in WebKit and learning more about this project, which will be the main bet for our Innovation area. It’s really great to leave aside for a while my regular tasks and continue learning and deeper analyzing such an interesting project.

The current state of the GeoClue based implementation it’s somehow preliminary. Basic location services are provided by GeoClue, but the integration with the WebKit core is not fully covered and unit tests are most of them disabled at this moment.

So, I thought it would be a good challenge to complete the implementation and check the unit tests cases to see if I’m able to fix at least one of them. I think it would be a good start 🙂

There are 24 unit test cases defined and only 4 of them are enabled at this moment and they are basically just testing if GeoClue is installed and configured, or checking the input arguments type. I’ve thought that working on the enabled.html test would be interesting, because is very simple and it probes the GeoClue API is correctly used and it works as expected.

description(“Tests Geolocation success callback using the mock service.”);

var mockLatitude = 51.478;
var mockLongitude = -0.166;
var mockAccuracy = 100;

if (window.layoutTestController) {
layoutTestController.setGeolocationPermission(true);
layoutTestController.setMockGeolocationPosition(mockLatitude,
mockLongitude,
mockAccuracy);
} else
debug(‘This test can not be run without the LayoutTestController’);

var position;
navigator.geolocation.getCurrentPosition(function(p) {
position = p;
shouldBe(‘position.coords.latitude’, ‘mockLatitude’);
shouldBe(‘position.coords.longitude’, ‘mockLongitude’);
shouldBe(‘position.coords.accuracy’, ‘mockAccuracy’);
finishJSTest();
}, function(e) {
testFailed(‘Error callback invoked unexpectedly’);
finishJSTest();
});

window.jsTestIsAsync = true;
window.successfullyParsed = true;

So, the fist issue to face is the implementation of the setMockGeolocationPosition method, which is unimplemented (see bug 28624) in the Gtk port (LayoutTestControllerGtk.cpp). This method should set the mock position to be retrieved by the Geolocation API method.

The problem is that GeoClue has not such method, at least, as API method. Depending on the location provider selected its possible to establish  a dummy position through the DBus API.

Another problem is that the Master provider, the one used for the WebKitGtk+ port to implement the Geolocation API, has not a very good provider selection algorithm so one of the web services based provider is selected, causing the unit test to become stalled, waiting for a web response which never come.

Hence, it seemed that the work should start at the GeoClue side, talking to the community to look for the proper approach, discussing about the patches I’ve implemented and eventually push those patches to be committed. After some weeks of hard work, the patches are already at the GeoClue bugzilla; lets see how the discussion evolves.

Meanwhile, I started the WebKit tasks implementing the mock operation. The first approach, perhaps the easiest one, would be to directly use the GeoClue API for setting the mock position. using my own GeoClue branch with my patches applied, I was able to correctly execute the success.html unit test. The patch was not too complex, but it required a new dependency in the WebKitTools module, in order to use the GeoClue API from the LayoutTestControllerGtk component.

void LayoutTestController::setMockGeolocationPosition(double latitude, double longitude, double accuracy)
{
// FIXME: Implement for Geolocation layout tests.
// See https://bugs.webkit.org/show_bug.cgi?id=28264.
GeocluePosition *pos = NULL;
const char *service = “org.freedesktop.Geoclue.Providers.Manual”;
const char *path = “/org/freedesktop/Geoclue/Providers/Manual”;
const char *iface = “org.freedesktop.Geoclue.Manual”;
GError *error = NULL;

GeoclueMaster* master = geoclue_master_get_default();
GeoclueMasterClient* client = geoclue_master_create_client(master, 0, 0);
if (geoclue_master_client_set_requirements(client, GEOCLUE_ACCURACY_LEVEL_LOCALITY, 0,
false, GEOCLUE_RESOURCE_ALL, &error)) {

pos = geoclue_master_client_create_iface_position (client, service, path, iface, &error);
geoclue_position_set_position (pos, accuracy, longitude, latitude, 0, &error);
g_object_unref (pos);
}

g_object_unref(master);
}

In spite of being functionally correct, after talking with some of the WebKitGtk+ developers, it seems that might be not the best approach to follow. The Chromium port has solved the problem by implementing its own Location Cache system inside the WebKit code, delegating on the specific Geolocation tools only when no valid location is available. The mock method just set the mock position into this cache, so the unit tests don’t need any external dependency.

Next steps would be talking to the WebkitGtk+ team to evaluate my proposal and figuring out the best approach to follow, probably something similar to what Chromium have implemented.

GeoClue: Analysis and Architecture

After the first post introducing this cool project is time to go further and deeply analyze the GeoClue internals and general behavior, describing briefly he most relevant components of its architecture.

The first thing I noticed during the analysis of this piece of software is that is a quite complex component, at least, from the architecture design point of view. It provides a set of interfaces and DBus bindings to provide a very general and flexible tool for handling and extending location providers implementations. The following picture illustrate this idea:

Lets start analyzing the architecture, exploring each module and the relationship and interactions between the internal components.

Interfaces

GeoClue provides several interfaces to expose the different locations services and configuration operations. The following interfaces are currently defined:

  • GcIfaceGeoclue: Interface for administrative and configuration operations.
  • GcIfaceAddress: Interface for address acquiring operations.
  • GcIfacePosition: Interface for global positioning operations.
  • GcIfaceGeocode: Interface for geocoding  operations.
  • GcIfaceReverseGeocode: Interface for reverse-geocoding  operations.
  • GcIfaceVeolcity: Interface for velocity monitoring operations.

Some of those interfaces are exposed through DBus interfaces. An XML file define the structure of the DBus bindings. The *Glue generated classes are created from those specification files.

Location Providers

The most direct way to obtain the location is through the specific Location Providers implementation, each one using a different strategy for acquiring the data. There are several implementations for both, Position and Address providers, but lets analyze one simple for the time being, say Localnet.

Every Location Provider is defined through the following configuration files:

  • geoclue-localnet.xml: This file defines the exposed DBus methods and signals. The associated *Glue classes are generated from this file.
  • geoclue-localnet.provider: This file defines the settings of the Location Providers, like description, Dbus specification (path, service, iface), accuracy and some special features provided by the provider (e.g. automatic updates).
  • org.freedesktop.Geoclue.providers.Localnet: DBus service launcher file.

The Location Provider class, in this case called GeoclueLocalnet, should inherit from the GcProvider abstract class, which implements the GcIfaceGeoclue interface. It also should implement the abstract methods defined in the corresponding generated *Glue class, which allows the provider to receive calls through the DBus system or session bus.

Location Data Containers

In order to retrieve different kind of Location data there are several classes defined for that purpose, all of them derived classes from GeoclueProvider. This class holds a DBus proxy object to the instance which actually implement the Location mechanism. The specific provider container instantiated, GeoclueAddress for instance, should receive the DBus service specification (path, service); in the case of the Localnet example, it would be  org.freedesktop.Geoclue.Providers.Localent and /org/freedesktop/Geoclue/Providers/Localnet.

Master Provider

The Master provider is designed by a client/server structure, where the server holds a reference to a DBus proxy object to the selected Location provider. The client will evaluate the available providers choosing the best one, following a provider selection algorithm and based on the user requirements.

The Master server could attend several clients and it monitors the currently selected Location Provider, notifying all the clients any status transition and events,  or even the change of the selected provider.

Both components, client and server, are accessed through the DBus interface, org.freedesktop.Geolcue.MasterClient and org.freedesktop.Geolcue.Master interfaces respectively.

Master client/server

As commented before, the Master Provider has a client/server based design; the server is defined as a singleton of the class GeoclueMaster and its the responsible of the clients instantiation.

The GeoclueMasterClient class instances are the ones used for external applications to interact with the Location framework. It creates the necessary providers (Address, Position, Velocity, …) associated to the Master DBus interface (org,freedesktop,Geoclue.MasterClient. The corresponding GcMasterClient instance will receive all the requests, forwarded thought the Master Dbus interface (org.freedesktop.Geoclue.Master) to the Master server, which will derive the call to the selected Location Provider, using the specific provider DBus interface (org.freedesktop.Geoclue.Providers.Localnet).

And thats all; there are other components, like the connectivity manager and the related interfaces, or the web services supporting classes, but i think the contents described in this post are far enough to understand how GeoClue works, or at least, to show what a complex structure it has.

The following sequence diagram could help to understand how the classes interact to get the address, for instance:

A more interesting debate would be if such a complex design is necessary, or if the advantages it provides, in terms of flexibility and generalization, are enough to justify that, or even they could be obtained as a result of a different design, perhaps simpler. But such an interesting debate will take place in future posts 😉 stay tunned.

Master discounts to the Final Year Project contest participants

For those who are thinking in applying for a Master course about software I would like to encourage to check out the Master on Free Software web. This year Igalia, the Master organizer, wanted to promote the FYP prize initiative by offering an additional 10% discount  for those who have presented their candidacy to the FYP contest.

With this step the Master project tries to support the people already committed with the Free Software movement and who want to deepen their knowledge about this way of creating technology.

At Igalia we are very focused on promoting Free Software and we think the students are one of the most relevant targets, since they are still open-minded and have the energy and passion enough to push for this collaborative business model against the typical one based on licenses and patents, usually managed by a group of big companies.

Creating camera software with GDigicam

After the release of the GDigicam project some months ago we have received some requests about creating documentation and examples of how to use the GDigicam component for handling specific camera devices. I’ve eventually got time to commit, truth being told,  a very preliminary code which pretends to be the first full GDigicam example, showing some of the most important features of this piece of software and how to interact with the GStreamer GstCamerabin component.

First of all, for those who still don’t know what GDigicam actually is, i would like to briefly introduce it. GDigicam is a framework for handling camera related low level software inspired in the OpenMax standard. GDigicam provides a complete API for implementing a set of functionalities very useful when building camera UI software:

  • ViewFinder
  • Flash Modes
  • Scene Modes
  • Resolution and Aspect Ratio
  • Autofocus
  • White Balance
  • Quality
  • Zoom
  • Video and Photo Capture

The GDigicam component is intended to ease the setup and handle the software components which actually control and implement the video and photography features, and in addition, hiding the technologies used in such lower layers.

The first implementation of the abstract API exposed by GDigicam is based on the GStreamer toolkit, using the GstCamerabin component. You can check it out from the git repository:

  • git clone git:gitorious.org/fremantle-gdigicam/gdigicam.git

The stable branch is totally focused on the MAEMO platform, so if you have plans to work on any different platform you will have to use the master branch. The new example added is only available at the unstable branch, since the GstCamerabin component is slightly different in MAEMO. Hopefully, I’ll be able to merge this example to the stable branch soon, but it will require some important design changes that could take some time.

When I was implementing the new GDigicam example I realized other possibilities to be built on top of the GDigicam component. I think a benchmarking tool could fit perfectly on the purpose of showing how to use GDigicam, but it also provides an interesting tool for the community, to be able to compare and analyze different kind of camera hardware and software platforms.

Here you are a video briefly showing this tool, being run in the MAEMO platform and using the N900 hardware. The UI interface is very simple, and perhaps a little rudimentary; user experience is not the key at this stage. In the video you can see how to configure the camera settings (flash, scene mode, resolution, quality and so on). After the configuration stage, you can enable or disable your own benchmarking set of tests. You can implement your own tests, grouping them in your own way and execute all of them in a row. In the video you can see the execution of the Set1 – Test1: Capture still images in a row (default is 5 iterations).

There are lots of additional features, like a full verbose log of whats going on, performance metrics, comparison and analysis of different HW used for benchmarking. Of course you can forget about testing and building your own Camera for your device.

Besides, the GDigicam component could provide other interesting features, very useful for implementing camera UI applications:

  • Video/Audio resource policies.
  • Metadata management.
  • Geolocation.

My new trip: GeoClue, and beyond …

Long time ago since the last post about software on my blog. However, far of meaning a lack of interest or passion regarding technology, I lived a nice experience working in the MAEMO project and having the opportunity to participate in one of the most interesting GNOME related projects I’ve ever been involved to. I have to admit that it was a working in the shadow age but you all know how this kind of projects are 😉 ; even though I think I contributed with my two cents. Besides the passion and energy applied to the project by all my development group at Igalia, the GDigicam component was born from this effort as a small (for the time being) contribution to the community.

But now its time to live new experiences, perhaps not as much relevant than the ones made for such a famous project, but for sure more exciting in terms of technological challenges and because  I’m having the chance to work on what I love. Geolocation is one of my secret passions, being considerably incremented after my experience in the mobile market, where I think this kind of technologies are going to be one of the keys of such smart devices.

Since the Birmingham GUADEC, i’ve been looking at the GeoClue project progress and analyzing the strong and weak points of this technology.  This is the first of a series of posts I’m working on, with the aim of providing a complete review of this project.

Lets put the first stone and talk about what GeoClue actually is.  From an academic point of view GeoClue could be understood as a framework, in the sense of providing integration of several kind of tools for Geolocation acquiring; it does not provide position data by itself, but through different kind of Geolocation providers:

  • Hostip: This provider uses http://www.hostip.info to get current position and address based on the current public IP address.
  • Plazes: This provider uses http://plazes.com to get current position and address based on current router mac address
  • Manual: This provider exists to let the user specify the current address.
  • Localnet: This provider does not strictly speaking require an internet connection, but it does require a connection to a router: it uses the current router mac address and a local keyfile to provide Address data.
  • Gsmloc: This provider uses the Gammu library and http://opencellid.org/ to provide position data. It gets cell identification data (MCC, MNC, LAC, CID) from Gammu and queries a position from opencellid with that data.
  • Gypsy: Gypsy is a gps multiplexing daemon with a D-Bus interface. Gypsy provider requires the option org.freedesktop.Geoclue.GPSDevice to be set.
  • GPSd: GPSd is a gps daemon that uses TCP sockets for communication. The daemon must be running when the provider starts.
  • Yahoo: The provider accepts two options org.freedesktop.Geoclue.GPSDevice and org.freedesktop.Geoclue.GPSHost.
  • Geonames: This provider uses the Geonames web service to provide geocoding/reverse geocoding service.

The GeoClue project has a great potential, because it integrates several and very different kind of Geolocation services. Nowadays, Geolocation capabilities are a must for both, handheld devices and Desktop, but in a global world and being “mobile people” no one of such services is enough by itself to provide location at any time and place.

I think thats enough for this post. I  don’t like too much the posts with huge amount of information, usually hard to digest. So stay tunned because I have a lot to say about GeoClue:

Coming next

  • Technical review of GeoClue.
  • Geolocation: state of the art and business opportunities.
  • GeoClue and Augmented Reality
  • GeoClue and WebKit