My first Strata Conference

This year was the first time the Strata Conference reached Europe and thanks to Igalia, I could be there to evaluate the investment we have been doing on BigData technologies.

This trip is part of the roadmap of the Distributed Computing team we created at Igalia with the aim of exploring a field where Open Source is a key and how our already more than ten years of experience as Open Source consultants would fit in such a competitive area.

We have been lately collaborating with the company Perceptive Constructs to increase our data modelling and machine learning capabilities. Both companies were present at the Strata Conference to showcase our work on Distributed and Smart components for Social Media streams analysis. We will unveil our achievements in future posts, but first I’ll share my impressions about the conference and the future of the BigData area.

O’Relly Strata Conferences

These conferences were usually US events, with presence in both coasts (New York, San Francisco and Santa Clara), but this time was the first EU conference so it was very important for us to attend. There is a great activity in UK regarding BigData and the Open Data commitment is very important in that area, which is causing a lot of start-ups can grow up there.

The conference is what you could expect from a big event like this, quite expensive but very well organized and fully oriented to networking and business. There were some events very interesting, like the Start-up Competition, connecting young companies and independent developers with investors and entrepreneurs.  The Office Hours gave us the possibility of face-to-face meetings with some of the speakers, which was a great thing in my opinion.

I’ll comment on the talks I’ve considered most relevant, but just mentioned before that I think the contents were very well structured, with a good mix of technical and inspiring stuff. The keynotes were a great warm-up for a conference which I think tries to show the social aspects behind the BigData field and how it could help to acquire better knowledge in an era of access to the information we have never seen before.

The Talks

First of all, I think it’s worth sharing all the keynotes videos, but I would like to comment on the most remarkable ones, in my opinion.

Good Data, Good Values

It was a really inspiring talk, describing how Big Data can help to make a better world, supporting not so big companies and organizations to make sense the BigData they are generating. “No need to have big data for getting big insights”.

The Great Railway Caper: Big Data in 1955

The talk was interesting because it explained very well what BigData is and which are the actual challenges:

  • Partitioning
  • Slow storage media.
  • Algorithms.
  • Lots of storage.

Current situation haven’t changed since 1955:

  • Not enough ram to solve the entire problem.
  • Algorithm doesn’t exits yet.
  • Machines busy for other stuff.
  • Secondary storage are slow.
  • Having to deal with tight deadlines.

Keynote by Liam Maxwell

The UK Government is really pushing for BigData and committed with the Open Data initiative. I would like to see the Spanish government to continue the efforts to increase the transparency and openness regarding the public data.

They really want to work with SMEs, avoiding big players and vendor locking issue, which I personally think is the right approach. As it was stated many time during the conference:

  • Open Source + Open Data = Democratization of BigData.

BigData in retail

This talk was an excellent example of a domain where BigData could fit very well. On-line retail providers manage huge volumes of data from many countries and statistical models apply pretty well on consumer habits and depot stock trends.

They basically use matlab, so I guess the real-time analysis is not crucial. They focus on different angles:

  • Predicting how weather affects on sales.
  • Reducing depot stock holding.
  • Improving promotions.

Transparency transformed

They have developed a very cool system, kind of expert system for detecting, classifying and generating new knowledge on different topics: news and media channels, sports, real state, financial services, … They are now approaching regular companies to analyse their business processes.

  • Scheme: data – facts – angles – structure – narrative.
  • Fully automated process: meta-journalism.

There are some cases studies: financial analysis and on-line education.

  • Generating financial reports from isolated insights.
  • Interpretation of financial charts.
  • Giving advices to students and teachers.
  • Social networks are another example.
  • Challenge of dealing with unstructured data.

The core system is based on AI techniques (expert systems, probably) using pattern-matching rules.

  • They don’t predict, but it’s in the roadmap (long term).
  • They don’t expose API.
  • They don’t do machine-learning.