The CCC “Big Data Computing Study Group” helped organize two adjacent events in Sunnyvale in March: the “Hadoop Summit” and the “Data-Intensive Scalable Computing (DISC) Symposium”.

The Hadoop Summit was an open event, hosted by Yahoo! Research. Its goal was to build a community among users of the open-source Hadoop software suite for distributed programming in the map-reduce style. About 350 people attended, a much larger crowd than originally expected. The DISC Symposium was an invitation-only event (~125 attendees) whose goal was to build a community among DISC researchers.

The presentations at the Hadoop Summit were fascinating. While they varied greatly in technical depth, in total they gave a sense of rapid growth in the amount of ingenuity being directed towards solving large-scale data-intensive problems on scalable computing clusters. As one might expect, academic researchers were among the speakers, as well as people from industry research labs at Yahoo!, IBM, and Microsoft. But there were also technical talks by developers at places like Google, Amazon, Rapleaf, Facebook, and Autodesk, each one essentially doing a “show-and-tell” on interesting data-intensive problems being tackled in their companies. This gave the attendees a glimpse into the growing industry interest in Hadoop.

The DISC Symposium had attendees from a broad range of companies and research institutions. By design, the program was broad and shallow — the idea was to bring together researchers from all aspects of DISC. Among the highlights:

  • In the “DISC systems” arena, Randy Bryant laid out a broad range of research challenges, Jeff Dean gave a lightening-fast overview of Google’s tools (clusters, GFS, MapReduce, BigTable, Chubby), and Garth Gibson talked about challenges in large-scale data systems.
  • In the “middleware” arena, ChengXiang Zhai discussed text information management, and Joe Hellerstein promoted declarative programming as a universal elixer.
  • In the “applications” arena, Jill Mesirov described computational paradigms for genomic medicine, Jon Kleinberg talked about algorithms for analyzing large-scale social network data, and Alex Szalay described applications in the physical sciences.
  • Jeannette Wing and Christophe Bisciglia announced NSF’s new program supporting DISC research utilizing a large-scale cluster provided by Google and IBM.

Slides from all presentations at both the Hadoop Summit and the DISC Symposium, as well as videos of most presentations, are available here.

So what can we conclude about all of this? Well, at the Hadoop Summit, the speakers (especially the ones from industry) were not the “usual suspects”, especially considering the fairly hard-score technical nature of research in large-scale distributed systems. However, there is an overwhelming sense that a major wave is starting, and overall we the excitement level at the meeting was extremely high.

Regarding the concept of “DISC”, here is our unabashed opinion about all of this: Ubiquitous cheap sensors (in gene sequencers, in telescopes, in buildings, on the sea floor, in the form of point-of-sale terminals or the readable web, etc.) are transforming many fields from data-poor to data-rich. The enormous volume of data makes “automated discovery” (machine learning, data mining, visualization) essential, requiring innovation throughout the stack. The traditional “high performance computing” crowd has missed the boat on this one. (The focus must be on the data.) Web companies such as Google, Yahoo!, and Microsoft have made significant strides. But there remains lots of room — and lots of need — for additional breakthroughs. Bluntly, a university that lacks this “big data” capability is not going to be competitive.

The job of the Computing Community Consortium is to facilitate the computing research community in envisioning, articulating, and pursuing longer-range, more audacious research challenges. “Visioning workshops” such as these are one route that the CCC is pursuing. This was the first CCC-sponsored meeting. While there’s room for improvement (more time for discussion, more younger attendees, …), most participants viewed this workshop as a success — there was a real buzz.

Let us know your thoughts!

– Ed Lazowska and Peter Lee

viagra
free viagra
buy viagra online
generic viagra
how does viagra work
cheap viagra
buy viagra
buy viagra online inurl
viagra 6 free samples
viagra online
viagra for women
viagra side effects
female viagra
natural viagra
online viagra
cheapest viagra prices
herbal viagra
alternative to viagra
buy generic viagra
purchase viagra online
free viagra without prescription
viagra attorneys
free viagra samples before buying
buy generic viagra cheap
viagra uk
generic viagra online
try viagra for free
generic viagra from india
fda approves viagra
free viagra sample
what is better viagra or levitra
discount generic viagra online
viagra cialis levitra
viagra dosage
viagra cheap
viagra on line
best price for viagra
free sample pack of viagra
viagra generic
viagra without prescription
discount viagra
gay viagra
mail order viagra
viagra inurl
generic viagra online paypal
generic viagra overnight
generic viagra online pharmacy
generic viagra uk
buy cheap viagra online uk
suppliers of viagra
how long does viagra last
viagra sex
generic viagra soft tabs
generic viagra 100mg
buy viagra onli
generic viagra online without prescription
viagra energy drink
cheapest uk supplier viagra
viagra cialis
generic viagra safe
viagra professional
viagra sales
viagra free trial pack
viagra lawyers
over the counter viagra
best price for generic viagra
viagra jokes
buying viagra
viagra samples
viagra sample
cialis
generic cialis
cheapest cialis
buy cialis online
buying generic cialis
cialis for order
what are the side effects of cialis
buy generic cialis
what is the generic name for cialis
cheap cialis
cialis online
buy cialis
cialis side effects
how long does cialis last
cialis forum
cialis lawyer ohio
cialis attorneys
cialis attorney columbus
cialis injury lawyer ohio
cialis injury attorney ohio
cialis injury lawyer columbus
prices cialis
cialis lawyers
viagra cialis levitra
cialis lawyer columbus
online generic cialis
daily cialis
cialis injury attorney columbus
cialis attorney ohio
cialis cost
cialis professional
cialis super active
how does cialis work
what does cialis look like
cialis drug
viagra cialis
cialis to buy new zealand
cialis without prescription
free cialis
cialis soft tabs
discount cialis
cialis generic
generic cialis from india
cheap cialis sale online
cialis daily
cialis reviews
cialis generico
how can i take cialis
cheap cialis si
cialis vs viagra
levitra
generic levitra
levitra attorneys
what is better viagra or levitra
viagra cialis levitra
levitra side effects
buy levitra
levitra online
levitra dangers
how does levitra work
levitra lawyers
what is the difference between levitra and viagra
levitra versus viagra
which works better viagra or levitra
buy levitra and overnight shipping
levitra vs viagra
canidan pharmacies levitra
how long does levitra last
viagra cialis levitra
levitra acheter
comprare levitra
levitra ohne rezept
levitra 20mg
levitra senza ricetta
cheapest generic levitra
levitra compra
cheap levitra
levitra overnight
levitra generika
levitra kaufen

Comments

One Response to “Big Data Computing Group Kicks Off”

  1. Gagan Agrawal on May 12th, 2008 2:36 pm

    Though the dominant effort from the academic high-performance computing community has been on compute-intensive applications, it is not quite true that data-intensive applications did not receive any attention.

    Systems like ADR and FREERIDE pre-date Google Map-Reduce, but have many similarities with this systems. Compiler support for data-intensive systems has also been explored around 2000-2002.

Leave a Reply