Today at a briefing on Capitol Hill titled, “Big Data, Bigger Opportunities“, hosted by Tech America, The National Science Foundation (NSF) and the National Institutes of Health (NIH) announced $15 million in funding for Big Data research. These awards come nearly six months after the Obama Administration released it’s substantial R&D initiative in March of this year.
The initiative committed more than $200 million in new funding through six agencies and departments to improve the nation’s “ability to extract knowledge and insights from large and complex collections of digital data.”
Subra Suresh, Director of the National Science Foundation:
“I am delighted to provide such a positive progress report just six months after fellow federal agency heads joined the White House in launching the Big Data Initiative,” said NSF Director Subra Suresh. “By funding the fundamental research to enable new types of collaborations–multi-disciplinary teams and communities–and with the start of an exciting competition, today we are realizing plans to advance the foundational science and engineering of Big Data, fortifying U.S. competitiveness for decades to come.”
Suzanne Iacono, Senior Advisor at the National Science Foundation, announced the eight award recipients, listed below:
BIGDATA: Mid-Scale: DCM: A Formal Foundation for Big Data Management
University of Washington, Dan Suciu
This project explores the foundations of big data management with the ultimate goal of significantly improving the productivity in Big Data analytics by accelerating data exploration. It will develop open source software to express and optimize ad hoc data analytics. The results of this project will make it easier for domain experts to conduct complex data analysis on Big Data and on large computer clusters.
BIGDATA: Mid-Scale: DA: Analytical Approaches to Massive Data Computation with Applications to Genomics
Brown University, Eli Upfal
The goal of this project is to design and test mathematically well-founded algorithmic and statistical techniques for analyzing large scale, heterogeneous and so called noisy data. This project is motivated by the challenges in analyzing molecular biology data. The work will be tested on extensive cancer genome data, contributing to better health and new health information technologies, areas of national priority.
BIGDATA: Mid-Scale: DA: Distribution-based Machine Learning for High-dimensional Datasets
Carnegie Mellon University, Aarti Singh
The project aims to develop new statistical and algorithmic approaches to natural generalizations of a class of standard machine learning problems. The resulting novel machine learning approaches are expected to benefit other scientific fields in which data points can be naturally modeled by sets of distributions, such as physics, psychology, economics, epidemiology, medicine and social network-analysis.
BIGDATA: Mid-Scale: DA: Collaborative Research: Genomes Galore – Core Techniques, Libraries, and Domain Specific Languages for High-Throughput DNA Sequencing
Iowa State University, Srinivas Aluru
Stanford University, Oyekunie Olukotun
Virginia Polytechnic University, Wuchun Feng
The goal of the project is to develop core techniques and software libraries to enable scalable, efficient, high-performance computing solutions for high-throughput DNA sequencing, also known as next-generation sequencing. The research will be conducted in the context of challenging problems in human genetics and metagenomics, in collaboration with domain specialists.
BIGDATA: Mid-Scale: DA: Collaborative Research: Big Tensor Mining: Theory, Scalable Algorithms and Applications
Carnegie Mellon University, Christos Faloutsos
University of Minnesota, Twin Cities, Nikolaos Sidiropoulos
The objective of this project is to develop theory and algorithms to tackle the complexity of language processing, and to develop methods that approximate how the human brain works in processing language. The research also promises better algorithms for search engines, new approaches to understanding brain activity, and better recommendation systems for retailers.
BIGDATA: Mid-Scale: ESCE: Collaborative Research: Discovery and Social Analytics for Large-Scale Scientific Literature
Rutgers University, Paul Kantor
Cornell University, Thorsten Joachims
Princeton University, David Biei
This project will focus on the problem of bringing massive amounts of data down to the human scale by investigating the individual and social patterns that relate to how text repositories are actually accessed and used. It will improve the accuracy and relevance of complex scientific literature searches.
The event also featured a new challenge announcement on Big Data science and engineering. The challenge is an interagency collaboration between the NSF, National Aeronautics and Space Administration (NASA) and the Department of Energy (DOE). More information on the challenge from the NSF press release:
The competition will be run by the NASA Tournament Lab (NTL), a collaboration between Harvard University and TopCoder, a competitive community of digital creators.
The NTL platform and process allows U.S. government agencies to conduct high risk/high reward challenges in an open and transparent environment with predictable cost, measurable outcomes-based results and the potential to move quickly into unanticipated directions and new areas of software technology. Registration is open through Oct. 13, 2012, for the first of four idea generation competitions in the series. Full competition details and registration information is available at the Ideation Challenge Phase website.