This week, the National Science Foundation issued a solicitation for its new Exploiting Parallelism and Scalability (XPS) program. The program aims to support groundbreaking research leading to a new era of scalable computing. NSF estimates that $15 million in awards will be made in FY 2013 for this program.
As the solicitation notes, the Computing Community Consortium (CCC) furnished a white paper earlier this year titled 21st Century Computer Architecture, through which members of the computing research community contributed strategic thinking in this space. The white paper drew upon a number of earlier efforts, including CCC’s Advancing Computer Architecture Research (ACAR) visioning reports.
Here is a synopsis of the Exploiting Parallelism and Scalability (XPS) program from the National Science Foundation:
Computing systems have undergone a fundamental transformation from the single-processor devices of the turn of the century to today’s ubiquitous and networked devices and warehouse-scale computing via the cloud. Parallelism has become ubiquitous at many levels. The proliferation of multi- and many-core processors, ever-increasing numbers of interconnected high performance and data intensive edge devices, and the data centers servicing them, is enabling a new set of global applications with large economic and social impact. At the same time, semiconductor technology is facing fundamental physical limits and single processor performance has plateaued. This means that the ability to achieve predictable performance improvements through improved processor technologies has ended.
The Exploiting Parallelism and Scalability (XPS) program aims to support groundbreaking research leading to a new era of parallel computing. XPS seeks research re-evaluating, and possibly re-designing, the traditional computer hardware and software stack for today’s heterogeneous parallel and distributed systems and exploring new holistic approaches to parallelism and scalability. Achieving the needed breakthroughs will require a collaborative effort among researchers representing all areas– from the application layer down to the micro-architecture– and will be built on new concepts and new foundational principles. New approaches to achieve scalable performance and usability need new abstract models and algorithms, programming models and languages, hardware architectures, compilers, operating systems and run-time systems, and exploit domain and application-specific knowledge. Research should also focus on energy- and communication-efficiency and on enabling the division of effort between edge devices and clouds.
Proposals should address four focus areas:
Foundational principles (FP)
Research on foundational principles should engender a paradigm shift in the ways in which one conceives, develops, analyzes, and uses parallel algorithms, languages, and concurrency. Foundational research should be guided by crucial design principles and constraints impacting these principles. Topics include, but are not limited to:
- New computational models that free the programmer from many low-level details of specific parallel hardware while supporting the expression of properties of a desired computation that allows maximum parallel performance. Models should be simple enough to understand and use, have solid semantic foundations, and guide algorithm design choices for diverse parallel platforms.
- Algorithms and algorithmic paradigms that simultaneously allow reasoning about parallel performance, lead to provable performance guarantees, and allow optimizing for various resources, including energy, memory hierarchy, and communication bandwidth as well as parallel work and running time.
- New programming languages and language mechanisms that support new computational models, raise the level of abstraction, and lower the barrier of entry for parallel and concurrent programming. Parallel and concurrent languages that have programmability, verifiability, and scalable performance as design goals. Of particular interest are languages that abstract away from the traditional imperative programming model found in most sequential programming languages.
- Compilers and techniques for mapping high-level parallel languages and language mechanisms to efficient low-level, platform-specific code.
- Development of interfaces to express parallelism at a higher level while being able to express and analyze locality, communication, and other parameters that affect performance and scalability.
Cross-layer and Cross-cutting Approaches (CLCCA)
In order to fully exploit the power of current and emerging computer architectures, research is needed that re-evaluates, and possibly re-designs, the traditional computer hardware and software stack – applications, programming languages, compilers, run-time systems, virtual machine, operating systems and architecture – for today’s heterogeneous parallel systems. A successful approach should be a collaboration that explores new holistic approaches to parallelism and cross-layer design. Topics include, but are not limited to:
- New abstractions, models, and software systems that expose fundamental attributes, such as energy use and communication costs, across all layers and that are portable across different platforms and architectural generations.
- New software and system architectures that are designed for exploitable locality, with parallelism and communication efficiency to minimize energy use, and using on-chip and chip-to-chip communication achieving low latency, high bandwidth, and power efficiency.
- New methods and metrics for evaluating, verifying and validating reliability, resilience, performance, and scalability of concurrent, parallel, and heterogeneous systems.
- Runtime systems to manage parallelism, memory allocation, synchronization, communication, I/O, and energy usage.
- Extracting general principles that can drive the future generation of computing architectures and tools with a focus on scalability, reliability, robustness, security and verifiability.
- Exploration of tradeoffs addressing an optimized “separation of concerns.” Which problems should be handled by which layers? What information and abstractions must flow between the layers to achieve optimal performance? Which aspects of system design can be automated and what is the optimal use of costly human ingenuity?
Scalable Distributed Architectures (SDA)
Many emerging applications require a rich environment that enables sensing and computing devices that communicate with each other and with warehouse-scale facilities via the cloud, which in turn processes and supplies information for edge devices, such as smart phones. Research is needed into the components and the programming of such highly parallel and scalable distributed architectures. Topics include, but are not limited to:
- Novel approaches that enable smart sensor design with the constraints of low energy use, tight form factors, tight time constraints and adequate computational capacity, and low cost. Exemplary approaches include using innovative communication modalities and data-specific approximate computing techniques.
- Runtime platforms and virtualization tools that allow programs to divide effort between and among portable platforms and the cloud while responding dynamically to changes in the reliability and energy efficiency of the cloud uplink. Possible questions to address include: How should computation be distributed between the nodes and cloud infrastructure? How can system architecture help preserve privacy by giving users more control over their data? Should compute engines and memory systems be co-designed?
- Research that enables conventionally-trained engineers to program warehouse-scale computers, taking advantage of the highly parallel and distributed environment and at the same time being resilient to significant amounts of component and communication failures. Such research may be based on novel hardware support, programming abstractions, new algorithms, storage systems, middleware, operating systems and/or virtualization.
Research is needed on how to exploit domain and application-specific knowledge to improve programmability, reliability, and scalable parallel performance. Topics include, but are not limited to:
- Parallel domain-specific languages that provide both high-level programming models for domain experts and high performance across a range of parallel platforms, such as GPUs, SMPs, and clusters.
- Program synthesis tools that generate efficient parallel codes from high-level problem descriptions using domain-specific knowledge. Approaches might include optimizations based on mathematical and/or statistical reasoning, auto-vectorization techniques that exploit domain-specific properties, and auto-tuning techniques.
- Hardware-software co-design for domain-specific applications that pushes performance and energy efficiency while reducing cost, overhead, and inefficiencies.
- Integrated data management paradigms harnessing parallelism and concurrency, encompassing the entire data path from generation to transmission, to storage, use, security, and maintenance, to eventual archiving or destruction.
- Work that generalizes the approach of exploiting domain-specific knowledge, such as tools, frameworks, and libraries that support the development of domain-specific solutions to computational problems and are integrated with domain science.
- Novel approaches suitable for scientific application frameworks addressing domain-specific mapping of parallelism onto a variety of parallel computational models and scales.
Read complete details – including proposal deadlines – here.
(Contributed by Kenneth Hines, CCC Program Associate)