Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.

The BD2K Guide to the Fundamentals of Data Science Section 2 and Upcoming Events

October 17th, 2016 / in Announcements, resources / by Khari Douglas

The BD2K Guide to the Fundamentals of Data Science SeriesSection 2 of The BD2K Guide to the Fundamentals of Data Science online lecture series, titled Data Representation Overview, starts October 28th with an overview from Anita Bandrowski, UCSD. The National Institutes of Health (NIH) Big Data to Knowledge (BD2K) program run lecture series features experts from around the country presenting on a wide range of topics in data science.  This course is an introductory overview that assumes no prior knowledge or understanding of data science. The series began Friday, September 9th and will continue to run all year once per week from 12noon-1pm ET.

Additionally registration is now open for the 2016 Open Data Science Symposium: How Open Data and Open Science are Transforming Biomedical Research. The symposium will take place December 1 at the Bethesda North Marriott Conference Center in Bethesda, MD, in conjunction with the 2016 BD2K All Hands Meeting, November 29-30, and is free and open to the public. The symposium will be live cast at: Please register here by November 18, 2016. For more information about this event, contact or

If you would like to join the meeting, please go to the BD2K Guide web page for the most up-to-date computer or mobile logins. 

This is a joint effort of the BD2K Training Coordinating Center (TCC), the BD2K Centers Coordination Center (BD2KCCC), and the NIH Office of the Associate Director of Data Science. For up-to-date information about the series and to see archived presentations, go to this website.

Tentative Schedule


11/4/16 Databases and data warehouses, Data: structures, types, integrations (Chaitan BaruNSF)

11/11/16 No lecture- Veteran’s Day

11/18/16 Social networking data (TBD)

12/2/16 Data wrangling, normalization, preprocessing (Joseph PiconeTemple)

12/9/16 Exploratory Data Analysis (Brian CaffoJohns Hopkins)

12/16/16 Natural Language Processing (Noemie Elhadad, Columbia)

1/6/17 SECTION 3: COMPUTING OVERVIEW (Patricia Kovatch, Icahn School of Medicine at Mount Sinai)

1/13/17 Workflows/pipelines

1/20/17 Programming and software engineering; API; optimization

1/27/17 Cloud, Parallel, Distributed Computing, and HPC

2/3/17 Commons: lessons learned, current state


2/17/17 Smoothing, Unsupervised Learning/Clustering/Density Estimation

2/24/17 Supervised Learning/prediction/ML, dimensionality reduction

3/3/17 Algorithms, incl. Optimization

3/10/17 Multiple testing, False Discovery rate

3/17/17 Data issues: Bias, Confounding, and Missing data

3/24/17 Causal inference

3/31/17 Data Visualization tools and communication

4/7/17 Modeling Synthesis


4/14/17 Open science

4/21/17 Data sharing (including social obstacles)

4/28/17 Ethical Issues

5/5/17 Extra considerations/limitations for clinical data

5/12/17 reproducibility

5/19/17 SUMMARY and NIH context

Other Upcoming BD2K Opportunities 
  • BD2K FOA:RFA-CA-16-020 “BD2K Support for Meetings of Data Science Related Organizations (U13).” The purpose of this FOA is to support high quality and impactful conferences or meetings convened by community-based, data science-related organizations that help to carry out critical work related to biomedical data science and are aligned with the goals and Mission Statement of the NIH BD2K program. Applications due December 15, 2016. Permission to submit application letters is required. Applicants are urged to initiate contact well in advance of the chosen application due date and no later than 6 weeks before that date. Please visit the Frequently Asked Questions (FAQ) page for more information on this FOA.
  • BD2K FOA:RFA-LM-17-001 “Big Data to Knowledge (BD2K) Enhancing the Efficiency and Effectiveness of Digital Curation for Biomedical Big Data (U01).”  Applications due December 15, 2016.  For additional information, contact Valerie Florance at:
  • BD2K FOA:  RFA-ES-16-011 “BD2K Research Education Curriculum Development: Data Science Overview for Biomedical Scientists (R25).”  Applications due December 7, 2016.  For additional information, contact the BD2K Training Team
  • BD2K FOA: RFA-MD-16-002 “NIH Big Data to Knowledge (BD2K) Enhancing Diversity in Biomedical Data Science (R25).”  Applications due November 14, 2016.  For additional information, contact the BD2K Training Team
  • NIH call for Public Feedback forDataMed DDI Prototype: Developed through the BD2K biomedical and healthCAre Data Discovery Indexing Ecosystem project (bioCADDIE), the prototype allows users to find and access biomedical datasets from multiple sources based on key attributes.  DataMed is an element of the NIH BD2K Commons, the vision for an interconnected digital ecosystem of resources around data and other research digital objects.  DataMed is a work in progress and the bioCADDIE development team welcomes your feedback here.  For more information, or
The BD2K Guide to the Fundamentals of Data Science Section 2 and Upcoming Events