The Missing Link: A Sustainability Plan
As the NIH National Centers for Biomedical Computing (NCBC) program enters its final years of support, there is an opportunity to reflect on how this program has made a lasting impact on the research community. The NCBCs were launched in response to the BISTI report1 which called on NIH to make significant investments to train a new community of biomedical computational scientists; develop new methods in computational research; support efforts to make data available and useable; and foster a scalable national computer infrastructure to support biomedical research.
The NCBCs have succeeded brilliantly in their mission by focusing on important biomedical research questions, publishing volumes of high impact research articles, training scores of new computational scientists, and producing professional quality open-source software and resources that many research and clinical groups now depend on as an integral part of their research framework. However, the community-based research infrastructure developed by the NCBCs is now in jeopardy as the program winds down.
Now, with a sense of déjà vu, the cycle is starting anew, led by a report by the Data and Informatics Working Group2 with a set of recommendations that parallel those of the BISTI report, but with a stronger focus on data management, integration and sharing designed to address the huge challenge of making better use of the deluge of data generated by biomedical researchers.
In response to the report, NIH has launched the Big Data to Knowledge (BD2K) Initiative and set a high bar for the long-term impact of this program: “A BD2K Center application is expected to propose the development of specific and substantive “products”—e.g., approaches, methods, software, tools, and other resources to analyze data—and then distribute the products to the user community to dramatically enhance the research community’s capabilities for using Big Data in biomedical research.”3
This bold vision is a tall order and lessons learned from the NCBC program indicate that new centers will face a significant challenge to ramp up and deliver, given the shortened time frame and reduced funding level of this new program.
The NCBCs have already provided innovative solutions to many Big Data challenges, as will the BD2K Centers in the future. What’s missing from both of these programs is a mechanism to address the major challenge of sustaining the infrastructure, software products and services necessary to support biomedical research communities.
There are no easy answers to the question of who will pay for the support of public access to research data, software and the infrastructure to support it. Fran Berman and Vint Cerf laid out several possibilities recently including public-private investments, government support for some community data collections and new economic models such as a small fee for downloads of data or software.4
The NIH has the opportunity to take on this challenge with the BD2K program by tasking the centers to develop sustainability plans for data collections and/or software in the first year of the project. Collectively the consortium could evaluate the feasibility of these plans with feedback from the community over time. If successful, the BD2K program could develop a realistic sustainability model for research resources that would be a huge benefit to the biomedical community.
1. The BISTI report: http://www.bisti.nih.gov/library/june_1999_Rpt.asp
2. The Data and Informatics Working Group Report: http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Repor...
3. Centers of Excellence in Big Data Computing FOA: http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-13-009.html
4. Berman, F and V Cerf (2013) Who will pay for public access to research data? Science vol. 341 pp 616 - 617