The Bionimbus Cloud, the Emergence of Petabyte Scale Biomedical Data Commons, and High Performance Data Transport Services
Time 07/17/13 02:05PM-02:20PM
Bionimbus is an open source cloud-based computing platform that is designed for managing, analyzing and sharing large genomics datasets and includes a biomedical data commons containing commonly used genomic datasets. It is approximately 1 PB in size today and should be approximately 2-3 PB in size by the end of 2013. Researchers who are authorized by NIH to access large genomics datasets from dbGaP, such as the The Cancer Genome Atlas (TCGA), can use Bionimbus so that they don't have to download the data to their local institution and set up a secure, compliant infrastructure for their analysis. Bionimbus is part of the Open Science Data Cloud (OSDC), which is a distributed cloud computing platform operated by the not-for-profit Open Cloud Consortium that supports researchers across a variety of scientific disciplines. The OSDC is approximately 4 PB in size today and will also approximately double in size over the next year. The biomedical community is currently exploring architectures for petabyte scale clouds, how they will support the research community, and how they might interoperate. In this talk, we discuss our experience with moving and synchronizing large genomics datasets to support Bionimbus and some of the datasets it contains, such as the TCGA.
Speaker Robert Grossman University of Chicago
Secondary tracks Cloud Architecture