New Network Transfer Protocol for Genomic Data
Time 01/14/13 11:40AM-12:00PM
UC Santa Cruz is under contract with NIH and the National Cancer Institute to construct and operate CGHub, a nation-scale library and user portal for genomic data. This contract covers growth of the library to 5 Petabytes. The three NCI programs (TGGA, TARGET and CGCI) that feed into the library currently produce about 10 terabytes of data each month. We will discuss the process that lead to the choice of a receiver-driven file transfer mechanism Annai-GT for use with the library. Annai-GT uses multiple TCP streams from multiple computers at the library site to parallelize genome downloads. We will review our performance experience with the new transfer mechanism. We will also explain additions to the transfer protocol to achieve FISMA security due diligence.