Improving Data Mobility & Management for International Climate Science

Use Internet2 SiteID

Already have an Internet2 SiteID?
Sign in here.

Internet2 SiteID

Implementing Cloud Transfer for HPC: Abstracting the Network for Scientific Data Movement

Time 07/15/14 02:10PM-02:30PM

Room GC402

Session Abstract

NOAA has consistently been at the forefront of big data, even before that was a term. NOAA has several >50PB archives and has made business decisions to make a scientific grid with remotely deployed HPC capability. Single link interstate data movement over 70 TB/day is the norm in our environment with local data movement being much larger. Being one of the most data intensive labs with immense scrutiny for data integrity means that we experience and notice failures in tools that other user communities either donʼt see or donʼt care about. In addition, our scientists are moving towards multi- run, multi-model ensembles and in order to enable them to scale their work NOAA has implemented both a data movement tool and a workflow manager to offload responsibility for data movement to automation. The combination of the two has the effect of turning our grid into a cloud of sorts.
This paper describes the motivations and challenges associated with creating the latest version of our data movement tool: generalized copy (gcp). It also covers our development and testing process and how gcp fits in with our certificate infrastructure.


Speaker Chandin Wilson NOAA (National Oceanic & Atmospheric Administration, Washington, D.C.)

Presentation Media

media item thumbnail Cloudy with a chance of transfer: Abstracting the Network for Data Movement

Speaker Chandin Wilson NOAA (National Oceanic & Atmospheric Administration, Washington, D.C.)