Improving Data Mobility & Management for International Climate Science

close
Use Internet2 SiteID

Already have an Internet2 SiteID?
Sign in here.

Internet2 SiteID

GENI and Climate Science

Time 07/16/14 08:30AM-09:00AM

Room GC402

Session Abstract

This session is a combination of two talks:

1. Title: Federated Network Infrastructure-as-a-Service (NIaaS) for Data Management in Climate Science
Abstract: Federated Network Infrastructure-as-a-Service (NIaaS) has the potential to transform the way climate scientist uses distributed computational infrastructure. The NSF Global Environment for Network Innovations (GENI) has taken steps toward this goal by providing a suite of infrastructure for at-scale networking experiments in future internets. ExoGENI is one of two computational testbeds to come out of GENI efforts. However, ExoGENI goes beyond standard GENI goals by considering the requirements of the domain science users of next generation internets.
ExoGENI is not just another cloud provider. It is an international federated testbed based on the Open Resource Control Architecture (ORCA) that can manage allocation of any programmable resource (aggregates). Current and planned aggregates include virtual compute nodes (OpenStack, Eucalyptus), bare-metal compute nodes (xCat), compute cluster resources (SLURM), local networks (OpenFlow, VLANs), long reaching layer-2 networks (ESnet OSCARS - including the 100G testbed, Internet2 ION/AL2S, BEN, many other regional providers), and storage (iSCSI targets). Users request slices of temporarily dedicated resources and ORCA controls the allocation and instantiation of available resources to the user.
One class of user that has applications that can utilize current ExoGENI resources at-scale is domain scientists who currently use existing high throughput and high-performance computing facilities. These scientists are quickly running into the “big-data” problem, a solution to which will certainly include next-generation “big-network” technologies.
ExoGENI has been used for computational domain science. Most notably it has been used for deploying several Pegasus workflows on HTCondor clusters, running ADCIRC storm surge and tide model for the North Carolina Forecast System (MPI), and running Solar Fuels simulations using virtual compute nodes in conjunction with NERSC’s Hopper petaflop system. It has also been used for experiments with Hadoop clusters, distributed filesystems (GlusterFS and Ceph), as well as the iRODS open-source data management software. This talk will present the current ExoGENI capabilities and uses, as well as future federated NIaaS features that are relevant to the climate science community. Topics include integration of resource allocation with Pegasus workflow management system and integration of automated infrastructure allocation for distributed data and storage management.
2. Talk Title: GENI, Domain Science, and Distributed Clouds for Climate Science Data and Models
Abstract: Climate Science is a field dominated by large-scale real-time data collected by high-bandwidth, high-resolution sensors broadly deployed across the wide area. Effective real-time processing of this data requires a combination of in-situ computation and adaptive, customizable, real-time reconfigurable networking. In this talk, we will describe the GENI facility that has been under development by the National Science Foundation for the past five years. GENI can perhaps best be thought of as a distributed Cloud, with points of presence at over 50 sites across the United States, where the sites are interconnected by high-bandwidth, and (more importantly) highly-reconfigurable software-defined networking. Using GENI, the take from a collection of high-bandwidth weather sensors can be processed near the sensor sites; this preprocessing can identify events of significant interest and importance. Takes from sensors indicating urgent events can be routed over the customizable network to high-performance back-end sites for rigorous processing. In this talk, we will describe the use of GENI in the CASA experiment, and the NowCasting customized local, up-to-the-second weather reports: a degree of local precision unfathomable with a conventional network. We'll also describe the GENI Experiment Engine, a real-time deployment engine for GENI applications and experiments.

Speakers

Speaker Paul Ruth RENCI

Speaker Rick McGeer HP

Presentation Media