This event has ended. Create your own event on Sched.
Data to Action: Increasing the Use and Value of Earth Science Data and InformationFor 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, thus forming a community dedicated to making Earth observations more discoverable, accessible and useful to researchers, practitioners, policymakers, and the public.

The ESIP Summer Meeting has already taken place, but check out the ESIP Summer Meeting Highlights Webinar: https://youtu.be/vbA8CuQz9Rk.
Back To Schedule
Wednesday, July 17 • 3:30pm - 5:00pm
Scalable, data-proximate cloud computing for Earth Science research

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Data intensive scientific workflows are at a pivotal time in which traditional local computing resources are no longer capable of meeting the storage or computing demands of scientists. In the Earth Sciences, we are facing an explosion of data volumes sourced from models, in-situ observations, and remote sensing platforms. Some agencies are starting to move data to commercial Cloud providers to facilitate access (e.g. NASA on Amazon Web Services). Fully leveraging these opportunities will require new approaches in the way the scientific community handles data access, processing and analysis. In particular, we need to stop downloading data and start uploading algorithms to wherever large archives reside. This session is targeted at researchers who pioneering such “data-proximate” computing on commercial Cloud infrastructure. We hope to hear current success stories, as well as failures, and identify ways to improve existing workflows.

  • 3:30 - 3:35 Scott Henderson (eScience Institute) Introduction to the session - slides: http://bit.ly/2YhbWnr
  • 3:35 - 3:55 Aimee Barciauskas (Development Seed): The Multi-Mission Algorithm and Analysis Platform (MAAP)
    Slides: https://doi.org/10.6084/m9.figshare.8942108
  • 3:55 - 4:15 Aji John (University of Washington) - Analyzing satellite imagery on the Cloud to understand wildflower phenology at Mt Rainier
  • 4:15 - 4:35 Julien Chastang (UCAR/unidata) - Deploying a Unidata JupyterHub on the NSF Jetstream Cloud, Lessons Learned and Challenges Going Forward
    Slides: https://doi.org/10.6084/m9.figshare.8944964
  • 4:35 - 4:55 Rich Signell (USGS): Using the Pangeo ecosystem for model analysis and visualization
    Slides: https://doi.org/10.6084/m9.figshare.9115229
  • 4:55 - 5:00   Wrapup discussion 

Session recording is here.

Session Take-Aways
  1. A current challenge for cloud-based workflows is that datasets from different agencies are in different formats, different regions, and often have similar but slightly different access apis
  2. Platforms such as MAAP and Pangeo are very promising and exciting. They enable the benefits of scalable computing on datasets stored on the cloud.
  3. The cost model for scalable cloud computing is unclear. How to support platforms into the future and regulate user access to cluster resources.


Aji John

University of Washington
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
I'm an oceanographer, geek and foodie.  You can talk to me about: Ocean Modeling, Python, Pangeo, Zarr, Xarray, HoloViz, Qhub, Cloud, HPC, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS,  Pizza Napoletana. 
avatar for Julien Chastang

Julien Chastang

Software Engineer, UCAR - Unidata
Scientific software developer at UCAR-Unidata.
avatar for Scott Henderson

Scott Henderson

Research Scientist, University of Washington
avatar for Aimee Barciauskas

Aimee Barciauskas

Data engineer, Development Seed

Wednesday July 17, 2019 3:30pm - 5:00pm PDT
Ballrm D
  Ballrm D, Breakout