Loading…
This event has ended. Create your own event on Sched.
Data to Action: Increasing the Use and Value of Earth Science Data and InformationFor 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, thus forming a community dedicated to making Earth observations more discoverable, accessible and useful to researchers, practitioners, policymakers, and the public.

The ESIP Summer Meeting has already taken place, but check out the ESIP Summer Meeting Highlights Webinar: https://youtu.be/vbA8CuQz9Rk.

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Monday, July 15
 

7:30am

Registration Desk Open
Stop by ahead of the official meeting start on Tuesday morning and pick up your badge! Registration will be located on the 3rd Floor of the Convention Center.

Monday July 15, 2019 7:30am - 5:30pm
TCC

8:00am

DataONE Community Meeting
The DataONE Community Meeting will be a 1-day event featuring plenary presentations, topical breakout sessions, and community-led discussions in addition to an early evening poster reception.

To participate, you must register at https://www.dataone.org/dataone-users-group/2019-meeting.

Monday July 15, 2019 8:00am - 7:30pm
Room 407

8:30am

LTER IM Meeting (CLOSED)
Monday July 15, 2019 8:30am - 5:30pm
Room 318

8:30am

GeoSemantics Symposium
Limited Capacity seats available

The ESIP Semantic Technologies Committee is hosting its annual Geosemantics Symposium on Monday, July 15th, 2019, 8:30am to 5pm in Tacoma, WA co-located with the ESIP Summer 2019 Meeting. This year's symposium theme is Building Harmony between Data Semantics and Machine Learning which will act as, amongst other things, a platform for Semantic Technologies and Machine Learning enthusiasts to come together in an interdisciplinary manner.

The symposium will aim to investigate and integrate data semantics as a first class citizen within the pervasive machine learning technology space. We seek broad community input and encourage non-ESIP members and members of other professional societies to attend. Also, suggestions for topics to be covered are welcome.

To register, simply add this session to your schedule.

Workshops:
  • Session I: Amazon Web Services - Amazon SageMaker 
  • Session II: Amazon Web Services - Amazon Neptune
  • Session III: ESRI Machine Learning
  • Session IV: Drone Data API Design Workshop

Note: The event itself is free; however, attendees will be responsible for providing their own travel and lodging.

This event is being held in conjunction with the ESIP Summer Meeting - we hope you join in the full week's activities. 

Speakers
avatar for Simon Cox

Simon Cox

Research Scientist, CSIRO
SH

Simon Handley

iBuild Global, Inc.
avatar for Lewis J. McGibbney

Lewis J. McGibbney

Chair, ESIP Semantic Technologies Committee, NASA, JPL
My name is Lewis John McGibbney, I am currently a Data Scientist at the NASA Jet Propulsion Laboratory in Pasadena, California where I work in Computer Science and Data Intensive Applications. I enjoy floating up and down the tide of technologies @ The Apache Software Foundation having... Read More →



Monday July 15, 2019 8:30am - 6:15pm
Room 316

2:00pm

Council of Data Facilities General Assembly Meeting
Monday July 15, 2019 2:00pm - 5:00pm
Room 317

2:00pm

ESIP Board Meeting (CLOSED)
ESIP Board will meet for their quarterly meeting. This meeting is closed. 

Monday July 15, 2019 2:00pm - 6:30pm
Room 315
 
Tuesday, July 16
 

8:00am

Morning Plenary
View live-stream here: ESIP 2019 Summer Meeting - Day 1 Plenary

Welcoming Remarks
Erin Robinson, ESIP Executive Director and Karl Benedict, ESIP President

Toward Interoperable Microbiome Data: Bridging Earth-Systems and Life-Science Semantics
Kai Blumberg, 2019 Raskin Scholar, University of Arizona
Slides: https://doi.org/10.6084/m9.figshare.8966357

Legacy arsenic contamination in freshwater ecosystems: the unique vulnerability of shallow weakly stratified lakes
Dr. Becca Neumann, 2018 Falkenberg Winner, University of Washington

From Data Lakes to Rivers: Improving the Value and Reach of a Seismic Data Archive
Rob Casey, IRIS
Slides: https://doi.org/10.6084/m9.figshare.8945210

View session recording here.

Speakers
avatar for Kai Blumberg

Kai Blumberg

PhD Student, University of Arizona
Kai Blumberg is a PhD student in the University of Arizona Biosystems Engineering department. He is working to create a model cyberinfrastructure system called Planet Microbe to integrate and provide analytical tools to analyze key marine 'omics and biogeochemical datasets. As a contributor... Read More →
avatar for Erin Robinson

Erin Robinson

Executive Director, ESIP
avatar for Rob Casey

Rob Casey

Deputy Director of Cyberinfrastructure, IRIS DMC
Rob currently serves as Deputy Director of Cyberinfrastructure at the IncorporatedResearch Institutions for Seismology (IRIS) Data Management Center (DMC) in Seattle, WA. His responsibilities include management of software development and data services activities as well as leading... Read More →
KB

Karl Benedict

ESIP President, ESIP
The ESIP President is a volunteer position, elected by the ESIP Community each year. The President works with the ESIP Staff for several of the presentation, speaker introductions, award ceremonies, and other speaking/participating aspects of ESIP meetings throughout the year.
avatar for Becca Neumann

Becca Neumann

Associate Professor, University of Washington
Dr. Neumann leads the hydro-biogeochemistry research group at the University of Washington, which works to understand how hydrologic, chemical and biological processes interact in soils, aquifers and surface waters to control chemical fate and transport. The group tackles societally... Read More →


Tuesday July 16, 2019 8:00am - 9:45am
TCC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

9:45am

Break
Tuesday July 16, 2019 9:45am - 10:15am
TCC

10:15am

Cloud Security and Compliance in Public Sector Archives
Increasing user adoption of and applications for cloud technologies as well as exponential growth in data volumes demands our public sector data archives accommodate cloud computing. Simultaneously, government-funded computing environments have constraints that present unique challenges in providing archives in the cloud, including Trusted Internet Connection mandates, funding models and legislation which do not allow unbounded costs, and security policies inherited from a pre-cloud world. Join us to discuss the progress members of the ESIP community have made in overcoming these hurdles toward moving large public sector archives to the cloud for valuable science applications.

View Full Recording on YouTube

Presenter: Nathan Clark
Talk Title: Cost Controls in the Cloud
Slides: https://doi.org/10.6084/m9.figshare.8938514

Presenter: Ben Williams
Talk Title: Cloud Governance at Scale
Slides: https://doi.org/10.6084/m9.figshare.8939981

Moderators
PQ

Patrick Quinn

Software Engineer, NASA / EED-2 Element 84

Speakers
BW

Ben Williams

Cloud Operations
NC

Nathan Clark

Software Engineer, EED2 Program



Tuesday July 16, 2019 10:15am - 11:45am
Ballrm A

10:15am

Big Gridded Data: The transition from legacy to next generation
This session aims to explore several dimensions of technology and operational systems that support archiving, cataloging, distributing, subsetting, and processing of large structured data. For this session, large structured data is defined as any data with well structured spatial, temporal, band, scenario, ensemble, dimensions and associated variables that exceed practical size constraints of commodity internet and personal computing resources. Typical examples are very high-resolution geospatial grids, outputs from ocean, landscape, weather and climate models, and multi-spectral remote sensing archives. Use cases for such data range from meta and reanalyses that require run-time access to entire datasets at once to ad-hoc investigations requiring small subsets of one or more dimension. For example, a local science project may need a small spatial subset of an ensemble climate projection or a remote sensing research project may need to sample 100 point locations from a all scenes of a multispectral remote sensing product. Data formats and computing Infrastructure to support this range of use cases, from terabyte and greater data access to custom small-subset extraction presents a great challenge especially as technology changes and what was a sound implementation and investment becomes dated and unable to meet modern expectations.

This session will feature speakers who manage operations and maintenance of archives of large structured data, build software and standards designed to meet the needs of a wide range of large structured data use cases, and researchers working to evaluate and demonstrate the potential of next generation technical solutions.

View Slides: https://doi.org/10.6084/m9.figshare.8939780

View Session Recording on YouTube.

Speakers
avatar for Jay Su

Jay Su

Sr. Scientific Software Developer, NASA GES DISC/ADNET
Working at NASA Goddard bridging science and engineering.
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Ed Armstrong

Ed Armstrong

Technologist, NASA JPL


Tuesday July 16, 2019 10:15am - 11:45am
Ballrm BC
  • Area Big Data, Subsetting, Archiving
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

10:15am

Cloud 101: How Do I Get Started In Cloud Computing Workshop
This workshop is structured to provide Earth scientists and practitioners with an authentic experience in making use of current cloud computing resources and related tools and machine learning services available. Participants should bring their own computers and plan on working through a use case and complete some data analysis on the cloud.

10:15 Introduction
  • Why and when should we use the cloud? 
  • Who is / are AWS? 
  • How do we use the cloud?
10:35  Storing data in the cloud
  • What are the three primary ways of talking to the cloud? 
  • What are the main activities supported by cloud consoles?
11:05  Doing computations in the cloud
  • What can we use a cloud machine for?

View Session Recording on YouTube.

Session Take-Aways
  1. Clouding computing is very powerful, yet complicated to set up even with people who are familiar with the tools. Provides insight into the world of cloud computing - lots of good concepts and vocab.
  2. We need to consider security issues when setting up VMs, particularly when dealing with controlled networks like at government institutions.
  3. Amazon Machine Images (AMI) can be a useful tool for reproducibility of instances in the AMS structure. This produces a snapshot of an instance in time so that parameters can be replicated.






Moderators
Speakers
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
ML

Mike Little

AIST Program Manager, NASA


Tuesday July 16, 2019 10:15am - 11:45am
Ballrm D

10:15am

Drone Data API Design Hackathon
Drones are a valuable new platform for collecting data. We have the technology to make these data ubiquitously FAIR. We should build that data infrastructure, come help design it.

The Sloan Foundation funded Drone Data API project aims to build a standards based foundations tool stack to provide a linked-data- , open source- and networked- native foundation for domains to leverage in building the specific tools they require for efficient data capture with drones. These APIs will therefore leverage OGC, W3C, and Engineering standards; along with GIS, Library, and Scientific Domain best practices.

Participants are encouraged to join the GeoSemantics Symposium on Monday and then spend Tuesday on a design hackathon to workshop the provisional high level API design towards a concrete
design plan.

See the full agenda at https://github.com/opengeospatial/LANDRS/blob/master/DesignDocs/DesignHack1/Agenda.md.

Questions? Contact jwyngaar@nd.edu.

Presentation: USGS Unmanned Aircraft Systems Technology Stack
Presenter: Joe Adams, USGS
Slides: https://doi.org/10.6084/m9.figshare.8939939

Presentation: CyVerse and Drone Data: An Open Source Pipeline for Agricultural Drone Data
Presenter: Christophe Schnaufer
Slides: https://doi.org/10.6084/m9.figshare.8940158

Session recording here.

Speakers
avatar for Jane Wyngaard

Jane Wyngaard

University of Notre Dame


Tuesday July 16, 2019 10:15am - 11:45am
Room 315

10:15am

FAIR Metadata
The FAIR principles provide high-level guidance for making data findable, accessible, interoperable, and reusable. Some of these principles describe repository characteristics and practices while others describe data and metadata characteristics. The metadata characteristics are described in very broad terms like “rich metadata”, “a plurality of accurate and relevant attributes”, and “detailed provenance”.

Data providers in the ESIP community use many metadata dialects to serve many disciplines. Implementing the FAIR Principles in this community requires understanding specific metadata practices and elements that support these broad disciplines.

The goal of this session is to make these metadata recommendations more specific and achieve community consensus on these recommendations.

We will break into groups to discuss metadata elements and checks that we are proposing to support FAIR Principles. Starting points are available for comment as a set of issues at:
F - https://github.com/NCEAS/metadig-checks/labels/Findable
A - https://github.com/NCEAS/metadig-checks/labels/Accessible
I - https://github.com/NCEAS/metadig-checks/labels/Interoperable and
R - https://github.com/NCEAS/metadig-checks/labels/Reusable

Presenter: Ted Habermann
Talk Title: Measuring the FAIR Principles
Slides:https://doi.org/10.6084/m9.figshare.9252914

View the Recording on YouTube

Session Take-Aways
  1. ESIP Members are interested in measuring and improving FAIRness of their metadata and support the development of a ESIP recommendation for FAIR metadata content.
  2. A discussion of potential FAIR recommendations is going on.. Please join that discussion here.


Speakers
avatar for Matt Jones

Matt Jones

Director, DataONE Program, DataONE, UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Scientific Synthesis
avatar for Margaret O'Brien

Margaret O'Brien

Data Manager, University of California, Santa Barbara
avatar for Ted Habermann

Ted Habermann

Owner, Metadata Game Changers
I am interested in all facets of metadata needed to discover, access, use, and understand data of any kind. Also evaluation and improvement of metadata collections, translation proofing. Ask me about the Metadata Game.


Tuesday July 16, 2019 10:15am - 11:45am
Room 316

10:15am

Conveying Information Quality – Recent Progress
The Information Quality Cluster (IQC) has been active since 2014 improving understanding of various aspects of information quality and fostering collaborations nationally and internationally. During this period, NASA’s Earth Science Data System Working Groups included a Data Quality Working Group, which made several recommendations that have been documented, reviewed thoroughly and published. The IQC has had plenary and breakout sessions discussing ideas about uncertainty in Earth science datasets, which have evolved into a white paper. Significant progress has been made in defining and propagating maturity matrices for various aspects of data management including information quality. The purpose of this session is to summarize the status and accomplishments in each of these areas and discuss future directions that the IQC should take.

Agenda
1. Information Quality Cluster Introduction - H. K. Ramapriyan (Rama) - 10 mins
2. NASA Data Quality Working Group’s Recommendations and Publications – Yaxing Wei – 20 mins.
3. Uncertainty White Paper Status – David Moroni – 15 mins.
4. Data Quality @Open Geospatial Consortium – Ivana Ivanova - 15 mins.
5. Maturity Matrices Update – Ge Peng – 15 mins.
6. Discussion and Key Takeaways – All – 15 mins.

View All Slides: https://doi.org/10.6084/m9.figshare.9336248

Meeting Notes: http://bit.ly/IQC_20190716_Notes  (for group editing)

View Session Recording on YouTube.


Session Take-Aways
  1. Considerable progress has been made in several areas over the last 5 years. Progress reported from NASA’s Data Quality Working Group, NOAA’s Data Stewardship Maturity Matrices, OGC’s Data Quality Domain Working Group. Collaborations have developed during the last two years with connections established with other clusters and non-US/international groups including E2SIP. 
  2. Members and observers of the IQC are looking to the IQC chairs/co-chairs for new opportunities to engage the community outside of ESIP at external conferences, namely AGU. Community engagement efforts to seek outside collaboration at IQC-organized Fall AGU sessions have been quite successful, particularly in recruiting presenters for invited talks in monthly telecons and collaborators for the uncertainty white paper.
  3. Progress is being made on development of a white paper on Earth science data uncertainty. IQC should consider airborne and in situ data as well - some new use cases are needed.


Speakers
avatar for Hampapuram Ramapriyan

Hampapuram Ramapriyan

Research Scientist/SME, Science Systems and Applications, Inc.
Information Quality, Data Stewardship, Provenance, Preservation Standards
avatar for Ge Peng

Ge Peng

Research Scholar, CICS-NC/NCEI
Dataset-centric scientific data stewardship, data quality management


Tuesday July 16, 2019 10:15am - 11:45am
Room 318

11:45am

Lunch
Tuesday July 16, 2019 11:45am - 12:45pm
Exhibit Hall B (4th Flr)

12:00pm

ESIP 101 & New to ESIP Intro
Are you new to ESIP? Join us for a quick primer on ESIP. 

Tuesday July 16, 2019 12:00pm - 12:30pm
Ballrm A

12:45pm

Toward Better Earth Science UX
The “order and download” paradigm is dying. NASA and other organizations are moving their data holdings to the cloud and future missions will be producing so much data–petabytes per year in some cases–that the old way of viewing, subsetting, and analyzing this information needs to adapt. As this data grows in size and complexity, it demands more usable, accessible, and thoughtful designs and user interfaces that support science and help researchers answer important questions. This session will focus on how we’re developing better user interfaces that utilize remote sensing data–especially in a cloud environment, and the impact user experience plays on the search, discovery, and analysis of Earth science data.

View Full Recording on YouTube

Presenter: Mark Reese
Title: Earthdata Search UX Lessons Learned
Abstract: Crafting a great user experience is hard. Crafting a great user experience for Earth science applications is fraught with challenges. From the variability in metadata to the experience profile of various users the possible permutations of use cases introduce layer upon layer of complexities that must be designed against. In this session, the Earthdata Search team would like to highlight lessons learned over the lifespan of the application — the good, the bad, and the ugly.
Slides: https://doi.org/10.6084/m9.figshare.8938244

Presenter: Grega Milcinski
Talk: Sentinel Hub Apps - Designing UI on EO API
Abstract: Sentinel Hub was one of the first truly interactive web services providing insight in global archive of EO data. The API needed some applications to demonstrate its power so we ended up coding these as well, starting with Postcards from the Space, Sentinel Playground and eventually EO Browser, each bringing the experience one level further. We will walk through the process of designing each of these and share some ideas for the future.
Slides: https://doi.org/10.6084/m9.figshare.9121997

Presenter: Aimee Barciauskas
Title: How Dynamic Tiling meets OGC Standards
Abstract: For the Multi-Mission Algorithm and Analysis Platform (MAAP), ESA and NASA have adopted the OGC standards for data access, data processing and data visualization to enable the sharing of ESA and NASA datasets. The data visualization component of the platform must be able to visualize both NASA and ESA archives, presenting challenges to being OGC compliant to WMTS and WMS standards while NASA leverages a dynamic tiling backend.
Slides: https://doi.org/10.6084/m9.figshare.8939579

Presenter: Tyler Stevens
Title: Evolving UMM-Var To Improve How Users Can Access NASA EOSDIS Data Sets
Abstract: The UMM-Variables (Var) Metadata Model has been evolved to support an End-to-End Services (E2E) capability, which enables variable level subsetting, data transformation, and data reformatting. This talk will discuss what is new with the model, how users can get their metadata ready for the E2E capability, and include a demo of how the model is being used to drive and improve the user experience in Earthdata Search when accessing EOSDIS data sets.
Slides: https://doi.org/10.6084/m9.figshare.9108131

Speakers
avatar for Tyler Stevens

Tyler Stevens

Senior Discipline Engineer, KBR/NASA EED-2
avatar for Mark Reese

Mark Reese

Senior Project Manager, NASA/EED-2, Element 84
avatar for Jeff Siarto

Jeff Siarto

EED Design Lead, Element 84
avatar for Drew Bollinger

Drew Bollinger

Developer, Development Seed
Drew is a developer and data analyst at Development Seed. He has rich experience running advanced analysis and machine learning algorithms on large geospatial data sets. He is passionate about using powerful analysis and visualization techniques to promote social change. He is a firm... Read More →
avatar for Aimee Barciauskas

Aimee Barciauskas

Development Seed
avatar for Grega Milcinski

Grega Milcinski

CEO and Co-founder, Sinergise
Sentinel Hub and general availability of EO data in the clouds



Tuesday July 16, 2019 12:45pm - 2:15pm
Ballrm BC
  • Area ux, usability, interface design
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

12:45pm

Using Pangeo JupyterHubs to work with large public datasets
Bring your laptop to this hands-on workshop! Participants will learn about the open-source scientific python ecosystem for analytic workflows with big data in Earth Science. Pangeo is first and foremost a community promoting open, reproducible, and scalable science (read more at https://pangeo.io). This community provides documentation, develops and maintains software, and deploys computing infrastructure to make scientific research and programming easier. The Pangeo software ecosystem involves open source tools such as xarray, iris, dask, jupyter, and many other packages. In brief workshop, participants will familiarize themselves with writing code in Jupyter Notebooks that can be run on scalable computing clusters running on the Cloud, bypassing a common bottleneck of downloading ever-increasing volumes of remote sensing or modeling data. We will introduce key Python tools and have participants write simple code to work with large public datasets hosted on Amazon Web Services and Google Cloud.

Agenda
12:45 - 12:55 Quick introduction to Pangeo (http://bit.ly/esip-slides)
12:55 - 1:25 Introductory notebooks for jupyter, xarray, dask on Google Binder
1:25 - 1:45 Landsat-8 demo on AWS Binder
1:45 - 2:15 Time for participant experimentation and questions

View Session Recording on YouTube.

Speakers
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Scott Henderson

Scott Henderson

Research Scientist, University of Washington



Tuesday July 16, 2019 12:45pm - 2:15pm
Ballrm D

12:45pm

Drone Data API Design Hackathon
Drones are a valuable new platform for collecting data. We have the technology to make these data ubiquitously FAIR. We should build that data infrastructure, come help design it.

The Sloan Foundation funded Drone Data API project aims to build a standards based foundations tool stack to provide a linked-data- , open source- and networked- native foundation for domains to leverage in building the specific tools they require for efficient data capture with drones. These APIs will therefore leverage OGC, W3C, and Engineering standards; along with GIS, Library, and Scientific Domain best practices.

Participants are encouraged to join the GeoSemantics Symposium on Monday and then spend Tuesday on a design hackathon to workshop the provisional high level API design towards a concrete
design plan.

See the full agenda at https://github.com/opengeospatial/LANDRS/blob/master/DesignDocs/DesignHack1/Agenda.md.

Questions? Contact jwyngaar@nd.edu.

Session recording here.

Speakers
avatar for Jane Wyngaard

Jane Wyngaard

University of Notre Dame


Tuesday July 16, 2019 12:45pm - 2:15pm
Room 315

12:45pm

Metadata Evaluation - Tools and Results
ESIP community members are actively working throughout the data life cycle from data management planning to collection and creation to archiving, discovery, and data reuse. They use many metadata dialects to address multiple data use cases and are exposed to metadata requirements and recommendations from many organizations, disciplines, and communities. Using these recommendations to guide metadata improvement requires being able to evaluate existing metadata collections with respect to these recommendations. These evaluations can take many forms and serve many purposes.
We will discuss the role of repositories in these evaluations beginning with insights from three repositories:
  1. DataOne/Arctic Data Center (Matt Jones)
  2. EDI/LTER (Margaret O’Brien)
  3. Dryad (Ted Habermann & Daniella Lowenberg)
with a focus on three questions:
  1. How can repositories/networks use metadata evaluation in curation of data and metadata 
  2. What supporting tools and infrastructure exist at repositories for evaluating metadata
  3. How can metadata evaluation help motivate and measure evolution of metadata and dialects.
We plan to have plenty of time for discussion after these presentations.

View the Recording on YouTube

Presenter: Margaret O'Brien
Talk Title: Environmental Data Initiative/Long Term Ecological Research Network EML Congruence Checker
Slides: https://doi.org/10.6084/m9.figshare.9162197

Presenter: Daniella Lowenberg
Talk Title: Metadata and Dialect Evolution - Affiliations in the Dryad Data Repository
Slides: https://doi.org/10.6084/m9.figshare.9252824

Session recording here.

Speakers
avatar for Matt Jones

Matt Jones

Director, DataONE Program, DataONE, UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Scientific Synthesis
avatar for Margaret O'Brien

Margaret O'Brien

Data Manager, University of California, Santa Barbara
avatar for Ted Habermann

Ted Habermann

Owner, Metadata Game Changers
I am interested in all facets of metadata needed to discover, access, use, and understand data of any kind. Also evaluation and improvement of metadata collections, translation proofing. Ask me about the Metadata Game.


Tuesday July 16, 2019 12:45pm - 2:15pm
Room 316

12:45pm

Data Product Developers' Guide Workshop
The Data Product Developer's’ Guide Working Group within NASA’s Earth Science Data Systems Working Groups has been developing a guide to assist science data product developers in designing and producing products that are interoperable and conveniently usable by the community. A draft version of this document is expected to be available for broad review in early July 2019. While the initial target audience for this document are the NASA teams responsible for product generation, it is expected to be more broadly applicable. The purpose of this workshop session is to present briefly the contents of this document to interested ESIP members and promote a broader participation in the review process and facilitate improvements for the benefit of end user communities. During this session, the attendees will be divided into subgroups to review individual sections of the document and provide comments

Presenters: Hampapuram Ramapriyan, Peter J.T. Leonard, Chris Lynnes

Agenda:
Session introduction - Ramapriyan - 3 mins.
Data Product Developers’ Guide - Motivation - Lynnes - 7 mins.
Summary of DPDG Content - Leonard - 15 mins.
Sign up for subgroups - 5 mins.
Breakout to review sections - 45 mins.
Feedback to full group by subgroups - 15 mins.

View Session Recording on YouTube.

Session Take-Aways
  1. Most attendees felt that the DPDG would be a useful contribution to help data producers make usable data products.
  2. Over 80 comments were collected during the session from four subgroups reviewing different sections of the document.
  3. Some of the key comments and suggestions included:
    1. Retitle the document to include “Data Producers” in the title. 
    2. Provide a paragraph upfront to tie sections to particular goals of particular readers
    3. Provide guidance on how to facilitate user feedback on data product usability and quality
    4. Rework discussion on chunk size for data
    5. Note how the Common Metadata Repository can make good use of certain types of metadata to make data more Findable


Speakers
avatar for Chris Lynnes

Chris Lynnes

EOSDIS System Architect for Data Use, NASA
avatar for Hampapuram Ramapriyan

Hampapuram Ramapriyan

Research Scientist/SME, Science Systems and Applications, Inc.
Information Quality, Data Stewardship, Provenance, Preservation Standards


Tuesday July 16, 2019 12:45pm - 2:15pm
Room 318

2:15pm

Break
Tuesday July 16, 2019 2:15pm - 2:45pm
TCC

2:45pm

Cloud Data Optimization: Emerging Best Practices I
When data is shared in the cloud, anyone can analyze it without having to download it or store it themselves, which lowers the cost of new product development, reduces the time to scientific discovery, and can accelerate innovation. However, staging large-scale datasets for analysis in the cloud requires consideration of how data should be prepared and organized to allow fast, efficient, and programmatic access from distributed computing systems. This workshop provides a forum for members of the community to share lessons learned as they explore ways to use the cloud to expand data access. It seeks to encourage dialog between users interested in leveraging data in the AWS Cloud for research and application development for Earth Sciences.

View Session Recording

Session Description:
When data is shared in the cloud, anyone can analyze it without having to download it or store it themselves, which lowers the cost of new product development, reduces the time to scientific discovery, and can accelerate innovation. However, staging large-scale datasets for analysis in the cloud requires consideration of how data should be prepared and organized to allow fast, efficient, and programmatic access from distributed computing systems. This workshop provides a forum for members of the community to share lessons learned as they explore ways to use the cloud to expand data access. It seeks to encourage dialog between users interested in leveraging data in the AWS Cloud for research and application development for Earth Sciences.

Workshop Format: 
Workshop includes 1.5 hours of presentations (Cloud Data Optimization: Emerging Best Practices I) followed by 1.5 hours of discussion on emerging best practices and identifying needs to move this space forward.

Presentations (10 minutes each)
Full Abstracts can be found in the attached file.
  1. Title: STAC, sat-utils, and Open Data - Prioritizing Data Use (10 min)
    Presenter: Dan Pilone (Element84)
  2. Title: Radiant ML Hub, A cloud based commons for geospatial training datasets (10 min)
    Presenter: Hamed Alemohammad (Radiant Earth Foundation)
  3. Title: One data format pattern to rule them all (10 min)
    Presenter: Grega Milcinski (Sinergise)
    Slides: https://doi.org/10.6084/m9.figshare.9121991
  4. Title: Improved Cloud Raster Format for multidimensional raster storage and analysis (10 min)
    Presenters: Hong Xu (Esri) & Sudhir Raj Shrestha (Esri)
  5. Title: Optimization of CESM LENS on AWS S3 (10 min)
    Presenter: Jeff de La Beaujardiere (NCAR)
    Slides: https://doi.org/10.6084/m9.figshare.9633314
  6. Title: The Zarr format
    Presenter: Rich Signell (USGS)
  7. Title: NOAA’s Big Data Project - A Data Broker’s Perspective
    Presenter: Otis Brown (NC State University/NCICS)
  8. Title: HDF Data Service for the Cloud
    Presenter: John Readey (The HDF Group)

Speakers
avatar for Jeff de La Beaujardiere

Jeff de La Beaujardiere

Director, Information Systems Division, NCAR/CISL
Big data, cloud computing, object storage, data management.
avatar for Dan Pilone

Dan Pilone

Chief Technologist, Element 84, Inc.
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Hamed Alemohammad

Hamed Alemohammad

Chief Data Scientist, Radiant Earth Foundation
JR

John Readey

The HDF Group
avatar for Sudhir R Shrestha

Sudhir R Shrestha

Solution Engineer Researcher, Esri
Solution Engineer and Scientific Data enthusiast with keen interest in making data easily Discoverable and Interoperable. Passionate about geospatially driven Hydrological Modeling and Heuristic Soil Modeling and develop, implement new and innovative geospatial methods, techniques... Read More →
avatar for Grega Milcinski

Grega Milcinski

CEO and Co-founder, Sinergise
Sentinel Hub and general availability of EO data in the clouds



Tuesday July 16, 2019 2:45pm - 4:15pm
Ballrm A

2:45pm

Epic Fails in Earth Science Informatics: learning from the past to do better in the future
In research we tend to only present on and/or publish our successes as they are so integral to our career progression. Yet not everything we attempt is successful: no matter how hard we try, some of our research and developments fails. For cyberinfrastructure projects, there is a high risk of failure, as technology is changing so rapidly and unpredictably, whilst the change of research culture is slow. Edwards et al. (2007) emphasized the value of honestly reporting failures “to supporting long-term and comparative learning across the varieties of cyberinfrastructural experience” and recommended that “through the disciplined and even-handed study of failure, funders and proponents of cyberinfrastructure must learn to stop hiding the bodies”. New trends in biochemical research and publishing show increased attention to sharing of negative results from early clinical trials (Kevin Kelly, “Speculations on the Future of Science”).

The purpose of this session is to provide a free and blameless environment to encourage honest reporting of where things went wrong. It is time to bring the skeletons out of the closet and showcase Epic Fails that you know about (particularly your own) in software, data infrastructures, samples, software delivery, services, etc. From these, we can build a portfolio of lessons learned that will inform the future, and ultimately contribute to accelerating progress in Earth science informatics. (Note: for those who may find presenting in this session stressful, we will ensure a supporting environment where you can reveal your fails without having to show your face).

View the Recording on YouTube

Moderators
avatar for Kerstin Lehnert

Kerstin Lehnert

Lamont-Doherty Earth Observatory, Columbia University
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the NSF-funded data facility IEDA (Interdisciplinary Earth Data Alliance). Kerstin holds a Ph.D in Petrology from the University of Freiburg in Germany.Over... Read More →

Tuesday July 16, 2019 2:45pm - 4:15pm
Ballrm BC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

2:45pm

Hands on with Jetstream Atmosphere Part I
Hands on with the Atmosphere GUI interface on Jetstream cloud

This tutorial will first give an overview of Jetstream, the National Science Foundation's first production research and education cloud, and various aspects of the system. Then we will take attendees through the basics of using Jetstream via the Atmosphere web interface. This will include a guided walk-through of the interface itself, the features provided, the image catalog, launching and using virtual machines on Jetstream, using volume-based storage, and best practices.

We are targeting users of every experience level. Atmosphere is well-suited to both HPC novices and advanced users. This tutorial is generally aimed at those unfamiliar with cloud computing and generally doing computation on laptops or departmental server resources. While we will not cover advanced topics in this particular tutorial, we will touch on the available advanced capabilities during the initial overview.

Attendees will need to bring a laptop with a modern web browser (Firefox, Chrome, or Safari).
----

Jetstream is a user-friendly cloud computing environment for researchers based on Atmosphere and OpenStack.It is designed to provide configurable cyberinfrastructure that gives researchers access to interactive computing and data analysis resources on demand, whenever and wherever they want to analyze their data. For a more in-depth description please see the System Overview - http://wiki.jetstream-cloud.org/System+Overview

Session recording is here.

Speakers
avatar for Jeremy Fischer

Jeremy Fischer

Manager, Jetstream Cloud, Jetstream - Indiana University
Cloud computing for research and education!


Tuesday July 16, 2019 2:45pm - 4:15pm
Ballrm D

2:45pm

Drone Data API Design Hackathon
Drones are a valuable new platform for collecting data. We have the technology to make these data ubiquitously FAIR. We should build that data infrastructure, come help design it.

The Sloan Foundation funded Drone Data API project aims to build a standards based foundations tool stack to provide a linked-data- , open source- and networked- native foundation for domains to leverage in building the specific tools they require for efficient data capture with drones. These APIs will therefore leverage OGC, W3C, and Engineering standards; along with GIS, Library, and Scientific Domain best practices.

Participants are encouraged to join the GeoSemantics Symposium on Monday and then spend Tuesday on a design hackathon to workshop the provisional high level API design towards a concrete
design plan.

See the full agenda at https://github.com/opengeospatial/LANDRS/blob/master/DesignDocs/DesignHack1/Agenda.md.

Questions? Contact jwyngaar@nd.edu.

Session recording here.

Speakers
avatar for Jane Wyngaard

Jane Wyngaard

University of Notre Dame



Tuesday July 16, 2019 2:45pm - 4:15pm
Room 315

2:45pm

Metadata Improvement Lab 4: How FAIR is your metadata?
In the fourth installment of the Metadata Improvement Lab, participants will utilize Python, XSL, and Jupyter Notebooks to determine if metadata collections contain the concepts needed to be FAIR. Participants will be able to utilize their own metadata, regardless of standard or choose from many sample collections from ESIP member organizations. Participants can load as many metadata collections as they would like to compare.

No coding experience will be needed, though a basic understanding of XML will be helpful. A step by step set up for using Google Collaboratory, a Jupyter based web accessible computational environment, will be given. Participants will only need a Google account and a connected web browser to access and run the repository which will allow them to create a shape visualization that describes the fitness of their metadata’s FAIRness. No changes will be made to the device or account used. Participants may also import the workshop repository into their own Jupyter environment.

Since there are many ideas of what it means to be FAIR, this workshop will allow participants to work together or on their own to create a recommendation using Google Docs to facilitate collaboration. During the workshop we will discuss a draft of what FAIR means for EML producing membernodes that was compiled during a workshop this March at DataONE. The Documentation Cluster has built many wiki pages containing recommendations and the XPaths needed in many popular metadata standards, which will aid in the creation of a FAIR recommendation that works for the many standards used throughout ESIP’s member organizations.

The recommendation will then be applied to the collections that participants have chosen to analyze. The workshop framework is highly portable and reusable, even including the generation of the raw data needed to evaluate the content of the metadata, though only the structure of documents will be utilized in this workshop. A report on the outcomes of the analysis will be created as a sharable Google Sheet. The report generated allows for comparison of collections, so that improvement can be measured, documented and visualized.

Presenter: Sean Gordon
Talk Title: Metadata Improvement Lab at ESIP 4: Visualizing FAIRness
Slides: https://doi.org/10.6084/m9.figshare.9273179

View the Recording on YouTube

Speakers
avatar for Sean Gordon

Sean Gordon

Information Engineer, The HDF Group
Talk to me about the ESIP Labs project, ESIPhub a JupyterHub based shared computational environment for workshops at Meetings.My research focuses on the connections between documentation structures and the evaluation of content for the metadata needs of diverse communities of practice... Read More →



Tuesday July 16, 2019 2:45pm - 4:15pm
Room 316

2:45pm

A Metadata Database Built on Usage Patterns in the LTER Network
LTER-core-metabase is a relational database model based on the GCE LTER Metabase, with adaptations by MCR, SBC, and BLE LTER sites.  The project provides a database schema for metadata about ecological data packages. The design is influenced heavily by the Ecological Metadata Language (EML).  There is also an associated R package, MetaEgress, that produces EML for a data package, enabling the information manager to quickly generate metadata for package archiving. The schema and R package are available on GitHub, with the schema represented as a set of SQL scripts to facilitate diffs.  

The Environmental Data Initiative (EDI) is defining a constrained profile of EML to streamline scripts and software for the broader community, but which is not tied to any specific back end storage system. EML was designed to be extensible, but we have observed EML creators converging on a set of elements. De facto then, a profile is emerging, and that profile for EML can define the specs of a common interface, which facilitates writing shared tools against different backend metadata storage systems. For example, LTER-Core-Metabase (a back-end), maps to the EML profile through views that decouple the outward-facing appearance from the back-end implementation. This session will share the current state of LTER-core-metabase and discuss advancing the project and its ties to EDI's profile for EML to improve data management system sustainability.

View Session Recording on YouTube.

Speakers
TW

Tim Whiteaker

University of Texas
avatar for Margaret O'Brien

Margaret O'Brien

Data Manager, University of California, Santa Barbara
avatar for M. Gastil-Buhl

M. Gastil-Buhl

Information Manager, Moorea Coral Reef Long Term Ecological Research
I curate datasets for an LTER site for reuse in future and current use by other research groups. I am interested in optimizing data usability and making the curation process more efficient. My favorite part of this work is the collegial spirit among LTER site information managers... Read More →


Tuesday July 16, 2019 2:45pm - 4:15pm
Room 317

2:45pm

NetCDF and CF: The Basics
This workshop will teach some of the basics of CF metadata for netCDF data files with some hands-on work available in Jupyter Notebooks using Python. Along with introduction to netCDF and CF, we will introduce the CF data model and discuss some netCDF implementation details to consider when deciding how to write data with CF and netCDF. We will cover gridded data as well as in situ data (stations, soundings, etc.) and touch on storing geometries data in CF.


Session recording here.

Session Take-Aways
  1. Writing your data into the metadata included netCDF format with CF is easy for both in situ and gridded. Learn how here http://bit.ly/esip2019-cf-tut.
  2. Binder is cool.


Speakers
avatar for Ethan Davis

Ethan Davis

UCAR Unidata



Tuesday July 16, 2019 2:45pm - 4:15pm
Room 318

4:15pm

Quick Break
Tuesday July 16, 2019 4:15pm - 4:30pm
TCC

4:30pm

Cloud Data Optimization: Emerging Best Practices II
Event Tittle: Cloud Data Optimization: Emerging Best Practuces II
Event Date/Time: July 16 | 4:30 PM-6:00 PM
Event Location: Greater Tacoma Convention Center, Tacoma, WA
ESIP URL: esipfed.org/summermeeting
Session Moderators: Ana Pinheiro Privette (Amazon), Joe Flasher (AWS) and Jeff de La Beaujardiere (NCAR)

View Session Recording on YouTube.

Session Description:
When data is shared in the cloud, anyone can analyze it without having to download it or store it themselves, which lowers the cost of new product development, reduces the time to scientific discovery, and can accelerate innovation. However, staging large-scale datasets for analysis in the cloud requires consideration of how data should be prepared and organized to allow fast, efficient, and programmatic access from distributed computing systems. This workshop provides a forum for members of the community to share lessons learned as they explore ways to use the cloud to expand data access. It seeks to encourage dialog between users interested in leveraging data in the AWS Cloud for research and application development for Earth Sciences.

Workshop Format: 
Workshop includes 1.5 hours of presentations (Cloud Data Optimization: Emerging Best Practices I) followed by 1.5 hours of discussion on emerging best practices and identifying needs to move this space forward.

Session II - Round table discussion: emerging best practices in cloud data optimization

Moderators
avatar for Jeff de La Beaujardiere

Jeff de La Beaujardiere

Director, Information Systems Division, NCAR/CISL
Big data, cloud computing, object storage, data management.

Speakers

Tuesday July 16, 2019 4:30pm - 6:00pm
Ballrm A

4:30pm

ESIP's International Connections: Sharing work that spans U.S., Australia and Europe
This session will highlight collaborative work between ESIP community in the U.S. and counterparts in Australia and Europe. It will introduce E2SIP (Earth & Environmental Information Partners), the emerging Australian community being incubated by ESIP and share how we have gone about establishing these connections.

Agenda
4:30 - ESIP's current international strategy
4:45-5:15 E2SIP Australia Focus
* Lesley Wyborn - Australian needs
* Jens Klump - Drone work
* Adrian Burton - ARDC Approach to international collab
5:15-5:30 Europe Focus - Helen Glaves
5:30-5:45 IGSN - truly international work linking U.S./Europe and Australia focus on collaborative aspects, Kerstin Lehnert, IGSN President
5:45-6 Group discussion on lessons learned from this work so far

View the Recording on YouTube

Presenter: Erin Robinson
Presentation Title: The Earth Science Information Partners: Globally Connected Networks of Earth, Space and Environmental Science Data Practitioners Making Data Matter
Slides: https://doi.org/10.6084/m9.figshare.9118898


Moderators
avatar for Erin Robinson

Erin Robinson

Executive Director, ESIP

Speakers
AB

Adrian Burton

Director, Data Policy and Services, ARDC
Adrian Burton is Director of Services, Policy, Collections with the Australian Research Data Commons, and has many years experience building and supporting national data policy, infrastructure, and services. 
avatar for Kerstin Lehnert

Kerstin Lehnert

Lamont-Doherty Earth Observatory, Columbia University
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the NSF-funded data facility IEDA (Interdisciplinary Earth Data Alliance). Kerstin holds a Ph.D in Petrology from the University of Freiburg in Germany.Over... Read More →


Tuesday July 16, 2019 4:30pm - 6:00pm
Ballrm BC
  • Area international, collaboration
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

4:30pm

Hands on with Jetstream Atmosphere Part II
Hands on with the Atmosphere GUI interface on Jetstream cloud

This tutorial will first give an overview of Jetstream, the National Science Foundation's first production research and education cloud, and various aspects of the system. Then we will take attendees through the basics of using Jetstream via the Atmosphere web interface. This will include a guided walk-through of the interface itself, the features provided, the image catalog, launching and using virtual machines on Jetstream, using volume-based storage, and best practices.

We are targeting users of every experience level. Atmosphere is well-suited to both HPC novices and advanced users. This tutorial is generally aimed at those unfamiliar with cloud computing and generally doing computation on laptops or departmental server resources. While we will not cover advanced topics in this particular tutorial, we will touch on the available advanced capabilities during the initial overview.

Attendees will need to bring a laptop with a modern web browser (Firefox, Chrome, or Safari).
----

Jetstream is a user-friendly cloud computing environment for researchers based on Atmosphere and OpenStack.It is designed to provide configurable cyberinfrastructure that gives researchers access to interactive computing and data analysis resources on demand, whenever and wherever they want to analyze their data. For a more in-depth description please see the System Overview - http://wiki.jetstream-cloud.org/System+Overview

Session recording is here.

Speakers
avatar for Jeremy Fischer

Jeremy Fischer

Manager, Jetstream Cloud, Jetstream - Indiana University
Cloud computing for research and education!


Tuesday July 16, 2019 4:30pm - 6:00pm
Ballrm D

4:30pm

Drone Data API Design Hackathon
Drones are a valuable new platform for collecting data. We have the technology to make these data ubiquitously FAIR. We should build that data infrastructure, come help design it.

The Sloan Foundation funded Drone Data API project aims to build a standards based foundations tool stack to provide a linked-data- , open source- and networked- native foundation for domains to leverage in building the specific tools they require for efficient data capture with drones. These APIs will therefore leverage OGC, W3C, and Engineering standards; along with GIS, Library, and Scientific Domain best practices.

Participants are encouraged to join the GeoSemantics Symposium on Monday and then spend Tuesday on a design hackathon to workshop the provisional high level API design towards a concrete
design plan.

See the full agenda at https://github.com/opengeospatial/LANDRS/blob/master/DesignDocs/DesignHack1/Agenda.md.

Questions? Contact jwyngaar@nd.edu.

Session recording here.

Speakers
avatar for Jane Wyngaard

Jane Wyngaard

University of Notre Dame


Tuesday July 16, 2019 4:30pm - 6:00pm
Room 315

4:30pm

Bridging The Gap Between Discovery and Use (Data and Tools)
How do metadata repositories with vast amounts of various data help users start working with the data quickly and easily? Connecting users to the data and tools/services that can utilize the data has been an ongoing challenge. To increase the value and use of Earth science data, having tools and services that can utilize data is crucial for doing scientific research. This session will convey how metadata repositories are attempting to help users start working with their data immediately through the use of metadata modeling and intuitive discovery tools. In this session, we will also capture best practices for connecting data to tools that can be shared with other organizations who are trying to tackle this issue. https://docs.google.com/document/d/1_jvkxKe2zyz8T6p9Xh5asKegk6Akl_2bo7lqbU57NOk/edit?usp=sharing

List of Talks:  
  • Evolving UMM-S To Better Accommodate NASA EOSDIS Tools (Web User Interfaces and Downloadable Tools) For Data Use (Tyler Stevens)
  • Telling the  Whole Tale via Reproducible Data Reuse (Matt Jones)
  • Enhancing Data Discoverability and Access with NOAA OneStop (Anna Milan)

Session recording here.

Speakers
avatar for Matt Jones

Matt Jones

Director, DataONE Program, DataONE, UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Scientific Synthesis
avatar for Tyler Stevens

Tyler Stevens

Senior Discipline Engineer, KBR/NASA EED-2
avatar for Anna Milan

Anna Milan

Metadata Standards Lead, NOAA NCEI
~*~Metadata Adds Meaning~*~



Tuesday July 16, 2019 4:30pm - 6:00pm
Room 316

4:30pm

What does it mean to be a Data Mentor?
Research data can become at risk for a variety of reasons, and the risks can occur throughout the data’s lifecycle. It takes dedicated resources to ensure data can be preserved for the long term and be made available, accessible, and usable to all. At DataAtRisk.org (https://dataatrisk.org), we rely on “Data Mentors,” or people who are committed to protecting data from risk, to make data secure and facilitate data rescue activities.

During this session, the DataAtRisk team invites attendees to help us formalize the “Data Mentor” role and its responsibilities for our Data Nomination Tool. We will first clarify the key characteristics (or personas) for the “Data Mentor”. Based on these personas, we will use user stories to describe the types of data rescue activities that the “Data Mentors” need to prioritize. Further, using these pieces of information, we will build a realistic workflow that represents the amount of effort it takes for the “Data Mentor” when facilitating data rescue activities submitted via the Data Nomination Tool. Finally, we will determine if “Data Mentor” is the appropriate name for this role.

Data Nomination Tool facilitates community-driven rescue efforts for Earth and Environmental science data. Particularly, the web-based tool connects people who can provide long term data stewardship support with those who need the assistance. The tool is created and hosted by CloudBIRST (https://cloudbirst.com/, key contact: Joan Saez). DataAtRisk.org’s current members also consist of individuals from Earth Science Information Partners (see ESIP Partners here: https://www.esipfed.org/partners), Johns Hopkins University Sheridan Libraries, and representatives from several University Research Libraries.

View Session Recording on YouTube.

Session Take-Aways
  1. Training for advocates and allowing advocates to participate at different levels
  2. Data coordinator / advocate requires many skills - and is the connector between provider and hero.
  3. There are other models to use such as zoouniverse and OpenStreetMaps that could inform skills, roles, and processes.



Moderators
avatar for Denise Hills

Denise Hills

Director, Energy Investigations, Geological Survey of Alabama
Long tail data, data preservation, connecting physical samples to digital information, geoscience policy, science communication

Tuesday July 16, 2019 4:30pm - 6:00pm
Room 317

4:30pm

netCDF-CF Workshop Part I
Agenda
  • 16:30 - Introductions, Workshop Plans, and Goals (Jessica Hausmann)
  • 17:00 - Summary of 2018 workshop - discussions, decisions, and status (Ethan Davis)
  • 17:30 - Review recent discussions of CF Governance and Process (Daniel Lee)
  • 19:00 - Dinner at Harmon Restaurant, 1938 Pacific Ave. Reservation under netcdf

Background

This is part 1 of a 4 part workshop. All sessions:
The Climate and Forecast (CF) metadata convention for netCDF (netCDF-CF) is a community-developed standard first released in 2003. The CF conventions were originally developed to represent climate and forecast model output encoded in the netCDF binary format, with the specific goal of facilitating comparison of output from different models. Subsequent development of the convention has broadened its scope to include observational data and derived products.

This workshop is focused on discussing current and future efforts and directions for the CF conventions.

View Session Recording on YouTube.

Session Take-Aways
  1. CF is working to improve its governance to make it more adaptable to the needs of its users.


Speakers
KO

Kevin O'Brien

Software Engineer, UW/JISAO, NOAA/PMEL
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
AJ

Aleksandar Jelenak

The HDF Group
DH

David Hassell

University of Reading
avatar for Daniel Lee

Daniel Lee

Software and data format engineer, EUMETSAT
GC

Guilherme Castelao

Scripps Institution of Oceanography
avatar for Ethan Davis

Ethan Davis

UCAR Unidata



Tuesday July 16, 2019 4:30pm - 6:00pm
Room 318

6:00pm

End of Day 1
Tuesday July 16, 2019 6:00pm - 6:00pm
TCC
 
Wednesday, July 17
 

8:00am

Morning Plenary
View live stream here: ESIP 2019 Summer Meeting - Day 2 Plenary


View session recording here.

Speakers
avatar for Jennifer Hennessey

Jennifer Hennessey

Senior Policy Advisor on Ocean Health, Washington State Government
Jennifer Hennessey is a senior policy advisor on ocean health for Washington State Governor Jay Inslee. Ms. Hennessey advises on policies to address ocean impacts from climate change, such as ocean acidification, ocean warming, hypoxia, and other ocean health topics. Previously, Ms... Read More →
avatar for Adam Mansur

Adam Mansur

Data Manager, Smithsonian Institution
Adam joined the Department of Mineral Sciences at the Smithsonian Institution in 2010 after completing an MS in geochemistry at the University of Maryland. At the Smithsonian, Adam oversees data about one of the largest, most comprehensive geological collections in the world. The... Read More →
avatar for Judy Twedt

Judy Twedt

Climate Data Sound Artist, University of Washington
Judy Twedt uses sound and music to create acoustic and emotionally expressive representations of climate data.  Analogous to visualization, data sonification is the sonic representation of data. Judy's climate data soundtracks have been aired on KUOW, PBS, and NPR.  She is a speaker... Read More →
avatar for Dawn Wright

Dawn Wright

Chief Scientist, Esri
As Chief Scientist of Esri, Dawn Wright aids in strengthening the scientific foundation for Esri software and services, while also representing Esri to the scientific community. A specialist in marine geology, Dawn has authored and contributed to some of the most definitive literature... Read More →


Wednesday July 17, 2019 8:00am - 10:00am
TCC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

8:30am

netCDF-CF Workshop Part II
Agenda
  • 8:30 - CF Governance, Roles, and Process
  • 9:00 - GitHub process - where we are and where we want to be
  • 9:30 - Way forward

Background
This is part of a 4 part workshop. All sessions:

The Climate and Forecast (CF) metadata convention for netCDF (netCDF-CF) is a community-developed standard first released in 2003. The CF conventions were originally developed to represent climate and forecast model output encoded in the netCDF binary format, with the specific goal of facilitating comparison of output from different models. Subsequent development of the convention has broadened its scope to include observational data and derived products.

This workshop is focused on discussing current and future efforts and directions for the CF conventions.

Session recording is here.

Speakers
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
AJ

Aleksandar Jelenak

The HDF Group
KO

Kevin O'Brien

Software Engineer, UW/JISAO, NOAA/PMEL
DH

David Hassell

University of Reading
avatar for Daniel Lee

Daniel Lee

Software and data format engineer, EUMETSAT
GC

Guilherme Castelao

Scripps Institution of Oceanography
avatar for Ethan Davis

Ethan Davis

UCAR Unidata



Wednesday July 17, 2019 8:30am - 10:00am
Room 318

10:00am

Break
Wednesday July 17, 2019 10:00am - 10:30am
TCC

10:30am

Approaches to extending schema.org for Data APIs
PROBLEM: schema.org can describe static Datasets, but it's difficult to accurately describe services and APIs that provide access to data. This session will bring together data API managers and curators, conceptual modelers and ontologists to model and develop a schema.org extension address accessing data through APIs and services.

Session recording is here.

Moderators
avatar for Adam Shepherd

Adam Shepherd

Technical Director, Co-PI, BCO-DMO
schema.org | Data Containerization | Linked Data | Semantic Web | Knowledge Representation | Ontologies

Wednesday July 17, 2019 10:30am - 12:00pm
Ballrm A

10:30am

Assessment Frameworks and Dimensions for Educational & Training Resources
One of the goals of the Institute of Museum & Library Services National Leadership Grant recipient and ESIP-hosted Data Management Training Clearinghouse (DMTC) is to identify and/or develop assessment frameworks that could be applied to the educational & training resource content in the DMTC. While a number of approaches to assessing educational resources seem promising (e.g., the Kirkpatrick framework, CLEAN evaluation criteria) the working group tasked to address the question of assessment would like to know more about these approaches, how they might apply to the DMTC resources or the DMTC itself, and the mechanisms or processes that have been developed by others to evaluate educational and training resources. The session will include invited speakers to describe different frameworks and how they are used, but also allow ample time to discuss how the frameworks might apply to DMTC resources.

Session recording is here.

Moderators
KB

Karl Benedict

ESIP President, ESIP
The ESIP President is a volunteer position, elected by the ESIP Community each year. The President works with the ESIP Staff for several of the presentation, speaker introductions, award ceremonies, and other speaking/participating aspects of ESIP meetings throughout the year.

Wednesday July 17, 2019 10:30am - 12:00pm
Ballrm BC
  • Area Training assessment, Assessment dimensions
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

10:30am

Getting your data into the cloud: How to deploy and use Cumulus
This session will be an interactive walkthrough of how to deploy the open-source Cumulus tool for getting your data into the cloud and a live demo of using Cumulus to ingest a new set of science data into the cloud.

Presenter: Mark Boyd
Presentation Title: 
An Introduction to Cumulus
Slides: https://doi.org/10.6084/m9.figshare.8947106
Session recording is here.

Moderators
MB

Mark Boyd

Engineer

Wednesday July 17, 2019 10:30am - 12:00pm
Ballrm D

10:30am

EnviroSensing: Sensor Data, Technology, and Best Practices
Session Overview
Sponsored by the ESIP EnviroSensing Cluster, this session is open to scientists, information managers, and technologists interested in the general topic of in-situ environmental sensing for science and management. Our community of practitioners promotes conversation around, and development and refinement of techniques to observe natural Earth system processes over short and long timescales. In this session, we will hear short talks on new data types, interesting technology applications, project case studies, data management, related software tools, quality control processes, and other advances in the field.

Session Agenda
  1. Presenters: Scotty Strachan & Renée F. Brown, EnviroSensing Cluster Co-chairs
    Presentation Title: ESIP EnviroSensing Cluster & Session Introduction
    Slides: https://doi.org/10.6084/m9.figshare.9275468
  2. Invited Keynote: Joseph Bell, USGS
    Presentation Title: USGS Next Generation Water Observing System
    Slides: https://doi.org/10.6084/m9.figshare.9275462
  3. Presenter: Martha Apple, Montana Tech
    Presentation Title: Sensors in Snowy Alpine Environments: Part I, Microhabitats and Plant Functional Traits
    Slides: https://doi.org/10.6084/m9.figshare.9275465
  4. Presenter: James Gallagher, OPeNDAP
    Presentation Title: Sensors in Snowy Alpine Environments: Part II, Sensor Networks with LoRa - James Gallagher, OPeNDAP
    Slides: https://doi.org/10.6084/m9.figshare.9275471
  5. Environmental Data Acquisition from the Bottom of the Earth - Renée F. Brown, McMurdo Dry Valleys LTER
  6. Presenter: Connor Scully-Allison, University of Nevada Reno
    Presentation Title: 5 in 10: A Practical Breakdown of 5 Frontend Libraries to Speed Up the Development of Interactive Visualizations
    Slides: https://doi.org/10.6084/m9.figshare.8953652
  7. Group discussion, Future Directions, and Closing Remarks

Session recording is here.

Session Take-Aways
  1. Visualization Dashboards: Sensor users struggle with effective real-time visualization tools and data quality assessment pipelines. Network managers and scientists with deployments need effective dashboards. Viz libraries exist, but are not yet adapted into specific tools for our applications.
  2. Telemetry & Emerging Tech (LoRa): Small-scale sensor users and geographically-dense applications are leveraging emerging low-power, long-distance radio technologies, such as LoRa. This radio technology and related network topologies are driving “open” networks for data transfer in near-real-time. The USGS is applying LoRa in the Next-Gen Water Network.
  3. Sensor Metadata & Raw Data Archival: Software tools to capture uniform metadata for sensor deployments don’t exist. People at all scales are doing this manually or not at all, challenging FAIR Data practices. "Raw" sensor data are useful for science questions that leverage noise or unintended behavior. Documenting deployments and archiving raw data will improve workflows and allow use of sensor data in novel, unanticipated ways.



Speakers
JG

James Gallagher

Contractor, OPeNDAP
MA

Martha Apple

Montana Tech
avatar for Renée Brown

Renée Brown

Information Manager, McMurdo Dry Valleys LTER
Environmental sensor networks, data management, long-term ecological research, aridland ecosystems, nitrogen and carbon biogeochemical cycles, climate change.
avatar for Scotty Strachan

Scotty Strachan

Director of Cyberinfrastructure, University of Nevada, Reno
Institutional cyberinfrastructure, sensor-based science, mountain climate observatories!


Wednesday July 17, 2019 10:30am - 12:00pm
Room 316

10:30am

The Information Management Code Registry: Software Solutions for Information Management Needs
The Information Management Code Registry (IMCR) enhances the use and value of Earth Science data by facilitating discovery of software solutions for information management needs. Our primary goal with the IMCR is to create a comprehensive registry of information management software that is searchable by task (e.g. quality control) and other attributes (e.g. science domain). Our secondary goal is to highlight coverage gaps and help shift redundant effort to new development. In this session, we report on the accomplishment of the primary goal and present plans for attaining the second. Additionally, we will run group activities and discussions to: (1) Test and refine discoverability-, (2) inform identification of coverage gaps, (3) explore the benefit of adding non-generalized, but unique and useful, scripts developed for a single purpose, and (4) collectively cogitate on the general utility of the IMCR and how to maximize its value.

Presenter: Colin Smith
Presentation Title: The Information Management Code Registry: Software Solutions for Information Management Needs
Slides: https://doi.org/10.6084/m9.figshare.8947118

Session recording is here.

Speakers
avatar for Colin Smith

Colin Smith

Data manager, Environmental Data Initiative (EDI)
I work on accelerating the archive and reuse of data in ecological science. My interests are in software development and data harmonization.
avatar for Kristin Vanderbilt

Kristin Vanderbilt

Research Associate Professor, University of New Mexico



Wednesday July 17, 2019 10:30am - 12:00pm
Room 317

10:30am

netCDF-CF Workshop Part III
Agenda
5 minute talks on current proposals:

Background
This is part of a 4 part workshop. All sessions:

The Climate and Forecast (CF) metadata convention for netCDF (netCDF-CF) is a community-developed standard first released in 2003. The CF conventions were originally developed to represent climate and forecast model output encoded in the netCDF binary format, with the specific goal of facilitating comparison of output from different models. Subsequent development of the convention has broadened its scope to include observational data and derived products.

This workshop is focused on discussing current and future efforts and directions for the CF conventions.

Session recording is here.

Speakers
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
AJ

Aleksandar Jelenak

The HDF Group
KO

Kevin O'Brien

Software Engineer, UW/JISAO, NOAA/PMEL
DH

David Hassell

University of Reading
avatar for Daniel Lee

Daniel Lee

Software and data format engineer, EUMETSAT
GC

Guilherme Castelao

Scripps Institution of Oceanography
avatar for Ethan Davis

Ethan Davis

UCAR Unidata



Wednesday July 17, 2019 10:30am - 12:00pm
Room 318

12:00pm

Lunch
Wednesday July 17, 2019 12:00pm - 1:30pm
Exhibit Hall B (4th Flr)

1:00pm

Data to Action Teacher Workshop
Limited Capacity seats available

Data to Action with Jupyter Notebooks and other Earth Science Tools
JOIN US for an afternoon deep-dive into two cutting-edge data science activities. First, we’ll use Jupyter Notebooks to investigate a 100-year hurricane dataset, modifying code to dig deeper. We’ll also examine GOES weather satellite tools, art history, geosciences, and biodiversity data-sets using the SuAVE visual exploration tool.
Session Take-Aways
  • GOES-R Series WebApps (http://cimss.ssec.wisc.edu/education/goesr/webapps) provide learners with easy to use apps to learn about combining satellite channels into RGB, spatial resolution, and the spectral bands of GOES 16/17
  • The Explore Atlantic Storms Jupyter Notebook provided secondary educators hands on experience (some for the first time) with viewing, modifying, and playing with (python) code.
  • SuAVE Survey Analysis via Visual Exploration is an incredibly powerful and fun tool to explore and use in the classroom.


Speakers
avatar for Sean Gordon

Sean Gordon

Information Engineer, The HDF Group
Talk to me about the ESIP Labs project, ESIPhub a JupyterHub based shared computational environment for workshops at Meetings.My research focuses on the connections between documentation structures and the evaluation of content for the metadata needs of diverse communities of practice... Read More →
avatar for LuAnn Dahlman

LuAnn Dahlman

Science Writer and Editor, NOAA Climate Program Office
The updated Climate Explorer application.
avatar for Shelley Olds

Shelley Olds

Science Education Specialist, UNAVCO
Data visualization tools, Earth science education, human dimensions of natural hazards, disaster risk reduction (DRR), resilience building.



Wednesday July 17, 2019 1:00pm - 5:00pm
Room 315

1:30pm

Advancing spatial and temporal aspects of schema.org
PROBLEM: schema.org is currently inconsistent with standards organizations (W3C, OGC) representations of spatial and temporal information. This session will bring together data curators, conceptual modelers and ontologists to formulate solutions for extending schema.org's approach to spatial and temporal descriptions.

Moderators
avatar for Adam Shepherd

Adam Shepherd

Technical Director, Co-PI, BCO-DMO
schema.org | Data Containerization | Linked Data | Semantic Web | Knowledge Representation | Ontologies

Wednesday July 17, 2019 1:30pm - 3:00pm
Ballrm A

1:30pm

Getting Stuff Done with R, Python and Jupyter Notebooks
Sometimes the hardest part of getting started with coding is to determine which is the best software to learn or use! The goal of this session is to provide a basic introduction to three commonly-used tools for data management and analysis and to provide examples of how they can be used for managing data, visualization, exploiting cloud resources, generating metadata, using or creating web services, manipulating XML documents, and facilitating reorganization of data.

A panel will provide brief overviews of R, Python, and Jupyter Notebooks, including examples of what they do best, drawn from real-world applications. Workshop attendees will be encouraged to participate in discussions of data challenges they have encountered and the relative merits of the different tools in meeting them. Participation in the session by coders experienced in one or more of the tools is encouraged, as is participation by those who have yet to use any of these very powerful tools.

NOTES: bit.ly/put_notes_here

Session recording is here.

Speakers
avatar for Stace Beaulieu

Stace Beaulieu

NES-LTER Information Manager, Woods Hole Oceanographic Institution
I'm the Information Manager for the Northeast U.S. Shelf LTER and Coordinator for WHOI's Ocean Informatics initiative (whoi.edu/ocean-informatics). Come talk with me about data science training in the ocean sciences!
avatar for Colin Smith

Colin Smith

Data manager, Environmental Data Initiative (EDI)
I work on accelerating the archive and reuse of data in ecological science. My interests are in software development and data harmonization.
CT

Chris Turner

Data Librarian, Axiom Data Science
avatar for M. Gastil-Buhl

M. Gastil-Buhl

Information Manager, Moorea Coral Reef Long Term Ecological Research
I curate datasets for an LTER site for reuse in future and current use by other research groups. I am interested in optimizing data usability and making the curation process more efficient. My favorite part of this work is the collegial spirit among LTER site information managers... Read More →


Wednesday July 17, 2019 1:30pm - 3:00pm
Ballrm BC
  • Area R, Python, Jupyter Notebooks
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

1:30pm

Cloud Engineering in Practice
With the immense increase in volume of data acquisition and archival comes the challenge of intensive data processing that we are all trying to solve. There are many efforts underway to achieve this by infusing cloud technologies into software infrastructure. In this session we would like to cover the various approaches being taken to move towards scalable storage and auto scaled processing. We will talk about porting applications to the cloud, container based deployment models and a hybrid science data processing system. This data system utilizes both on-premise and remote compute resources to meet latency requirements while handling the large volumes of data. Cloud based infrastructure is being used for running data analytic stacks, automated workflows for reprocessing campaigns, forward keep up and much more. Several projects have invested in cloud technologies such as GRFN (Getting Ready for NISAR), PO.DAAC, SWOT and so on. We have explored running our softwares on several cloud platforms like Azure, Google Cloud Platform, Amazon Web Services, High Performance Computing and Kubernetes. We would like to shed some light on such work and lessons learned in the process.

Presentations
  • Cloud-based Data Processing and Workflow Systems – Namrata Malarout (Jet Propulsion Laboratory, California Institute of Technology)
  • Steering the Ship: Making Sense of Multi Container Deployments with the Help of Kubernetes and AWS – Frank Greguska (Jet Propulsion Laboratory, California Institute of Technology)
    Packaging applications into containers is an easy and effective mechanism for delivering software that is runnable, repeatable, and reliable. However, that is just the first step. Scientific applications that deal with big data tend to require parallelism and multiple focused applications. In this case, it is necessary to manage multi-container deployments spread across multiple machines. In this talk, one solution for deploying and managing multi-container deployments will be explored in depth. The focus will be on Kubernetes, Amazon Web Services, and Apache SDAP as deployed for the NASA Sea Level Change Portal. You can look forward to a Kubernetes crash course followed by a detailed explanation of a production deployment of an Elastic Kubernetes Service (EKS) cluster.
  • Cumulus Lessons Learned: Building, testing, and sharing a cloud archive – Patrick Quinn (NASA / EED-2 / Element 84)
    Cumulus is a scalable, extensible cloud-based archive system which is capable of ingesting, archiving, and distributing data from both existing on-prem sources and new cloud-native missions. As we have built and evolved the system with contributions from seven NASA EOSDIS organizations, we have learned several lessons about how to build a robust, broadly-applicable, microservices-based cloud system for geospatial data which we will share in this talk.

Session recording is here.

Speakers
PQ

Patrick Quinn

Software Engineer, NASA / EED-2 Element 84
avatar for Frank Greguska

Frank Greguska

Scientific Applications Software Engineer, Jet Propulsion Laboratory, California Institute of Technology
avatar for Namrata Malarout

Namrata Malarout

Scientific Applications Software Engineer, NASA / JPL


Wednesday July 17, 2019 1:30pm - 3:00pm
Ballrm D

1:30pm

Data to Action Teacher Workshop
Limited Capacity seats available

Data to Action with Jupyter Notebooks and other Earth Science Tools
JOIN US for an afternoon deep-dive into two cutting-edge data science activities. First, we’ll use Jupyter Notebooks to investigate a 100-year hurricane dataset, modifying code to dig deeper. We’ll also examine GOES weather satellite tools, art history, geosciences, and biodiversity data-sets using the SuAVE visual exploration tool.Session Take-Aways
  • GOES-R Series WebApps (http://cimss.ssec.wisc.edu/education/goesr/webapps) provide learners with easy to use apps to learn about combining satellite channels into RGB, spatial resolution, and the spectral bands of GOES 16/17
  • The Explore Atlantic Storms Jupyter Notebook provided secondary educators hands on experience (some for the first time) with viewing, modifying, and playing with (python) code.
  • SuAVE Survey Analysis via Visual Exploration is an incredibly powerful and fun tool to explore and use in the classroom.



Speakers
avatar for LuAnn Dahlman

LuAnn Dahlman

Science Writer and Editor, NOAA Climate Program Office
The updated Climate Explorer application.
avatar for Sean Gordon

Sean Gordon

Information Engineer, The HDF Group
Talk to me about the ESIP Labs project, ESIPhub a JupyterHub based shared computational environment for workshops at Meetings.My research focuses on the connections between documentation structures and the evaluation of content for the metadata needs of diverse communities of practice... Read More →
avatar for Shelley Olds

Shelley Olds

Science Education Specialist, UNAVCO
Data visualization tools, Earth science education, human dimensions of natural hazards, disaster risk reduction (DRR), resilience building.



Wednesday July 17, 2019 1:30pm - 3:00pm
Room 315

1:30pm

Open Forum IGSN 2040: Maturing a PID Organization toward Sustainability
Globally unique, persistent, and resolvable identifiers (PIDs) are now an essential component of the modern research ecosystem and are used for many types of digital objects and research artefacts including data, software, samples, and instruments. The International Geo Sample Number (IGSN) is a specialized PID for physical samples that ensures unambiguous citation and tracking of these samples and links them to data and publications.  Originally developed for the Earth Sciences, the IGSN has evolved into an international PID system and is increasingly adopted by other disciplines that need to refer to physical samples. The growing number and range of stakeholders worldwide include, but are not limited to, researchers, collection curators, and data managers.

To date, nearly 6.9 million samples have been registered with IGSN. As the audience expands and the adoption rate accelerates, the governance and business models of the system need to be reassessed to support this growth. The IGSN 2040 project, funded in 2018 by an award from the Alfred P. Sloan Foundation, has enabled the participation of an international group of experts, from multiple domains, to re-design and improve the existing organization and technical architecture of the IGSN. The goal is to be able to respond to and support, in a sustainable manner, the rapidly growing demands of an increasingly multi-disciplinary samples user community in a landscape of maturing research data infrastructures.

The IGSN 2040 team invites the ESIP community to participate in an open forum to explore solutions for a scalable and sustainable future of the IGSN. This discussion will begin broad addressing essential criteria for trustworthiness and sustainability for PIDs in the rapidly growing global unique, persistent, and resolvable identifier (UPRI) ecosystem, and narrow to focus on the optimal organizational foundations needed to ensure longevity, scalability and effective governance of the IGSN. The results of this discussion will inform the work of the IGSN 2040 Governance Steering Committee Meeting, which is colocated with the 2019 ESIP Summer meeting.

Session recording is here.

Moderators
avatar for Kerstin Lehnert

Kerstin Lehnert

Lamont-Doherty Earth Observatory, Columbia University
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the NSF-funded data facility IEDA (Interdisciplinary Earth Data Alliance). Kerstin holds a Ph.D in Petrology from the University of Freiburg in Germany.Over... Read More →

Wednesday July 17, 2019 1:30pm - 3:00pm
Room 316

1:30pm

Large/Mission Scale Multiplatform Data Working Group
Are you working with large or mission scale NASA data? Join this session to share experiences with other big data users and collaborate with members of this newly-formed ESDIS working group as we formulate ideas on how to streamline access to these multi-platform datasets and reduce the strain on DACC infrastructure.

Join in the discussion by adding your experiences to our doc:
http://bit.ly/LARGE_SCALE

Session recording is here.

Speakers
avatar for Ben Galewsky

Ben Galewsky

Research Programmer, National Center for Supercomputing Applications


Wednesday July 17, 2019 1:30pm - 3:00pm
Room 317

1:30pm

netCDF-CF Workshop Part IV
Agenda
  • 13:30 - UDUNITS (Ethan Davis)
  • 13:45 - Calendars (Ethan Davis)
  • 14:00 - Leap seconds (Jim Biard)
  • 14:15 - CF in the cloud (Aleksandar Jelenak)
  • 14:30 - Next steps (Ethan Davis)

Background
This is part of a 4 part workshop. All sessions:

The Climate and Forecast (CF) metadata convention for netCDF (netCDF-CF) is a community-developed standard first released in 2003. The CF conventions were originally developed to represent climate and forecast model output encoded in the netCDF binary format, with the specific goal of facilitating comparison of output from different models. Subsequent development of the convention has broadened its scope to include observational data and derived products.

This workshop is focused on discussing current and future efforts and directions for the CF conventions.

Session recording is here.

Speakers
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
AJ

Aleksandar Jelenak

The HDF Group
KO

Kevin O'Brien

Software Engineer, UW/JISAO, NOAA/PMEL
DH

David Hassell

University of Reading
avatar for Daniel Lee

Daniel Lee

Software and data format engineer, EUMETSAT
GC

Guilherme Castelao

Scripps Institution of Oceanography
avatar for Ethan Davis

Ethan Davis

UCAR Unidata



Wednesday July 17, 2019 1:30pm - 3:00pm
Room 318

3:00pm

Break
Wednesday July 17, 2019 3:00pm - 3:30pm
TCC

3:30pm

Metadata harvesting through schema.org
Repositories have recognized the benefits of adopting schema.org metadata in their data catalog landing pages to improve discoverability, particularly with the incentive of inclusion in the Google Dataset search. While Google supports broad, general search and discovery, we can also use this mechanism to improve domain-specific aggregated search systems like DataONE. In this working session, we will focus on real world issues of implementing schema.org for repositories, how to link traditional metadata records into dataset landing pages, and how this can result in improved harvesting and representation by science focused aggregators such as DataONE. We will work through recommendations emerging from science-on-schema.org, optimizing JSON-LD to work with major search engines, and options for extending to include more detailed dataset information beyond the typical discovery-level metadata found in most records.

Moderators
avatar for Matt Jones

Matt Jones

Director, DataONE Program, DataONE, UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Scientific Synthesis
DV

Dave Vieglais

University of Kansas

Wednesday July 17, 2019 3:30pm - 5:00pm
Ballrm A

3:30pm

Data Citations: What Makes a Good Citation?
Citing data is important as it provides credit to the producers, better transparency in reproducibility of work and applies to FAIR (Findable, Accessible, Interoperable, Reproducible). As most researchers know how to cite scientific writings, citing data is not as obvious or well practiced. Therefore most data repositories are providing citation formats for their datasets so users will know how data should be cited. Repositories are also registering PIDs, typically DOIs for the datasets as tracking the PID is much easier than the actual citation text in an article. But why do the citations and registered PIDs contain the information they contain? This session will look at the citation formats registered information that goes into a PID at USGS, NOAA, NASA and other repositories. We will then compare the various citations and see why differences, if there are any, exist. Is it due to available metadata, community driven, funder driven, etc.?
Speakers:
Reyna Jenkyns - "MINTED: Making Identifiers Necessary for Tracking Evolving Data"
Alex Bell - "A generalist perspective on data citation"
Madison Langseth - "USGS Data and Software Citations"
Heather Brown - "Citations at NCEI"
Jessica Hausman - "Citations at PO.DAAC and NASA" and the new ESIP Guidelines 

Discussion
Notes will be captured in this google doc http://bit.ly/2Y3UL8J

Session recording is here.

Speakers
avatar for Madison Langseth

Madison Langseth

U.S. Geological Survey
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL
avatar for Heather Brown

Heather Brown

Archive Data Management Specialist, Riverside for NESDIS/NCEI
avatar for Mark Parsons

Mark Parsons

Research Scientist, Rensselaer Polytechnic Institute
avatar for Reyna Jenkyns

Reyna Jenkyns

Data Stewardship Manager, Ocean Networks Canada
dynamic data citations, ocean observing metadata and data



Wednesday July 17, 2019 3:30pm - 5:00pm
Ballrm BC
  • Area Citation, data citation, DOI, PID
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

3:30pm

Scalable, data-proximate cloud computing for Earth Science research
Data intensive scientific workflows are at a pivotal time in which traditional local computing resources are no longer capable of meeting the storage or computing demands of scientists. In the Earth Sciences, we are facing an explosion of data volumes sourced from models, in-situ observations, and remote sensing platforms. Some agencies are starting to move data to commercial Cloud providers to facilitate access (e.g. NASA on Amazon Web Services). Fully leveraging these opportunities will require new approaches in the way the scientific community handles data access, processing and analysis. In particular, we need to stop downloading data and start uploading algorithms to wherever large archives reside. This session is targeted at researchers who pioneering such “data-proximate” computing on commercial Cloud infrastructure. We hope to hear current success stories, as well as failures, and identify ways to improve existing workflows.

Agenda
  • 3:30 - 3:35 Scott Henderson (eScience Institute) Introduction to the session - slides: http://bit.ly/2YhbWnr
  • 3:35 - 3:55 Aimee Barciauskas (Development Seed): The Multi-Mission Algorithm and Analysis Platform (MAAP)
    Slides: https://doi.org/10.6084/m9.figshare.8942108
  • 3:55 - 4:15 Aji John (University of Washington) - Analyzing satellite imagery on the Cloud to understand wildflower phenology at Mt Rainier
  • 4:15 - 4:35 Julien Chastang (UCAR/unidata) - Deploying a Unidata JupyterHub on the NSF Jetstream Cloud, Lessons Learned and Challenges Going Forward
    Slides: https://doi.org/10.6084/m9.figshare.8944964
  • 4:35 - 4:55 Rich Signell (USGS): Using the Pangeo ecosystem for model analysis and visualization
    Slides: https://doi.org/10.6084/m9.figshare.9115229
  • 4:55 - 5:00   Wrapup discussion 

Session recording is here.

Session Take-Aways
  1. A current challenge for cloud-based workflows is that datasets from different agencies are in different formats, different regions, and often have similar but slightly different access apis
  2. Platforms such as MAAP and Pangeo are very promising and exciting. They enable the benefits of scalable computing on datasets stored on the cloud.
  3. The cost model for scalable cloud computing is unclear. How to support platforms into the future and regulate user access to cluster resources.


Speakers
AJ

Aji John

University of Washington
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Julien Chastang

Julien Chastang

Software Engineer, UCAR - Unidata
Scientific software developer at UCAR-Unidata.
avatar for Scott Henderson

Scott Henderson

Research Scientist, University of Washington
avatar for Aimee Barciauskas

Aimee Barciauskas

Development Seed


Wednesday July 17, 2019 3:30pm - 5:00pm
Ballrm D

3:30pm

Data to Action Teacher Workshop
Limited Capacity seats available

Data to Action with Jupyter Notebooks and other Earth Science Tools
JOIN US for an afternoon deep-dive into two cutting-edge data science activities. First, we’ll use Jupyter Notebooks to investigate a 100-year hurricane dataset, modifying code to dig deeper. We’ll also examine GOES weather satellite tools, art history, geosciences, and biodiversity data-sets using the SuAVE visual exploration tool.Session Take-Aways
  • GOES-R Series WebApps (http://cimss.ssec.wisc.edu/education/goesr/webapps) provide learners with easy to use apps to learn about combining satellite channels into RGB, spatial resolution, and the spectral bands of GOES 16/17
  • The Explore Atlantic Storms Jupyter Notebook provided secondary educators hands on experience (some for the first time) with viewing, modifying, and playing with (python) code.
  • SuAVE Survey Analysis via Visual Exploration is an incredibly powerful and fun tool to explore and use in the classroom.



Speakers
avatar for LuAnn Dahlman

LuAnn Dahlman

Science Writer and Editor, NOAA Climate Program Office
The updated Climate Explorer application.
avatar for Sean Gordon

Sean Gordon

Information Engineer, The HDF Group
Talk to me about the ESIP Labs project, ESIPhub a JupyterHub based shared computational environment for workshops at Meetings.My research focuses on the connections between documentation structures and the evaluation of content for the metadata needs of diverse communities of practice... Read More →
avatar for Shelley Olds

Shelley Olds

Science Education Specialist, UNAVCO
Data visualization tools, Earth science education, human dimensions of natural hazards, disaster risk reduction (DRR), resilience building.



Wednesday July 17, 2019 3:30pm - 5:00pm
Room 315

3:30pm

Help Us Help You! Developing Data Pathfinders at Earthdata.nasa.gov
The NASA Earth Science Data Systems Program, which manages NASA’s Earth science data collections, is currently working to improve the discoverability of its data holdings and information through the earthdata.nasa.gov website. To this end, we have developed several data pathfinders that focus on a variety of themes in which remote Earth observation data can provide an added dimension to ground-based observations for forecasting, monitoring and responding to climate-related events. Currently we have data pathfinders for Agriculture and Water Resources, Health and Air Quality, and Wildfires.

In this interactive working session, we would like to learn about your data needs and ask you to test our new data pathfinders. Your valuable feedback will be used to improve our tools and increase the discoverability of NASA Earth science data.

Session recording is here.

Moderators
avatar for Teddy Gelabert

Teddy Gelabert

Web Strategist, NASA ESDS/SSAI
Earthdata user information needs and how to meet them via the web.
avatar for Cynthia Hall

Cynthia Hall

Community Coordinator, NASA Earth Science Data Systems/SSAI
Accessing and using NASA Earth science data through Earthdata
PL

Paula Land

Senior Content Strategist, NASA Earth Sciences Data Systems Communications Group
Impressions and experiences with using the earthdata.nasa.gov site.
avatar for Kevin Ward

Kevin Ward

NASA Earth Observatory

Wednesday July 17, 2019 3:30pm - 5:00pm
Room 316

3:30pm

The Critical Zones: Supporting Place Based Research
We look at the history of data management for the the NSF CZO Network, a NSF funded network of sites focused on how components of the Critical Zone interact, shape Earth's surface, and support life. Each site has their own data management practices, with a central catalog aggregating information about well curated datasets. Each site leverages specific technologies such as Dendra, Geodashboard, Clowder, etc. We will discuss some of these local approaches and how in the last few years there has been an attempt at improving the central catalog by leveraging efforts such as CUAHSI HydroShare, together with some future looking approaches for a better federated data manager package.

Presentations
  • CZO Cyberinfrastructure History (Collin Bode, CZO, UC Berkeley)
  • Eel River CZO & Dendra (Collin Bode, CZO, UC Berkeley)
  • IML CZO & Clowder+Geodashboard (Luigi Marini, CZO, NCSA)
  • CUAHSI & HydroShare (Martin Seul, CUAHSI): https://doi.org/10.6084/m9.figshare.8968016
  • Migrating CZO (meta)data to HydroShare (David Lubinski, CZO, CU-Boulder)
  • “CZ Collaborative Network” NSF RFP (Luigi Marini, CZO, NCSA)
Find and access all slides: https://doi.org/10.6084/m9.figshare.9106955

Session recording is here.

Session Take-Aways
  1. Critical zone observatories (CZO) are by necessity multi-disciplinary research centers and because of the size of the data collection network and the data stream its necessary to have a data manager to homogenize and monitor incoming data.
  2. The dendra platform is a tool for collecting, cleaning and working with time series data from distributed sensor networks. One take away from the use of this tool was that there are benefits to using a system that is designed for a specific purpose but you need to know the limits of the system.
  3. The HydroShare system will become a holding place for CZO data as the project transitions to a new phase based on new NSF funding guidelines. Currently, there is a lack of standardization in terminology in the HydroShare system and this requires streamlining to increase usability.



Moderators
avatar for Ben Galewsky

Ben Galewsky

Research Programmer, National Center for Supercomputing Applications

Speakers
avatar for Collin Bode

Collin Bode

Data Manager, Eel River CZO, UC Berkeley
We work at the Angelo Coast Range Reserve (http://angelo.berkeley.edu) doing critical zone science. Our interests are in place-based science asking questions about the structure and processes of the critical zone. I have developed a realtime sensor management and curation system... Read More →
avatar for Luigi Marini

Luigi Marini

Lead Research Programmer, National Center for Supercomputing Applications



Wednesday July 17, 2019 3:30pm - 5:00pm
Room 317

5:30pm

Identifying Competencies Needed by Data “Specialists”
Many organizations and consortia around the globe are discussing how to identify the kinds of data skills that people who are interested in working as professional data stewards (or data curators or information specialists, for example) need to acquire during their academic careers (including undergraduate, graduate, PhDs, PostDocs and possibly, early career professionals.  Members of the ESIP community are involved in  discussions of the data skills needed by the "generic" data steward, but want to inform those discussions with the needs of the data steward working with Earth Science data.  The skills identified will be used by the American Geosciences Institute (AGI) to create a one page "Career Compass" handout that has been quite popular at any number of professional conferences and association meetings.  An example of a Career Compass for Data Science can be seen at:  https://www.americangeosciences.org/sites/default/files/CareerCompass_DataSciences.pdf.  Comments, suggestions & feedback from those engaged in data steward activities and have similar responsibilities are invited to offer them to the organizers.  If there is enough interest in group discussion, an unconference session may also be proposed.

Speakers
KB

Karl Benedict

ESIP President, ESIP
The ESIP President is a volunteer position, elected by the ESIP Community each year. The President works with the ESIP Staff for several of the presentation, speaker introductions, award ceremonies, and other speaking/participating aspects of ESIP meetings throughout the year.
avatar for Nancy Hoebelheinrich

Nancy Hoebelheinrich

Principal, Knowledge Motifs LLC
See my LinkedIn profile at: https://www.linkedin.com/in/nancy-hoebelheinrich-0576ba3


Wednesday July 17, 2019 5:30pm - 8:00pm
TCC

5:30pm

Research Showcase & Research as Art
Research Showcase, including Research as Art on the evening of July 17th. This evening reception will provide participants with a wide variety of options for sharing their work - they can share posters, demo tools, and we also encourage them to use visual media to present their data and research as art.

We will also have a couple of special activities that you can take part in, including Identifying Competencies Needed by Data Specialists (more at https://sched.co/RH8D).

Wednesday July 17, 2019 5:30pm - 8:00pm
TCC

8:00pm

End of Day 2
Wednesday July 17, 2019 8:00pm - 8:00pm
TCC
 
Thursday, July 18
 

8:00am

ESIP Lab Plenary
View live-stream here: ESIP 2019 Summer Meeting - Day 3 Plenary

Lab Project Reports
Ziheng Sun: Geoweaver Update
Abdullah Alowairdhi: FAIRtool.org
Eric Sproles: UAS Snow Albedo
Amanda Tan: CubeSats and Snow Covered Area
Rich Signell: Conda Forge + Google Summer of Code Project XrVis

Plenary Speakers
Jim Bednar, Talk Title: PyViz Tools for Geoscience: Easy, Flexible, High-Performance, Browser-Based Visualizations in Python

Presenter: Kelsey Jordahl
Talk Title: Planet data, applications, and interoperability
Slides: https://doi.org/10.6084/m9.figshare.8956526

Lightning Talk Slides here.

View session recording here.

Speakers
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS
avatar for Amanda Tan

Amanda Tan

Data Scientist, University of Washington
Cloud computing, distributed systems
avatar for Eric Sproles

Eric Sproles

Montana State University
avatar for Abdullah Alowairdhi

Abdullah Alowairdhi

PhD Candedate, U of Idaho
avatar for Kelsey Jordahl

Kelsey Jordahl

Director, Data Pipeline Team, Planet
Kelsey Jordahl is the Director of the Data Pipeline team at Planet. Planet operates the largest private constellation of satellites in the world, including over 150 earth observation satellites, and processes terabytes of data every day. Prior to joining Planet in 2015, he worked... Read More →
avatar for Jim Bednar

Jim Bednar

Leader, PyViz/HoloViz Group, Anaconda
Dr. James A. Bednar was a faculty member in the School of Informatics at the University of Edinburgh from 2003 to 2015, and is currently Manager, Technical Services at Anaconda, Inc. At Anaconda, Jim is the project lead of a variety of open-source packages under the PyViz.org banner... Read More →


Thursday July 18, 2019 8:00am - 10:00am
TCC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

9:00am

IGSN2040 Steering Committee Meeting (Closed Meeting)
Day 1 (18 July 2019)09.00 - 9.15

Workshop Opening


09.15-10.15
Orienting Ourselves on the Strategic Context 
Intro to Organizational Steering Committee - scope, members
Review of the TSC workshop outcomes
Report out from the Open Forum at ESIP
10.15-10.30
Coffee/Tea break
10.30-11.00


11.00-11.45


11.45-12.00


12:00 - 12.45
Lunch -- Exhibit Hall B (4th Flr)
12.45 - 2.30





2.30-3.00


3.00-3.15
Coffee/Tea break
3.15-4.30


4.30-5.00


5.00
Adjourn for the Evening






Speakers
avatar for Kerstin Lehnert

Kerstin Lehnert

Lamont-Doherty Earth Observatory, Columbia University
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the NSF-funded data facility IEDA (Interdisciplinary Earth Data Alliance). Kerstin holds a Ph.D in Petrology from the University of Freiburg in Germany.Over... Read More →


Thursday July 18, 2019 9:00am - 5:00pm
Room 315

10:00am

Break
Thursday July 18, 2019 10:00am - 10:30am
TCC

10:30am

Meet The Maintainers: commoning for data infrastructure durability
Because they care about and for the infrastructure that houses every bit of data, every byte of the cloud, and every line of code, maintainers sustain the technology infrastructure that makes Earth data use possible. Maintainers work in many arenas, of course, they keep energy grids up, roadways repaired, buildings secure. Data infrastructure experts are now in conversations with other maintainers. Recently, a group of maintainers: technicians, engineers, historians, social scientists, sysadmins (the ones you call on to reboot the system when it’s down) started a conversation and created a group called The Maintainers. With support from the Alfred P. Sloan Foundation, ESIP is bringing the Maintainer conversation to Tacoma. We’ve invited several of them to talk about the real issues involved in stewarding hardware and systems, not just data. By caring for your hardware, they let you focus on other tasks. Join us to discover how ESIP’s goals of sustaining the Earth science data endeavor rely upon those who chose not to innovate today, but rather to navigate the problematics of keeping everything running most of the time.

Recording: https://2019esipsummermeeting.sched.com/event/PtQN (session starts at ~7:20)
Another recording link
here.

Introduction:
Bruce Caron
Presentation Title: Culture, Kindness, and Care: Commoning for Earth Knowledge Sustainability
Slides: https://doi.org/10.6084/m9.figshare.8969912

Moderator: Mark Parsons
Slides: https://doi.org/10.6084/m9.figshare.8969915

Invited Presentations
Presenter: Emily Jane Sylak-Glassman
Presentation Title: The Importance of Maintaining Earth Observational Data for Long-Term Climate Record Reconstruction
Slides: https://doi.org/10.6084/m9.figshare.8969918

Presenter: Daniella Lowenberg
Presentation Title: Maintaining and Growing Research Data Publishing at CDL & Dryad
Slides: https://doi.org/10.6084/m9.figshare.8980454

Presenter: Jason A. Gallo
Presentation Title: The Scale and Value of Earth Observation Infrastructure
Slides: https://doi.org/10.6084/m9.figshare.8969924

Presenter: Fred C. Beach
Presentation Title: U.S. Energy Infrastructure: ‘What’s Past is Prologue’
Slides: https://doi.org/10.6084/m9.figshare.8969921

Session Take-Aways
  1. ESIP scientists and data scientists and project managers are only one part of the larger team that keeps Earth information active and durable. The Maintainer organization can be an “ESIP for the rest of the team”... where ESIP maintainers gather with others to solve their problems. ESIP will have a session at the next Information Maintainer conference in DC in October. Maintenance is about fostering and caring for relationships: Who decides to maintain is all of us. This requires awareness and kindness.
  2. Earth data resources often have multiple inputs, some of these quite complex. Just finding out and mapping these is an important maintainer activity. Also, archiving data and software at the same time makes good sense (Dryad and Zenodo). Maintenance is more complex than it seems (The US has no energy policy; predicting one variable (sea ice extent) requires dozens of inputs and complex interactions).
  3. Infrastructure (such as our energy infrastructure) can get to the point where the trillions of dollars needed to update it might be better spent replacing this with something highly distributed.



Speakers
avatar for Bruce Caron

Bruce Caron

Executive Director, New Media Research Institute
avatar for Mark Parsons

Mark Parsons

Research Scientist, Rensselaer Polytechnic Institute


Thursday July 18, 2019 10:30am - 12:00pm
Ballrm A

10:30am

Challenges and Opportunities in Adopting Cloud technologies for Data Intensive Science
The amount of data generated by public and private sector organizations has increased many fold in the last decade. In recent years, consumers and providers of data are faced with an increasing challenge of managing the quantity and quality of information produced. The advent of cloud technologies has been a boon for the big data era offering a solution for the information overload. While cloud technologies have provided an excellent opportunity, challenges and opportunities on utilizing cloud technologies are still to be explored. The complex business/infrastructure aspect of the cloud technologies paradigm and the rapid changes in the technical development have made transitions complex and confusing at times. In this session, we hope to share case studies of migration/utilization of cloud technologies for data intensive science. The challenges and opportunities revealed by those case studies we hope will inform stakeholders, collaborators, and other interested parties. We hope that the lessons learned will inform future work and help expedite progress in the field of Earth Science informatics.

Developing Applications Using Earth Science Data in the AWS Cloud with PODPAC

Matt Ueckermann
Observational and modeled data products from NASA encompass petabytes of scientific data available for analysis, analytics, and exploitation. Unfortunately, these data sets are highly underutilized by the scientific community due to: (1) vast computational resource requirements; (2) disparate formats, projections, and resolutions that hinder data fusion and integrated analyses across different data sets; (3) complex and disjoint data access and retrieval protocols; and (4) task specific and non-reusable code development processes that hinder algorithm sharing and collaboration. In response, NASA EOSDIS is actively investigating migration of their vast data archives to storage on commercial cloud services such as Amazon Web Services (AWS). However, to maximize the benefit of cloud-based data storage, cloud-based data analysis and analytics are needed to process data “close” to where it is stored. Recognizing that migrating workflows to the cloud requires a high degree of cloud computing expertise, we are developing the Pipeline for Observational Data Analysis and Collaboration (PODPAC). PODPAC is a Python library designed to automatically harmonize disparate data sources, seamlessly access NASA earth science data, and analyze data in the AWS cloud. PODPAC is built around the tools of the Python data ecosystem (NumPy, Scipy, X-Array) and aims to bridge the gap between data sources, analysis, and the cloud. In this talk, we will introduce PODPAC, and demonstrate on-demand cloud computation of a value-added derived product using NASA data. 
Opportunities for Accelerating Science in the Cloud
Christopher Lynnes
As the data holdings of the Earth Observation System Data and Information System expand over the next several years, the typical data analysis process of downloading data to local compute resources will become increasingly inefficient. However, cloud computing promises to mitigate that by allowing the user to process close to the data. These improvements will be obtained via a variety of mechanisms: 1 - improving the ability of data transformation services to reduce the data prior to analysis; 2 – providing cloud-native analysis capabilities for common analysis functions; and 3 – providing the ability to work directly with data in Web Object Storage.

The role of data stewards in a cloud-based platform
Amanda Leon

Google Earth Engine has a growing user community as a cloud-based platform for analysis and visualization of geospatial data. This adoption is heavily driven by the ease of access Earth Engine’s Data Catalog provides to a wealth of satellite imagery and other geospatial data.  As stewards of NASA EOSDIS data, Distributed Active Archive Centers (DAACs) can play a key role in supporting and maximizing the utility of Earth Engine for the scientific community.  The NSIDC DAAC has been assessing various data stewardship topics to support the sustainment and expansion of NASA EOSDIS data in Google Earth Engine including: 1) data inclusion decisions based on science use cases; 2) optimized workflows for preparing 

Open Source Data-Intensive Platform for the Cloud
 
Thomas Huang
JPL has a long history of building many innovative solutions for onboard instrument, ground operation and data system, archive and distribution for our missions. As the rate of data generate from our missions continue to increase and is expected to rise significantly in near future, JPL is engaging in in reusable data-intensive technologies for mission operations and to enable science. This talk discusses open source solution we have developed for the Cloud platform to address three challenges from our growing collections of scientific data: interactive analysis, in situ match-up, and search relevancy, and their applications.

Developing a roadmap for cloud services
Suresh Vannan

The Physical Oceanography Distributed Active Archive Center (PO.DAAC) will be the data repository for the Surface Water Ocean Topography (SWOT) mission. SWOT provides new challenges, and opportunities, to PO.DAAC, a large data volume (20 TB/day) and a new community of users (hydrologists). This presentation will show how PO.DAAC plans on addressing those. PO.DAAC first assessed what tools and services current and new users will need to discover, access and utilize SWOT data. This analysis provided information for developing a roadmap that shows what services PO.DAAC (and ESDIS) will migrate and/or develop in a Cloud-based environment for the user community.

Leveraging an interoperable scalable data platform to support Earth Observation Data
Sudhir Raj Shrestha (sshrestha@esri.com)
With an ever-increasing wealth of scientific data produced from various sources and platforms including earth observations, models and forecasts, comes exciting and challenging opportunities to exploit such vast amounts of data to produce valuable information products. These data are widely used for monitoring, and analysis of measurements that are associated with physical, chemical and biological phenomena across earth’s oceans, atmosphere and land masses by government agencies like NOAA, NASA, USGS and private industries. The volume, diversity, and complexity of multidimensional earth science data have posed challenges in the past with how it is shared with a diverse community, visualized intuitively, and integrated for answering scientific questions. With advances in geospatial science and technology, these data and analytics can now advantageously be hosted in the cloud. This will have a tremendous impact on how scientists, policy makers, and the public ingest, manage, analyze, visualize, and share complex scientific data. GIS software is evolving in step with the technology industry to help meet these challenges. In this presentation, I will discuss briefly, how the current technology trend is driving more scalable, interoperable and format agnostic capabilities. We will share how the ArcGIS platform supports this “Open Science” and share use cases in place in NOAA and NASA. We will also share recent advancements in the cloud, spatial machine learning and geospatial data science that support various domain of science applications.

Session recording here.

Speakers
avatar for Sudhir R Shrestha

Sudhir R Shrestha

Solution Engineer Researcher, Esri
Solution Engineer and Scientific Data enthusiast with keen interest in making data easily Discoverable and Interoperable. Passionate about geospatially driven Hydrological Modeling and Heuristic Soil Modeling and develop, implement new and innovative geospatial methods, techniques... Read More →
avatar for Amanda Leon

Amanda Leon

DAAC Manager, NASA National Snow and Ice Data Center DAAC
avatar for Thomas Huang

Thomas Huang

Technical Group Supervisor, JPL
avatar for Chris Lynnes

Chris Lynnes

EOSDIS System Architect for Data Use, NASA
avatar for Suresh Vannan

Suresh Vannan

Project Manager, NASA/Caltech Jet Propulsion Laboratory



Thursday July 18, 2019 10:30am - 12:00pm
Ballrm BC
  • Area cloud, data intensive science, data management, user communities, adoption
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

10:30am

Current Approaches for Tracking and Exposing Research Object Usage Metrics
Many publishers and funders have implemented open data policies in efforts to make research more transparent and re-usable. These policies also aim to support data, software, and other research objects as valuable output of the research process. To begin to assess impact and give credit to researchers for sharing research objects, however, the community needs to take additional steps to promote standardized measurement of research object usage and proper citation. This means different things for different stakeholders: researchers need to be informed on how and why research object citations should be included in articles and other publications, publishers need to promote and index research object citations, repositories need to standardize and display research object usage information, and institutions need to value these metrics.

Several stakeholders have begun improving capabilities for tracking and exposing research object usage metrics. For example, Make Data Count highlights the value of research data by providing the infrastructure for repositories to display data usage and citation metrics. The project has worked with COUNTER to develop a Code of Practice to enable standardization and has also developed mechanisms for repositories to expose data usage metrics, including implementation examples from California Digital Library, the Arctic Data Center, and DataONE. In this session, we will hear 1) how repositories are currently tracking research object citations, and 2) how the Make Data Count project and other efforts can help these repositories standardize their reporting approach to support accurate representation of the value of research objects.

Session recording here.

Moderators
avatar for Amber Budden

Amber Budden

Director for Community Engagement and Outreach, DataONE
avatar for Bob Downs

Bob Downs

CIESIN
Dr. Robert R. Downs serves as the senior digital archivist and acting head of cyberinfrastructure and informatics research and development at CIESIN, the Center for International Earth Science Information Network, a research and data center of the Earth Institute of Columbia University... Read More →
avatar for Matt Jones

Matt Jones

Director, DataONE Program, DataONE, UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Scientific Synthesis
avatar for Madison Langseth

Madison Langseth

U.S. Geological Survey
DV

Dave Vieglais

University of Kansas

Speakers
avatar for Jessica Hausman

Jessica Hausman

Data Engineer, PO.DAAC JPL



Thursday July 18, 2019 10:30am - 12:00pm
Ballrm D

10:30am

Advanced Geospatial Cyberinfrastructure for Deep Learning
The deep stack and tremendous amount of computational parameters in deep learning models greatly increases the challenges of pre-processing, training, testing, and post- processing geospatial datasets quickly and efficiently. This session will discuss the latest progresses on constructing advanced cyberinfrastructure for deep learning on satellite-based or field-observed geospatial datasets. The goal is to bring community experiences together and collaborate on building advanced geospatial cyberinfrastructure addressing the big questions raised in solving fundamental geoscience problems using deep learning models.

Presenter: Ziheng Sun
Presentation Title: Geoweaver for Better Deep Learning: A Review of Cyberinfrastructure
Slides: https://doi.org/10.6084/m9.figshare.9037091

View Full Recording on YouTube

Moderators
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP

Speakers

Thursday July 18, 2019 10:30am - 12:00pm
Room 316

10:30am

Multi-sensor data integration for cryosphere and hydrosphere monitoring
In keeping with this year’s Summer Meeting theme of “Increasing the Use and Value of Earth Science Data and Information,” this session aims to explore different data streams used for monitoring of the hydrosphere and cryosphere. Earth science data for water resources monitoring has existed as field collected data, remote sensing, modeled and in situ data for decades but relatively recent increases in computational capabilities (e.g. cloud computing platforms), data storage and integration and processing methods like machine learning have allowed researchers to ask a suite of questions that rely on data from multiple sources and typologies to answer complex questions about water resources critical to humans and ecosystems. To emphasize the ‘use and value of earth science data’ this session will incorporate presentations on data generation and processing methods as well as applied uses of data products for water resources monitoring.

Presenter: Eric Sproles
Presentation Title: Bridging the Scaling Issues of Earth Observations
Slides: https://doi.org/10.6084/m9.figshare.8980400

Presenter: Jeffrey Deems
Presentation Title: New Data, Old Problems: Integrating Novel Data Sources for Study & Management of Snowmelt Systems
Slides: https://doi.org/10.6084/m9.figshare.8980406

Presenter: Yuhan Rao
Presentation Title: Integrating Satellite Observations and In Situ Measurements to Study Snow-Albedo-Temperature Interactions Over the Tibetan Plateau
Slides: https://doi.org/10.6084/m9.figshare.8980409

Presenter: Scott Oviatt
Presentation Title: National Resources Conservation Service SNOTEL Network
Slides: https://doi.org/10.6084/m9.figshare.8980415

Presenter: Ruth Duerr
Presentation Title: Polar Data Activities
Slideshttps://doi.org/10.6084/m9.figshare.8980397

Session Take-Aways
  1. NRCS plans to convert long-term snow courses to SNOTEL, continue to pursue tech upgrades, develop new methodologies to improve accuracy
  2. Machine learning can integrate satellite observations and in situ measurements to create a more complete measurement
  3. UAV provide higher density albedo measurements, remote locations, multiple field sites
  4. Creating an integrated system for the future to track cryospheric changes
  5. Arctic Data Committee has technical and semantic guidance for integrating cryospheric data

View the Recording on YouTube

Moderators
Speakers
avatar for Yuhan Rao

Yuhan Rao

Ph.D., University of Maryland, College Park
avatar for Eric Sproles

Eric Sproles

Montana State University
SO

Scott Oviatt

Snow Survey Supervisory Hydrologist, USDA - Natural Resources Conservation Service
Mr. Oviatt graduated from the University of Missouri, B.S. Agriculture, Atmospheric Scientist. Upon graduation, Mr. Oviatt worked for 3 different consulting firms as a consulting meteorologist in the western U.S. For the past 20 years he has worked for the USDA. First with the... Read More →


Thursday July 18, 2019 10:30am - 12:00pm
Room 318

12:00pm

Lunch
Thursday July 18, 2019 12:00pm - 1:30pm
Exhibit Hall B (4th Flr)

1:30pm

ESIP Geoscience Community Ontology Engineering Workshop (GCOEW)
"Brains! Brains! Give us your brains!"               
      - Friendly neighbourhood machine minds
The collective knowledge in the ESIP community is immense and invaluable. During this session, we'd like to make sure that this knowledge drives the semantic technology (ontologies) being developed to move data with machine-readable knowledge in Earth and planetary science.
What we'll do:
  1. In the first half hour of this session, we'll a) sketch out how and why we build ontologies and b) show you how to request that your knowledge gets added to ontologies (with nanocrediting).
  2. We'll then have a 30-minute crowdsourcing jam session, during which participants can share their geoscience knowledge on the SWEET issue tracker. With a simple post, you can shape how the semantic layer will behave, making sure it does your field justice! Request content and share knowledge here: https://github.com/ESIPFed/sweet/issues
  3. In the last, 30 minutes we'll take one request and demonstrate how we go about "ontologising" it in ENVO and how we link that to SWEET to create interoperable ontologies across the Earth and life sciences.
Come join us and help us shape the future of Geo-semantics!

Stuff you'll need:
  1. A GitHub account available at https://github.com/
  2. An ORCID (for nanocrediting your contributions) available at https://orcid.org
Notes for this session can be found at https://docs.google.com/document/d/1iupSeRRGmgjMBSjVAWIX0MFr1N4yr7DsIxnZhN3G1Zk/edit?usp=sharing

Session recording here.

Moderators
avatar for Pier Luigi Buttigieg

Pier Luigi Buttigieg

Data Scientist, Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung
Responsibilities My work focuses on the combination of semantics, bioinformatics, and data analysis in aid of meaningfully mobilising ecological data and detecting structure in this complex and plastic context. See my ORCID for more: orcid.org/0000-0002-4366-3088 I apply... Read More →

Speakers
avatar for Lewis J. McGibbney

Lewis J. McGibbney

Chair, ESIP Semantic Technologies Committee, NASA, JPL
My name is Lewis John McGibbney, I am currently a Data Scientist at the NASA Jet Propulsion Laboratory in Pasadena, California where I work in Computer Science and Data Intensive Applications. I enjoy floating up and down the tide of technologies @ The Apache Software Foundation having... Read More →
BH

Beth Huffer

Lingua Logica


Thursday July 18, 2019 1:30pm - 3:00pm
Ballrm A

1:30pm

Geospatial Data Analytics and Visualization for Sustainability in the Cloud
Session TitleGeospatial Data Analytics and Visualization for Sustainability in the Cloud
Session Convener(s): Sudhir Raj Shrestha (Esri), Ana Pinheiro Privette (Amazon) and Joe Flasher (AWS) 


Session Description:

Sustainability’s geospatial processes are complex since environmental, societal, and economic systems are deeply interconnected. This creates challenges for researchers working in this field because the impact from changes in one system are not always well understood or predictable for the other systems. As a result, extracting timely and meaningful insights for sustainable environmental decision making often requires large datasets from many different domains, and tools capable of capturing the multidimensional nature of the problem. To address these challenges, many users are exploring the use of cloud computing to leverage its scalable storage and geospatial analytical capabilities. In this session, we are soliciting presentations that utilizes cloud-based workflows and applications of GIS technology to derive insights for sustainability.

Workshop structure: Each presenter will have 12 mins of presentation time and 2 mins of Q&A.

Presentations

Title: Amazon Sustainability Data Initiative: promoting innovation and problem solving for sustainability
Presenter: Ana Pinheiro Privette (Amazon)
Abstract: Last December, Amazon launched its Sustainability Data Initiative (ASDI) to promote sustainability research, innovation, and problem-solving by making key data easily accessible and even more widely available. ASDI Initiative leverages Amazon Web Services’ technology and scalable infrastructure to stage, analyze, and distribute data. The initiative identifies foundational data for sustainability and works closely with data providers like NOAA, NASA and the UK Met Office to stage their data in the AWS Cloud by giving them complete ownership and control over how their data is shared. While these datasets have always been freely available, they aren’t always easily accessible and researchers may not have the compute power necessary to take advantage of these resources through their own on-premises data centers. To encourage application development, researchers can apply for AWS Promotional Credits through the AWS Cloud Credits for Research program. Offsetting these costs will encourage experimentation and promote innovative solutions. Amazon believes that providing easier access to massive datasets (i.e. petabyte-scale) in the cloud and providing access to analytical tools will help researchers and innovators address a wide range of sustainability challenges, such as the impacts of climate change and weather extremes.
Contact the ASDI team if you would like to learn more or get involved!

Title: Blue Dot Water Observatory
Presenter: Grega Milcinski (Sinergise)
Abstract: Water lies at the heart of economic and social development. As it is becoming scarce, stakeholders need innovative ways to better understand water conditions, predict risks, and tackle problems. Cost-effective, yet reliable solutions for monitoring water resources are needed, as ground-based monitoring networks are often too costly and due to networks deterioration in some cases also unreliable. This is even more true for developing countries.
Being enlightened by JRC’s Global Surface Water project we have built a service, which does not only show historic data but is also up-to-date. Copernicus Sentinel mission, with its global coverage and short revisit time, combined with an efficient use of AWS infrastructure resources makes it feasible to do a global scale project with limited resources. The Blue Dot Water Observatory is an EO-based solution that provides reliable and timely information about surface water levels of water bodies across the globe.
With this service, we also wish to demonstrate how global monitoring of the environment
using Earth observation data can be done efficiently and orders of magnitude cheaper than before, if done in an intelligent way. To make it possible to others to build on top of our experience, we share all the code as an open-source.
Slides: https://doi.org/10.6084/m9.figshare.9121994

Title: Systematic Data Transformation to Enable ArcGIS Image Services and Web Coverage Services (WCS) within the NASA Earth Science Data System’s Cloud
Presenter: Jason Barnett (Booz Allen Hamilton)
Abstract: This presentation will provide an overview of current efforts underway to develop and deploy scalable Amazon Web Services (AWS) Step Functions and serverless Lambda Functions in order to orchestrate a workflow of customized micro-services executing GDAL transformations in order to geospatially enable and serve new cloud-optimized MetaRaster Format (MRF) NASA Earth science data products. These analysis-ready data products will be served to end users as cloud-based multidimensional ArcGIS Image Services and OGC Web Coverage Services, to be eventually discoverable within catalogs such as NASA Earthdata Search, NASA ArcGIS Online, Esri Living Atlas, etc. Thus, enabling NASA Earth Science datasets to be usable inputs for analysis within ArcGIS, QGIS, custom web mapping applications and enable the ability to derive insights for sustainability across multiple domains.

Title: Understanding Bob the bias by using true diversity of thought
Presenter: Alexis Hannah Smith (IMGeospatial)
Abstract: IMGeospatial is in development of an open source QA app that will be used on android devices to provide a Proof of Concept for our central objective: namely the participation in quality assuring and validation of extracted features from remote-sensed data undertaken by individuals who sign up to our scheme. This project forms part of a wider collaboration with the European Space Agency (ESA), Anglian Water, Affinity Water and the World Bank. Our motivation is to clearly demonstrate that a freelancer sitting outside his or her dwelling in the heart of an African desert, in a Finnish forest (or indeed anywhere) can be part of, and make a significant contribution to, the AI revolution. At the same time also enriching the lives of those in our global community who need support from the developed world. IMGeospatial believes that by developing and deploying this system using true diversity of thought, we can not only improve the quality of AI-derived data for the whole community, but also understand and measure the dormant devil, Bob the Bias.

Title: Improving Information and Communications in a Disaster Scenario with AWS Snowball Edge
Presenter: Dan Pilone (Element 84)
Abstract: Volunteers and emergency personnel carefully coordinate their response to natural disasters. This coordination requires data and making data actionable and accessible at the tactical edge remains a challenge. We'll give a quick overview of the results of our disaster response user needs study and demonstrate a prototype disaster response pipeline for field data management. The serverless, cloud-based pipeline combines public and private data sources with open source software. It can provide the field with a ruggedized remote data center (AWS Snowball Edge), preloaded with critical information, including reach-back capabilities. You'll see how this works and learn ways first responders can update data from in-situ sources such as drones.

Title: Upstream Ancillary Ingest: Keep Up Best You Can 
Presenter: Namrata Malarout (JPL) 
Abstract: The ARIA project generates products process

Speakers
avatar for Namrata Malarout

Namrata Malarout

Scientific Applications Software Engineer, NASA / JPL
avatar for Dan Pilone

Dan Pilone

Chief Technologist, Element 84, Inc.
avatar for Sudhir R Shrestha

Sudhir R Shrestha

Solution Engineer Researcher, Esri
Solution Engineer and Scientific Data enthusiast with keen interest in making data easily Discoverable and Interoperable. Passionate about geospatially driven Hydrological Modeling and Heuristic Soil Modeling and develop, implement new and innovative geospatial methods, techniques... Read More →
avatar for Jason Barnett

Jason Barnett

Geospatial Specialist, Booz Allen Hamilton
avatar for Grega Milcinski

Grega Milcinski

CEO and Co-founder, Sinergise
Sentinel Hub and general availability of EO data in the clouds
avatar for Alexis Hannah Smith

Alexis Hannah Smith

Founder and CEO, IMGeospatial


Thursday July 18, 2019 1:30pm - 3:00pm
Ballrm BC
  • Area Cloud computing, GIS, Earth Science, Sustainability, Raster Analytics
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

1:30pm

How to build your data "groups" for optimal discovery?
How does your Earth Science community define a collection that is discoverable in catalogs AND  yet can be simply understood by humans? Many areas need to be evaluated such as definitions, elements, vocabularies and more...oh, my!  Help us create a cheat sheet to help the data management community by having fun.

Whether you come to our session or not, please help us by filling in the blanks for the following statement in Slido.
I work with _______ datatype and it is aggregated by ___________. An example answer would be: sonar data; single cruise + instrument type. 

https://app.sli.do/event/dttuqvzw
or
Go to https://www.slido.com and enter the code #C690

Session recording here or possibly here.

Speakers
avatar for Anna Milan

Anna Milan

Metadata Standards Lead, NOAA NCEI
~*~Metadata Adds Meaning~*~
avatar for Heather Brown

Heather Brown

Archive Data Management Specialist, Riverside for NESDIS/NCEI



Thursday July 18, 2019 1:30pm - 3:00pm
Ballrm D

1:30pm

An ESIP community's working session on machine learning: introducing adoptable use cases and beyond
The large volume of freely available Earth and environmental data and the fast developing computational capacity have powered the rapid growth of discovery and exploration using machine learning (ML) within Earth science information community.

Based on a community survey conducted by ESIP Machine Learning Cluster in 2018, the majority of the participants expressed the desire of introductory materials for ML applications as well as the interests in curated ML datasets. With this in mind, the ML cluster has since then aligned our efforts on 1) generating Earth science specific ML application examples and 2) curating ML data repositories for ESIP community.
In this working session, we invite everyone in the ESIP community who is interested in various ML applications to contribute to a discussion on how to move these efforts forward in a most efficient and community-driven manner. The session will begin with presentations from members of the cluster on a) the development of sample use cases for learning about ML, 2) the curation of a centralized metadata repository of open source ML suitable data sets from various repositories, and 3) a demo of Data Driven Discovery of Models (D3M) for composing ML pipelines to simultaneously solve various problems. The session will conclude with an open discussion on the cluster’s priority and efforts for the following 6-month to best serve the ESIP community.

We encourage all ESIP ML enthusiastics to join our session and contribute to the cluster’s initiatives.
Presentations
  • Yuhan Rao (University of Maryland, College Park): Machine learning applications using R and open source earth science datasets
    • In this presentation, I will share the status of the efforts on developing sample ML applications using open source earth science data sets (from UCI repository) and ML package in R. The use case can be adopted by beginners to start their own applications.
  • Arif Albayrak (NASA Goddard Space Flight Center): Enabling machine learning applications in earth science community through curated open source data sets
    • This presentation features the cluster’s initiative to gather available open source data sets which are suitable for ML applications in earth science. The goal of this effort is to enable future development of ML applications for users with different level of experiences.
  • Sujen Shah (NASA Jet Propulsion Laboratory): Demonstration of automated machine learning application pipeline through DARPA Data Driven Discovery of Models (D3M)

View the Recording on YouTube

Speakers
avatar for Anne Wilson

Anne Wilson

Senior Software Engineer, Laboratory for Atmospheric and Space Physics


Thursday July 18, 2019 1:30pm - 3:00pm
Room 316

1:30pm

HDF Town Hall
Data in HDF file formats continues to play an important role for Earth Scientists in the U.S. and around the world. The HDF Group will update ESIP members on the state of HDF software and HDF5 Roadmap, and will share our experience on working with HDF5 in the Cloud. We will discuss our technical approaches, and lessons learned from different projects including a NASA ACCESS project that transformed NASA HDF data into GeoTIFF in AWS. We will also update ESIP members on our involvement in standardization efforts and demonstrate how HDF tools support ESDIS data from product initial design to production, and to compliance with the standards. We will encourage ESIP members participating in the session to share their experiences with the HDF software and to contribute to the HDF5 Roadmap.

Talks   
Google Colaboratory for HDF-EOS - Joe Lee Abstract: Google provides a free Jupyter notebook environment called Colaboratory (also known as Colab).  It is simple, easy, and awesome Python environment for data scientists. We present how NASA Earthdata in HDF can be used with Google Colab using the existing comprehensive example on HDF-EOS Tools and Information Center website (http://hdfeos.org/zoo). We also present how OPeNDAP can be used with Colab to achieve 100%-cloud data analysis.

Keywords: Python, Google Colab, Jupyter notebook, HDF-EOS, OPeNDAP, Cloud computing.
Slides: https://doi.org/10.6084/m9.figshare.8976464

Leveraging the Cloud for HDF Software Testing - Larry Knox

Abstract: In this talk we will discuss how we leverage the Cloud for HDF software daily regression testing including testing of the HDF5 parallel library on the Cloud cluster using Orange FS.
Keywords: HDF5, Cloud, CI testing.
Parallel Computing with HDF Server - John Readey

Abstarct: To deal with really big data you need to be able to harness the power of multiple machines, but many users are put off by the complexity involved in setting up a cluster and then figuring out to effectively utilize it.   However, by using HDF Server (HSDS) with Kubernetes, it’s much easier than you would think.  In this talk we’ll walk through some examples of using xarray, h5netcdf, and h5py with HSDS to illustrate how you can scale up your compute to match your data size.
Keywords: HDF5, h5netcdf, h5py

 HDF5 Roadmap 2019-2020 - Elena Pourmal

Abstract: In this talk we will give an overview of the new features of the upcoming HDF5 release 1.12.0, and outline the HDF5 roadmap for the next year. We will demonstrate new open source file drivers to access HDF5 files via Amazon Simple Storage Service (Amazon S3) and on Hadoop Distributed File system (HDFS). We will use this presentation to get feedback on the HDF5 roadmap from the ESDIS users and application developers.
Keywords: HDF5, Amazon S3, HDFS, Cloud, Object Store.

Session recording here.

Moderators
AJ

Aleksandar Jelenak

The HDF Group

Speakers
JR

John Readey

The HDF Group
LK

Larry Knox

The HDF Group
EP

Elena Pourmal

Engineering Director, HDF Group
HDF
avatar for Joe Lee

Joe Lee

Software Engineer, The HDF Group
HDF Product Designer HDF(-EOS) / netCDF / GDAL OPeNDAP / Hyrax / THREDDS / PydapBig data / Spark / Hadoop / Elasticsearch / Logstash / KibanaCloud / S3 / Lambda / Docker / CondaMinecraft / AR / VR / WebGL Machine Learning / Deep Learning / Keras.io / H2O.ai / Rekognition / AlexaAI... Read More →



Thursday July 18, 2019 1:30pm - 3:00pm
Room 318

3:00pm

Break
Thursday July 18, 2019 3:00pm - 3:30pm
TCC

3:30pm

Unconference Session I: Schedule in the Session Description
Ballroom A -  Tools, Tools, Tools: Share your scientific computing tools with your peers.
Ballroom BC -  Data on the DWeb: Hack Session on Linking P2P Data to Repositories
Ballroom D -  The Metadata Game
316 -  Data Activism: Power to the People
317 -  2020 is the 50th Anniversary of Earth Day: What could/should ESIP do to celebrate?


______
An "unconference" is particularly useful when participants generally have a high level of expertise or knowledge in the field the conference convenes to discuss. So ESIP is the perfect place to unconference!

At the ESIP unconference, the agenda is created by the attendees through the first 2.5 days of the meeting. Anyone who wants to initiate a discussion on a topic can add ideas to the Unconference Board at Registration. Participants will also have dots included with their name badges. You can vote on your preferred sessions throughout the first 2.5 days. At lunch, before this session starts we will co-create the schedule based on session popularity and attendee input. There will be 5 minutes to move between session during each unconference block.

ESIP unconference sessions are led by the participant who suggested its topic; Sessions can also be geared around working on a particular topic, hack-a-thon, whatever you need at this point in the meeting, make it your session! In an effort to accommodate as many unconference sessions as we can, there will be limited access to A/V at this time. Please plan accordingly.



Thursday July 18, 2019 3:30pm - 4:10pm
TCC

3:30pm

Data Risk Matrix Do-A-Thon I
Defining risks for data can be a daunting task. The risk factors for data collections may vary from collection to collection, or vary over time for a single collection. These factors could additionally vary by the priorities and resources available at any given time. The Data Stewardship Committee held a session at the ESIP Summer Meeting 2018 (https://2018esipsummermeeting.sched.com/event/Eypr/building-a-data-risk-factor-matrix) where participants undertook a “card sorting” exercise, an established method for developing categorizations of concepts. The outcome of that exercise indicated more than one way to categorize data risks, thus indicating that any approach may need adjustment depending on the situation at hand.

This working session is intended to further develop and evolve the Data Risk Categorization Matrix (http://bit.ly/2IX3VM5) begun by the Data Stewardship Committee, and to work through test cases for its application. We invite volunteers to use the Categorization Matrix on a data collection they are familiar with prior to the session, then during the session we will discuss issues, comments, concerns, or improvements to the matrix. Participants are encouraged to bring information on a data collection to the session to conduct live assessments with input from other participants.

Session recording here.

Moderators
avatar for Denise Hills

Denise Hills

Director, Energy Investigations, Geological Survey of Alabama
Long tail data, data preservation, connecting physical samples to digital information, geoscience policy, science communication

Thursday July 18, 2019 3:30pm - 4:15pm
Room 318

4:15pm

Unconference Session II - Schedule in the Session Description
Ballroom A -  Challenges in Ocean Data Discoverability: schema.org, metadata stacks, etc...
Ballroom BC - Fine-grained access to netCDF data in the cloud
Ballroom D - How to decide upon underlying goals framework for educational content in the DMT Clearinghouse
316 - Creating a robust environment for hosting Jupyterhub workshops

______
An "unconference" is particularly useful when participants generally have a high level of expertise or knowledge in the field the conference convenes to discuss. So ESIP is the perfect place to unconference!

At the ESIP unconference, the agenda is created by the attendees through the first 2.5 days of the meeting. Anyone who wants to initiate a discussion on a topic can add ideas to the Unconference Board at Registration. Participants will also have dots included with their name badges. You can vote on your preferred sessions throughout the first 2.5 days. At lunch, before this session starts we will co-create the schedule based on session popularity and attendee input. There will be 5 minutes to move between session during each unconference block.

ESIP unconference sessions are led by the participant who suggested its topic; Sessions can also be geared around working on a particular topic, hack-a-thon, whatever you need at this point in the meeting, make it your session! In an effort to accommodate as many unconference sessions as we can, there will be limited access to A/V at this time. Please plan accordingly.




Thursday July 18, 2019 4:15pm - 4:55pm
TCC

4:15pm

Data Risk Matrix Do-A-Thon II
Defining risks for data can be a daunting task. The risk factors for data collections may vary from collection to collection, or vary over time for a single collection. These factors could additionally vary by the priorities and resources available at any given time. The Data Stewardship Committee held a session at the ESIP Summer Meeting 2018 (https://2018esipsummermeeting.sched.com/event/Eypr/building-a-data-risk-factor-matrix) where participants undertook a “card sorting” exercise, an established method for developing categorizations of concepts. The outcome of that exercise indicated more than one way to categorize data risks, thus indicating that any approach may need adjustment depending on the situation at hand.

This working session is intended to further develop and evolve the Data Risk Categorization Matrix (http://bit.ly/2IX3VM5) begun by the Data Stewardship Committee, and to work through test cases for its application. We invite volunteers to use the Categorization Matrix on a data collection they are familiar with prior to the session, then during the session we will discuss issues, comments, concerns, or improvements to the matrix. Participants are encouraged to bring information on a data collection to the session to conduct live assessments with input from other participants.

Session recording here.

Moderators
avatar for Denise Hills

Denise Hills

Director, Energy Investigations, Geological Survey of Alabama
Long tail data, data preservation, connecting physical samples to digital information, geoscience policy, science communication

Thursday July 18, 2019 4:15pm - 5:00pm
Room 318

5:00pm

Unconference Wrap-up
We will come back to plenary for a quick session to report out on key insights, lessons learned and anything that is being carried forward. This could be great fodder for FUNding Friday team forming and poster making too!

We will also enjoy the Climate Fables Virtual Reality Project & Fashion Show

Thursday July 18, 2019 5:00pm - 5:30pm
TCC

6:30pm

FUNding Friday Poster Making Session
Join us at 7 Seas to make a poster or find a team + make a poster for FUNding Friday (FF).

FUNding Friday is an annual mini-grant competition associated with ESIP’s Summer Meeting. The mini-grants are available to ESIP members ($5000) and to students and Education Committee workshop participants ($3000), with total number of awards specified annually and generally 2-4 awards per participant group.

Interested participants must exhibit a poster describing the project during the Poster Pitch session (Friday morning, check the Summer Meeting schedule for specific time and place). The poster should be hung in the provided space before the pitch session begins.

The poster size is limited to 25 by 30 inches. It can be hand-drawn; materials for the posters are provided to interested participants during the FF Poster event Thursday night.

Thursday July 18, 2019 6:30pm - 8:30pm
7 Seas Brewery 2101 Jefferson Ave, Tacoma, WA 98402

8:00pm

End of Day 3
Thursday July 18, 2019 8:00pm - 8:00pm
TCC
 
Friday, July 19
 

8:00am

FUNding FRIDAY & Morning Plenary
View live stream here: ESIP 2019 Summer Meeting - Day 4 Plenary

  • FUNding Friday Pitches
  • Increasing Earth Data Usage through Public Data and Machine Learning
    Shane Glass, Google
  • Best-value Data-intensive Analysis Architecture Deduced Using “Geo-lly” Beans
    Kwo-Sen Kuo, UMD/NASA Goddard/Bayesics LLC

View session recording
here.

Speakers
avatar for Kwo-Sen Kuo

Kwo-Sen Kuo

UMD/NASA Goddard/Bayesics LLC
Kwo-Sen Kuo is a “disruptive thinker” (commonly known as “boat-rocker” or “troublemaker”) because he likes to question the existing ways of doing things. Although he considers that to be completely rational, it is not always appreciated as so by others. His disruptiveness... Read More →
avatar for Shane Glass

Shane Glass

Public Data Lead, Google
Shane is a program manager with Google Cloud's Developer Relations, where he leads the Google Cloud Public Datasets Program. The Cloud Public Datasets Program facilitates access to high-demand public datasets in order to make it easy for data users to access and uncover new insights... Read More →


Friday July 19, 2019 8:00am - 9:45am
TCC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

8:30am

IGSN2040 Steering Committee Meeting (Closed Meeting)
Speakers
avatar for Kerstin Lehnert

Kerstin Lehnert

Lamont-Doherty Earth Observatory, Columbia University
Kerstin Lehnert is Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the NSF-funded data facility IEDA (Interdisciplinary Earth Data Alliance). Kerstin holds a Ph.D in Petrology from the University of Freiburg in Germany.Over... Read More →


Friday July 19, 2019 8:30am - 1:30pm
Room 315

9:45am

Break
Friday July 19, 2019 9:45am - 10:00am
TCC

10:00am

Conceptual modeling for earth science
Data repositories often rely upon conceptual models that provide formal representation information and identity conditions for digital resources -- for instance, the ontologies that underlie semantic data, or conceptual models like FRBR that underlie digital libraries. Though these later two cases represent extremely well documented conceptual models, there are many other instances where underlying conceptual models are tacit or inexplicit, and rarely published by practitioners and researchers. This makes it hard to build on one another's work, identify weaknesses in our models or modeling approaches, or forge new innovative collaborations. Furthermore, even in cases were conceptual models are well articulated, we believe there is a need for further discussion related to the methods used in modeling work, and the open research questions regarding conceptual modeling.

To that end, we'd like to see ESIP become a home for conversations about conceptual modeling for earth science data! We (https://sig-cm.github.io) are a group of information scientists who believe that sustaining a rich tradition of research and development in conceptual modeling in LIS requires collaboration with, and contributions from, communities like ESIP. This session would be the second in a series of interdisciplinary workshops, panels, and working sessions with the goal of building community and a research agenda around conceptual modeling work in libraries, archives, museums, and data repositories.

Plan for session:
- short lightning talks from presenters, setting the stage and outlining the topic
- a working session, in which participants are split into small groups to discuss areas of unmet need, and develop research questions, possible future research/development directions for ESIP + conceptual/data modeling efforts.

Session recording here.

Moderators
Friday July 19, 2019 10:00am - 11:30am
Ballrm A

10:00am

Location, Location, Location: Enabling Data Discovery by Place
Controlled vocabularies and ontologies are used to annotate datasets in the environmental sciences to improve data discoverability. However, they typically focus on data content and uses, rather than the location where data is collected. Although selecting terms for the theme of a dataset is usually straightforward, identifying terms for the location of data collection is a more complicated issue. Places where research is conducted vary by location and in size. Some named locations may be subsumed by other named locations (e.g., a city in a state) and sometimes multiple names need to be specified to be clear (e.g., Springfield, IL, USA vs. Springfield, MO, USA vs. Springfield, ON, CA). Moreover many geographic name databases work well for terrestrial locations, but not for aquatic ones (e.g., coral reefs). The nearest named place from a gazetteer may be quite distant from a study site in the wilderness. Additionally, data for a given study may be collected in many distinct locations with intervening gaps in between. For discoverability, is it preferable to identify a place as part of a study where many types of data are collected, or as a set of coordinates? In this working group, we will consider use cases from the perspective of environmental researchers to evaluate how well gazetteers and other resources such as the NGA GEOnet Names Server (GNS) could enable data discovery by researchers searching for data. Our aim is to provide recommendations for specifying location using geographic naming resources, or failing that, to better define how various resources might be evaluated for fitness.

Presenters: John Porter, Kristin Vanderbilt
Presentation Title: Location, Location, Location: Enabling Data Discovery by Place
Slides: https://doi.org/10.6084/m9.figshare.8980052
Session recording here.

Speakers
avatar for Kristin Vanderbilt

Kristin Vanderbilt

Research Associate Professor, University of New Mexico



Friday July 19, 2019 10:00am - 11:30am
Ballrm BC
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

10:00am

Current Status in Cloud Data Access
Cloud computing holds the promise of novel data analysis capabilities for geoscientists by providing affordable on-demand computing system resources. One of the major differences with the traditional computing systems is web-based object storage which requires new data access methods with a different set of performance parameters.

The aim of this session is to provide the ESIP community with an opportunity to learn about the current capabilities for accessing data in cloud object stores. The emphasis will be on the actual software, data servers or libraries, which are capable of accessing cloud object stores, performance issues and bottlenecks, and best practices that can be adopted when migrating data to the cloud. When considering end-user applications, this session is about how those tools access data from the novel data storage systems available with cloud computing and not about the algorithms, etc., associated with data visualization, analytics, or machine learning.

Session recording here.

Moderators
JG

James Gallagher

Contractor, OPeNDAP
AJ

Aleksandar Jelenak

The HDF Group

Friday July 19, 2019 10:00am - 11:30am
Ballrm D

10:00am

Preparing climate and hydrological time series data for submission to CUAHSI
In this working session we will introduce CUAHSI Data services to manage point time series data, such as streamflow and precipitation. This standardized data format will enable data synthesis and comparison across different sites and locations. Specifically, we will demonstrate how to convert a climate dataset into this format and upload to CUAHSI’s data repository. Participants may follow along using their own data. Please bring one year’s worth of climate station data in your local format to this working session along with a laptop containing your favorite scripting environment. We will also provide an example dataset and expertise in various scripting languages. The goal is for a data manager to obtain a good understanding of the workflow involved for converting their local climate station data for submission to CUAHSI’s data repository.

Session recording here.

Moderators
Speakers
avatar for Margaret O'Brien

Margaret O'Brien

Data Manager, University of California, Santa Barbara
avatar for Suzanne Remillard

Suzanne Remillard

Oregon State University


Friday July 19, 2019 10:00am - 11:30am
Room 316

10:00am

Community Ontology Repository (COR) Administration, Development and Planning
This session will consist of two main themes: discussion of current and pending important administrative tasks; and hands-on exercises covering the development aspect toward improvement of the software itself, as well as possible complementary tools that could be integrated (e.g., ontology viewers/visualizers). Participants will gain understanding of how this particular instance of the ORR software is deployed on Amazon and how they can contribute in various ways including further core development and integration of new tools and client libraries to leverage the powerful API and SPARQL endpoint capabilities of the COR server.

NOTES: HERE

Session recording here.

Moderators
avatar for Annie Burgess

Annie Burgess

ESIP Lab Director, ESIP
avatar for John Graybeal

John Graybeal

Technical Program Manager, CEDAR and BioPortal, Stanford University
Metadata, semantics, and cool repositories for metadata and semantics.Cool Earth Science (or biomedical) projects that will change the world.Or at least, change the way we manage metadata about the world.

Speakers
avatar for Lewis J. McGibbney

Lewis J. McGibbney

Chair, ESIP Semantic Technologies Committee, NASA, JPL
My name is Lewis John McGibbney, I am currently a Data Scientist at the NASA Jet Propulsion Laboratory in Pasadena, California where I work in Computer Science and Data Intensive Applications. I enjoy floating up and down the tide of technologies @ The Apache Software Foundation having... Read More →
avatar for Carlos Rueda

Carlos Rueda

Sr Software Engineer, MBARI
My areas of expertise and interest include scientific data management, visualization, data integration and interoperability, programming languages, and semantic web. https://www.mbari.org/rueda-carlos/


Friday July 19, 2019 10:00am - 11:30am
Room 317

10:00am

Surprising and Novel Ways to Integrate Community Data Systems with Each Other
We know all the standard mechanisms for integrating data systems with each other: standards, APIs, standards-based APIs, etc., etc., etc. But new possibilities are opening up due to new technologies and approaches: Jupyter, Eclipse Che, Everything-as-a-Service, Slack, JSON-LD... Do you have a novel integration mechanism you want more developers to adopt so we can hook more things together? Come to this session and talk it up!

Agenda:
  • Dave Blodgett (USGS) - "Non information resource - Meta information resource - Data information resources: A resource model for integration of data from multiple organizations about the same real-world feature. Summary of the ongoing the Second Environmental Linked Features Interoperability Experiment (SELFIE)"
  • Kevin O'Brien/Bob Simons (NOAA/PMEL) - Integrating Data with ERDDAP
  • Namrata Malarout (NASA/JPL) - Centralized MAAP* API: Simplifying Algorithm Collaboration
  • Daven Quinn (U. Wisconsin) - Sparrow: an in-house, interoperable data system for individual geochronology labs

*MAAP = Multi-mission Algorithm and Analysis Platform

Session recording here.

Speakers
avatar for Chris Lynnes

Chris Lynnes

EOSDIS System Architect for Data Use, NASA



Friday July 19, 2019 10:00am - 11:30am
Room 318

11:30am

Break
Friday July 19, 2019 11:30am - 11:45am
TCC

11:45am

Identifying Trusted Data Sources for Operational Decision Making & the Role of “Fitness for Use” as ORL Criteria
The Disaster Lifecycle cluster is hosting a breakout session to explore sources of trusted datasets from various agencies and what constitutes operational readiness for these data. A key issue is how ‘Fitness for Use’ criteria can apply across ORLs, the Operational Readiness Levels.

With FEMA’s encouragement, collaborators at the All Hazards Consortium are “operationalizing” ORLs for data-driven decision-making support to improve situational awareness in response to power outages, transportation, fuel and lodging after major disasters. An interesting development is the need to assign fixed ORLs to datasets, rather than determining the ORL value based on specific use cases. The GIS ORL team within the Sensitive Information Sharing Environment (SISE) committee of the Fleet Response Working Group (FRWG) recognizes that latency, resolution, and coverage features have a significant impact on dataset readiness for most critical infrastructure and many weather and other EO datasets. However the inherent confusion that changing a trusted dataset’s ORL assessment creates a bigger problem for operator training and response efforts. Currently, most of their critical datasets are logistical in nature (what roads have been closed by state authorities, where can truck drivers get fuel/ food/ lodging, where are the authorized staging locations, etc.) and amenable to fixed ORLs assignments.

The recent wildfires in CA and associated mud and debris flows are impacting lives and property. Earthquake exercises are leading to data needs by decision makers that can drive situational awareness and decision making criteria. For example, soil condition information in burn scar areas is critical for NWS forecasters to know so they can accurately identify rainfall thresholds for issuing flood warnings in burn scar areas.

Looking forward to successfully using more trusted EO data for disaster operations, we plan to hear about current and planned datasets for disaster response needs. We are also seeking ways to clarify fitness for use criteria (especially latency, resolution and coverage) for these datasets that otherwise would meet the current readiness criteria of ORL1.

Agenda

Trusting Data Sets, Needs & Use Cases [60 min]
  • [TRUST] Fleet Response WG approach to ORL - Chris McIntosh/ Bent Ear Solutions and All Hazards Consortium
  • [NEED] NWS Hydrology needs for EO data - Katherine Rowden/NOAA NWS Western Region Hydrology
  • [USE] Building a Community Based Housing Disaster Recovery GIS Application - Ashley Tseng/NCDP
  • [RISK] Talk by Maximilian Dixon, Hazards and Outreach Program Supervisor at Washington State Emergency Management Division

Federal Agency Panel [20 min Conversation]
What are the issues and how to improve discovery and access to federal datasets across independent portals, ala Radiant.Earth?
  • NASA Maggi Glasscoe (Remote) - Disasters GIS Team Lead 
  • USGS Marie Peppler (Remote) - Emergency Management Coordinator, Acting
  • NOAA Kari Sheets (Remote) - NOAA/NWS Geospatial Data Lead
  • DHS / FEMA Chris Vaughan (Remote) - Geospatial Information Officer
Please note: Our panelists participated in our ESIP webinar on "Trusted Federal Data Sources for Hazard Response and Decision Making" which was recorded on June 24 to highlight current data services and upcoming plans. The Youtube video is available: https://youtu.be/ueUhJJIYCII

Session recording here.


Speakers
avatar for Karen Moe

Karen Moe

NASA Goddard Emeritus


Friday July 19, 2019 11:45am - 1:15pm
Ballrm A

11:45am

Improving Airborne Data Discovery and Use
Join us to talk about Airborne Data issues!
 
Airborne earth observations are typically collected in field campaigns aimed at satellite data validation or intensive observation of a particular geophysical feature or physical relationship. This results in a wealth of coincident observations of Earth system processes from a wide variety of instruments. However, these heterogeneous data have diverse temporal and spatial scales, variables, and data formats and organization. Compared to satellite data, airborne data typically have a much smaller user community and consist of more data types containing fewer and smaller data files. In many cases, the users of airborne data may be limited to just those involved with the airborne campaign due to the complexity of the data and the difficulty visualizing and using the data. Individual data centers have developed their own particular way of serving the needs of a particular community of airborne data users effectively. In this session, we aim to bring together data providers and data users to gather effective ideas for broadening airborne data user communities beyond the campaign scientists. 

Session Notes Doc (please add your name to attendees list):  http://bit.ly/ESIPAirborneDataNotes 

Session Agenda:  
Introduction- 10 min
Speakers - 45 min
  • Heather Holbach (NOAA/AOML/HRD) Hurricane airborne data use/issues
  • Tristan Goulden (NEON)  Ecological airborne data use/issues
  • Jeff Deems  (NSIDC)  Snow and Ice airborne data use/issues
  • Helen Conover (UAH/ITSC) - Technology examples for airborne data exploration
Discussion - 30 min
Wrap up - 5 min

Goals of Sesion:
  • Build interest in this topic
  • Bring together data providers, data users, and data managers
  • Gather ideas for increasing airborne data use among more communities
  • Explore effort vs outcome as consideration to prioritizing airborne data improvements
  • Identify future webinar topics that would aid users/data managers

Session recording here.




Speakers
HH

Heather Holbach

Assistant in Research, NOAA/AOML/HRD
avatar for Deborah Smith

Deborah Smith

Airborne Data Management Group, IMPACT/ UAH
I am the lead scientist of the IMPACT Airborne Data Mangement Group (ADMG). I work towards improving airborne data knowledge, use, access and value.
avatar for Amanda Leon

Amanda Leon

DAAC Manager, NASA National Snow and Ice Data Center DAAC
avatar for Helen Conover

Helen Conover

ESDIS Standards Office, UAH/ITSC
Data stewardship, metadata, standards, lightning observations from space



Friday July 19, 2019 11:45am - 1:15pm
Ballrm BC
  • Area Drones, Science Communication
  • REMOTE PARTICIPATION LINK: https://global.gotomeeting.com/join/670434781
  • REMOTE PARTICIPATION PHONE #: United States: +1 (646) 749-3129 Australia: +61 2 8355 1050 France: +33 170 950 594 Norway: +47 21 93 37 51 Austria: +43 7 2081 5427 Germany: +49 692 5736 7317 Spain: +34 932 75 2004 Belgium: +32 28 93 7018 Ireland: +353 15 360 728 Sweden: +46 853 527 827 Canada: +1 (647) 497-9391 Italy: +39 0 230 57 81 42 Switzerland: +41 225 4599 78 Denmark: +45 32 72 03 82 Netherlands: +31 207 941 377 United Kingdom: +44 330 221 Finland: +358 923 17 0568 New Zealand: +64 9 280 6302 0088
  • REMOTE PARTICIPATION ACCESS CODE 670-434-781

11:45am

Beyond the cookbook: Connecting workflows, data and people for sustainable interdisciplinary Earth Sciences
This interactive workshop intends to add to the Throughput cookbook, by having participants work through and annotate workflows. Additionally, participants will use the API to look at the networks already built to ascertain what additional tools are needed to make sense of it all.


Session recording here.

Moderators
Friday July 19, 2019 11:45am - 1:15pm
Ballrm D

11:45am

Data Management Training Clearinghouse Advisory Board Meeting (Continuing through lunch)
The Third Quarter 2019 Advisory Board for the ESIP-hosted and IMLS (Institute of Museum & Library Services) grant funded Data Management Training Clearinghouse would like to hold a face to face meeting at ESIP Summer. AB meetings are scheduled for each quarter of the year, and this would be the first time that AB members meet face to face (although remote participation would also be welcomed). Part of the impetus for the face to face meeting is to bring those board members to an ESIP meeting who have not yet had the opportunity to attend. Ideally, the meeting could be held for about 3 hours on the Friday after the ESIP meeting in hopes that more members could attend both.

Session recording here.

Moderators
KB

Karl Benedict

ESIP President, ESIP
The ESIP President is a volunteer position, elected by the ESIP Community each year. The President works with the ESIP Staff for several of the presentation, speaker introductions, award ceremonies, and other speaking/participating aspects of ESIP meetings throughout the year.

Friday July 19, 2019 11:45am - 1:15pm
Room 316

11:45am

Semantic Technologies Committee Business Meeting
As the adoption of the FAIR principles accelerates, the use of semantic resources by research and operational communities worldwide is intensifying. There is thus a pressing need to coordinate our semantic development efforts in the Earth science community such that we can maintain diversity, reduce duplication of effort, and present a more unified and demonstrably interoperable interface to external stakeholders.

We will use this session to address the need above, and develop a plan to link our activities to emerging semantic frameworks in Earth observation, such as the UN Sustainable Development Goals and the Essential Variables for Climate, Oceans, Biodiversity, and Geodiversity.

This plan will feed-forward into the planning of the 4th Geosemantics Symposium.

Session recording here.

Moderators
avatar for Pier Luigi Buttigieg

Pier Luigi Buttigieg

Data Scientist, Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung
Responsibilities My work focuses on the combination of semantics, bioinformatics, and data analysis in aid of meaningfully mobilising ecological data and detecting structure in this complex and plastic context. See my ORCID for more: orcid.org/0000-0002-4366-3088 I apply... Read More →
avatar for Lewis J. McGibbney

Lewis J. McGibbney

Chair, ESIP Semantic Technologies Committee, NASA, JPL
My name is Lewis John McGibbney, I am currently a Data Scientist at the NASA Jet Propulsion Laboratory in Pasadena, California where I work in Computer Science and Data Intensive Applications. I enjoy floating up and down the tide of technologies @ The Apache Software Foundation having... Read More →

Friday July 19, 2019 11:45am - 1:15pm
Room 317

11:45am

New paradigms for alternative data packaging of geolocation information in EO satellite data
Earth observation (EO) satellite files can encode the geolocation of their
observations in a variety of ways. This often depends on the processing
level of information. Gridded rasters can be described by concise map
projection information while Level 1 and 2 data that are in native satellite
coordinates require more detail. Often these files encode the pixel level
geolocation information as multi dimension variables internal to the file.
In the past there have been example implementations of storing geolocation
in an external file (early NASA MODIS) or sub sampling geolocation
information (early NASA SeaWiFS) that did not work out very well for
various reasons. Storing the geolocation data or map projection references
in each file (granule) has many advantages the most important is playing
"nicely" with tools and services and software, and promoting
interoperability. However, the geolocation data for L1/L2 EO files are
often the storage heaviest individual component as its precision requires at
least float data types (its information cannot be elegantly "packed") so it
is worthwhile to revisit ideas and methodologies for reducing its
footprint. How best could geolocation information be shared across
different variables, different files from the same sensor, or even
different sensors on the same platform. Furthermore, in the age of cloud
and database tiled storage of satellite information how is geo-location (and
other) information best packaged and utilized to improve data access and
processing. In this session we will look at this problem and potential
solution space via a number of presentations, historical lessons learned and
dynamic discussion.


Presentations:
  • Introduction: Ed Armstrong and Alexsander Jelenak
  • Kwo-Sen Kuo: STARE and data packaging
  • Robert Wolf: MODIS Experience with External Geolocation
  • Ed Armstrong; Reducing geolocation storage: a CF success story, and introducting a novel (and complicated) Eumetsat L2 data model 
  • Alexsandar Jelenak: HDF5 and external references
Find and access all slides: https://doi.org/10.6084/m9.figshare.8986325

Session recording here.

Speakers
AJ

Aleksandar Jelenak

The HDF Group
avatar for Ed Armstrong

Ed Armstrong

Technologist, NASA JPL
avatar for Kwo-Sen Kuo

Kwo-Sen Kuo

UMD/NASA Goddard/Bayesics LLC
Kwo-Sen Kuo is a “disruptive thinker” (commonly known as “boat-rocker” or “troublemaker”) because he likes to question the existing ways of doing things. Although he considers that to be completely rational, it is not always appreciated as so by others. His disruptiveness... Read More →


Friday July 19, 2019 11:45am - 1:15pm
Room 318

1:15pm

 


Twitter Feed