Controlled vocabularies and ontologies are used to annotate datasets in the environmental sciences to improve data discoverability. However, they typically focus on data content and uses, rather than the location where data is collected. Although selecting terms for the theme of a dataset is usually straightforward, identifying terms for the location of data collection is a more complicated issue. Places where research is conducted vary by location and in size. Some named locations may be subsumed by other named locations (e.g., a city in a state) and sometimes multiple names need to be specified to be clear (e.g., Springfield, IL, USA vs. Springfield, MO, USA vs. Springfield, ON, CA). Moreover many geographic name databases work well for terrestrial locations, but not for aquatic ones (e.g., coral reefs). The nearest named place from a gazetteer may be quite distant from a study site in the wilderness. Additionally, data for a given study may be collected in many distinct locations with intervening gaps in between. For discoverability, is it preferable to identify a place as part of a study where many types of data are collected, or as a set of coordinates? In this working group, we will consider use cases from the perspective of environmental researchers to evaluate how well gazetteers and other resources such as the NGA GEOnet Names Server (GNS) could enable data discovery by researchers searching for data. Our aim is to provide recommendations for specifying location using geographic naming resources, or failing that, to better define how various resources might be evaluated for fitness.
Presenters: John Porter, Kristin Vanderbilt
Presentation Title: Location, Location, Location: Enabling Data Discovery by Place
Slides: https://doi.org/10.6084/m9.figshare.8980052Session recording here.