In a paper recently published on Future Generation Computer System (among the authors, CMCC researcher S. Fiore) the Earth System Grid Federation (ESGF), an open infrastructure for access to distributed geospatial data, was presented and described.
ESGF is a multi-agency, international collaboration that aims at developing the software infrastructure needed to facilitate and empower the study of climate change on a global scale. More in detail, ESGF includes services for data discovery, access, analysis and visualization, model output, observations, and reanalysis data while being a successful example of integration of disparate open source technologies into a cohesive, wide functional system. Furthermore, ESGF is supporting operationally the CMIP5 global distributed archive (3PB).
The abstract of the paper
The Earth System Grid Federation (ESGF) is a multi-agency, international collaboration that aims at developing the software infrastructure needed to facilitate and empower the study of climate change on a global scale. The ESGF’s architecture employs a system of geographically distributed peer nodes, which are independently administered yet united by the adoption of common federation protocols and application programming interfaces (APIs). The cornerstones of its interoperability are the peer-to-peer messaging that is continuously exchanged among all nodes in the federation; a shared architecture and API for search and discovery; and a security infrastructure based on industry standards (OpenID, SSL, GSI and SAML). The ESGF software stack integrates custom components (for data publishing, searching, user interface, security and messaging), developed collaboratively by the team, with popular application engines (Tomcat, Solr) available from the open source community. The full ESGF infrastructure has now been adopted by multiple Earth science projects and allows access to petabytes of geophysical data, including the entire Fifth Coupled Model Intercomparison Project (CMIP5) output used by the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) and a suite of satellite observations (obs4MIPs) and reanalysis data sets (ANA4MIPs). This paper presents ESGF as a successful example of integration of disparate open source technologies into a cohesive, wide functional system, and describes our experience in building and operating a distributed and federated infrastructure to serve the needs of the global climate science community.
Read the integral version of the paper:
Cinquini L., Crichton D., Mattmann C., Harney J., Shipman G., Wang F., Ananthakrishnan R., Miller N., Denvil S., Morgan M., Pobre Z., Bell G. M. , Doutriaux C., Drach R, Williams D., Kershaw P., Pascoe S., Gonzalez E., Fiore S., Schweitzer R. (2014)
The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data
Future Generation Computer Systems, Volume 36, July 2014, Pages 400-417, ISSN 0167-739X, http://dx.doi.org/10.1016/j.future.2013.07.002.