The Scientific Computing and Operations (SCO) Division carries out R&D activities on Computational Science applied to the Climate Change domain. In particular, it focuses on the optimization of numerical models on HPC architectures and the management of large volumes of scientific data looking forward at exascale scenarios. The management and the operation of the CMCC Supercomputing Center is also in charge of the Division.
- the optimization and the parallelization of the numerical models for climate change simulations (both climate and impacts models);
- the design and implementation of open source solutions addressing efficient access, analysis and mining of scientific data in the climate change domain;
- the system management of the High Performance Computing facilities owned by the CMCC SuperComputing Center and research on Green Computing for an efficient use (energy driven) of computational resources.
HPC Systems Management (HSM)
This macro-activity is mainly focused on the system management of the High Performance Computing facilities owned by the CMCC SuperComputing Center. In addition to the ordinary computing and storage resources management activities, the team conducts research on Green Computing topics to develop optimal and innovative solutions for an efficient use (energy driven) of computational resources.
- High Performance Computing systems management
The main objective of this activity is to ensure the operation of the High Performance Computing systems available at the CMCC Supercomputing Center and the full data life cycle management concerning the data produced by the CMCC researchers in national and international projects. The research group is also in charge of managing the network interconnection among the CMCC sites and defining proper data center IT security policies.
- Development of advanced solutions for monitoring and resource usage optimization in High Performance Computing
This activity is focused on the development of advanced software tools based on power-saving techniques. It starts from a critical analysis of the energy consumption related to the CMCC computing resources. The main objective of this work is the development of proactive and reactive monitoring systems to enable an automatic node power management minimizing the energy costs (Green Computing). The development of this kind of software solutions is rather complex because the minimization of energy costs actions has also to take into account the multifaceted systems constraints on the systems and the QoS level required by the CMCC users.
Scientific Data Management (SDM)
The main goal of this macro-activity concerns with the design and implementation of open source solutions addressing efficient access, analysis and mining of scientific data in the climate change domain. In particular, the activities focus on the management of distributed (geographically spread) data in the context of the international ESGF (Earth System Grid Federation) and CMIP5 initiatives, the management of data banks applied to scientific data to identify novel storage models and efficient parallel I/O libraries and the knowledge discovery from data, which means inferring new knowledge from large volumes of scientific data.
- Distributed scientific data management in large scale environments
The main goal of this activity is the trasparent, secure and efficient distributed management of large volumes of data on a geographical scale. In particular the activity focuses on the management of distributed data in the context of the ESGF (Earth System Grid Federation) initiative. In this regard, part of the work concerns with relevant extensions to the data node component, related to the proactive distributed monitoring system foreseen in the ESGF P2P system architecture.
- Storage models and parallel I/O applied to scientific data
This activity aims at studying, analyzing and designing novel storage models related to scientific data in the climate change context, with special regard to the NetCDF format. Through the definition of these new storage models for the management of climate change data (to be implemented on HPC platforms and by means of the adoption of parallel paradigms such as MPI and OpenMP), this research activity aims at optimizing the efficiency related to the data access (through new I/O primitives), as well as to the storage space allocation.
- Knowledge Discovery from Data (KDD) applied to scientific data
This activity aims at inferring knowledge from large volumes of data. Starting from the access primitives defined in the activity “Storage models and parallel I/O applied to scientific data”, it defines and implements new interfaces (“data operators”) to carry out analysis and mining applied to multidimensional data in the climate change context. The design of the KDD platform keeps into account the evolution of the ESGF architecture to study convergences and possible integrations.
High End Computing (HEC)
The objective of this research unit is (I) the optimization and the parallelization of the numerical models for climate change simulations (both climate and impacts models); (II) the evaluation of how new emergent technologies impact on the implementation and the design of the current climate models; (III) the innovation of the main computational kernels by means of a re-design at numerical, algorithmic and software level.
- Models optimization and parallelization
This activity supports all of the CMCC divisions and aims at the use of “co-design” techniques for the optimization of the climate models used at CMCC. Moreover the activity focuses on the analysis of the main computational kernels, which characterize the models. The aim is the kernels optimization on “multi-cores” hybrid architectures taking into consideration the performance impacts of optimized compilers, numerical libraries and new parallel paradigms.
- Re-design of computational kernels towards Exascale
This activity focuses on the study of the impact of exascale computing architectures on the numerical algorithms used in the main climate models studied at CMCC. In particular, the following aspects are analyzed from exascale point of view: (I) the optimization of Earth System coupled Models; (II) the use of advanced parallel algorithmic structures to reduce the current “dynamical cores” communication overhead; (III) the optimal management of “multi-model” and “multi-emission” ensemble experiments; (IV) the analysis of the “super-parameterization” approaches used to solve the CRM (Cloud Resolving Models) at highest resolution; (V) the optimal management of the memory system and its hierarchies; (VI) the rationalization of I/O operations; (VII) the fault-tolerance management; (VIII) the adoption of new communication parallel paradigms and parallel tasks synchronization mechanisms. This activity is carried out under the international initiatives IESP (International Exascale Software Programme) and EESI (European Exascale Software Initiative).