Objectives

Develop new multi-objective metrics for quantitative assessment and analysis of the energy profile of algorithms

The target of this objective is to develop new energy metrics, which will form the first steps towards the limits of minimum energy consumption for a given simulation problem by allowing the quantification of the total energy demanded by running the simulation on a super computer. These metrics will pave the way for energy-aware simulations including the conjunction with minimisation of time to solution and other boundary conditions like accuracy and performance.

 

The main challenge is related to the fact that we not only need to identify the right setup for measuring sensors and hardware counters to acquire detailed insights about the power consumption with respect to the hardware, but also need to establish the link between the different layers of the algorithm and software stack underlying the considered application. This will be achieved by means of multi-objective metrics where the energy consumption, the performance and accuracy will be treated together, both considering mathematical models and measurements of the energy footprint. This analysis will lead to the definition of optimal scenarios, where the energy consumption as a constraint will be weighted in an adequate way.

Develop an advanced and detailed power consumption monitoring and profiling

This objective aims at developing a new generic methodology to quantify the power consumption of individual hardware components for the different software layers of HPC systems. The main challenge to reach this target lies within the complex heterogeneity of future hardware platforms, which will aggregate a very large number of nodes (e.g. more than 100.000) with each of these nodes featuring different devices. Only the detailed monitoring and profiling of the power consumption will allow for precisely identifying energy-intensive communication between the devices, for example inefficient idle states and latencies, when applying energy saving modes. This target will render an inter-device communication mechanism aiming at minimising the energy consumption.

Develop new smart algorithms using energy-efficient software models

This objective aims at developing highly adaptable and versatile algorithms on different programming levels from assembly to higher programming language by developing new software models and programming methodologies, which strive for the minimal energy consumption on the given hardware. The main challenge is to develop highly flexible models and methodologies, which can be easily adapted for any architecture and which do not need to be individually adapted. The flexibility of the developed algorithms is reached by maintaining the modularity of the small kernels and subroutines (popularly known as algorithmic dwarfs), in order to efficiently combine them in the advanced libraries such as linear system solvers, pre-conditioners, adaptive mesh generators/refiners. This flexibility will ensure the viability of this approach as different adaptation for different architecture is time- and especially cost-intensive. The resulting libraries/linear system solvers will be used in COSMO-ART as part of the proof of concept.

Develop a smart, power-aware scheduling technology for High Performance Clusters

This objective addresses the vital subject of smart and power-aware hardware setups for high performance clusters. This will be a critical issue for exascale computing as the dissipation of hardware setups becomes increasingly important with the complexity of the system. The main challenge is to avoid hotspots and to maintain spatially homogeneous heat dissipation. With the increasing homogeneity of heat distribution, the cooling systems can work with higher efficiency. This in turn will contribute significantly to the foundation of hardware technology with minimised energy consumption. In addition, this will also improve the reliability and half-time of the distinct devices and therefore lead to smaller environmental impacts by reducing the number of new acquisitions and the amount of “hardware waste”. To achieve this goal, new schemes are developed to schedule the tasks on modern hardware platforms efficiently over the computing nodes, focusing on avoiding data-intensive and flop-intensive areas and on the reduction of hotspot potential. The novelty of the approach is that the scalability properties of the considered application will be taken into account in the scheduling strategy.

Conduct a proof of concept using the weather forecast model COSMO-ART

The target of this objective is to provide a proof of concept for the methodologies and technologies developed in the Exa2Green project. The COSMO model will serve as a highly relevant example of a computing intensive simulation, whose energy profile is currently far from optimal. The COSMO model is a weather forecast model which has become a standard all over Europe. Beside the federal weather forecast stations in Germany (Deutscher Wetterdienst, DWD), Switzerland (MeteoSchweiz, MCH), Italy (Ufficio Generale Spazio Aereo e Meteorologia, USAM), Greece (Hellenic National Meteorological Service, HNMS), Poland (Institute of Meteorology and Water Management, IMGW), Romania (National Meteorological Administration, NMA), and Russia (Federal Service for Hydrometeorology and Environmental Monitoring, RHM), also a large number of agencies, including military and research institutions base their forecasts on COSMO. The main challenge of this objective is induced by the aerosols and chemistry models used for the simulations by COSMO-ART or COSMO-HAM. Due to the widespread use of COSMO, improving its power and energy profile should not only be seen as a model of how simulations' energy footprints can be optimised, but also as a milestone in the energy-efficient European weather forecast simulation.