
Accurately simulating the climate system has been a challenge and aspiration since the advent of numerical modeling. Here, we use the spatial pattern of 2 m surface temperature to discuss the evolution of model performance from the beginning of the Coupled Model Intercomparison Project (CMIP) in the 1990s to the latest kilometer-scale models today. We find that the kilometer-scale IFS-FESOM model outperforms even the best CMIP6 models, while other kilometer-scale models still have considerable deficits. These results demonstrate the potential of kilometer-scale models to surpass established CMIP models, despite undergoing only limited tuning, while also highlighting the considerable efforts still needed to realize their full potential. We put this performance in the context of 10 observation-based references and 150 coupled global climate models developed over the past three decades to discuss that increasing resolution might be necessary, but is not sufficient for improving model skill.




