The multiple correlation and/or regression information that two competing forecast systems have on the same observations is decomposed into four components, adapting the method of multivariate information decomposition of

A classic method for determining the potential skill of a forecast system versus observations is the use of the Pearson correlation coefficient, which is directly related to information entropy. The comparison of competing forecast systems is another basic issue in the evaluation of forecasting systems, for example, for quality assurance purposes. For the Pearson correlation metric, this problem is often addressed using correlation differences. However, a central problem in the comparison of different forecasting systems is the strong collinearity of the two forecasting systems because, by construction, both systems aim at a reproduction of the same observations.

Another way to overcome this problem is the use of partial correlations, which at the same time offers new views for forecast evaluation. The use of partial correlations has the advantage of well-known methods for inference; for example, hypothesis testing. Partial correlations can be tested like conventional correlations after reducing the degrees of freedom accordingly

The method of partial correlations can be related to the partial information decomposition (PID) which has been proposed by

Here a partial correlation decomposition (PCD) is applied under the assumption of continuous Gaussian distributed variables for which the mutual information is directly related to multiple correlation. Multiple correlation of one target variable and multiple predictands is the classic Pearson correlation between a variable and its regression on the predictands

The redundant or shared information between the forecasting systems on the observations is named target redundance here. The unique information is the added value that one forecast provides when given the other. The newly defined component is non-target redundance, which is that part of the shared variance between the models not verified by observations. The first three components can be depicted as four maps of explained variances on the observations, whereby the added values are separately determined for each of the models. The non-target redundance is based on the same partial correlation for both models but is mapped using the individual unexplained variance of the respective model, given the observations. Details of the derivation and definitions are outlined in Sects. 2 and 3.

We will present results of the PCD for two examples covering different timescales. The first is an application to decadal climate forecasts using two versions of the ensemble prediction system developed within the German decadal climate prediction project of MiKliP

The comparison of the pros and cons of two different forecast systems can be done within the framework of correlation. The starting point here is the multiple correlation coefficient (

By indexing the observations with 1 and the two forecast systems with 2 and 3, the multiple linear regression of the observations in the two forecasts is given by the following:

In the case of Gaussian random observations and simulations, it can be shown that the multiple correlation is directly linked to information entropy (

The partial (conditional) correlation

From Eqs. (

Besides the target redundance term, there is a portion of the variance of the competing forecasting systems which is potentially not represented in the observations. This portion we define as non-target redundance. It goes beyond the PID of

The variables of the PCD form a uniform tool that can be used in various combinations. They are partly more or less established and partly new (Table

PCD terms.

The concept of partial correlation is also useful for distinguishing actual model forecasts from simplified forecasts such as damped persistence and/or autoregressive models of the observations.

In the following, the decomposition of multiple correlation analysis using partial correlations is applied to the verification of a multi-annual ensemble mean 2 m temperature time series (1961–2013) from two decadal ensemble prediction systems of the German MiKlip decadal forecast project

First, annual mean 2 m temperatures for the period 1962–2013 averaged over lead years 2 to 5 are examined. They are taken from initialized retrospective decadal climate hindcasts as part of the MiKlip decadal climate prediction system

All decadal forecasts start on 1 November each year and consider the first 2 months (November and December) as the spin-up phase. Thus, lead year 2 actually starts after 14 months of integration time. The correlation analysis is done on the ensemble mean of 10 available ensemble members in each forecast system, beginning in January in the year following the initialization. The forecasts are evaluated using the HadCRUT4

The second example compares daily mean ECMWF forecasts at lead

Within the MiKlip project for decadal hindcasts, the EnKF model version, based on the ensemble Kalman initialization technique for ocean temperature initialization, has been tested and compared to nudging methods to initialize the model ensemble for ocean and atmosphere reanalyses

Figure

Figure

The test of the null hypothesis of vanishing added values has been determined using the Student

The non-target redundance of EnKF and PREOP is shown in Fig.

Squared multiple correlation coefficients

As in Fig.

As in Fig.

Lead day 4 dynamic model forecasts and lag day

The multiple correlations for the winter of 2016–2017 are given in Fig.

Generally similar results are found for the other two winters. We want, however, to analyse the interannual differences. Figure

Next, we look for the general tendency for compensation between changes in the lag day

Squared multiple correlations between ECMWF reanalyses and ECMWF lead day 4 dynamic forecasts and damped lag day

As in Fig.

As in Fig.

Scatter diagrams in which the target redundance differences between the winters of 2016–2017 and 2014–2015 of 4 d dynamic and damped lag day

In this paper, we proposed an as yet unexplored method for evaluating and comparing two opposing forecasting systems with the respective observations on the basis of correlation and/or regression partial decomposition. Apart from the classic views provided by regressions, correlations, and differences, two new variates are presented. One is the shared variance between both models and the observations, and the other is the shared variance between the models that are not observed. These classic and new variates are directly related and comparable as they are based on the same units. The PCD is an application of the PID proposed by

Multiple correlation is directly related to mutual information for Gaussian distributed variables and replaces it in our PCD analysis. Thus, the multiple correlation provides the total information of the forecasts on the target and/or observation. The added values are determined from the partial correlations. The multiple correlation links to the multiple regression method used by

This PCD toolkit has been applied to decadal hindcasts of the MiKlip ensemble prediction system and, similarly, to synoptic daily ECMWF forecasts of 2 m temperature. For the MiKlip mean of lead year 2 to 5 forecasts, the analysis shows the added values of the EnKF compared to the PREOP forecasts, especially over the North Atlantic and in regions influenced by the El Niño–Southern Oscillation (ENSO) over the Pacific Ocean. The target redundance is mainly equal to the information of the PREOP model because it provides only a small added value compared to the EnKF version. The map of non-target redundant information shows large values, especially in the subtropical Pacific Ocean and the southern tropical Atlantic region; such a representation is new.

On the synoptic timescale, the PCD is used to show the benefit of 4 d dynamic forecasts of 2 m temperature with respect to lagged

A non-local multivariate extension of the PCD can be done using basis functions, such as empirical orthogonal functions in the spatial domain. The determination of the redundance terms in the realm of partial correlations becomes quite difficult if more than two predictors are to be compared. Also, here, the problem might be solved by using empirical orthogonal functions in a first step – this time on the estimated correlation matrix among the predictors. If the common variance is large enough, it should be reflected in one of the estimated modes. The unique information of single or groups of forecasts might be reflected by other modes if they can be estimated with enough confidence.

The analysis has been done with a bash script using Climate Data Operators (CDO) and NetCDF data input. The data analysis tools are available at

The MiKlip data used for this paper are from the German Federal Ministry of Education and Research (BMBF)-funded project MiKlip. Model data of the described predictions are made available at Climate and Environmental Retrieval and Archive (CERA), a long-term data archive at the German Climate Computing Centre (DKRZ). The ECMWF data are freely accessible
through the ECMWF TIGGE portal, after registration (

RG-H provided the regression decomposition and the data processing, SB and JB developed the EnKF forecast system, and all co-authors formulated the text.

The authors declare that they have no conflict of interest.

We would like to thank ECMWF for providing the daily forecasts in the TIGGE databank. Special thanks are due to the anonymous reviewers, who gave very helpful and detailed comments.

Rita Glowienka-Hense has been funded by the Federal Ministry of Education and Research in Germany (BMBF) through the research programme MiKlip (grant no. FKZ 01 LP 1520G) and through the DeepRain research programme (grant no. IS18047A-E). Sebastian Brune has also been funded by the Federal Ministry of Education and Research in Germany (BMBF) under the MiKlip project (grant no. FKZ 01 LP 1516A). Johanna Baehr has been funded by the German Research Foundation (Deutsche Forschungsgemeinschaft – DFG) under Germany's Excellence Strategy – EXC 2037 and the “CLICCS – Climate, Climatic Change, and Society” project (grant no. 390683824) at the Center for Earth System Research and Sustainability (CEN) of the Universität Hamburg.

This paper was edited by Chris Forest and reviewed by three anonymous referees.