Nonstationary extreme value analysis for event attribution combining climate models and observations

We develop an extension of the statistical approach by Ribes et al. (2020), which was designed for Gaussian variables, for generalized extreme value (GEV) distributions. We fit nonstationary GEV distributions to extremely hot temperatures from an ensemble of Coupled Model Intercomparison Project phase 5 (CMIP) models. In order to select a common statistical model, we discuss which GEV parameters have to be nonstationary and which do not. Our tests suggest that the location and scale parameters of GEV distributions should be considered nonstationary. Then, a multimodel distribution is constructed and constrained by observations using a Bayesian method. The new method is applied to the July 2019 French heat wave. Our results show that both the probability and the intensity of that event have increased significantly in response to human influence. Remarkably, we find that the heat wave considered might not have been possible without climate change. Our results also suggest that combining model data with observations can improve the description of hot temperature distribution.


Introduction
In the context of climate change, extreme events such as heat waves can happen more frequently due to the shift of temperature to higher values. More generally, climate change signals alter the probability distribution of many climate variables, with impacts on the frequency of rare events. More frequent and/or more intense extreme events such as heat waves, extreme rainfall or storms have been shown to have critical impacts on human health, human activities and the broader environment. Over the last decade, there has been extensive research on the attribution of extreme weather and climate events to human influence. In order to model extremes of a physical variable in a statistical sense, generalized extreme value (GEV) distributions are commonly used. Attribution analysis requires deriving the probability of an event occurring in the factual world (our world) and in a counterfactual world (without anthropogenic signal), which can be done from their respective GEV distributions (see e.g., Stott et al., 2004;Pall et al., 2011). The ratio between these two probabilities measures the human influence on the extreme event considered. Two approaches have been proposed to infer this ratio. The first uses a large ensemble of simulations to sample the factual and counterfactual worlds (see e.g., Wehner et al., 2018;Yiou and Déandréis, 2019). The second approach infers the trend from observations using a nonstationary GEV fit and then derives the probabilities of interest (see e.g., Rahmstorf and Coumou, 2011;Hansen et al., 2014;van Oldenborgh et al., 2015aPhilip et al., 2018;Otto et al., 2018). In the latter case, the counterfactual world is typically a time period in the past (e.g., the early 20th century) that potentially requires extrapolation of the trend if observations are not available.
Recently, Ribes et al. (2020) have introduced a new approach in which transient climate change simulations are merged with observations to infer the desired probabilitiesan idea also explored by Gabda et al. (2019). Using transient simulations enables the use of large multimodel ensembles, such as the Coupled Model Intercomparison Project phase 5 (CMIP5, 2011) and, as a consequence, a better sampling of model uncertainty. The authors first fit nonstationary statistical models to individual climate model outputs to estimate changes occurring in these models. Then, a multimodel synthesis is made, which provides a prior for the real-world parameters. Lastly, they implement observational constraints to select values which are consistent with observations. Their entire procedure is based on nonstationary Gaussian distributions, which severely limits the range of application of that method and relies on a covariate describing the response to external forcings such as a regional mean temperature.
Here, we overcome two limitations of their approach. First, we extend the procedure to nonstationary GEV distributions. Second, while only the stationary parameters and the covariate were constrained by observations in Ribes et al. (2020), here we propose a more comprehensive Bayesian approach to constrain all parameters in a consistent way, including the nonstationary parameters. Our overall strategy is to construct a prior of the statistical parameters of interest, using an ensemble of climate model simulations, and then derive the posterior given available observations. As a main guideline, we propose studying the French heat wave of July 2019. This heat wave has already been investigated by  in a fast attribution study, summarized by Vautard et al. (2020), using some of the methods listed above.
In Sect. 2, we present the data set used and the event definition, the mathematical framework of attribution, and the Bayesian description of our study. In Sect. 3, we describe our statistical method and present our main results. A large part of our approach is directly taken from Ribes et al. (2020). We also investigate which nonstationary GEV model is the most appropriate. A discussion of the method and the results obtained for the 2019 heat wave is provided in Sect. 4.

Data set and event considered
To present our methodology, we propose implementing an attribution analysis of the extreme heat wave of July 2019. We focus on France (42-51 • N, 5 • W-10 • E), and we consider the random variable of annual maxima of a 3 d average of the mean temperature anomaly with respect to the period 1961-1990, noted as T .
The time series of observations of T , denoted as T o (note the difference between the random variable, which is a mathematical abstraction, and the realization, which is the observations), comes from the Météo-France thermal index. This index is built as an average of observations from 30 ground stations, showing data available between 1947 and 2019. The time series of T o is represented in Fig. 1a. The annual 2019 maximum occurred in July during the heat wave. The anomaly of this event is equal to 4.98 K, slightly higher than the 2003 maximum (4.93 K; second highest).
For climate models, we extract a simulated time series of T from the CMIP5 data set (CMIP5, 2011). We select all models for which simulations were produced between 1850 and 2100 and merge their historical runs with their representative concentration pathway 8.5 (RCP8.5; van Vuuren et al., 2011;Riahi et al., 2007) scenario simulations. All models se-lected are summarized in Table 1. In all, we have 26 climate models over France representing the random variable T .
We use the summer mean temperature anomaly with respect to period 1961-1990 over Europe as a covariate, noted as X t , consistent with previous attribution studies ( van Oldenborgh et al., 2015b;van der Wiel et al., 2017). For observations of the covariate, the data set HadCRUT4 (Jones et al., 1999(Jones et al., , 2001Kennedy et al., 2011;Morice et al., 2012) is used and noted as X o t . For each CMIP5 model included in our analysis, we also extract this covariate by taking the mean temperature anomaly over Europe.
To summarize, we have the following: -The observation T o of T , given by the Météo-France thermal index.
-The observation X o t of X t , given by the HadCRUT4 data set.
-A total of 26 time series of T from 26 CMIP5 models.
-A total of 26 time series of X t coming from 26 CMIP5 models.

Goal of attribution
A goal, in attribution, is to find the probability of the realization of the 2019 event, in the factual world (our world) and in the counterfactual world (without human influence). Noting P F t and P C t , the probability distribution of T in the factual and counterfactual world, respectively, we are looking for the following probabilities: Observe the dependence in time of P F t and P C t . It is justified by the anthropic and natural forcings (such as volcanic eruptions) which alter the probability distribution with time. From the two probability distributions, we can also derive the intensity of the event. Noting Q F t and Q C t , the quantile functions, the intensity I F t and I C t in the factual/counterfactual world are defined by the following: Finally, the attribution is performed with the two following indicators, namely the probability ratio and the change in intensity, which measure the influence of anthropic forcing as follows: Consequently, an attribution study requires knowing the probability distribution P F t and P C t . The next sections describe the statistical model and how to use the climate models to infer these distribution with a Bayesian approach.  (CMIP5, 2011;Taylor et al., 2012;Meehl et al., 2009;Hibbard et al., 2007

Statistical models
Classically, for the maximum temperature, the random variable T is assumed following a generalized extreme value distribution, in which parameters vary with a covariate X t (either linearly or through a link function), depending on time and describing the response to external forcings. Mathematically we write that T ∼ P F t = GEV(µ(t), σ (t), ξ (t)), as follows: The parameters µ(t), σ (t) and ξ (t) are, respectively, the location (similar to the mean), scale (similar to the variance) and shape parameters. µ 0 , σ 0 and ξ 0 are called the stationary parameters, while µ 1 , σ 1 and ξ 1 are called the nonstationary parameters. The exponential link function is used for the scale parameter to ensure its positivity. We note this model Y. Robin and A. Ribes: Nonstationary GEV analysis for event attribution 209 The inference of the probability distribution of θ allows us to know the probability distributions P F t and P C t just by replacing the covariate X t with X F t and X C t , respectively, in Eq. (1). Note that we assume that the coefficients µ 0 , µ 1 , σ 0 , σ 1 , ξ 0 and ξ 1 are the same in the factual and counterfactual world; only the value of X t differs between these two worlds.
Our goal is to infer the probability distribution of θ and to constrain it with Bayesian techniques by the observations X o t and T o . Mathematically, we want to derive the posterior of the following: (2) Our strategy is as follows. First, we estimate θ in each individual climate model considered. Second, we use the ensemble of climate models, in particular the spread in the estimated values, to derive a prior for the real-world value of θ . Third, we apply Bayesian techniques to derive the posterior of θ given observations X o t of X F t and observations T o of T .

The covariate in the factual and counterfactual worlds
Here, we estimate the covariate X t , i.e., the forced response of European summer mean temperature, for each single climate model over the period 1850-2100. This approach implies some uncertainty in the value of X t . Indeed, the value of X t is noise by the internal variability, and there is no link between the internal variability of X t and that of T . This is done following the Ribes et al. (2020) approach closely, and it provides estimates of X F t and X C t . Our goal is to find the forced response from X t , which requires a smoothing procedure. A generalized additive model (GAM; see e.g., Tibshirani, 1990) is used to decompose the time series of X t as follows: where -X 0 is a constant, -X N t is the response to natural forcings only (such as volcanoes) and is inferred from an energy balance model (Held et al., 2010;Geoffroy et al., 2013), -X A t is the response to anthropogenic forcings; it is assumed to be a smooth function of time and is estimated using a smoothing procedure, ε is a Gaussian random term due to internal variability.
Each of these terms has been estimated, and the response of X t to external forcings in the factual and counterfactual worlds is easy to derive, as follows: Furthermore, a covariance matrix is associated with the GAM decomposition in Eq. (3), describing the uncertainty of X 0 , X N t and X A t (i.e., the covariance matrix of X 0 , the linear regression coefficient involved in X N t and the splines coefficients involved in X F t ). We use it to draw 1000 perturbed realizations of the pair (X F t , X C t ) to describe their probability distribution.
We have represented, in Fig. S1, the two covariates and their respective uncertainty for three climate models. We can see different behaviors, with the model MIROC-ESM-CHEM increasing to +8 K at the end of the 21st century in the factual world, whereas the two others are between 4 and 5 K. Over the historical period, the CNRM model stays flat, whereas the two others increase or decrease. By anticipating a little, the synthesis of models shown in Fig. 4a and b depict a large range of uncertainty, justifying taking the uncertainty in the covariate into account.
This procedure (decomposition and perturbed realizations) is applied to each of the 26 realizations X t from our CMIP5 models. Thus, at this step, we have inferred 26 distributions of the coupled (X F t , X C t ), which is a subdistribution of our target θ . The next step involves fitting GEV models to each climate model and selecting the most appropriate statistical model.

Selection of the GEV model
We start by defining four submodels of Eq. (1), namely M µ,σ , M µ,ξ , M µ and M 0 , which assume, respectively, that ξ 1 ≡ 0, σ 1 ≡ 0, σ 1 ≡ ξ 1 ≡ 0 and µ 1 ≡ σ 1 ≡ ξ 1 ≡ 0. We want to determine which GEV model out of M µ,σ,ξ , M µ,σ , M µ,ξ , M µ and M 0 is the most relevant. We propose using a likelihood ratio test (LRT) from theorem 2.7 of Coles et al. (2001). We start from two models, M a and M b , such that M a is a submodel of M b , and we note k as being the number of supplementary parameters in M b with respect to M a . For example, if M a = M µ , M b = M µ,σ then k = 1. The likelihood ratio test rejects M a in favor of M b at the significance level α if, in the following: where l is the maximum of the log-likelihood function, and c α is the 1 − α quantile of the k degree of freedom χ 2 k distribution. We propose identifying the best common GEV model by increasing the complexity by 1 degree of freedom at a time, i.e., k ≡ 1. The pairs of models tested are represented as a tree in Fig. 2. Each edge of the tree is a test in which the left model represents M a , while the right model represents The parameters of all GEV model are fitted from the realizations of T for each CMIP5 model, with the associated covariate X F t from the GAM decomposition, using the maximum likelihood estimation (MLE). More details about the fit are given in Appendix A. First, all models agree with the consideration that µ must be nonstationary (the first line is very close to 0). This result is expected because the climate change signal in the RCP8.5 scenario is very high and directly alters the trend. Results are less obvious for other parameters. A total of seven CMIP5 models do not show evidence of any other form of nonstationarity, while 14 models suggest that either the scale or the shape are nonstationary. A total of eight (respectively, four) of these models suggest that the scale (respectively, shape) is nonstationary. For the two other models, it is not clear if the scale and shape must be simultaneously nonstationary or if we have a transfer of information between these two parameters.
Overall, these results suggest that we have to take into account a nonstationarity in the trend and in the scale or the shape. Because the MLE can swap information between the scale and the shape, and a small error of estimation in the shape can have a big impact on the fit, we choose to select the model M µ,σ . So, in the rest of this paper, we assume that ξ 1 ≡ 0, and our nonstationary GEV model is written as follows: Consequently, θ is now written as follows: We can now infer the probability distribution of θ for each CMIP5 model. A bootstrapping procedure is used for each climate model in which the variable T is resampled 1000 times, corresponding to the 1000 estimates of X F t derived from the previous section. Thus, 1000 perturbedμ 0 ,μ 1 ,σ 0 , σ 1 andξ 0 are estimated per CMIP5 model. The goodness of fit of this model is assessed using a Kolmogorov-Smirnov test (see Fig. S2 and Table S1 in the Supplement) and shows a good agreement with the CMIP5 models. The best fit is shown in Fig. 1b for the CNRM-CM5 model (Fig. S3 in the Supplement shows the distribution of all fitted parameters and exhibits a large variability). First, in that case, the shape parameter is negative, equal to −0.23, implying the existence of an upper bound. Second, the upper bound follows the trend, and an extreme event occurring in the future might have been impossible in the past.
At this stage, we have 26 distributions of θ -one for each CMIP5 model -modeling the uncertainty of the covariates and the GEV parameters. Uncertainty of each distribution is quantified through a sample of 1000 bootstrapped parameters. We now focus on how to merge them into a single distribution representing multimodel uncertainty.

Deriving a prior from an ensemble of models
The goal is to synthesize our 26 distributions of θ into one single multimodel distribution. In a nutshell, we assume that each single model θ is a realization of this multimodel distribution, which is a multivariate Gaussian law in the dimension of 251 + 251 + 5 = 507 (251 years for factual/counterfactual world and 5 parameters for the GEV). This distribution is therefore representative of the model spread -which is usually much larger than the uncertainty resulting from estimating θ in a single model. We subsequently use the "models are statistically indistinguishable from the truth" paradigm (Annan and Hargreaves, 2010) and assume that this multimodel distribution is a good prior for the real-world value. In practice, this synthesis is made following Ribes et al. (2017); details are given in Appendix B.
The effect of the multimodel synthesis is shown in Fig. 4. Panels (a) and (b) depict the synthesis of the covariates X F t and X C t , respectively. The dotted blue line is the best estimate, and the light blue shading is the 95 % confidence interval. These two panels show a large uncertainty across models for the covariates; note that the uncertainty is almost null between 1961 and 1990 because the models are given in anomalies with respect to this period. Panel (c) shows the covariance matrix of the distribution of (σ 0 , ξ ) as ellipsis of the 95 % confidence interval. The dotted red ellipsis are the individual models, depicting many different behaviors. The solid red ellipsis is the multimodel synthesis, which includes almost all the best estimates from individual models.
So, with this approach, we obtain a good candidate θ for our Bayesian prior, described in Eq. (2), inferred from the climate models.  The dotted blue line and the blue line are the best estimate of the multimodel synthesis before and after applying the Bayesian constraint, respectively. The shading in light blue and dark blue are the 95 % confidence interval of the multimodel synthesis before and after Bayesian constraint, respectively. (b) Covariate X C t in the counterfactual world. The dotted blue line and the solid blue line are the best estimate of the multimodel synthesis before and after applying the Bayesian constraint, respectively. The shading in light blue and dark blue are the 95 % confidence interval of the multimodel synthesis before and after Bayesian constraint, respectively. (c) Ellipsis are the 95 % confidence interval given by the covariance matrix for the 26 CMIP5 models (dotted red), the multimodel synthesis before (solid red) and after (blue) Bayesian constraint. The crosses are the best estimate of the individuals models.

Using observations to derive a posterior
To derive the posterior of θ | T o ∩ X o t , we start by applying the Bayesian theorem, as follows: In other words, to constrain θ by T o and X o t , we can, first, constrain θ by X o t , and, second, constrain the new random variable (θ | X o t ) by T o . Note the two steps; the prior of θ derived in the previous section is used to infer (θ | X o t ). This new random variable is then used as a prior to derive the posterior of the constraint by T o .
The posterior of (θ |X o t ) is already computed and used by Ribes et al. (2020), under the name of CX constraint. This distribution is easy to derive because both θ and X F t are assumed to follow Gaussian distributions, and X F t is a subvector of θ , so the Gaussian conditioning theorem (see e.g., Eaton, 1983) applies, and (θ | X o t ) also follows an explicit Normal law. The effect of the constraint is represented in Fig. 4a and b. The blue line is the constrained best estimate, https://doi.org/10.5194/ascmo-6-205-2020 Adv. Stat. Clim. Meteorol. Oceanogr., 6, 205-221, 2020 and the blue shading is the new confidence interval. We can see a significant decrease in the confidence interval. In the historical period, the covariate is closer to 0, except during volcanic episodes. For the projection period, the signal is increased by removing the lowest covariates. The last step is to draw from the posterior of the Eq. (6). To draw a θ fully constrained from the posterior, we use a Markov chain Monte Carlo (MCMC) method, namely the Metropolis-Hastings algorithm (Hastings, 1970). We propose, for each sampling, performing 10 000 iterations of the MCMC algorithm (see Appendix C), and we assume that the last 5000 samples are drawn from the posterior P θ | T o ∩ X o t . We uniformly draw one θ among these 5000 values, and we consider it as our new parameter. We use the median of these posterior drawings as a best estimate (e.g., for the final indicators I and PR). As example of the effect of the constraint, we have added in Fig. 4c the covariance matrix of the pairs of parameters (σ 0 , ξ ) in blue. We can see the significant decrease in the uncertainty and the shift of the parameters.

Summary
Finally, we obtain set of nonstationary parameters of a GEV distribution from a multimodel synthesis. It is fully constrained via Bayesian techniques. We have a synthetic description of the variable T as a GEV law, with best estimates of the parameters of that law and 1000 bootstrapped samples characterizing uncertainty on these parameters. Furthermore, using the decomposition of the covariate X t , we can switch between the factual and counterfactual worlds.
Once the parameters of the GEV distributions have been inferred, we can compute the cumulative distribution function at any time t and derive the instantaneous probability of our event in the factual world, denoted as p F t , and in the counterfactual world, denoted as p C t . The usual attribution indicators, such as the probability (or risk) ratio PR t := p F t /p C t , the fraction of attributable risk FAR t := 1 − 1/PR t and also the change in intensity I t , can be derived consistently from our description of T .

About the statistical model
In the literature (see e.g., van Oldenborgh et al., 2015b;van der Wiel et al., 2017), it is assumed (i) that σ 1 ≡ ξ 1 ≡ 0 (forcings do no affect the variability and shape), (ii) the covariate X t is given by the observations of the 4-year moving average of the global mean surface temperature or the European mean temperature, (iii) the GEV distribution is directly fitted with the observations T o , and (iv) the GEV distribution fitted describes only the factual world, while the counterfactual world is defined as a particular time in the past (e.g., the year 1900). In other words, P C t ≡ P F 1900 .
First, the main argument assumes that σ 1 = ξ 1 = 0 is the large uncertainty during the fit, but we have seen that the climate models can exhibit at least a σ 1 that is significantly different from 0. The uncertainty can be due to the small size of T o (73 realizations currently). Second, the uncertainty of the covariate X t is not taken into account, while we were able to see that it was not negligible, including over the period when the observations are known. Third, the definition of the counterfactual world depends on the date selected as being representative of preindustrial conditions, e.g., 1900 by . In particular, the counterfactual world can be affected by some anthropogenic forcings already present at that time. Our approach allows us to take this into account.

About the constraint
In this section, we discuss the effect of the Bayesian constraint on GEV model parameters. The multimodel parameters and their confidence intervals are summarized in Fig. 5.
For the location parameter µ(t) in the factual world (Fig. 5a), we can see the trend of RCP8.5 simulations for each set of parameters. The effect of natural forcings is discernible over the period 1850-2000. The constraint tends to slightly reduce the confidence interval to between 40 % and 60 %. The best estimate of the change in µ(t) is slightly increased by the constraint, by about 0.5 K at the end of the 21st century. Note that in the counterfactual world (Fig. 5d) the trend is flat during the 21st century due to the absence of anthropogenic forcings in RCP scenarios over that period.
For the scale parameter in the factual world (Fig. 5b), application of the constraint leads to a large reduction in uncertainty by a factor of 2. A large trend in the scale becomes visible in the confidence interval over the 21st century. Note that the trend can be negative, or positive, corresponding to different climate models. No trend is visible in the counterfactual world (Fig. 5e). The Bayesian constraint suggests a much higher upward trend in σ (t). Figure 5c shows the shape parameter, which is the same in the factual and the counterfactual worlds (the shape does not depend on the covariate). The shape, as estimated in the multimodel distribution, is always negative and is in the interval [−0.35, −0.15]. Applying the Bayesian constraint, the uncertainty range is reduced to [−0.3, −0.1], with a best estimate equal to −0.2. An upper bound always exists in this case.
We have represented, in Fig. 6, the parameters σ 0 , σ 1 and ξ for the prior (i.e., the multimodel), the Bayesian constraints and for observations T o . The parameters fitted for observations are derived from the same GEV model, and the covariate used is a spline smoothing of X o t . The observed time series is bootstrapped 5000 times to sample the joint distribution between all three parameters. The Bayesian constraint is locked at the intersection between the prior and the observations. The prior and the distribution derived from observations often exhibit some overlap. This suggests that there is no clear evidence of observations being inconsistent with model-simulated parameters given the sampling uncertainty in observations, i.e., the difficulty in directly fitting a nonstationary GEV distribution to a very small sample (71 observations). We can also deduce that the prior has a relatively strong influence on the posterior; in particular, the range of σ 0 is narrowed. Figure 5f shows the upper bound of the fitted GEV distribution in the factual world. This bound is given by µ(t) − σ (t)/ξ (t) if the shape is negative and is infinite if the shape is positive. Here, the effects of the constraint are critical. In the unconstrained multimodel distribution, the estimated upper bound is sometimes lower than observed events; such events should be impossible, which suggests that this estimate is unreliable. After applying the constraint, the estimated upper bound is always higher than observations, consistent with expectations. Furthermore, we notice that the observed 2019 value is above plausible values of the GEV upper bound over the 19th and 20th centuries (i.e., within the confidence range), suggesting that the 2019 event might have had a null probability of occurring at that time.

About the heat wave
In this section, we discuss the attribution results derived from our methodology, based on the two indicators defined above. The first is the probability ratio PR t := p F t /p C t , shown in Fig. 7a and b. The second is the change in intensity I t between the factual and counterfactual worlds, shown in Fig. 7c and d. The statistics for the years 2019 and 2040 are summarized in Table 2, where RT F t := 1/p F t is the factual world return time.
The probability ratio can be undetermined, e.g., if p F t = p C t = 0. In this case, if the proportion of undetermined values is greater than 0.05, we assume the confidence interval to be [0, ∞] (∞ included). Specifically, the upper (respectively, lower) bound of the confidence interval is computed by assuming that all undetermined values are equal to ∞ (respectively, 0), corresponding to the least confident case. The proportion of undetermined values is indicated in gray along the timeline below each panel. After the 1990s, the ratio of undetermined values is lower than 0.05. We note that, to a certain extent, undetermined values indicate that the odds of the event have not changed (i.e., similar to PR = 1) because the event is impossible in both the factual and the counterfac- tual worlds. The key point is that the heat wave is assessed as impossible in the factual and counterfactual worlds over the 19th and 20th centuries, with a probability between 5 % and 50 % particularly when volcanic eruptions occur.
Before 1990, the PR is not well defined, indicating no evidence of human influence on such an event. After 1990, the lower bound of the PR confidence interval increases quickly above 1. The probability of an event like the 2019 event is found to have increased by a factor in the range 19-∞ in 2019 and by a factor in the range 75-∞ in 2040. The value ∞ comes from realizations in which the event has a null probability in the counterfactual world. The estimated return time is in the range 13-153 years in 2019 and 3-20 years in 2040.
Estimated changes in intensity are consistent with the above picture. No change is detected before the year 1980, and then a sharp increase is found after that date, reaching I to +2 K in the year 2019 (95 % confidence intervals of 1.5-2.7), which is consistent with the long-term climate change signal estimated over France in summer (currently nearing +2 K; see e.g., Ribes et al., 2016). According to the RCP8.5 scenario, the I will continue to increase to 3.6 in 2040 (2.6-4.6). We note that the upper bound in the year 2100 is approaching +12 K, which corresponds to an annual maxima of mean temperature around 39 • C (29 • C in 2019).

About the world weather attribution (WWA) paper
We finish with a comparison to the fast attribution paper by Vautard et al. (2019, hereafter V19), who investigated the same event. V19 considers a T time series extracted from E-OBS observations (Cornes et al., 2018) and fits a GEV model with a nonstationary location parameter (a smoothed global mean temperature is used as a covariate) and a constant scale and shape. The method used to build confidence intervals has not been published, so the discussion below focuses on the best estimates.
V19 reports a return period of around 134 years (i.e., p F 2019 = 7×10 −3 ), with a probability ratio at least equal to 10 and a change in intensity between 1.5 and 3 K from E-OBS. Results from CMIP5 models were rejected due to a scale parameter in models (= 1.78, with uncertainty range [1.74, 1.83]) being in sharp disagreement with that estimated from Here, by making the best use of all available information, we find a return period between 13 and 153 years, a probability ratio at least equal to 19 and a change in intensity between 1.5 and 2.6. Attribution results, i.e., PR and I, agree reasonably well between the two studies. However, there is a substantial discrepancy in the estimated return time. We can propose some explanation for this discrepancy. (i) Our GEV model assumes a nonstationary scale parameter. In practice, we find a significant change in that parameter after the end of the 20th century, contributing to a shorter return time of the event. (ii) We use the entire historical record to apply our Bayesian constraint, while V19 excluded the observed 2019 value from their estimation procedure.
Another noticeable difference between these two studies involves the method of combining climate models and observations. For V19, the parameters of the GEV distribution are fitted from observations, and all models that are in disagreement (e.g., scale parameter too far) are rejected. They assume that the climate models rejected are lacking some physical process vital for the generation of extremes in the real world. We, instead, are assuming that the physical processes are there but potentially misrepresented in some way. After exploring the model uncertainty comprehensively, we find no inconsistency between models and observations. We therefore derive parameter ranges which are consistent with Figure 7. (a) Probability ratio, PR t := p F t /p C t , of multimodel synthesis after the Bayesian constraint. The gray line is the proportion of undetermined values (i.e., 0/0). If the gray line is larger than 0.05 (the dotted gray line), the confidence interval is [0, ∞]. The red shading is the 95 % confidence interval of undetermined values (p F t > 0 or p C t > 0). The red line is the best estimate. (b) Change in intensity I t , i.e., the difference between the value of an event with probability p F 2019 in a factual and counterfactual world for each year. The red shading is the 95 % confidence interval, and the red line is the best estimate. these two sources of information. In the end, model data do not play the same role in these two studies.

Conclusions
In this paper, we propose an extension of the method of Ribes et al. (2020), which was designed for Gaussian variables, for the analysis of extreme events and, in particular, generalized extreme value distributions. This extension can be used as soon as the event is sufficiently extreme under both the factual and counterfactual worlds in which the GEV modeling is applicable. We also provide new insights on observa-tional constraints for attribution, and a new constraint based on Bayesian techniques is developed. This approach shows a good capacity to restrict the CMIP5 multimodel distribution to trajectories which are consistent with the observed time series. This method is applied to annual maxima of a 3 d moving average over France, with a focus on the July 2019 heat wave.
This method illustrates how CMIP5 models can be used to estimate the human influence on extremely hot events. A key point is the nonstationarity of the scale parameters, which is often assumed to be constant. Some CMIP5 models exhibit a strong change in this parameter. For a small subset of these models, the shape parameters could also be nonstationary. Over the observed period, such changes are limited and hidden by internal variability, so they cannot be ruled out by the observational constraint.
Potentially, our new method can be applied to any variable by just considering the maxima. Illustrating the potential of this technique on another type of extreme event, such as extreme rainfall, is an important area of exploration for future work. Examining the response of hot extremes to climate change in the new generation of climate models (CMIP6) would also be an attractive approach for assessing the nonstationarity of the scale and shape parameters from a broader ensemble.
This minimization cannot be solved explicitly, and we use the classic algorithm of Broyden-Fletcher-Goldfarb-Shanno (BFGS; Nocedal and Wright, 2006). This method is a gradient method (with an estimation of the Hessian matrix), and requires a starting point to converge to the solution. The gradient can be explicitly written as follows: We focus on selecting the starting point. Classically, the Lmoment estimators are used to initialize µ 0 , σ 0 and ξ 0 , and it is assumed, for the starting point, that µ 1 = σ 1 = 0. This choice, mostly if µ 1 and σ 1 are far from 0, can make the likelihood and/or its gradient undetermined at the starting point, leading to a failed minimization. We propose using the following modification of the initialization. We perform a quantile regression (Koenker and Bassett, 1978;Koenker and Hallock, 2001), with the covariate X t of T for quantiles from 0.05 to 0.95 by steps of 0.01. The Frish-Newton algorithm (Koenker and Ng, 2005) is used to solve the quantile regression problem. We have 91 samples per unit time. For each time, we compute the L moments and, with a least square regression, we find an estimation of the nonstationary L moments. Applying the equation of Hosking et al. (1985), we find a first approximation of µ, σ and ξ 0 . Now, with a final least square regression, we can find an estimation of θ and then use this estimate to initialize the BFGS algorithm.

Appendix B: Multimodel synthesis
The main hypothesis of this multimodel synthesis is the paradigm that models are statistically indistinguishable from the truth. Following Ribes et al. (2017), we note that θ * is the multimodel synthesis and µ θ is the truth. We decompose any models θ i as θ i ∼ N (µ θ + µ i , u + i ) and µ i ∼ N (0, u ), where µ θ + µ i is the mean of the model θ i , u is the climate modeling uncertainty -assumed equal for each model -and i is the internal variability of the model θ i . The paradigm allows us to assume the following: Taking θ = 1 n n i=1 θ i , the multimodel mean, it follows: i .
We have to find an estimator of u . The difference between an individual model and the multimodel mean is given by the following: Cov ( Consequently, it follows that: An estimator of e is given by the moment method, by taking the empirical covariance matrix of all samples from all models. Thus, by noting +, the positive part of a matrix, we find finally, in the following: https://doi.org/10.5194/ascmo-6-205-2020 Adv. Stat. Clim. Meteorol. Oceanogr., 6, 205-221, 2020

Appendix C: Metropolis-Hastings algorithm
In this section, we describe a Markov chain Monte Carlo method, namely the Metropolis-Hastings algorithm (Hastings, 1970). We have a random variable, X, with law P X parameterized by a set of parameters θ . It is assumed that θ is also a random variable, with law P θ , and we have an estimation of it. Finally, we also have an observation X o of X. Our problem is constraining the law of θ , given the observation X o . Mathematically, we want to draw samples from the posterior distribution P(θ | X o ). Using Bayesian theorem, we can write the following: .
Focus on the right term. The numerator is known; this is just the law P X and P θ , assumed known. The problem is the denominator because the law P(X o ) is unknown. The Metropolis-Hastings algorithm starts with the following remark -for two realizations, θ 0 and θ 1 , this ratio can be evaluated, and does it not depend on P(X o ), as follows: . Thus, starting from a random θ 0 , a perturbation is applied to it, generating a θ 1 . The ratio ρ is computed, and if ρ > 1, then θ 1 is better than θ 0 with respect to observations X o , and it is accepted. Otherwise, a random number u is drawn from the uniform distribution in [0, 1]. If ρ > u, then θ 1 is accepted (and we accept a degradation) or else θ 0 is kept. And this procedure is repeated. After a sufficiently large number of iterations, the sequence of θ i generated is a Markov chain, drawing from the posterior distribution θ | X o . Here, in Sect. 3.5, we wait 5000 iterations before drawing 5000 new samples from the posterior.
Code and data availability. The CMIP5 database is freely available. Source codes realizing the GEV fit are freely available in the R Python package of SDFC under the CeCILL-C license (https://doi.org/10.5281/zenodo.4263886, Robin, 2020a). Source codes to reproduce the analysis of this article are packed into the library NSSEA under the CeCILL-C license (https://doi.org/10.5281/zenodo.4263904, Robin, 2020b).
Author contributions. YR performed the analyses. The experiments were codesigned by YR and AR. All the authors contributed to the writing of the paper.