Changes in extreme weather may produce some of the largest societal impacts
of anthropogenic climate change. However, it is intrinsically difficult to
estimate changes in extreme events from the short observational record. In
this work we use millennial runs from the Community Climate System Model
version 3 (CCSM3) in equilibrated pre-industrial and possible future (700 and
1400 ppm

As the Earth's mean climate changes under increased
concentrations of human-emitted greenhouse gases, the intensity and frequency
of extreme weather conditions may change as well (

Over the past 2 decades, numerous studies have sought to identify changes in
temperature extremes both in observations

Analysis of changes in extremes is complicated by the fact that there is no
unique definition of “extremes”. One common definition is as exceedance of
certain defined thresholds, with threshold values often defined based on past
climate distributions

An alternative and potentially more useful definition of “extremes” is
based on occurrences in the far tail of the distribution of the quantity of
interest. Extreme value theory (EVT)

The major limitation of the block extremes approach is that it discards all
but the most extreme value within each block. In contrast, the peak over
threshold (POT) method uses all data above a specified threshold. The POT
method models exceedances over a (sufficiently high/low) threshold as a
generalized Pareto distribution (GPD)

Wintertime (December–February) temperatures from a location in
Idaho from 1000-year CCSM3 model runs under pre-industrial and 700 ppm

An important advantage of using any EVT-based approach for the study of
climate extremes is that it allows estimation of the probability of events
that are more rare than the “moderate extremes” analyzed in threshold
exceedance studies. The threshold exceedance study of

In the past 2 decades, EVT has been applied in numerous studies to climate
variables, generally temperature and precipitation

Prior GEV studies are limited to some extent by two factors: the length and
non-stationarity of the analyzed time series. For observational studies,
existing data records are relatively short (often only several decades).
Model runs may be longer, but the model runs used in these studies typically
extend only around 100 years from the present. For some models and scenarios,
“initial condition ensembles” are available, i.e., multiple runs of the
same model and forcing scenario that differ only in their initial conditions.
Such ensembles obviously provide further information about extreme events,
but they generally include only a few model realizations. The length of the
time series has a large influence on the ability to detect changes in
extremes, and a record that is too short can lead to large uncertainty in
estimated return levels at long periods (see Sect.

In this study we avoid the limitations of short transient runs by using three
long (millennial) climate model runs in which climate is fully equilibrated.
Although numerical simulations of future climates provide only suggestions
for possible changes in climate variables, not direct evidence of changes,
they are important complements to observational studies. We use temperature
output from the widely used Community Climate System 3 (CCSM3) model

The paper is structured as follows: in Sect.

The GCM output used is part of an ensemble of climate simulations completed
by the Center for Robust Decision Making on Climate and Energy Policy (RDCEP)

Locations of centers of grid cells for model output used here. The
rainbow color scale for latitudes is used throughout this study. Idaho (ID),
California (CA), and Texas (TX) grid cells are used as examples in Figs.

The GEV distribution is widely applicable in the sciences because it arises,
at least approximately, in many cases of natural data. The simplest situation
where a GEV distribution can arise is in distributions of maxima taken from
sequences of

In this work, when studying high temperature extremes, the random variables

We give here a brief review of EVT for block maxima. For further background,

To study low temperature extremes, we must consider minima rather than
maxima. Equation (

In the analysis that follows, we also describe extremes by their return
levels, a widely used measure of extreme events. For warm temperature
extremes, the

Note that Eq. (

In the warmer climate conditions that result from higher atmospheric

Illustration of the effect on return levels of changing individual GEV parameters. We show consequences for both warm (red, left columns) and cold (blue, right columns) extremes. Columns 1 and 3 show GEV distributions for baseline (solid) and future (dashed) climates. Columns 2 and 4 show resulting changes in return levels for different return periods. Note that 1000-year periods are on the right for warm and left for cold extremes, to conform with percentiles. Location and shape parameters used here are chosen as representative of our model results, with larger effects in cold extremes than warm extremes, while shape parameters are identical in both cases. Top row: changing the location parameter shifts return levels uniformly across return periods. Middle row: increasing the scale parameter produces effects dependent on return period. All return levels increase (decrease for cold extremes), but more so for longer return periods. Bottom row: increasing the shape parameter produces dramatic increases in return levels at very long return periods.

By fitting the annual maxima and minima to a GEV distribution (see Appendix

We show in Fig.

Under warmer future climate conditions, we have strong reasons to expect positive shifts in location parameters: extremes should shift to warmer values for both warm and cold extremes. There are, however, no simple physical arguments that guide expectations for changes in the scale and shape parameters. The 1000-year model runs used here allow us to accurately estimate changes in all three parameters under this model.

The estimated CCSM3 GEV parameters and their changes in possible
future climate conditions. Left: fitted GEV parameters (location

Assessment of whether

Illustration of the changes in return levels and their relationship with the changes in seasonal means. The histograms are for the 1000 annual maxima (warm extremes) and minima (cold extremes) for the pre-industrial (289 ppm, blue) and future (700 ppm, red). Smooth curves are corresponding estimated GEV distributions. Return level plots show changes in estimated return levels (dark curve) and uncertainty from the bootstrap procedure (lighter curves). The corresponding changes in seasonal means are marked with a cross (change in mean summertime daily maximum for warm extremes and mean wintertime daily minimum for cold extremes).

Changes in the location parameters for both warm and cold extremes are, as
expected, positive everywhere in the study region
(Fig.

Estimated changes in warm extremes return levels for possible future
climate conditions in our CCSM3 runs. Top: 700 ppm

As in Fig.

GEV parameter estimates of warm and cold extremes for pre-industrial climate state and the estimated changes from pre-industrial to 700 ppm climate state at ID, CA, and TX locations.

For both warm and cold extremes, scale parameters show changes that are
geographically complicated but statistically significant relative to
pre-industrial values over much of the region (see
Fig.

A statistical diagnostic of the assumption that annual blocks are
sufficiently long for the GEV approximation. Differences (10- vs. 1-year
blocks) in estimated shape parameters are plotted for warm (left) and cold
(right) extremes against longitude. Estimates are shown for the
pre-industrial climate state, for all model grid cell locations in the study
area, with the same color and symbol scheme as in previous figures.
Subscripts in notation denote the block length; e.g.,

As discussed in Sect.

The different changes in warm and cold extremes shown in the example
locations of Fig. 6 are characteristic of the whole contiguous United States.
In CCSM3 model output, throughout the region, annual maximum return level
changes follow changes in summer means, but annual minimum return level
changes exceed winter means, with stronger influence of the scale and shape
parameters. Figures

The relationship between sample errors for return levels and data
lengths for warm extremes. Estimates of changes (from pre-industrial to
700 ppm

Same as Fig.

Assessment of sampling errors in estimates of return level changes
using comparable data lengths for all model grid cells in the study area.
Values shown are the estimated return level change minus the “ground-truth”
change determined from the entire 1000-year segment (

In this analysis, as in many climate applications of EVT, we have assumed
that an annual block is long enough that the GEV distribution is
approximately valid. Our long time series allow us to explicitly evaluate
this assumption. The evaluation relies on the fact that the GEV distribution
has the property of “max-stability”. In the context of distributions of
block maxima (

In our case, if an annual block is long enough that the GEV distribution is
approximately valid for a given time series, then the shape parameter
estimate

We refit GEV distributions for each grid cell in our study region with block
maxima/minima sizes of 2, 5 and 10 years, and compare shape parameters. We
find that for annual maxima, the differences in

The 1000-year model runs used in this work provide fairly accurate estimates
of changes in return levels even for long return periods. Estimated
uncertainties in return level changes – the bootstrapped envelopes in
Fig.

To assess how uncertainties increase with shorter model runs, we divide the
pre-industrial and 700 ppm time series into segments of 20 or 50 years and
refit the GEV parameters for each pair of segments (so, for example, pairing
the first 20 years of the pre-industrial run with the first 20 years of the
700 ppm run). The resulting distributions of estimates of changes in return
levels for warm and cold extremes are shown in
Figs.

The results show that sampling error can be large, as expected, when using
climate simulations of length comparable to the return periods of interest.
In both warm and cold extremes, the distribution of estimates of return level
changes derived from short model segments are centered around their “true”
values but with large spread (Figs.

The example locations shown in Figs.

It is important to note that spatial modeling approaches may be able to reduce estimation variability. (See further discussion in the following section.) When not using such approaches, this assessment suggests that when using single model runs or observational data, sampling error may be quite large when the length of the series is comparable to the return period of interest.

In this study, we have used extreme value theory to study changing
temperature extremes in model projections of future climate states. Following
prior work, we assume the GEV model provides a reasonable approximation for
the distribution of annual temperature extremes (see Figs.

For our model runs, the results suggest that for the contiguous United
States, much of the behavior of temperature extremes in higher CO

The millennial-scale model runs used here allow us to assess the validity of
the use of annual block sizes in studies of GEV distributions of temperature
extremes. Our results suggest that annual blocks are sufficiently long for
representing warm extremes, but may be insufficient for cold extremes,
especially for inland locations at high latitudes. In these locations,
altering the block length alters the shape parameter of the estimated GEV
distribution (Fig.

Millennial-scale model runs also enable detailed investigation of the
relationship between data length and GEV sampling error. Previous studies
have suggested that extrapolation should be carried out with caution

The computational demands of millennial-scale climate simulations restrict us
here to examining a single climate model run at fairly coarse resolution. A
single model should not be taken as robust guidance on how temperature
extremes may change in future climate conditions. However, few modeling
groups have performed runs of comparable length, and the multi-model data in
public archives are not ideal for estimating changes in extremes: run lengths
are much shorter (

In the absence of large ensembles, one could potentially overcome some of the
challenges of limited data by exploiting the spatial structure of temperature
extremes (see Fig.

Extreme value theory (EVT) has become popular for studying climate extremes, but our results here suggest that one should be aware of the underlying assumptions, the corresponding implications, and the potential limitations when applying it. EVT using block extremes can potentially allow characterization of tail behavior that differs from that of the underlying distribution, but it involves a corresponding penalty, as fitting a distribution of block extremes necessarily involves throwing out most of one's data. For short time series, care must be taken to ensure that the drawbacks do not outweigh the benefits. In the climate model output studied here, for example, shifts in the location parameter for both warm and cold extremes are well explained by changes in the mean and standard deviation of the underlying temperature distribution. The millennial time series used here allow us to also identify significant changes in the scale parameter, especially for cold extremes, but with shorter time series, sampling errors can be too large for estimated return levels to be of much practical value. Long model runs such as those used here therefore provide an important tool for the study of climate extremes, helping both in clarifying the contexts in which EVT should best be used and in devising approaches for working with shorter time series.

We show the comparison of the overall distribution and the distribution of
extremes, as in Fig.

As in Fig.

In this appendix, we briefly describe the fundamental result in extreme value
theory that justifies the use of GEV models for block maxima. Let

We assess the goodness of fit of GEV distributions used to model temperature
extremes with quantile–quantile plots. That is, we plot the quantiles of our
fitted GEV distributions against those of the empirical distribution of
annual maxima and minima. Figures show good agreement for both warm and cold
extremes (Figs.

The Q–Q plots for the fitted GEVs of warm extremes in the model run
simulating pre-industrial conditions. For clarity we display only every other
row and column of the grid cells shown in Fig.

As in Fig.

In Sect.

As in Fig.

As in Fig.

In this work we fit the parameters of the extreme value distributions for the
annual maxima/minima of each grid cell using maximum likelihood. For the
numerical optimization to find these estimates, we use the

Boxplots of estimated changes in 20-, 50-, and 100-year return levels obtained with two estimation procedures, PWM (blue) and ML (red), for three example locations. Results are generally similar, but PWM does avoid extreme outliers produced by ML for cold extremes in one location (TX).

We assess the uncertainties for GEV parameters and return levels by using
bootstrap resampling. Both simple nonparametric bootstrap

Assessment of estimation uncertainty to return levels using
different block sizes for block bootstrap. The scatterplot shows block
bootstrapped standard errors by resampling years (

Here we explore the relationship of changes in the location parameter of
extremes with changes in the corresponding seasonal means
(Figs.

An investigation of the relationship of changes in the location
parameter of warm extremes (i.e.,

As in Fig.

Here we present changes in the estimated shape parameter of extremes in the transition from pre-industrial to 1400 ppm climate states.

Changes in the estimated CCSM3 GEV shape parameter in the transition from pre-industrial to 1400 ppm climate states. Top: warm extremes. Bottom: cold extremes. Changes in the shape parameter are in general significant for inland cold extremes.

This work was conducted as part of the