The modeling of the occurrence of a rainfall dry spell and wet spell (dsws, respectively) can be jointly conveyed using interarrival times (its). While the modeling has the advantage of requiring a single fitting for the description of all rainfall time characteristics (including wet and dry chains, an extension of the concept of spells), the assumption of the independence and identical distribution of the renewal times itws, which may not be true in some cases. In this study, two different methods for the modeling of rainfall time characteristics at the station scale have been applied: (i) a direct method (DM) that fits the discrete Lerch distribution to itwsdsitwsdswsits are modeled well by the Lerch distribution. Improved performances are obtained with the IM thanks to the relaxation of the assumption of the independence and identical distribution of the renewal times. A further improvement of the fittings is obtained when the datasets are separated into two periods, suggesting that the inferences may benefit from accounting for the local seasonality.
We present a method for the analysis and compact description of large-scale multivariate weather extremes. Spatial patterns of extreme events are identified using the tail pairwise dependence matrix (TPDM) proposed by Cooley and Thibaud (2019). We also introduce the cross-TPDM to identify patterns of common extremes in two variables. An extremal pattern index (EPI) is developed to provide a pattern-based aggregation of temperature. A heat wave definition based on EPI is able to detect the most important heat waves over Europe. As an extension for considering simultaneous extremes in two variables, we propose the threshold-based EPI (TEPI) that captures the compound character of spatial extremes. We investigate daily temperature maxima and precipitation deficits at different accumulation times and find evidence that preceding precipitation deficits have a significant influence on the development of heat waves and that heat waves often co-occur with short-term drought conditions. We exemplarily show for the European heat waves of 2003 and 2010 that TEPI is suitable for describing the large-scale compound character of heat waves.
This paper develops a method for determining whether two vector time series originate from a common stochastic process. The stochastic process considered incorporates both serial correlations and multivariate annual cycles. Specifically, the process is modeled as a vector autoregressive model with periodic forcing, referred to as a VARX model (where X stands for exogenous variables). The hypothesis that two VARX models share the same parameters is tested using the likelihood ratio method. The resulting test can be further decomposed into a series of tests to assess whether disparities in the VARX models stem from differences in noise parameters, autoregressive parameters, or annual cycle parameters. A comprehensive procedure for compressing discrepancies between VARX models into a minimal number of components is developed based on discriminant analysis. Using this method, the realism of climate model simulations of monthly mean North Atlantic sea surface temperatures is assessed. As expected, different simulations from the same climate model cannot be distinguished stochastically. Similarly, observations from different periods cannot be distinguished. However, every climate model differs stochastically from observations. Furthermore, each climate model differs stochastically from every other model, except when they originate from the same center. In essence, each climate model possesses a distinct fingerprint that sets it apart stochastically from both observations and models developed by other research centers. The primary factor contributing to these differences is the difference in annual cycles. The difference in annual cycles is often dominated by a single component, which can be extracted and illustrated using discriminant analysis.
We develop a framework to forecast 24 h averaged particulate matter (PM2.5) concentrations 4 d in advance in ground-based stations over the metropolitan area of the Aburrá Valley, Colombia. The input variables are gathered from a highly diverse set of sources, including in situ real-time PM2.5To contribute to understanding the sources of predictability and uncertainty of air quality in the city, we perform a feature importance analysis revealing that the relevance of the different independent variables is a function of the lead time. Particularly, apart from the past concentrations, the variables that most affect the predictability are the forecasted aerosol optical depth (AOD), the integrated fire radiative power over a forecasted back trajectory (BT-IFRP), and the predicted planetary boundary layer height (PBLH). In the testing period, the models showed the ability to forecast poor-air-quality events in the valley with more than 1 d of anticipation. This study serves as a framework for developing and evaluating the ML-based air quality forecasting models over the Andean region.
Initial steps in statistical downscaling involve being able to compare observed data from regional climate models (RCMs). This prediction requires (1) regridding RCM outputs from their native grids and at differing spatial resolutions to a common grid in order to be comparable to observed data and (2) bias correcting RCM data, for example via quantile mapping, for future modeling and analysis. The uncertainty associated with (1) is not always considered for downstream operations in (2). This work examines this uncertainty, which is not often made available to the user of a regridded data product. This analysis is applied to RCM solar radiation data from the NA-CORDEX (North American Coordinated Regional Climate Downscaling Experiment) data archive and observed data from the National Solar Radiation Database housed at the National Renewable Energy Lab. A case study of the mentioned methods over California is presented.
what if
Recent heatwaves such as the 2021 Pacific Northwest heatwave have shattered temperature records across the globe. The likelihood of experiencing extreme temperature events today is already strongly increased by anthropogenic climate change, but it remains challenging to determine to what degree prevalent atmospheric and land surface conditions aggravated the intensity of a specific heatwave event. Quantifying the respective contributions is therefore paramount for process understanding but also for attribution and future projection statements conditional on the state of atmospheric circulation or land surface conditions. We here propose and evaluate a statistical framework based on extreme value theory, which enables us to learn the respective statistical relationship between extreme temperature and process variables in initial-condition large ensemble climate model simulations. Elements of statistical learning theory are implemented in order to integrate the effect of the governing regional circulation pattern. The learned statistical models can be applied to reanalysis data to quantify the relevance of physical process variables in observed heatwave events. The method also allows us to make conditional attribution statements and answer “what if” questions. For instance, how much would a heatwave intensify given the same dynamic conditions but at a different warming level? How much additional warming is needed for the same heatwave intensity to occur under average circulation conditions? Changes in the exceedance probability under varying large- and regional-scale conditions can also be assessed. We show that each additional degree of global warming increases the 7 d maximum temperature for the Pacific Northwest area by almost 2 ∘C, and likewise, we quantify the direct effect of anti-cyclonic conditions on heatwave intensity. Based on this, we find that the combined global warming and circulation effect of at least 2.9 ∘C accounts for 60 %–80 % of the 2021 excess event intensity relative to average pre-industrial heatwave conditions.
Many marine activities, such as designing ocean structures and planning marine operations, require the characterization of sea-state climate. This study investigates the statistical relationship between wind and sea states, considering its spatiotemporal behavior. A transfer function is established between wind fields over the North Atlantic (predictors) and the significant wave height (predictand) at three locations: southwest of the French coast (Gironde), the English Channel, and the Gulf of Maine. The developed method considers both wind seas and swells by including local and global predictors. Using a fully data-driven approach, the global predictors' spatiotemporal structure is defined to account for the non-local and non-instantaneous relationship between wind and waves. Weather types are constructed using a regression-guided clustering method, and the resulting clusters correspond to different wave systems (swells and wind seas). Then, in each weather type, a penalized linear regression model is fitted between the predictor and the predictand. The validation analysis proves the models skill in predicting the significant wave height, with a root mean square error of approximately 0.3 m in the three considered locations. Additionally, the study discusses the physical insights underlying the proposed method.
In this study we detect and quantify changes in the distribution of the annual maximum daily maximum temperature (TXx) in a large observation-based gridded data set of European daily temperature during the years 1950–2018. Several statistical models are considered, each of which analyses TXx using a generalized extreme-value (GEV) distribution with the GEV parameters varying smoothly over space. In contrast to several previous studies which fit independent GEV models at the grid-box level, our models pull information from neighbouring grid boxes for more efficient parameter estimation. The GEV location and scale parameters are allowed to vary in time using the log of atmospheric CO2∘C (95 % confidence interval of [2.03,2.12] ∘C) hotter than that based on the 1950 climate. Moreover, averaged across our spatial domain, the 100-year return level of TXx based on the 1950 climate corresponds approximately to a 6-year return level in the 2018 climate.
Daily meteorological data such as temperature or precipitation from climate models are needed for many climate impact studies, e.g., in hydrology or agriculture, but direct model output can contain large systematic errors. A large variety of methods exist to adjust the bias of climate model outputs. Here we review existing statistical bias-adjustment methods and their shortcomings, and compare quantile mapping (QM), scaled distribution mapping (SDM), quantile delta mapping (QDM) and an empiric version of PresRAT (PresRATe). We then test these methods using real and artificially created daily temperature and precipitation data for Austria. We compare the performance in terms of the following demands: (1) the model data should match the climatological means of the observational data in the historical period; (2) the long-term climatological trends of means (climate change signal), either defined as difference or as ratio, should not be altered during bias adjustment; and (3) even models with too few wet days (precipitation above 0.1 mm) should be corrected accurately, so that the wet day frequency is conserved. QDM and PresRATe combined fulfill all three demands. For (2) for precipitation, PresRATe already includes an additional correction that assures that the climate change signal is conserved.
General circulation model (GCM) outputs are a primary source of information for climate change impact assessments. However, raw GCM data rarely are used directly for regional-scale impact assessments as they frequently contain systematic error or bias. In this article, we propose a novel extension to standard quantile mapping that allows for a continuous seasonal change in bias magnitude using localized regression. Our primary goal is to examine the efficacy of this tool in the context of larger statistical downscaling efforts on the tropical island of Puerto Rico, where localized downscaling can be particularly challenging. Along the way, we utilize a multivariate infilling algorithm to estimate missing data within an incomplete climate data network spanning Puerto Rico. Next, we apply a combination of multivariate downscaling methods to generate in situ climate projections at 23 locations across Puerto Rico from three general circulation models in two carbon emission scenarios: RCP4.5 and RCP8.5. Finally, our bias-correction methods are applied to these downscaled GCM climate projections. These bias-correction methods allow GCM bias to vary as a function of a user-defined season (here, Julian day). Bias is estimated using a continuous curve rather than a moving window or monthly breaks. Results from the selected ensemble agree that Puerto Rico will continue to warm through the coming century. Under the RCP4.5 forcing scenario, our methods indicate that the dry season will have increased rainfall, while the early and late rainfall seasons will likely have a decline in total rainfall. Our methods applied to the RCP8.5 forcing scenario favor a wetter climate for Puerto Rico, driven by an increase in the frequency of high-magnitude rainfall events during Puerto Rico's early rainfall season (April to July) as well as its late rainfall season (August to November).
The performance of a new statistical framework, developed for the evaluation of simulated temperature responses to climate forcings against temperature reconstructions derived from climate proxy data for the last millennium, is evaluated in a so-called pseudo-proxy experiment, where the true unobservable temperature is replaced with output data from a selected simulation with a climate model. Being an extension of the statistical model used in many detection and attribution (D&A) studies, the framework under study involves two main types of statistical models, each of which is based on the concept of latent (unobservable) variables: confirmatory factor analysisstructural equation modelling
Evaluation of climate model simulations is a crucial task in climate research. Here, a new statistical framework is proposed for evaluation of simulated temperature responses to climate forcings against temperature reconstructions derived from climate proxy data for the last millennium. The framework includes two types of statistical models, each of which is based on the concept of latent (unobservable) variables: confirmatory factor analysisstructural equation modellingfive specific
This study develops a statistical conditional approach to evaluate climate model performance in wind speed and direction and to project their future changes under the Representative Concentration Pathway (RCP) 8.5 scenario over inland and offshore locations across the continental United States (CONUS). The proposed conditional approach extends the scope of existing studies by a combined characterization of the wind direction distribution and conditional distribution of wind on the direction, hence enabling an assessment of the joint wind speed and direction distribution and their changes. A von Mises mixture distribution is used to model wind directions across models and climate conditions. Wind speed distributions conditioned on wind direction are estimated using two statistical methods, i.e., a Weibull distributional regression model and a quantile regression model, both of which enforce the circular constraint to their resultant estimated distributions. Projected uncertainties associated with different climate models and model internal variability are investigated and compared with the climate change signal to quantify the robustness of the future projections. In particular, this work extends the concept of internal variability in the climate mean to the standard deviation and high quantiles to assess the relative magnitudes to their projected changes. The evaluation results show that the studied climate model captures both historical wind speed and wind direction and their dependencies reasonably well over both inland and offshore locations. Under the RCP8.5 scenario, most of the studied locations show no significant changes in the mean wind speeds in both winter and summer, while the changes in the standard deviation and 95th quantile show some robust changes over certain locations in winter. Specifically, high wind speeds (95th quantile) conditioned on direction in winter are projected to decrease in the northwestern, Colorado, and northern Great Plains locations in our study. In summer, high wind speeds conditioned on direction over the southern Great Plains increase slightly, while high wind speeds conditioned on direction over offshore locations do not change much. The proposed conditional approach enables a combined characterization of the wind speed distributions conditioned on direction and wind direction distributions, which offers a flexible alternative that can provide additional insights for the joint assessment of speed and direction.
This paper derives a test for deciding whether two time series come from the same stochastic model, where the time series contains periodic and serially correlated components. This test is useful for comparing dynamical model simulations to observations. The framework for deriving this test is the same as in the previous three parts: the time series are first fit to separate autoregressive models, and then the hypothesis that their parameters are equal is tested. This paper generalizes the previous tests to a limited class of nonstationary processes, namely, those represented by an autoregressive model with deterministic forcing terms. The statistic for testing differences in parameters can be decomposed into independent terms that quantify differences in noise variance, differences in autoregression parameters, and differences in forcing parameters (e.g., differences in annual cycle forcing). A hierarchical procedure for testing individual terms and quantifying the overall significance level is derived from standard methods. The test is applied to compare observations of the meridional overturning circulation from the RAPID array to Coupled Model Intercomparison Project Phase 5 (CMIP5) models. Most CMIP5 models are inconsistent with observations, with the strongest differences arising from having too little noise variance, though differences in annual cycle forcing also contribute significantly to discrepancies from observations. This appears to be the first use of a rigorous criterion to decide “equality of annual cycles” in regards to all their attributes (e.g., phases, amplitudes, frequencies) while accounting for serial correlations.
The description and analysis of compound extremes affecting mid- and high latitudes in the winter requires an accurate estimation of snowfall. This variable is often missing from in situ observations and biased in climate model outputs, both in the magnitude and number of events. While climate models can be adjusted using bias correction (BC), snowfall presents additional challenges compared to other variables, preventing one from applying traditional univariate BC methods. We extend the existing literature on the estimation of the snowfall fraction from near-surface temperature, which usually involves binary thresholds or nonlinear least square fitting of sigmoidal functions. We show that, considering methods such as segmented and spline regressions and nonlinear least squares fitting, it is possible to obtain accurate out-of-sample estimates of snowfall over Europe in ERA5 reanalysis and to perform effective BC on the IPSL_WRF high-resolution EURO-CORDEX climate model when only relying on bias-adjusted temperature and precipitation. In particular, we find that cubic spline regression offers the best tradeoff as a feasible and accurate way to reconstruct or adjust snowfall observations, without requiring multivariate or conditional bias correction and stochastic generation of unobserved events.
Human-driven climate change has caused a wide range of extreme weather events to become more frequent in recent decades. Although increased and intense periods of extreme weather are expected consequences of anthropogenic climate warming, it remains challenging to rapidly and continuously assess the degree to which human activity alters the probability of specific events. This study introduces a new framework to enable the production and communication of global real-time estimates of how human-driven climate change has changed the likelihood of daily weather events. The framework's multi-method approach implements one model-based and two observation-based methods to provide ensemble attribution estimates with accompanying confidence levels. The framework is designed to be computationally lightweight to allow attributable probability changes to be rapidly calculated using forecasts or the latest observations. The framework is particularly suited for highlighting ordinary weather events that have been altered by human-caused climate change. An example application using daily maximum temperature in Phoenix, AZ, USA, highlights the framework's effectiveness in estimating the attributable human influence on observed daily temperatures (and deriving associated confidence levels). Global analyses show that the framework is capable of producing worldwide complementary observational- and model-based assessments of how human-caused climate change changes the likelihood of daily maximum temperatures. For instance, over 56 % of the Earth's total land area, all three framework methods agree that maximum temperatures greater than the preindustrial 99th percentile have become at least twice as likely in today's human-influenced climate. Additionally, over 52 % of land in the tropics, human-caused climate change is responsible for at least five-fold increases in the likelihood of preindustrial 99th percentile maximum temperatures. By systematically applying this framework to near-term forecasts or daily observations, local attribution analyses can be provided in real time worldwide. These new analyses create opportunities to enhance communication and provide input and/or context for policy, adaptation, human health, and other ecosystem/human system impact studies.
Climate models are critical tools for developing strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how those parameter uncertainties impact our assessment of future climate risks and the efficacy of strategies to manage them. Here, we use random forests to examine the parametric drivers of future climate risk and how the relative importances of those drivers change over time. In this work, we use the Building blocks for Relevant Ice and Climate Knowledge (BRICK) semi-empirical model for sea-level rise. We selected this model because of its balance of computational efficiency and representation of the many different processes that contribute to sea-level rise. We find that the equilibrium climate sensitivity and a factor that scales the effect of aerosols on radiative forcing are consistently the most important climate model parametric uncertainties throughout the 2020 to 2150 interval for both low and high radiative forcing scenarios. The near-term hazards of high-end sea-level rise are driven primarily by thermal expansion, while the longer-term hazards are associated with mass loss from the Antarctic and Greenland ice sheets. Our results highlight the practical importance of considering time-evolving parametric uncertainties when developing strategies to manage future climate risks.
In parts I and II of this paper series, rigorous tests for equality of stochastic processes were proposed. These tests provide objective criteria for deciding whether two processes differ, but they provide no information about the nature of those differences. This paper develops a systematic and optimal approach to diagnosing differences between multivariate stochastic processes. Like the tests, the diagnostics are framed in terms of vector autoregressive (VAR) models, which can be viewed as a dynamical system forced by random noise. The tests depend on two statistics, one that measures dissimilarity in dynamical operators and another that measures dissimilarity in noise covariances. Under suitable assumptions, these statistics are independent and can be tested separately for significance. If a term is significant, then the linear combination of variables that maximizes that term is obtained. The resulting indices contain all relevant information about differences between data sets. These techniques are applied to diagnose how the variability of annual-mean North Atlantic sea surface temperature differs between climate models and observations. For most models, differences in both noise processes and dynamics are important. Over 40 % of the differences in noise statistics can be explained by one or two discriminant components, though these components can be model dependent. Maximizing dissimilarity in dynamical operators identifies situations in which some climate models predict large-scale anomalies with the wrong sign.
Numerous marine applications require the prediction of medium- and long-term sea states. Climate models are mainly focused on the description of the atmosphere and global ocean variables, most often on a synoptic scale. Downscaling models exist to move from these atmospheric variables to the integral descriptors of the surface state; however, they are most often complex numerical models based on physics equations that entail significant computational costs. Statistical downscaling models provide an alternative to these models by constructing an empirical relationship between large-scale atmospheric variables and local variables, using historical data. Among the existing methods, deep learning methods are attracting increasing interest because of their ability to build hierarchical representations of features. To our knowledge, these models have not yet been tested in the case of sea state downscaling. In this study, a convolutional neural network (CNN)-type model for the prediction of significant wave height from wind fields in the Bay of Biscay is presented. The performance of this model is evaluated at several points and compared to other statistical downscaling methods and to WAVEWATCH III hindcast databases. The results obtained from these different stations show that the proposed method is suitable for predicting sea states. The observed performances are superior to those of the other statistical downscaling methods studied but remain inferior to those of the physical models. The low computational cost and the ease of implementation are, however, important assets for this method.
Day Zero
The 2014–2018 drought over South Africa's winter rainfall zone (WRZ) created a critical water crisis which highlighted the region's drought and climate change vulnerability. Consequently, it is imperative to better understand the climatic characteristics of the drought in order to inform regional adaptation to projected climate change. In this paper we investigate the spatio-temporal patterns of drought intensity and the recent rainfall trends, focusing on assessing the consistency of the prevailing conceptual model of drought drivers with observed patterns. For this we use the new spatial subdivision for the region encompassing the WRZ introduced in our companion paper (Conradie et al., 2022).
Compared to previous droughts since 1979, the 2014–2018 drought in the WRZ core was characterised by a markedly lower frequency of very wet days (exceeding the climatological 99.5th percentile daily rainfall – including dry days) and of wet months (SPI1>0.5), a sub-seasonal attribute not previously reported. There was considerable variability in the spatial footprint of the drought. Short-term drought began in the south-western core WRZ in spring 2014. The peak intensity gradually spread north-eastward, although a spatially near-uniform peak is seen during mid-2017. The overall drought intensity for the 2015–2017 period transitions radially from most severe in the WRZ core to least severe in the surroundings. During 2014 and 2015, the drought was most severe at those stations receiving the largest proportion of their rainfall from westerly and north-westerly winds; by 2018, those stations receiving the most rain from the south and south-east were most severely impacted. This indicates an evolving set of dynamic drivers associated with distinct rain-bearing synoptic flows.No evidence is found to support the suggestion that the drought was more severe in the mountain catchments of Cape Town's major supply reservoirs than elsewhere in the core nor that rain day frequency trends since 1979 are more negative in this subdomain. Rainfall and rain day trend rates also exhibit some connections to the spatial seasonality structure of the WRZ, although this is weaker than for drought intensity. Caution should be applied in assessing South African rain day trends given their high sensitivity to observed data shortcomings. Our findings suggest an important role for zonally asymmetric dynamics in the region's drought evolution. This analysis demonstrates the utility of the spatial subdivisions proposed in the companion paper by highlighting spatial structure in drought intensity evolution linked to rainfall dynamics.