Analysis of the Evolution of Parametric Drivers of High-End Sea-Level Hazards

Climate models are critical tools for developing strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how those parameter uncertainties impact our assessment of future climate risks and the efficacy of strategies to manage them. Here, we use random forests to examine the parametric drivers of future climate risk and how the relative importances of those drivers change over time. We find that the equilibrium climate sensitivity and a factor that scales the effect of aerosols on radiative forcing are consistently the most important climate model parametric uncertainties throughout the 2020 to 2150 interval for both low and high radiative forcing scenarios. The near-term hazards of high-end sea-level rise are driven primarily by thermal expansion, while the longer-term hazards are associated with mass loss from the Antarctic and Greenland ice sheets. Our results highlight the practical importance of considering time-evolving parametric uncertainties when developing strategies to manage future climate risks.


Introduction
A rising sea level poses a threat to island and coastal regions around the world. More than 3.1 billion people globally live within 100 km of the coast (FAO, 2014). Due to high populations of people in these regions, the respective governing bodies need to assess and manage risk (e.g., Exec. Order No. 14008, 2021;New Orleans Health Department, 2018;Hinkel et al., 2014;Le Cozannet et al., 2015). Climate models provide a valuable tool to understanding future climate risks and testing the efficacy of risk management strategies in a computational experimental setting.
There are various modeling techniques that climate models are based on. Semi-empirical models (SEMs) are both flexible and computationally efficient. Because of that, they are appropriate for uncertainty quantification, resolving the high-risk upper tails of probability distributions, and informing decision analysis. Other, more detailed process-based climate models (e.g., CLARA (Fischbach et al., 2012) or SLOSH (Jelesnianski et al., 1992)) are also useful because they resolve more specific processes and have better geographical resolution. This work uses a SEM called the Building blocks for Relevant Ice and Climate Knowledge model (BRICK; Wong et al., 2017b).
However, climate models have numerous model parameters and multiple potentially conflicting data sets may be used to calibrate them, which poses a challenge when interpreting climate model outputs (Flato et al., 2013;Giorgi, 2019). All models model is a simple climate model for simulating global mean surface temperature and global mean sea-level rise, as well as regional sea-level rise (Wong et al., 2017b). BRICK uses the Diffusion-Ocean-Energy balance CLIMate model (DOECLIM; Kriegler, 2005), the Glaciers and Ice Caps portion of the MAGICC climate model (GIC-MAGICC; Meinshausen et al., 2011), a Thermal Expansion (TE) module based on the semi-empirical relationships used by (e.g.) Grinsted et al. (2010) and Mengel et al. (2016), the Simple Ice-sheet Model for Projecting Large Ensembles (SIMPLE; Bakker et al., 2016), the ANTarctic Ocean temperature model (ANTO; Bakker et al., 2017), and the Danish Center for Earth System Science Antarctic Ice Sheet model (DAIS;Shaffer, 2014). Since our main focus is on changes in global mean sea level (GMSL), it is important to note that BRICK determines this by summing the sea-level change due to changes in land water storage (Church et al., 2013), glaciers and ice caps, the Greenland ice sheet, the Antarctic ice sheet, and thermal expansion. We use the model output data from the Representative Concentration Pathway (RCP; Moss et al., 2010) 2.6 and 8.5 scenarios from BRICK simulations of Vega-Westhoff et al. (2020). Comprehensive discussions of the Hector model is given in Hartin et al. (2015), of the BRICK model in Wong et al. (2017b), and of the Hector-BRICK coupled model in Vega-Westhoff et al. (2020). This work focuses on the BRICK model for sea-level rise and how uncertainty in its parameters relate to uncertainty in future coastal hazards, so we use only the BRICK model parameters and their relation to the sea-level rise scenarios. A list of the 38 parameters of the BRICK model are shown in Table A1. These model parameters include, but are not limited to, equilibrium climate sensitivity (ECS), a factor that scales the effect of aerosols on radiative forcing (α DOECLIM ), thermal expansion (α T E ), the temperature associated with the onset of fast dynamical disintegration of the Antarctic ice sheet (T crit ), and the rate of fast dynamical disintegration of the Antarctic ice sheet (λ) (Wong et al., 2017a). We use the model output from Vega-Westhoff et al. (2020) that has been calibrated using observations of global mean surface temperature and sea-level rise due to thermal expansion, glaciers and ice caps, the Greenland ice sheet, and the Antarctic ice sheet to constrain model parameters and projections of future sea levels and temperatures. We use the 10,000 parameter sample values from the "TTEGICGISAIS.csv" file along with projected GMSL values for the 2020 to 2150 time period (Vega-Westhoff, 2019). This corresponds to the results presented in the main text of that work. The original data from Vega-Westhoff (2019) were converted from their original RData file format to CSV using R version 3.6.1 (R Core Team, 2019). The subsequent analyses and plots were done in Python (Python 3.7.4, 2019), using the sklearn library to make the decision trees and random forests (Pedregosa et al., 2011).
The aim of the present work is to explore high-end sea-level rise scenarios and analyze how the factors driving sea-level hazards change over time. Toward this end, we preprocessed the data. The GMSL model output from every five years between 2020 and 2150 was our output data of interest. We went through each year of GMSL outputs in those 5-year intervals from the different RCP scenarios and calculated the 90th percentile of the GMSL ensemble. We used the 90th percentile as the threshold for classifying "high-end" scenarios of sea-level rise as any state-of-the-world (SOW, or ensemble member) that meets or exceeds this value; a concomitant set of model parameters, RCP forcing, and resultant temperature change and GMSL change comprises a SOW. SOWs with GMSL in each target year below this threshold are classified as "non-high-end." It is possible for a SOW to have non-high-end GMSL in one 5-year time period and later have high-end GMSL. In addition to the 90th percentile threshold, we considered the 80th percentile as the threshold in a supplemental experiment, and found that the results were not sensitive to the selection of the percentile threshold (see Appendix).

Decision trees
We are interested in examining how a given SOW's model parameters are related to whether that SOW is more/less likely to be a high-end scenario of GMSL. Decision trees are a supervised learning approach used in classification and regression applications (James et al., 2013). We use decision trees to classify each set of model parameters as leading to high-end or non-high-end sea-level rise by successively splitting training outcomes into different outcome regions.
As a running example, Fig. 1 shows a graphical representation of a hypothetical decision tree. Figure 1 splits on equilibrium climate sensitivity (ECS), and the parameters P 0 and b SIM P LE . ECS is defined as the equilibrium increase in global mean surface temperature that results from doubling the atmospheric CO 2 concentration relative to pre-industrial conditions, and is related to the climate component of the BRICK model. Meanwhile P 0 and b SIM P LE pertain to the major ice sheets' components of the model. P 0 represents the Antarctic annual precipitation for Antarctic surface temperature of 0 • C, and b SIM P LE is the equilibrium Greenland ice sheet volume for temperature anomaly of 0 • C.
Once the levels of splits in a tree reach a specified depth or another specified stopping criteria, the outcome from each leaf node is determined. The following are decision tree hyperparameters that can be used as stopping criteria in the sklearn library: -max_depth, the maximum allowed depth of a tree; -min_samples_leaf, the minimum number of samples needed at a node in order for it to be a leaf node; and -min_samples_split, the minimum number of samples needed at a node in order for a split to occur (Pedregosa et al., 2011).
If the number of samples at a node is less than min_samples_split, then the node will be a leaf node.
In the example depicted in Fig. 1, the maximum depth (max_depth) we use is 3, so the tree splits on three levels before creating leaf nodes. In our case, the leaf nodes would be classified as "non-high-end" or "high-end" as described above by a simple majority vote among the data points allocated to that node. The values for the parent split nodes are determined by considering the information gain of each possible split. Information gain quantifies the reduction in impurity, such as entropy, that would occur as a result of that split. A large value of information gain is desirable. Therefore, the potential parameter choice and value for that parameter that give the largest information gain will be selected as the split.
After this training procedure, in which data from the ensembles of Vega-Westhoff et al. (2020) is used to determine the split node values and leaf outcome classifications, the decision tree can predict outcomes based on input feature data. For example, a feature data point x with ECS equals 5 • C, P 0 equals 0.2 m yr −1 , and b SIM P LE equals 9 m would be classified as "high-end." Starting at the top of the tree, we consider the ECS value. Since x has an ECS greater than 3.25 • C, we move down the right branch of the tree to the P 0 node. x's P 0 value, 0.2 m yr −1 , is less than 0.5 m yr −1 , so we continue to the left child of the P 0 node, which is the b SIM P LE node. Because x has b SIM P LE value greater than 7.9 m, we go to the right child of the b SIM P LE node. This node is a "high-end" leaf node, so we classify x as "high-end." Figure 1. Hypothetical decision tree demonstrating the general decision tree structure using BRICK model parameters and our "high-end" and "non-high-end" classification outcomes. Since we used a maximum depth of 3 as the stopping criteria, the tree made three levels of splits before stopping to create leaf nodes.

Random Forests
As can be the case with many machine learning algorithms, decision trees can overfit the training data used to create the tree (James et al., 2013). Random forests, which are an ensemble method, are one way to reduce overfitting. Random forests are a collection of many decision trees, created by using a random subset of the training data to build each tree and by using a random subset of the features at each split. In this work, the features of the random forests are the parameters of the BRICK model, and the response to be classified is whether or not the time series of GMSL associated with those parameters is a high-end GMSL scenario (above the 90th percentile). Taking a random subset of the training data when creating a tree is called bootstrapped aggregation, or bagging. When bagging, the random subset of the training data is taken with replacement. In addition to bootstrap subsets of training data, random forests take a random subset of features from which to split the data. This further helps to address the issue of overfitting the training data.
The following general process outlines how we construct each random forest. We repeat this process to classify the high-end GMSL scenarios in 5-year increments from 2020 through 2150 for each of RCP2.6 and RCP8.5. This leads to a total of 27 random forests for each of the two RCP scenarios.
To create a random forest of decision trees, we first split the data into the parameters and the output for the given year. We then created training and test subsets of that data. The training set is used to train the model and tune the model's hyperparameters, and the test set is used to test the model's performance. In this work, the training subset comprises 80% of the original data, and the test comprises 20%. We use the RandomForestClassifier from the sklearn.ensemble library to make a forest of decision trees using entropy as the criterion to determine the splits of each tree (Pedregosa et al., 2011). Table 1. Hyperparameter values of the best estimator from the gridsearch. The max_features hyperparameter is fixed at 'sqrt' to ensure variability across the decision trees within the random forest. The 'sqrt' value for max_features means the square root of the total number of features will be used as the numerical value for max_features. Prior to fitting the model, we tune the hyperparameters of the RandomForestClassifier by performing a five-fold cross validation grid search. This process takes the training dataset, divides it into five equal-sized subsets (or "folds"), and uses four subsets to train the model and one subset as the validation set. This is repeated five times, so that each fold is used once as the validation set. The same parameters are used to train random forests with each combination of four training subsets and one validation subset. Therefore, the same parameters are used with five different training sets, and this is done for each combination of hyperparameters in the grid. In addition to the three decision tree-specific hyperparameters noted in Sect. 2.2, we explored values for the following parameters in our grid search: -max_features, the number of features a tree can consider at each split; and -n_estimators, the numbers of trees in the forest (Pedregosa et al., 2011).
We tried a few preliminary grid searches over these hyperparameters to narrow down the ranges for the hyperparameters before the final grid search using the ranges shown in Table 1. To select hyperparameters for all of the main experiments, we used these ranges in grid searches for the years 2050, 2075, and 2100 for both RCP2.6 and RCP8.5. The cross-validation scores for the hyperparameter sets for a given RCP and year are very similar (Table A2, Table A3, Table A4), and therefore, are not sensitive to changing the hyperparameters for the given ranges in Table 1. We chose the set of best values in Table 1 to use as the hyperparameter values because that specific combination of values consistently had one of the top cross-validation scores in the years and RCP scenarios that we did the gridsearch over (Table A2, Table A3, Table A4).
Using the random forest hyperparameters settings from the "Best value" row of Table 1, we fit random forests using the training data for each of the 5-year intervals for each of RCP2.6 and 8.5. We input the corresponding testing parameter values into the random forest models for it to predict output values. We then compare the predicted values to the actual output values from the testing data and calculate the percentage of the testing values that the model correctly predicted. The same process is done for the training subset that was used to create the forest. The training and testing accuracies for the forests can be found in Table A5.
Given the initial hypothesis that the equilibrium climate sensitivity (ECS) parameter is an important indicator of future sealevel rise (Vega-Westhoff et al., 2020), we examine the set of all split values for the ECS parameter for each tree in the forest.
Likewise, we examine the distribution of the maximum ECS value of each tree in the forest. The maximum split values are informative because they differentiate the highest cases of sea-level rise when the trees are branching. Hence, the maximum splits help quantify the threshold of ECS that separates the high-risk situations from the non-high-risk (lower) ECS splits.

Feature Importances
We use feature importances to assess which parameters play the largest role in determining whether or not a given SOW is a high-end GMSL scenario. We compute the importances for each feature (i.e., BRICK model parameter) using our forests fit for each year from 2020 to 2150 in 5-year increments. We use Gini importances as the feature importances, which are the normalized reductions of node impurity by each feature chosen for splitting the tree Pedregosa et al. (2011). We calculate the importance of each node in a tree using Eq. (1).
ni j is the node importance of node j, w j is the weighted number of samples at node j, C j is the impurity of node j, the "left" subscript represents the left child from the split on node j, and the "right" subscript represents the right child from the split on node j. The weighted number of samples is used as a coefficient of the different impurity calculations because the impurities of nodes address the proportion of data points that belong to that node's left-and right-children, but does not address the total number of data points associated with that node. The calculation of ni j in Eq.
(1) addresses this by giving greater weight to nodes with a large proportion of the samples than nodes that split small numbers of samples.
For example, in Fig. 1 The impurity of a node is a measure of the efficacy of the feature used for splitting the data set at that node for subdividing the data set. We use entropy as the impurity criterion in our forests, which ranges in value from 0 to 1. If the data at a given node are split evenly among that node's left-and right-children, then the node impurity is maximized at 1. The more asymmetrically the data at that node are split between the node's children, the lower the node's entropy will be. The closer the entropy is to 1, the more difficult it is to draw conclusions from the data split by that node. We calculate entropy using Eq. (3, where p c is the fraction of examples in class c (where c here is either high-end GMSL or non-high-end GMSL).
Since we are constructing forests of trees, the feature importances that we present for a given forest are the mean feature importances over all the trees in the forest.
We compute the feature importances for each feature for each forest that we fit, in 5-year intervals from 2020 to 2150. With these importances, we construct a stacked bar graph from 2020 to 2150 in 5 year increments. The bar for each year shows the breakdown of the feature importances for that specific year. Hence, all of the individual feature importances bars for a given year will add up to 1. We define an "other" category such that model parameters with an importance less than 4% are grouped into "other". There are 38 model parameters, so if the importances were uniform across all the parameters, each importance would be about 2.6%. Hence, any "other" category threshold of 3% or less would show importances that were not substantially different from the average. Because of that, we use 4% as the threshold for the "other" category. portances of the BRICK model parameters using (a) the RCP2.6 radiative forcing scenario, and (b) the RCP8.5 forcing scenario. All model parameters with an importance less than 4% were grouped into an "other" category, which is shown with hatch marks to denote its difference from the parameters. Stippling was added to alternating parameters in the legend to aid in telling the difference between similar colors.

Feature Importances
Based on Fig. 2, equilibrium climate sensitivity (ECS) (darkest solid blue boxes) and the aerosol scaling factor (α DOECLIM ) (darkest stippled blue boxes) are consistently associated with greatest high-end sea-level risk throughout both RCP2.6 and RCP8.5. Both of these model parameters are associated with the climate component of the BRICK model. Equilibrium climate sensitivity (ECS) and the aerosol scaling factor (α DOECLIM ) account for 9.3% and 6.4% (respectively) of the overall feature importance in RCP2.6 in the year 2020 and increase to 28.1% and 15.9% by the year 2150 (Fig. 2a). In the higher forcing RCP8.5 scenario, ECS accounts for 9.2% and α DOECLIM accounts for 6.6% of the overall feature importance in 2020. By 2150 in RCP8.5, they increase to 13.6% and 12.0% respectively (Fig. 2b). The relatively higher importance associated with these climate module parameters in the lower forcing RCP2.6 scenario are indicative of the large influence exerted by those parameters on the severity of the resulting sea-level rise. The only parameter in RCP2.6 that has an importance greater than 4% and belongs to a non-climate component of BRICK is the temperature associated with Antarctic ice sheet fast disintegration, T crit (Fig. 2a). This suggests that in the high-end SOW of sea-level rise, even in the low-forcing RCP2.6 scenario, by the middle of the 21st century and beyond the Antarctic ice sheet dynamics can still drive severe risks to coastal areas.

Characterization of risk over time
In contrast to high-end sea-level rise being driven primarily by climate uncertainties under RCP2.6, in the higher forcing RCP8.5 scenario, the importances of the ECS and aerosol scaling factor are relatively lower. This is indicative of the transition to uncertainty in sea-level processes driving high-end risks under the higher forcing scenario. The near-term risk in both RCP2.6 and RCP8.5 is driven by sea-level rise from thermal expansion (α T E ). Figure 2 (solid black boxes) shows that the parameters related to thermal expansion are important from 2020 until the middle of the century, becoming less important as time goes on.
In RCP2.6, α T E comprises 9.0% of the overall feature importance in 2020, which then decreases to 4.4% in 2050. Likewise in RCP8.5, α T E accounts for 8.8% in 2020 and decreases to 4.3% in 2040.
As for the long-term risks, ice loss from the Antarctic ice sheet is a driver in both emissions scenarios. In particular, the T crit and λ parameters within the Antarctic ice sheet component of BRICK are important (Fig. 2, stippled dark brown boxes and solid taupe respectively). Within the BRICK model, T crit is the temperature associated with the onset of fast dynamical disintegration of the Antarctic ice sheet. In RCP8.5, T crit is significantly important from 2040 to 2060. This is consistent with predictions that major Antarctic ice sheet disintegration will occur between 2040 and 2070 in high radiative forcing scenarios (Kopp et al., 2017;DeConto et al., 2021;Wong et al., 2017a;Nauels et al., 2017).
In addition to the Antarctic ice sheet, ice loss from the Greenland ice sheet also poses a risk in the long term. In the case of the Greenland ice sheet, the sea-level rise associated with its ice loss only has significant importance in RCP8.5. The specific model parameters associated with the Greenland ice sheet are the following: b SIM P LE , α SIM P LE , β SIM P LE , and a SIM P LE ( Fig. 2b; stippled green boxes, solid light green boxes, stippled yellow boxes, and solid tan boxes respectively).

ECS threshold of high-end sea-level rise
Previous work by Vega-Westhoff et al. (2020) used 5 • C as the value of ECS that separates high-end and non-high-end climate risk scenarios. Using our collection of random forests, we select the highest ECS split value that each decision tree in the forest split on. These split values should separate the highest-risk scenarios of GMSL in each time interval. Figure 3 shows the distribution of the maximum ECS splits for every 5 years in the 2020 to 2150 time period.
There is a considerable number of outliers of the maximum ECS split value in RCP2.6, particularly in the high-risk upper tail (Fig. 3a). However, the interquartile ranges are quite small. In most cases, the interquartile range for RCP2.6 falls between approximately 5.75 • C and 7 • C (Fig. 3a). The greatest interquartile range width is 1.6 • C in 2020, and the smallest width is 0.5 • C in 2120, with an average of 0.9 • C (Table A6). Likewise, the median is consistently between 6 • C and 6.5 • C. In RCP2.6, the medians of the maximum ECS split over the given years ranges from 5.9 • C to 6.6 • C (Table A6).
There are fewer outliers in RCP8.5 than in RCP2.6, but RCP8.5 has wider uncertain ranges. The interquartile ranges of the maximum ECS split value in RCP8.5 span a larger length than those of RCP2.6. Here, the interquartile ranges are mostly contained between 5.75 • C and 7.25 • C (Fig. 3b). The interquartile range width in RCP8.5 is between 0.9 • C (in 2050) and 1.9 • C (in 2020), with an average of 1.3 • C (Table A6). However, the median is consistently between 6 • C and 6.5 • C. In RCP8.5, the medians of the maximum ECS split over the given years ranges from 6.29 • C to 6.64 • C (Table A6). forests. The outliers are the data points less than Q1 − 1.5 * IQR or greater than Q3 + 1.5 * IQR, where Q1 is Quartile 1, Q3 is Quartile 3, and IQR is the interquartile range (Q3−Q1). The blue boxes show the IQR, and the line within the IQR is the median.
Since the median is consistently between 6 • C and 6.5 • C in both low and high radiative forcing scenarios, our random forests suggest that an ECS value between 6 • C and 6.5 • C could be used as a threshold for high-end scenarios of global mean sea-level rise. In a supplemental experiment, we reproduced the same experiment using the 80th percentile of the global mean sea level to denote the high-end cases (as opposed to the 90th percentile in the main experiments). We find that the distribution of the maximum ECS split value from each decision tree in the random forests fitted using the 80th percentile of the data is very similar to that of the 90th percentile (Fig. A2). Hence, our results are not sensitive to our choice to use the 90th percentile.
In this work, we present an approach to classify global mean sea level from the BRICK semi-empirical model as either highend or non-high-end. We use a previously published model output data set to construct random forests for 5 year increments from 2020 to 2150 for RCP2.6 and RCP8.5. We explore the parametric drivers of risk by examining the feature importances of random forests. Results show that climate components of the BRICK model, specifically equilibrium climate sensitivity (ECS) and the aerosol scaling factor (α DOECLIM ), are consistently the most important parametric uncertainties in both radiative forcing scenarios (Fig. 2). Our results also show that thermal expansion and glacier and ice cap mass loss both pose a risk in the near-term, and that long-term risks are driven by mass loss from the Greenland and Antarctic ice sheets (Fig. 2). In addition to the feature importances, we find that an ECS value greater than 6 • C to 6.5 • C indicates a high-end sea-level rise scenario based on the maximum ECS split values that each decision tree in a forest split on (Fig. 3). This result is consistent for both RCP2.6 and RCP8.5, as well as when using the 80th or 90th percentile to characterize the sea-level data as "high-end".
These results demonstrate the nonstationary risks posed by climate change and the related hazards driven by sea-level change. In turn, climate risk management strategies must address both near-term actions to mitigate near-term risks such as sea-level rise from thermal expansion and glaciers and ice caps. At the same time, risk management strategies must also guard against the long-term risks driven by mass loss from the major ice sheets. While this work was centered around the impact of model parametric uncertainties on sea-level hazards, the same machine learning approaches can be generalized to incorporate the socioeconomic uncertainties that relate future climate hazards (e.g., changes in temperatures and sea levels) to financial and human risks. These approaches offer promise to provide a more holistic view of uncertainties affecting future climate risk and the efficacy of human strategies to mitigate and manage these risks.
Code and data availability. All codes are freely available from https://drive.google.com/drive/folders/1iHSPr7qpBcQDwrKLQyem8o10duvaMNAG? usp=sharing. All data and modeling and analysis codes are freely available from https://drive.google.com/drive/folders/1iHSPr7qpBcQDwrKLQyem8o10d usp=sharing. (After the review process, these Google Drive links will be replaced by a stable Zenodo repository URL and this italicized statement will be deleted.) Appendix A  Table A6. Quartile descriptions of the distributions of the maximum equilibrium climate sensitivity (ECS) split value from each decision tree in the fitted forests. Q1 denotes Quartile 1, which is the 25th percentile of the data. The median is the 50th percentile of the data. Q3 denotes Quartile 3, which is the 75th percentile of the data. IQR stands for the interquartile range, which is calculated as Q3−Q1. Figure A1. Distributions of the maximum equilibrium climate sensitivity (ECS) split value from each decision tree in the fitted random forests. The left column of plots depicts the maximum ECS split distributions in the RCP2.6 forests, and the right column depicts the maximum ECS split distributions in the RCP8.5 forests. Figure A2. Distributions of the maximum equilibrium climate sensitivity (ECS) split value from each decision tree in the random forests fitted using the 80th percentile to classify the global mean sea level (GMSL) data. (a) depicts the maximum ECS split distributions in the RCP2.6 forests, and (b) depicts the maximum ECS split distributions in the RCP8.5 forests. The outliers are the data points less than Q1 − 1.5 * IQR or greater than Q3 + 1.5 * IQR, where Q1 is Quartile 1, Q3 is Quartile 3, and IQR is the interquartile range (Q3−Q1). The blue boxes show the IQR, and the line within the IQR is the median.