Numerous marine applications require the prediction of medium- and long-term sea states. Climate models are mainly focused on the description of the atmosphere and global ocean variables, most often on a synoptic scale. Downscaling models exist to move from these atmospheric variables to the integral descriptors of the surface state; however, they are most often complex numerical models based on physics equations that entail significant computational costs. Statistical downscaling models provide an alternative to these models by constructing an empirical relationship between large-scale atmospheric variables and local variables, using historical data. Among the existing methods, deep learning methods are attracting increasing interest because of their ability to build hierarchical representations of features. To our knowledge, these models have not yet been tested in the case of sea state downscaling. In this study, a convolutional neural network (CNN)-type model for the prediction of significant wave height from wind fields in the Bay of Biscay is presented. The performance of this model is evaluated at several points and compared to other statistical downscaling methods and to WAVEWATCH III hindcast databases. The results obtained from these different stations show that the proposed method is suitable for predicting sea states. The observed performances are superior to those of the other statistical downscaling methods studied but remain inferior to those of the physical models. The low computational cost and the ease of implementation are, however, important assets for this method.

The prediction of sea states meets multiple maritime needs. In particular, waves are the major environmental forcing at sea. Their prediction is, therefore, required for the sizing of marine structures, for the implementation of marine energy converters, or for seakeeping. Because of the long life of these structures, engineers need medium- and long-term future projections, as well as extensive time series. Indeed, the design of offshore structures usually requires a 100-year return period (Ewans and Jonathan, 2012). Other applications, such as coastal flood risk prevention, demand the characterization of wave climatology (Idier et al., 2020). For atmospheric variables, these predictions are provided by general circulation models (GCMs), but these models are coarse and hardly predict oceanic variables. To move from GCMs to the oceanographic predictions needed by industry and policy makers, dynamical and statistical downscaling (SD) methods have been developed.

In perfect prognosis statistical downscaling, an empirical relationship is built between large-scale atmospheric predictors and local wave parameters. Its main advantage is its low computational cost compared to numerical models. The efficiency of these methods allows us to explore different GCM scenarios and models, as well as to carry out several runs to estimate the uncertainties (Trzaska and Schnarr, 2014). This type of model has already been implemented in several studies for the prediction of ocean waves (Laugel et al., 2014; Wang et al., 2010). Among the possible approaches for statistical downscaling problems, machine learning approaches and, more precisely, deep learning ones have shown their interest, thanks to their ability to extract high-level feature representations in a hierarchical way (Baño-Medina et al., 2020). However, these approaches are still perceived as black boxes, which explains the lack of confidence in these models among the climate community, especially for climate change issues. Nevertheless, there are increasing calls to encourage research towards the understanding of deep neural networks in climate science (Reichstein et al., 2019).

In this study, a deep neural network is developed to describe the
relationship between global wind and local sea state, considering the spatial structure of this relationship that is introduced. The choice was made to focus on the prediction of the significant wave height (

The following three paragraphs describe the main characteristics of the datasets used in this study.

The Climate Forecast System Reanalysis (CFSR) is a hindcast database (Saha et al., 2010). Wind and other environmental variables (humidity, pressure, etc.) are recomputed with a homogeneous model and data assimilation system to eliminate fictitious trends caused by model and data assimilation changes in real time. The main purpose of this type of treatment is to obtain consistent datasets for the study of climate.

Only 10 m wind components

Predicting wave climatology is particularly complex due to the random nature of the ocean, the lack of data, and the difference in scale between the phenomena. Numerical wave prediction models are powerful tools to address this problem, based on the physics of the phenomenon (Thomas and Dwarakish, 2015). WAVEWATCH III is a spectral-phase average wave model based on a discretization of the energy balance equation with finite differences in time, as follows:

WAVEWATCH III's model accuracy depends on the one of the forcing fields and parameterization of the source terms and on the effect of the numerical schemes (Roland and Ardhuin, 2014). In deep water, the predominant forcing is the wind, followed by currents and water height. This order is reversed in coastal areas.

In this study, data from two hindcast databases using the WAVEWATCH III model are exploited as follows:

In situ data are difficult to use in statistical downscaling models. Indeed, deep learning models usually require a large amount of data to be fitted. In situ data often suffer from too short a recording (less than a decade and missing values). Sea state data from satellite altimetry are particularly complex to use because of their sparsity. Buoys provide high-frequency wave measurements at their location. However, buoys with a sufficiently long and continuous history are rather rare, and their spatial repartition is sparse.

The idea of a ground truth can be misleading, and there are challenges in using existing in situ wave measurements. Indeed, data from physical measurements are noisy, which may induce bias and extra variances in the models. Moreover, a buoy estimates the significant wave height from its own motion. Thus, the characteristics of the buoy, such as its size, composition, or
structure, as well as the characteristics of the sensor and the processing chain, can alter the estimation of

The statistical downscaling method presented in this study is a deep learning model. Deep learning algorithms have gained a large popularity recently because they outperform other prediction algorithms in many fields. The rapid expansion of deep learning has been allowed by the increase of processing power, the amount of available data, and the development of more advanced optimization algorithms. Deep learning models usually involve a huge number of parameters; however, they have good generalization capacities, and methods exist to help interpret the different levels of the network (Gagne et al., 2019). To obtain better performances and to facilitate learning by reducing the dimensionality of the predictor, this model does not work directly on 10 m wind but on a projected and time-shifted wind. The following paragraph describes the construction of the predictors used as input to the proposed model.

The predictors introduced here have been proposed in Obakrim et al. (2022) for a linear statistical downscaling model for the significant wave height. They are based on some simplified physics rules. In particular, the deep-water hypothesis (depth

At a given time, the wave field is the combination of the swells generated over long distances, often several days before, and wind sea waves generated by the local wind. It is, therefore, important that the predictors consider the complex phenomenon of wave generation and the non-instantaneous and non-local relationship between wind and waves. Therefore, two sets of
predictors are needed, i.e., a local and global predictor, denoted, respectively, as

Distance to the point of interest.

The global predictor component,

The estimated temporal width

Estimated travel time (left). Estimated temporal width in hours (right; from Obakrim et al., 2022).

The local predictor

Deep learning techniques are promising approaches for statistical downscaling because of their ability to learn spatial features from huge spatiotemporal datasets. The model presented in this paper is a convolutional neural network (CNN). It is inspired by the CNN-PR (pattern recognition) network, using standard topologies from pattern recognition for precipitation downscaling presented in Baño-Medina et al. (2020).

The use of CNN for the statistical downscaling of the sea state has been motivated by properties of CNN that are interesting for the prediction of wind generated waves, as follows:

Sea state is the result of the combination of wind sea wave and swell. Thus, the model needs to take both wave systems into consideration. This is realized by implementing a hybrid CNN model, HCNN, which takes two inputs, i.e., the global and local predictor. Figure 3 shows a schematic view of the network architecture.

Architecture of the hybrid neural network.

This neural network has been implemented with the Python deep learning library Keras (

The local predictor is normalized before being fed into the neural network to improve numerical stability of the model and its computational efficiency (Shanker et al., 1996). The Adam optimizer is a stochastic gradient descent algorithm in which individual adaptive learning rates are computed for different parameters from the estimates of the first and second moments of the gradients. It was chosen because its hyperparameters have an intuitive interpretation and typically require little tuning (Kingma and Ba, 2015). In addition, the Adam optimization algorithm can handle sparse gradients on noisy problems. The selected loss function is the mean squared error (MSE).

This neural network was trained using data from a point in the Bay of Biscay (located at coordinates 45.25

A 3 h time series

To evaluate the performance of downscaling methods, classical metrics are used.

The root mean squared error (RMSE) is the measure of the accuracy, as follows:

The performance of the HCNN model is evaluated through seven-fold cross-validation. To do so, the dataset is divided into seven periods of 3 consecutive years. Each of these periods is then successively taken as a validation dataset, with the rest of the data being used for learning. For each of these iterations, five models are trained, and the grand mean of each score is computed on each fold. The training of five individual models is a robust approach to evaluating the skills of deep learning models that are stochastic because of their use of randomness during training, particularly because of the initialization to a random state and the use of stochastic gradient descent algorithms.

Statistical scores for each fold (the orange curve is the average score for each fold).

According to Fig. 5, HCNN gives results of the same order of magnitude over each validation period. The variations observed are justified by the differences in climatology specific to each period. The standard deviation of the HOMERE

Standard deviation of

Since the performances of the HCNN model are homogeneous on each of the folds, only the last one is kept to conduct an in-depth study of the performances of the model. The selected training and validation datasets are as follows:

Train – 13 January 1994 to 31 December 2013 (size of 58 341)

Validation – 1 January 2014 to 31 December 2016 (size of 8767).

The results over the test period are satisfactory. The RMSE is 27 cm, which is 14 % of the mean value of the target

The plots are consistent with the indicators previously outlined. The scatterplot shows a rather weak dispersion around the diagonal line. Moreover, there is no outlier (point very far from this line). The quantile–quantile diagram shows that the distribution of

The significant wave height derived from the energy spectrum gives an overall description of the sea state. However, it has been highlighted that a sea state is generally made of wind waves caused by local winds and several swell trains propagated over large distances. To better understand the performance of the method, it is interesting to know how much of these different components are predicted. Energy spectrum partitioning methods have been developed to identify these components. In the output of HOMERE, the partitioning is obtained using the WaveSEP (Wave Spectrum Energy Partitioning) method, which is based on a watershed algorithm (Tracy et al., 2007).

The developed neural network is unable to determine this partitioning. Yet, it is possible to use the classification of wave systems described in Maisondieu (2017) to analyze the performance of the HCNN model. The following six possible combinations of sea states are identified in increasing order of
complexity: wind sea wave, swell, wind sea wave

Proportion and mean of target

The following bar plot shows the value of the four statistical indicators previously introduced, plus the bias, for each of the six possible combination of sea states.

Bar plot of prediction scores for each sea state during the test period selected for node no. 7818.

According to Fig. 8, RMSE and RMSEQ95 are higher for wind sea waves and are almost identical for the other sea states. However, the correlation decreases and the scatter index, which gives similar information, increases with the complexity of the sea states. There are two hypotheses that can explain this result. On the
one hand, the chosen cost function (MSE) penalizes the errors indifferently for low and high

The previous results were obtained using all available history and all points that are not masked by an island or continent. However, it is reasonable to assume that, beyond a certain distance, the waves generated by the wind do not reach the target location (or only a small fraction of their energy) and that the model does not need such an amount of data. To determine the width of the spatial and temporal window, several models are trained by gradually widening them. The two RMSE indicators are used to quantify the evolution of the prediction error. Due to the geometry of the global predictor (see Fig. 1), the search for the spatial window is simplified to the search for the optimal range of longitudes, which is hereafter referred to as

RMSE and RMSEQ95 versus

The plot in Fig. 9 (top left panel) shows that the accuracy of the prediction does not improve by taking winds located at more than 50

In the second figure, one can observe that the prediction improves by increasing the learning period. However, a few years are enough to obtain sufficiently accurate predictions (RMSE less than 30 cm, with 5 years of training). Moreover, the slope of the RMSE curve becomes less steep after about 10 years. The considered learning period must also be long enough to cover the variations in the ocean–atmosphere regime over the region. Concerning the Bay of Biscay, 10 years seems to be a sufficient period to observe positive or negative North Atlantic Oscillation (NAO) phases.

The two methods used for the comparison were developed by our team and use the previously introduced predictors.

Scores of

The analog method, originally introduced by Lorenz (1969), is a classical and simple statistical downscaling method that performs, in general, as well as the more complicated methods (Zorita and von Storch, 1999). The recent proliferation of data in atmospheric and oceanographic sciences has strengthened the scientific interest in this method (see, e.g., Platzer et al., 2021 and references therein).

The proposed method is composed of the following four main steps:

The dissipation of wave energy over long distances is difficult to quantify, but some research show that swells can lose a significant part of their energy (65 % over 2800 km; Ardhuin et al., 2009). To consider the decreasing influence of wind with distance to the target location, a Gaussian kernel is applied to the global predictor before the PCA. The radius of the Gaussian kernel determined after optimization is

Local predictor – local wind

Mean distance to neighbors – the further away from the nearest neighbors, the greater the correction to be applied.

The second model compared with HCNN is a linear regression model (Obakrim et al., 2022).

Both models were evaluated on the station presented in the previous section. The results of the three statistical downscaling methods are summarized in Table 2.

The HCNN method gives better results for the four selected indicators. Next comes the linear regression and then the analog method. It should be noted, however, that while the HCNN method outperforms the linear model, it is much less interpretable. There is a tradeoff between interpretability and performance. The analog method, even if it is the conceptually simplest of the three, obtains performances quite close to the linear model.

Scores of

In the previous sections, the HCNN neural network was only trained on the outputs of the HOMERE numerical model. It is also interesting to evaluate the ability of the network to learn from in situ data. Here, the reference data are from station no. 62001
(

The waves are, on average, higher than for the previous station (mean of 2.43 m; min 0.20 m and max 11.75 m).

The selected train and test periods are as follows:

Train – 29 July 1998 to 31 December 2013 (size of 41 696)

Test – 1 January 2014 to 31 December 2016 (size of 8541).

According to Table 3 and Fig. 10, the prediction made by RSCODE is closer to the

A 3 h time series of

In Fig. 11, one can notice the smoothness of the prediction of

The deep learning model presented in this study achieves good performance in predicting local sea states from local and global winds. These results once again prove the suitability of these approaches for statistical downscaling problems. The HCNN model even outperforms the other two statistical downscaling methods studied. However, this performance comes at the cost of interpretability. Numerical models remain better than statistical
downscaling methods in terms of accuracy, but this result is not sufficient to conclude on the interest of the latter. Indeed, unlike numerical models, the prediction of

Many perspectives remain open for the development of this model. First, we have so far been limited to predicting the significant wave height at a station. Since the global predictor is being defined on a large scale, it remains valid near the target location (the distance of validity remains to be defined), and thus, the prediction could be extended to the neighboring points by varying only the local predictor. Then, it is understood that the method should be able to consider the tidal currents and the variation in the water height to be applied in coastal environment. This point deserves further study and has not been conducted here to focus on the comparison with other models. Finally, the low computational cost makes it possible to use ensemble weather forecasts to predict the significant wave height and a confidence interval, thus reinforcing the value of the prediction made and its usability for marine applications. The proposed methodology could be useful to extend the available history of sea state conditions at locations where such information is sparse. For example, one could be interested in estimating the energy producible by a WEC (Payne et al., 2021) or estimating weather windows for the operation and maintenance of offshore structures (Walker et al., 2013).

Python scripts and data used in Sect. 4 are available online at

SO proposed the project of using DL to predict

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors acknowledge the two anonymous reviewers, for their numerous and very useful comments and suggestions.

This paper was edited by William Hsieh and reviewed by two anonymous referees.