Spatial heterogeneity in rain-bearing winds, seasonality and rainfall variability in southern Africa’s winter rainfall zone

A renewed focus on southern Africa’s winter rainfall zone (WRZ) following the Day Zero drought and water crisis has not shed much light on the spatial patterns of its rainfall variability and climatological seasonality. However, such understanding remains essential in studying past and potential future climate changes. Using a dense station network covering the region encompassing the WRZ, we study spatial heterogeneity in rainfall seasonality and temporal variability. These spatial patterns are compared to those of rainfall occurring under each ERA5 synoptic-scale wind direction sector. A well-defined “true” WRZ is identified with strong spatial coherence between temporal variability and seasonality not previously reported. The true WRZ is composed of a core and periphery beyond which lies a transition zone to the surrounding year-round rainfall zone (YRZ) and late summer rainfall zone. In places, this transition is highly complex, including where the YRZ extends much further westward along the southern mountains than has previously been reported. The core receives around 80 % of its rainfall with westerly or north-westerly flow compared to only 30 % in the south-western YRZ incursion, where below-average rainfall occurs on days with (usually pre-frontal) north-westerly winds. This spatial pattern corresponds closely to those of rainfall seasonality and temporal variability. Rainfall time series of the core and surroundings are very weakly correlated (R2 < 0.1), also in the winter half-year, implying that the YRZ is not simply the superposition of summer and winter rainfall zones. In addition to rain-bearing winds, latitude and annual rain day climatology appear to influence the spatial structure of rainfall variability but have little effect on seasonality. Mean annual rainfall in the true WRZ exhibits little association with the identified patterns of seasonality and rainfall variability despite the driest core WRZ stations being an order of magnitude drier than the wettest stations. This is consistent with the general pattern of near homogeneity within the true WRZ, in contrast to steep and complex spatial change outside it.

W. S. Conradie et al.: SA winter rainfall zone seasonality and variability biodiversity hotspot (Bayer, 1984;Born et al., 2007;Myers et al., 2000), which has been the focus of extensive ecological and palaeoclimatic research that has contributed substantial climatic insights (e.g. Quick et al., 2016;Engelbrecht et al., 2019;Slingsby et al., 2021, and references therein). The eco-region's spatial evolution over long timescales is seen as key to understanding changes in the Southern Hemisphere (SH) westerlies (Chase and Meadows, 2007;Stager et al., 2012). However, both ecological diversity and palaeoecological change patterns differ between the WRZ and the YRZ, with considerable uncertainty remaining about patterns of past climate change, particularly in the transitional zones Engelbrecht et al., 2019). Disagreements regarding rainfall seasonality and variability in the transitional regions from the WRZ into its surroundings (Roffe et al., 2019) may be a contributing factor.
There is broad consensus that the WRZ is largely confined to a narrow band along the South African west coast, but in some undersampled transition regions considerable differences in classification occur by data source and classification method. Roffe et al. (2019) review a range of approaches to defining South African rainfall seasonality, finding particular disagreement regarding the boundaries between the WRZ, YRZ and summer rainfall zone (SRZ) along the south-western sections of southern Africa's Great Escarpment (henceforth referred to as "the south-western escarpment"; see Sect. 2 and Fig. 1). Following Chase and Meadows (2007), in recent palaeoclimatic and ecological studies, the WRZ conventionally has been defined as the region where more than 66 % of mean annual rainfall (MAR) occurs during the winter half-year (April-September or AMJ-JAS; henceforth, we follow the conventional practise of denoting all seasons by the capitalised, ordered sequence of the first letter of the months in the season). Locations receiving 33 % to 66 % of MAR during AMJJAS are then classified as YRZ and those receiving less than 33 % as SRZ (e.g. Chase et al., 2019;Granger et al., 2018;Procheş et al., 2005), usually using WorldClim data (Hijmans et al., 2005;Fick and Hijmans, 2017). In some ecological studies, MJJASO -the 6 coolest months in the WRZ and YRZ -has been used as an alternative winter half-year definition (e.g. Aschmann, 1973;Bradshaw and Cowling, 2014). A continuous measure such as mean seasonal rainfall contributions helps identify spatially gradual and abrupt seasonality transitions (Bradshaw and Cowling, 2014). However, a shortcoming of these approaches is that they are unable to capture differences in monthly rainfall climatology, which are important when considering locations with a distinct late summer or bimodal rainfall peak (cf. Roffe et al., 2021b).
The WRZ is usually studied as a single unit characterised by a smooth west-east gradient in winter contribution to MAR (Bradshaw and Cowling, 2014). Climatic studies have tended to focus on the southern WRZ, studied as a uniform whole (e.g. Reason and Rouault, 2005;Blamey et al., 2017). The southern WRZ falls within South Africa's Western Cape  Table A1 for meanings of abbreviations used. Blue labels refer to reservoirs or other river features, red labels to point locations and black labels to regions. Purple lines represent the Great Escarpment (in different shades by section) and orange lines the Cape Fold Belt (CFB) (in different shades for the meridional and zonal branches). Map tiles are by Stamen Design (http://stamen.com/, last access: 3 March 2022) under CC BY 3.0 (http://creativecommons. org/licenses/by/3.0, last access: 3 March 2022) and underlying data (OpenStreetMap, http://openstreetmap.org/, last access: 3 March 2022) under ODbL (http://www.openstreetmap.org/ copyright, last access: 3 March 2022).
Province. It is often studied as a seasonality domain distinct from the province's YRZ and SRZ (e.g. Van Niekerk and Joubert, 2011;Du Plessis and Schloms, 2017;Abba Omar and Abiodun, 2020). Other studies considered the province to be a uniform winter-rainfall climate unit (e.g. Masante et al., 2018;Lakhraj-Govender et al., 2019;Naik and Abiodun, 2019;Kam et al., 2021). This is, however, inaccurate, as less than half the provincial area experiences a distinct winter precipitation peak (see Results). Furthermore, the northern WRZ falls mostly outside the province. Northern WRZ cli-W. S. Conradie et al.: SA winter rainfall zone seasonality and variability 33 matology has been studied mostly in relation to ecological diversity Desmet, 2007;MacKellar et al., 2007). The relationship between southern WRZ and northern WRZ variability, seasonality and weather systems has not been rigorously assessed.
The WRZ and its surroundings form a spatially complex region with large gradients in MAR, seasonality, temporal rainfall variability and various other climatic variables (Lynch, 2004;Bradshaw and Cowling, 2014;Mbokodo et al., 2020). This heterogeneity has previously been considered in studies by e.g. Favre et al. (2013), Philippon et al. (2012), Mahlalela et al. (2019) and Abba Omar and Abiodun (2021). However, it is not well understood from the perspective of large-scale drivers and local responses. It has also not been subject to focused analysis with a dense station network. Hence, the spatial pattern of seasonality and variability remains poorly understood. The spatial pattern of transition out of the WRZ to its surroundings, in particular, has not been systematically studied. Locations near the northern and north-eastern boundaries of the WRZ have been documented to display highly variable seasonality over time (Desmet and Cowling, 1999;Nicholson, 2000;Roffe et al., 2021b), but the phenomenon is poorly understood. Mahlalela et al. (2019) and Philippon et al. (2012) explored subregional structure in the southern WRZ using cluster analysis of stations by rainfall annual cycle. However, this variable is dominated by amplitude differences between dry and wet regions. Differences in seasonality within the region have not been considered explicitly. Focusing on standardised variability, Wolski et al. (2021) detect three spatially continuous subregions of the WRZ, displaying a high degree of internal coherence that remains underexplored, since most of the northern WRZ was not considered and relatively few stations were used. Of the Day Zero drought studies, only Abba Omar and Abiodun (2020) have shown a link between spatial rainfall variability patterns and variability in rainfall drivers (cut-off lows -COLs). It has been proposed that spatial differences in recent rainfall variability and trends may be related to wind directions from which rainfall is primarily received (Deacon et al., 1992;Wolski et al., 2021), but this hypothesis has not been systematically tested.
To understand drivers of rainfall variability and trend, the spatial units sharing variability characteristics need to be investigated. Given that specific wind directions are closely associated with the stages of passage of the primary synoptic rain-bearing system (viz. cold fronts; Burls et al., 2019), studying the spatial wind direction-rainfall relationship in the WRZ should contribute to such understanding. This, in turn, would allow for more robust, locally relevant, climate change signal projections in this water-sensitive region. Hence, to complement the trend focus of Wolski et al. (2021), here we look at variability and seasonality and relate them to spatial and seasonal heterogeneity in rain-bearing winds. Section 2 provides a brief overview of the study domain, Sect. 3 describes the methods and data employed, Sect. 4 presents the primary findings of the study, Sect. 5 introduces a new spatial subdivision of the region, Sect. 6 considers the results and framework in the context of recent studies of the WRZ and Sect. 7 summarises the findings and their implications and limitations.

Physical geography of the WRZ
The WRZ stretches roughly from Cape Agulhas -the southernmost tip of Africa -along the west coast beyond the Gariep River mouth into far south-western Namibia (see Fig. 1). The south-western mountains are very wet (MAR > 1500 mm), but the WRZ also encompasses hyper-arid areas (MAR < 100 mm) in northern coastal Namaqualand and the Tanqua Karoo (Lynch, 2004;Schulze and Maharaj, 2007;Slingsby et al., 2021).
The WRZ's complex topography -it has the secondhighest median topographic heterogeneity among Mediterranean-type climate regions (Bradshaw and Cowling, 2014) -is responsible for much of the steep MAR gradients in the region. Topographically, the southern and western coastal belts -encompassing most of the WRZ -are dominated by the Cape Fold Belt. The Cape Fold Belt is composed of two series of mountain chains, one running zonally, roughly parallel to the south coast and warm Agulhas Current, and the other meridionally, parallel to the cold Benguela Current. In addition to high topographic heterogeneity, the unique proximity to both an eastern and a western boundary current (Veitch and Penven, 2017;Mahlalela et al., 2019) and alignment of these currents with the orientation of the Cape Fold Belt chains contribute to large gradients in MAR, rainfall seasonality and rain-bearing winds and systems (Cowling and Lombard, 2002;Procheş et al., 2005;Bradshaw and Cowling, 2014). In the north-east, beyond the Cape Fold Belt, the southern, south-western and western sections of the escarpment form a prominent barrier to southerly and westerly flow into the vast interior plateau beyond. The Cape Peninsula, jutting out from the south-western coast to Cape Point, is immediately apparent and is usually the location of landfall for cold fronts affecting the WRZ (Conradie et al., 2022c). On closer inspection, this peninsula is best conceptualised as a small but prominent protrusion from a larger, "blunt" peninsula, stretching from Verlorenvlei to the Breede River mouth (Compton, 2004).
The meridional and zonal chains of the Cape Fold Belt meet in an intricate set of mountain ranges -the Boland Mountain Cluster -where the wettest observing stations in extratropical southern Africa (MAR > 3000 mm) are located (Slingsby et al., 2021;de Buys, 2019; South African Forestry Research Institute , SAFRI, S). The Boland Mountain Cluster is also where the highest rates of biodiversity and endemism in the Cape Floristic Region are observed (Rebelo et al., 2006;Cowling et al., 2017) and where the highest proportion of MAR falls during winter (see Results). The wet Boland Mountain Cluster is the source of almost all water for the Western Cape Water Supply System (WCWSS) of interconnected reservoirs (Sinclair-Smith and Winter, 2019) upon which Cape Town depends as well as of most of the other major reservoirs of the WRZ. However, there are very few highaltitude stations with reasonable-quality multi-decadal daily data in the Boland Mountain Cluster. Although the WRZ is frequently described as having a Mediterranean-type climate (e.g. Barrable et al., 2002;Van Niekerk and Joubert, 2011;Polade et al., 2017), in a single 0.5 • latitude-longitude grid cell in the Boland Mountain Cluster are stations recording MAR too low and others too high to qualify for such a classification (Lynch, 2004;Schulze and Maharaj, 2007) using any of the definitions employed by Aschmann (1973), Engelbrecht andLandman (2016) and Deitch et al. (2017). Consequently, we use the term Mediterranean-like -instead of Mediterranean-type -climate regions to refer collectively to all subtropical and temperate regions with dominant winter rainfall seasonality.

Rainfall data
Two station datasets are employed: one covering southern Africa, the other the WRZ domain only. A list of stations included in the WRZ dataset together with associated metadata is provided in Table A2. A gridded dataset (Global Precipitation Climatology Centre -GPCC) is used to supplement station-based results.

Southern Africa dataset
To delineate the WRZ and characterise it in the southern African context, we use the southern African station dataset of Dent et al. (1987) and Lynch (2004) updated with South African Weather Service (SAWS) data to 2015. These daily rainfall data are subject to the quality control procedures described in Wolski et al. (2021). We use this dataset primarily to gain qualitative insights into rainfall seasonality from a dense station network covering a large area; consequently, only a limited effort is made to eliminate biases due to differences in the periods of record between stations. We require non-missing values for the equivalent of at least 25 complete years (25×365.25 d) unless otherwise indicated. Where time series of annual totals are regressed against time series of seasonal totals, at least 25 calendar years of data with at least 90 % coverage in each season are required. In all cases, means and variances are computed across the full available record and missing values are not infilled.

WRZ dataset
From the stations in the southern African dataset still reporting to SAWS in 2017, stations were selected with at least 95 % daily data coverage for the period 1979-2015 that fall within South-Western Southern Africa (defined here as Africa west of 22 • E and south of 26 • S; depicted in Fig. A1). Stations are retained only if %AMJJAS ≥ 50where %AMJJAS denotes the climatological percentage of MAR falling during AMJJAS. Additional stations were selected with at least 90 % data availability in undersampled regions. Preference was given to automatic weather stations as these were found to report rain days more consistently with nearby stations (see point 6 in Sect. A2). For these stations, updated data were received from SAWS to the end of September 2018 in most cases (see Table A2). Further station data were obtained from the South African Department of Water Affairs and Sanitation (DWS; see  for stations in South-Western Southern Africa. See Sect. A1.1 for an overview of DWS station selection. The City of Cape Town (CoCT) also provided daily data from January 2000 to the end of 2018 for selected stations. Only CoCT stations for which a record was also available from SAWS and/or DWS were included in the main dataset (it is relatively common for there to be overlap in the periods of record held by different institutions for the same station; see Lynch, 2004) due to their short periods of record. Selected stations with short periods of record but reliable coverage therein were included as a nearest-neighbour dataset to support infilling and time shifting (see Sect. A1.2 for details).

GPCC data
The GPCC version 2018 precipitation climatology (preferentially for 1950Schneider et al., 2017;Meyer-Christoffer et al., 2018) and full data monthly (Becker et al., 2013;Schneider et al., 2018) datasets, both at 0.25 • spatial resolution, are used to assess rainfall seasonality in a spatially continuous gauge-based dataset. GPCC is chosen due to its extensive temporal coverage and demonstrated veracity over the WRZ .

Outline of cleaning methodology
Since our intention with the WRZ dataset is to conduct detailed daily, monthly, seasonal and annual analyses, having an unbiased, continuous series is necessary. For this purpose, a new cleaning methodology is developed and applied to the WRZ dataset (described in Sect. 3.1.2). A paper is in preparation describing the methodology in greater detail and demonstrating its effectiveness, but an overview of each component is provided in Sect. A2. The iterative procedure includes fitting the Generalised Additive Model for Location, Shape and Scale (GAMLSS; Rigby and Stasinopoulos, 2005) to station series (see Sect. A2, point 10). The procedure is intended to correct as far as possible for common quality shortcomings in South African rainfall data. These are identified by comparing different data sources for the same station, where station records are held by multiple institutions. These data irreg-ularities include spurious sequences of zeroes and repeated equal non-zero values, rainfall recorded a day early or late, decimal errors, miscellaneous differences in values recorded for the same station on the same day and recording cumulative totals as daily rainfall (cf. Roffe et al., 2021a). An initial infilling is done by correlation coefficient weighting (CCW; see Smithers and Schulze, 2000) on rank correlation and then updated with estimates from the model fit. Where infilling is done with a GAMLSS fit, rain days are selected stochastically, with probability of rain on each day determined by the model fit.

Wind direction
The relationship between synoptic-scale wind direction and rainfall is assessed using a substantial adaptation to the WRZ of the circulation weather type (CWT) methodology of Trigo and DaCamara (2000). Following a similar approach to Sousa et al. (2018) -given the small WRZ domain -we distinguish 10 classes based on the large-scale boundary-layer wind direction and refer to these as CWTs. However, in contrast to Sousa et al. (2018), since the WRZ is most frequently subject to relatively uniform synoptic-scale flows with dominant local-rather than regional-scale modulations, we do not include cyclonic and anti-cyclonic CWTs. Instead, in addition to the CWTs corresponding to each of the eight major (cardinal plus intercardinal) wind directions, weak synoptic flow (WF) and incoherent synoptic flow (IF) patterns are classified. Trigo and DaCamara (2000) use sea-level pressure to deduce surface geostrophic flow. However, given the complex topography of the WRZ and frequent sea breezes and offshore flows due to high temperature gradients (e.g. van Rooy, 1936), synoptic flow is directly assessed from the European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis 0.75 • (ERA-Interim; Dee et al., 2011) and ECMWF Re-Analysis-5 0.25 • (ERA5; Hersbach et al., 2020) 6-hourly 900 hPa reanalysis u (zonal) and v (meridional) wind vector components. The domain considered is −30.5 ≤ φ ≤ −35 • N, 17 ≤ θ ≤ 21 • E (for latitude φ and longitude θ ). The procedure involves the following steps (in what follows, x denotes the spatial mean of x). -Remaining time steps when V ≤ 2.5 m s −1 are classified as WF.
-All remaining time steps are classified according to the direction of assigned to one of the eight major 45 •wide wind-direction-bearing sectors (e.g. 22.5 to 67.5 • is classified as NE).
-Since South African daily precipitation measurements are taken for the 24 h observation period ending at 06:00 UTC (08:00 SAST) the next morning, each time step other than 06:00 UTC is weighted as contributing 0.25 to the rainfall total for the daily observation period within which it falls. The wind directions for the 06:00 UTC data points are assigned 0.125 contribution weightings to both the observation period ending and the one starting at that time.
-Total rainfall at each station occurring under each CWT is calculated and normalised by MAR, yielding the proportion of rainfall contributed under that CWT.

Clustering
Hierarchical Ward clustering (Ward, 1963) is used to group stations in line with previous studies in the WRZ (e.g. Wolski et al., 2021;Burls et al., 2019). Cluster analysis is performed on the following three variables. Climatological normalised seasonality clustering.
Monthly mean rainfall at each station is computed over the period of record, expressed as the proportion of MAR falling in that month. Annual and winter semester interannual variability clustering. Clustering is performed on z scores, using all complete years as variables.
Annual CWT contribution clustering. Clustering is performed on the proportion of MAR occurring under each of the considered CWTs.

Analysis and plotting
A rain day is defined as occurring when at least 1 mm is recorded (see Conradie et al., 2022b). In Fig. 5b a Daniell smoother is used to provide a more consistent estimate of the true spectral density (von Storch and Zwiers, 1999, Sect. 12.3

Southern African context
We use normalised seasonality clustering and %AMJJAS to investigate seasonality patterns and gradients around the WRZ in the southern African dataset (Fig. 1). It should be borne in mind that some stations in this dataset exhibit systematic patterns of data gaps that affect seasonality assessments adversely. Hence, we focus only on broader patterns and ignore individual stations that contradict these. We identify five rainfall seasonality domains which we consider to be meaningfully distinct and reasonably spatially coherent. The dominant one is the SRZ; the others are (in the order that they are distinguished from the SRZ, with the parent cluster in brackets, where applicable) (2) WRZ, (3) YRZ (SRZ), (4) late SRZ (SRZ), and (5) extended SRZ (late SRZ). The late SRZ is characterised by a pronounced JFM or FMA peak in rainfall. It is commonly distinguished from the SRZ in South African seasonality studies (cf. Hofmeyr, 1984;Schulze and Maharaj, 2007). The extended SRZ is identified in a broad band along the east coast and southern interior, characterised by a much flatter summer rainfall peak than the (late) SRZ. Maxima occur in early autumn and late spring, suggesting a possible combination of SRZ and YRZ influences. We are not aware of any previous studies noting the similarity in seasonality between the southern interior and east coast. However, it is the least stable cluster; its spatial extent is quite sensitive to period-of-record criteria for station selection. As such, it should perhaps not be conceptualised as a single zone with coherent drivers of seasonality but rather as a combination of two distinct zones exhibiting similar seasonality characteristics resulting from two distinct combinations of seasonality drivers. The section on the east coast corresponds largely to region 14 of Schumann and Hofmeyr (1938) and to the long SRZ of Roffe et al. (2020, in whose study no stations in the southern interior were considered). Schumann and Hofmeyr (1938) identify three rainfall seasonality domains in the southern interior, all exhibiting somewhat similar seasonality to our extended SRZ cluster. Together, this suggests that the southern interior may be conceptualised as a broad transition zone between all four other seasonality zones. In contrast to the extended SRZ cluster, the WRZ cluster is remarkably stable: the same stations are selected for it when identifying between 2 and 10 clusters. The WRZ extent exhibits little change, with varying period-of-record criteria for station selection. The scalar quantity %AMJJAS effectively captures the primary features of South African rainfall seasonality, both in station and GPCC data, supporting the common use of %AMJJAS in ecological and palaeoclimatic studies (see Sect. 1). The first four clusters are clearly apparent in the %AMJJAS map ( Fig. 2a): the WRZ as a narrow west coast band with %AMJJAS 50, the broad interior SRZ by %AMJJAS 50 and the YRZ by %AMJJAS ≈ 50. Together, the extended and late SRZs are characterised by 25 %AMJJAS 35. They can be distinguished by further considering climatological OND : JFM or SON : MAM rainfall ratios. In general, seasonality features not apparent in %AMJJAS relate to (a) the relative contributions from shoulder seasons (e.g. very high in the YRZ) and/or (b) early vs. late summer contributions (e.g. low in the late SRZ). The classifications are consistent with a generally westward progression of the summer wet season (e.g. Liebmann et al., 2012;Roffe et al., 2021b). In addition to effectively delineating the primary rainfall seasonality zones, %AMJJAS shows a generally smooth decreasing south-west to north-east gra-dient and a secondary pattern of decreasing winter contributions away from the east coast (Roffe et al., 2021b). Notable deviations occur on the western (relatively high %AMJ-JAS) and eastern (very low %AMJJAS) slopes of the Lesotho Drakensberg (∼ 28 • E, 30 • S) and around St Lucia (∼ 32 • E, 28 • S), where %AMJJAS > 40, thus qualifying as YRZ (cf. Roffe et al., 2021b;Hofmeyr, 1984).

Annual rainfall climatology
It is clear from Fig. 2 that the WRZ is confined to South-Western Southern Africa, as defined in Sect. 3.1.2. We now consider the seasonality of the WRZ station dataset (see Sect. 3.1.2) in detail (Fig. 3). Comparative results from the southern African dataset stations in this domain with at least the equivalent of 20 complete years of daily data (as opposed to 25 years for Sect. 4.1, yielding a much denser station network in undersampled regions) are presented in Fig. A1, allowing for clearer assessment of the patterns of transition out of the WRZ.
The WRZ dataset includes mostly stations from the WRZ cluster in Fig. 2 but also a few that are classified as YRZ or late SRZ. From the WRZ dataset stations, six distinct seasonality subclusters are identified (Fig. 3b). A spatially coherent but complex transition from the WRZ into the surroundings is apparent. The WRZ of Fig. 2 is subdivided into WRZ, "core" WRZ and southern and northern margin subclusters. The WRZ and WRZ core subclusters together coincide closely with %AMJJAS 70 (Figs. 3d and A1). They display remarkable zonal uniformity in %AMJJAS, considering the MAR range covered from ≈ 50 to ≈ 2000 mm (Figs. 3a and A1a). The core WRZ receives a higher proportion of rainfall during JJ than the other subclusters and is situated mostly along the spine of the meridional Cape Fold Belt.
The margin and transition subclusters exhibit a still distinct but lower-amplitude mid-winter peak in rainfall and 60 %AMJJAS 70. The margin's width and longitudinal position vary substantially with latitude, indicating previously unreported fine-scale complexity in the WRZ edge. Margin_S is a coastal subcluster transitioning to the YRZ and receiving a greater proportion of rainfall during SON. The Margin_N subcluster is on the western escarpment slopes and the Transition along its eastern slopes. Together they represent the transition to the late SRZ, exhibiting a pronounced April rainfall peak together with a lower AS contribution. The Transition subcluster, additionally, features a markedly higher March contribution (see Figs. 2 and A1).
Two regions with dramatic changes in seasonality over a short distance justify particular attention. has previously been reported (Roffe et al., 2019, and references therein) or is suggested by gridded rainfall products, such as GPCC, CRU and WorldClim (see also Sect. 1, and references therein). This narrow band exhibits the characteristic bimodal YRZ rainfall seasonality (see Figs. 3c and A1c) with equinoctial peaks (cf. Schumann and Hofmeyr, 1938;Engelbrecht et al., 2015).
2. Along a zonal transect from the eastern Cedarberg to the north-western plateau, %AMJJAS first falls from > 80 to ∼ 65 in the Tanqua Karoo and then rises to over 70 on the escarpment before falling steeply across the plateau to below 45 in less than 50 km. When more stations in this band of steep seasonality change are included ( Fig. A1), stations with 40 %AMJJAS 55 are clustered together as a separate very late SRZ. Although often classified as YRZ (e.g. Rebelo et al., 2006), it is characterised -uniquely in southern Africa -by a distinct unimodal MA peak (see also Schumann andHofmeyr, 1938, andEngelbrecht et al., 2015). Given the steep seasonality gradients, competing WRZ, YRZ and late SRZ seasonality drivers and the low station density, the sensitivity to the seasonality classification approach is unsurprising.
Although seasonality patterns in the margins largely reflect a damped WRZ pattern, the summer season appears to be the dominant driver of interannual variability here -OND-JFM series are generally better correlated with annual time series than AMJJAS series (Fig. 3e). Correlations by standard 3-month seasons (not shown) confirm this. Although %AMJJAS is near uniform across the western WRZ, OND-JFM influence on annual variability increases notably from south-west to north-west. This is important not only for understanding present-day drought risk and climate change responses in the WRZ margins, but may also help explain the dramatic fine-scale spatial differences in climate responses since the Last Glacial Maximum observed near the present WRZ margin in the north-eastern Cape Fold Belt and northern WRZ, which have been posited to relate to summer-vs. winter-dominated signals (MacPherson et al., 2018;Chase et al., 2011Chase et al., , 2015Chase et al., , 2019.

Rain day seasonality
Whereas Burls et al. (2019) detect much larger trends in the rain day frequency than in total rainfall over the southern WRZ, no thorough assessment of the spatial patterns or variability in WRZ rain days has been conducted. The most strik-ing characteristic of rain day seasonality is that it is much weaker than in rainfall (Fig. 3f). In the south-west, %AMJ-JAS in rainfall is much greater than that in rain days, whereas in the western YRZ, where %AMJJAS ≈ 50, many more rain days occur during AMJJAS than ONDJFM. Furthermore, clustering by rain day seasonality (Fig. A2) reveals much smaller amplitude difference in seasonal cycle than for rainfall. These findings are consistent with winter rainfall in the WRZ margin being more reliable (as defined and discussed in Cowling et al., 2005;Desmet, 2007) and summer rainfall being more variable and thus disproportionately impacting interannual variability. In the far south-western WRZ, rain day frequency peaks later in winter (often during August) than rainfall (almost always in June); May and September receive similar numbers of rain days, but May mean rainfall tends to be about 50 % greater. This implies greater rainfall intensity in early winter and more frequent gentle rains in late winter. In contrast, the northern WRZ exhibits the highest winter rain day concentration (Fig. 3d), and rain day seasonality closely resembles rainfall seasonality, suggesting that weaker late winter and spring fronts rarely result in rainfall here. YRZ shoulder season rainfall peaks appear to result from higher rainfall intensities rather than greater rain day frequencies.

WRZ recent variability pattern: seasonal and spatial structure
In Fig. 4, for the WRZ dataset, we present eight annual and seven AMJJAS temporal variability clusters (henceforth variability clusters), each with a high degree of spatial coherence and distinct patterns of variability. Having a higher station density over a larger domain allows us to distinguish more clusters than, for example, Wolski et al. (2021), who perform a similar analysis. Comparing the time series of annual and AMJJAS clusters confirms that AMJJAS dominates variability in the west, whereas summer dominates particularly in the SE variability cluster. Comparing Fig. 4a and b to Fig. 3a reveals substantial coherence between the spatial patterns of temporal variability and of seasonality not previously reported. The annual SE and AMJJAS SE variability clusters are identical -and differ from the YRZ seasonality cluster (Fig. 3a) only by including one additional station. The S annual and AMJJAS variability clusters are also identical and correspond largely to the Margin_S seasonality cluster. This suggests common drivers of variability and seasonality. Further support is provided by the cluster mean time series (Fig. 4c and d); between the western variability clusters (CPT, SW, W and NW, collectively largely the same stations as the WRZ and WRZ core seasonality clusters) correlations are strong (all R 2 > 0.5). However, variability patterns in the north-east, south and -most notably -south-east are remarkably different from the western clusters. Some variability is shared between the SE and S cluster series: R 2 AMJJAS ≈ 0.44 (p < 10 −5 ) and R 2 annual ≈ 0.48 (p < 10 −6 ) -and between the SE and E annual cluster series: R 2 annual ≈ 0.25 (p < 0.005) -but all other clusters exhibit their lowest pairwise annual and AMJJAS correlations with the SE cluster (R 2 < 0.1; p > 0.1 for all western clusters). The fact that correlations are very low not only for annual, but also for AMJJAS series clearly indicates that the differences in variability patterns are not simply the result of summer dominance of SE cluster variability but that the drivers of YRZ winter variability are also distinct from those operating over the WRZ.
In addition to seasonality, temporal variability patterns also exhibit influences of latitude, distance from Cape Point (closely related to annual rain day climatology) and rain day seasonality. Falling in the Winter-Spring rain day seasonality cluster (Fig. A2) appears to distinguish the CPT from the SW annual variability cluster. The CPT stations saw more intense winter drought than the SW stations in 2010 and 2011 and less intense drought during 2017, when weak late-season fronts appear to have reduced drought intensity in the far south-west (see Conradie et al., 2022b). The N AMJJAS and annual variability clusters are distinguished by their northerly latitude, fewer frontal impacts and consequently higher dependence on COL rainfall (cf. Favre et al., 2013). The low number of AMJJAS rain days along the south-western escarpment suggests that only the strongest fronts penetrate this far inland. These systems presumably drive winter variability in the NE annual variability cluster.
To assess the characteristic variability patterns by cluster, we conduct Fourier decomposition of the station time series and cluster mean series. For selected frequencies judged to capture variability characteristic of each of the annual variability clusters, we plot the relative phase and normalised amplitude for all stations by annual variability cluster in Fig. 5a. Almost all WRZ stations exhibit high-amplitude variability at a period of about 6 years with a remarkably similar phase. The largest amplitude is found in stations in the western annual variability clusters. The S, E and -especially -SE variability clusters exhibit large variability with periods near 4 years with a similar characteristic phase. Highfrequency variability (< 5-year period) is often almost antiphase between the SW and SE clusters. The S cluster stations show variability patterns generally intermediate between the SW and SE clusters but also have the highest-amplitude quasi-decadal variability. Northern stations tend to have a spectral peak at around 13 years not apparent in the AMJ-JAS series (not shown), suggesting a summer driver.
In all clusters the AMJJAS series are characterised by two 20-year quasi-cycles (Fig. 4d). The largest amplitude is seen in the CPT variability cluster (not shown), for which the annual time series also shows a marked spectral peak at this frequency (see Fig. 5b). A number of stations in the SW, CPT, N and NE clusters also show pronounced variability at the 40-year timescale, modulating the 20-year cycles. Phases of the 40-year components are generally consistent with low-frequency quasi-periodic variability found in WRZ In (a, b) a map of the cluster distributions is shown and in (b, d) the cluster mean time series with a first-order loess smoother with a 16-year span overlain. Note that, whereas clustering was performed on standardised rainfall anomalies, in order to convey the amplitude and mean climatology of each subregion, the time series shown are of raw spatial mean rainfall. long-term datasets: substantial satellite-era drying occurs in the southern WRZ, the onset of which is more recent in the northern WRZ (cf. Wolski et al., 2021;Burls et al., 2019;Ndebele et al., 2019). Interannual variability during the second 20-year cycle (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) is notably greater than during the first 20-year cycle (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998) in the western clusters ( Fig. 4c and d). Applying an F test for equality of variance (von Storch and Zwiers, 1999, Sect. 6.7) between the two 20-year periods yields a significant difference at 10 of the 105 stations (p < 0.05); all 10 stations are in the western clusters or N cluster and in each case variance is higher during the second 20-year period. The CPT annual cluster mean exhibits a particularly large increase in variance between the two periods (F ≈ 2.5 (p < 0.05), where F > 1 implies larger variance in the latter period). Large but statistically insignif-icant differences are also observed in the SW (F ≈ 2), NW (F ≈ 1.7) and S (F ≈ 1.6) annual variability clusters; for all other clusters F is close to 1.

CWT-rainfall associations and its relation to seasonality and variability
In search of mechanisms driving the observed spatial structure of variability and seasonality in the WRZ, we now consider observed rainfall by CWT (in this study, effectively, wind direction). The observed seasonal frequencies of CWTs are consistent with the conventional description of Cape Town's wind climate (Fig. 6). Two wind directions are dominant: W-NW (most common overall) winds in winter and S-SE winds in summer, as the austral westerlies migrate poleward and the WRZ comes under the dominant influ- ence of the eastern flank of the South Atlantic High Pressure Cell (Reason, 2017;Sun et al., 2017). More interesting is the notable asymmetry between the two shoulder seasons. Spring (SON) is characterised by more frequent winds from southerly than northerly sectors (≈ 36 vs. 26 d per season), autumn (MAM) by the opposite (≈ 29 vs. 16 d per season). Since heavier rains tend to fall with northerly winds compared to southerly winds in the southern WRZ (in the northern WRZ southerly flow rarely leads to any rain; not shown), this shoulder season asymmetry may explain the differences between rainfall seasonality and rain day seasonality described in Sect. 4.2.2. Results presented here are from ERA5, but qualitative patterns are equivalent in ERA-Interim. The largest difference between them is the less frequent occurrence of WF events in ERA5 (≈ 5 vs. 10 d yr −1 ) associated with higher mean wind speeds (not shown). In Fig. 7 the spatial pattern of rainfall by CWT and grouped CWTs is shown, which explains much of the spatial structure of seasonality and variability. In particular, %AMJ-JAS closely corresponds to the CWT contributions from westerly sectors (Fig. 7b) (Fig. 7c) are similar in magnitude but opposite in sign. Contributions by CWT closely align with relative location in the study domain; contributions from S-SE winds are greatest along the southand south-east-facing zonal Cape Fold Belt (Fig. 7c), from SW winds in the far south-west, from N-NE winds along the north-eastern slopes of the escarpment and from the W-NW winds from the meridional Cape Fold Belt westward. Significant topographic barriers result in steep changes in CWT contributions, likely explaining sharp seasonality and variability boundaries described in Sects. 4.2 and 4.3.
Clustering by CWT contributions yields very similar patterns to seasonality and variability clusters (Fig. 8a). The same stations identified in Fig. 3 as YRZ stations are clustered together as the All-Year CWT cluster. They are clearly distinguished from the other clusters by a maximum CWT contribution from S (compared to W in all other clusters) and by receiving little more than 10 % of rainfall from the NW (compared to about 35 % at south-western stations). In fact, the All-Year CWT cluster is the only CWT cluster receiving below-average daily rainfall under the NW CWT and the only CWT cluster receiving above-average daily rainfall under the SE CWT.
The three primary patterns of seasonal variation in CWT rainfall contributions can be summarised using three CWT groups (Figs. 7d and A3). The CPT annual variability cluster stations mostly receive > 70 % of summer (DJF) rainfall from W_all and ≈ 60 % from W-NW CWTs (Fig. 7d). The rest of the Core CWT cluster receives 40 %-60 % of DJF rainfall from W-NW CWTs, while very few stations in the other CWT clusters receive > 30 % of DJF rainfall from W-NW CWTs. During JJA, all stations except those in the All-Year CWT cluster receive > 50 % of rainfall from the W-NW CWTs (Fig. 7d). Stations in the south and south-east, especially in the All-Year CWT cluster, receive most of their rain from the S-SE in SON and DJF (Fig. A3a). Rain east of the escarpment during DJF (and to a lesser extent MAM) occurs much more frequently with N-NE (Fig. A3b) and IF (not shown) CWTs than elsewhere in the study domain, suggesting local convective activity.

New framework for South-Western Southern
African WRZ seasonality structure

Approach used to develop the framework
Using rainfall seasonality alone cannot distinguish the southern WRZ core from the northern WRZ periphery, despite the latter differing from the former in variability, rain-bearing wind seasonality, rain-bearing systems and the summer contribution to variability. We propose that understanding the WRZ spatial structure requires understanding of both the mean and variability of seasonality. Hence we use the 5th percentile AMJJAS contribution to annual totals (denoted by q5 %AMJJAS ) as a measure of the reliability of the winter season contribution to the annual total. Using a percentile-based approach allows meaningful comparison between datasets. We propose a spatial subdivision of South-Western Southern Africa based on %AMJJAS and q5 %AMJJAS . The hierarchical structure is conceptually illustrated in Fig. 9. Thresholds are chosen to yield as spatially coherent a meaningful classification as possible, but in reality all boundaries are fluid. Data quality shortcomings and extreme events within the chosen time period, especially at arid stations, limit the robustness of the approach applied at the station scale. We present a simple decomposition of the study domain (Fig. 10a-c), which is highly consistent between the three datasets described in Sect. 3.1, and an extended decomposition ( Fig. 10d and e;

Subdomains
The obtained spatial structure is illustrated in Fig. 10. The subdomains are the following.
-The true WRZ is a coherent area between the west coast and the spine of the western escarpment. Seasonality of rainfall and rain days is qualitatively similar across this area. Low-and high-frequency rainfall variability is dominated by common signals. There is a unimodal winter rainfall peak, almost always in June.
-The core WRZ lies within the true WRZ, distinguished by having interannual variability dominantly driven by AMJJAS. It is situated in the south-west, including Cape Town, the Boland Mountain Cluster, meridional Cape Fold Belt chains, and the Swartland (see Fig. 1). Seasonality and rainfall contributions by CWT are near uniform spatially. Summer rainfall is still largely from the western sector CWTs and hence relatively reliable, presumably resulting in the AMJJAS dominance of interannual variability. Despite the fine-scale complexity apparent in the core, there is little spatially coherent substructure that can be diagnosed with the available station density and quality, as it is inconsistent between the three datasets. Specifically, in contrast to previous studies (e.g. Rebelo et al., 2006) and GPCC, WRZ stations with %AMJJAS ≥ 80 form a spatially disjointed set rather than being concentrated around the Berg River mouth. There are pockets where %AMJJAS ≥ 80 in the northwestern Boland Mountain Cluster, the Cedarberg, Cape Town east of Table Mountain and even the Kamiesberg.
-The periphery is the true WRZ outside the core. Relative to the core, the periphery receives a considerably larger rainfall contribution from COLs ( Favre et al., 2013;Abba Omar and Abiodun, 2020). At least one shoulder season (usually spring in the south and autumn in the north) tends to receive a larger share of MAR relative to the core.
• The southern periphery is a small coastal belt south and immediately south-east of the core. It is similar to the core in almost all respects but with a reduced-amplitude seasonality driven largely by more light rain from the south and south-west in SON and DJF. Much of this rain is presumably frontal or post-frontal (Deacon et al., 1992;Cowling et al., 2009), associated with the passage of weaker fronts that brush past the southern tip of the continent. The lowfrequency variability amplitude is lower than in the core.
• The northern periphery is a much larger region, including most of Namaqualand and stretching W. S. Conradie et al.: SA winter rainfall zone seasonality and variability along a narrow margin in the lee of the meridional Cape Fold Belt. In contrast to the southern periphery, its seasonality is very similar to the core but with far fewer rain days and consequently less stable seasonality. It receives a notably larger rainfall contribution from northerly and IF CWTs, especially in summer and autumn. The 13-year quasi-oscillatory mode is enhanced relative to the core. Roughly 65 % of variance in annual totals is shared between the core and northern periphery compared to only about 50 % with the southern periphery.
-The transition zones combine seasonality and variability characteristics of both the WRZ and YRZ (southern transition) or late SRZ (northern transition). A second rainfall peak (in addition to June) occurs in April across the transition zone, and in the south a third peak in October-November is apparent. A mid-summer rainfall minimum still occurs, but ONDJFM tends to dominate temporal rainfall variability, which is strongly correlated with the adjacent periphery (R 2 ≈ 0.75), especially in the north (R 2 > 0.8). However, rainfall variability of the northern transition is almost unrelated to that of the southern transition (R 2 ≈ 0.12).
-The WRZ's surroundings have no winter peak in rainfall and as such are definitely not part of the WRZ. The northern surroundings mostly fall in the late SRZ, although a broad and complex transitional area exists characterised by multiple competing seasonality influences. The southern surroundings fall within the YRZ.

CWT rainfall contributions and spatial structure
Coherence of variability, seasonality and CWT contributions between wet and dry regions in the true WRZ suggests common primary drivers exist; this is particularly true within the core. Consequently, to maximise robustness of findings, spatial disaggregation within the core for rainfall variability studies is not advisable. However, there is also strong evidence for heterogeneity in drivers between the true WRZ, the transitions and the surroundings. The spatial gradient in %AMJJAS from the core to the surroundings is closely aligned with a corresponding gradient in the proportion of rainfall received from westerly CWTs and, hence, presumably, from weather systems in the westerlies with associated westerly moisture sources. Orographic enhancement and its relationship with CWT contributions have long been identified as crucial in the region (e.g. Wicht, 1940;Desmet and Cowling, 1999;Reason, 2017) but appear to be very complex: we find robust large-scale influences but also suggestions of local-scale nuances (cf. Moses, 2008;Mbali, 2016) that would require much higher station densities to explore.
CWT-rainfall relationships provide insight into the distinction between the northern periphery and the core. Despite having similar seasonality to the core, temporal variability and rain-bearing wind patterns in the northern WRZ are peripheral: during all seasons, less rainfall is received with westerly (mostly frontal) winds. Presumably this is because of only the most intense fronts making landfall in the northern WRZ -as indicated by uniformly much lower rain day climatologies across sizeable MAR gradients -and consequently a greater influence of COLs on interannual rainfall variability. Additionally, irregular late summer thunderstorms -usually in conjunction with IF, N, NE or E CWTs (see Fig. A3c) -have a large impact on annual rainfall totals in some years (Desmet and Cowling, 1999;Bradshaw and Cowling, 2014;Helme and Schmiedel, 2020).
The core WRZ is better characterised by the relative rainfall contributions of rain-bearing systems than by rainfall seasonality. Even during DJF, the core -uniquely -receives most of its rainfall from systems in the westerlies (Fig. A3a). This is probably related to the South Atlantic High exhibiting the smallest seasonal variation in latitude among the SH subtropical anti-cyclones and the most poleward position in summer, allowing for a more northerly frontal track (Rudeva et al., 2019). Hence, probably uniquely among subtropical Mediterranean-like climate zones, rainfall seasonality in the WRZ core is driven almost exclusively by variability in landfalling westerly storms, with little influence from systems embedded in the tropical easterlies. As wind and moisture belts shift in response to climate change (e.g. Sousa et al., 2018;Burls et al., 2019), this difference between the core WRZ and its surroundings may result in strongly differing climate change responses.
6.2 COL rainfall contributions as a determinant of spatial structure Wolski et al. (2021) identified COL rainfall contributions from Favre et al. (2013) and Abba Omar and Abiodun (2020) as closely related to variability clusters in the southern WRZ.
The periphery and transition (as defined in Sect. 5) are associated with progressively larger COL contributions to total rainfall and may be an important contributor to seasonality and variability differences relative to the core, where COL rainfall contributions are low. COLs may be particularly influential in the less stable seasonality in the periphery and transition. The surroundings also have a larger COL contribution to total rainfall.

Implications for the YRZ and WRZ-YRZ transition
The sharp distinction between the temporal rainfall variability patterns in the WRZ and south-western YRZ is presumably a consequence of NW winds yielding below-average and SE winds above-average daily rainfall in the YRZ, unlike in the WRZ and transitional regions. This finding lends support to the modelling results of Engelbrecht et al. (2019), suggesting that the YRZ -and particularly the south-facing zonal Cape Fold Belt slopes -received less rainfall when the austral westerly storm track was shifted equatorward during the Last Glacial Maximum, resulting in enhanced northwesterly flow. More research is, however, required to understand the drivers of spatial patterns of winter rainfall variability in the YRZ and its relation to the WRZ. In particular, the spatial coherence of YRZ variability and the relationship between core YRZ climatology and the climatology of its north-western extension are poorly understood (Engelbrecht et al., 2015). However, when considered in conjunction with Engelbrecht et al. (2015), Engelbrecht and Landman (2016) and Reason (2017), the findings presented here clearly support the notion that the WRZ and YRZ have fundamentally distinct climatic drivers (e.g. Desmet and Cowling, 1999;Cowling et al., 1999;Goldblatt and Manning, 2002) rather than the conceptualisation of the YRZ as essentially a transitional zone between the WRZ and SRZ that combines climatological influences of each (e.g. Faith et al., 2019;Chase et al., 2017;Dieppois et al., 2016;Tyson, 1999). Furthermore, the importance of summer rainfall for annual variability suggests that understanding drought and future change in the periphery and transition requires greater focus on drivers of summer variability than has previously been forthcoming. These findings are also consistent with suggestions that, on palaeoclimatic timescales, summer rainfall variability has dominated overall drought stress signals in the WRZ periphery (MacPherson et al., 2018;Chase et al., 2019).

Conclusions
We study rainfall seasonality and variability in southern Africa's WRZ in its regional context (South-Western Southern Africa) through the lens of rainfall-wind direction relationships. Our primary findings regarding the WRZ can be summarised thus: -A well-defined, relatively homogeneous true WRZ with spatially coherent variability and seasonality is situated along the west coast, meridional Cape Fold Belt and western escarpment between Cape Town and the Gariep River mouth.
-The true WRZ corresponds very closely to those locations receiving 70 % of MAR from winds in the westerly sectors, mostly ( 70 %) -and reliably -in AMJ-JAS.
-The true WRZ can be subdivided meaningfully into three further subdomains, the core and the southern and northern peripheries, corresponding roughly to the SW, SE and N clusters of Wolski et al. (2021).
-The core covers most of the southern WRZ. It is surprisingly homogeneous in seasonality and variability de-spite the fact that the MAR ratio between the driest and wettest core WRZ stations exceeds 10, while the MAR range in the true WRZ spans a factor of 50.
-Variability patterns in the core are dominated by AMJJAS signals, with evidence for important quasiperiodic variability at approximately 40-, 20-and 6-year timescales.
-With distance from the core, the summer influence on interannual variability increases at a greater rate than its contribution to MAR.
The most important findings regarding South-Western Southern Africa are the following.
-The region is characterised by large heterogeneity in rainfall seasonality and temporal variability patterns. Spatially abrupt transitions occur from the WRZ to the surrounding YRZ and late SRZ. The complementary use of continuous and discrete seasonality metrics to understand this heterogeneity is advised, as is done in Roffe et al. (2020).
-The eastern Boland Mountain Cluster and western escarpment are regions of spatially abrupt and complex seasonality transitions, including areas of nonmonotonic change in %AMJJAS.
-Along a narrow band immediately south (windward) of the zonal Cape Fold Belt, the YRZ extends much further westward than has previously been reported.
-This western YRZ incursion is characterised by fundamentally distinct rainfall variability patterns, annually and in AMJJAS, tied to clearly distinct CWT-rainfall contributions. This is inconsistent with the common conceptualisation of the YRZ as a simple combination of WRZ and SRZ influences.
The coherence of spatial patterns of seasonality and variability between coastal plains, mountains and intermontane valleys -greatly differing in MAR -strongly supports the use of relative or normalised anomalies when studying WRZ rainfall variability. In particular, we would discourage using MAR explicitly or implicitly for station subsetting (as in e.g. Philippon et al., 2012;Burls et al., 2019). However, despite this coherence and the small size of the WRZ relative to Mediterranean-like climate regions globally (Aschmann, 1973;Seager et al., 2019), care should be taken in selecting regions of interest for WRZ rainfall variability and change assessments. Otherwise distinct variability signals driven by fine-scale heterogeneity in seasonality and variability in South-Western Southern Africa may easily be conflated.
It should be noted that the 40-year data period of this study is relatively short given the dominant role of multi-decadal variability in WRZ rainfall Burls et al., 2019;Ndebele et al., 2019;Dieppois et al., 2016). Furthermore, spatial aggregation and generalisability of the results are hampered by the absence of reasonable-quality long-term daily records in the wettest mountain locations. As such, some quality concerns remain for stations included in this study in the interest of spatial representativity, even after extensive quality control is applied. While it is unlikely to dramatically alter the qualitative patterns observed, the assumption of uniform rain rates during each 6 h subperiod of the 24 h rainfall recording period is crude. It is made to allow comparison between the available rainfall data (daily) and higher temporal resolution wind data (6-hourly) required to capture transient weather systems. Finally, the clustering methodologies employed are all somewhat sensitive to arbitrary parameter choices. Future studies seeking a more nuanced understanding of spatial patterns and transitions in South-Western Southern Africa would be well advised to apply fuzzy clustering approaches. However, for the purpose of this work -gaining improved qualitative understanding of the finer-scale spatial patterns of variability and seasonality of the WRZ and their relationship with wind direction -the approaches employed allow us to demonstrate and explain both remarkable homogeneity in rainfall patterns in the west and complex fine-scale patterns of transition to the east.

A1.1 DWS station selection
All South-Western Southern African DWS stations for which daily data are available from October 1980 (the start of DWS daily rainfall records) were included, with the exception of Theewaterskloof Dam, whose records were judged to be inconsistent with surrounding stations. The decision to include DWS stations under less stringent conditions than SAWS stations was motivated by their location in important water source regions, frequently at higher elevations. The Ceres Dam station, which started reporting only in 1999, is also included, as its rainfall is found to be closely related to surrounding lower-elevation stations at the daily timescale, making infilling of the missing data relatively reliable, and because it is the only known station north of the Boland Mountain Cluster recording MAR > 1500 mm. Data for all DWS stations were obtained up to December 2018, except for Clanwilliam Dam, which stopped reporting in 2014; the nearby Clanwilliam AWS SAWS station (reporting since 2001) was found to be sufficiently similar during the period of overlap, so that, after mean adjustment, it could serve as a replacement series for 2014-2018.
SAWS stations not satisfying the 95 % or 90 % criteria outlined in Sect. 3.1.2, but for which a record is also available from DWS and/or CoCT, were also included in the study.

A1.2 Nearest-neighbour stations
Gauges selected as nearest neighbours have shorter but reliable periods of record and are situated near stations in the WRZ dataset or in poorly sampled regions. These stations include CoCT stations and a number of newer automatic SAWS gauges.

A2 Description of the WRZ dataset cleaning methodology
The iterative procedure applied for data cleaning involves the following 12 steps.
1. The data cleaning procedure used by Wolski et al. (2021) is first applied. Data obtained from SAWS have been subjected to SAWS internal data cleaning, although a description of this procedure is not publicly available and has evolved in time, so that differences may have occurred in the cleaning conducted between different updates received from SAWS. DWS also performs data cleaning on their daily data, but this procedure, too, is not publicly documented.
2. Spurious sequences of dry days are identified, which presumably should have been missing. If the estimated probability of a given sequence of dry days is less than 0.25 %, the sequence is replaced by missing values. This procedure identified a number of such instances, most notably where 25 SAWS manual gauges in the southwestern Cape reported no rainfall at all for throughout JAS 1999, while surrounding stations received frequent, near-normal rainfall.
3. Sequences with more than 2 consecutive days with equal non-zero rainfall are also removed. Such sequences generally appear to result from cumulative rainfall totals (often, but not always, for a month) being distributed uniformly across all days in that period.
4. To ensure convergence of the GAMLSS fits, as large as possible a sample of days is required and bias in missing value occurrence needs to be minimised. Hence an initial infilling is done using mean-adjusted CCW. Because of the large differences in mean rainfall occurring over short distances, we use the squared Spearman rank correlation as the correlation measure. The estimated rainfall is computed using four to six nearest neighbours.
5. For manually selected stations, over periods for which climatologically very similar surrounding stations (up to a maximum of three) are reporting, CCW on this fixed set of stations is used instead. This predictor is preferentially used over other predictors, throughout the procedure, although the set of stations for which this nearest-neighbour (NN; Pegram and Bárdossy, 2013) approach is used is reduced once the GAMLSS predictor is available (see point 10). Some of these stations (e.g. Brooklands, Molteno Reservoir, Porterville AWS, and Hermanus AWS) are not part of the main WRZ dataset and are used only for infilling due to short periods of record. Automatic stations are preferred for this purpose as they are found to report rain days more consistently (see points 6 and 8).
6. Incorrect reporting of the day of rainfall in South African records is common (Smithers and Schulze, 2000). Daily rainfall is supposed to be measured over 24 h periods from 08:00 to 08:00 SAST and recorded under the date of the start of this period. However, often it is recorded under the day when the reading is taken, and this malpractice may persist for periods of days, weeks or months at a time before reverting to the correct practice. Consequently we apply time shifting in cases where the occurrence of rainfall is found to be inconsistent with surrounding stations and the sequence shifted forward or backward by 1 d is much more consistent with surrounding observations. More precisely, rainfall series are shifted back by 1 d, when, over the period from day t 0 to t n , where r(t) is the rainfall recorded at the station on day t and m(t) is the corresponding value expected (or modelled) using the approaches from points 4 or 5 above; an equivalent procedure is employed for forward shifting. Time shifting is applied iteratively until near convergence to allow for shifts by a few days and for adjustments if corrections to surrounding stations render a potential shift more desirable or a prior shift unnecessary. At very dry stations the procedure can get stuck in a loop, so complete convergence is not achievable. 7. Where records from different institutions for the same station on the same day disagree, source selection is conducted. In such cases, the value most consistent with the surrounding stations (using an approach analogous to expression A1) is selected. This is done on timeshifted versions of each of the institutional time series to ensure that differences are unlikely to be due to one recorded being a time-shifted version of another.
8. Suspiciously large single-day rainfall values following a sequence of zeroes or missing values are assumed to represent cumulative totals since the last non-zero non-missing value. These cumulative totals are then distributed over previous days in proportion to the CCW-NN (using NN estimates where available and CCW otherwise) expected rainfall values.
9. Other suspiciously large rainfall values, greater than 9 times the value expected from CCW-NN, are assumed to be the result of decimal errors and are divided by 10.
10. After the above cleaning and infilling operations have been conducted, a GAMLSS is fit for the zero-inflated Gamma distribution (ZIG) Stasinopoulos, 2005, see Schoenach et al., 2020, Appendix A, for a brief overview of the GAMLSS framework). This approach and distribution allow one to separately model the mean rainfall value, the variance of this estimate and the probability of non-zero rainfall at a station on a given day as a function of rainfall values at selected surrounding stations and the day of the year. Potential predictor stations are iteratively added to the model until at least three terms are included and thereafter are retained if the Schwarz Bayesian criterion (SBC) (Schwarz, 1978) is reduced by their inclusion. The station achieving the highest value of (ρ S ) 2 with model residuals multiplied by a measure of the frequency of co-occurrence of rainfall at the target and candidate stations is selected as the next predictor. Missing values are replaced by 0 when a binomial random number generator with probability set to the modelled probability of rainfall at that station returns 0; otherwise it returns the mean value of the fit, i.e. rainfall occurrence is stochastically modelled, but rainfall amounts on rain days are deterministic.
11. Since the probabilities of a dry day and the probability of exceeding a particular rainfall amount can now be estimated rigorously using the fits obtained in step 10, steps 2 and 9 can be repeated with more robust probability estimates.
12. After this more robust cleaning procedure is run, the GAMLSS fit is performed again. This final output is then used as the WRZ dataset.
A3 Software used -The data cleaning procedure is run in R (R Core Team, 2018) using the gamlss package (Stasinopoulos and Rigby, 2007).
-Simple statistical manipulation on the WRZ dataset is performed using the tidyverse (Wickham et al., 2019) and time series plotting using ggplot2 (Wickham, 2009) in R.
-Spatial plots and maps are produced using the python libraries basemap and cartopy (Met Office, 2010-2020. -For preprocessing, the climate data operators (cdo; Schulzweida, 2019) and the KNMI climate explorer (van Oldenborgh et al., 2009) are used. Figure A1. Equivalent of Fig. 3 for the southern African dataset, including all southern African dataset stations over South-Western Southern Africa with at least the equivalent of 20 years of complete data.         Table A2. The first column ("Station name") contains the names data providers use for their stations; the second column ("Name used") contains the name of the station used in our study. This name may differ from the original to ensure that the same name is used for all records from a given station, to ensure uniqueness where multiple stations have the same name, or in the interest of shortness and familiarity. The "Source" column identifies the institution providing the data; the latitude ("Lat"), longitude ("Long") and elevation ("Elev.") of the station provided by this institution are then listed (NA if none is provided), followed by the ID code it uses to identify the station and the start and end dates of their record used here. The "Other" column lists other institutions with records for the same station. The "Nearest neighbour" column indicates whether nearest neighbour infilling is preferentially employed for a given station, in this case listing the stations used as its neighbours; "NA" indicates that nearest-neighbour infilling is not applied for this station. The "Type" column indicates whether the record is a primary record for a station in a dataset ("Primary"), used exclusively as a nearest neighbour ("NN"), or to extend the period of record of a primary record forward or backward ("Completion").