Learning from single extreme events 1. Introduction After an extreme storm in New England in 1898, Bumpus found 136 injured house sparrows (Passer domesticus), of which 64 later died [1]. When he compared the morphology of survivors with that of deceased individuals, he found that the former were clearly less variable than the latter. This outcome was precisely as predicted by the theory of natural selection, with elimination of those individuals that deviate the most from the norm. Acentury later, Keller et al. [2] examined inbreeding in a population of song sparrows (Melospiza melodia) on Mandarte Island, British Columbia, before and after a population crash caused by a severe winter storm and found that the survivors were less inbred than the individuals that had died. Both studies were based on a single extreme event. Bumpus made use of a fortuitous opportunity and was one of the first to observe natural selection in action. Keller et al. analysed an event that occurred during a long-term study during which they had collected detailed pedigree data and provided one of the first demonstrations of selection against inbreeding. Both studies became landmark studies, significantly contributing to our understanding of how natural selection works in the wild. Extreme climatic events (ECEs) are changing in frequency and magnitude [3], and the concern is that they may have a disproportionate effect on ecosystems [4]. ECE studies therefore need to provide information on the likely biological effects of a climatic event with a particular extremeness, and whether this type of event is likely to push the biological system across thresholds from which it will only recover slowly, if at all [5]. ECEs often induce delayed and cascading responses [5-7], and understanding the underlying mechanisms (e.g. physiological, demographic) of the biological reaction is, therefore, important. Here, we ask what can be learned from studies based on a single ECE and what characteristics can make such studies particularly informative. We suggest ways of making better use of existing studies and to improve the design of future ECE studies. (a) Characteristics of single-event studies A substantial proportion of our knowledge about the effects of ECEs on natural biological systems (below, we will use the term system for any biological system, which could be an individual, population or community) is based on observing a single event. Are these studies a biased sample of all ECE studies? We examined the characteristics of 242 studies listed in a recent review [8]. More than half (59%) of these studies were based on a single ECE but they covered similar ecological responses, climatic events, habitats and taxonomic groups as the studies with multiple ECEs (figure 1; electronic supplementary material, appendix S1 with figures S1-3). The single-event ECE studies in our sample broadly fell into three categories (see electronic supplementary material, appendix S1 for details): (i) Opportunistic observational studies were initiated after an ECE and tended to be short in duration (31% of the studies, median duration 1.5 years). (ii) Long-term observational studies usually followed a system both before the extreme event occurred and after it passed and were generally able to detect delayed responses (38% of the studies, median duration 10 years). (iii) Experimental studies tended to be short, on small spatial scales and were generally restricted to systems that can be manipulated relatively easily (31% of the studies, median duration 0.7 years). (b) Definitions matter Most climatic and biological variables of interest are continuous and whether an event is considered extreme-or how many events in a time series are labelled extreme-therefore depends on the definition used [8,9]. Most definitions of an ECE require that either the climatic variable (climatological definition), the biological variable (biological definition) or both (hybrid definition) exceed a certain value or are expected to occur sufficiently rarely [9], e.g. in less than 5% of the years. Single-event studies report a biological response to one event (or treatment in the case of experiments) that is considered extreme according to one of these definitions (an observation falling into areas I, II or III in figure 2a). Beyond that, however, single-event studies vary greatly in their design, which impacts on what can be learned from them. (c) Opportunistic single-event studies Some single-event studies literally just observe a single extreme event and some biological response to it (opportunistic studies, figure 2a).Opportunistic single-event studies using a climatological definition (climate event is to the right of the vertical grey line in figure 2a) typically were initiated after an ECE occurred (e.g. the 2003 heat wave in Europe [10]) and examined ecological responses after the event. For example, a mid-winter rainstorm in the polar desert of continental Antarctica, and subsequent freezing did not damage lichens [11]. Single event studies using a biological definition (a response above the horizontal grey line in figure 2a) typically observe an unusual ecological response and then examine what may have caused it. For example, an extreme case of heather (Calluna vulgaris) dieback in Scotland was attributed to low humidity combined with low temperatures and an ageing plant population [12]. Opportunistic single-event studies sometimes use a hybrid definition (area I in figure 2a, [5]). For example, Knapp & Soule [13] examined the spatial extent of an extreme frost event after it had led to widespread tree mortality in the Pacific Northwest of the United States. Opportunistic single-event studies show what can happen to a system under certain climatic conditions. However, observational studies generally cannot attribute the response to the ECE unequivocally because the effect of unobserved confounding variables can never be ruled out. (d) Single extreme climatic events observed during long-term studies The other main type of observational single-event study reports ecological responses to an ECE that occurred during a long-term study (figure 2b-d). These studies are able to quantify how much the ecological response to the extreme climate deviates from the response of the system to nonextreme climate. For example, one could model the ecological response, Yi at occasion i, as a function of the observed climate (climi) as where g is a suitable transformation of the response and f a suitable function for the relationship between climate and response, e.g. constant, linear, etc. The model above assumes that the errors 1i are normally distributed with variance s2, and related to climate through function h, which may be constant or generalized to account for autocorrelation. d estimates the difference between a normal ( 0) and an extreme ( 1) event, and therefore how unusual it is, compared with non-extreme events. However, if the response to a single ECE is judged unusual, one cannot distinguish whether the event led to an extreme mean (event is not well described by function f; open triangles in figure 2b) or an increased variance (event is not well described by function h; open circles in figure 2b) in the response, as an extreme response is possible in both cases. Conversely, if a climate extreme does not lead to an observed extreme response (figure 2c), this does not necessarily mean that function f in the above model provides a good description of how the system responds to this type of extreme event. It is possible that the response is more variable under these conditions but the observed event happened to lead to a response that looked typical given the observations from non-extreme events (figure 2c). An increased variance in the ecological response could be expected if climate interacted with other variables. For example, extreme winters lead to high mortality in Dutch oystercatchers (Haematopus ostralegus) only when food availability is low [14]. The model above (equation (1.1)) can be used to test whether an observed response falls outside of the system response expected under non-extreme climatic variation, which could indicate a threshold-like response. Conversely, if this model predicts an extreme response well, that could indicate that the system does not cross any thresholds over the range of observed climatic events (figure 2d). For example, juvenile survival in barn owls (Tyto alba) was lowest after unusually harsh winters, but in line with what was expected given the extremeness of these years [15]. 2. Inference from single-event studies To understand the effects of ECEs on biological systems, we need to be able to estimate the magnitude of the effect (e.g. as described above) and also to attribute that effect to the ECE [9]. Different types of ECE studies have different strengths and limitation in this regard. Experimental studies adhering to the three statistical principles of study design-replication, randomization and control-allow attributing effects to treatments. Experiments are, therefore, the most powerful tool for examining the effect of particular climatic conditions on a system. However, experiments may not estimate the magnitude of the natural response well [16] and are often not possible to carry out at the desired scale. Well-designed observational studies are important because they estimate the response under real conditions but attribution is more difficult since one can never rule out the possible effects of unobserved covariates [17, 18]. (a) Design of observational single-event studies Observational single-event studies allow for stronger inference if they use random sampling as opposed to convenience sampling. Random sampling ensures representativeness. Bumpus study [1], for example, was based on sparrows that were injured during the storm and they may not be representative of the sparrow population in general. We, therefore, cannot tell whether this storm reduced morphological variance in the sparrow population as a whole. More generally, single-event studies based on a convenience sample of observational units (individuals, study sites, etc.), rather than a random one, can demonstrate that a particular phenomenon can happen but they cannot make inference about the bigger population of affected biological systems. Inference from single-event studies can be improved by employing some kind of control. The classical designs for such impact studies [19] involve either temporal or spatial controls, or both. Where a single ECE was observed during a long-term study, its impact can be inferred from temporal changes in the response (e.g. equation (1.1) above). Methods for inferring step-changes in time series are also known as intervention analysis [20]. If no data are available from before the impact occurred (as is the case in opportunistic single event studies), spatial controls could be used instead, e.g. by comparing the system affected by the ECE with one that was not affected. Spatial replication has occasionally been used in single ECE studies (e.g. case study 1, box 1, [21]), allowing researchers to examine interactions between the ECE and factors that vary spatially, like habitat quality [21]. Since the extremeness of an event often varies spatially [10, 13], spatial replication can give important information on the shape of the relationship between climate and response, e.g. whether critical thresholds lie within the range of observed values of the climatic driver. Spatial replication could also be used to separate climate extremes from other drivers, if they are not strongly correlated. The strongest designs include spatial and temporal controls, known as Before-After-Control-Impact designs [19], but are difficult to apply to ECEs because we do not know when and where these are going to happen. b) Replication at a lower level Single-event studies are unreplicated at the level of the ECE and inference is limited to the particular event that was observed. We therefore also cannot estimate the natural variability in the biological response to the observed type of ECE (figure 2b,c), and whether it depends on the state of the system [24] or other variables that may interact with climate [21]. However, the precision with which we can estimate the biological response to a particular event is at least partly under our control as it depends on replication at a lower level. For example, to estimate the effect of a particular heat wave on the mortality of a certain tree population, we could count the number of trees that have died (d) and those that survived (s) in a particular sample and estimate survival (V) during this particular event as: where n is the total number of trees observed. By increasing n, we can estimate tree mortality during this particular event more precisely. Stronger inference is possible if we have data on tree stands during normal climatic conditions, e.g. by sampling other stands that were not impacted by the heat wave (spatial control), or by observing the same stand during non-extreme years (temporal control). We could then estimate the variability in survival, for example using a generalized linear mixed effects model where Yi is the number of surviving trees out of ni in stand or year i, m estimates mean survival under non-extreme conditions, d estimates the difference between a normal ( 0) and an extreme ( 1) event and 1i is a random effect. The variability in survival among stands or years is captured by the estimate of s2. With this model, we can quantify and test the effect of this particular ECE on survival and we can improve precision by increasing ni or the number of years/ stands that we observe. (c) Mechanistic understanding through ancillary information Statistically attributing biological responses to an ECE is difficult in observational studies [17 ,18]. However, a convincing case can usually be made if the mechanistic pathways that led to the response are known and ancillary data are collected that give insights into the mechanisms of the ecological response we are interested in [25]. For example, Grant & Grant ([22], case study 2, box 1) showed how an El Nino event changed selection on beak size of Galapagos finches by favouring plants that produced softer seeds. Observing multiple demographic responses might give insights into the demographic mechanisms, such as life-history trade-offs and constraints [26], and observing multiple species helps understanding community-level effects [27]. Godfree et al. ([23], case study 3, box 1) combined detailed observations with an experiment and models to understand how an extreme drought affected the local occurrence and range boundaries of a grass. Observing how the system recovers from the impact [28] or how a collection of similar systems reacts to the same event (e.g. [21], case study 1, box 1) are additional means for strengthening the inference that can be made from single-event studies. For a mechanistic understanding of the biological effects of ECEs, physiologically important variables such as temperature, or available water, are more useful than measures of climate without a clear mechanistic link to the ecological response, e.g. a climate index [29]. Ideally, the measured climatic conditions should closely reflect the microclimate that a biological system experiences, which may not necessarily be what regular meteorological stations record. (d) Clear hypotheses lead to stronger tests There is usually some prior knowledge on the system we are interested in, or on similar systems. If we understand the relevant processes (e.g. physiological limits) well enough to be able to generate specific predictions, observing the response to a single ECE can be a powerful test of our knowledge. Bumpus [1] study was so influential because it tested a key hypothesis of an important new theory. A single careful observation can also indicate gaps in our understanding. For example, if a known physiological threshold is exceeded during an extreme event, but the expected ecological reaction does not happen, the organisms must have ways to protect themselves [30]. 3. Ways to knowledge: learning by data accumulation versus learning through theory development Progress in our understanding of ecological systems alternates between inductive and deductive inference: we observe a phenomenon, develop hypotheses that might explain the observation, develop a theory, collect more observations, refine or re-develop the theory, etc. [31]. The goal is to progress from a situation of no data and little understanding to a situation of having rich data and thorough understanding. Along this path, observations, experiments and theory (models) are tools that complement each other. We need to observe natural events to make sure that the phenomenon we study is relevant in nature. We need experiments to establish causation. And we need theory to deduce testable predictions and reach a more general understanding. How a new observation-e.g. of the effect of an ECE- contributes to this process depends on the level of background information that is already available [32]. When we lack understanding of a system, observing one event can tell us what type of responses are possible. A single observation greatly reduces uncertainty compared with the prior state of not having any information at all [33], and can be useful for decision-making [34]. Observations, even patchy or anecdotal ones, are a starting point for the process of gaining knowledge. Conversely, having a lot of data does not necessarily lead to a good understanding of a system or process. Holling [35] distinguished between situations with lots of data but little understanding (area I in figure 3), lots of understanding but little data (area II in figure 3), little data and little understanding (area IV in figure 3) and lots of data and good (a) Theory-driven pathway Most importantly, we think, single-event studies can contribute to the theory-driven pathway to knowledge (figure 3) in critical ways. In systems with little prior knowledge (area IV in figure 3), observing a large ecological response to an unusual climatic event can suggest that critical thresholds have been crossed. For example, Salewski et al. [36] observed high mortality among migratory birds arriving at an oasis south of the Sahara desert during the post-breeding southwards migration, following temperatures around 508C. Because little was known about the migration ecology of these birds crossing the Sahara, it is not clear how unusual such events are or what effect they have on the bird populations. Nevertheless, it clearly showed that these conditions can push many individual birds over the limit, which led to further studies that clarified how migratory birds cross the Sahara desert [37,38]. Where more background information is available, detailed observations of ecological effects of a single ECE can improve our confidence in our understanding of the system. A good understanding of the mechanistic pathways of how extreme events affect ecological systems exists for many situations. For example, we know a lot about the mechanisms by which plants respond to droughts and under what conditions they reach limits [39]. We also have clear hypotheses of the effects of extreme precipitation on terrestrial systems [40], and we understand the pathways of how ECEs affect carbon fluxes [41], riverine systems [42] and arid ecosystems [43]. These reviews provide frameworks against which each new observation can be evaluated. The slow accumulation of observations of extreme events is similar to the slow accumulation of evidence in some situations where natural resources are harvested. There, adaptive management has been suggested as a tool for making decisions while at the same time learning about a system [44]. Adaptive management relies on a number of alternative models that represent the uncertainty about how the system works, according to the current knowledge at the time. Learning happens by comparing model predictions to observed outcomes and re-evaluating one confidence in each model. Ideas of adaptive management could be used to learn about the ecological responses to extreme events more effectively, even if no management decisions are involved. We applied this method to the question how extreme winters affect barn owl survival [15] and found that learning continued long after the last extreme event happened (box 2). Generally, by building alternative models to predict ecological responses to ECEs it becomes clear what kind of information is needed to help distinguish between alternative hypotheses. Each time an extreme event occurs, one can focus on collecting that type of information and thereby make progress along the theory driven pathway (figure 3). Adaptive management provides a sound framework for learning from consecutive events. This approach is particularly powerful when observations are hard to come by and one needs to make the most out of each observation [46, 47] and for detecting ecological surprises [48]. Adopting ideas from adaptive management may only work for effects thatwe can anticipate and have enough knowledge to model, albeit even with little knowledge, simple models can be very powerful [49]. (b) Data-driven pathways to learning from single-event studies: meta-analyses Studies on single extreme events can contribute to the datadriven pathway most effectively if they are reported in a way that makes them comparable with other studies, for example, through formal meta-analysis [50]. This requires reporting effect sizes in a comparable way-a meta-analysis is only possible if the effect sizes from different studies reflect the same thing. The most relevant effect size is often the magnitude of the ecological response to the ECE, i.e. the difference in the ecological variable-e.g. survival, growth-after an extreme event compared to its value under normal climatic conditions. Having some kind of control (2a) is therefore particularly important. To make single event studies comparable one also needs to know how extreme the observed climatic event was, since the magnitude of the ecological response likely depends on how climatically extreme the event was. Quantifying the magnitude of climatic extremeness in a biologically relevant way is challenging as long-term climatic data are needed, and the frequency of such events is changing with climate change. Meta-analyses can help attributing extreme responses to particular drivers (different components of climate or non-climatic drivers) if single-event studies report those. To demonstrate how meta-analyses can be used to draw information from multiple single-event studies, we examined whether the 2003 heatwave that affected much of Europe [10] had different effects on fecundity, growth and survival across different organisms (example 1, box 3). We found some evidence that survival and fecundity declined more than growth but there was a lot of variability among observations. As a second example, we examined whether a change in survival in response to heatwaves depended on the extremeness of the event (example 2, box 3). In the two examples, we had to exclude 24 and 44% of the studies because they lacked critical information on the effect size or extremeness of the climatic event, suggesting that future studies should pay more attention to reporting critical information. One issue that needs particular attention when conducting formal meta-analyses of ecological responses to ECEs is that there could be bias due to particular definitions of ECEs. For example, studies observing no large ecological response after a climatological ECE (area I in figure 2a) would not be called extreme events under the biological or hybrid definitions. Meta-analyses might consequently overestimate the effects of ECEs, particularly where sample sizes are low [52]. 4. Conclusion: how to make single-event studies most useful As ECEs happen rarely we get few opportunities to study them. Single-event studies will therefore remain an important source of knowledge about the biological effects of ECEs. We found that single-event studies broadly fall into three categories that each have their strengths and limitations but can contribute to our knowledge of the biological effects of ECE in complementary ways [53]. (i) Long-term studies can collect information on a system before and after an extreme event. However, they need a lot of investment to be maintained over long enough time spans. (ii) Opportunistic studies are easier to set up, but lack information on the system dynamics before the extreme event. (iii) Experiments can uncover causal relationships, but tend to be limited to certain types of systems, and relatively small spatial and temporal scales. We examined factors that make single-event studies more useful contributions to both a theory-driven (3a) and a datadriven pathway (3b) of learning, and summarise these factors in box 4. The location and timing of the next extreme event is uncertain and it can therefore be difficult to measure the right thing in the right place (2c). However, having clear hypotheses (2d) and some prior understanding of the mechanisms (2c) help when deciding what responses should be measured and for how long. To make the most of each opportunity, we argue that attention to rigorous study design is particularly important. This involves using appropriate controls and random sampling (2a), and enough power to estimate the effect size reliably (2b). Studies reporting on ecological responses to a single ECE may have their limitations [8]. However, due to the difficulty of studying multiple extreme events, these studies play an important role in our understanding of such effects. With this paper we hope to improve the value of single-event studies by taking a critical look at the value and limitations of such studies, by suggesting ways of making better use of existing studies and by suggesting ways to improve on the design of future studies.