Tag: Environmental economics

Bureaucrat incentives reduce crop burning and child mortality in South Asia

[ad_1]

Data

For our empirical analysis, we use data from a variety of sources.

Fires: we measure crop burning with data on fires from NASA satellites Moderate Resolution Imaging Spectroradiometer (MODIS) (1 km × 1 km resolution) and Visible Infrared Imaging Radiometer Suite (VIIRS) (375 m × 375 m resolution) from September 2012, the first year in which VIIRS starts operating. These data are generated by detecting thermal signatures that are not impeded in the presence of cloud cover or smoke in the atmosphere. We plot the total number of fires over time in Supplementary Fig. A.1: each year, there are two peaks in the months of April through May and October through November, in correspondence with the crop burning seasons of wheat and rice, respectively⁴⁹.

Crops: we gather data on average crop area, production and yield in 2010—the latest year in which micro-level crop data are available before the start of our analysis sample (2012–2022). These data are retrieved from Food and Agriculture Organization of United Nations (FAO) maps (https://mapspam.info/index.php/methodology/). They have a resolution of 5 × 5 arcmin, corresponding to approximately the size of a grid cell in our data.

Winds: the data on wind direction and speed are from the European Centre for Medium-Range Weather Forecasts (ECMWF; https://www.ecmwf.int/en/forecasts/datasets) and are precise at a 25 × 25 km resolution. We proxy for the expected wind direction of bureaucrats in a given month–year by computing ten-year rolling averages of wind direction in each month and year by using wind data starting in the 2000s. We plot actual and rolling averages of wind directions in Supplementary Fig. A.1 (right). We can see that rolling averages closely follow the actual wind direction. The two major wind switches happen on average in June and August each year.

Administrative borders: we use the latest 2017 census boundaries for Pakistan, which are available on the Humanitarian Data Exchange website⁵⁰. For the Indian administrative boundaries, we rely on the latest version generated by Community Created Maps of India⁵¹. These maps compile information from various government websites to create district-level polygons. During our analysis period, some districts underwent splits: Charki Dadri was carved out from Bhiwani in Haryana on 1 December 2016, and Malerkotla district was bifurcated from Sangrur in Punjab on 14 May 2021. We account for these splits by modifying our shapefiles accordingly. Our main sample of analyses comprises states in which the problem of crop stubble burning is particularly salient. These states are Punjab in Pakistan, and Punjab, Haryana, Uttar Pradesh, Delhi and Bihar in India. These are the same states considered in other papers focusing on air pollution in South Asia⁵.

Combining all the data at a 5 km² grid-cell level: we combine the above datasets into a single gridded dataset of 5 km². Wind and fire grids do not overlap precisely. To reduce errors in the assignment of fires to wind grids, we split the 25 × 25 km wind grid cells into squares of 5 km² (around 2.25 km × 2.25 km size). This guarantees that, even if wind and fire grids do not perfectly overlap, data on fires are assigned to the wind grid cell that contains most of the fire grid cell. Splitting wind grids in smaller parts also enables us to address a second measurement concern—some grid cells might overlap across jurisdiction boundaries. By reducing their size, we minimize the amount of misattribution and we are able to assign each grid cell to the appropriate jurisdiction. Notice that any remaining error should bias our findings towards zero, as errors are uncorrelated with the dependent and independent variables. In total, 11% of the grid cells fall between more than one district. These are assigned to the district that contains a larger overlapping area with them (and we show that our results are robust to dropping these from the analysis). For crop data, we assign the pre-period crop production to each 5 km² grid cell. This enables us to classify what crops are produced in each grid-cell location. Figure 2a illustrates our final gridded dataset for Hardoi District in Uttar Pradesh, where each square refers to a 5 km² grid cell. Within the sample area, we are able to observe 149,830 grid cells in each month and year from September 2012 to August 2022, for a total of 17,979,600 observations. Supplementary Table A.1 presents descriptive statistics on these data.

Air pollution: we use data on air pollution from the Copernicus Atmosphere Monitoring Service (CAMS; https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-global-reanalysis-eac4?tab=overview), measuring the concentration in the air of particulate matter with a diameter of 2.5 μm or less (PM_2.5). The data are available at a resolution of approximately 80 km. In Supplementary Fig. A.2, we map the total concentration of PM_2.5 in the air in 2014. The north of the subcontinent is covered by a dense cloud of air pollution (darker red), unlike any other parts of the globe.

PM_2.5 data from CAMS are valuable as they offer high-resolution information on air quality in all the regions of our sample. However, these data are estimates obtained by combining satellite observations, ground-based measurements and atmospheric models. To validate these estimates, we gather data that are purely ground based: measurement of air pollution from air monitors⁵². The database contains data from 2,110 monitors operated by the Central Pollution Control Board (CPCB) in the period 2015–2019. These data cover 76 CAMS grids out of the 2,156 in our entire study area, and not in all months, making it too sparse for use as a regressor. We instead use air monitors to validate CAMS estimates in Supplementary Fig. E.2, which shows a positive correlation (0.726) between the total pollution recorded each month by the monitoring stations and CAMS estimates.

Bureaucratic punishment: for India, penalty data come from the Socioeconomic High-resolution Rural-Urban Geographic Platform for India (SHRUG) database (https://www.devdatalab.org/judicial-data), which contains all judicial case records from the Indian eCourts platform. From this database, we extracted all judicial cases in which there was a violation of the Air (Prevention and Control of Pollution) Act of 1981, which includes crop burning offences from 2010–2018, the years during which the data overlap with our sample. Measurement error in these data should bias our results downwards. In the case of Pakistan, we obtained new data on police records of criminal activity from the Central Police Office in Lahore, called First Information Reports (FIRs), that relate specifically to crop burning offences for the years 2017–2022.

Bureaucratic control of fires

Treatment definition

As discussed above, the treatment turns on when smoke from a potential fire in a grid cell would pollute the home district of the administrator. For each month–year, we proxy for administrators’ expectations on actual wind directions by using 10 month–year rolling average of wind direction. In practice, these beliefs are formed through long-term local knowledge, as well as through weather data that are made available as part of bureaucrats’ routine work⁵³.

Once we have this measure of wind direction, we use it to calculate the area of the jurisdiction that is downwind, as well as upwind, from each grid cell in each month. We do this by sketching a 180° line that passes through the grid-cell centroid and is perpendicular to the wind direction vector such that the district is split into a downwind area and an upwind area⁵⁴. Using this definition, a grid cell is treated in a month–year if the area of the jurisdiction downwind from the grid cell is larger than the area upwind. Figure 2b provides a visual illustration of this treatment definition. Figure 2c illustrates the actual representation of treatment status variation in different months.

Estimation

$${\rm{Fire}}{{\rm{s}}}_{it}=\mathop{\sum }\limits_{r=-s}^{z}\zeta {D}_{ir}+{\alpha }_{i}+{\gamma }_{j}\times {\theta }_{t}+\delta {{\bf{X}}}_{it}+{\nu }_{it}$$

(1)

where our dependent variable Fires_it is the number of fires in a grid cell and month. The coefficient of interest ζ is the effect of treatment in each period before and after treatment, relative to the switching month–year (the omitted category). For example, if changes in wind direction convert grid-cell i from control to treatment in May, relative time will take value 0 in May, +1 in June, −1 in April, and so on. We restrict attention to observations in a six-month window around treatment, where the vast majority of our sample lies (Supplementary Fig. A.3) and drop treated observations in the first period in the sample, September 2012, as their relative time cannot be established.

The high dimensionality of our data allows for the inclusion of fine-grained fixed effects, although as we show in the Supplementary Information, the treatment effect size is quite stable when these are included. First, α_i are 5 km² grid-cell fixed effects that absorb time-invariant factors for over 149,000 grid cells in our data, such as crop suitability or economic characteristics of small locations. Second, γ_j × θ_t are jurisdiction by month–year fixed effects that account for time-varying shocks at the district level, such as changes in administrative personnel. X is a vector of controls that includes average wind speed and direction. Standard errors are clustered at the grid cell and month–year by district levels to account for temporal dependence of observations as well as spatial correlation in wind.

In addition to the event study design, we also run a difference in differences strategy. For a 5 km² grid-cell i in jurisdiction (district) j and month–year t, we estimate:

$${{\rm{Fires}}}_{it}=\beta {{\rm{Downwind}}}_{it}+{\alpha }_{i}+{\gamma }_{j}\times {\theta }_{t}+\delta {{\bf{X}}}_{it}+{{\epsilon }}_{it}$$

(2)

The coefficient of interest β captures the overall effect of treatment on the number of fires in that grid cell. Downwind represents the treatment variable: a grid switches to treatment when, depending on wind direction, it would expose most of the home district to smoke. Fixed effects, controls and clustering are as in the event study specification described above.

Action and inaction in the control of fires

We build on the above specification to separate the effect of action and inaction towards the control of fires. For each grid cell, we calculate the distance from its respective borders and code whether these borders are upwind or downwind in a given month. This yields a grid-border–grid-cell pair database with spatial variation in the distance of each grid cell to the border and monthly variation in whether pollution exits or enters the border corresponding to each grid cell.

The key advantage of this strategy is that we can separately estimate the effort bureaucrats exert to prevent fires in areas close to upwind borders from the effort not exerted to prevent fires in areas close to downwind borders. Calling a border segment b, we estimate, for each segment b corresponding to grid-cell i in month–year t:

$$\begin{array}{l}{{\rm{F}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{s}}}_{ibt}=\mathop{\sum }\limits_{q=1}^{5}\kappa {{\rm{D}}{\rm{i}}{\rm{s}}{\rm{t}}{\rm{a}}{\rm{n}}{\rm{c}}{\rm{e}}}_{it}+\mathop{\sum }\limits_{q=1}^{5}\lambda {\rm{D}}{\rm{i}}{\rm{s}}{\rm{t}}{\rm{a}}{\rm{n}}{\rm{c}}{\rm{e}}\\ \,\,\,\,\,\,\,\,\,\,\,\times \,{{\rm{D}}{\rm{o}}{\rm{w}}{\rm{n}}{\rm{w}}{\rm{i}}{\rm{n}}{\rm{d}}{\rm{N}}{\rm{e}}{\rm{i}}{\rm{g}}{\rm{h}}{\rm{b}}{\rm{o}}{\rm{u}}{\rm{r}}}_{ibt}+{\mu }_{i}+{\eta }_{b}\times {\theta }_{t}+{\nu }_{ibt}\end{array}$$

(3)

where q are quintiles of the distance of each grid cell to the bordering district segment b, and we control for grid cell and for border times month–year fixed effects. We use the fifth quintile (the farthest) as the absorbed category. We are interested in two coefficients: κ captures the effect of proximity to the upwind border, an area where bureaucrats have the largest incentives to curb fires, as those would affect their own district the most. Instead, λ captures the differential effect of what happens as we approach the downwind border, which might be the same location as an upwind area in other periods. Our expectation is that bureaucrats have little incentive to act on fires when those are located close to downwind borders, as smoke would exit their district. The logic of this estimation is further illustrated in Fig. 3a, where κ coefficients are represented in blue and λ in red.

Bureaucratic action and farmer deterrence

We code a new district-level variable called PostPunishment, taking value one if a farmer was punished in a district in the previous 12 months. To stay consistent with our calendar month analysis, we start the 12-month duration from the filing month itself if the penalty was filed before the 15th day of the month. Otherwise, it starts the following month. We use this variable in a triple difference-in-differences event study design by interacting it with the down > up dummy referenced as D_ir in equation (1). We show the event study results in Fig. 3c. We also plot these results in 1-month windows and find consistent, but noisier, results (Supplementary Fig. D.1). Finally, we show that these results are robust using a simple triple difference-in-differences setup in Supplementary Table D.2.

Mortality impacts of crop burning

Linking fires to downstream recipients of pollution

We measure infant and child mortality using geolocalized individual-level health information from surveys collected by the Demographic and Health Surveys (DHS) program (https://dhsprogram.com/data/) in India (2015–2016 and 2019–2021) and Pakistan (2016–2017). The DHS data enable us to match each pregnancy to a specific small location, called a DHS cluster, across India and Pakistan. In total, we have 542,150 pregnancies across 58,103 DHS clusters for which we observed child, mother and family characteristics.

For each birth, we calculate the pollution exposure from crop burning fires as the total number of particles coming from fires started in our analysis sample (Punjab in India and Pakistan, Haryana, Delhi, Uttar Pradesh and Bihar) to the home DHS cluster location of the birth, for the expected length of the pregnancy. This process enables us to link pollution particles from each upwind fire to pregnancies, thereby generating a precise measure of the expected air pollution from crop burning in our sample for each pregnant mother. Besides the number of particles received in the home DHS cluster location of the pregnancy, we also calculate the total PM_2.5 pollution experienced in that location using satellite data from the CAMS.

We proceed to collapse the data at the pregnancy level, calculating the total local air pollution, as well as total particles from crop burning fires received by each child in utero. Children born in the same location and from a similar family will experience different levels of in utero pollution from PM_2.5 depending on the timing of the pregnancy.

Estimation strategy

Our quantity of interest is the impact of air pollution (PM_2.5 concentration) caused by crop burning on child and infant mortality. This quantity is generally confounded by characteristics of the location, period of birth and of the family, which might be correlated with both air pollution and worse health outcomes. We can address these concerns in our setting using time variation in exposure to pollution. We consider two models. First, we use a reduced form regression of the impact of total crop burning fire particles on mortality outcomes. Second, we capture the impact that fires have on overall levels of air pollution using an instrumental variable strategy. For this, we first estimate the impact of total fire particles on PM_2.5 concentration experienced by each child in utero (first stage). We then estimate the impact of the predicted air pollution from fires on children’s mortality (second stage). For each child (c) born in district d during month–year t, we estimate:

$$(\text{first stage})\,{\rm{PM}}{2.5}_{ct}=\pi {{\rm{FireParticles}}}_{ct}+{\xi }_{d}+{\tau }_{t}+{\phi }_{b}+\sigma {X}_{t,dhs}+{\upsilon }_{ct}$$

(4)

$$(\text{second stage})\,{{\rm{M}}{\rm{o}}{\rm{r}}{\rm{t}}{\rm{a}}{\rm{l}}{\rm{i}}{\rm{t}}{\rm{y}}}_{ct}=\rho \hat{{\rm{P}}{\rm{M}}{2.5}_{ct}}+{\xi }_{d}+{\tau }_{t}+{\phi }_{b}+\sigma {X}_{t,dhs}+{\omega }_{ct}$$

(5)

Both the first and second stage control for district ξ, month–year from birth τ, and birth order ϕ fixed effects. All regressions also control for the total number of fires during the pregnancy within a range of 10 km from the DHS cell, X—while most of the particles affecting a pregnancy will be carried from different and farther locations, fires originating around the DHS cluster are likely to have a larger impact on exposure, and locations with more fires might be different on average than other places. Our coefficient of interest, ρ, identifies the effect of exposure to fire-related air pollution in utero on child and infant mortality. We provide evidence consistent with the validity of the IV identifying assumptions in Supplementary Table E.1 (c), which shows a strong first stage of fire particles on PM_2.5 concentration, and Supplementary Fig. E.3, which shows that their relationship is monotonic. The extent to which a pregnancy is affected by pollution particles from a specific fire is likely as if random, because this is determined by a host of upwind meteorological factors interacted with the timing of the pregnancy. Finally, the case for exclusion restriction in our setting is strong because it is unlikely that particles from fires have an impact on mortality from channels other than air pollution.

Linking bureaucratic incentives with mortality outcomes

In the main section, we have shown that (1) bureaucratic strategic behaviour impacts the incidence of fires, and that (2) fires have a substantial impact on child and infant mortality. We then connect these two findings to assess how much of the increase in mortality determined by fires can be attributed to jurisdictional incentives. We do a back-of-the-envelope calculation to understand what would happen if bureaucrats had the same incentives and capacity across the entire district area, and exerted control in the downwind parts of their districts the same way they do in the upwind parts. Here we describe the details of our calculation.

We have shown in Supplementary Table B.1 that treatment reduces fires by 10–13% on average in the upwind part of the district, or by approximately 5–7.5% in the full district. We have also estimated that a 1-log increase in particles from fires over the 9 months of the pregnancy raises child mortality by up to 36 deaths in 1,000 births (Supplementary Table E.1). Now, consider what would happen if exposure to fires in utero dropped by the extent that bureaucrats can affect fires in the district. If bureaucrats, assuming proper incentives and capacity, reduced crop burning in the downwind parts of their district to match the upwind parts, child mortality would reduce by between 1.8 to 2.7 fewer deaths in 1,000, which corresponds to a 4.4 to 6.6% reduction over the average child mortality in the sample. Of course, a problem with bureaucratic action is the lack of capacity, but this scenario illustrates one way in which effects could be obtained by creating slack in bureaucratic capacity through the prioritization of environmental management.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

[ad_2]

Source link

October 23, 2024
Mortality caused by tropical cyclones in the United States
[ad_1]
Data

Wind speed data

We measure TC incidence using the LICRICE model (version 4)²⁸. LICRICE is a parametric wind-field model that estimates the maximum sustained winds experienced by every location throughout the lifetime of each TC recorded in the International Best Track Archive for Climate Stewardship (IBTrACS) database^{2,29,31,37,53,54,55,56}. LICRICE uses observed maximum wind speeds to reconstruct wind fields throughout the storm based on internal storm structure, location, and each storm’s observed translational velocity. Many summaries of storm wind incidence are possible to construct using LICRICE, such as integrated power dissipation^28,29; however, prior analyses have shown that the maximum wind experienced within storms is the most predictive of social and economic outcomes among numerous parsimonious metrics previously analysed. This result is consistent with most components of built structures failing catastrophically based on whether or not a threshold of stress is applied. LICRICE does not explicitly model storm surge, rainfall or flooding, however these dimensions of impact are captured in the analysis to the extent that they are correlated with integrated maximum wind speeds. For example, we find that LICRICE wind speed and NCEP rainfall from TCs are correlated within storms at the pixel level (Supplementary Fig. 3) and at the state-by-storm level (R² = 0.31, P < 0.001; Fig. 1e) for a limited sample of storms for which granular rainfall data are available. Iterations of the LICRICE model has been used to measure various social and economic impacts of TCs, including direct deaths and damages²⁹, changes to household income and expenditure², infant mortality², GDP growth^28,37 and depreciation³. We also compare our measure of wind speed against total direct national economic damages (normalized by GDP) from a limited sample of TCs estimated by Nordhaus⁵¹. We find that national wind speed exposure is a meaningful predictor of total national damages (Fig. 1d and Supplementary Fig. 2), although we note that this outcome is highly uncertain and widely understood to be biased.

Here we use a new reconstruction of incidence at the sub-national-level within CONUS. We reconstruct incidence for 0.125° × 0.125° pixel of CONUS in each of 1,230 Atlantic storms between 1930 and 2015. Supplementary Fig. 1 shows all decadal averages of these output (four example maps are also shown in Fig. 1a), illustrating the TC climatology for CONUS–however these aggregates over time are not themselves used in subsequent analysis.

To match state-level mortality data, TC incidence is collapsed from pixels to states for every month. If multiple storms impact a cell within a month, the maximum incidence at the cell level is recorded, and monthly averages are computed across pixels in each state. This spatial averaging causes our measures of incidence to be substantially lower than the maximum sustained wind speed commonly reported for storms, since only a small number of pixels experience those extreme conditions within each storm. We note that TC events generate the highest average state wind speeds compared to wind speeds from other intense storm phenomena, such as tornadoes. For reference, the minimum monthly state average wind speed we compute from TCs in our sample is 3.34 × 10⁻⁴ ms⁻¹ and the 1st percentile is 8.4 × 10⁻³ ms⁻¹. By contrast, the maximum monthly state average wind speeds from tornadoes in CONUS between 1950 and 2022 is 9.6 × 10⁻⁴ ms⁻¹ and the 99th percentile is 3.8 × 10⁻⁵ ms⁻¹, and for non-TC and non-tornado wind/hail events the maximum is 1.1 × 10⁻³ ms⁻¹ and the 99th percentile is 1.3 × 10⁻⁴ ms⁻¹. Therefore, the maximum non-TC wind events are comparable to the minimum (non-zero) TC events; and absent a TC, states do not experience average wind speeds of a similar magnitude as those from TCs.

Prior analysis by Hsiang & Jina³ demonstrated that spatial aggregates of TC exposure can be used as independent variables in a regression framework to obtain unbiased average effects that are expressed at finer spatial resolutions (footnote 13 on pages 16–17 of ref. ³). As long as there is no systematic correlation between the average intensity of a storm and the likelihood that the most intense regions within that storm strike the most populated (or economically active or vulnerable) pixels within a state, regression coefficients will not be biased by spatial aggregations. This condition would be violated if, for example, there were systematic patterns such that the eyes of a Category 3 hurricanes tended to pass directly over dense cities, but the eyes of Category 2 hurricanes tended to miss cities. However, given that the paths of storms are primarily controlled by random steering winds at high altitude, interacting with the beta-effect induced by the Earth’s meridional vorticity gradient, we have strong reason to believe that the spatial distribution of TC incidence within each state is orthogonal to the spatial distribution of underlying populations; and further that this covariance is independent of average TC intensity. Thus far, we know of no evidence that the trajectory of stronger (or weaker) storms systematically strike more vulnerable locations on land.

Of the 1,230 TCs that we reconstruct, 501 come within 250 km of a CONUS coastline. Intersecting these storms with state boundaries generates a total of 3,317 state-by-TC events. These longitudinal data reveals rich variation in the timing and intensity of TC incidence for individual states⁵² (Extended Data Fig. 1). Within-state variation in incidence season-to-season and month-to-month provides substantial variation in TC impulses that enable us to identify the impulse-response of mortality empirically.

All-cause mortality data

We analyse all-cause mortality at the state-year-month level between 1930 and 2015 using data from multiple sources. Data from 1900 to 2004 were digitized and assembled by Barreca et al.³¹ in their report identifying the impact of temperature on mortality. According to Barreca, this is the most comprehensive data on mortality assembled in this context. The remaining data was assembled by the authors using the CDC Underlying Cause of Death database. Data prior to 1959 was digitized from the Mortality Statistics of the United States annual volumes and is not otherwise available in a machine-readable format. Therefore, data in years prior to 1959 do not include cause of death (for example, cardiovascular disease) or demographic information (for example, age 1–44, Black)³¹. From 1959 to 2004, the data are from the machine-readable MCOD files, which include cause of death and demographic data.

For the years 2005–2015, we analyse mortality data from the public CDC Underlying Cause of Death database (2017). The data are based on death certificates for U.S. residents, which gives a single underlying cause of death and demographic data. Cause of death prior to 2000 was indexed using the four-digit ICD-9 code and 2000 onwards the index changed to the four-digit ICD-10 code. Cause of death was indexed using a four-digit ICD-10 code. We harmonized the cause of death into five categories that matched the cause of death variables from Barreca et al. We also construct a 6th category which is the difference between all-cause mortality and the sum of the 5 cause-specific categories, called ‘other’. Notably, the change in CDC ICD code methodology resulted in a shift in the counts of deaths from specific causes, particularly infectious diseases and cardiovascular disease.

To account for differences in underlying age-specific mortality we decomposed the effect of TCs on all-cause mortality by four age groups in the data: <1, 1–44, 45–64, and 65+ years of age. We were limited to these age groups because these are the designations in our historical data. We computed mortality with respect to the underlying population by these same four age groups. We compute mortality by race with respect to the population by race in each state and year. Black and white are the only race categories available for the entire sample. Extended Data Fig. 3 shows the monthly all-cause mortality rate, and our predicted monthly mortality rate, for all the states in our sample.

Direct deaths from TCs

Direct deaths from TCs are deaths that are officially attributed to a storm by the US government. We combine official death counts from two NOAA data sources. For storms between 1950 and 1996 we use the NOAA National Hurricane Center and Central Pacific Hurricane Center’s Hurricane in History⁶. Storms from 1997 to 2015 are from the NOAA Storm Events Database⁷.

Population data

We normalize state mortality by the population (per 100,000 people) in the state each month. Similar to the all-cause mortality data, these data must be combined from multiple sources. Pre-1968 population estimates are from Haines⁵⁷; estimates for 1969–2000 are from the National Cancer Institute (2008); estimates for 2000–2010 are from the US Census Bureau, Intercensal Population and Housing Unit Estimates: 2000 to 2010 (ref. ⁵⁸); estimates for 2010–2017 are from the US Census Bureau, US Population Estimates⁵⁹.

Temperature data

Average monthly temperature data are from Berkeley Earth Surface Temperatures (BEST) land surface air temperature. BEST provides a monthly mean of average, minimum and maximum surface air temperature over land covering 1753 to the present^56,60. The temperature data are based on a large inventory of observations from over 30,000 weather stations. Using these observations gridded temperature fields are reconstructed statistically, incorporating the reliability of individual weather stations and spatial variability of temperature^56,60. Gridded BEST temperature data are then spatially aggregated, weighted by population, to the state-month-level.

Analysis

The econometric approach that we apply here is a top-down strategy, commonly called a ‘reduced-form’ analysis, that describes the overall net change of an aggregate outcome y (mortality) in response to exogenous treatments z (TC incidence). Under suitable conditions, this approach can identify causal effects on the outcome y induced by exogenous changes in independent variable z without explicitly describing all underlying mechanisms that link z to y, without observing intermediary variables x (for example, retirement savings accounts or healthcare infrastructure) that might link z to y, or without explicitly tracking other determinants of y unrelated to z (such as demographic trends or health policy), denoted w (refs. ^23,61,62). Let f(•) describe a complex and unobserved process that generates state-level mortality rates \({y}_{{t}_{2}}\), occurring at time t₂ based on x, w and z that occur both at times t₁ and t₂ (t₁ < t₂):

$${y}_{{t}_{2}}=f({z}_{{t}_{2}},{x}_{{t}_{1}}^{1}({z}_{{t}_{1}}),\ldots ,{x}_{{t}_{1}}^{K}({z}_{{t}_{1}}),\,{x}_{{t}_{2}}^{1}({z}_{{t}_{1}},{z}_{{t}_{2}}),\ldots ,{x}_{{t}_{2}}^{K}({z}_{{t}_{1}},{z}_{{t}_{2}}),\,{w}_{{t}_{1}\,}^{1}\ldots \,{w}_{{t}_{1}\,}^{J},\,{w}_{{t}_{2}\,}^{1}\ldots \,{w}_{{t}_{2}}^{J})$$

(1)

where \({x}_{{t}_{1}}^{k}\left({z}_{{t}_{1}}\right)\) indicates that the kth factor x^k, which influences mortality rates y, at time t₁ is itself affected by TCs at time t₁. At time t₂, x^k may be influenced both by TCs at t₂ and those that occur in the past at t₁. Here, we let there be K pathways through which y is impacted by intermediary variables (x) and J ways through which determinants unrelated to TCs (w) impact y.

In this framework, the direct mortality impact of TC incidence usually reported by government agencies are the partial derivative:

$$direct{\rm{\_}}{deaths}_{{t}_{2}}=\frac{{\rm{\partial }}{y}_{{t}_{2}}}{{\rm{\partial }}{z}_{{t}_{2}}}$$

(2)

which are the deaths that occur contemporaneously and directly as a result of the geophysical event itself, holding fixed all other factors.

In this analysis, we directly estimate the total change in mortality that results from TC incidence in both current and prior moments in time, allowing for the possibility that changes in other factors that are influenced by TCs may indirectly affect mortality. In this case, the overall total change in mortality from TC incidence in t₂ is the total derivative:

$$\frac{{\rm{d}}}{{\rm{d}}{z}_{{t}_{2}}}({mortality{\rm{\_}}rate}_{{t}_{2}})=\frac{{\rm{d}}{y}_{{t}_{2}}}{{\rm{d}}{z}_{{t}_{2}}}=\mathop{\underbrace{\frac{{\rm{\partial }}{y}_{{t}_{2}}}{{\rm{\partial }}{z}_{{t}_{2}}}}}\limits_{{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{d}}{\rm{e}}{\rm{a}}{\rm{t}}{\rm{h}}{\rm{s}}}+\mathop{\underbrace{\mathop{\sum }\limits_{k=1}^{K}\frac{{\rm{\partial }}{y}_{{t}_{2}}}{{\rm{\partial }}{x}_{{t}_{2}}^{k}}\frac{{\rm{\partial }}{x}_{{t}_{2}}^{k}}{{\rm{\partial }}{z}_{{t}_{2}}}}}\limits_{{\rm{i}}{\rm{n}}{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{d}}{\rm{e}}{\rm{a}}{\rm{t}}{\rm{h}}{\rm{s}}}$$

(3)

which includes both direct deaths and deaths that result from any of the K possible pathways that depend on the intermediate variables x^k at time t₂. Empirically, we find that direct deaths are much smaller than indirect deaths in CONUS.

In addition, we also account for the possibility that deaths are delayed relative to TC incidence. Because direct deaths are usually tabulated immediately following storms, there are negligible direct deaths that are delayed. However, once we begin considering indirect deaths, it becomes possible for substantial delays to emerge due to the dynamics of different pathways x^k. In our analysis, we also estimate the total deaths that occur at time t₂ as a result of TC incidence that occurs at an earlier time t₁, which is the total derivative

$$\frac{{\rm{d}}}{{\rm{d}}{z}_{{t}_{1}}}({mortality{\rm{\_}}rate}_{{t}_{2}})=\frac{{\rm{d}}{y}_{{t}_{2}}}{{\rm{d}}{z}_{{t}_{1}}}=\mathop{\underbrace{\mathop{\sum }\limits_{k=1}^{K}\frac{{\rm{\partial }}{y}_{{t}_{2}}}{{\rm{\partial }}{x}_{{t}_{2}}^{k}}\frac{{\rm{\partial }}{x}_{{t}_{2}}^{k}}{{\rm{\partial }}{z}_{{t}_{1}}}}}\limits_{{\rm{i}}{\rm{n}}{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{v}}{\rm{i}}{\rm{a}}\,{x}_{{t}_{2}}}+\mathop{\underbrace{\mathop{\sum }\limits_{k=1}^{K}\frac{{\rm{\partial }}{y}_{{t}_{2}}}{{\rm{\partial }}{x}_{{t}_{1}}^{k}}\frac{{\rm{\partial }}{x}_{{t}_{1}}^{k}}{{\rm{\partial }}{z}_{{t}_{1}}}}}\limits_{{\rm{i}}{\rm{n}}{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{v}}{\rm{i}}{\rm{a}}\,{x}_{{t}_{1}}}$$

(4)

This expression does not contain a term for direct deaths, but it contains two summations which capture the effects of past TC incidence \(({{z}_{t}}_{1})\) on current mortality \((\,{y}_{{t}_{2}})\) via past intermediate variables \(({x}_{{t}_{1}})\) and current intermediate variables \(({x}_{{t}_{2}})\). In practice, we explore the possibility of indirect effects that emerge over the course of 240 months following TC incidence, one could generalize this framing to a corresponding number of summations.

The possibility of delayed indirect deaths has two major implications regarding how indirect mortality is estimated and how those results are interpreted. First, because TC incidence at multiple points in the past, as well as the present, might affect current mortality, we must account for both the present and past influence of TC incidence simultaneously for each instance of the outcome. This is accomplished via deconvolution^{25,26,27,56,63,64}, implemented here using a distributed lag-model solved via ordinary least squares, detailed below.

Second, each TC event affects mortality outcomes at multiple points in time, thus computing the full impact of a TC event requires summing these impacts that might emerge gradually. In the simplified two-period framework above, the total impact from TC incidence at t₁ is then

$$\frac{{\rm{d}}}{{\rm{d}}{z}_{{t}_{1}}}(mortality{\rm{\_}}{rate}_{{t}_{1}}\,+\,mortality{\rm{\_}}{rate}_{{t}_{2}})\,=\,\frac{{\rm{d}}{y}_{{t}_{1}}}{{\rm{d}}{z}_{{t}_{1}}}\,+\,\frac{{\rm{d}}{y}_{{t}_{2}}}{{\rm{d}}{z}_{{t}_{1}}}$$

(5)

which can be expanded further by substituting from the equations above. These terms, if plotted separately, characterize the impulse response of y in reaction to the TC impulse \({z}_{{t}_{1}}\). In our actual analysis, we compute the average cumulative impact of a single TC that occurs at t₀ over 240 subsequent months (t₁ − t₂₄₀). Following substitution and simplification, this can be expressed as

$$\frac{{\rm{d}}}{{\rm{d}}{z}_{{t}_{0}}}\left(\mathop{\sum }\limits_{\ell =0}^{240}{mortality{\rm{\_}}rate}_{{t}_{\ell }}\right)=\mathop{\underbrace{\frac{{\rm{\partial }}{y}_{{t}_{0}}}{{\rm{\partial }}{z}_{{t}_{0}}}}}\limits_{{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{d}}{\rm{e}}{\rm{a}}{\rm{t}}{\rm{h}}{\rm{s}}}+\,\mathop{\underbrace{\mathop{\sum }\limits_{\ell =0}^{240}\mathop{\sum }\limits_{k=1}^{K}\left(\mathop{\sum }\limits_{j=\ell }^{240}\frac{{\rm{\partial }}{y}_{{t}_{j}}}{{\rm{\partial }}{x}_{{t}_{\ell }}^{k}}\right)\frac{{\rm{\partial }}{x}_{{t}_{\ell }}^{k}}{{\rm{\partial }}{z}_{{t}_{0}}}}}\limits_{{\rm{i}}{\rm{n}}{\rm{d}}{\rm{i}}{\rm{r}}{\rm{e}}{\rm{c}}{\rm{t}}\,{\rm{d}}{\rm{e}}{\rm{a}}{\rm{t}}{\rm{h}}{\rm{s}}}$$

(6)

which describes the overall total impact of a storm through all pathways across all possible delays ℓ = [0, 240]. Note that neither K nor x^k need ever be specified explicitly in our estimation below. This expansion reveals that, accounting for numerous possible pathways operating over different delays, a single TC event can potentially generate a total mortality impact much larger than the direct deaths traditionally reported.

To compute the overall mortality burden imposed by the TC climate of CONUS, we compute the full mortality response across all age groups in each state, accounting for the incidence of each storm on the state:

$$\begin{array}{l}mortality{\rm{\_}}burden\,=\,\mathop{\sum }\limits_{{\ell }=0}^{240}\sum _{t\in {\rm{m}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{h}}{\rm{s}}}\sum _{s\in {\rm{s}}{\rm{t}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{s}}}\sum _{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}{\rm{s}}}{population}_{i,t+{\ell }}\\ \,\,\,\,\,\cdot {z}_{sit}\cdot \frac{{\rm{d}}}{{\rm{d}}z}{(mortality{\rm{\_}}rate)}_{i,t+{\ell }}\end{array}$$

(7)

where z_sit is the TC incidence of storm s on state i in month t, population_i,t+ℓ is the population in state i in month t + ℓ, and \(\frac{{\rm{d}}}{{\rm{d}}z}{(mortality{\rm{\_}}rate)}_{i,t+{\ell }}\) is our estimate for the total impact of TC incidence on the mortality rate in state i in month t + ℓ. In practice, this effect is nonlinear, but it is expressed linearly here for simplicity. Information on state i affects the impulse responses used in these calculations because TC risk is computed by state and affects the structure of the impulse-response function.

Econometric implementation

Identification

Our econometric analysis exploits the quasi-random variation in the location and intensity of TC incidence to estimate the impact of TCs on mortality separately from other known and unknown factors that affect mortality across locations and over time. As described above, this reduced-form approach captures the effect of all possible channels of influence that may increase mortality after a TC^33,65. Because the location, timing and intensity of TC incidence is determined by oceanic and atmospheric conditions that are beyond the control of individual states, we assume mortality TC incidence is as good as randomly assigned^23,61. For reference, Extended Data Fig. 1 shows the sequence of monthly TC incidence by state for all the states in our sample.

We note that some early analyses of natural disaster impacts utilized social outcomes (for example, direct economic damage⁶⁶ or direct mortality) as a proxy measure of physical hazard severity. However, it is now understood that use of these metrics as independent variables may confound estimated treatment effects, since they are endogenously determined by many of the same underlying covariates (for example, healthcare, infrastructure, inequality and institutions) that mediate other outcomes from disasters^{22,29,33,34,35,67}. Thus, use of these proxy measures for hazard severity exposes analyses to selection biases, since population characteristics may cause observational units to ‘select’ into more or less severe treatment²³. We therefore focus this analysis strictly on independent variables that are physical measures of TC incidence (wind speed), because they are exogenous and cannot be influenced by the populations that are impacted²².

Deconvolution

In considering the long-run impact of TCs on mortality, we hypothesize that there may be a delay between the geophysical event and components of the mortality response. Because TCs are regular events that occur frequently in CONUS, the possibility of this delay means that the time series of mortality outcomes we observe in data may be the result of overlapping responses from multiple storms. Extended Data Fig. 2 displays a cartoon of this data-generating process. In such a context, the empirical challenge is isolating the impact from individual storms which might be partially confounded by the overlapping TC signals from earlier or later storms. We use the well-established signal-processing approach of deconvolution^{25,26,27,33,63,64} to recover the characteristic impulse-response function for a TC impulse. Conceptually, this approach searches for an impulse-response function that, if applied to all TCs in the data simultaneously, best fits the observed outcome data. Stated another way, this approach estimates the effect of each TC accounting for the potential overlapping impact of all other TCs, subject to the constraint that TCs share a characteristic impulse-response function.

This method assumes that the overlapping responses influencing mortality at a moment in time are additively separable, an assumption that we think is reasonable given the overall small impact that any individual storm event has mortality rates at a moment in time in a particular region (0.019% on average, 0.04% for storms at the 95th percentile). We solve for the structure of the impulse response, characterized by a set of coefficients β, using ordinary least squares (OLS). This is a standard procedure that is commonly applied in a wide range of disciplines²⁶. In some fields, such as econometrics, deconvolution is frequently described as estimation of distributed lags²⁷.

Baseline specification

Our main results are based on a linear model of TC incidence on mortality rate. Indexing states by i and month of sample by t, we solve the model

$$\begin{array}{l}mortality{\rm{\_}}{rate}_{it}=\mathop{\sum }\limits_{{\ell }=-72}^{240}({\beta }_{{\ell }}\cdot wind{\rm{\_}}{speed}_{i,t-{\ell }})\\ \,\,\,\,\,\,\,\,+{\delta }_{1,i}\cdot {temp}_{it}\cdot {s}_{i}+\,{\delta }_{2,{\rm{i}}}\cdot {temp}_{it}^{2}\cdot {s}_{i}+{\mu }_{1}\cdot {m}_{it}\\ \,\,\,\,\,\,\,\,+\,\mathop{\sum }\limits_{n=1}^{8}({\eta }_{n}\cdot {s}_{i}\cdot {t}^{n})\,+\,{\mu }_{2}\cdot {m}_{it}\cdot t+{\mu }_{3}\cdot {h}_{t}+{{\epsilon }}_{it}\end{array}$$

(8)

via OLS. Here wind_speed_i,t−ℓ is TC maximum wind speed ℓ months prior to month t, s_i is a vector of state-specific dummies, m_it are state-by-month-of-the-year dummies (for example, an indicator variable for whether state = Florida and month = January), h_t are month-of-sample dummies (for example, an indicator variable for whether the month = January, 1974), \({temp}_{it}\cdot {s}_{i}\) is month-of-sample temperature interacted with state dummies, and \({temp}_{it}^{2}\cdot {s}_{i}\) is squared month-of-sample temperature (also interacted with state). Each coefficient β_ℓ measures the marginal effect of an additional ms⁻¹ of wind speed incidence on mortality ℓ months after a TC conditional on the effect of any prior TC. We include 72 lead terms in equation (8) as a falsification test, also known as negative exposure controls, since idiosyncratic future TC incidence should not alter current health outcomes.

This model accounts for state-specific quadratic effects of temperature on mortality based on prior literature, which has shown that very hot and very cold temperatures cause higher levels of mortality relative to more moderate temperatures^31,33,34,35. Each state is allowed to express a different mortality response to temperature extremes, implemented via interaction with the state dummy variable s_i. Supplementary Fig. 6 shows the state-specific shape of the quadratic functions we estimate for the temperature-mortality response. Consistent with prior findings studying patterns of adaptation^31,33,34,35, we observe that some states have a flatter response at temperatures that are more common for that state (for example, cold in Minnesota) while other states have steeper curves at those same temperatures if they are less common (for example, cold in Florida). In an effort to balance parsimony with model richness, we omit extended lags of temperature based on prior literature demonstrating that impacts on mortality dissipate within a month^33,68.

This model also non-parametrically accounts for:
- State-by-month-specific constants (fixed effects) that capture average differences between states, as well as unique seasonable patterns within states (\({\mu }_{1}\cdot {m}_{it}\)). These terms will account for differences in mortality driven by unobserved factors at the state level, such as health policies, as well as factors that cause seasons within a state to exhibit higher mortality, such as holidays.
- State-specific nonlinear trends in mortality, captured by eighth-order polynomials in month-of-sample interacted with state fixed effects (\({\sum }_{n=1}^{8}({\eta }_{n}\cdot {s}_{i}\cdot {t}^{n})\)). These trends account for unobserved factors that have caused mortality within states to change over time, such as changing health policies or demographic trends.
- Trends in state-specific seasonal patterns of mortality, captured by a linear trend in month-of-sample interacted with state and month fixed effects (\({\mu }_{2}\cdot {m}_{t}\cdot t\)). These trends are additive to the state-specific polynomial and allow for the model to express gradual convergence or divergence in the seasonality of mortality within a year, and allows for these changes to differ by state. These trends account for unobserved factors that drive gradual changes over time that may cause mortality in certain times of year (for example, January) to change relative to other times of year (for example, June). For example, if adoption of safety standards has reduced wintertime mortality from motor vehicle accidents or improvements in medical care have reduces summertime deaths from infectious diseases. Extended Data Fig. 4a illustrates the combined effect of these state-specific seasonal trends, state-specific polynomials, and state-by-month- specific constants on model predictions for Florida and New Jersey. For example, the seasonality of mortality in Florida has lessened over time, in conjunction with other nonlinear trends.
- National month-of-sample fixed effects that capture nonlinear and/or discontinuous changes in mortality rates nation-wide (\({\mu }_{3}\cdot {h}_{t}\)). These terms are particularly important for capturing idiosyncratic spikes in mortality that result from nation-wide conditions, such as influenza outbreaks, as well as any systematic changes in the accounting methodology of mortality by the CDC. Comparisons of Extended Data Fig. 4b,c illustrates how the inclusion of these terms in the model alters the ability of the model to capture unusual spikes in mortality that are not captured by other model elements, including the trends listed above, TCs, and temperature.
Overall, the fit for this model is high (in-sample adjusted R² = 0.93 with 25,062 degrees of freedom). Extended Data Fig. 3 overlays predictions with observations for all states (same as in Fig. 1c). We find that all of the non-parametric controls listed above are important for passing standard specification checks. For example, failure to account for trends flexibly enough causes estimated leads to deviate from zero or randomization-based placebo tests (described below) to recover non-zero central estimates. These results are unchanged if we use a Poisson regression specification (Extended Data Fig. 6a).

In a robustness test, we interact the month-of-sample fixed effects with 3 region indicator variations. We continue to obtain our main findings after introducing these additional 2,062 parameters to the model, although the estimates become much noisier and attenuate slightly. Both of these effects are well understood results of including a large number of highly flexible variables that absorb a meaningful fraction of the true variation in the independent variable⁶⁹.

We evaluate the distribution of the unmodelled variation represented by the error term ϵ_it and find that it essentially follows a Normal distribution except with slightly positive kurtosis (Supplementary Fig. 5a). The distribution of these residuals appears stationary throughout the sample period and independent over time (Supplementary Fig. 5b). The consistency of the distribution of these errors is attributed to the high degree of flexibility in the non-parametric terms of our econometric specification, which are able to capture those components of the data-generating process that would otherwise appear as auto-correlated errors. On the basis of this evaluation, we construct OLS standard-error estimates as the underlying assumptions for these estimates appear to be reasonably satisfied. In addition, we find strong support for this modelling choice when we conduct a variety of permutation tests for statistical significance (Extended Data Fig. 5), all of which indicate that our asymptotic estimates for confidence intervals are correctly (and possibly conservatively) sized and our tests for statistical significance correctly powered. Notably, these permutation tests do not rely on the assumptions used to estimate these confidence intervals, thus they can be considered independent corroboration for the validity of this approach.

Cumulative effects

To compute the total effect after the TC makes landfall, we estimate the cumulative sum of β_ℓ for each ℓ ∈ [−72, 240]. We compute \({\Omega }_{{\ell }}={\sum }_{k=o}^{{\ell }}{\beta }_{k}\), which denotes the cumulative impact of an additional 1 ms⁻¹ wind speed incidence on mortality ℓ months after a TC event. We account for the estimated covariance of β when estimating uncertainty in Ω. We normalize the sums relative to the impact one month prior to the TC, ℓ = −1 such that Ω₋₁ = 0.

Randomization-based placebo tests

Given the complexity of our model, the long delays we study, and the absence of prior analyses of long-run total mortality from TCs, it is not possible to subjectively evaluate our econometric analysis against any prior benchmark. In such a context, there is risk of unknowingly recovering a spurious estimate generated as an artefact of our model specification. A strong test designed to avoid such artefacts is to ensure that model estimates of TC impacts on mortality are unbiased in a variety of situations where the structure of the association has been manipulated. In four tests, we shuffle the true TC data in different ways. In each case, this shuffling should break any correlation between TC incidence and mortality such that an unbiased estimate of the effect of shuffled TCs on mortality is zero. However, in each case, some of the structure in the original TC data are allowed to remain in the shuffled TC data. For example, randomization within a state over time retains the average cross-sectional patterns of TC incidence, but destroys any time- series structure. Thus, these tests allow us to examine whether, in each case, the remaining structure generates artefacts in the model that would produce a spurious result, also known as a negative exposure control⁷⁰. Any non-zero correlation, on average, would indicate a biased model where the bias is driven by the non-randomized components of the original TC data.

Within each type of randomization we scramble TC assignment 1,000 times and run the linear version of the model (equation (8)) on each re-sampled version of the data. Our four randomizations are illustrated graphically in Supplementary Fig. 7 and described below:
- Total randomization shuffles TC events across all state-by-month observations. This tests whether the unconditional marginal distribution of TC events, which has a long right tail, could generate bias. Results are shown by light blue boxes in Fig. 1g.
- Within-state randomization shuffles the sequencing of TCs that a state experiences over time. TCs are always assigned to the correct state, but the month and year assigned to each storm is random. The cross-sectional average pattern of storm incidence is preserved in the data. Thus, this tests whether time-invariant cross-sectional patterns across states generate spurious correlations. Results are shown by dark blue boxes in Fig. 1g.
- Within-month randomization shuffles the TC incidence across states within each month-of-sample. TCs are always assigned to the correct month and year, but the state assigned to each storm is random. The average time-series structure of TC incidence nation-wide is preserved. Thus, this tests whether national or seasonal trends, which are nonlinear, could bias estimates produced by this model. Results are shown by maroon boxes in Fig. 1g.
- Across-state shuffles complete TC times-series across states, keeping the timing and sequence of storms correct as blocks. TCs are always assigned to the correct month and year, and the sequence of storms experienced by a state is always a continuous sequence that is observed in the data. However, the state that is assigned that sequence is randomly chosen. This tests whether trends within a state and within the sequence of storms that a state experiences could generate bias. This test differs from the within month randomization because state-level trends often differ across states (see Extended Data Fig. 1 and Extended Data Fig. 3) and there are complex seasonal patterns that could potentially affect estimates. Results are shown by red boxes in Fig. 1g.
The estimated impact of TCs in each of these placebo tests is zero on average, to within a high degree of precision. Extended Data Fig. 5 illustrates distributions of estimates for all lags. These results demonstrate that non-exchangeability across states within a month, across months within a state, or across states (conditional on month of sample) does not confound our analysis; indicating that the rich set of fixed effects and trends successfully adjust for many patterns of TC incidence and/or mortality such that the remaining conditional variation is as good as random.

Permutation tests for statistical significance

In addition to establishing the unbiasedness of our main point estimates, the four randomizations above can be utilized to serve a second function: estimating statistical significance of our estimates. These randomizations enable approximate permutation tests⁷¹, allowing for different types of autocorrelation to remain in the TC data. We use these randomizations to examine the likelihood of randomly obtaining an entire impulse-response function similar our actual estimate, if in reality no such relationship exists in the data. To do this, we jointly test the significance of all true cumulative estimates Ω_ℓ against the null hypothesis that a similarly extreme sequence of estimates is generated randomly. Extended Data Fig. 5 overlays the true estimates for Ω_ℓ on distributions of similar estimates from each randomization. P values for individual lag terms \(\left({p}_{{\ell }}=\Pr \left(\left|{\Omega }_{{\ell }}^{{\rm{randomized}}}\right| > \left|{\Omega }_{{\ell }}\right|\right)\right)\) are plotted in the right subpanels and are all individually statistically significant (P < 0.05) for ℓ < 150 months in each randomization. However, the significance of the complete sequence of coefficients that together compose the entire impulse-response function is far greater. We compute a joint P value for the full impulse response between 0 and 172 months \(({p}_{{\ell }}=\Pr ({\cap }_{{\ell }=0}^{172}(| {\Omega }_{{\ell }}^{{\rm{randomized}}}| > | {\Omega }_{{\ell }}| ))\) that ranges from P = 0.012 to P = 0.0012 across the randomization approaches (Extended Data Fig. 5). We conclude that it would be extremely unlikely to obtain an impulse-response function as extreme as our main result due to chance.

Subsamples by age, race and cause of death

In addition to the all-cause mortality rate for the entire population, we also present the results stratified by age, race, and cause of death. For the six cause-specific mortality rates we compute mortality per 100,000 of the total population in that state in the time period (for example, total number of deaths from cardiovascular disease divided by total population times 100,000). ‘Other’ mortality is the difference between the total deaths and the sum of all the other causes. For age groups and race we report mortality risk as the outcome of interest. For example, for the Black population, we construct the mortality rate for Black people as the number of deaths of Black people in the state divided by the Black population. We do the same procedure by age group. We also report the mortality by these strata as a proportion of the total deaths traceable to TC incidence, see Extended Data Fig. 7.

Subsamples by average TC risk

To evaluate whether there is heterogeneity in the mortality response of states that are frequently exposed to TCs compared to those infrequently exposed, we stratify the sample by the average TC incidence they experience. We allow the mortality impulse response to differ based on quartiles of states, sorted by their average TC incidence. We implement this by including and interaction with an indicator variable for the quartile of their average wind speed incidence, following the general approach for modelling adaptation developed in refs. ^29,33. Average wind incidence is a measure of the expected TC risk a population bears, which informs preventive risk reduction investments, behaviours, or other adaptive actions they take to reduce the expected harm from TCs. We approximate this measure by computing as the mean wind speed in each state i across the period t in our sample

$$wind{\rm{\_}}{speed}_{i}=\frac{{\sum }_{t=1}^{1,032}wind{\rm{\_}}{speed}_{it}}{\mathrm{1,032}}$$

and assigning the quartiles of these means to each state. We estimate a model that allows each quartile to express a different impulse response to TCs and observe little difference in the impact of TCs on mortality between the second through fourth quartile (Extended Data Fig. 6c). The effect for the second through fourth quartile are not statistically significantly different than the effect for the second through fourth quartile, combined (P = 0.38). Thus, to improve the efficiency of our model and limit unnecessary noise in our estimates, we pool quartiles 2–4 to create the ‘high average incidence’ group in the main results (shown in Fig. 2f) (‘high incidence’ in equation (9)). The ‘low average incidence’ group is the first quartile of average wind speed alone (‘low incidence’ in equation (9)). We additionally evaluate whether the spatial distribution of populations relative to the coast, within states, alters the mortality response to TCs. Stratifying states on the basis of the average fraction of the population that lives in coastal counties, we fail to find evidence that states with high concentrations of coastal populations are systematically different from states with little or no coastal population (Extended Data Fig. 6d). Lastly, we evaluate whether the overall spatial correlation between populations and average wind speed incidence, within each state, alters the mortality response to TCs. Stratifying states on the basis of within-state spatial correlation across 0.125° × 0.125° pixels, we fail to find evidence that states with higher spatial correlations are systematically different from states with little or negative correlations (Extended Data Fig. 6e).

Nonlinear effects of TCs

We evaluate whether the mortality impact of a TC is nonlinear in the physical intensity of the event. This could occur, for example, if more extreme TC events generate exponentially more physical damage^51,72 or if they elicit different government responses^18,73. Empirically, we find that excess mortality 180 months after a TC is well approximated by a linear function of max wind incidence, particularly for TCs with area-average max wind speeds between 0 and 20 ms⁻¹, which is the majority of events (93%) in our sample (Extended Data Fig. 8). However, for the most extreme events (>30 ms⁻¹, 1.4% of events) we find that excess mortality is generally lower than a linear function would predict, although these nonlinear effects are not themselves statistically significant. We lack the data to fully evaluate the underlying causes of this nonlinearity, but believe it is an important topic for future study. For example, it is possible that societal responses to the most extreme events (for example, disaster relief) are more effective at alleviating mortality impacts of TCs because these events attract a disproportionate quantity of attention, compared to less extreme events that are also harmful but less salient⁷⁴. Regardless of their cause, we account for these non-linearities in calculations below as they contain information on how populations in CONUS have adapted to their TC climates^5,73.

To estimate nonlinear effects of TCs, we estimate a model that is identical to the benchmark linear model in equation (1), but it allows the magnitude of the TC mortality impulse-response function to be cubic in TC incidence. The motivation for this approach is the possibility that the relationships between wind speed and long-run mortality does not increase linearly. For example, very high wind speeds may cause extreme damages and/or elicit greater governmental and humanitarian responses, which would mean that a unit of increase from 40 to 41 ms⁻¹ wind speed may have a larger or small mortality impact compared to an increase from 5 to 6 ms⁻¹. Since the nonlinear impact of TCs may be influenced by the historical TC experience and baseline TC risk of populations, this model also allows for the nonlinear response to differ based on the risk categorization for each state:

$$\begin{array}{l}mortality{\rm{\_}}{rate}_{it}=\\ \,\mathop{\sum }\limits_{{\ell }=-72}^{240}\mathop{\sum }\limits_{r=1}^{3}({\theta }_{r,{\ell }}^{low{\rm{\_}}incidence}\cdot wind{\rm{\_}}{speed}_{i,t-{\ell }}^{r}\cdot {Q}_{i}^{low{\rm{\_}}incidence}\\ \,+\,{\theta }_{r,{\ell }}^{high{\rm{\_}}incidence}\cdot wind{\rm{\_}}{speed}_{i,t-{\ell }}^{r}\cdot {Q}_{i}^{high{\rm{\_}}incidence})\\ \,+\,controls+{{\epsilon }}_{it}\end{array}$$

(9)

where r indicates an exponent and the controls are identical to those in equation (8). \({Q}_{i}^{low{\rm{\_}}incidence}\) (\({Q}_{i}^{high{\rm{\_}}incidence}\)) is an indicator variable that is set to one if state i is in the low incidence (high incidence) group. The coefficients θ_1ℓ, θ_2ℓ and θ_3ℓ separately capture the cubic relationship between wind speed incidence and mortality for low and high-risk states in each lag period. Extended Data Fig. 8 displays the cumulative impact estimated using both the linear and nonlinear models after 180 months. These results are unchanged if we alternatively use a cubic spline regression specification (Extended Data Fig. 8). The low-risk response is slightly convex relative to the linear estimate, while the high-risk response is slightly concave. In both cases, impacts are relatively well approximated by the linear version of the model and only diverge (insignificantly) at very high levels of incidence that are rare in sample. Distributions for in sample frequency are shown in lower panels of Extended Data Fig. 8.

Computing mortality burdens

The total impacts of all TCs on mortality are estimated using each version of the model, presented in Supplementary Table 1. We compute the excess mortality from TCs by state, month, and TC. These estimates are presented in Figs. 3 and 4 and Supplementary Tables 1 and 2.

Figure 4 displays the estimated excess mortality from TCs by state, age and race, computed using equation (8) applied to equation (7). Figure 4a presents our estimated full TC mortality burden, similar to equation (7) but by state (mortality_burden_it) as an average proportion of total deaths (mortality_it) in each state between 1950 and 2015:

$${proportion}_{i}=\frac{{\sum }_{t\in {\rm{m}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{h}}}mortality{\rm{\_}}{burden}_{it}}{{\sum }_{t\in {\rm{m}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{h}}}{mortality}_{it}}$$

Similarly, we estimate the proportion by state and age group (proportion_i,a), shown in Fig. 4d,f. Proportion and total excess mortality for the Black population is based on mortality burden estimated with the mortality risk for the Black population, therefore \({proportion}_{Black}=\frac{{\sum }_{t\in {\rm{m}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{h}}}{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}mortality{\rm{\_}}{burden}_{it,{\rm{B}}{\rm{l}}{\rm{a}}{\rm{c}}{\rm{k}}}}{{\sum }_{t\in {\rm{m}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{h}}}{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}{mortality}_{it,{\rm{B}}{\rm{l}}{\rm{a}}{\rm{c}}{\rm{k}}}}\).

Figure 4g illustrates the impact of the TC climate on the geographic differences in average annual all-cause mortality rate between states that do not experience TCs (‘non-TC states’) and states that do (‘TC states (actual)’), in this context. We also subtract the average annual TC mortality burden from the actual average annual mortality for each TC-impacted state (‘TC states without TCs’).

Decomposing trends in mortality burden

We examine the differences in TC events and population distribution before 2001 and after 2001 in order to understand why the mortality burden after 2001 is increasing more rapidly than it did before 2001 (Fig. 4h–j). Figure 4h shows the distribution of the number of TCs that made landfall each year before 2001 and after 2001. Similarly, Fig. 4i shows the maximum annual wind speed per year and Fig. 4j plots the average wind speed per year as experienced by a proportion of the CONUS population. The changes in the distribution of TC events affecting CONUS after 2001 were themselves probably caused by a combination of factors, including warmer sea surface temperatures^11,75 and reductions of anthropogenic aerosol emissions^76,77 (which create an environment more amenable to TC intensification); and shifts in steering winds⁷⁸ (which direct a larger fraction of TCs to landfall in CONUS after formation). We note that identifying factors driving the TC climate remains an active area of research^56,79.

To understand the 1950 to 2015 trend in the national aggregate mortality burden, we re-estimate mortality_burden_t for each month of the sample with different populations to decompose the long-term trend based on various population patterns. We first replace population_i,t+ℓ from equation (7) with fixed 1950 or 2015 populations (Fig. 4k, red and maroon lines). To generate the yellow line in Fig. 4k, we replace population_i,t+ℓ with an estimate of the 2015 population ‘deflated’ to 1950 levels. Specifically, we first compute a national population deflation fraction,

$$\Delta =\frac{{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}{population}_{i,2015}-{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}{population}_{i,1950}}{{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}{population}_{i,1950}}$$

where population_i,2015 is the state-specific population in 2015 and population_i,1950 is the state population in 1950. We then calculate

$$deflated\_\,{population}_{i,2015}=\frac{{\sum }_{i\in {\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}{population}_{i,2015}}{1+\Delta }$$

and apply deflated_population_i,2015 to equation (7). This value is an adjusted state-level population that allows the total national population to match 1950 level but have a relative spatial distribution that reflects 2015.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
[ad_2]

Source link
October 2, 2024
The economic commitment of climate change

[ad_1]

Historical climate data

Historical daily 2-m temperature and precipitation totals (in mm) are obtained for the period 1979–2019 from the W5E5 database. The W5E5 dataset comes from ERA-5, a state-of-the-art reanalysis of historical observations, but has been bias-adjusted by applying version 2.0 of the WATCH Forcing Data to ERA-5 reanalysis data and precipitation data from version 2.3 of the Global Precipitation Climatology Project to better reflect ground-based measurements^49,50,51. We obtain these data on a 0.5° × 0.5° grid from the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) database. Notably, these historical data have been used to bias-adjust future climate projections from CMIP-6 (see the following section), ensuring consistency between the distribution of historical daily weather on which our empirical models were estimated and the climate projections used to estimate future damages. These data are publicly available from the ISIMIP database. See refs. ^7,8 for robustness tests of the empirical models to the choice of climate data reanalysis products.

Future climate data

Daily 2-m temperature and precipitation totals (in mm) are taken from 21 climate models participating in CMIP-6 under a high (RCP8.5) and a low (RCP2.6) greenhouse gas emission scenario from 2015 to 2100. The data have been bias-adjusted and statistically downscaled to a common half-degree grid to reflect the historical distribution of daily temperature and precipitation of the W5E5 dataset using the trend-preserving method developed by the ISIMIP^50,52. As such, the climate model data reproduce observed climatological patterns exceptionally well (Supplementary Table 5). Gridded data are publicly available from the ISIMIP database.

Historical economic data

Historical economic data come from the DOSE database of sub-national economic output⁵³. We use a recent revision to the DOSE dataset that provides data across 83 countries, 1,660 sub-national regions with varying temporal coverage from 1960 to 2019. Sub-national units constitute the first administrative division below national, for example, states for the USA and provinces for China. Data come from measures of gross regional product per capita (GRPpc) or income per capita in local currencies, reflecting the values reported in national statistical agencies, yearbooks and, in some cases, academic literature. We follow previous literature^3,7,8,54 and assess real sub-national output per capita by first converting values from local currencies to US dollars to account for diverging national inflationary tendencies and then account for US inflation using a US deflator. Alternatively, one might first account for national inflation and then convert between currencies. Supplementary Fig. 12 demonstrates that our conclusions are consistent when accounting for price changes in the reversed order, although the magnitude of estimated damages varies. See the documentation of the DOSE dataset for further discussion of these choices. Conversions between currencies are conducted using exchange rates from the FRED database of the Federal Reserve Bank of St. Louis⁵⁵ and the national deflators from the World Bank⁵⁶.

Future socio-economic data

Baseline gridded gross domestic product (GDP) and population data for the period 2015–2100 are taken from the middle-of-the-road scenario SSP2 (ref. ¹⁵). Population data have been downscaled to a half-degree grid by the ISIMIP following the methodologies of refs. ^57,58, which we then aggregate to the sub-national level of our economic data using the spatial aggregation procedure described below. Because current methodologies for downscaling the GDP of the SSPs use downscaled population to do so, per-capita estimates of GDP with a realistic distribution at the sub-national level are not readily available for the SSPs. We therefore use national-level GDP per capita (GDPpc) projections for all sub-national regions of a given country, assuming homogeneity within countries in terms of baseline GDPpc. Here we use projections that have been updated to account for the impact of the COVID-19 pandemic on the trajectory of future income, while remaining consistent with the long-term development of the SSPs⁵⁹. The choice of baseline SSP alters the magnitude of projected climate damages in monetary terms, but when assessed in terms of percentage change from the baseline, the choice of socio-economic scenario is inconsequential. Gridded SSP population data and national-level GDPpc data are publicly available from the ISIMIP database. Sub-national estimates as used in this study are available in the code and data replication files.

Climate variables

Following recent literature^3,7,8, we calculate an array of climate variables for which substantial impacts on macroeconomic output have been identified empirically, supported by further evidence at the micro level for plausible underlying mechanisms. See refs. ^7,8 for an extensive motivation for the use of these particular climate variables and for detailed empirical tests on the nature and robustness of their effects on economic output. To summarize, these studies have found evidence for independent impacts on economic growth rates from annual average temperature, daily temperature variability, total annual precipitation, the annual number of wet days and extreme daily rainfall. Assessments of daily temperature variability were motivated by evidence of impacts on agricultural output and human health, as well as macroeconomic literature on the impacts of volatility on growth when manifest in different dimensions, such as government spending, exchange rates and even output itself⁷. Assessments of precipitation impacts were motivated by evidence of impacts on agricultural productivity, metropolitan labour outcomes and conflict, as well as damages caused by flash flooding⁸. See Extended Data Table 1 for detailed references to empirical studies of these physical mechanisms. Marked impacts of daily temperature variability, total annual precipitation, the number of wet days and extreme daily rainfall on macroeconomic output were identified robustly across different climate datasets, spatial aggregation schemes, specifications of regional time trends and error-clustering approaches. They were also found to be robust to the consideration of temperature extremes^7,8. Furthermore, these climate variables were identified as having independent effects on economic output^7,8, which we further explain here using Monte Carlo simulations to demonstrate the robustness of the results to concerns of imperfect multicollinearity between climate variables (Supplementary Methods Section 2), as well as by using information criteria (Supplementary Table 1) to demonstrate that including several lagged climate variables provides a preferable trade-off between optimally describing the data and limiting the possibility of overfitting.

We calculate these variables from the distribution of daily, d, temperature, T_x,d, and precipitation, P_x,d, at the grid-cell, x, level for both the historical and future climate data. As well as annual mean temperature, \({\bar{T}}_{x,y}\), and annual total precipitation, P_x,y, we calculate annual, y, measures of daily temperature variability, \({\widetilde{T}}_{x,y}\):

$${\widetilde{T}}_{x,y}=\frac{1}{12}\mathop{\sum }\limits_{m=1}^{12}\sqrt{\frac{1}{{D}_{m}}\mathop{\sum }\limits_{d=1}^{{D}_{m}}{({T}_{x,d,m,y}-{\bar{T}}_{x,m})}^{2}},$$

(1)

the number of wet days, Pwd_x,y:

$${{\rm{Pwd}}}_{x,y}=\mathop{\sum }\limits_{d=1}^{{D}_{y}}H\left({P}_{x,d}-1\,{\rm{mm}}\right)$$

(2)

and extreme daily rainfall:

$${{\rm{Pext}}}_{x,y}=\mathop{\sum }\limits_{d=1}^{{D}_{y}}H\left({P}_{x,d}-{P99.9}_{x}\right)\times {P}_{x,d},$$

(3)

in which T_x,d,m,y is the grid-cell-specific daily temperature in month m and year y, \({\bar{T}}_{x,m,{y}}\) is the year and grid-cell-specific monthly, m, mean temperature, D_m and D_y the number of days in a given month m or year y, respectively, H the Heaviside step function, 1 mm the threshold used to define wet days and P99.9_x is the 99.9th percentile of historical (1979–2019) daily precipitation at the grid-cell level. Units of the climate measures are degrees Celsius for annual mean temperature and daily temperature variability, millimetres for total annual precipitation and extreme daily precipitation, and simply the number of days for the annual number of wet days.

We also calculated weighted standard deviations of monthly rainfall totals as also used in ref. ⁸ but do not include them in our projections as we find that, when accounting for delayed effects, their effect becomes statistically indistinct and is better captured by changes in total annual rainfall.

Spatial aggregation

We aggregate grid-cell-level historical and future climate measures, as well as grid-cell-level future GDPpc and population, to the level of the first administrative unit below national level of the GADM database, using an area-weighting algorithm that estimates the portion of each grid cell falling within an administrative boundary. We use this as our baseline specification following previous findings that the effect of area or population weighting at the sub-national level is negligible^7,8.

Empirical model specification: fixed-effects distributed lag models

Following a wide range of climate econometric literature^16,60, we use panel regression models with a selection of fixed effects and time trends to isolate plausibly exogenous variation with which to maximize confidence in a causal interpretation of the effects of climate on economic growth rates. The use of region fixed effects, μ_r, accounts for unobserved time-invariant differences between regions, such as prevailing climatic norms and growth rates owing to historical and geopolitical factors. The use of yearly fixed effects, η_y, accounts for regionally invariant annual shocks to the global climate or economy such as the El Niño–Southern Oscillation or global recessions. In our baseline specification, we also include region-specific linear time trends, k_ry, to exclude the possibility of spurious correlations resulting from common slow-moving trends in climate and growth.

The persistence of climate impacts on economic growth rates is a key determinant of the long-term magnitude of damages. Methods for inferring the extent of persistence in impacts on growth rates have typically used lagged climate variables to evaluate the presence of delayed effects or catch-up dynamics^2,18. For example, consider starting from a model in which a climate condition, C_r,y, (for example, annual mean temperature) affects the growth rate, Δlgrp_r,y (the first difference of the logarithm of gross regional product) of region r in year y:

$${\Delta {\rm{lgrp}}}_{r,y}={\mu }_{r}+{\eta }_{y}+{k}_{r}y+\alpha {C}_{r,y}+{\varepsilon }_{r,y},$$

(4)

which we refer to as a ‘pure growth effects’ model in the main text. Typically, further lags are included,

$${\Delta {\rm{lgrp}}}_{r,y}={\mu }_{r}+{\eta }_{y}+{k}_{r}y+\mathop{\sum }\limits_{L=0}^{{\rm{NL}}}{\alpha }_{L}{C}_{r,y-L}+{\varepsilon }_{r,y},$$

(5)

and the cumulative effect of all lagged terms is evaluated to assess the extent to which climate impacts on growth rates persist. Following ref. ¹⁸, in the case that,

$$\mathop{\sum }\limits_{L=0}^{{\rm{NL}}}{\alpha }_{L} < 0\,{\rm{for}}\,{\alpha }_{0} < 0\,{\rm{or}}\,\mathop{\sum }\limits_{L=0}^{{\rm{NL}}}{\alpha }_{L} > 0\,{\rm{for}}\,{\alpha }_{0} > 0,$$

(6)

the implication is that impacts on the growth rate persist up to NL years after the initial shock (possibly to a weaker or a stronger extent), whereas if

$$\mathop{\sum }\limits_{L=0}^{{\rm{NL}}}{\alpha }_{L}=0,$$

(7)

then the initial impact on the growth rate is recovered after NL years and the effect is only one on the level of output. However, we note that such approaches are limited by the fact that, when including an insufficient number of lags to detect a recovery of the growth rates, one may find equation (6) to be satisfied and incorrectly assume that a change in climatic conditions affects the growth rate indefinitely. In practice, given a limited record of historical data, including too few lags to confidently conclude in an infinitely persistent impact on the growth rate is likely, particularly over the long timescales over which future climate damages are often projected^2,24. To avoid this issue, we instead begin our analysis with a model for which the level of output, lgrp_r,y, depends on the level of a climate variable, C_r,y:

$${{\rm{lgrp}}}_{r,y}={\mu }_{r}+{\eta }_{y}+{k}_{r}y+\alpha {C}_{r,y}+{\varepsilon }_{r,y}.$$

(8)

Given the non-stationarity of the level of output, we follow the literature¹⁹ and estimate such an equation in first-differenced form as,

$${\Delta {\rm{lgrp}}}_{r,y}={\mu }_{r}+{\eta }_{y}+{k}_{r}y+\alpha {\Delta C}_{r,y}+{\varepsilon }_{r,y},$$

(8)

which we refer to as a model of ‘pure level effects’ in the main text. This model constitutes a baseline specification in which a permanent change in the climate variable produces an instantaneous impact on the growth rate and a permanent effect only on the level of output. By including lagged variables in this specification,

$${\Delta {\rm{lgrp}}}_{r,y}={\mu }_{r}+{\eta }_{y}+{k}_{r}y+\mathop{\sum }\limits_{L=0}^{{\rm{NL}}}{\alpha }_{L}{\Delta C}_{r,y-L}+{\varepsilon }_{r,y},$$

(9)

we are able to test whether the impacts on the growth rate persist any further than instantaneously by evaluating whether α_L > 0 are statistically significantly different from zero. Even though this framework is also limited by the possibility of including too few lags, the choice of a baseline model specification in which impacts on the growth rate do not persist means that, in the case of including too few lags, the framework reverts to the baseline specification of level effects. As such, this framework is conservative with respect to the persistence of impacts and the magnitude of future damages. It naturally avoids assumptions of infinite persistence and we are able to interpret any persistence that we identify with equation (9) as a lower bound on the extent of climate impact persistence on growth rates. See the main text for further discussion of this specification choice, in particular about its conservative nature compared with previous literature estimates, such as refs. ^2,18.

We allow the response to climatic changes to vary across regions, using interactions of the climate variables with historical average (1979–2019) climatic conditions reflecting heterogenous effects identified in previous work^7,8. Following this previous work, the moderating variables of these interaction terms constitute the historical average of either the variable itself or of the seasonal temperature difference, \({\hat{T}}_{r}\), or annual mean temperature, \({\bar{T}}_{r}\), in the case of daily temperature variability⁷ and extreme daily rainfall, respectively⁸.

The resulting regression equation with N and M lagged variables, respectively, reads:

$$\begin{array}{l}{\Delta {\rm{lgrp}}}_{r,y}\,=\,{\mu }_{r}+{\eta }_{y}+{k}_{r}y+\mathop{\sum }\limits_{L=0}^{N}({{\alpha }_{1,L}\Delta \bar{T}}_{r,y-L}+{\alpha }_{2,L}{\Delta \bar{T}}_{r,y-L}\times {\bar{T}}_{r})\\ \,\,\,\,\,+\mathop{\sum }\limits_{L=0}^{N}({{\alpha }_{3,L}\Delta \widetilde{T}}_{r,y-L}+{\alpha }_{4,L}{\Delta \widetilde{T}}_{r,y-L}\times {\widehat{T}}_{r})\\ \,\,\,\,\,+\mathop{\sum }\limits_{L=0}^{M}({\alpha }_{5,L}\Delta {P}_{r,y-L}+{\alpha }_{6,L}\Delta {P}_{r,y-L}\times {P}_{r})\\ \,\,\,\,\,+\mathop{\sum }\limits_{L=0}^{M}({\alpha }_{7,L}\Delta {{\rm{Pwd}}}_{r,y-L}+{\alpha }_{8,L}\Delta {{\rm{Pwd}}}_{r,y-L}\times {{\rm{Pwd}}}_{r})\\ \,\,\,\,\,+\mathop{\sum }\limits_{L=0}^{M}({\alpha }_{9,L}\Delta {{\rm{Pext}}}_{r,y-L}+{\alpha }_{10,L}\Delta {{\rm{Pext}}}_{r,y-L}\times {\bar{T}}_{r})+{{\epsilon }}_{r,y}\end{array}$$

(10)

in which Δlgrp_r,y is the annual, regional GRPpc growth rate, measured as the first difference of the logarithm of real GRPpc, following previous work^{2,3,7,8,18,19}. Fixed-effects regressions were run using the fixest package in R (ref. ⁶¹).

Estimates of the coefficients of interest α_i,L are shown in Extended Data Fig. 1 for N = M = 10 lags and for our preferred choice of the number of lags in Supplementary Figs. 1–3. In Extended Data Fig. 1, errors are shown clustered at the regional level, but for the construction of damage projections, we block-bootstrap the regressions by region 1,000 times to provide a range of parameter estimates with which to sample the projection uncertainty (following refs. ^2,31).

Spatial-lag model

In Supplementary Fig. 14, we present the results from a spatial-lag model that explores the potential for climate impacts to ‘spill over’ into spatially neighbouring regions. We measure the distance between centroids of each pair of sub-national regions and construct spatial lags that take the average of the first-differenced climate variables and their interaction terms over neighbouring regions that are at distances of 0–500, 500–1,000, 1,000–1,500 and 1,500–2000 km (spatial lags, ‘SL’, 1 to 4). For simplicity, we then assess a spatial-lag model without temporal lags to assess spatial spillovers of contemporaneous climate impacts. This model takes the form:

$$\begin{array}{c}{\Delta {\rm{l}}{\rm{g}}{\rm{r}}{\rm{p}}}_{r,y}\,=\,{\mu }_{r}+{\eta }_{y}+{k}_{r}y+\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{1,{\rm{S}}{\rm{L}}}\Delta {\bar{T}}_{r-{\rm{S}}{\rm{L}},y}+{\alpha }_{2,{\rm{S}}{\rm{L}}}\Delta {\bar{T}}_{r-{\rm{S}}{\rm{L}},y}\times {\bar{T}}_{r-{\rm{S}}{\rm{L}}})\\ \,\,\,\,\,+\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{3,{\rm{S}}{\rm{L}}}\Delta {\mathop{T}\limits^{ \sim }}_{r-{\rm{S}}{\rm{L}},y}+{\alpha }_{4,{\rm{S}}{\rm{L}}}\Delta {\mathop{T}\limits^{ \sim }}_{r-{\rm{S}}{\rm{L}},y}\times {\hat{T}}_{r-{\rm{S}}{\rm{L}}})\\ \,\,\,\,\,+\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{5,{\rm{S}}{\rm{L}}}\Delta {P}_{r-{\rm{S}}{\rm{L}},y}+{\alpha }_{6,{\rm{S}}{\rm{L}}}\Delta {P}_{r-{\rm{S}}{\rm{L}},y}\times {P}_{r-{\rm{S}}{\rm{L}}})\\ \,\,\,\,\,+\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{7,{\rm{S}}{\rm{L}}}\Delta {{\rm{P}}{\rm{w}}{\rm{d}}}_{r-{\rm{S}}{\rm{L}},y}+{\alpha }_{8,{\rm{S}}{\rm{L}}}\Delta {{\rm{P}}{\rm{w}}{\rm{d}}}_{r-{\rm{S}}{\rm{L}},y}\times {{\rm{P}}{\rm{w}}{\rm{d}}}_{r-{\rm{S}}{\rm{L}}})\\ \,\,\,\,\,+\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{9,{\rm{S}}{\rm{L}}}\Delta {{\rm{P}}{\rm{e}}{\rm{x}}{\rm{t}}}_{r-{\rm{S}}{\rm{L}},y}+{\alpha }_{10,{\rm{S}}{\rm{L}}}\Delta {{\rm{P}}{\rm{e}}{\rm{x}}{\rm{t}}}_{r-{\rm{S}}{\rm{L}},y}\times {\bar{T}}_{r-{\rm{S}}{\rm{L}}})+{{\epsilon }}_{r,y},\end{array}$$

(11)

in which SL indicates the spatial lag of each climate variable and interaction term. In Supplementary Fig. 14, we plot the cumulative marginal effect of each climate variable at different baseline climate conditions by summing the coefficients for each climate variable and interaction term, for example, for average temperature impacts as:

$${\rm{M}}{\rm{E}}=\mathop{\sum }\limits_{{\rm{S}}{\rm{L}}=0}^{N}({\alpha }_{1,{\rm{S}}{\rm{L}}}+{\alpha }_{2,{\rm{S}}{\rm{L}}}{\bar{T}}_{r-{\rm{S}}{\rm{L}}}).$$

(12)

These cumulative marginal effects can be regarded as the overall spatially dependent impact to an individual region given a one-unit shock to a climate variable in that region and all neighbouring regions at a given value of the moderating variable of the interaction term.

Constructing projections of economic damage from future climate change

We construct projections of future climate damages by applying the coefficients estimated in equation (10) and shown in Supplementary Tables 2–4 (when including only lags with statistically significant effects in specifications that limit overfitting; see Supplementary Methods Section 1) to projections of future climate change from the CMIP-6 models. Year-on-year changes in each primary climate variable of interest are calculated to reflect the year-to-year variations used in the empirical models. 30-year moving averages of the moderating variables of the interaction terms are calculated to reflect the long-term average of climatic conditions that were used for the moderating variables in the empirical models. By using moving averages in the projections, we account for the changing vulnerability to climate shocks based on the evolving long-term conditions (Supplementary Figs. 10 and 11 show that the results are robust to the precise choice of the window of this moving average). Although these climate variables are not differenced, the fact that the bias-adjusted climate models reproduce observed climatological patterns across regions for these moderating variables very accurately (Supplementary Table 6) with limited spread across models (<3%) precludes the possibility that any considerable bias or uncertainty is introduced by this methodological choice. However, we impose caps on these moderating variables at the 95th percentile at which they were observed in the historical data to prevent extrapolation of the marginal effects outside the range in which the regressions were estimated. This is a conservative choice that limits the magnitude of our damage projections.

Time series of primary climate variables and moderating climate variables are then combined with estimates of the empirical model parameters to evaluate the regression coefficients in equation (10), producing a time series of annual GRPpc growth-rate reductions for a given emission scenario, climate model and set of empirical model parameters. The resulting time series of growth-rate impacts reflects those occurring owing to future climate change. By contrast, a future scenario with no climate change would be one in which climate variables do not change (other than with random year-to-year fluctuations) and hence the time-averaged evaluation of equation (10) would be zero. Our approach therefore implicitly compares the future climate-change scenario to this no-climate-change baseline scenario.

The time series of growth-rate impacts owing to future climate change in region r and year y, δ_r,y, are then added to the future baseline growth rates, π_r_,y (in log-diff form), obtained from the SSP2 scenario to yield trajectories of damaged GRPpc growth rates, ρ_r_,y. These trajectories are aggregated over time to estimate the future trajectory of GRPpc with future climate impacts:

$${{\rm{GRPpc}}}_{r,Y}={{\rm{GRPpc}}}_{r,2020}\mathop{\sum }\limits_{y=2020}^{Y}{\rho }_{r,y}={{\rm{GRPpc}}}_{r,2020}\mathop{\sum }\limits_{y=2020}^{Y}\left(1+{\pi }_{r,y}+{\delta }_{r,y}\right),$$

(13)

in which GRPpc_r,y=2020 is the initial log level of GRPpc. We begin damage estimates in 2020 to reflect the damages occurring since the end of the period for which we estimate the empirical models (1979–2019) and to match the timing of mitigation-cost estimates from most IAMs (see below).

For each emission scenario, this procedure is repeated 1,000 times while randomly sampling from the selection of climate models, the selection of empirical models with different numbers of lags (shown in Supplementary Figs. 1–3 and Supplementary Tables 2–4) and bootstrapped estimates of the regression parameters. The result is an ensemble of future GRPpc trajectories that reflect uncertainty from both physical climate change and the structural and sampling uncertainty of the empirical models.

Estimates of mitigation costs

We obtain IPCC estimates of the aggregate costs of emission mitigation from the AR6 Scenario Explorer and Database hosted by IIASA²³. Specifically, we search the AR6 Scenarios Database World v1.1 for IAMs that provided estimates of global GDP and population under both a SSP2 baseline and a SSP2-RCP2.6 scenario to maintain consistency with the socio-economic and emission scenarios of the climate damage projections. We find five IAMs that provide data for these scenarios, namely, MESSAGE-GLOBIOM 1.0, REMIND-MAgPIE 1.5, AIM/GCE 2.0, GCAM 4.2 and WITCH-GLOBIOM 3.1. Of these five IAMs, we use the results only from the first three that passed the IPCC vetting procedure for reproducing historical emission and climate trajectories. We then estimate global mitigation costs as the percentage difference in global per capita GDP between the SSP2 baseline and the SSP2-RCP2.6 emission scenario. In the case of one of these IAMs, estimates of mitigation costs begin in 2020, whereas in the case of two others, mitigation costs begin in 2010. The mitigation cost estimates before 2020 in these two IAMs are mostly negligible, and our choice to begin comparison with damage estimates in 2020 is conservative with respect to the relative weight of climate damages compared with mitigation costs for these two IAMs.

[ad_2]

Source link

April 17, 2024