A dialogue on examining datasets in the nuclear vs renewable energy debate

Posted on 11 August 2021 by Francisco Dominguez — No Comments ↓

Following the publication of their paper “Differences in carbon emissions reduction between countries pursuing renewable electricity versus nuclear power” in Nature Energy last October, Prof. Benjamin K Sovacool, Prof. Andy Stirling and their co-authors received a number of responses and challenges to the paper’s findings.

To advance scientific debate around independent research, they engaged in a series of dialogues with researchers offering critiques of our work. Below, they share an exchange with Daniel Perez, PhD student at École Normale Supérieure in Paris.

Mr Perez’s paper, “On Sovacool’s et al. study on the differences in carbon emissions reduction between countries pursuing renewable electricity versus nuclear power“, offers a critical perspective of Sovacool et al.’s paper’s models and statistical analysis.

The exchange below begins with their response to Mr Perez’s paper, followed by Mr Perez’s response to theirs.

By sharing the exchange here, Profs Sovacool and Stirling hope to encourage collegiate debate and support the critical importance of independent research, an issue considered in their earlier blog, Nuclear vs renewable energy and the critical importance of independent research.

Thanks to Mr Perez for his original response and for participating in this exchange.

Response to Daniel Perez’s Matters Arising

Benjamin K. Sovacool, Patrick Schmid, Andy Stirling, Goetz Walter & Gordon MacKerron

We thank Mr. Perez for engaging with our article. But we do not believe any of the concerns he raises are novel, nor do they hit the point on many aspects.

First of all, if he had read thoroughly, Mr. Perez might have noticed that we never talk about greenhouse gases in their full generality. To say we use GDP to “confound something” is a serious misrepresentation. We actually use GDP as a “control”.

Mr. Perez also seems to misunderstand us when he says: “despite the fact that decarbonated energy sources are not good predictors of GHG emissions” and “Fossil fuels as the real predictor and the ‘crowding out’ hypothesis”. Just as we never address GHG in their entirety, so we never claim that clean energy sources are a “predictor” of CO2 emissions. Ours is not a predictive but a correlative study. The reader might wonder why Mr. Perez puts so much emphasis on such obvious red herrings.

With respect to Mr. Perez’s point that the crowding out hypothesis is not surprising at all (since “renewables and nuclear power are structurally incompatible, so there is an anti-correlation between them”), we would note that he is directly endorsing (without duly emphasized acknowledgement) one of the most crucial findings of our paper.

And to be clear, we do not emphasize mere theoretical properties of random variables, which need opaque assumptions and are devoid of empirical data. That Mr. Perez states on such an ostensibly precise theoretical basis, “little to no surprise”, detracts from his idiom of precision. It raises the question: is it really no surprise or is there something to be investigated? It is the empirical findings we obtain – together with our qualifications – that strike us without doubt as being something to be investigated.

With respect to the important role played by hydroelectricity in the earlier period we examine, Perez again deploys a misleading polemic. Why should this unavoidable empirical reality be treated as if it were somehow a deficiency of our study? The relative importance of hydroelectricity in the early stages of renewables uptake is simply a reflection of the established historical trajectory in renewable development. In later stages, the effects we document in this regard become much more influenced by wind and solar. With all these sources anyhow counting as ‘renewable’, why would this count as a ‘flaw’.

With respect to timeframes, the question raised is (as we acknowledge) about nuanced differences of approach, not about “mistakes”. We are ourselves clear that there are multiple things to consider on this issue. This is exactly why we have chosen a robust data-averaging approach with several triangulation procedures. Together with our openness to the many conditionalities, this is the way to properly address uncertainties and ambiguities that are unavoidable in this kind of research. If Mr. Perez really wants to claim that there exists just one single definitive approach to this complexity, then he is arguably reproducing the kind of technocratic authoritarianism that has led for so long to the neglect of the kinds of questions we are raising.

As to Mr. Perez’s argument that “you do not have stationarity” and “you need stationarity for time series analysis”, we agree. But this is again a strangely misleading point. It is this need for stationarity in time series approaches that constitutes a key reason why we do not adopt such an approach.

We choose the stated time lag without involvement of a second category of assumptions that would not compellingly fit the purpose of an initial pioneering study. As we explain, the indicated time lag was chosen to optimally use the data set. Otherwise, we might have disregarded precious data points which would then in turn have raised the objection that we intentionally and deliberately used only some parts of the data, but not all of it, thereby wasting parts of the available data set. Crucial here, is that we still have to consider a directional effect since power plants typically involve a lot of down-stream processes (such as maintenance) that stretch over time but need to be clearly attributed.

To Mr. Perez’s statement that “distribution of the residuals is not exactly normal”, we respond that any expert should be aware that any test of assumptions only gives hints for acceptance within defined error intervals. When invoking statistical pretests, all issues surrounding statistical tests, like “false positives”, power and efficiency of tests, have to be mentioned.

On a further technical point, Mr. Perez refers to “confounding variable with a power law and not just a linear model”. But there is no part of our work that relies on identifying a “best fit” curve. This would be difficult to motivate from a theoretical perspective – for example: why squared or the root. We are not aiming to build a “causal model”. We never claim to do so. Why does Perez imply otherwise?

In similar vein, Mr. Perez makes statements about the “predictive power” of our model that compound a further diversion with a misquote. What we actually said was “Crucially, renewable energy strategies are, to an evidently noteworthy degree, associated with lower levels of national carbon emissions”. This is not an attribution of causality. Whether one might have chosen a different analytical approach is a moot point that we acknowledge. But all methods hold pros and cons. Oddly for someone so focused on precision, Mr. Perez does not demonstrate that alternatives do not display their own more serious specific disadvantages.

With regard to Mr. Perez’s statement that the original data would yield a “bias”: we are not adjusting/distorting the original data, we analyze it simply it as it is. His slurs about “the poor study of the data set” and “suboptimal modeling” can be qualified in light of our response to his other misleading language addressed above.

Multivariate linear regression is actually quite robust with respect to its assumptions. What is most crucial here is that it was not our aim in this pioneering study to test any particular model versus another as a candidate for an “optimal fit”. What we are instead doing, is investigating prevailing understandings of the form “the more … energy, the less emissions”. So our methodology stands in this regard. Given that the associated issues are so prominent and so high stakes, it is remarkable that our research question has not been posed before.

In conclusion, we would urge that the reader cut through the many technicalities to see the underlying picture. Our study asks a very basic empirical question. We do not claim to have answered this definitively, but merely pointed to the significant implications and the grounds for further research. Our findings remain valid and salient.

Response to Sovacool et al.’s response

Daniel Perez

Benjamin K Sovacool et al.: We thank Mr Perez for engaging with our article. But we do not believe any of the concerns he raises are novel, nor do they hit the point on many aspects.

First of all, if he had read thoroughly, Mr Perez might have noticed that we never talk about greenhouse gases in their full generality. To say we use GDP to “confound something” is a serious misrepresentation. We actually use GDP as a “control”.

Mr Perez also seems to misunderstand us when he says: “despite the fact that decarbonated energy sources are not good predictors of GHG emissions” and “Fossil fuels as the real predictor and the ‘crowding out’ hypothesis”. Just as we never address GHG in their entirety, so we never claim that clean energy sources are a “predictor” of CO2 emissions. Ours is not a predictive but a correlative study. The reader might wonder why Mr Perez puts so much emphasis on such obvious red herrings.

Daniel Perez: The terms “confounding variable” and “predictors” are widespread and well-known concepts in statistics. Both of these terms are standard terminology in the context of regression analysis, as can be corroborated by looking at any statistics textbook. It’s in the statistical sense that the correlative study made in Sovacool et al.’s paper explicitly uses both nuclear and renewables as predictors of GHG emissions.

Sovacool et al.: With respect to Mr Perez’s point that the crowding out hypothesis is not surprising at all (since “renewables and nuclear power are structurally incompatible, so there is an anti-correlation between them”), we would note that he is directly endorsing (without duly emphasized acknowledgement) one of the most crucial findings of our paper.

Perez: This is a misquote, we were simply explaining Sovacool et al.’s reasoning. The full statement reads as follows: “Moreover, the reasoning behind the “crowding out” hypothesis is flawed. Indeed, the authors of [16] motivate the proposal of the “crowding out” hypothesis as follows: Intermittent renewables require a decentralized electrical infrastructure as soon as they occupy a significant fraction of the electricity produced. By contrast, the optimal electrical infrastructure of non-intermittent power sources, such as fossil fuels, hydroelectricity and nuclear power is centralized [2]. The authors then suggest that, for these reasons, there should be an anticorrelation between R and N, which is the statement of the so-called “crowding out” hypothesis. They back this statement by verifying that R and N are indeed anticorrelated and use this to justify their statements.” It is clear that nowhere are we agreeing with their conclusions, but rather just explaining the reasoning proposed by Sovacool et al. as to their proposal of the “crowding out” hypothesis.

Sovacool et al.: And to be clear, we do not emphasize mere theoretical properties of random variables, which need opaque assumptions and are devoid of empirical data. That Mr Perez states on such an ostensibly precise theoretical basis, “little to no surprise”, detracts from his idiom of precision. It raises the question: is it really no surprise or is there something to be investigated? It is the empirical findings we obtain – together with our qualifications – that strike us without doubt as being something to be investigated.

Perez: Whether Sovacool et al. were aware of their emphasis on a phenomenon arising when studying fractions of a same whole in a regression analysis is irrelevant in the demonstration that their “findings” are mere artefacts of this fact, as clearly demonstrated in our work.

Sovacool et al.: With respect to the important role played by hydroelectricity in the earlier period we examine, Perez again deploys a misleading polemic. Why should this unavoidable empirical reality be treated as if it were somehow a deficiency of our study? The relative importance of hydroelectricity in the early stages of renewables uptake is simply a reflection of the established historical trajectory in renewable development. In later stages, the effects we document in this regard become much more influenced by wind and solar. With all these sources anyhow counting as ‘renewable’, why would this count as a ‘flaw’.

Perez: That “the effects in this regard become much more influenced by wind and solar” remains to be shown, as hydroelectricity accounts for a much higher percentage of energy produced world-wide than both of these sources of energy combined, particularly so in both the timeframes considered by Sovacool et al. To extrapolate their findings to a regime where solar and wind power were to become dominant deserves at the very least a justification, which is not present in their paper. Let us stress that, although hydro, wind and solar share the renewable characteristics, the large uncontrolled variability of wind and solar production make them very different to hydroelectricity in that respect.

Sovacool et al.: With respect to timeframes, the question raised is (as we acknowledge) about nuanced differences of approach, not about “mistakes”. We are ourselves clear that there are multiple things to consider on this issue. This is exactly why we have chosen a robust data-averaging approach with several triangulation procedures. Together with our openness to the many conditionalities, this is the way to properly address uncertainties and ambiguities that are unavoidable in this kind of research. If Mr Perez really wants to claim that there exists just one single definitive approach to this complexity, then he is arguably reproducing the kind of technocratic authoritarianism that has led for so long to the neglect of the kinds of questions we are raising.

As to Mr Perez’s argument that “you do not have stationarity” and “you need stationarity for time series analysis”, we agree. But this is again a strangely misleading point. It is this need for stationarity in time series approaches that constitutes a key reason why we do not adopt such an approach.

Here, the argument that we should have used panel data and that our analysis is unduly time-averaged actually go together. While panel data analysis may be an alternative, we intentionally chose time averaging since this procedure enables more robust statements to be made in the context of random variables (the underlying modelling for statistics). Such approaches are often used as a mean to average nuisance contributions in an environment with a presence of many influencing factors which clearly is our case at hand. We choose the stated time lag without involvement of a second category of assumptions that would not compellingly fit the purpose of an initial pioneering study. As we explain, the indicated time lag was chosen to optimally use the data set. Otherwise, we might have disregarded precious data points which would then in turn have raised the objection that we intentionally and deliberately used only some parts of the data, but not all of it, thereby wasting parts of the available data set. Crucial here, is that we still have to consider a directional effect since power plants typically involve a lot of down-stream processes (such as maintenance) that stretch over time but need to be clearly attributed.

Perez: Non-stationarity is an important phenomenon in this particular timeframe, as many countries underwent rapid development in the 90s and the 00s. It was never claimed in our paper that “time series require stationarity”, which is a false statement. Time series analysis can be performed even in a non-stationary setting, for instance by using a Moving Average (MA) or Moving Average Exogenous (MAX) model, which was not the case in Sovacool et al.’s work. Other standard tools in this context are Autoregressive processes (ARPs) or Autoregressive exogenous processes (ARXs). All of these tools are well-adapted to indeed study whether the claims of Sovacool et al. regarding the nature of the time lag are justified or not. However, non-stationarity in particular implies that considering only two-time steps with an a priori arbitrary lag is an incorrect approach from a statistical point of view. The averaging chosen by the authors is not justified from a time-series analysis perspective and does not exploit the data in any sense of optimality (from a statistical standpoint). That this procedure is “robust” remains to be shown by, for instance, demonstrating its stability, i.e. whether a change in the time step and number of timeframes considered changes the conclusions of the regression analysis or not. This was never made explicit by the authors in their paper. Furthermore, whether their justification for the lag is correct or not also would require a finer time series analysis. Regardless, this was not the main argument provided in our paper, although we point out that more adequate tools exist for treating the data. As we stated in our paper, even taking the averaged-out data from Sovacool et al. there are many other problems regarding their analysis, which are not related to these time series considerations.

Sovacool et al.: To Mr Perez’s statement that “distribution of the residuals is not exactly normal”, we respond that any expert should be aware that any test of assumptions only gives hints for acceptance within defined error intervals. When invoking statistical pretests, all issues surrounding statistical tests, like “false positives”, power and efficiency of tests, have to be mentioned.

Perez: The only time we make this remark is when we are reporting the t-statistic, standard error of our regressions and their p-value. We remark that looking at p-values is irrelevant here, and that the standard error and t-statistics are thus the relevant metrics to look at.

Sovacool et al.: On a further technical point, Mr Perez refers to “confounding variable with a power law and not just a linear model”. But there is no part of our work that relies on identifying a “best fit” curve. This would be difficult to motivate from a theoretical perspective – for example: why squared or the root.

Perez: The full quote is “despite the fact that going forwards we should consider accounting for the confounding variable with a power law and not just a linear model.” As is clear from inspection of the data in a log-log chart, the GDP vs CO2eq emissions are better described by a power law rather than just a linear model, as was shown in our analysis. That this should be the case is not necessarily a surprise, as the data is clearly heteroskedastic and spans many orders of magnitude. This is often a sign that the underlying distribution should be Pareto, hence our inspection of whether this hypothesis holds or not.

Sovacool et al.: We are not aiming to build a “causal model”. We never claim to do so. Why does Perez imply otherwise?

Perez: On this point, let us quote the authors on the conclusions of their paper: “When taken together with the finding that renewables seem significantly more positive for carbon abatement, important adverse implications arise for nuclear power. As the evidently less generally favourable of the two broad carbon emissions abatement strategies, a tendency of nuclear not to coexist well with its renewable alternative, does (all else being equal) raise doubts about the opportunity costs of investments in nuclear power rather than renewable energy. The direction of cost and learning trends discussed here, intensify this point. Given the current state of climate debates internationally and in many countries, it is troubling that nuclear and renewable energy pathways appear (both historically and, here, empirically) to display such mutual tension. It appears that countries planning large-scale investments in new nuclear power are risking suppression of greater climate benefits from alternative renewable energy investments. That the converse may also be true (with renewables tending to suppress nuclear investments) is evidently less important, because it is renewable strategies that are on balance evidently more effective at carbon emissions mitigation.” If the authors did not seek to exploit a causal model in which the link between the variables studied was properly understood, drawing such conclusions from a simple correlation study exhibiting the several statistical caveats mentioned in our paper is at best unjustified. Alternatively, if the objective of the paper was to make policy recommendations, then the study of a causal model becomes necessary (albeit, not necessarily sufficient).

Sovacool et al.: In similar vein, Mr Perez makes statements about the “predictive power” of our model that compound a further diversion with a misquote. What we actually said was “Crucially, renewable energy strategies are, to an evidently noteworthy degree, associated with lower levels of national carbon emissions”. This is not an attribution of causality.

Perez: Once again, “predictive power” is a common expression in the statistical jargon typically used in regression analysis.

Sovacool et al.: Whether one might have chosen a different analytical approach is a moot point that we acknowledge. But all methods hold pros and cons. Oddly for someone so focused on precision, Mr Perez does not demonstrate that alternatives do not display their own more serious specific disadvantages.

Perez: The matter is not whether a particular method holds pros or cons, but rather to point out that there are many methodological mistakes in the analysis in Sovacool et al.’s paper. For example, performing correlations over fractions of the same whole, disregarding that data concerning nuclear power necessarily has a considerably smaller variance than that of renewables, by circumstance (lots of countries have little to no nuclear power, whereas there are very few countries with a large portion of nuclear in their electrical mix). It is not a matter of a pro or con but simply a methodological mistake. Finally, we are explicit in our paper in stating that our goal was to reproduce the study of Sovacool et al. not to do our own on the same subject.

Sovacool et al.: With regard to Mr Perez’s statement that the original data would yield a “bias”: we are not adjusting/distorting the original data, we analyse it simply it as it is. His slurs about “the poor study of the data set” and “suboptimal modelling” can be qualified in light of our response to his other misleading language addressed above.

Perez: cf. our previous discussion on time series analysis considerations.

Sovacool et al.: Multivariate linear regression is actually quite robust with respect to its assumptions. What is most crucial here is that it was not our aim in this pioneering study to test any particular model versus another as a candidate for an “optimal fit”. What we are instead doing, is investigating prevailing understandings of the form “the more … energy, the less emissions”.

Perez: Precisely, and what we show in our paper is that it is not possible to conclude, using the methodology of Sovacool et al., anything other than “fossil fuels emit CO2”.

Sovacool et al.: So our methodology stands in this regard. Given that the associated issues are so prominent and so high stakes, it is remarkable that our research question has not been posed before. In conclusion, we would urge that the reader cut through the many technicalities to see the underlying picture. Our study asks a very basic empirical question. We do not claim to have answered this definitively, but merely pointed to the significant implications and the grounds for further research. Our findings remain valid and salient.

Having considered the points Mr Perez raises, Prof. Sovacool, Prof. Stirling and their co-authors feel they have been adequately covered in their initial response.

Profs Sovacool and Stirling will share further reflections on the response to the paper and the challenge of maintaining open debate in energy debates in another blog post, to follow shortly on this site.

Follow Sussex Energy Group

Tagged with: Data, modelling, nuclear vs renewables
Posted in All Posts, nuclear, renewables

Sussex Energy Group at SPRU

A dialogue on examining datasets in the nuclear vs renewable energy debate

Response to Daniel Perez’s Matters Arising

Response to Sovacool et al.’s response

Related

Leave a comment Cancel reply

Follow Sussex Energy Group on Twitter

Most read posts (48 hours)

Disclaimer

Subscribe to Blog via Email

Archives

Subscribe to Sussex Energy Group's quarterly newsletter

Subscribe to Blog via Email

Sussex links