Our latest EPSRC-funded research carried out in CIED (and led by SPRU, Sussex) has something new to say about the testing of statistical models. Available now in the journal Energy Economics and authored by Lee Stapleton, Steve Sorrell and Tim Schwanen, you can access the paper here
The paper estimates the so-called ‘direct rebound effect’ associated with personal car travel in Great Britain. This effect relates to the increased driving that may occur when fuel efficiency improvements make car travel cheaper. Our results suggest that the direct rebound effect has been in the region of 20% in GB over the last 40 years. Put differently, a 1% fall in average fuel prices over this period leads to a 0.2% increase in total distance travelled.
To arrive at this conclusion we developed and estimated a total of 108 different regression models. Some of these models were quite fancy, some were quite simple and together they produced a range of parameter estimates in terms of rebound and other determinants of distance travelled. So, how did we choose between them?
To do this, we quantified the robustness (strength or quality) of each model in terms of the extent to which it adhered to governing rules and assumptions about structure, stability, parsimony and the behaviour of parameter estimates and non-fitted data (residuals). Unfortunately, most applied research which uses these kinds of models pays insufficient attention to these rules and assumptions – many of which are well-documented in textbooks and routinely covered in courses on research methods. So why are they so often ignored? Other assumptions and rules are confined to the more technical, specialist literature which makes it easier to understand their limited diffusion in applied research.
Capturing multi-dimensional concepts such as robustness can be achieved by constructing so-called composite indicators. In other words, uni-dimensional constituents of robustness can be combined mathematically into weighted, aggregated, multi-dimensional representations of the concept. And that’s what we did. Specifically, we assessed 96 of our 108 models in terms of 13 ‘quality indicators’ to create aggregate, composite measures of robustness for each model. We took two approaches based on different weightings. The first (unequally weighted) is based on our judgement of the ‘relative importance’ of each robustness constituent. The second (equally weighted) does not differentiate in terms of importance. We assessed the remaining 12 models in terms of a reduced set of 6 quality indicators because the robustness of these models was trickier to assess. Here, again, we developed unequally and equally weighted robustness composites.
Doing this allowed us to explore the relationship between model robustness and parameter estimates. We haven’t seen this done before. Previously, this has only been approached using relatively narrow operationalisations of robustness compared to the multi-dimensional indicators developed and used here.
So, the $64,000 question is Do bad methods lead to biased results? And conversely Do good methods lead to less biased results? We can provide answers using our robustness indicators. If there are systematic relationships between parameter estimates and robustness we should be concerned that bad methods do indeed lead to ‘wrong’ answers. But, if there is no relationship between parameter estimates and robustness we can afford to be less concerned about these bad methods.
In our latest work here, we found some evidence to suggest that bad methods lead to biased results, but only some. It depends on which model parameter estimates you look at. More studies need to be done which apply comparable robustness indicators to other models to get a better handle on this. Regardless of the answer, it is a good idea to choose and use statistical models which are multi-dimensionally robust. Hence, we are currently applying the indicators developed in this work to models being developed in other projects.
Lee Stapleton was a Research Fellow at CIED until 2016.