The latest scandal in economics? After considering 67 recent empirical papers published in 13 prestigious journals, Andrew C. Chang and Phillip Li found that "[b]ecause we are able to replicate less than half of the papers in our sample even with help from the authors [more precisely: 1/3 totally independently, less than 1/2 with the author's help], we assert that economics research is usually not replicable." (PDF)
Sounds bad, doesn't it? Don't worry about it, because it gets worse. Like, a lot worse.
William G. Dewald from the Bureau of Business and Economics Affairs, and Jerry G. Thursby and Richard G. Anderson, both from the Ohio State University -- DTA from here on -- writing for the American Economic Review, reported entirely similar results … in September, 1986!
Thirty years ago, in their paper ("Replication in Empirical Economics: The Journal of Money, Credit, and Banking Project", American Economic Review, vol. 76, no. 4, p. 587). DTA proposed remedial measures:
"On the basis of our findings, we recommend that journals require the submission of programs and data at the time empirical papers are submitted. The description of sources, data transformations, and economic estimators should be so exact that another researcher could replicate the study and, it goes without saying, obtain the same results".As the Reinhart-Rogoff fiasco indicates -- a couple of years ago -- those correctives are not universally applied, although progress has been made as the Chang and Li paper shows. [*]
And, yet, economic empirical research still isn't "replicable", even when procedures, software and data sets are made available for scrutiny.
It doesn't take a philosopher of science to understand why: a careful reading of the DTA recommendations (above), of the procedure Chang and Li followed, or of how Thomas Herndon found errors in the Reinhart-Rogoff spreadsheets is enough (note the words emphasized above).
What these researchers did was to verify, check, inspect, given a pre-existing data set, whether the protocols followed by the authors of the papers under scrutiny produced the results reported. They found the results reported did not correspond to the procedures followed.
That's an important, meritorious, and useful job; it requires honesty, patience, hard work, strength of character, eye for details, and willingness to make enemies in high places. It contributes to the confidence deposited in economic findings. All of that may be true, but it's not what scientists call "replication".
Replication, for scientists, is a byword for repetition: you repeat someone else's experiment from scratch, under identical circumstances. And you generate a new data set, which you analyse. When your conclusions parallel those of the original work, you say you replicated their results. Period.
Clearly, that is way beyond what most of economics (and the social sciences, or even astronomy) can normally achieve. It's not the economists' fault, mind you. It's just how things are.
As valuable as what DTA, Chang and Li, and Herndon did is, what they did is more akin to what teachers do with their students' homework or what proof-readers do with texts before publication. They verify, check, inspect, scrutinise for errors (whether involuntary or deliberate), and correct them. A teacher doesn't say he/she did not replicate the student's results. He/she says the student failed the test, got an "F".
The solution to this? To get over it. Paraphrasing "Dirty" Harry Callahan: an economist's got to know his/her limitations. To call that "replication" -- as economists are wont to do -- either demonstrates appalling ignorance of science or is childish and false advertising. Call it "science envy".
If one wants to blame economists for something, it's for this. Sorry.
[*] Thomas Herndon (at the time a graduate student at the University of Massachusetts Amherst) discovered a series of formula errors in the Excel spreadsheets used by Carmen Reinhart and Kenneth Rogoff to produce their notorious work on public debt. Reinhart and Rogoff did not make their data (or their spreadsheets) publicly available. Thomas Piketty, at the other hand, may have also made mistakes, but he, unlike Reinhart and Rogoff, went to great lengths to make his data and spreadsheets available.