This is a pretty exciting time in the social sciences, I think. I (and others – um, Drek? Maybe Jesse?) have often compared this point in the development of social science with the point when astronomers acquired telescopes. We’ve finally got computers – computers that let you do crazy things like run OLS and ordered logit regressions on 43,000 people in 32 countries in about half an hour, while automatically saving the results to files (thanks, R!). I knew that computers were a big deal for us, but I’ve also started to realize just how recent they are – one of my stats professors this spring had used punchcards, for God’s sake! And they both know how to program in Fortran 66 (Beezy, read as: the programming equivalent of speaking Sanskrit).
Now that our scales are tipping a bit more from art toward science, it seems like replication should be (and is) becoming a more important part of our research. People definitely grouse about it, but I don’t think we go to quite the same ends that those in, say, biology do to ensure that our results aren’t just flukes.
I’ve been trying to build a lot of replicability and transparency into my thesis; I’m doing all of the analysis programmatically in R, from data inputs and transformations to automatic outputs to .txt and .csv files. Ultimately, I’d like for someone to be able to download this from my (ahem) faculty web page, run it all through R, get the same results, and say, “Well, your results are technically correct, but you used (readily-available method from default package) instead of (insanely obscure method from undocumented package), which would have been a better fit.” Or perhaps: “Dude, you suck.”
That rather extreme degree of transparency seems somewhat necessary, even if only due to the complexity of most modern statistical software. I mean, it’s easy to write in a paper that you ran an ordered logit regression. But that doesn’t tell me about the defaults or algorithms that your software used; and it especially doesn’t demonstrate that you even knew which defaults and algorithms were used. And that’s okay – I sure as hell don’t know what’s going on in R’s multilevel package half the time – but it’s something that should be made apparent. Unfortunately, our current review system isn’t well-designed for such things – you send in your article, and that’s that. Nobody is required or even really allowed to test your analyses to see if they’re correct. That leaves us with a big honkin’ blind spot to both error and fraud.
Actually, we’re double-exposed to fraud because of the nature of our subjects. Yes, people can falsify their stats (or, I should add, just screw them up). But our data is a challenge, too. It’s not like physics or astronomy – people are far more variable than atoms, and not nearly as accessible as the stars. How many qualitative researchers have tweaked their participants’ words to enhance their impact? How many survey researchers have quietly fiddled with their numbers to tweak a p value down a tenth or so? It’s all too easy, and the pressure to publish is great. Yeah, we have our professional ethics and all, and I do believe that the valuation of that identity by most would keep them honest – but that identity doesn’t mean much if you can’t get tenure.