Wednesday, February 02, 2011

empiricism, faith, computational linguistics

Mike sent me a fantastic piece by Ted Pederson, calling for NLP/CL researchers to care more about having reproducible results and maintainable software.

Empiricism Is Not a Matter of Faith.

It's sad that this is a problem; it should be easy to get other researchers' software up and running, reproduce the results reported in papers, and plug things into other things -- but I think we're moving in that direction. At least one CL conference, CICLING, explicitly calls for open software and reproducible results. Which is pretty cool.


edde addad said...

Thanks for the post, I ended up citing Pedersen in a publication.

I think most people don't distribute their software because support is a time sink, early versions can contain embarassing shortcuts, and (sometimes) funders and university intellectual property offices disallow it without an extensive review. A couple PIs I know have released resources only after they've gotten as many papers as they could out of them.

Still, it's a good idea, especially if funders encourage it.

Alex Rudnick said...

Glad to help :)

After The Revolution, we'll all be able to move faster by sharing code and data and ideas... one hopes.