Wednesday, February 02, 2011

empiricism, faith, computational linguistics

Mike sent me a fantastic piece by Ted Pederson, calling for NLP/CL researchers to care more about having reproducible results and maintainable software.

Empiricism Is Not a Matter of Faith.

It's sad that this is a problem; it should be easy to get other researchers' software up and running, reproduce the results reported in papers, and plug things into other things -- but I think we're moving in that direction. At least one CL conference, CICLING, explicitly calls for open software and reproducible results. Which is pretty cool.

2 comments:

edde addad said...

Thanks for the post, I ended up citing Pedersen in a publication.

I think most people don't distribute their software because support is a time sink, early versions can contain embarassing shortcuts, and (sometimes) funders and university intellectual property offices disallow it without an extensive review. A couple PIs I know have released resources only after they've gotten as many papers as they could out of them.

Still, it's a good idea, especially if funders encourage it.

Unknown said...

Glad to help :)

After The Revolution, we'll all be able to move faster by sharing code and data and ideas... one hopes.