Sunday, June 23, 2013

NAACL 2013 review

Just recently, I was in Atlanta for NAACL. So much fun! The hallway track is always the best -- I saw a bunch of friends from the NLP world, and especially a lot of Googlers, and met a bunch of new people! Also I managed to be present for Ray Mooney and David Forsyth and some other professors disagreeing animatedly about internal representations of meaning and to what extent you need to take the intentional stance with respect to other people.

Lots of really interesting papers this time around. There is of course Hal Daumé's expert opinion about the interesting papers at the main conference -- I saw a lot of those same talks, having mostly been hanging out at the machine translation and syntax/parsing tracks. On a personal note, it's exciting to see people I know and have worked with getting mentions on Hal's blog! (so, congratulations Greg Durrett and John DeNero and Juri Ganitkevitch!)

Additionally, here's what I thought was cool:
  • Training Parsers on Incompatible Treebanks by Richard Johansson. You want to build a parser for your language. And you've got a treebank. No! You've got two treebanks. Even better, right? But what if those two treebanks use entirely different annotation schemes? ...
  • In the invited talk on Wednesday, Kathy McKeown talked about, among other things, the idea that as NLP people we can provide evidence for or against ideas in comparative literature or literary theory, in collaboration with literature folks -- "well, the theory is that narrative works like this -- let's check!"
  • At *Sem, but also in the main conference, people are talking about using richer, more structured semantic models in our applications again. The really major change in the field in the early 1990s was to not do this -- but now we've got bigger computers and more data, and as a community we know a lot more about stats! Kevin Knight and his group are launching their Abstract Meaning Representation project ("It's like a treebank, but for semantics.") -- maybe it'll work this time!
  • Also at *Sem, Yoav Goldberg talked about the unreasonably enormous Syntactic Ngrams dataset -- it's basically chunks of parse trees from the English part of the Google Books corpus, indexed by time. That's going to be super useful.
  • I popped in to some of the Computational Linguistics for Literature talks -- Mark Riedl's invited talk about programmatically generating stories for games (slides) was especially good!
  • SemEval! There were fourteen different tasks -- lots of different aspects of understanding text! And people are using all these wildly different techniques to do it. An introductory talk about a task and then a single presentation about a system for performing that task is not always enough to really understand the problem, though...
  • I think my presentation went pretty well! People I've been citing for a while were at my talk, and people seemed engaged and asked good questions! (slides, paper)
Alright! So now, full of encouragement and ideas -- back to work.