Wednesday, November 15, 2006

The Enron Email Dataset!

So they might not have really been the smartest guys in the room... but one really good thing coming out of Enron is all those emails. 2.6 gigabytes worth, for your perusal or data-mining, social-network-mapping, and language-modeling pleasure. Thanks, guys!

The corpus is here!
(thanks, CMU!)

Andrew said...

Hi Alex,

If you actually end up doing anything with the Enron corpus or want to ask others about what they're doing with it, there's an Enron email corpus mailing list I setup a while back that you can subscribe to.

It's pretty quiet right now, but I'd love to hear from people who are using the Enron corpus for research or even just for fun!

Have you got any plans for how you might use the dataset yet?