Monday, November 16, 2009

explanatory power of working examples

The NLP algorithms I've been studying since I started back at school aren't particularly complex. But they're often described with really dense notation: maybe your field does this too! Here's a description, for example, of how to calculate an "outside probability" -- it's the (joint) probability that a particular nonterminal symbol covers a certain chunk of text, and the words outside the span of that nonterminal. This is from Fei Xia's lecture slides (and I think these are pretty good).



Maybe what I need is more practice picking apart dense notation, but in all honesty I have trouble keeping track of what the different letters mean. Maybe a nice dynamic programming implementation springs to mind for people smarter than me, but I have to stare at it (and the surrounding slides) for quite a while!

I think I'd be making a pretty good contribution to the world if I took the algorithms I'm learning and wrote down the most straightforward pseudocode and prose versions I can, with a running Python implementation and descriptive variable names. Surely many people out there would find code easier to digest!

Somebody's already done precisely this with the Viterbi Algorithm wikipedia page, and I'm very grateful to that somebody.

Wednesday, November 04, 2009

Lenovo: you have to buy Windows, As Per Policy

I got a pretty quick response from the Lenovo sales people -- complete with verbiage at the bottom emphasizing how the email was confidential and legally privileged, and any retransmission, dissemination, or other public use is strictly prohibited. They should have put the EULA for the email at the top, before I scrolled down! I might not have agreed to read the email! Geez, or worse, what if somebody accidentally read it over my shoulder in a coffee shop!

Anyway, they said:
We do not have option to sell any unit without operating system as per policy.

So I guess I won't buy a ThinkPad. I'm just not willing to pay The Microsoft Tax when I'm not going to use Windows.

Python generator expressions

I just found out about this: Python has a really concise way to make new generators.

It looks like a list comprehension, just without the brackets. Before I knew about this feature, the code I was reading looked pretty mysterious.

There are some nice examples of cases where you might want to use this sort of thing in the relevant PEP. Especially pleasant uses from the PEP include passing a generator to the dictionary constructor, like so:
d = dict( (k, func(k)) for k in keylist)

... and, useful for me personally, getting the set of words in a file, all in one go:
s = set(word for line in f for word in line.split())

Good to know!