So far, I've got:
- calculating the entropy of a discrete random variable
- a cute implementation of finite-state automata with matrix multiplication
- calculating the probability of a Markov process going to a particular state, again with matrix multiplication
- a simple CYK-style chart parser for probabilistic grammars (computing inside probabilities, outside probabilities, and the most probable parse)
- a parse evaluator that gives precision and recall for parse trees
- probabilistic part-of-speech taggers that take into account bigrams, both by trying all combinations of tags for the words and using the Viterbi algorithm
- Some pretty clean code for hidden Markov models in general
I've already checked these in over on narorumo. They're all in Python, but some depend on nltk or numpy.
They'll be increasingly clean and documented over the next week or so. I hope these are helpful to somebody!