Monday, October 12, 2009

normal distributions and R

When I'm using R to do statistical things (such as homework), I feel somewhat torn -- it's got so many nice functions that come built in, but the language itself is slightly clunky, and integrating code that I've written in R with bigger projects seems like it would be kind of a pain. That's a general problem with picking any special-purpose language, though -- I might make similar complaints about Matlab/Octave or even Prolog...

I note, though, that I haven't jumped ship to NumPy yet.

pnorm and qnorm

I just wanted to mention these fantastically easy-to-use functions that come built right in: pnorm and qnorm.

pnorm is what you use if you have a z-score and you want the probability that a value in the distribution would come up as less than that score. This is equivalent to looking up probability values in the "z" tables in the back of your stats book. pnorm(0) gives you 0.5, since half of all values are going to have a value less than 0.

qnorm does the inverse -- you give it a probability and it gives you back the z-score below which that much of the probability mass lies. So if you give it 0.5, it gives you back 0.

Both of these functions can take more parameters -- you can specify your distribution mean and stddev (so you don't have to use z-scores), for example. Type "?qnorm" for the docs!

No comments: