Tuesday, October 31, 2006

JES Prerelease!

I packaged up the next release of JES, our "Media Computation" environment for helping beginning programmers today. Honestly, if I was teaching kids about programming, I'd probably use this. It's a lot faster than the previous versions, it has a better layout and clever integrated help, it has LOGO turtles!, and like always -- it's really good for doing funky stuff to pictures. (gallery 1, gallery 2).

Fresh code! Here!

Any feedback would be greatly appreciated!
(slides for cs1315, the course where we make use of JES can be found here, in case you want to play around with JES but don't know what it's good for yet!)

Monday, October 30, 2006

Beautiful educational software :)

So many of you are probably aware that my job (ie: research assistantship, ie: how I'm getting gradschool paid for) is to work on JES, the Jython Environment for Students -- it's the little IDE that we use at Georgia Tech to teach introductory CS classes to non-majors and for summer camps as part of our "media computation" effort. A number of other schools use our stuff too, which makes me pretty happy. We're releasing a new version soon, and it's going to rock hard and be faster and include LOGO-style turtles! I'll put up links when it's out.

But speaking of LOGO-style turtles and media computation:
- The KDE Project has KTurtle out now, which looks pretty cool. It's part of their bigger KDE Edutainment effort.
- TuxPaint is also pretty cool! My friend (and noted cartoonist) Brett finds it pretty amusing. It reminds me of KidPix, which you might remember from your childhood if you're about as old as me -- and that's still out there, apparently?
- Tux Paint comes to us from New Breed Software, which has a bunch of other cool educational products.
- And if you only click one link on this post: Pictures Kids Made With TuxPaint

And, for completeness:
- Alice is a programmable 3D environment up outta CMU that rocks harder than I can describe in a quick blog post. As they put it on their site, "Alice will make programming a means to an exciting end." That exciting end is really easy 3D storytelling.

Friday, October 27, 2006

reading on the web

I'm not a fan of most web design out there. I know that's not an uncommon opinion. But say you're a site that has some nice articles that people might want to read -- why clutter up the page with sidebars and crap so that the majority of the screen is taken up with things that aren't the intended article?

Take for example: this DevX article about J2ME. Horrible! And you have to click through to get the different pages! Why do we have "pages" on articles on the web? Non-techy news sites are usually even worse about this.

Articles, I'm thinking, should be "printer friendly" to begin with. Having to look for the "make this readable" button is dumb. I'll let you know if the article is any good.

Monday, October 16, 2006

More on the horoscope remix project...

So my plan so far for remixing horoscopes doesn't seem to be going so well. It went like this:

- Take in the text or texts to be mixed and tag them, so you know what every word's part of speech is.
- Shuffle all the words together randomly.
- Do something like simulated annealing to hillclimb towards a more sensible output text: at every step, swap two words somewhere in the shuffled text, and keep the swap if we think the series of words that it would produce is more likely (as determined by looking at our transition-probability model) than the previous series of three words in that area -- and sometimes make bad choices, probabilistically, to avoid getting stuck in local maxima. Now our transition-probability model works in terms of parts of speech ("tags", to those hip to the NLP lingo), and we learned those tables from tagging a previous corpus...

I'm not quite sure why this isn't working. It might be that using trigram transition probabilities (on the tags) doesn't capture enough structure to get coherent sentences? Because this method doesn't produce coherent sentences very well.

I'm thinking about what else I could do. Perhaps I could use longer n-grams in the model -- or maybe I should look to use a parser, and reward the hill-climbing when it produces longer and longer parsable sentence chunks. The other possibility is that maybe my tagger isn't as accurate as I think it is... it could be mislabeling more words than expected (and it's expected to mislabel a bunch of them.)

The problem is totally not that I'm training my models on James Joyce.

Sunday, October 08, 2006

NLTK and generating horoscopes

Have I mentioned that NLTK is the hotness? NLTK is the hotness, particularly if you want to do your language-y things in Python.

It's intended for educational use, but it has what you need, and it lets you compare different algorithms for tokenizing, tagging, and parsing chunks of text, very pluggably. The API is nice too -- you can very easily tell taggers (the parts of your program that decide which part-of-speech a given word is) to make calls to one another in case they can't figure out the right tag independently.

There's of course the OpenNLP tools in Java, but they don't seem near as quick or awesome.

Soon: using transition probabilities on parts-of-speech to shuffle chunks of text and generate new moderately-sensible horoscopes? Yes! (also: this should be helpful in the long term for my automatic poetry project, which will eventually be more Python-and-ML than Lisp-and-formal-rules)

Thursday, October 05, 2006

Wednesday, October 04, 2006

Type-inference weirdness in Java 1.5 with Generics

Surely this has come up for somebody out there before, but I haven't yet found anything about it. It came up today. Essentially: in Java, if Chair is a sort of Thing, a Vector of Chairs is not recognized as a subtype of Vector of Things... this seems wrong.

What I wanted to do was have a method that takes a vector of a certain type (an interface particularly), then pass to it as an argument, a vector of a type that implemented that interface.

So for example:
- Make an interface "Thing" that requires its subclasses to provide "doSomething()"
- ... and a class Chair implements Thing.
- A variable Vector<Chair> chairs
- And a method doEachThing( Vector <Thing> things).
So you should be able to pass chairs as an argument to doEachThing, right?

I get the error:
ThingDoer.java:17: doEachThing(java.util.Vector<Thing>) in ThingDoer cannot be applied to (java.util.Vector<Chair>)

I don't understand this at all. Surely the type-checker should be able to do that inference, that every element in a list of Chairs is in fact a Thing? Am I totally missing the point, or is this a design wrongness about Java?

Anybody else run into something along these lines?

Tuesday, October 03, 2006

Red pandas and rhinoceri and turtles? ...

Why is there not a client-side javascript version of LOGO?

How cool would that be? Turtles in your browser? ... Small children having access to Papertian goodness wherever the Internets are tubin'?

Monday, October 02, 2006

Lisp is still really pretty.

So I'm building some decision trees for a project I've been working on -- this automatic typo correction for tiny blackberry-style keyboards. The motivation for this is pretty obvious -- we have to decide when to do the correction and when to let the user type what they're typing!

I'm going through some code I'd had for an earlier project, this one done in Common Lisp. And I've been really excited about Python recently -- but man, Lisp is so nice for data structures. There's so little punctuation for building things. I mean, goodness -- it's as if the language was made for dealing with recursively nested things. I'm a little reluctant to translate this stuff -- why am I not just using CL, again? ...

Oh yeah, because I need to make it talk to other programs.

Also! Roughly the hippest thing ever is the __call__ construct on Python classes. If an object has a __call__() method defined on it, then for object foo, you just go foo(arg), and it does its Pre-Defined Thing. Check it out! ("emulating callable objects")