Saturday, October 06, 2007
ye can't get ye flask
And it's pretty easy to sign up, and you get a fairly nice-looking avatar by default (mine was "city chic", which pretty much describes me in First Life as well), and you download the client, which they probably have one for your platform, and it all works. And you find yourself on this friendly-looking island and the tutorial tells you how to walk around and say things and stuff. And there are a few other virtual people standing around.
But it's not clear if they can hear you when you speak -- and if they're trying to chat with you, how would you find out? Maybe there's a "chat" window that you can pull up. I found a window that maybe wanted to be that one ("History"?) ... but only some of my utterances seemed to show up there.
So I walked around on that little island for a few minutes, trying to figure out how to get to another island -- and there's this "teleport" button, but how do I use it, and why is it grayed out?
... after a while, I found that I'd hit something that turned off my walking. My arrow keys would turn me in place, but I couldn't walk around anymore. And there's no clear "oh, you're in 'don't walk anymore' mode" indicator. Buh?
And that was enough to end my second foray into Second Life. There are only so many minutes in the day.
I think my experience was rather more anticlimactic than Drew's. He at least found out how to go places. But neither of us could figure out how to hit people.
Tuesday, September 11, 2007
possible upcoming projects
New Things To Build
- Sketch out and build an online community where kids can make stuff in code and share it with each other, like MOOSE Crossing only awesome and on the web and easy to use and linkable. Does this want to be made of Scratch, or similar to it? Can we include proper inheritance? Make it work for the OLPC.
- TEB. Make TEB awesome. Does TEB want to use/be part of/merge with gnoetry? Related: That horoscope remixer thing that never worked right.
- Some automated way to calculate Erdos numbers, probably with the help of DBLP.
- A Scrabble bot. I've been thinking about this a lot, actually, looking at different algorithms for permuting strings -- but I think I can do Scrabble without looking at permutations... maybe all you really care about is whether certain groups of letters constitute the same bag. If it turns out we don't have to permute things (in n!), then we can probably plan several turns ahead. This might be similar to playing Backgammon,, but maybe we can't do something like that kind of policy learning; your set of possible actions changes so much every turn, and you'd have to use probability estimates to look into future turns...
Interesting Exercises, way already done by other people
- Build a language and virtual machine. Educational for me, not very useful for other people. Recommended by Strick. Building a VM at a pretty high level of abstraction might not be that hard, just think about stack frames and returning things. At what level do things like the Python VM work? The JVM? Surely there are different approaches used for this; what are they? Would it be hard to build something with multithreading in mind from the bottom up? What about a purely functional lambda-calculus implementation?
- Write a checkers bot. Tree search, AB pruning, etc. are pretty well understood, even by me -- the hard thing would be an evaluation function.
- Build a MUD, or at least a MUD framework. But in Scheme. ynniv and I somehow never got around to that...
- Another Sudoku solver. But in Haskell.
Thursday, August 23, 2007
James Gosling, it turns out, thought harder about type inference than I did
The mighty Toby R, in a conversation with me and Strick, shed some light on the situation and led me to understand why this is not the case. The particular use-case is: when you're in the method that takes List<Thing>
(also, somehow I missed an anonymous comment, probably from my mother, well-respected for her work on type theory, which explained that exact case)
Now in a purely functional language, where you don't go around getting references to objects and modifying them, I'd like to posit that this wouldn't be a problem -- but perhaps there are other problematic situations? ...
Wednesday, August 22, 2007
typing about typing about types
But! All of this means that by the next GWT release, you'll likely have some code that I touched in your hands. Woo.
Sunday, August 12, 2007
this blog post: for you, $50.
A few links out, I ran into the Serials Crisis article. Apparently (and Wikipedia articles close to the "Library Science" one are never wrong), the costs of subscribing to scholarly journals keep on going up -- libraries only have so much money for subscriptions, but there are ever-more academics and subfields, thus more journals. And if a given library cancels its subscription from a particular journal, that publisher's fixed costs are still fixed, so prices increase for the remaining subscribers.
The traditional academic journal system had seemed pretty shaky, especially in light of the Web; upsetting publisher websites (Springer, ACM Portal, IEEE's site...) seem like their sole purpose is to keep the enterprising students from reading an article. In light of how most of the science behind the articles is publicly funded in the first place, the articles seem like they should be public as well.
I wouldn't mind seeing companies like Springer just going away; universities seem totally capable of hosting journals -- over the web especially! There may be some compelling reason for the current system, and I'll try to find it out... but for the short term, tools like Google Scholar could go a little further out of their way to help us find the full text of an article!
Also:
http://en.wikipedia.org/wiki/Open_access
http://en.wikipedia.org/wiki/Open_access_journal
http://en.wikipedia.org/wiki/Open_access_publishing
The Serials Crisis: A White Paper for the UNC-Chapel Hill Scholarly Communications Convocation
The Crisis in Scholarly Publishing
Friday, August 10, 2007
You get spoiled by languages with first-class functions
[fn for fn in fns if fn.endswith(".html")]
Or even:
filter( lambda x: x.endsWith(".html"), fns)
Or the equivalent Lisp. Y'know, with "loop" and "collect". (Common Lisp has, as they say, the Cadillac of loop syntax)
Or something to that extent. But then I remembered that I was writing in Java, and I ended up writing a for-loop. There's no clean, idiomatic way to say that in Java, is there? Do you have to build up the list procedurally? ...
Sunday, August 05, 2007
"If I knew the answer ahead of time, I wouldn't be writing this program!"
On the plane back from San Francisco to Atlanta, I read Thomas and Hunt's Pragmatic Unit Testing. It's a very quick read, about 120 pages plus appendices. And while it's very light, it has what seems like a lot of helpful advice for good design and development practice. To be honest, I've never done much unit testing in the past, thinking "oh, that's what those corporate software engineer guys do -- pssh." But! It turns out that these days, I am a corporate software engineer, and anyway one should always be looking out for new ways to hack more effectively.
There's a bunch of important principles to take away from Pragmatic Unit Testing. The one that stuck with me the most, though, is that when designing and writing code, you've got to think: "how am I going to test this?". The significance of this is not just "oh man, I'm going to have to write a unit test for my method"; it gets back to that generally-understood but oft-ignored idea that every logical unit of your code really wants to be its own method, a modular thing that you can use separately -- because you're going to have to use that same calculation again somewhere else. And Don't Repeat Yourself.
For example: if you have a method that calculates how to do something and then does it (say with a call to another library), maybe you want to make that two methods: the calculation and then a call to that calculation coupled with the library call. This will be easier to test -- you're not really interested in testing a third-party library, just your own calculations -- and as a happy side-effect, your code is now cleaner and more reusable!
Similarly, separating out the backend code from the GUI (two of the things that get my hackles up the most in this life are terms "business logic" and "MVC") lets you properly test the Stuff That Does Stuff on its own. One example in the book hit particularly close to home -- a small GUI application where all the caculations and I/O happened mixed in with the Swing code.
That example, and in fact the whole book, brought on flashbacks to a project I'd worked on recently. One of my friends and I (and he's one of the sharpest guys I know) inherited a fairly involved program and ended up sinking months trying to fix it up. This system suffered from pretty much every pitfall in the book: random silently-caught exceptions, real work happening mixed in with the GUI code, needlessly long, opaque, nigh-untestable (let alone "tested") methods, repetition all over the place. Worse! It was built by a guy who'd supposedly specialized in software engineering -- and he did pretty much everything that Thomaas and Hunt warn against! Not a pleasant situation. I'm sure you can relate.
Of course, I knew at the time that this was atrocious code. But now maybe I'll be a bit more principled in my development, working with a lean towards easy testability. Unit testing will probably be a good discipline to get into.
On the other hand, I've been reading about (and writing a bit of) Haskell. All this murky business of setting up and tearing down state, "proving" to yourself that each function does what you think it does "for the boundary cases" -- it all relies on the idea that you're going to be able to predict where you're going to make the bugs (by heuristic, habits, and mnemonics). And if you're smart enough to predict where the bugs are going to pop up, it seems like there's something better you could do to keep them from being introduced at all. In ML for the Working Programmer, L.C. Paulson suggests that a mark of the professional in the future will be writing functionally (in ML). The modularity practices seem like the Right Thing, useful for the functional programmer as well as the OO, but if your code could be formally verified, how much more confident would you be that it was correct for the general case? Simple testing is nothing like a proper proof.
But who ever writes proof-carrying code?
Saturday, August 04, 2007
further readings: one day, I'll have something coherent to say about type systems
The first finder of any error in my books receives $2.56; significant suggestions are also worth $0.32 each. If you are really a careful reader, you may be able to recoup more than the cost of the books this way.
However, people who have read the book Eats, Shoots & Leaves should not expect a reward for criticizing the ways in which I use commas. Punctuation is extremely important to me, but I insist on doing it my own way.
*laughs* We love you, Don.
Also: there's so much to read in this life. I've recently picked up books on proper C++ technique, security, and proper unit testing, and I'll probably give those some priority on my ever-growing Queue. Of course, there's all those books on statistical NLP that need to get read in the near future if I'm going to be of any help to anyone.
I've been in on a few conversations recently about C++ and pitfalls and bugs that can arise when using it. The more I think about these, particularly random language-specific casting rules and several different competing ways to represent strings, the more I think that Haskell is going to be a good idea. Or possibly SML.
Thursday, July 26, 2007
(ping)
But! In the last few months, I've graduated, got a paper accepted to a conference, taught six weeks of summer camps to enthusiastic middle- and highschool kids, and made all the preparations to start up work with the Goog. My first day is Monday.
Topics that I've been looking into and will hopefully post about soon:
- Haskell -- maybe one day I'll have a more consistent opinion about how I feel about static vs. dynamic typing. Writing here on the issue will probably help sort it out. Or just building something big with Haskell.
- Statistics. I went out and bought All of Statistics and I've been thumbing through it a bit.
- CS Education and edutech. Good goodness, education. I've spent most of my summer with the childrens, trying out different ways of convincing them they want to know what I think they should know. It's been going well, for the most part.
- GWT and associated topics in Javascript and Ajax. I'm joining the GWT team in just a few days, so I'm working pretty hard on learning it!
Friday, May 25, 2007
more exciting edutech!
It's really easy to figure out, especially if you've seen things like the LEGO mindstorms interface, or Alice -- commands snap together with a familiar building-blocks metaphor. This is to say that there's a fairly standard vocabulary for childrens programming environments, these days...
The Scratch intro "Facilitorial" video is here.
Also fairly interesting (and brand new on my radar as of today), is Greenfoot, which is another educational programming environment, perhaps for slightly older kids. It makes it easy to do simulations with different kinds of "actors" on these nice 2D worlds (they can be grid-worlds, but they don't have to be)... although you have to write some Java, it looks like, to build up your new behaviors. Maybe this is awesome too.
Wednesday, April 25, 2007
you'll probably tell me that Emacs already does this
How useful would that be for you?
Monday, April 23, 2007
retrocomputing from the other side of the pond
In the early 1980s, the BBC started an initiative called the BBC Computer Literacy Project, a major part of which was the production of the BBC Micro, a machine produced by Acorn Computers, complete with its own line of peripherals including expandable memory and various pluggable co-processors. There was an associated television show, The Computer Programme, which ran in various incarnations through the decade and featured music from Kraftwerk. The computers came with BBC BASIC, a rather more advanced system than the BASICs that were shipping stateside -- it had proper named subroutines and if/then/else, features most users on the MS-DOS side of things wouldn't see until QBASIC.
The mind-blowing part of the project was Telesoftware, whereby computer programs were sent embedded in the broadcast television signal, using Teletext, which is how the closed-captioning data was sent in Britain at the time. Analogue technology like broadcast TV feels so alien these days... but the Beeb was busy using it to send example programs to eager learners at home.
There seems to be a pretty active online community of BBC/Acorn enthusiasts out there, two and a half decades later.
You know how to use the Googles, of course, but here's another, more detailed overview of the BBC/Acorn system.
Wednesday, April 18, 2007
java 6: apparently even less of a loss than java 5!
Well, I'm excited anyway. The new Scripting API provides a standard interface for embedding other languages in Java and making calls between the two. See if there's already a project to handle your favorite language here -- there probably is, unless you like Common Lisp. There's even a mechanism for manipulating namespaces in the embedded language, pretty snappy.
Apparently recent releases of Jython already have hooks to support the new API. Maybe the next version of JES should be rewritten with that in mind, say once Jython 2.2 is stable. And perhaps we'll see ABCL ported to the new standard...
Also in Java 6, the built-in support for splash screens is kinda cute. And they're saying that the whole shebang is faster and prettier. Good job, guys!
Tuesday, April 17, 2007
tools for blogging and reading
Graham suggests that this would be better with online storage -- it could sync up with your del.icio.us bookmarks and keep track of what you've already blogged about. Perhaps someday soon I'll be cool enough to use del.icio.us.
Speaking of reading things on the web and managing one's reading -- please allow me to direct your attention to BibDesk and Skim, a pair of apps for the Mac designed with your reading pleasure in mind. The first is a bibliography manager that works with BibTeX format and has a lovely UI and lets you drag references around and whatnot, and the latter is for reading, highlighting, and annotating your papers, which is traditionally pretty difficult with a PDF.
The downside of these is that they're Cocoa apps and Mac-only, but they're pretty much what I'll want to build when I get around to putting together that cross-platform Python paper manager thing I've been thinking about...
Thursday, April 12, 2007
things that start with p
I've taken to using jar to deal with .zip files, so I can use consistent tar syntax and don't have to remember how to use the zip options. But! The bash_completion file for Ubuntu doesn't include ".zip" as an extension that it looks for when tab-completing files for jar, oh noes!
Easy enough to fix, right? I pop open /etc/bash_completion and start searching for "jar". There's a section near the second occurrence that looks like:
_filedir '?(e|j|w)ar'After some fiddling, I change that one line to " _filedir '?(ear|jar|war|zip)' ". And it works! For reference, that the _filedir function looks like this. It's painfully obvious to everyone what this does, yes?
To be totally fair, there's some explanatory comments right above it... but it's obtuse things like this that make we want to switch to a shell with a more sensible scripting language. bash is often line noise. We claim that allowing users to modify their environments to fit their needs is one of the major benefits of Free Software, but are we doing enough to encourage that? How is your mom supposed to pick up bash script? It seems like scsh isn't meant for interactive use as your daily shell, but what if your everyday environment had a more modern language embedded in it? Are things like that already out there?
Also: speaking of Scheme embedded in things, JScheme is a dialect of Scheme with a very simple interface to Java, called the Javadot notation . It's by Peter Norvig and crew, fairly recently updated and feature-complete. Also on Peter's (fantastic) site, you can find his older "mercilessly small, easily modifiable version". I very badly to embed this in JES. Media Computation in Scheme ahoy.
Monday, April 09, 2007
Alice wants him. Bob fears him. Charlie wants to be him.
My favorite so far: "Bruce Schneier writes his books and essays by generating random alphanumeric text of an appropriate length and then decrypting it." -- Bruce Schneier in the comments
Saturday, March 31, 2007
list comprehensions!
tenpercent = len(lines) / 10
testset = random.sample(lines, tenpercent)
trainingset = [line for line in lines if line not in testset]
Python makes me warm and fuzzy on the inside. Also, random.sample() is pretty sexy!
Wednesday, March 28, 2007
brains 'n' balancing training data
If building classifiers is your thing, you may be interested to take a look at these articles:
- Gustavo E. A. P. A. Batista , Ana L. C. Bazzan, and Maria Carolina Monard: Balancing Training Data for Automated Annotation of Keywords: a Case Study.
Three researchers, seven middle names, one novel technique for building balanced data sets out of unbalanced ones for training classifiers: generate new instances of your minority class by interpolating between examples actually in your dataset. I'm still trying to decide whether this approach should work for the general case -- does it make too many assumptions about the shape of the space? Particularly: can you arbitrarily draw lines (in higher-dimensional space) between positive instances? What if there are negative instances between those two? Which dimensions do you look at first, and how is this better than just adding some noise or weighting positive examples higher? (is that last option the same as simply counting them several times?)
- Foster Provost: Machine Learning from Imbalanced Data Sets 101.
A basic overview of the problem, examining the motivation for building classifiers at all and some different approaches to sampling. The award for Best Name Ever goes to Dr. Foster Provost.
Friday, March 23, 2007
reading feeds over the web
It is the future. We've got dynabooks and memexes, and we use them to distribute pictures of cats doing cute things.
O my vast readership, I address to you this question: how do you read news online? Do you have some separate feed reader program? Do you use your browser's RSS features? Google Reader? Your LiveJournal friends page? Something else?
And moreover: for the LiveJournal denziens, does anyone know of a good method for reading "friends-only" posts through Google Reader? There are a few posts out in the world on this topic, but nobody seems to have a decisive answer yet... perhaps we can answer the question definitively.
Monday, February 26, 2007
Ripping DVDs with Free Software!
But Apple's DVD player won't play DRM'd disk images, of course.
However! Here's a very nice howto for some very friendly software for Linux, Mac OS X, and BeOS (I know you've all got BeBoxes out there) that'll make video files from your DVDs, no sweat. Super-easy to use. Your mom could do it.
Thanks for the link, Cory Doctorow!
programmable tab completion: you may already have it!
It turns out that this is a feature known as Programmable Completion, available in modern versions of bash and enabled by default in Ubuntu! Who knew?
For example, the Ubuntu version of the programmable completion only fills in ".java" files when you're issuing a "javac" command, and it auto-completes class names (but not .class files) when you're trying to run a Java program from the command line. Flippin' sweet.
Monday, February 05, 2007
Programmatic poetry: oh noetry!
Anyway, it turns out that there are non-me people interested in this sort of thing, and the ever-helpful Graham has pointed out a bunch of interesting things happening in the field!
- The prosthetic imagination is a blog by a one Jim Carpenter, who's been working on Erica T. Carter (aka "the electronic text composition project", mentioned on GrandTextAuto here), which uses probabilistic grammars to generate free verse poems. I think the output is pretty convincing ("convincingly what?"); according to Mr. Carpenter, it's rather unnerving to readers who've been informed that they were composed by machine.
It's interesting how people react, when confronted with "creativity" from a non-human source; one is reminded of Douglas Hofstadter's surprising reaction to David Cope's lovely work with algorithmic music composition, which makes music, in a sense, in the style of other composers.
I'll have to read more, but I'm not entirely sure, if it's just using hierarchical grammars, how Erica is different from The Postmodernism Generator (the best-known use of the world-famous dada engine)... but I'll report back on this later.
- There is an Electronic Poetry Center at Buffalo. Interesting!
- Gnoetry is another system out there, and a very prolific one at that, apparently connected to this super-fascinating Beard of Bees publishing group. Language is a prosthesis of an ancient neuro-chemical regime; but now the chemical author is dead. Gnoetry places language at a remove from its typical sources: pre-conscious governance, psycho-historical flux, conscious-mind narration.
- At upenn, they have a series of readings, M^<4|\3, with all sorts of "literary uses of technology" things going on, including, next week, Flarf poetry (!) .
- Speaking of literary uses of technology, the GTR Language Workbench looks like something between Eclipse and a word processor... I'm not quite sure what to make of it yet.
I'm all excited. Let's get hacking.
Tuesday, January 30, 2007
begging the question, language composition and orthogonality
And I've come to the point in my life where this doesn't bother me anymore, despite the fact that I know the technical rhetorical sense of "begging the question" -- an argument presupposing what it's trying to prove, often implicitly. I prove that unicorns exist thus: all those magical one-horned horses out there are unicorns. I prove that there's an objectively extant material world by kicking a rock and hurting my foot.
This post, of course, begs the question: will I be secure enough as an armchair philosopher to start using the phrase in the vernacular sense? I'm torn: there are few things I like less in the world than prescriptive grammar, but few things I like quite as much as precise, expressive expression.
Wednesday, January 24, 2007
gmaps and quicksilver
Also: my ATLhack compatriot Erik introduced me to this really nice interface tweak for Mac OS X -- quicksilver. It lets you do a lot less mousing on the Mac, which is a pretty welcome change -- a quick key-tap, and it pops up a window where you type the first few letters of something, say an application or a folder or whatever, and it searches out what you probably mean! It seems like it's more efficient than reaching for the mouse, and for right now I've taken everything off my Dock to see if quicksilver is a viable replacement. Thanks, Erik!
Thursday, January 18, 2007
Run and jump on that gmaps bandwagon!
Maybe not as immediately useful as Gmaps Pedometer, but it'd be fun to put together. The API looks kinda neat, and I should learn this newfangled Web 2.0-AJAX-web-services schlock one of these days...
Wednesday, January 17, 2007
Mapping and reducing is like popping and locking for programmers
Previously, I'd thought about mapping functions onto lists as an operational thing, a set of steps to complete, but that's not the cleanest way to think about it. Mapping is actually a special case of "reduce", an operation where you just go through and replace the all the "cons" functions in a list expression with something else, then evaluate the expression again. "map" functions have a cons in the function they're reducing with, so the end result is another list.
For example: you might write, in lisp: "(mapcar #'(lambda(x) (* x 2)) '(1 2 3 4))", yielding (2 4 6 8).. and you might think of that procedurally... but a map is just a reduce where "cons composed with the mapped function" replaces every "cons". Append can be written similarly; it's all essentially just replacement, function composition and evaluation. The paper also goes into beautiful issues like lazy evaluation (the author says that if he wrote the paper now, the examples would be in Haskell!) and continues to do some lovely examples, some numerical and one very close to our heart: a bot that plays tic-tac-toe, optimally, with pruned game trees.
Years ago, Kurt Eiselt told me that the future of computing was going to be functional languages on very-parallel hardware; functional languages, at least in principle, make synchronization easier by limiting or removing-altogether side effects. (although reconciling this idea with stateful, event-driven end-user applications is another issue!) To an extent, it looks like he was right, and the future is here! MapReduce is the method Google is using to crunch super-giant datasets with enormously parallel clusters. It's not just for functional languages, of course, but the idea is there.
Friday, January 05, 2007
Possibly interesting, but almost definitely not useful!
- Self. It's been ported to Linux. It's one of those languages that one feels like one should learn more about. It has interesting family relationships with SmallTalk and Dylan and JavaScript...
- Thinking Meat celebrates the holidays. I just found this blog, but over at the Thinking Meat Project, she has a lovely article about taking part in the culture around the holidays and coming to terms with the cognitive dissonance from enjoying good Bach choral music while feeling like one shouldn't be participating in religious rites, for consistency's sake. It can hard to balance these things, particularly soon after giving up a faith.