Thursday, May 10, 2012

command line tricks: ps, grep, awk, xargs, kill

I recently learned a little bit of awk; if it's not in your command line repertoire, it's worth looking into! awk lets you do things like this:

$ ps auxww | grep weka | grep -v grep | awk '{print $2}' | xargs kill -9

Let's unpack what's going on here. First, we list all processes, in wide format (that's the "auxww" options to ps), then we filter with grep to only include lines that include "weka". When I wrote this line, I was debugging a long-running machine learning task, so I would start it running, then if (when) I found a bug and wanted to restart, I used this command to kill it.

Now we have all the lines of output from ps that include "weka". Unfortunately, this includes the grep process that's searching for "weka"! No problem, just use "grep -v" to filter out the lines that include "grep".

This is where awk comes in. We want to get the process numbers out of the ps output. It seems like we could use cut to just get the second column, but we don't know how wide that column is going to be! Maybe there's a cut option for that, but I don't think there is. Instead, we just use a tiny awk script that prints the second whitespace-delimited thing on each line.

Finally, we use xargs to take the process numbers and make them be arguments to kill. xargs is great: it takes each line of its standard input and makes those lines argument to a program (ie, its first argument). Usually I use xargs in combination with find, "svn status", or "git status -s", to do the same thing to batches of files. Maybe delete them or add them to version control or whatever.

Thoughts? Better ways to do this sort of thing?

5 comments:

Alec Story said...

Do you know about killall -9? I think it does just what you want here, although it's not as general as awk, of course.

Alex Rudnick said...

@Alec: Good catch! I was considering bringing that up in the post. I know about killall, yeah :)

The reason why I'm not using it here is because weka is a Java process, so the name of the process would be java; I didn't want to kill all the running java processes.

sajith said...

What about this -

$ ps auxww | grep wek[a] | awk '{print $2}' | xargs kill -9

(Captcha: "untaro nomandog")

Alex Rudnick said...

@Sajith: Haha yeahhh, that's really clever! I have trouble imagining myself remembering to do that, though...

Anonymous said...

Remember to always use kill without -9 before kill -9. Programs typically use the regular kill signal (SIGINT) as a hint to start cleaning up their dirty work. E.g. a simple shell script might trap sigint and remove its temporary/lock files, a java program might flush the database. When you kill -9 you don't give the program a chance to clean up, so the database might end up "dirty" or even unusable, perhaps there's stray lock files you have to hunt for and delete, etc. Yes, it might take a few seconds for regular kill to get done, but that's typically because it's doing stuff. If not, it's a buggy program that you shouldn't use =P