Monday, September 3, 2007

Another reason why Java looses big

WARNING: this is a pure geek post - don't read if you are not into programming or (lucky you) never have to touch Java.

Because of a certain library and some other reasons, I recently was forced to program in Java. In this course, I had to make a command line call (to dump flatfiles into my PostgreSQL database). Knowing many languages, I though this would be pretty straightforward with Java, too - nada! Here's the code to do a command line call in Java as safe as I can get it:

/**executes cmd on the Runtime
*
* This method throws a RuntimeException if any output
* is found on STDERR - so catch it if your program
* does so to log only. The content of STDERR is
* sans newlines in the message of the RE.
*
* @return exitVal - anything else than 0 is invalid */
private int runtimeExec(String cmd)
throws InterruptedException, IOException {

// execute the command
Process child = Runtime.getRuntime().exec(cmd);
int exitVal = child.waitFor();

// check STDERR
InputStream stderr = child.getErrorStream();
InputStreamReader isr = new InputStreamReader(stderr);
BufferedReader br = new BufferedReader(isr);
String line = null;
while ( (line = br.readLine()) != null)
error.append(line + " ");

// close the streams
stderr.close();
child.getInputStream().close();
child.getOutputStream().close();

// throw RE on STDERR output
if (error.length() > 0)
throw new RuntimeException(error.toString());

return exitVal;
}
Pretty nasty, eh?! So which precautions are required? Here's the explanation of above code:
  • Although a program should terminate with exit value non-zero on error, I found that the postgres dumping using psql -d mydb -f mydump.pg actually does terminate with 0 when an encoding error is encountered. This means your program happily continues while your database has been abrogated. The only means to find this situation is to throw a RE on any text found on STDERR.
  • The second really bad thing is if you call this code say a few thousand times, you will run into a "too many open filehandles" error. Why? Because you have to close all three streams (IN, OUT, ERR), as the Java GC just sucks - my process runs over days, going into this segment about 1000 times. So on about the last day, this program crashes. Nice, just love garbage collection...
Oh, and it actually still ain't completely safe: you would have to do a try...catch...finally to close the streams, if you'd want to be perfect!

So my advice, again, is: keep away from this hideous language and develop something real! I hope this was my last "me vs. Java"-encounter for a long time!