Tuesday, November 1, 2011

64-bit goodness?

so I've been doing load testing with emprix eload/oracle application testing suite. among the many issues I've had with it: running out of heap on the. controller and agent process

starting in version 9, they switched from a fairly light install ) about 400 meg ) to a rather hefty 1.5gb download. they used to run jboss 4 ( which was old at the time anyway but! ) and was 100% windows based. also was only 32-bit

starting in 9.2 they posted some support for 64-bit. I've regularly had oom issues on both the controller and the agent. because its 32-bit I'm restricted to 2gb heap which is a shame. their support tech tells me "well, you clearly need more heap - upgrade the hosts to 64-bit windows and that should fix your problem

I ran the upgrade, 64-bit 2008R2. guess what, installed 9.3 ( latest as of today ) and it's ..... 32-bit. great ;) opened a ticket 24 hours ago and guess what - no response yet ;) I'm so shocked!

Sunday, October 30, 2011

alternative to jpackage for java alternatives on rhel/centos

I find this much easier :->

cd /usr/java/default/bin
slaves=""
for bin in $(ls)
do
if [ $bin != "java" ];
then
slaves="$slaves --slave /usr/bin/$bin $bin /usr/java/default/bin/$bin"
fi
done
sudo alternatives --install /usr/bin/java java /usr/java/default/bin/java 2 $slaves
sudo alternatives --set java /usr/java/default/bin/java

$ java -version
java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)

I have always found the jpackage rpms and various dependencies just annoying...

Friday, September 30, 2011

Java, Oracle and Hadoop

Was reading the summary of a /. article ( http://developers.slashdot.org/story/11/09/30/1855204/oracle-proud-self-reliant-increasingly-isolated )and it got be wondering again - what is the fate of Hadoop and the JDK.

Kind of wonder what it would take for a new implementation of hadoop ( or similar implementation ) but in a different language. What if Oracle were to basically reach a point where there was the public domain Java that goes basically stagnant and a commercial version for-pay suddenly exploding the cost of your cluster beyond servers and some switches.

Sure there are some other frameworks ( MongoDB, Cassandra, probably many others ) but in the same idea as the marriage of HDFS and the map/reduce framework itself.

Would OpenJDK even be an option? I've not tried using it for anything. Then again, I just administer hadoop clusters I don't program for them ;)

Wednesday, July 20, 2011

speculative execution

Running some benchmarks of hadoop using teragen/terasort. One of the recommendations I was given was to disable speculative execution. Noticed something rather strange when I forced it to disabled in the config.

Runtime with speculative execution: 18.5 minutes
Runtime without speculative execution: 1 hour

Seems that 2-3 map tasks are taking longer than the rest.

Question now is: why. Each map task is responsible for generating the same % of data - why would speculative execution make the job run quicker. Does this point to hardware differences ( if so, the slow tasks are on different machines - I have not noticed a pattern yet ), configuration problems elsewhere, or just random bad luck.