One of the projects I’m currently working on is a fast linear algebra matrix library for Java. I know that there already exist some alternatives, but I usually found that they were either no longer actively maintained or their performance was much slower than what, for example, matlab would give you.
So in principle, all I wanted is a matrix library which directly interfaces with some fast BLAS or LAPACK library, for example ATLAS, but hides all of the Fortran greatness behind a nice object-oriented wrapper. Turns out that something like that does not seem to exist, so we started to work on our own little library called jBLAS which is going to be released soon.
Anyway, while working on that library, I got a bit worried about the performance of the stuff I’m doing in Java. These are mainly element-wise operations like multiplying all elements of a matrix with the corresponding elements of another matrix.
Since we’re interfacing to C (actually Fortran) code a lot, all of the
matrix data is stored in direct buffers, in particular
java.nio.DoubleBuffer
,
and I wondered how their put and get access methods compared to array
access and of course to C. So I wrote a little program in eclipse
which did just that and compared the outcome with what I got in C, and
to my suprise, Java was very much on par, and the performance
differences between accessing arrays and direct buffers was very
small.
But when I tried to repeat the results at home, I got very different numbers, both for arrays and for direct buffers, which suddenly took five times longer than arrays. At first I suspected that the different processor maker was the issue, but after some more tweaking I found out that again to my great surprise, the issue was the compiler.
It turns out that the eclipse compiler ecj is much better at compiling my benchmark than Sun’s own javac. To see this for yourself, here is my benchmark:
So let’s compare the run-times of these programs using the latest Java JDK 6 update 7 and the stand-alone eclipse compiler 3.4.
Then, I compile the source with
and run it with java -cp . Main
:
copying array of size 1000 1000000 times... (2.978s)
copying DoubleBuffer of size 1000 1000000 times... (LITTLE_ENDIAN) (4.291s)
Now the same with the eclipse compiler: I compile it with
and get the following results:
copying array of size 1000 1000000 times... (0.547s)
copying DoubleBuffer of size 1000 1000000 times... (LITTLE_ENDIAN) (0.633s)
Note that both files only differ in the bytecode, but the eclipse compiler manages to produce code which is about 6 times faster, and which is only slightly slower for the direct buffer.
I’ll have to look into this quite some more, but if this is true, this is pretty amazing!
Posted by at 2008-08-25 12:23:00 +0000
blog comments powered by Disqus