jblas release 1.0

I've just release jblas 1.0. For those of you who don't know yet, jblas is a matrix library for Java which is based on the BLAS and LAPACK routines, currently using ATLAS and lapack-lite, for maximum performance.

I'm using jblas myself for most of my research together with a wrapper in jruby which I plan to release pretty soon as well.

In any case, here are the most important new features of release 1.0

Default jar file now contains all the libraries for Windows, Linux, and Mac OS X. Both 32 and 64 bit are supported (although right now, the 64 bit version for Windows is still missing. I'll take care of that after the holidays). The libraries are built for the Core2 platform which in my experience gives the best combined performance on Intel and AMD processors (The Hammer or AMDk10h settings give slightly better performance on AMD, but inferior performance on the Intel platform).
New LAPACK functions GEEV (generalized Eigenvalues), GETRF (LU factorization), POTRF (Cholesky factorization) in NativeBlas.
Decompose class providing high-level access to LU and Cholesky factorization.
Improved support for standard Java framework: Matrix classes are serializable, matrix classes provide read-only AbstractList views for elements, rows, and vectors.
Permutation class for random permutations and subsets.

The new jar file also comes with a little command line tool for checking the installation and running a small benchmark. If you run java -server -jar jblas-1.0.jar on my machine (Linux, 32 bit, Core 2 Duo @ 2Ghz), you get


Simple benchmark for jblas

Running sanity benchmarks.

checking vector addition... ok
-- org.jblas CONFIG BLAS native library not found in path. Copying native library from the archive. Consider installing the library somewhere in the path (for Windows: PATH, for Linux: LD_LIBRARY_PATH).
-- org.jblas CONFIG Loading libjblas.so from /lib/static/Linux/i386/libjblas.so.
checking matrix multiplication... ok
checking existence of dsyev...... ok
checking XERBLA... ok
Sanity checks passed.

Each benchmark will take about 5 seconds...

Running benchmark "Java matrix multiplication, double precision".
n = 10   :  424.4 MFLOPS (1061118 iterations in 5.0 seconds)
n = 100  : 1272.6 MFLOPS (3182 iterations in 5.0 seconds)
n = 1000 :  928.5 MFLOPS (3 iterations in 6.5 seconds)

Running benchmark "Java matrix multiplication, single precision".
n = 10   :  445.0 MFLOPS (1112397 iterations in 5.0 seconds)
n = 100  : 1273.0 MFLOPS (3183 iterations in 5.0 seconds)
n = 1000 : 1330.9 MFLOPS (4 iterations in 6.0 seconds)

Running benchmark "ATLAS matrix multiplication, double precision".
n = 10   :  428.2 MFLOPS (1070428 iterations in 5.0 seconds)
n = 100  : 3293.9 MFLOPS (8235 iterations in 5.0 seconds)
n = 1000 : 5383.2 MFLOPS (14 iterations in 5.2 seconds)

Running benchmark "ATLAS matrix multiplication, single precision".
n = 10   :  465.2 MFLOPS (1162905 iterations in 5.0 seconds)
n = 100  : 5997.3 MFLOPS (14994 iterations in 5.0 seconds)
n = 1000 : 9186.6 MFLOPS (23 iterations in 5.0 seconds)

Resources:

React to this post