One of the most exciting aspects of MTL is the high level of performance that it delivers.

Here we present performance results for several MTL algorithms on various platforms. Presently, we include results for matrix-matrix product and matrix-vector product on Sun UltraSPARC and for the IBM RS6000. Once the performance tests become part of our nightly tests, we will be adding results for more architectures, including SGI and Pentium. We will also be adding more algorithms.

About the Tests

The tests here illustrate several points.

The MTL matrix-matrix multiply timed in the experiments below is the recursive algorithm described in the POOSC BLAIS paper.

Ultrasparc Results

The particular machine used was an UltraSPARC 170E using KAI C++ v3.3c in conjunction with Sun's C compiler,
width=400 height=320>

Sun UltraSPARC dense matrix-matrix multiply performance.

width=400 height=320>

Matrix-matrix multiply kernel.

The above graph shows dense matrix-matrix multiply performance for small matrices. This focuses on the inner loop, or kernel, of the algorithm, which was implemented using the BLAIS library.

width=400 height=320>

Sparse matrix-vector multiply performance (row oriented).

width=400 height=320>

Dense matrix-vector multiply performance (column oriented).

RS 6000 Results

The particular machine used was an RS6000 590. The compiler used was KAI C++ v3.2 with IBM's XLC compiler.
width=400 height=320 >

IBM RS6000 dense matrix-matrix multiply performance.