Category: Benhcmarks
-
High-Performance Matrix Multiplication with FLAME/BLIS: A Deep Dive into DGEMM
When it comes to scientific computing and machine learning, efficient matrix multiplication is a fundamental building block. Among the most critical operations in linear algebra libraries is DGEMM (Double-precision General Matrix Multiply), which computes the product of two double-precision matrices. In the quest for optimal performance, the BLAS (Basic Linear Algebra Subprograms) interface has been…
-
Empirical roofline tool (ERT) – a benchmark for machine performance characterization
A well known and very useful benchmark for characterizing a machine performance is the Empirical Roofline Tool (ERT). The Empirical Roofline Tool, ERT, automatically generates a roofline data for a given computer. This includes the maximum bandwidth for the various levels of the memory hierarchy and the maximum gflop rate. This data is obtained using…