Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 421 Bytes

README.md

File metadata and controls

15 lines (8 loc) · 421 Bytes

CppTensority

Prerequisites

How to speed it up

If you don't mind the portability issue, you may try to opt it by working on the cblas_dgemm() using:

  • MKL on a Intel machine, or even
  • GPU computing lib, for example, cuBLAS
    • cublasGemmEx may draw your interests and you would have to convert the types and prepare (set up) the matrix beforehand