GitHub - tlc-pack/cutlass_fpA_intB_gemm: A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

Extracted fp16 A and int8/4 B CUTLASS GEMM kernels from FasterTransformer for easier integration in third-party projects. See the original code below.

Build with

mkdir build && cd build
cmake ..
make

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
cmake/utils		cmake/utils
cutlass @ cc85b64		cutlass @ cc85b64
cutlass_extensions/include/cutlass_extensions		cutlass_extensions/include/cutlass_extensions
cutlass_kernels		cutlass_kernels
tvm_binding		tvm_binding
utils		utils
weightOnlyBatchedGemv		weightOnlyBatchedGemv
.clang-format		.clang-format
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 5

Languages

License

tlc-pack/cutlass_fpA_intB_gemm

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages