Kernel fusion for Llama-v2 #538

TejaGollapudi · 2023-07-21T01:16:29Z

TejaGollapudi
Jul 21, 2023

Hi,
https://twitter.com/pommedeterre33/status/1681935636129873920?t=VaxYpkbwNLKxly7icie8kw&s=19

I came across this great thread showing the benefits of kernel fusion for speeding up LLama-2 up to 1.8x using OpenAI's Triton kernels. (It may work with torch kernel fusions too).

Not sure if this would be beneficial for vLLM but it might be worth taking a look at 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel fusion for Llama-v2 #538

{{title}}

Replies: 0 comments

Select a reply

Kernel fusion for Llama-v2 #538

TejaGollapudi Jul 21, 2023

Replies: 0 comments

TejaGollapudi
Jul 21, 2023