Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Config hidden layer number to run in 1 lazy graph
#451 opened Nov 1, 2024 by libinta Loading…
[CI] Add Llama2 to torch compile tests
#446 opened Oct 30, 2024 by anko-intel Loading…
Fix branch version in README_GAUDI.md
#444 opened Oct 29, 2024 by michalkuligowski Loading…
to make repetition penalty faster
#442 opened Oct 29, 2024 by ccrhx4 Loading…
Oct 28 rebase
#439 opened Oct 28, 2024 by kzawora-intel Loading…
Add HPU information to collect_env script
#430 opened Oct 25, 2024 by michalkuligowski Loading…
fix profiler end for prepare_input_tensor
#422 opened Oct 24, 2024 by jikunshang Loading…
GPTQ Support
#421 opened Oct 23, 2024 by maktukmak Loading…
Create run-lm-eval-mmlu.sh
#399 opened Oct 16, 2024 by michalkuligowski Draft
[DO NOT MERGE] Upstream test PR
#322 opened Sep 23, 2024 by kzawora-intel Loading…
Optimize LoRA mask creation habana Issues or PRs submitted by Habana Labs
#285 opened Sep 13, 2024 by SanjuCSudhakaran Draft
Draft: Add max-num-prefill-seqs parameter habana Issues or PRs submitted by Habana Labs
#253 opened Sep 6, 2024 by kzawora-intel Draft
enabling multi-node serving on Gaudi ray cluster intel Issues or PRs submitted by Intel
#218 opened Aug 29, 2024 by vishnumadhu365 Loading…
ProTip! What’s not been updated in a month: updated:<2024-10-01.