Releases · vllm-project/llm-compressor

What's Changed

Correct Typo in SparseAutoModelForCausalLM docstring by @kylesayrs in #56
Disable Default Bitmask Compression by @Satrat in #60
TRL Example fix by @rahul-tuli in #59
Fix typo by @rahul-tuli in #63
Correct typo by @kylesayrs in #61
correct import in README.md by @zzc0430 in #66
Fix for issue #43 -- starcoder model by @horheynm in #71
Update README.md by @robertgshaw2-neuralmagic in #74
Layer by Layer Sequential GPTQ Updates by @Satrat in #47
[ Docs ] Update main readme by @robertgshaw2-neuralmagic in #77
[ Docs ] gemma2 examples by @robertgshaw2-neuralmagic in #78
[ Docs ] Update FP8 example to use dynamic per token by @robertgshaw2-neuralmagic in #75
[ Docs ] Overhaul accelerate user guide by @robertgshaw2-neuralmagic in #76
Support kv_cache_scheme for quantizing KV Cache by @mgoin in #88
Propagate trust_remote_code Argument by @kylesayrs in #90
Fix for issue #81 by @horheynm in #84
Fix for issue 83 by @horheynm in #85
[ DOC ] Big Model Example by @robertgshaw2-neuralmagic in #99
Enable obcq/finetune integration tests with commit cadence by @dsikka in #101
metric logging on GPTQ path by @horheynm in #65
Update test config files by @dsikka in #97
remove workflows + update runners by @dsikka in #103
metrics by @horheynm in #104
add debug by @horheynm in #108
Add FP8 KV Cache quant example by @mgoin in #113
Add vLLM e2e tests by @dsikka in #117
Fix style, fix noqa by @kylesayrs in #123
GPTQ Algorithm Cleanup by @kylesayrs in #120
GPTQ Activation Ordering by @kylesayrs in #94
demote recipe string initialization to debug and make more descriptive by @kylesayrs in #116
compressed-tensors main dependency for base-tests by @kylesayrs in #125
Set ready label for transformer tests; add message reminder on PR opened by @dsikka in #126
Fix markdown check test by @dsikka in #127
Naive Run Compressed Pt. 2 by @Satrat in #62
Fix transformer test conditions by @dsikka in #131
Run Compressed Tests by @Satrat in #132
Correct typo by @kylesayrs in #124
Activation Ordering Strategies by @kylesayrs in #121
Fix README Issue by @robertgshaw2-neuralmagic in #139
update by @dsikka in #143
Update finetune and oneshot tests by @dsikka in #114
Validate Recipe Parsing Output by @kylesayrs in #100
fix build error for nightly by @dhuangnm in #145
Fix recipe nested in configs by @kylesayrs in #140
MOE example with warning by @rahul-tuli in #87
Bug Fix: recipe stages were not being concatenated by @rahul-tuli in #150
fix package name bug for nightly by @dhuangnm in #155
Add descriptions for pytest marks by @kylesayrs in #156
Fix Sparsity Unit Test by @Satrat in #153
Fix: Error during model saving with shared tensors by @rahul-tuli in #158
Update 2:4 Examples by @dsikka in #161
DeepSeek: Fix Hessian Estimation by @Satrat in #157
bump up main to 0.2.0 by @dhuangnm in #163
Fix help dialogue by @kylesayrs in #151
Add MoE and Compressed Inference Examples by @Satrat in #160
Separate trust_remote_code args by @kylesayrs in #152
Enable a skipped finetune test by @dsikka in #169
Fix filename in example command by @dbarbuzzi in #173
Add DeepSeek V2.5 Example by @dsikka in #171
fix quality by @dsikka in #176
Patch log function name in gptq by @kylesayrs in #168
README for Modifiers by @Satrat in #165
Fix default for sequential updates by @dsikka in #186
fix default test case by @dsikka in #193
Fix Initalize typo by @Imss27 in #190
Update MoE examples by @mgoin in #192