This documents the main changes to the candle
crate.
- Added the Mistral 7b v0.1 model 983.
- Quantized version of the Mistral model 1009.
- Add the gelu-erf op and activation function 969.
- Add the mixformer/phi-v1.5 model 930.
- Add the sclice-scatter op 927.
- Add the Wuerstchen diffusion model 911.
- Support for simd128 intrinsics in some quantized vecdots 982.
- Optimize the index-select cuda kernel 976.
- Self-contained safetensor wrappers 946.
- Add some RNNs (GRU and LSTM) in
candle-nn
674, 688. - gguf v2 support 725.
- Quantized llama example in Python using the pyo3 api 716.
candle-nn
layer for conv2d-transposed 760.- Add the Segment-Anything Model (SAM) as an example 773.
- TinyViT backbone for the segment anything example 787.
- Shape with holes support 770.
- Dilations are now supported in conv-transpose2d. 671.
- Interactive mode for the quantized model 690.
- Faster softmax operation 747.
- Faster convolution operations on CPU and CUDA via im2col 802.
- Moving some models to a more central location 796.
- Add the powf op 664.
- Stable Diffusion XL support 647.
- Add the conv-transpose2d op 635.
- Refactor the VarBuilder api 627.
- Add some quantization command 625.
- Support more quantized types, e.g. Q2K, Q4K, Q5K... 586.
- Add pose estimation to the yolo example 589.
- Api to write GGUF files 585.
- Support more quantization types 580.
- Add EfficientNet as an example Computer Vision model 572.
- Add a group parameter to convolutions 566.
- New dtype: int64 563.
- Handling of the GGUF file format. 559.