Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b3902
cmake : do not build common library by default when standalone (#9804)
b3901
perplexity : fix integer overflow (#9783) * perplexity : fix integer overflow ggml-ci * perplexity : keep n_vocab as int and make appropriate casts ggml-ci
b3899
ggml : fix BLAS with unsupported types (#9775) * ggml : do not use BLAS with types without to_float * ggml : return pointer from ggml_internal_get_type_traits to avoid unnecessary copies * ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits it's not really internal if everybody uses it
b3898
server : better security control for public deployments (#9776) * server : more explicit endpoint access settings * protect /props endpoint * fix tests * update server docs * fix typo * fix tests
b3896
ggml : add backend registry / device interfaces to BLAS backend (#9752) * ggml : add backend registry / device interfaces to BLAS backend * fix mmap usage when using host buffers
b3895
Update building for Android (#9672) * docs : clarify building Android on Termux * docs : update building Android on Termux * docs : add cross-compiling for Android * cmake : link dl explicitly for Android
b3892
metal : single allocation of encode_async block (#9747) * Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>
b3889
sync : llama.cpp
b3887
rerank : use [SEP] token instead of [BOS] (#9737) * rerank : use [SEP] token instead of [BOS] ggml-ci * common : sanity check for non-NULL tokens ggml-ci * ci : adjust rank score interval ggml-ci * ci : add shebang to run.sh ggml-ci
b3886
sync : ggml