Releases · ggerganov/llama.cpp

09 Oct 18:09

c81f3bb

cmake : do not build common library by default when standalone (#9804)

Assets 22

09 Oct 15:05

github-actions

b3901

e702206

b3901

perplexity : fix integer overflow (#9783)

* perplexity : fix integer overflow

ggml-ci

* perplexity : keep n_vocab as int and make appropriate casts

ggml-ci

Assets 22

08 Oct 13:28

github-actions

b3899

dca1d4b

b3899

ggml : fix BLAS with unsupported types (#9775)

* ggml : do not use BLAS with types without to_float

* ggml : return pointer from ggml_internal_get_type_traits to avoid unnecessary copies

* ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits

it's not really internal if everybody uses it

Assets 22

08 Oct 12:39

github-actions

b3898

458367a

b3898

server : better security control for public deployments (#9776)

* server : more explicit endpoint access settings

* protect /props endpoint

* fix tests

* update server docs

* fix typo

* fix tests

Assets 22

07 Oct 21:46

github-actions

b3896

6374743

b3896

ggml : add backend registry / device interfaces to BLAS backend (#9752)

* ggml : add backend registry / device interfaces to BLAS backend

* fix mmap usage when using host buffers

Assets 22

07 Oct 17:35

github-actions

b3895

f1af42f

b3895

Update building for Android (#9672)

* docs : clarify building Android on Termux

* docs : update building Android on Termux

* docs : add cross-compiling for Android

* cmake : link dl explicitly for Android

Assets 22

07 Oct 13:28

github-actions

b3892

96b6912

b3892

metal : single allocation of encode_async block (#9747)

* Single allocation of encode_async block with non-ARC capture in ggml-metal.m

* Moving Block_release to the deallocation code

* Release encode block when re-setting encoding buffer count if needed

* Update ggml/src/ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 22

06 Oct 11:01

github-actions

b3889

b6d6c52

b3889

sync : llama.cpp

Assets 22

05 Oct 16:25

github-actions

b3887

8c475b9

b3887

rerank : use [SEP] token instead of [BOS] (#9737)

* rerank : use [SEP] token instead of [BOS]

ggml-ci

* common : sanity check for non-NULL tokens

ggml-ci

* ci : adjust rank score interval

ggml-ci

* ci : add shebang to run.sh

ggml-ci

Assets 22

05 Oct 15:55

github-actions

b3886

58b1669

b3886

sync : ggml

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b3902

b3901

b3899

b3898

b3896

b3895

b3892

b3889

b3887

b3886