..
ext_server
set `shutting_down` to `false` once shutdown is complete ( #2484 )
2024-02-13 17:48:41 -08:00
generate
Explicitly disable AVX2 on GPU builds
2024-02-15 14:50:11 -08:00
llama.cpp @ 6c00a06692
Revert "Revert "bump submodule to `6c00a06` ( #2479 )"" ( #2485 )
2024-02-13 18:18:41 -08:00
patches
patch: always add token to cache_tokens ( #2459 )
2024-02-12 08:10:16 -08:00
dyn_ext_server.c
Switch to local dlopen symbols
2024-01-19 11:37:02 -08:00
dyn_ext_server.go
Shutdown faster
2024-02-08 22:22:50 -08:00
dyn_ext_server.h
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
ggml.go
add max context length check
2024-01-12 14:54:07 -08:00
gguf.go
refactor tensor read
2024-01-24 10:48:31 -08:00
llama.go
use `llm.ImageData`
2024-01-31 19:13:48 -08:00
llm.go
Ensure the libraries are present
2024-02-07 17:27:49 -08:00
payload_common.go
Detect AMD GPU info via sysfs and block old cards
2024-02-12 08:19:41 -08:00
payload_darwin_amd64.go
Add multiple CPU variants for Intel Mac
2024-01-17 15:08:54 -08:00
payload_darwin_arm64.go
Add multiple CPU variants for Intel Mac
2024-01-17 15:08:54 -08:00
payload_linux.go
Add multiple CPU variants for Intel Mac
2024-01-17 15:08:54 -08:00
payload_test.go
Fix up the CPU fallback selection
2024-01-11 15:27:06 -08:00
payload_windows.go
Add multiple CPU variants for Intel Mac
2024-01-17 15:08:54 -08:00
utils.go
partial decode ggml bin for more info
2023-08-10 09:23:10 -07:00