Commit Graph

6 Commits

Author SHA1 Message Date
Gabe Goodhart 73d089bb90 feat: Update all patches
There are a number that are no longer needed at all:

- 0003-embeddings: Embeddings entirely overhauled on master
- 0008-ensure-KV-cache-is-fully-defragmented: KV caching entirely
    overhauled on master
- 0019-metal-add-mean-kernel-14267: Merged upstream
- 0020-CUDA-add-mean-operation-14313: Merged upstream

Branch: GraniteFour

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
2025-06-27 16:57:05 -06:00
Jeffrey Morgan 0cefd46f23
llama: update to commit de4c07f93 (#10655) 2025-05-12 12:17:26 -07:00
Jeffrey Morgan 8dd12c873d
llama: update to commit e1e8e099 (#10513) 2025-05-01 18:24:09 -07:00
Jeffrey Morgan e9e5f61c45
llama: update to commit 2016f07b (#10352) 2025-04-24 17:26:02 -07:00
Jeffrey Morgan 943464ccb8
llama: update to commit 71e90e88 (#10192) 2025-04-16 15:14:01 -07:00
Jesse Gross ccb7eb8135 ggml: Free ggml_backend_buffer_t when releasing buffer
When ggml_backend_buffer_free() is called, the device memory
is released but not all backends consistently release the actual
ggml_backend_buffer_t in system RAM, causing a memory leak.

Bug #10040
2025-04-15 15:29:58 -07:00