There are a number that are no longer needed at all:
- 0003-embeddings: Embeddings entirely overhauled on master
- 0008-ensure-KV-cache-is-fully-defragmented: KV caching entirely
overhauled on master
- 0019-metal-add-mean-kernel-14267: Merged upstream
- 0020-CUDA-add-mean-operation-14313: Merged upstream
Branch: GraniteFour
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
When ggml_backend_buffer_free() is called, the device memory
is released but not all backends consistently release the actual
ggml_backend_buffer_t in system RAM, causing a memory leak.
Bug #10040