ollama

History

Blake Mizerany acbffa59e9 llm: suppress large allocations for GGUF arrays This introduces a little array type for holding GGUF arrays that prevents the array from growing too large. It preserves the total size of the array, but limits the number of elements that are actually allocated. GGUF arrays that are extremely large, such as tokens, etc, are generally uninteresting to users, and are not worth the memory overhead, and the time spent allocating and freeing them. They are necessary for inference, but not for inspection. The size of these arrays is, however, important in Ollama, so it is preserved in a separate field on array.		2024-06-23 14:26:56 -07:00
..
ext_server	remove confusing log message	2024-06-19 11:14:11 -07:00
generate	Merge pull request #5072 from dhiltgen/windows_path	2024-06-19 09:13:39 -07:00
llama.cpp@7c26775adb	llm: update llama.cpp commit to `7c26775` (#4896 )	2024-06-17 15:56:16 -04:00
patches	llm: update llama.cpp commit to `7c26775` (#4896 )	2024-06-17 15:56:16 -04:00
filetype.go	Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322 )	2024-05-23 13:21:49 -07:00
ggla.go	simplify safetensors reading	2024-05-21 11:28:22 -07:00
ggml.go	llm: suppress large allocations for GGUF arrays	2024-06-23 14:26:56 -07:00
gguf.go	llm: suppress large allocations for GGUF arrays	2024-06-23 14:26:56 -07:00
llm.go	revert tokenize ffi (#4761 )	2024-05-31 18:54:21 -07:00
llm_darwin_amd64.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_darwin_arm64.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_linux.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_windows.go	Move nested payloads to installer and zip file on windows	2024-04-23 16:14:47 -07:00
memory.go	handle asymmetric embedding KVs	2024-06-20 09:57:27 -07:00
memory_test.go	review comments and coverage	2024-06-14 14:55:50 -07:00
payload.go	Move libraries out of users path	2024-06-17 13:12:18 -07:00
server.go	Refine mmap default logic on linux	2024-06-20 11:07:04 -07:00
status.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00