ollama

Author	SHA1	Message	Date
Daniel Hiltgen	e890be4814	Revert "More parallelism on windows generate" This reverts commit `0577af98f4`.	2024-06-17 13:32:46 -07:00
Daniel Hiltgen	b2799f111b	Move libraries out of users path We update the PATH on windows to get the CLI mapped, but this has an unintended side effect of causing other apps that may use our bundled DLLs to get terminated when we upgrade.	2024-06-17 13:12:18 -07:00
Jeffrey Morgan	152fc202f5	llm: update llama.cpp commit to `7c26775` (#4896 ) * llm: update llama.cpp submodule to `7c26775` * disable `LLAMA_BLAS` for now * `-DLLAMA_OPENMP=off` v0.1.45-rc2	2024-06-17 15:56:16 -04:00
Lei Jitang	4ad0d4d6d3	Fix a build warning (#5096 ) Signed-off-by: Lei Jitang <leijitang@outlook.com>	2024-06-17 14:47:48 -04:00
Jeffrey Morgan	163cd3e77c	gpu: add env var for detecting Intel oneapi gpus (#5076 ) * gpu: add env var for detecting intel oneapi gpus * fix build error	2024-06-16 20:09:05 -04:00
Daniel Hiltgen	4c2c8f93dd	Merge pull request #5080 from dhiltgen/debug_intel_crash Add some more debugging logs for intel discovery	2024-06-16 14:42:41 -07:00
Daniel Hiltgen	fd1e6e0590	Add some more debugging logs for intel discovery Also removes an unused overall count variable	2024-06-16 07:42:52 -07:00
royjhan	89c79bec8c	Add ModifiedAt Field to /api/show (#5033 ) * Add Mod Time to Show * Error Handling	2024-06-15 20:53:56 -07:00
Jeffrey Morgan	c7b77004e3	docs: add missing powershell package to windows development instructions (#5075 ) * docs: add missing instruction for powershell build The powershell script for building Ollama on Windows now requires the `ThreadJob` module. Add this to the instructions and dependency list. * Update development.md	2024-06-15 23:08:09 -04:00
pufferffish	b6554e9b8c	fix vulkan handle releasing	2024-06-15 21:11:07 +01:00
Daniel Hiltgen	07d143f412	Merge pull request #5058 from coolljt0725/fix_build_warning gpu: Fix build warning	2024-06-15 11:52:36 -07:00
Daniel Hiltgen	a12283e2ff	Implement custom github release action This implements the release logic we want via gh cli to support updating releases with rc tags in place and retain release notes and other community reactions.	2024-06-15 11:36:56 -07:00
Daniel Hiltgen	4b0050cf0e	Merge pull request #5037 from dhiltgen/faster_win_build More parallelism on windows generate v0.1.45-rc1	2024-06-15 08:03:05 -07:00
Daniel Hiltgen	0577af98f4	More parallelism on windows generate Make the build faster	2024-06-15 07:44:55 -07:00
Daniel Hiltgen	17ce203a26	Merge pull request #4875 from dhiltgen/rocm_gfx900_workaround Rocm gfx900 workaround	2024-06-15 07:38:58 -07:00
Daniel Hiltgen	d76555ffb5	Merge pull request #4874 from dhiltgen/rocm_v6_bump Rocm v6 bump	2024-06-15 07:38:32 -07:00
Daniel Hiltgen	2786dff5d3	Merge pull request #4264 from dhiltgen/show_gpu_visible_settings Centralize GPU configuration vars	2024-06-15 07:33:52 -07:00
DSLstandard	b958cd2848	remove cap_get_bound check	2024-06-15 20:19:19 +08:00
KOISHI KOMEIJI FROM TOUHOU 11	e3f9ca4009	fix check_perfmon len	2024-06-15 20:13:15 +08:00
pufferffish	38466f1821	fix build	2024-06-15 12:06:43 +01:00
pufferffish	18f3f960b0	update gpu.go	2024-06-15 12:05:01 +01:00
pufferffish	e77ea68e11	Merge branch 'refs/heads/main' into vulkan # Conflicts: # gpu/gpu.go	2024-06-15 12:01:36 +01:00
pufferffish	11c55fab81	fix total memory monitor	2024-06-15 10:58:12 +01:00
pufferffish	257364cb3c	fix free memory monitor	2024-06-15 10:52:34 +01:00
pufferffish	e4e8a5d25a	fix compilation	2024-06-15 09:44:10 +01:00
pufferffish	724fac470f	fix segfault	2024-06-15 08:05:48 +01:00
pufferffish	24c8840037	it builds	2024-06-15 07:49:28 +01:00
Lei Jitang	225f0d1219	gpu: Fix build warning Signed-off-by: Lei Jitang <leijitang@outlook.com>	2024-06-15 14:26:23 +08:00
pufferffish	93c4d69daa	add support in gen_linux.sh	2024-06-15 05:42:59 +01:00
pufferffish	9c6b049567	add support in gpu.go	2024-06-15 05:27:14 +01:00
Daniel Hiltgen	532db58311	Merge pull request #4972 from jayson-cloude/main fix: "Skip searching for network devices"	2024-06-14 17:04:40 -07:00
Daniel Hiltgen	6be309e1bd	Centralize GPU configuration vars This should aid in troubleshooting by capturing and reporting the GPU settings at startup in the logs along with all the other server settings.	2024-06-14 15:59:10 -07:00
Daniel Hiltgen	da3bf23354	Workaround gfx900 SDMA bugs Implement support for GPU env var workarounds, and leverage this for the Vega RX 56 which needs HSA_ENABLE_SDMA=0 set to work properly	2024-06-14 15:38:13 -07:00
Daniel Hiltgen	26ab67732b	Bump ROCm linux to 6.1.1	2024-06-14 15:37:54 -07:00
Daniel Hiltgen	45cacbaf05	Merge pull request #4517 from dhiltgen/gpu_incremental Enhanced GPU discovery and multi-gpu support with concurrency	2024-06-14 15:35:00 -07:00
Daniel Hiltgen	17df6520c8	Remove mmap related output calc logic	2024-06-14 14:55:50 -07:00
Daniel Hiltgen	6f351bf586	review comments and coverage	2024-06-14 14:55:50 -07:00
Daniel Hiltgen	ff4f0cbd1d	Prevent multiple concurrent loads on the same gpus While models are loading, the VRAM metrics are dynamic, so try to load on a GPU that doesn't have a model actively loading, or wait to avoid races that lead to OOMs	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	fc37c192ae	Refine CPU load behavior with system memory visibility	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	434dfe30c5	Reintroduce nvidia nvml library for windows This library will give us the most reliable free VRAM reporting on windows to enable concurrent model scheduling.	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	4e2b7e181d	Refactor intel gpu discovery	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	48702dd149	Harden unload for empty runners	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	68dfc6236a	refined test timing adjust timing on some tests so they don't timeout on small/slow GPUs	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	5e8ff556cb	Support forced spreading for multi GPU Our default behavior today is to try to fit into a single GPU if possible. Some users would prefer the old behavior of always spreading across multiple GPUs even if the model can fit into one. This exposes that tunable behavior.	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	6fd04ca922	Improve multi-gpu handling at the limit Still not complete, needs some refinement to our prediction to understand the discrete GPUs available space so we can see how many layers fit in each one since we can't split one layer across multiple GPUs we can't treat free space as one logical block	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	206797bda4	Fix concurrency integration test to work locally This worked remotely but wound up trying to spawn multiple servers locally which doesn't work	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	43ed358f9a	Refine GPU discovery to bootstrap once Now that we call the GPU discovery routines many times to update memory, this splits initial discovery from free memory updating.	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	b32ebb4f29	Use DRM driver for VRAM info for amd The amdgpu drivers free VRAM reporting omits some other apps, so leverage the upstream DRM driver which keeps better tabs on things	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	fb9cdfa723	Fix server.cpp for the new cuda build macros	2024-06-14 14:51:40 -07:00
Daniel Hiltgen	efac488675	Revert "Limit GPU lib search for now (#4777 )" This reverts commit `476fb8e892`.	2024-06-14 14:51:40 -07:00

... 32 33 34 35 36 ...

4610 Commits