Commit Graph

141 Commits

Author SHA1 Message Date
pufferffish 7fe16eaab8 Merge remote-tracking branch 'upstream/main' into vulkan 2024-06-28 08:47:37 +01:00
Josh Yan 662568d453 err!=nil check 2024-06-20 09:30:59 -07:00
Josh Yan 4ebb66c662 reformat error check 2024-06-20 09:23:43 -07:00
Josh Yan 23e899f32d skip os.removeAll() if PID does not exist 2024-06-20 08:51:35 -07:00
Daniel Hiltgen 96624aa412
Merge pull request #5072 from dhiltgen/windows_path
Move libraries out of users path
2024-06-19 09:13:39 -07:00
Daniel Hiltgen 10f33b8537
Merge pull request #5146 from dhiltgen/backout
Put back temporary intel GPU env var
2024-06-19 09:12:45 -07:00
Daniel Hiltgen d34d88e417 Revert "Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)""
This reverts commit 755b4e4fc2.
2024-06-19 08:57:41 -07:00
Daniel Hiltgen 52ce350b7a Fix bad symbol load detection
pointer deref's weren't correct on a few libraries, which explains
some crashes on older systems or miswired symlinks for discovery libraries.
2024-06-19 08:39:07 -07:00
Wang,Zhe badf975e45 get real func ptr. 2024-06-19 09:00:51 +08:00
Wang,Zhe 755b4e4fc2 Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)"
This reverts commit 163cd3e77c.
2024-06-19 08:59:58 +08:00
Daniel Hiltgen b2799f111b Move libraries out of users path
We update the PATH on windows to get the CLI mapped, but this has
an unintended side effect of causing other apps that may use our bundled
DLLs to get terminated when we upgrade.
2024-06-17 13:12:18 -07:00
Lei Jitang 4ad0d4d6d3
Fix a build warning (#5096)
Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-06-17 14:47:48 -04:00
Jeffrey Morgan 163cd3e77c
gpu: add env var for detecting Intel oneapi gpus (#5076)
* gpu: add env var for detecting intel oneapi gpus

* fix build error
2024-06-16 20:09:05 -04:00
Daniel Hiltgen fd1e6e0590 Add some more debugging logs for intel discovery
Also removes an unused overall count variable
2024-06-16 07:42:52 -07:00
pufferffish b6554e9b8c fix vulkan handle releasing 2024-06-15 21:11:07 +01:00
Daniel Hiltgen 07d143f412
Merge pull request #5058 from coolljt0725/fix_build_warning
gpu: Fix build warning
2024-06-15 11:52:36 -07:00
Daniel Hiltgen 17ce203a26
Merge pull request #4875 from dhiltgen/rocm_gfx900_workaround
Rocm gfx900 workaround
2024-06-15 07:38:58 -07:00
DSLstandard b958cd2848
remove cap_get_bound check 2024-06-15 20:19:19 +08:00
KOISHI KOMEIJI FROM TOUHOU 11 e3f9ca4009 fix check_perfmon len 2024-06-15 20:13:15 +08:00
pufferffish 38466f1821 fix build 2024-06-15 12:06:43 +01:00
pufferffish 18f3f960b0 update gpu.go 2024-06-15 12:05:01 +01:00
pufferffish e77ea68e11 Merge branch 'refs/heads/main' into vulkan
# Conflicts:
#	gpu/gpu.go
2024-06-15 12:01:36 +01:00
pufferffish 11c55fab81 fix total memory monitor 2024-06-15 10:58:12 +01:00
pufferffish 257364cb3c fix free memory monitor 2024-06-15 10:52:34 +01:00
pufferffish e4e8a5d25a fix compilation 2024-06-15 09:44:10 +01:00
pufferffish 724fac470f fix segfault 2024-06-15 08:05:48 +01:00
pufferffish 24c8840037 it builds 2024-06-15 07:49:28 +01:00
Lei Jitang 225f0d1219 gpu: Fix build warning
Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-06-15 14:26:23 +08:00
pufferffish 9c6b049567 add support in gpu.go 2024-06-15 05:27:14 +01:00
Daniel Hiltgen 6be309e1bd Centralize GPU configuration vars
This should aid in troubleshooting by capturing and reporting the GPU
settings at startup in the logs along with all the other server settings.
2024-06-14 15:59:10 -07:00
Daniel Hiltgen da3bf23354 Workaround gfx900 SDMA bugs
Implement support for GPU env var workarounds, and leverage
this for the Vega RX 56 which needs
HSA_ENABLE_SDMA=0 set to work properly
2024-06-14 15:38:13 -07:00
Daniel Hiltgen 6f351bf586 review comments and coverage 2024-06-14 14:55:50 -07:00
Daniel Hiltgen fc37c192ae Refine CPU load behavior with system memory visibility 2024-06-14 14:51:40 -07:00
Daniel Hiltgen 434dfe30c5 Reintroduce nvidia nvml library for windows
This library will give us the most reliable free VRAM reporting on windows
to enable concurrent model scheduling.
2024-06-14 14:51:40 -07:00
Daniel Hiltgen 4e2b7e181d Refactor intel gpu discovery 2024-06-14 14:51:40 -07:00
Daniel Hiltgen 6fd04ca922 Improve multi-gpu handling at the limit
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
Daniel Hiltgen 43ed358f9a Refine GPU discovery to bootstrap once
Now that we call the GPU discovery routines many times to
update memory, this splits initial discovery from free memory
updating.
2024-06-14 14:51:40 -07:00
Daniel Hiltgen b32ebb4f29 Use DRM driver for VRAM info for amd
The amdgpu drivers free VRAM reporting omits some other apps, so leverage the
upstream DRM driver which keeps better tabs on things
2024-06-14 14:51:40 -07:00
Daniel Hiltgen efac488675 Revert "Limit GPU lib search for now (#4777)"
This reverts commit 476fb8e892.
2024-06-14 14:51:40 -07:00
pufferffish f46b4a6fa2 implement the vulkan C backend 2024-06-14 19:56:35 +01:00
Daniel Hiltgen aac367636d Actually skip PhysX on windows 2024-06-13 13:17:19 -07:00
Michael Yang e919f6811f lint windows 2024-06-04 11:13:30 -07:00
Michael Yang bf7edb0d5d lint linux 2024-06-04 11:13:30 -07:00
Michael Yang e40145a39d lint 2024-06-04 11:13:30 -07:00
Jeffrey Morgan 476fb8e892
Limit GPU lib search for now (#4777)
* fix oneapi errors on windows 10
2024-06-01 19:24:33 -07:00
Daniel Hiltgen 646371f56d
Merge pull request #3278 from zhewang1-intc/rebase_ollama_main
Enabling ollama to run on Intel GPUs with SYCL backend
2024-05-28 16:30:50 -07:00
Patrick Devine 4cc3be3035
Move envconfig and consolidate env vars (#4608) 2024-05-24 14:57:15 -07:00
Wang,Zhe fd5971be0b support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
Daniel Hiltgen 30a7d7096c Bump VRAM buffer back up
Under stress scenarios we're seeing OOMs so this should help stabilize
the allocations under heavy concurrency stress.
2024-05-10 09:15:28 -07:00
Daniel Hiltgen 354ad9254e Wait for GPU free memory reporting to converge
The GPU drivers take a while to update their free memory reporting, so we need
to wait until the values converge with what we're expecting before proceeding
to start another runner in order to get an accurate picture.
2024-05-09 14:56:01 -07:00