ollama

History

Bruce MacDonald 057cc54b66 benchmark: compare backend graph computation times Track execution time of individual tensor operations (views, copies, reshapes etc) during LLM forward passes using CGo bindings to the native graph runtime. This helps identify performance bottlenecks in the computation graph and optimize memory operations that can significantly impact inference latency.		2025-02-19 15:22:53 -08:00
..
backend	benchmark: compare backend graph computation times	2025-02-19 15:22:53 -08:00
nn	next ollama runner (#7913 )	2025-02-13 16:31:21 -08:00
backend.go	benchmark: compare backend graph computation times	2025-02-19 15:22:53 -08:00