ollama

Commit Graph

Author	SHA1	Message	Date
Bruce MacDonald	057cc54b66	benchmark: compare backend graph computation times Track execution time of individual tensor operations (views, copies, reshapes etc) during LLM forward passes using CGo bindings to the native graph runtime. This helps identify performance bottlenecks in the computation graph and optimize memory operations that can significantly impact inference latency.	2025-02-19 15:22:53 -08:00

Author

SHA1

Message

Date

Bruce MacDonald

057cc54b66

benchmark: compare backend graph computation times

Track execution time of individual tensor operations (views, copies, reshapes etc)
during LLM forward passes using CGo bindings to the native graph runtime. This
helps identify performance bottlenecks in the computation graph and optimize memory
operations that can significantly impact inference latency.

2025-02-19 15:22:53 -08:00

1 Commits