Change HumanBytes to use binary prefixes (1024-based) instead of decimal
(1000-based) for file size formatting. This fixes the rounding error where
3072 bytes was displayed as '3.1 KB' instead of '3 KB'.
The decimal constants (KiloByte, MegaByte, etc.) are preserved for other
uses like buffer sizes and memory capacity.
Fixes#13405
Removed redundant checks and streamlined the switch-case structure.
Added test cases for both HumanBytes and HumanBytes2 to cover a wide range of scenarios.
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.