* Bring back escape valve for llm libraries
If the new discovery logic picks the wrong library, this gives users the
ability to force a specific one using the same pattern as before. This
can also potentially speed up bootstrap discovery if one of the libraries
takes a long time to load and ultimately bind to no devices. For example
unsupported AMD iGPUS can sometimes take a while to discover and rule out.
* Bypass extra discovery on jetpack systems
On at least Jetpack6, cuda_v12 appears to expose the iGPU, but crashes later on in
cublasInit so if we detect a Jetpack, short-circuit and use that variant.