Redundant functions in Accelerate framework

wigging · October 1, 2024, 5:26pm

The Accelerate framework provides high performance functions for working with vectors and matrices via vDSP and vForce but it also provides access to functions in BLAS and LAPACK. Consequently, some of the functions in Accelerate seem redundant. For example, vDSP_mmulD performs double-precision matrix multiplication but cblas_dgemm can also be used to perform matrix multiplication of double-precision values. Other than the interface, both of these functions seem to do the same thing. So in general, when should I use a vDSP or vForce function instead of a similar BLAS or LAPACK function? Do functions in vDSP and vForce automatically take advantage of Apple Silicon features where as BLAS and LAPACK functions do not?

scanon · October 1, 2024, 5:57pm

This is really a question for the apple developer forums, but since I was one of the maintainers for these libraries for many years, I'll answer it here and save you the trip:

Generally speaking there is not a meaningful difference between using the BLAS, vDSP, BNNS, or vImage interfaces for the same operation. They support slightly different operations w.r.t. strides (vDSP and BLAS have different conventions for negative strides, for example) and control over threading in some cases (e.g. vImage lets you specify kvImageDoNotTile on individual operations), but they are essentially equivalent.

There are sometimes cases where one may not be as well optimized as the other--when this happens, it is a bug that will be fixed if you report it, but someone has to report it. In general your best bet to avoid this is to use the API that best matches the operation. If you are doing matrix multiplications, use the BLAS API for that. If an API makes an operation more idiomatic, that's a good hint that it's the preferred API to use for that operation.

wigging · October 1, 2024, 6:11pm

Thank you for the suggestions. I guess I'll stick with using BLAS and LAPACK through Accelerate for vector and matrix arithmetic and algebra then resort to vDSP, vImage, vForce, and so on for everything else.