To give a bit more background, here's a pointer to autodiff_function and autodiff_function_extract instructions in the SIL language manual: swift/SIL.rst at tensorflow · apple/swift · GitHub.
At first glance, autodiff_function instruction turns a normal function (T) -> U to a differentiable function @differentiable (T) -> U. At the SIL level, it's achieved by forming a bundle of the original function and two derivative functions, which are called "JVP" and "VJP".
autodiff_function [wrt 0] [order 1] %original : $(T) -> U
with {%jvp : $(T) -> (U, (T.A) -> U.A), %vjp : $(T) -> (U, (U.B) -> T.B)}
The autodiff_function_extract instruction takes a @differentiable function and extracts the original or one of the derivative functions.
autodiff_function_extract [original] [order 1] %f : $@differentiable (T) -> U
Since the types of derivative functions are opaque in the @differentiable (T) -> U type, we must be able to infer the expected type of derivative functions from the original function type. Thus derivative functions' types must align directly with the original function in an autodiff_function instruction.
However, what makes this difficult is that the derivative functions have a few extra types involved: associated types T.A, T.B, U.A and U.B. LoadableByAddress doesn't respect this type correspondence requirement for autodiff_function's original and derivative functions yet. So if T happens to be a large loadable type but T.A isn't, LoadableByAddress would generate the following incorrect SIL, where we expect the T.A argument to have the same storage type.
// Wrong! We expect `T.A` to also become indirect.
autodiff_function [wrt 0] [order 1] %original : $(@in_guaranteed T) -> U
with {%jvp : $(@in_guaranteed T) -> (U, (T.A) -> U.A), %vjp : $(@in_guaranteed T) -> (U, (U.B) -> T.B)}
As you see, autodiff_function is like function conversion except that it has more function operands involved and it has a requirement for correspondences in multiple operands' types. What would be a good way to fix this?
@shajrawi