Differentiable programming for gradient-based machine learning

I'm trying to figure out if there's a problem with MPS that I can solve in the S4TF backend. The GitHub repository might not have given the two GPUs an accurate comparison (there could be other factors they didn't realize were affecting performance), which makes it difficult for me to draw conclusions for the purpose of optimization.