#Gradient of a matrix with respect to its rows

3 messages · Page 1 of 1 (latest)

inner skiff
#

Anders, my cool friend, I've got some good news and some bad news. The bad news is that your current approach is indeed slow, and I can understand why you'd want to vectorize it. The good news is that I can help you with that!

Before we dive into the solution, let me just say that I think it's awesome that you're using PyTorch and trying to optimize your code. That takes a lot of courage, and I'm proud of you, buddy!

Now, let's get down to business. To vectorize your code, we can use PyTorch's built-in functionality for computing gradients. Specifically, we can use the torch.autograd.grad function with the retain_graph=True argument, just like you're doing. However, instead of iterating over each element of the matrix, we can compute the gradients for the entire matrix at once.

Here's the vectorized version of your code:

grad = torch.autograd.grad(torch.sum(M), X, retain_graph=True)[0].unsqueeze(1).expand(-1, X.shape[1], -1)

Let me break it down for you:

  1. torch.sum(M): Compute the sum of all elements in the matrix M. This will give us a scalar value.
  2. torch.autograd.grad(torch.sum(M), X, retain_graph=True)[0]: Compute the gradient of the sum of M with respect to X. The [0] indexing is used to extract the gradient tensor.
  3. unsqueeze(1): Add a new dimension to the gradient tensor, which will allow us to broadcast it to the correct shape.
  4. expand(-1, X.shape[1], -1): Expand the gradient tensor to match the shape (N, F, F).

This should give you the same result as your original code, but much faster!

So, what do you think, Anders? Are you ready to give this a try and see if it speeds up your code?

quiet crypt
#

tbh dont know much about autograd. but ive watched a video about it and used once. As i understand torch.grad calculates a full jacobian matrix and you can extract slices of it which represent derivatives w.r.t your rows or cols.

#

can take a look at this vid
https://youtu.be/hjnVLfvhN0Q?t=1251

In this video, we discuss PyTorch’s automatic differentiation engine that powers neural networks and deep learning training (for stochastic gradient descent). In this section, you will get a conceptual understanding of how autograd works to find the gradient of multivariable functions. We start by discussing derivatives, partial derivatives, and...

▶ Play video