0

Suppose I have a differential equation

$\frac{d}{dt}y(t,\lambda) = f(y(t,\lambda),\lambda)$

where $\lambda$ is some parameter of the system. The initial value $y(0,\lambda)$ is specified.

I am looking for a good way to estimate $\frac{d}{d\lambda}y(t,\lambda)$ at finite $t$.

Using the RK4 method I can approximate $y$ (over small steps $h$) at time $t=(n+1)h$ as follows

$y(t,\lambda) \simeq y_{n+1} = y_n + \frac{h}{6}\left( k_1 + 2k_2 + 2k_3 + k_4 \right)$

where $y_0 = y(0,\lambda)$ and

$k_1=f(y_n,\lambda)$

$k_2=f(y_n + \frac{h}{2}k_1,\lambda)$

$k_3=f(y_n + \frac{h}{2}k_2,\lambda)$

$k_4=f(y_n + hk_3,\lambda)$

So far, so good.

Of course, I can then evaluate the derivative numerically; i.e. plug in a slightly larger value $\lambda+\epsilon$, iterate RK4 $(n+1)$ times from the initial condition to evaluate the approximation to $y(t,\lambda+\epsilon)$ and then evaluate $\frac{y(t,\lambda+\epsilon) - y(t,\lambda)}{\epsilon}$.

Is there a systematic improvement on this? I started trying to compute the derivatives of $k_1$, $k_2$ etc. by making the substitution $\lambda \to \lambda+\epsilon$ but they got messy quickly and seemed to indicate that I'd need higher-order derivatives to get correct results (which doesn't seem very much in the spirit of RK4). Also, it'll be even more complicated if I start doing multiple RK4 steps (though for the system I'm interested in, $f$ varies slowly enough that one Runge-Kutta step might potentially be sufficient; not sure).

Anyway, this seems like an obvious enough problem that I'm probably reinventing the wheel...

Update (4/25 4:24pm): corrected RHS of ODE from $f(t,\lambda)$ to $f(y(t,\lambda),\lambda)$.

  • The usual way is to introduce an additional component $z=\partial_\lambda y(t,\lambda)$ and augment the system with $\dot z=\partial_\lambda f(t,λ)+ \partial_y f(t,λ)z$. – Lutz Lehmann Apr 25 '23 at 22:30
  • Thanks @LutzLehmann, that does make sense; do I understand you correctly that this is still a numerical derivative using a finite difference $\delta_\lambda$ ? – Ian Holmes Apr 25 '23 at 23:26
  • Yes, it is a numerical derivative if a numerical ODE solver is used, but no, it is not a finite difference approximation. – Lutz Lehmann Apr 26 '23 at 04:36

1 Answers1

0

Thinking more about this, I guess the "obvious" way is just to think of the Runge-Kutta iteration as a repetitively structured computation graph, and apply the chain rule repeatedly, backpropagation-style; that is, since

$y_{n+1} = y_n + \frac{h}{6}\left( k_{1,n} + 2k_{2,n} + 2k_{3,n} + k_{4,n} \right)$

$k_{1,n}=f(y_n,\lambda)$

$k_{2,n}=f(y_n + \frac{h}{2}k_{1,n},\lambda)$

$k_{3,n}=f(y_n + \frac{h}{2}k_{2,n},\lambda)$

$k_{4,n}=f(y_n + hk_{3,n},\lambda)$

then if we write $D_\lambda \equiv \frac{d}{d\lambda}$ for the (backpropagated) differentiation operator, and $f_y \equiv \frac{\partial f}{\partial y}$ and $f_\lambda \equiv \frac{\partial f}{\partial \lambda}$ for the partial derivatives of $f$, we have

$D_\lambda y_{n+1} = D_\lambda y_n + \frac{h}{6}\left( D_\lambda k_{1,n} + 2D_\lambda k_{2,n} + 2D_\lambda k_{3,n} + D_\lambda k_{4,n} \right)$

$D_\lambda k_{1,n} = f_y(y_n,\lambda) \cdot D_\lambda y_n + f_\lambda(y_n,\lambda)$

$D_\lambda k_{2,n} = f_y(y_n + \frac{h}{2} k_{1,n},\lambda) \cdot \left( D_\lambda y_n + \frac{h}{2} D_\lambda k_{1,n} \right) + f_\lambda(y_n + \frac{h}{2} k_{1,n},\lambda)$

$D_\lambda k_{3,n} = f_y(y_n + \frac{h}{2} k_{2,n},\lambda) \cdot \left( D_\lambda y_n + \frac{h}{2} D_\lambda k_{2,n} \right) + f_\lambda(y_n + \frac{h}{2} k_{2,n},\lambda)$

$D_\lambda k_{4,n} = f_y(y_n + h k_{3,n},\lambda) \cdot \left( D_\lambda y_n + h D_\lambda k_{3,n} \right) + f_\lambda(y_n + h k_{3,n},\lambda)$

  • It is only back propagation if some gradient gets propagated backwards. Otherwise it is the forward mode of automatic/algorithmic differentiation. – Lutz Lehmann Apr 26 '23 at 04:35