CSCI 5636

Generating Optimal Path for
Non-holonomic vehicles using the Hamilton-Jacobi Formulation

Slides borrowed from my colleague William Pope

Motivation

Tree-based online (PO)MDP solvers use rollout simulations to initialize new nodes
- Reward is sparse in navigation scenarios, so need a good rollout policy to distinguish good and bad future positions!
- Value estimate must be generated quickly to build search tree online
HJB provides on-demand optimal trajectories from any point in space, used for rollout in pedestrian navigation problem

Multi-Query Motion Planning

Can use multi-query planning as policy for rollout simulations
Single run of planner algorithm provides complete trajectories on-demand from any point in state space
Common multi-query methods:
- Fast marching method (FMM)
- Probabilistic roadmaps (PRM)

PRM

FMM

However: these methods don't work well for systems with differential constraints

Curvature-Constrained Dynamics

Simple nonlinear kinematic model used to approximate motion of a car
Assumes instantaneous changes in speed and steering angle

\dot{x} = f(x,u) = \begin{bmatrix} \dot{x} \\ \dot{y} \\ \dot{\theta} \end{bmatrix} = \begin{bmatrix} u_{v} \cos{\theta} \\ u_{v} \sin{\theta} \\ u_{v}\frac{1}{l}\tan{u_{\phi}} \end{bmatrix}

u = \begin{bmatrix} u_{v} \\ u_{\phi} \end{bmatrix}

x = \begin{bmatrix} x \\ y \\ \theta \end{bmatrix}

Reeds-Shepp Curves

For vehicle with car-like steering, shortest path between two states will always be a series of straight lines connected by minimum-radius curves, driven at max speed

U_{opt} = \begin{Bmatrix} (\max(u_{v}), 0.0) & (\min(u_{v}), 0.0) \\ (\max(u_{v}), \max(u_{\phi}) & (\min(u_{v}), \max(u_{\phi}) \\ (\max(u_{v}), \min(u_{\phi}) & (\min(u_{v}), \min(u_{\phi}) \end{Bmatrix}

So for first-order dynamics, continuous-time optimal action will always be 1 of 6:

Hamilton-Jacobi-Bellman Equation

PDE for finding global optimal control of a system (Bellman, 1950s)
Value function: minimum cost-to-go over given time interval
- C – cost rate function
- D – terminal value

V(x(t_{0}),t_{0}) = \min_{u}\{\int_{t_{0}}^{t_{f}} C(x(\tau),u(\tau)) d\tau + D(x(t_{f}))\}

V(x(t_{0}),t_{0}) = \min_{u}\{V(x(t_{0}+dt),t_{0}+dt) + \int_{t_{0}}^{t_{0}+dt} C(x(\tau),u(\tau)) d\tau\}

Rewriting with dynamic programming principle:

Hamilton-Jacobi-Bellman Equation

Applying Taylor series expansion to right side:

V(x(t),t) = \min_{u}\{V(x(t+dt),t+dt) + \int_{t}^{t+dt} C(...) d\tau\}

V(x(t),t) = \min_{u}\{V(x(t),t) + \frac{\partial V}{\partial t}dt + \frac{\partial V}{\partial x}dx + \int_{t}^{t+dt} C(...) d\tau\}

V(x(t),t) = V(x(t),t) + \frac{\partial V}{\partial t}dt + \min_{u}\{\frac{\partial V}{\partial x}\dot{x}dt + \int_{t}^{t+dt} C(...) d\tau\}

0 = \frac{\partial V}{\partial t} + \min_{u}\{\frac{\partial V}{\partial x}\dot{x} + \frac{1}{dt}\int_{t}^{t+dt} C(...) d\tau\}

0 = \frac{\partial V}{\partial t} + \min_{u}\{\frac{\partial V}{\partial x}f(x,u) + C(x,u) \}

Hamilton-Jacobi-Bellman partial differential equation:

Solving HJB

Apply system information to HJB PDE:

0 = \frac{\partial V}{\partial t} + \min_{u}\{\frac{\partial V}{\partial x}f(x,u) + C(x,u) \}

\frac{\partial V}{\partial t} = 0

C(x,u) = 1

Value function doesn't change over time:

Cost = time elapsed, cost rate:

-1 = \min_{u}\{\frac{\partial V}{\partial x} \cdot f(x,u) \}

-1 = \min_{(u_{v},u_{\phi})}\{V_{x}(u_{v}\cos{\theta}) + V_{y}(u_{v}\sin{\theta}) + V_{\theta}(u_{v}\frac{1}{l}\tan{u_{\phi}}) \}

Solving HJB

Finite difference method (FDM) – numerical method for approximating derivatives
- Forward/backward:

V_{x} = \frac{V_{i+1,j,k} - V_{ijk}}{h_{xy}}

V_{x} = \frac{-(V_{i-1,j,k} - V_{ijk})}{h_{xy}}

\text{PDE: } -1 = \min_{(u_{v},u_{\phi})}\{V_{x}(u_{v}\cos{\theta}) + V_{y}(u_{v}\sin{\theta}) + V_{\theta}(u_{v}\frac{1}{l}\tan{u_{\phi}}) \}

Upwind scheme
- Value at given state only depends on value of "upwind" states (closer to target)
- In FDM, need to pull value from upwind states
  - Upwind direction is determined by state/action

i_{uw} = i + sgn(\dot{x}) \\ j_{uw} = j + sgn(\dot{y}) \\ k_{uw} = k + sgn(\dot{\theta})

V_{x} = \frac{sgn(\dot{x})(V_{i_{uw},j,k} - V_{ijk})}{h_{xy}}

Solving HJB

Plugging in upwind FDM:

-1 = \frac{sgn(\dot{x})(V_{i_{uw}} - V)}{h_{xy}}(u_{v}\cos{\theta}) + \frac{sgn(\dot{y})(V_{j_{uw}} - V)}{h_{xy}}(u_{v}\sin{\theta}) + \frac{sgn(\dot{\theta})(V_{k_{uw}} - V)}{h_{\theta}}(u_{v}\frac{1}{l}\tan{u_{\phi}})

V_{i,j,k} = \frac{\frac{h_{xy}}{u_{v}} \text{ } + \text{ } V_{i_{uw},j,k} \text{ } s_{\dot{x}}\cos{\theta_{k}} \text{ } + \text{ } V_{i,j_{uw},k} \text{ } s_{\dot{y}}\sin{\theta_{k}} \text{ } + \text{ } V_{i,j,k_{uw}} \text{ } s_{\dot{\theta}}(\frac{h_{xy}}{h_{\theta}l})\tan{u_{\phi}}}{s_{\dot{x}}\cos{\theta_{k}} \text{ } + \text{ } s_{\dot{y}}\sin{\theta_{k}} \text{ } + \text{ } s_{\dot{\theta}}(\frac{h_{xy}}{h_{\theta}l})\tan{u_{\phi}}}

\text{PDE: } -1 = \min_{(u_{v},u_{\phi})}\{V_{x}(u_{v}\cos{\theta}) + V_{y}(u_{v}\sin{\theta}) + V_{\theta}(u_{v}\frac{1}{l}\tan{u_{\phi}}) \}

Solution:

Solving HJB

Implementation:
1. Set initial value at every node in grid (target: 0, else: large)
2. Iterate through all nodes in free space
  1. Calculate V_ijk through FDM with upwind neighbors for each of 6 possible optimal actions, keep lowest value
3. Repeat sweeps until all nodes have converged