Autonomous Thermalling as a Partially Observable Markov Decision Process

Iain Guilliard, Richard Rogahn, Jim Piavis, Andrey Kolobov

Robotics: Science and Systems, 2018

 

Presented by Zachary Sunberg, August 25, 2021

Motivation

Thermals can help aircraft fly with less power.

 

 

 

Video by Tobias Kemmerer

Motivation

Birds use thermalling extensively

Motivation

Thermals are difficult to detect.

Contributions

  1. Formulate the thermalling problem as a POMDP with key assumptions that simplify the problem without crippling performance.
  2. Propose an algorithm simple enough to run on a Pixhawk microcontroller.
  3. Implement POMDPSoar as an Open-source arduplane module.
  4. Demonstrate convincingly that POMDPSoar outperforms ArduSoar.

Background: POMDPs

  • \(\mathcal{S}\) - State space
  • \(T:\mathcal{S}\times \mathcal{A} \times\mathcal{S} \to \mathbb{R}\) - Transition probability distribution
  • \(\mathcal{A}\) - Action space
  • \(R:\mathcal{S}\times \mathcal{A} \to \mathbb{R}\) - Reward
  • \(\mathcal{O}\) - Observation space
  • \(Z:\mathcal{S} \times \mathcal{A}\times \mathcal{S} \times \mathcal{O} \to \mathbb{R}\) - Observation probability distribution

Environment

Belief Updater

Policy/Planner

\(b\)

\(a\)

\[b_t(s) = P\left(s_t = s \mid a_1, o_1 \ldots a_{t-1}, o_{t-1}\right)\]

True State

\(s = 7\)

Observation \(o = -0.21\)

Background: POMDPs

Background: Kalman Filter

Context in Literature

  • ArduSoar - Included in ArduPlane
  • RL Approaches - episodic (is this actually a problem??)
  • Heuristics, e.g. Reichmann rules (5.3 hour record)
  • Other methods, (e.g. work by John Bird), focus on extending endurance and range with flight path planning

Problem Formulation

Assumptions:

  • Thermal vertical velocity distribution (\(w\)) is Gaussian
  • Thermal does not move w.r.t. surrounding air

Problem Formulation

  • \(s = (s^u, s^{th}); \quad s^u = (p^u, v, \psi, \phi, \dot{\phi}, h); s^{th} = (p^{th}, W_0, R_0)\)
  • \(\mathcal{A} = \{-45^\circ, -30^\circ, -15^\circ, 0^\circ, 15^\circ, 30^\circ, 45^\circ\}\) roll angle
  • \(\mathcal{T}\) = vehicle dynamics, and process noise
  • \(R(s, a, s') = (h_{s'} - h_s)\)
  • \(\mathcal{O}\): sensor readings
  • \(\mathcal{Z}\): Gaussian

No pitch because that would be "cheating"

Approach

\(R(s, a, s') = h_s - h_{s'}\)

\(R(b) = tr(cov(b))\)

  • Online
  • Multi-forecast model-predictive control

4 second horizon

12 second horizon

Approach

Computational Resources

 

  • Implemented on a Pixhawk
  • "32-bit ARM processor with only 168MHz clock speed and 256KB RAM"
  • FPU?
  • < 1s per action computation

 

Hardware Experiments

Radian Pro (2m wingspan)

"Nearly constant cloud cover, frequent gusty winds and rain"

Sim-to-real gap

Hardware Experiments

Typical flight time: 2000s = half an hour

Hardware Experiment Results

Critique

  • Negatives:
    • Doesn't actually solve the hard part of the POMDP (the exploration vs exploitation tradeoff)
    • Unimodal beliefs
    • Does not take information on where thermals are likely to occur into account
  • Positive:
    • Hardware experiments
    • Controlled for many confounding factors (airframe, battery, wave lift)

Impact and Legacy

Future Work /Reading

"companion computers on larger sUAVs may be able to run a full-fledged POMDP solver in real time. Design and evaluation of a controller based on solving the thermalling POMDP near-optimally, e.g., using an approach similar to Slade et al. [36]’s, is a direction for future work."

[36] P. Slade, P. Culbertson, Z. Sunberg, and M. J. Kochen- derfer. Simultaneous active parameter estimation and control using sampling-based bayesian reinforcement learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017. URL https://arxiv.org/abs/1707.09055.

Contributions (Recap)

 

  1. Formulate the thermalling problem as a POMDP with key assumptions that simplify the problem without crippling performance.
  2. Propose an algorithm simple enough to run on a Pixhawk microcontroller.
  3. Implement POMDPSoar as an Open-source arduplane module.
  4. Demonstrate convincingly that POMDPSoar outperforms ArduSoar.

POMDP Thermalling

By Zachary Sunberg

POMDP Thermalling

  • 347