★★★★★
Certainty Equivalence
Control as if the mean (or median or mode) is the true state
(subjective)
Optimal for LQG
★★★★★
QMDP
Full observability after 1 step (hindsight knowledge of state uncertainty)
Upper bound on true value
★★★★☆
Hindsight Optimization
Hindsight knowledge of state and outcome uncertainty
Looser upper bound than QMDP
★★☆☆☆
Fast Informed Bound (FIB)
Take one observation into account
Tighter upper bound than QMDP
★★★★☆
\(k\)-Markov
Pretend the last \(k\) observations make up the state and solve the MDP
Great for Atari!
★★★☆☆
Open Loop
Choose a sequence of actions that optimizes the objective in expectation
Good if alleatory is low, epistemic is hard to reduce
★★★☆☆
Most likely observation
Plan assuming \(b' = \tau(b, a, \hat{o}(b))\)
No observation branching; Good when \(Z\) unimodal