Professor Zachary Sunberg
March 18th, 2022
Alleatory
Epistemic (Static)
Epistemic (Dynamic)
Interaction
Markov Decision Process
Reinforcement Learning
POMDP
Game
All drivers normal
Outcome only
Omniscient
Mean MPC
QMDP
POMCPOW
Simulation results
Human Behavior Model: IDM and MOBIL
M. Treiber, et al., “Congested traffic states in empirical observations and microscopic simulations,” Physical Review E, vol. 62, no. 2 (2000).
A. Kesting, et al., “General lane-changing model MOBIL for car-following models,” Transportation Research Record, vol. 1999 (2007).
A. Kesting, et al., "Agents for Traffic Simulation." Multi-Agent Systems: Simulation and Applications. CRC Press (2009).
All drivers normal
Omniscient
Mean MPC
QMDP
POMCPOW
\[|Q_{\mathbf{P}}^*(b,a) - Q_{\mathbf{M}_{\mathbf{P}}}^*(\bar{b},a)| \leq \epsilon \quad \text{w.p. } 1-\delta\]
For and \(\epsilon>0\) and \(\delta>0\), if \(C\) (number of particles) is high enough,
\(\mathbf{M}_\mathbf{P}\) = Particle belief MDP approximation of POMDP \(\mathbf{P}\)
[Lim, Becker, Kochenderfer, Tomlin, & Sunberg, 2023 (?)]
No dependence on \(|\mathcal{S}|\) or \(|\mathcal{O}|\)!
Conventional 1D POMDP
2D POMDP
Intention-Aware Navigation in Crowds with Extended-Space POMDP Planning. Gupta, H.; Hayes, B.; and Sunberg, Z. AAMAS, 2022.
\(\mathcal{A} = \mathbb{R}^{N\times N}\)
1
2
...
...
...
...
...
...
...
\(N\)
Tyler Becker and Zachary Sunberg. “Imperfect Information Games and Counterfac-
tual Regret Minimization in Space Domain Awareness”. Abstract under review for the
Advanced Maui Optical and Space Surveillance Technologies conference.
POMDPs.jl - An interface for defining and solving MDPs and POMDPs in Julia