ADCL Research Overview

Professor Zachary Sunberg

March 18th, 2022

Research Focus: Decision Making under Uncertainty

Alleatory

Epistemic (Static)

Epistemic (Dynamic)

Interaction

Markov Decision Process

Reinforcement Learning

POMDP

Game

Online Planning in Large POMDPs

https://arxiv.org/abs/1709.06196

https://arxiv.org/abs/2005.14549

https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7963408

Online Planning in Large POMDPs

All drivers normal

Outcome only

Omniscient

Mean MPC

QMDP

POMCPOW

Simulation results

https://arxiv.org/abs/2005.14549

Human Behavior Model: IDM and MOBIL

\ddot{x}_\text{IDM} = a \left[ 1 - \left( \frac{\dot{x}}{\dot{x}_0} \right)^{\delta} - \left(\frac{g^*(\dot{x}, \Delta \dot{x})}{g}\right)^2 \right]

g^*(\dot{x}, \Delta \dot{x}) = g_0 + T \dot{x} + \frac{\dot{x}\Delta \dot{x}}{2 \sqrt{a b}}

M. Treiber, et al., “Congested traffic states in empirical observations and microscopic simulations,” Physical Review E, vol. 62, no. 2 (2000).

A. Kesting, et al., “General lane-changing model MOBIL for car-following models,” Transportation Research Record, vol. 1999 (2007).

A. Kesting, et al., "Agents for Traffic Simulation." Multi-Agent Systems: Simulation and Applications. CRC Press (2009).

All drivers normal

Omniscient

Mean MPC

QMDP

POMCPOW

Efficient POMDP Approximations

\[|Q_{\mathbf{P}}^*(b,a) - Q_{\mathbf{M}_{\mathbf{P}}}^*(\bar{b},a)| \leq \epsilon \quad \text{w.p. } 1-\delta\]

For and \(\epsilon>0\) and \(\delta>0\), if \(C\) (number of particles) is high enough,

\(\mathbf{M}_\mathbf{P}\) = Particle belief MDP approximation of POMDP \(\mathbf{P}\)

[Lim, Becker, Kochenderfer, Tomlin, & Sunberg, 2023 (?)]

No dependence on \(|\mathcal{S}|\) or \(|\mathcal{O}|\)!

Conventional 1D POMDP

2D POMDP

Online Planning in Large POMDPs

Intention-Aware Navigation in Crowds with Extended-Space POMDP Planning. Gupta, H.; Hayes, B.; and Sunberg, Z. AAMAS, 2022.

Online Planning in Large POMDPs

https://arxiv.org/abs/2112.09456

POMDP-based Weather Info-Gathering

Space Domain Awareness Games

\(\mathcal{A} = \mathbb{R}^{N\times N}\)

...

\(N\)

Tyler Becker and Zachary Sunberg. “Imperfect Information Games and Counterfac-
tual Regret Minimization in Space Domain Awareness”. Abstract under review for the
Advanced Maui Optical and Space Surveillance Technologies conference.

Resolving Equilibrium Uncertainty

https://arxiv.org/abs/2002.04354

POMDPs.jl - An interface for defining and solving MDPs and POMDPs in Julia

Open Source Software

https://github.com/JuliaPOMDP/POMDPs.jl

ADCL Students

Thank You!

zachary.sunberg.net