Himanshu Gupta
Masters Student, Computer Science, University of Colorado Boulder
Pedestrian Modeling
Reactive Controllers
Issues
Predict and act
controllers
Issues
OUTCOME
MODEL
STATE
Markov Model
Markov Decision Process (MDP)
Solving MDPs - The Value Function
$$V^*(s) = \underset{a\in\mathcal{A}}{\max} \left\{R(s, a) + \gamma E\Big[V^*\left(s_{t+1}\right) \mid s_t=s, a_t=a\Big]\right\}$$
Involves all future time
Involves only \(t\) and \(t+1\)
$$\underset{\pi:\, \mathcal{S}\to\mathcal{A}}{\mathop{\text{maximize}}} \, V^\pi(s) = E\left[\sum_{t=0}^{\infty} \gamma^t R(s_t, \pi(s_t)) \bigm| s_0 = s \right]$$
$$Q(s,a) = R(s, a) + \gamma E\Big[V^* (s_{t+1}) \mid s_t = s, a_t=a\Big]$$
Value = expected sum of future rewards
Online Decision Process Tree Approaches
Time
Estimate \(Q(s, a)\) based on children
$$Q(s,a) = R(s, a) + \gamma E\Big[V^* (s_{t+1}) \mid s_t = s, a_t=a\Big]$$
\[V(s) = \max_a Q(s,a)\]
Partially Observable Markov Decision Process (POMDP)
State
Timestep
Accurate Observations
Goal: \(a=0\) at \(s=0\)
Optimal Policy
Localize
\(a=0\)
Environment
Belief Updater
Policy
\(b\)
\(a\)
\[b_t(s) = P\left(s_t = s \mid a_1, o_1 \ldots a_{t-1}, o_{t-1}\right)\]
True State
\(s = 7\)
Observation \(o = -0.21\)
[1] Christos H. Papadimitriou and John N. Tsitsiklis. 1987. The Complexity of Markov Decision Processes. Mathematics of Operations Research 12, 3 (1987), 441–450.
Action Nodes
Belief Nodes
[Ross, 2008] [Silver, 2010]
*(Partially Observable Monte Carlo Planning)
Roll-out Policy is important
Roll-out Policy is important
\(1D\)-\(A^*\)
$$ C(\rho) = \sum_{i=0}^{n} ( \lambda^i C_{st}(x_i, y_i) + \lambda^i C_{ped}(x_i, y_i)) $$
$$\mathcal{s(t)} = (x_c(t), y_c(t), \theta_c(t), v_c(t), [ (x_1(t), y_1(t), v_1(t), g_1), \\ (x_2(t), y_2(t), v_2(t), g_2),...,(x_{n_{ped}}(t), y_{n_{ped}}(t), v_{n_{ped}}(t), g_{n_{ped}})])$$
$$\mathcal{s(t)} = (x_c(t), y_c(t), \theta_c(t), v_c(t), [ (x_1(t), y_1(t), v_1(t), g_1), \\ (x_2(t), y_2(t), v_2(t), g_2),...,(x_{n_{ped}}(t), y_{n_{ped}}(t), v_{n_{ped}}(t), g_{n_{ped}})])$$
$$\delta_s(t) \in \{\textbf{Increase Speed, Decrease Speed,}$$ $$\textbf{Maintain Speed, Sudden Brake\}}$$
$$\mathcal{o(t)} = (x_c(t), y_c(t), [ (x_1'(t), y_1'(t)), (x_2'(t), y_2'(t)),...,(x_{n_{ped}}'(t), y_{n_{ped}}'(t))])$$
$$\mathcal{s(t)} = (x_c(t), y_c(t), \theta_c(t), v_c(t), [ (x_1(t), y_1(t), v_1(t), g_1), \\ (x_2(t), y_2(t), v_2(t), g_2),...,(x_{n_{ped}}(t), y_{n_{ped}}(t), v_{n_{ped}}(t), g_{n_{ped}})])$$
$$\mathcal{a = ( \delta_\theta(t) , \delta_s(t) )}$$
$$\mathcal{o(t)} = (x_c(t), y_c(t), [ (x_1'(t), y_1'(t)), (x_2'(t), y_2'(t)),...,(x_{n_{ped}}'(t), y_{n_{ped}}'(t))])$$
An effective rollout policy
Execute a path from the vehicle's current position to its goal location
Rely on Multi-query Path Planning Techniques
Probabilistic RoadMaps
Fast Marching Method
The Probabilistic Roadmap or 𝑃𝑅𝑀 is a method for path planning in high dimensions for robots in static environments.
The Fast Marching Method (𝐹𝑀𝑀) is an algorithm for tracking and modeling the motion of a physical wave interface.
𝐹𝑀𝑀 calculates the time that the wave takes to reach all points in the environment.
G
G
\(2D-FMM\)
Scenario 1
(Open Field)
Scenario 2 (Cafeteria Setting)
Scenario 3
(L shaped lobby)
Scenario 1
(Open Field)
Scenario 2 (Cafeteria Setting)
Scenario 3
(L shaped lobby)
Limited Space
Planner
Extended Space
Planners
\(1D\)-\(A^*\)
\(2D\)-\(PRM\)
\(2D\)-\(FMM\)
\(1D\)-\(A^*\)
\(2D\)-\(PRM\)
\(2D\)-\(PRM\)
\(2D\)-\(FMM\)
\(2D\)-\(FMM\)
# humans = 100
# humans = 200
# humans = 300
# humans = 400
Scenario 1
Evaluation Metrics:
Evaluation Metric: Travel Time (in s)
Evaluation Metric: #Outperformed
Scenario 1
\(1D-A^*\)
\(2D-FMM\)
\(2D-PRM\)
Scenario 2
\(1D-A^*\)
\(2D-FMM\)
\(2D-PRM\)
Scenario 3
\(1D-A^*\)
\(2D-FMM\)
\(2D-PRM\)
Evaluation Metric: #SB action
Limited Space
Planner
Extended Space
Planners
\(1D\)-\(A^*\)
\(2D\)-\(NHV\)
\(1D\)-\(A^*\)
\(2D\)-\(NHV\)
\(2D\)-\(NHV\)
Evaluation Metric: Travel Time (in s)
Evaluation Metric: #SB action
Evaluation Metric: #Outperformed
\(1D-A^*\)
\(2D-NHV\)
\(1D-A^*\)
\(2D-NHV\)