Stochastic Processes and Simple Decisions
Review
Markov Blanket
- The Markov blanket of \(X\) is the minimal set of nodes that, if their values were known, would make \(X\) conditionally independent of all other nodes.
- Parents, Children, and other parents of Children.
Guiding Question
- What does "Markov" mean in "Markov Decision Process"?
- How do we find an optimal action based on maximizing expected utility?
Stochastic Process
- A stochastic process is a collection of R.V.s indexed by time.
- \(\{X_0, X_1, X_2, \ldots\}\) or \(\{X_t\}_{t=0}^\infty\) or just \(\{X_t\}\)
Example: Positive, Uniform Random Walk
\(X_0 = 0\)
\(X_{t+1} = X_t + V_t\)
\(V_t \sim \text{Bernoulli}(0.5)\) (i.i.d.)
In a stationary stochastic process (all in this class), this relationship does not change with time
\(P(X_{t+1} \mid X_{0:t}) = \begin{cases} 0.5 & \text{if } X_{t+1} = X_t\\ 0.5 & \text{if } X_{t+1} = X_t + 1 \\ 0 & \text{otherwise}\end{cases}\)
Bayes Net
Trajectories
Stochastic Process
- A stochastic process is a collection of R.V.s indexed by time.
- \(\{x_1, x_2, x_3, \ldots\}\) or \(\{x_t\}_{t=1}^\infty\) or just \(\{x_t\}\)
Example: Positive, Uniform Random Walk
\(x_0 = 0\)
\(x_{t+1} = x_t + v_t\)
\(v_t \sim \mathcal{U}(\{0,1\})\) (i.i.d.)
Shorthand: \(x' = x + v\)
In a stationary stochastic process (all in this class), this relationship does not change with time
\(P(x' \mid x) \\=\text{SparseCat}([x, x + 1], [0.5, 0.5])\)
Bayes Net
Dynamic Bayes Net (DBN)
Switch to Bernoulli
Stochastic Process
Joint
| x0 | x1 | x2 | P(x1, x2, x3) |
|---|---|---|---|
| 0 | 0 | 0 | 0.25 |
| 0 | 0 | 1 | 0.25 |
| 0 | 1 | 1 | 0.25 |
| 0 | 1 | 2 | 0.25 |
\(x_0 = 0\)
\(x_{t+1} = x_t + v_t\)
\(v_t \sim \mathcal{U}(\{0,1\})\) (i.i.d.)
\[P(x_{1:n}) = \prod_{t=1}^n P(x_t \mid \text{pa}(x_t))\]
For this particular process,
\[P(x_{1:n}) = \prod_{t=1}^n P(x_t \mid x_{t-1})\]
For this particular process, since \(\text{pa}(x_t) = x_{t-1}\), if \(P(x_{t-1})\) is known,
\[= 0.5 \, P(x_{t-1} = x_t-1) + 0.5 \, P(x_{t-1} = x_t)\]
\[P(x_t)\]
\[= \sum_{k \in x_{t-1}} P\left(x_t \mid x_{t-1}=k\right) P(x_{t-1} = k)\]
Marginal
Stochastic Process
Expectation
\[E[x_t] = \sum_{x \in x_t} x P(x_t = x)\]
For this particular process, \(x_t = \sum_{i=1}^t v_t\), so
\[E[x_t] = E\left[\sum_{i=1}^t v_t\right] = \sum_{i=1}^t E[v_t] = 0.5 t\]
Expectation of a function (such as reward)
\[E[f(x_t)] = \sum_{x \in x_t} f(x) P(x_t = x)\]
Causal Stochastic Processes
In a causal stochastic process, \(x_t\) may depend on any \(x_\tau\) with \(\tau <t\).
In general, stochastic processes may have connections between any times in their Bayesian Network.
Remove
Simulating a Stochastic Process
030-Stochastic-Processes.ipynb
A More Complex Example
Markov Process
- A stochastic process \(\{S_t\}\) is Markov if \[P(S_{t+1} \mid S_{0:t}) = P(S_{t+1} \mid S_{t})\] \[S_{t+1}\, \bot \, S_{t-\tau} \mid \, S_t \quad \forall \tau \in 1:t\]
- \(S_t\) is called the "state" of the process
Hidden Markov Model
(Often you can't measure the whole state)
Bayesian Networks
A Bayesian Network is a directed acyclic graph (DAG) that encodes probabilistic relationships between R.V.s
- Nodes: R.V.s
- Edges: Direct probabilistic relationships
Concretely:
\(P(x_{1:n}) = \prod_i P(x_i \mid pa(x_i))\)
\(P(A, B, C) \\ = P(A)P(B \mid A) P(C \mid A)\)
Markov Process
Hidden Markov Model
Dynamic Bayesian Network
(One step)
Dynamic Bayesian Networks
Break
Suppose you want to create a Markov process model that describes how many new COVID cases will start in a particular week. What information should be in the state of the model?
Assume:
- The population mixes thoroughly (i.e. there are no geographic considerations).
- COVID patients may be contagious up to 2 weeks after they contract the disease.
- Researchers have determined a probabilistic model for the number of new cases given the number of people in the first week of the disease and the number of people in the second week of the disease.
Simple Decisions
Simple Decisions
Outcomes
Probabilities
\(S_1 \ldots S_n\)
\(p_1 \ldots p_n\)
- Completeness: Exactly one holds: \(A\succ B\), \(B \succ A\), \(A \sim B\)
- Transitivity: If \(A \succeq B\) and \(B \succeq C\), then \(A \succeq C\)
- Continuity: If \(A\succeq C \succeq B\), then there exists a probability \(p\) such that
\([A:p; B:1-p] \sim C\) - Independence: If \(A \succ B\), then for any \(C\) and probability \(p\),
\([A:p; C:1-p] \succeq [B:p; C:1-p]\)
von Neumann - Morgenstern Axioms
Lottery
\([S_1: p_1; \ldots; S_n: p_n]\)
These constraints imply a utility function \(U\) with the properties:
- \(U(A) > U(B)\) iff \(A \succ B\)
- \(U(A) = U(B)\) iff \(A \sim B\)
- \(U([S_1: p_1; \ldots; S_n: p_n]) = \sum_{i=1}^n p_i \, U(S_i)\)
Decision Networks
Value of Information
Decision Networks and MDPs
Decision Network
Chance node
Decision node
Utility node
MDP Dynamic Decision Network
MDP Optimization problem
\[\text{maximize} \quad \text{E}\left[\sum_{t=1}^\infty r_t\right]\]
Not well formulated!
Infinite
Markov Decision Process
Finite MDP Objectives
- Finite time
- Average reward
- Discounting
- Terminal States
\[\text{E} \left[ \sum_{t=0}^T r_t \right]\]
\[\underset{n \rightarrow \infty}{\text{lim}} \text{E} \left[\sum_{t=0}^n r_t \right] \]
\[\text{E} \left[\sum_{t=0}^\infty \gamma^t r_t\right]\]
Infinite time, but a terminal state (no reward, no leaving) is always reached with probability 1.
discount \(\gamma \in [0, 1)\)
typically 0.9, 0.95, 0.99
if \(\underline{r} \leq r_t \leq \bar{r}\)
then \[\frac{\underline{r}}{1-\gamma} \leq \sum_{t=0}^\infty \gamma^t r_t \leq \frac{\bar{r}}{1-\gamma} \]
Maximizing Expected Utility
Value of Information
Guiding Question
- What does "Markov" mean in "Markov Decision Process"?
030 Stochastic Processes
By Zachary Sunberg
030 Stochastic Processes
- 364