Example:
\(x_0 = 0\)
\(x_{t+1} = x_t + v_t\)
\(v_t \sim \mathcal{U}(\{0,1\})\) (i.i.d.)
Shorthand: \(x' = x + v\)
In a stationary stochastic process (all in this class), this relationship does not change with time
Joint
x0 | x1 | x2 | P(x1, x2, x3) |
---|---|---|---|
0 | 0 | 0 | 0.25 |
0 | 0 | 1 | 0.25 |
0 | 1 | 1 | 0.25 |
0 | 1 | 2 | 0.25 |
\(x_0 = 0\)
\(x_{t+1} = x_t + v_t\)
\(v_t \sim \mathcal{U}(\{0,1\})\) (i.i.d.)
\[P(x_{1:n}) = \prod_{t=1}^n P(x_t \mid \text{pa}(x_t))\]
For this particular process,
\[P(x_{1:n}) = \prod_{t=1}^n P(x_t \mid x_{t-1})\]
For this particular process, since \(\text{pa}(x_t) = x_{t-1}\), if \(P(x_{t-1})\) is known,
\[= 0.5 \, P(x_{t-1} = x_t-1) + 0.5 \, P(x_{t-1} = x_t)\]
\[P(x_t)\]
\[= \sum_{k \in x_{t-1}} P\left(x_t \mid x_{t-1}=k\right) P(x_{t-1} = k)\]
Marginal
Expectation
\[E[x_t] = \sum_{x \in x_t} x P(x_t = x)\]
For this particular process, \(x_t = \sum_{i=1}^t v_t\), so
\[E[x_t] = E\left[\sum_{i=1}^t v_t\right] = \sum_{i=1}^t E[v_t] = 0.5 t\]
Expectation of a function (such as reward)
\[E[f(x_t)] = \sum_{x \in x_t} f(x) P(x_t = x)\]
030-Stochastic-Processes.ipynb
Suppose you want to create a Markov process model that describes how many new COVID cases will start on a particular day. What information should be in the state of the model?
Assume:
(Often you can't measure the whole state)
A Bayesian Network is a directed acyclic graph (DAG) that encodes probabilistic relationships between R.V.s
Concretely:
\(P(x_{1:n}) = \prod_i P(x_i \mid pa(x_i))\)
\(P(A, B, C) \\ = P(A)P(B \mid A) P(C \mid A)\)
Markov Process
Hidden Markov Model
Dynamic Bayesian Network
(One step)
Outcomes
Probabilities
\(S_1 \ldots S_n\)
\(p_1 \ldots p_n\)
von Neumann - Morgenstern Axioms
Lottery
\([S_1: p_1; \ldots; S_n: p_n]\)
These constraints imply a utility function \(U\) with the properties:
Decision Network
Chance node
Decision node
Utility node
MDP Dynamic Decision Network
MDP Optimization problem
\[\text{maximize} \quad \text{E}\left[\sum_{t=1}^\infty r_t\right]\]
Not well formulated!
Infinite
\[\text{E} \left[ \sum_{t=0}^T r_t \right]\]
\[\underset{n \rightarrow \infty}{\text{lim}} \text{E} \left[\sum_{t=0}^n r_t \right] \]
\[\text{E} \left[\sum_{t=0}^\infty \gamma^t r_t\right]\]
Infinite time, but a terminal state (no reward, no leaving) is always reached with probability 1.
discount \(\gamma \in [0, 1)\)
typically 0.9, 0.95, 0.99
if \(\underline{r} \leq r_t \leq \bar{r}\)
then \[\frac{\underline{r}}{1-\gamma} \leq \sum_{t=0}^\infty \gamma^t r_t \leq \frac{\bar{r}}{1-\gamma} \]