Bayesian Network Learning
-
Last time:
- Conditional independence in Bayesian Networks
- Sampling from Bayesian Networks
-
Today:
- Given a Bayesian Network and some values, how do we calculate the probability of other values?
- Given data, how do we fit a Bayesian network?
Bayesian Network
Structure
Parameters
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Inference
Inputs
- Bayesian network structure
- Bayesian network parameters
- Values of evidence variables
Outputs
- Posterior distribution of query variables
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Given that you have detected a trajectory deviation, and the battery has not failed what is the probability of a solar panel failure?
\(P(S=1 \mid D=1, B=0)\)
Exact
Approximate
Exact Inference
Exact Inference
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
\[P(S{=}1 \mid D{=}1, B{=}0)\]
\[= \frac{P(S{=}1, D{=}1, B{=}0)}{P(D{=}1, B{=}0)}\]
\[P(S{=}1, D{=}1, B{=}0) = \sum_{e, c}P(B{=}0, S{=}1, E{=}e, D{=}1, C{=}c)\]
\[P(B{=}0, S{=}1, E, D{=}1, C)\]
\[= P(B{=}0)\,P(S{=}1)\,P(E\mid B{=}0, S{=}1)\,P(D{=}1\mid E)\,P(C{=}1\mid E)\]
\(2^5= 32\) possible assignments, but quickly gets too large
Exact Inference
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
\(2^5= 32\) possible assignments, but quickly gets too large
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463883/pasted-from-clipboard.png)
Product
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463887/pasted-from-clipboard.png)
Condition
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463892/pasted-from-clipboard.png)
Marginalize
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463897/pasted-from-clipboard.png)
Exact Inference: Variable Elimination
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463915/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463916/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9463919/pasted-from-clipboard.png)
Start with
Eliminate \(D\) and \(C\) (evidence) to get \(\phi_6(E)\) and \(\phi_7(E)\)
Eliminate \(E\)
Eliminate \(S\)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464065/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464066/pasted-from-clipboard.png)
vs
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464070/pasted-from-clipboard.png)
Choosing optimal order is NP-hard
Approximate Inference
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Approximate Inference: Direct Sampling
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464133/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464136/pasted-from-clipboard.png)
Analogous to
unweighted particle filtering
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Approximate Inference: Weighted Sampling
Analogous to
weighted particle filtering
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464155/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464160/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Approximate Inference: Gibbs Sampling
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464172/pasted-from-clipboard.png)
Markov Chain Monte Carlo (MCMC)
Learning
Bayesian Network Learning
Inputs
- Data, \(D\)
- Priors (?)
Outputs
- Bayesian network structure, \(G\)
- Bayesian network parameters, \(\theta\)
Counting Parameters
For discrete R.V.s:
\[\text{dim}(\theta_X) = \left(|\text{support}(X)|-1\right)\prod_{Y \in Pa(X)} |\text{support}(Y)|\]
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9454664/pasted-from-clipboard.png)
Structure Learning Example
Parameter Learning
Maximum Likelihood
Bayesian
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464203/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464204/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464205/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464206/pasted-from-clipboard.png)
Multinomial:
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464208/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464209/pasted-from-clipboard.png)
Multinomial:
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464211/pasted-from-clipboard.png)
Structure Learning
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464214/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464214/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464217/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464218/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464218/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464222/pasted-from-clipboard.png)
NP-Hard
Markov Equivalence Class
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464223/pasted-from-clipboard.png)
Markov Equivalent iff
- Same undirected edges
- Same set of immoral v-structures
![](https://s3.amazonaws.com/media-p.slid.es/uploads/870752/images/9464224/pasted-from-clipboard.png)
Recap
Inference
Learning
230 Bayesian Network Learning
By Zachary Sunberg
230 Bayesian Network Learning
- 504