4.5 Directed Graphical Models

Directed graphical models, also known as Bayesian networks, provide a graphical language for specifying probabilistic models. They offer a compact and intuitive way to visualize dependencies between random variables and to represent how a joint distribution can be decomposed into simpler conditional distributions.

The key idea is that nodes represent random variables, and edges (arrows) represent probabilistic or conditional relationships. The joint distribution is expressed as a product of conditional probabilities. Graphical models help identify independence and conditional independence relationships among variables.

Some of the benefits of graphical models are that they:

Provide a simple way to visualize model structure.
Facilitate the design and understanding of statistical models.
Make independence properties easy to identify.
Enable simplified inference and learning computations through graphical manipulations.

4.5.1 Graph Semantics

A directed graphical model (or Bayesian network) represents conditional dependencies among random variables.

Example 4.7 A joint distribution: \[ p(a, b, c) = p(c | a, b) \, p(b | a) \, p(a) \] implies the following:

\(c\) depends on \(a\) and \(b\)
\(b\) depends on \(a\)
\(a\) is independent of \(b\) and \(c\)

This factorization can be represented as a directed graph with arrows from \(a \to b\) and \(a, b \to c\).

4.5.1.1 General Rule for Constructing Graphs

Create a node for each random variable.
For each conditional probability, draw arrows from the conditioning variables to the conditioned variable.

4.5.1.2 General Form of the Joint Distribution

For random variables \(\mathbf{x}_1, \ldots, \mathbf{x}_K\), the joint distribution factorizes as: \[ p(\mathbf{x}) = p(\mathbf{x}_1, \ldots, \mathbf{x}_K) = \prod_{k=1}^K p(\mathbf{x}_k | \text{Pa}_k) \] where \(\text{Pa}_k\) denotes the parent nodes of \(\mathbf{x}_k\) (nodes with arrows pointing to \(\mathbf{x}_k\)).

Example 4.8 Coin Flip Experiment

For a Bernoulli trial with parameter \(\mu\): \[ p(\mathbf{x} | \mu) = \text{Ber}(\mu). \] Repeating the experiment \(N\) times: \[ p(\mathbf{x}_1, \ldots, \mathbf{x}_N | \mu) = \prod_{n=1}^N p(\mathbf{x}_n | \mu). \]

This can be represented graphically:

Each outcome \(\mathbf{x}_n\) depends on the same latent variable \(\mu\).
Plate notation is used to represent the repetition \(N\) times.

We can also assign a hyperprior to \(\mu\), for instance: \[ \mu \sim \text{Beta}(\alpha, \beta) \] where \(\alpha\) and \(\beta\) may be treated as deterministic parameters.

4.5.2 Conditional Independence and d-Separation

Directed graphical models can reveal conditional independence relationships directly from the graph using the concept of d-separation (Pearl, 1988).

Definition 4.13 Given disjoint node sets \(A, B, C\) in a directed acyclic graph, \(A\) is d-separated from \(B\) given \(C\), written as: \[ A \perp\!\!\!\perp B \,|\, C \] if all paths between nodes in \(A\) and \(B\) are blocked.

A path is blocked if:

The arrows meet head-to-tail or tail-to-tail at a node in \(C\) (a path of the form \(a \rightarrow c \rightarrow b\) or \(a \leftarrow c \rightarrow b\)), or
The arrows meet head-to-head at a node not in \(C\), and none of its descendants are in \(C\) (a path of the form \(a \rightarrow c \leftarrow b\)).

If all paths are blocked, \(A\) and \(B\) are conditionally independent given \(C\).

Example 4.9 In the graphical model:

A E-separation example

We can infer:

\(b \perp d \,|\, a, c\)
\(a \perp c \,|\, b\)
\(b \not\!\perp d \,|\, c\) since \(d\) is a descendant of \(c\)
\(a \not\!\perp c \,|\, b, e\) since \(d\) is a descendant of \(c\)

Example 4.10 d-Separation in a Bayesian Network

Consider the following directed acyclic graph (DAG):

\[ A \longrightarrow C \longrightarrow D \longleftarrow B \] We will analyze conditional independence relationships using d-separation.

We examine whether variables \(A\) and \(B\) are independent under different conditioning sets. There is exactly one undirected path between \(A\) and \(B\): \[ A \rightarrow C \rightarrow D \leftarrow B. \] This path contains a collider at node \(D\).

If we condition on nothing:

The path contains a collider \(D\)
Colliders block paths unless conditioned on (or their descendants are)

The path is blocked \[ A \;\perp\!\!\!\perp\; B \]

Now condition on \(C\):

\(C\) is a chain node
Conditioning on a chain node blocks the path
The collider at \(D\) remains unconditioned

The path is still blocked \[ A \;\perp\!\!\!\perp\; B \mid C \]

Now condition on \(D\):

Conditioning on a collider opens the path

The path is now unblocked \[ A \;\not\!\perp\!\!\!\perp\; B \mid D \] Knowing \(A\) gives information about \(B\) once \(D\) is known.

Suppose \(E\) is a descendant of \(D\): \[ A \rightarrow C \rightarrow D \leftarrow B, \quad D \rightarrow E \] Condition on \(E\):

Conditioning on a descendant of a collider also opens the path

The path is unblocked \[ A \;\not\!\perp\!\!\!\perp\; B \mid E \]

Summary of Results

Conditioning Set	Are \(A\) and \(B\) d-separated?
None	Yes
\(C\)	Yes
\(D\)	No
\(E\)	No

Exercises

Exercise 4.7 Chain (No Conditioning)

\[ X \rightarrow Y \rightarrow Z \]

Question: Are \(X\) and \(Z\) independent?

The path is a chain
No conditioning blocks the path

\[ X \;\not\!\perp\!\!\!\perp\; Z \]

Solution

Exercise 4.8 Chain (Condition on the Middle)

\[ X \rightarrow Y \rightarrow Z \]

Condition on \(Y\):

Conditioning on a chain node blocks the path

\[ X \;\perp\!\!\!\perp\; Z \mid Y \]

Solution

Exercise 4.9 Fork (Common Cause)

\[ X \leftarrow Y \rightarrow Z \]

No conditioning:

\(Y\) is a common cause
Path is open

\[ X \;\not\!\perp\!\!\!\perp\; Z \]

Solution

Exercise 4.10 Fork (Condition on the Common Cause)

\[ X \leftarrow Y \rightarrow Z \]

Condition on \(Y\):

Conditioning on a fork node blocks the path

\[ X \;\perp\!\!\!\perp\; Z \mid Y \]

Solution

Exercise 4.11 Collider (No Conditioning)

\[ X \rightarrow Y \leftarrow Z \]

No conditioning:

\(Y\) is a collider
Colliders block paths by default

\[ X \;\perp\!\!\!\perp\; Z \]

Solution

Exercise 4.12 Collider (Condition on the Collider)

\[ X \rightarrow Y \leftarrow Z \]

Condition on \(Y\):

Conditioning on a collider opens the path

\[ X \;\not\!\perp\!\!\!\perp\; Z \mid Y \]

Solution