# How p affects your posterior judgements in BHMs

There may be simpler/more elegant ways to show what I show below, but I find that this question comes up often and it’s good to have a toy example handy.

Consider the hierarchical model $Z = Y + v,$ $Y =X{\beta} + e,$

where for simplicity, we assume that $v \sim \mathcal{N}(0,I),$ $e \sim \mathcal{N}(0,I).$

How does increasing the number of covariates p affect our posterior judgements on $Y$ and $\beta$? We will analyse the simple case when the components of $X$ are all orthogonal. Then the answer to this question is:

Let X be an n x p matrix where n = dim(Y) and p = dim( $\beta$), where the columns of X are orthogonal, then

• Var(Y | Z) strictly increases with p and
• Var( $\beta_i$ | Z), i = 1,…,p, is independent of p.

To illustrate (not prove) why these two claims are true we first re-write the model as $Z = CV + v,$ $AV = e,$

where $C = [I \quad 0]$, $A = [I \quad -X]$ and $V = [Y^T \quad \beta^T]^T$. In typical applications Z would constitute the observations, Y the hidden state, X the regressors and $\beta$ the weights attached to each component in $X$. For our simple model $\textrm{Var}(V | Z) = (C^TC + A^TA)^{-1}.$

Var(Y | Z)

Here we are interested in the effect of the number of columns in $X$ on the posterior uncertainty of Y, that is, Var(Y | Z). Using Schur complements on $\textrm{Var}(V | Z)$ we obtain $\textrm{Var}(Y | Z) = (2I - X(X^TX)^{-1}X^T)^{-1}.$

Now, if $X = x_1$, where $x_1$ is a vector (a single covariate), then $\textrm{Var}_1(Y | Z) = \left(2I - \frac{x_1x_1^T}{\|x_1\|}\right)^{-1},$

while if $X = [x_1 \quad x_2]$ (two covariates) then it can be shown that $\textrm{Var}_2(Y | Z) = \left(2I - \frac{\|x_2\|x_1x_1^T + \|x_1\|x_2x_2^T - 2(x_1\cdot x_2)x_1x_2^T}{\|x_1\|\|x_2\| - (x_1\cdot x_2)^2}\right)^{-1},$

where $x_1 \cdot x_2$ denotes the inner product between $x_1$ and $x_2$.

Note that both $\textrm{Var}_1(Y | Z)$ and $\textrm{Var}_2(Y | Z)$ can be written in the form $Var_i(Y | Z) = (2I - B_i)^{-1}.$

Therefore if $\textrm{Var}_1(Y | Z) <\textrm{Var}_2(Y | Z)$ , then $B_1 < B_2$ and vice-versa. Now, when $x_1$ and $x_2$ are orthogonal, $B_2 = \frac{\|x_2\|x_1x_1^T + \|x_1\|x_2x_2^T}{\|x_1\|\|x_2\|} = \frac{x_1x_1^T}{\|x_1\|} +\frac{x_2x_2^T}{\|x_2\|} > B_1 =\frac{x_1x_1^T}{\|x_1\|}.$

Therefore the posterior variance of Y increases with the number of orthogonal regressors in X.

Var( $\beta$ | Z)

Here we are interested in the effect of the number of columns in $X$ on the posterior uncertainty of $\beta$, that is, Var( $\beta$ | Z). Using Schur complements on $\textrm{Var}(V | Z)$ we obtain $\textrm{Var}(\beta | Z) = 2(X^TX)^{-1},$

and therefore when $p = 1$ (one regressor), $\textrm{Var}_1(\beta | Z) = \frac{2}{\|x_1\|}.$

When $p=2$ we obtain $\textrm{Var}_2(\beta | Z) = \frac{2}{\|x_1\|\|x_2\| - (x_1\cdot x_2)^2} \left[ \begin{array}{cc} \|x_2\| & -x_2\cdot x_1\\ -x_1\cdot x_2 & \|x_1\| \end{array} \right].$

If $x_1$ is orthogonal to $x_2$ then we obtain $\textrm{Var}_2(\beta | Z) = \left[ \begin{array}{cc} 2/\|x_1\| & 0\\ 0 & 2/\|x_2\| \end{array} \right].$

Note that $\textrm{Var}_1(\beta_1 | Z) =\textrm{Var}_2(\beta_1 | Z)$