Appendix B — Random vectors

B.1 Vector mean

If we have a collection of random variables \(x_1,\ldots,x_p\), we can form them into a vector \(\mathbf{x}=(x_1,\ldots,x_p)^T\). The joint distribution of the collection of random variables \((x_1,\ldots,x_p)\) defines the distribution of the random vector \(\mathbf{x}\).

We define the mean of \(\mathbf{x}\) to be the vector of means of its components: \[E(\mathbf{x})=\left(\begin{array}{c} E(x_1) \\ \vdots \\ E(x_p)\end{array}\right)\]

B.2 Covariance matrix

The obvious way to define the variance of a vector random variable would be to make it the vector of variances of its components. But it is useful also to recognize the covariances between the various components of \(\mathbf{x}\), too. Accordingly, we define it to be the \(p\times p\) matrix \[Var(\mathbf{x})=\left( \begin{array}{cccc} \mbox{Var}(x_1) & \mbox{Cov}(x_1,x_2) & \cdots & \mbox{Cov}(x_1,x_p) \\ \mbox{Cov}(x_2,x_1) &\ Var(x_2) & \cdots & \mbox{Cov}(x_2,x_p) \\ \vdots & \vdots & \ddots & \vdots \\ \mbox{Cov}(x_p,x_1) & \mbox{Cov}(x_p,x_2) & \cdots & \mbox{Var}(x_p) \end{array} \right)\] The above is usually referred to as the variance-covariance matrix of \(\mathbf{x}\), or just as the covariance matrix of \(\mathbf{x}\).

As the covariance of a random variable with itself is its variance (for example, \(\mbox{Cov}(x_i,x_i)=\mbox{Var}(x_i)\)), we sometimes use the notation \(Cov(\mathbf{x})\) to mean the same thing as \(Var(\mathbf{x})\):

\[Cov(\mathbf{x}):=Var(\mathbf{x})\]

Notice that since covariance is a symmetric relation, \(\mbox{Cov}(x_i,x_j) = \mbox{Cov}(x_j,x_i)\), this is a symmetric matrix. Thus we can write \(\mbox{Var}(\mathbf{x})^T=\mbox{Var}(\mathbf{x})\).

B.3 Vector covariance

More generally, if \(\mathbf{x}=(x_1,\ldots,x_p)^T\) is a \(p\times 1\) random vector and \(\mathbf{y}=(y_1,\ldots,y_q)^T\) is a \(q\times 1\) random vector, we can define \[\mbox{Cov}(\mathbf{x},\mathbf{y})=\left( \begin{array}{cccc} \mbox{Cov}(x_1,y_1) & \mbox{Cov}(x_1,y_2) & \cdots & \mbox{Cov}(x_1,y_q) \\ \mbox{Cov}(x_2,y_1) & \mbox{Cov}(x_2,y_2) & \cdots & \mbox{Cov}(x_2,y_q) \\ \vdots & \vdots & \ddots & \vdots \\ \mbox{Cov}(x_p,y_1) & \mbox{Cov}(x_p,y_2) & \cdots & \mbox{Cov}(x_p,y_q) \end{array} \right)\] This is not a symmetric matrix. In fact, since it is \(p\times q\), it is not even square unless \(p = q\). Notice, however, that \(\mbox{Cov}(\mathbf{y},\mathbf{x})=\mbox{Cov}(\mathbf{x},\mathbf{y})^T\). Also we have \(\mbox{Cov}(\mathbf{x},\mathbf{x})=\mbox{Var}(\mathbf{x})\).

Sometimes \(Cov(\mathbf{x},\mathbf{y})\) is referred to as the covariance of \(\mathbf{x}\) and \(\mathbf{y}\).

B.4 Some useful results

We know many useful results about expectations, such as \(E(aX+b)=aE(X)+b\), when \(a\) and \(b\) are constants. Here are some vector generalizations.

  1. \(E(A\mathbf{x}+\mathbf{b})=AE(\mathbf{x})+\mathbf{b}\), when \(A\) is a \(q\times p\) matrix of constants and \(\mathbf{b}\) is a \(q\times 1\) vector of constants. Notice that this expresses the mean of the \(q\times 1\) random vector in terms of that of the \(p\times 1\) random vector \(\mathbf{x}\).

  2. \(\mbox{Var}(A\mathbf{x}+\mathbf{b})=A\mbox{Var}(\mathbf{x})A^T\).

  3. \(\mbox{Cov}(A\mathbf{x}+\mathbf{b},C\mathbf{y}+\mathbf{d})=A\mbox{Cov}(\mathbf{x},\mathbf{y})C^T\), where the dimensions of the matrices \(A\) and \(C\) and the vectors \(\mathbf{b}\) and \(\mathbf{d}\) are as required to allow the matrix multiplication.

To understand result 3, note that for scalar random variables, we define \[ \mbox{Cov}(x, y) = E((x - E(x))(y - E(y))), \] and so the covariance matrix for vectors can be written as

\[ \mbox{Cov}(\mathbf{x}, \mathbf{y}) = E((\mathbf{x} - E(\mathbf{x}))(\mathbf{y} - E(\mathbf{y}))^T). \] It then follows that \[\begin{align} \mbox{Cov}(A\mathbf{x}+\mathbf{b},C\mathbf{y}+\mathbf{d})&=E((A\mathbf{x}+\mathbf{b} - E(A\mathbf{x}+\mathbf{b}))(C\mathbf{y}+\mathbf{d} - E(C\mathbf{y}+\mathbf{d}))^T) \\ &= E((A\mathbf{x} - E(A\mathbf{x}))(C\mathbf{y} - E(C\mathbf{y}))^T)\\ &= E(A(\mathbf{x} - E(\mathbf{x}))(\mathbf{y} - E(\mathbf{y}))^T C^T)\\ &= AE((\mathbf{x} - E(\mathbf{x}))(\mathbf{y} - E(\mathbf{y}))^T )C^T\\ &=A\mbox{Cov}(\mathbf{x},\mathbf{y})C^T. \end{align}\]