3.4 Inversion and related concepts
Suppose \(\mathbf{A}\mathbf{x}=\mathbf{B}\) and we want to solve for \(\mathbf{x}\) … can we “divide” by \(\mathbf{A}\)? The answer is: “sort of”. There is no such thing as matrix division, but we can multiply both sides by the inverse of \(\mathbf{A}\). If a matrix \(\mathbf{A}^{-1}\) satisfies \(\mathbf{A}\mathbf{A}^{-1}=\mathbf{A}^{-1}\mathbf{A}=\mathbf{I}\), then \(\mathbf{A}^{-1}\) is the inverse of \(\mathbf{A}\). If we know what \(\mathbf{A}^{-1}\) is, then \(\mathbf{x}=\mathbf{A}^{-1}\mathbf{B}\). Note that \(\mathbf{x}\) is not equal to \(\mathbf{B}\mathbf{A}^{-1}\); we need to left multiply by the inverse and the order of multiplication matters.
If two vectors \(\mathbf{u}\) and \(\mathbf{v}\) satisfy \(\mathbf{u}^{\scriptstyle\top}\mathbf{v}=0\), they are said to be orthogonal to each other. If all the columns and rows of a matrix \(\mathbf{A}\) are orthogonal to each other and satisfy \(\mathbf{a}^{\scriptstyle\top}\mathbf{a}= 1\), then \(\mathbf{A}\) (transposed) can serve as its own inverse: \(\mathbf{A}^{\scriptstyle\top}\mathbf{A}=\mathbf{A}\mathbf{A}^{\scriptstyle\top}=\mathbf{I}\). In this case, the matrix \(\mathbf{A}\) is said to be an orthogonal matrix. If a matrix \(\mathbf{X}\) is not square, then it is possible that \(\mathbf{X}^{\scriptstyle\top}\mathbf{X}=\mathbf{I}\) but \(\mathbf{X}\mathbf{X}^{\scriptstyle\top}\neq \mathbf{I}\); in this case, the matrix is said to be column orthogonal, although in statistics it is common to refer to these matrices as orthogonal also. A somewhat related definition is that a matrix is said to be idempotent if \(\mathbf{A}\mathbf{A}=\mathbf{A}\).
Does every matrix have one and only one inverse? If a matrix has an inverse, it is said to be invertible – all invertible matrices have exactly one, unique inverse. However, not every matrix is invertible. For example, there are no values of \(a, b, c\), and \(d\) that satisfy
\[ \left[ \begin{array}{rr} 2 & 4 \\ 1 & 2 \end{array} \right] \left[ \begin{array}{rr} a & b \\ c & d \end{array} \right]= \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \]
Why doesn’t this matrix have an inverse? There are four equations and four unknowns, but some of those equations contradict each other. The term for this situation is linear dependence. If you have a collection of vectors \(\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n\), then you can form new vectors from linear combinations of the old vectors: \(c_1\mathbf{v}_1+c_2\mathbf{v}_2+\cdots+c_n\mathbf{v}_n\). A collection of vectors is said to be linearly independent if none of them can be written as a linear combination of the others; if it can, then they are linearly dependent. This is the key to whether a matrix is invertible or not: a matrix \(\mathbf{A}\) is invertible if and only if its columns (or rows) are linearly independent. Note that the columns of our earlier matrix were not linearly independent, since \(2(2 \quad 1)=(4 \quad 2)\).
The rank of a matrix is the number of linearly independent columns (or rows) it has; if they’re all linearly independent, then the matrix is said to be of full rank.
Additional helpful identities:
\[\begin{align*} (\mathbf{A}+\mathbf{B}) ^{\scriptstyle\top}&= \mathbf{A}^{\scriptstyle\top}+ \mathbf{B}^{\scriptstyle\top}\\ (\mathbf{A}\mathbf{B}) ^{\scriptstyle\top}&= \mathbf{B}^{\scriptstyle\top}\mathbf{A}^{\scriptstyle\top}\\ (\mathbf{A}\mathbf{B})^{-1} &= \mathbf{B}^{-1}\mathbf{A}^{-1} \\ (\mathbf{A}^{\scriptstyle\top})^{-1} &= (\mathbf{A}^{-1}) ^{\scriptstyle\top} \end{align*}\]