Comprehensive Study Notes: Matrices

Matrix algebra is a key mathematical tool in doing modern-day quantum-mechanical calculations on molecules. Matrices also furnish a convenient way to formulate much of the theory of quantum mechanics. Matrix methods will be used in some later chapters, but this book is written so that the material on matrices can be omitted if time does not allow this material to be covered.

A matrix is a rectangular array of numbers. The numbers that compose a matrix are called the matrix elements. Let the matrix A have $m$ rows and $n$ columns, and let $a_{i j}$ $(i=1,2, \ldots, m$ and $j=1,2, \ldots, n)$ denote the element in row $i$ and column $j$. Then

$
\mathbf{A}=\left(\begin{array}{cccc}
a{11} & a{12} & \cdots & a{1 n} \
a{21} & a{22} & \cdots & a{2 n} \
\cdot & \cdot & \cdots & \cdot \
a{m 1} & a{m 2} & \cdots & a_{m n}
\end{array}\right)
$

$\mathbf{A}$ is said to be an $m$ by $n$ matrix. Do not confuse $\mathbf{A}$ with a determinant (Section 8.3); a matrix need not be square and is not equal to a single number.

A row matrix (also called a row vector) is a matrix having only one row. A column matrix or column vector has only one column.

Two matrices $\mathbf{R}$ and $\mathbf{S}$ are equal if they have the same number of rows, and the same number of columns, and have corresponding elements equal. If $\mathbf{R}=\mathbf{S}$, then $r{j k}=s{j k}$ for $j=1, \ldots, m$ and $k=1, \ldots, n$, where $m$ and $n$ are the dimensions of $\mathbf{R}$ and $\mathbf{S}$. A matrix equation is thus equivalent to $m n$ scalar equations.

The sum of two matrices $\mathbf{A}$ and $\mathbf{B}$ is defined as the matrix formed by adding corresponding elements of $\mathbf{A}$ and $\mathbf{B}$; the sum is defined only if $\mathbf{A}$ and $\mathbf{B}$ have the same dimensions. If $\mathbf{P}=\mathbf{A}+\mathbf{B}$, then we have the $m n$ scalar equations $p{j k}=a{j k}+b_{j k}$ for $j=1, \ldots, m$ and $k=1, \ldots, n$.

$
\begin{equation}
\text { If } \mathbf{P}=\mathbf{A}+\mathbf{B}, \text { then } p{j k}=a{j k}+b_{j k} \tag{7.105}
\end{equation}
$

The product of the scalar $c$ and the matrix $\mathbf{A}$ is defined as the matrix formed by multiplying every element of $\mathbf{A}$ by $c$.

$
\begin{equation}
\text { If } \mathbf{D}=c \mathbf{A}, \quad \text { then } \quad d{j k}=c a{j k} \tag{7.106}
\end{equation}
$

If $\mathbf{A}$ is an $m$ by $n$ matrix and $\mathbf{B}$ is an $n$ by $p$ matrix, the matrix product $\mathbf{R}=\mathbf{A B}$ is defined to be the $m$ by $p$ matrix whose elements are

$
\begin{equation}
r{j k} \equiv a{j 1} b{1 k}+a{j 2} b{2 k}+\cdots+a{j n} b{n k}=\sum{i=1}^{n} a{j i} b{i k} \tag{7.107}
\end{equation}
$

To calculate $r{j k}$ we take row $j$ of $\mathbf{A}$ (this row's elements are $a{j 1}, a{j 2}, \ldots, a{j n}$ ), multiply each element of this row by the corresponding element in column $k$ of $\mathbf{B}$ (this column's elements are $b{1 k}, b{2 k}, \ldots, b_{n k}$ ), and add the $n$ products. For example, suppose

$
\mathbf{A}=\left(\begin{array}{rrr}
-1 & 3 & \frac{1}{2} \
0 & 4 & 1
\end{array}\right) \quad \text { and } \quad \mathbf{B}=\left(\begin{array}{rrr}
1 & 0 & -2 \
2 & 5 & 6 \
-8 & 3 & 10
\end{array}\right)
$

The number of columns of $\mathbf{A}$ equals the number of rows of $\mathbf{B}$, so the matrix product $\mathbf{A B}$ is defined. $\mathbf{A B}$ is the product of the 2 by 3 matrix $\mathbf{A}$ and the 3 by 3 matrix $\mathbf{B}$, so $\mathbf{R} \equiv \mathbf{A B}$ is a 2 by 3 matrix. The element $r{21}$ is found from the second row of $\mathbf{A}$ and the first column of $\mathbf{B}$ as follows: $r{21}=0(1)+4(2)+1(-8)=0$. Calculation of the remaining elements gives

$
\mathbf{R}=\left(\begin{array}{rrr}
1 & 16 \frac{1}{2} & 25 \
0 & 23 & 34
\end{array}\right)
$

Matrix multiplication is not commutative; the products $\mathbf{A B}$ and $\mathbf{B A}$ need not be equal. (In the preceding example, the product $\mathbf{B A}$ happens to be undefined.) Matrix multiplication can be shown to be associative, meaning that $\mathbf{A}(\mathbf{B C})=(\mathbf{A B}) \mathbf{C}$ and can be shown to be distributive, meaning that $\mathbf{A}(\mathbf{B}+\mathbf{C})=\mathbf{A B}+\mathbf{A C}$ and $(\mathbf{B}+\mathbf{C}) \mathbf{D}=\mathbf{B D}+\mathbf{C D}$.

A matrix with equal numbers of rows and columns is a square matrix. The order of a square matrix equals the number of rows.

If $\mathbf{A}$ is a square matrix, its square, cube, $\ldots$ are defined by $\mathbf{A}^{2} \equiv \mathbf{A A}$, $\mathbf{A}^{3} \equiv \mathbf{A A A}, \ldots$

The elements $a{11}, a{22}, \ldots, a_{n n}$ of a square matrix of order $n$ lie on its principal diagonal. A diagonal matrix is a square matrix having zero as the value of each element not on the principal diagonal.

The trace of a square matrix is the sum of the elements on the principal diagonal. If $\mathbf{A}$ is a square matrix of order $n$, its trace is $\operatorname{Tr} \mathbf{A}=\sum{i=1}^{n} a{i i}$.

A diagonal matrix whose diagonal elements are each equal to 1 is called a unit matrix or an identity matrix. The $(j, k)$ th element of a unit matrix is the Kronecker delta $\delta{j k}$; $(\mathbf{I}){j k}=\delta_{j k}$, where $\mathbf{I}$ is a unit matrix. For example, the unit matrix of order 3 is

$
\left(\begin{array}{lll}
1 & 0 & 0 \
0 & 1 & 0 \
0 & 0 & 1
\end{array}\right)
$

Let $\mathbf{B}$ be a square matrix of the same order as a unit matrix $\mathbf{I}$. The $(j, k)$ th element of the product IB is given by (7.107) as (IB $){j k}=\sum{i}(\mathbf{I}){j i} b{i k}=\sum{i} \delta{j i} b{i k}=b{j k}$. Since the $(j, k)$ th elements of $\mathbf{I B}$ and $\mathbf{B}$ are equal for all $j$ and $k$, we have $\mathbf{I B}=\mathbf{B}$. Similarly, we find $\mathbf{B I}=\mathbf{B}$. Multiplication by a unit matrix has no effect.

A matrix all of whose elements are zero is called a zero matrix, symbolized by $\mathbf{0}$. A nonzero matrix has at least one element not equal to zero. These definitions apply to row vectors and column vectors.

Most matrices in quantum chemistry are either square matrices or row or column matrices.

Matrices and Quantum Mechanics

In Section 7.1 , the integral $\int f{m}^{*} \hat{A} f{n} d \tau$ was called a matrix element of $\hat{A}$. We now justify this name by showing that such integrals obey the rules of matrix algebra.

Let the functions $f{1}, f{2}, \ldots$ be a complete, orthonormal set and let the symbol $\left{f{i}\right}$ denote this complete set. The numbers $A{m n} \equiv\left\langle f{m}\right| \hat{A}\left|f{n}\right\rangle \equiv \int f{m}^{*} \hat{A} f{n} d \tau$ are called matrix elements of the linear operator $\hat{A}$ in the basis $\left{f{i}\right}$. The square matrix
is called the matrix representative of the linear operator $\hat{A}$ in the $\left{f{i}\right}$ basis. Since $\left{f_{i}\right}$ usually consists of an infinite number of functions, $\mathbf{A}$ is an infinite-order matrix.

Consider the addition of matrix-element integrals. Suppose $\hat{C}=\hat{A}+\hat{B}$. A typical matrix element of $\hat{C}$ in the $\left{f_{i}\right}$ basis is

$
\begin{aligned}
C{m n} & =\left\langle f{m}\right| \hat{C}\left|f{n}\right\rangle=\left\langle f{m}\right| \hat{A}+\hat{B}\left|f{n}\right\rangle=\int f{m}^{}(\hat{A}+\hat{B}) f{n} d \tau \
& =\int f{m}^{} \hat{A} f{n} d \tau+\int f{m}^{*} \hat{B} f{n} d \tau=A{m n}+B_{m n}
\end{aligned}
$

Thus, if $\hat{C}=\hat{A}+\hat{B}$, then $C{m n}=A{m n}+B_{m n}$, which is the rule (7.105) for matrix addition. Hence, if $\hat{C}=\hat{A}+\hat{B}$, then $\mathbf{C}=\mathbf{A}+\mathbf{B}$, where $\mathbf{A}, \mathbf{B}$, and $\mathbf{C}$ are the matrix representatives of the operators $\hat{A}, \hat{B}, \hat{C}$.

Similarly, if $\hat{P}=c \hat{S}$, where $c$ is a constant, then we find (Prob. 7.52) $P{j k}=c S{j k}$, which is the rule for multiplication of a matrix by a scalar.

Finally, suppose that $\hat{R}=\hat{S} \hat{T}$. We have

$
\begin{equation}
R{m n}=\int f{m}^{} \hat{R} f{n} d \tau=\int f{m}^{} \hat{S} \hat{T} f_{n} d \tau \tag{7.109}
\end{equation}
$

The function $\hat{T} f{n}$ can be expanded in terms of the complete orthonormal set $\left{f{i}\right}$ as [Eq. (7.41)]:

$
\hat{T} f{n}=\sum{i} c{i} f{i}=\sum{i}\left\langle f{i} \mid \hat{T} f{n}\right\rangle f{i}=\sum{i}\left\langle f{i}\right| \hat{T}\left|f{n}\right\rangle f{i}=\sum{i} T{i n} f_{i}
$

and $R_{m n}$ becomes

$
\begin{equation}
R{m n}=\int f{m}^{} \hat{S} \sum{i} T{i n} f{i} d \tau=\sum{i} \int f{m}^{*} \hat{S} f{i} d \tau T{i n}=\sum{i} S{m i} T{i n} \tag{7.110}
\end{}
$

The equation $R{m n}=\sum{i} S{m i} T{i n}$ is the rule (7.107) for matrix multiplication. Hence, if $\hat{R}=\hat{S} \hat{T}$, then $\mathbf{R}=\mathbf{S T}$.

We have proved that the matrix representatives of linear operators in a complete orthonormal basis set obey the same equations that the operators obey. Combining Eqs. (7.109) and (7.110), we have the useful sum rule

$
\begin{equation}
\sum_{i}\langle m| \hat{S}|i\rangle\langle i| \hat{T}|n\rangle=\langle m| \hat{S} \hat{T}|n\rangle \tag{7.111}
\end{equation}
$

Suppose the basis set $\left{f{i}\right}$ is chosen to be the complete, orthonormal set of eigenfunctions $g{i}$ of $\hat{A}$, where $\hat{A} g{i}=a{i} g{i}$. Then the matrix element $A{m n}$ is

$
A{m n}=\left\langle g{m}\right| \hat{A}\left|g{n}\right\rangle=\left\langle g{m} \mid \hat{A} g{n}\right\rangle=\left\langle g{m} \mid a{n} g{n}\right\rangle=a{n}\left\langle g{m} \mid g{n}\right\rangle=a{n} \delta_{m n}
$

The matrix that represents $\hat{A}$ in the basis of orthonormal $\hat{A}$ eigenfunctions is thus a diagonal matrix whose diagonal elements are the eigenvalues of $\hat{A}$. Conversely, one can prove (Prob. 7.53) that, when the matrix representative of $\hat{A}$ using a complete orthonormal set is a diagonal matrix, then the basis functions are the eigenfunctions of $\hat{A}$ and the diagonal matrix elements are the eigenvalues of $\hat{A}$.

We have used the complete, orthonormal basis $\left{f{i}\right}$ to represent the operator $\hat{A}$ by the matrix $\mathbf{A}$ of (7.108). The basis $\left{f{i}\right}$ can also be used to represent an arbitrary function $u$, as follows. We expand $u$ in terms of the complete set $\left{f{i}\right}$, according to $u=\sum{i} u{i} f{i}$, where the expansion coefficients $u{i}$ are numbers (not functions) given by Eq. (7.40) as $u{i}=\left\langle f{i} \mid u\right\rangle$. The set of expansion coefficients $u{1}, u{2}, \ldots$ is formed into a column matrix (column vector), which we call $\mathbf{u}$, and $\mathbf{u}$ is said to be the representative of the function $u$ in the $\left{f{i}\right}$ basis. If $\hat{A} u=w$, where $w$ is another function, then we can show (Prob. 7.54) that $\mathbf{A u}=\mathbf{w}$, where $\mathbf{A}, \mathbf{u}$, and $\mathbf{w}$ are the matrix representatives of $\hat{A}, u$, and $w$ in the $\left{f_{i}\right}$ basis. Thus, the effect of the linear operator $\hat{A}$ on an arbitrary function $u$ can be found if the matrix representative $\mathbf{A}$ of $\hat{A}$ is known. Hence, knowing the matrix representative $\mathbf{A}$ is equivalent to knowing what the operator $\hat{A}$ is.