Skip to main content
Side Block Position
Top Block Position

Comprehensive Study Notes

Completion requirements

Comprehensive Study Notes for the full course

Matrix algebra (Section 7.10) was developed during the period 1855-1858 by the British mathematician Arthur Cayley as a shorthand way of dealing with simultaneous linear equations and linear transformations from one set of variables to another. [Cayley had to support himself by working as a lawyer for 14 years until he obtained a professorship in 1860. While working as a lawyer, he published hundreds of papers in mathematics and regularly discussed mathematics with his fellow lawyer John Joseph Sylvester. Sylvester coined the term matrix for rectangular arrays (after the Latin word for "womb") and obtained a mathematics professorship in 1855.] Matrix algebra remained unknown to most physicists for many years, and when Heisenberg discovered the matrix-mechanics version of quantum mechanics in 1925, he did not realize that the entities he was dealing with were matrices. When he showed his work to Born, Born, who was well-trained in mathematics, recognized that Heisenberg was using matrices. Nowadays, matrix algebra is an essential tool in quantum chemistry computations and is widely used in most branches of physics.

The set of linear inhomogeneous equations (8.35) can be written as the matrix equation

\(
\begin{gather}
\left(\begin{array}{cccc}
a{11} & a{12} & \cdots & a{1 n} \
a{21} & a{22} & \cdots & a{2 n} \
\vdots & \vdots & \ddots & \vdots \
a{n 1} & a{n 2} & \cdots & a{n n}
\end{array}\right)\left(\begin{array}{c}
x{1} \
x{2} \
\vdots \
x{n}
\end{array}\right)=\left(\begin{array}{c}
b{1} \
b{2} \
\vdots \
b_{n}
\end{array}\right) \tag{8.76}\
\mathbf{A x}=\mathbf{b} \tag{8.77}
\end{gather}
\)

where $\mathbf{A}$ is the coefficient matrix and $\mathbf{x}$ and $\mathbf{b}$ are column matrices. The equivalence of (8.35) and (8.76) is readily verified using the matrix-multiplication rule (7.107).

The determinant of a square matrix $\mathbf{A}$ is the determinant whose elements are the same as the elements of $\mathbf{A}$. If $\operatorname{det} \mathbf{A} \neq 0$, the matrix $\mathbf{A}$ is said to be nonsingular.

The inverse of a square matrix $\mathbf{A}$ of order $n$ is the square matrix whose product with $\mathbf{A}$ is the unit matrix of order $n$. Denoting the inverse by $\mathbf{A}^{-1}$, we have

\(
\begin{equation}
\mathbf{A A}^{-1}=\mathbf{A}^{-1} \mathbf{A}=\mathbf{I} \tag{8.78}
\end{equation}
\)

where $\mathbf{I}$ is the unit matrix. One can prove that $\mathbf{A}^{-1}$ exists if and only if $\operatorname{det} \mathbf{A} \neq 0$. (For efficient methods of computing $\mathbf{A}^{-1}$, see Press et al., Section 2.3; Shoup, Section 3.3; Prob. 8.51. Many spreadsheets have a built-in capability to find the inverse of a matrix.)

If det $\mathbf{A} \neq 0$ for the coefficient matrix $\mathbf{A}$ in (8.76), then we can multiply each side of (8.77) by $\mathbf{A}^{-1}$ on the left to get $\mathbf{A}^{-1}(\mathbf{A x})=\mathbf{A}^{-1} \mathbf{b}$. Since matrix multiplication is associative (Section 7.10), we have $\mathbf{A}^{-1}(\mathbf{A x})=\left(\mathbf{A}^{-1} \mathbf{A}\right) \mathbf{x}=\mathbf{I} \mathbf{x}=\mathbf{x}$. Thus, left multiplication of (8.77) by $\mathbf{A}^{-1}$ gives $\mathbf{x}=\mathbf{A}^{-1} \mathbf{b}$ as the solution for the unknowns in a set of linear inhomogeneous equations.

The linear variation method is widely used to find approximate molecular wave functions, and matrix algebra gives the most computationally efficient method to solve the equations of the linear variation method. If the functions $f{1}, \ldots, f{n}$ in the linear variation function $\phi=\sum{k=1}^{n} c{k} f{k}$ are made to be orthonormal, then $S{i j}=\int f{i}^{*} f{j} d \tau=\delta{i j}$, and the homogeneous set of equations (8.55) for the coefficients $c{k}$ that minimize the variational integral becomes

\(
\begin{gather}
H{11} c{1}+H{12} c{2}+\cdots+H{1 n} c{n}=W c{1} \
H{21} c{1}+H{22} c{2}+\cdots+H{2 n} c{n}=W c{2} \
\vdots \tag{8.79a}\
\vdots \
H{n 1} c{1}+H{n 2} c{2}+\cdots+H{n n} c{n}=W c{n} \tag{8.79b}\
\left(\begin{array}{cccc}
H{11} & H{12} & \cdots & H{1 n} \
H{21} & H{22} & \cdots & H{2 n} \
\vdots & \vdots & \ddots & \vdots \
H{n 1} & H{n 2} & \cdots & H{n n}
\end{array}\right)\left(\begin{array}{c}
c{1} \
c{2} \
\vdots \
c{n}
\end{array}\right)=W\left(\begin{array}{c}
c{1} \
c{2} \
\vdots \
c{n}
\end{array}\right) \tag{8.79c}
\end{gather}
\)

where $\mathbf{H}$ is the square matrix whose elements are $H{i j}=\left\langle f{i}\right| \hat{H}\left|f{j}\right\rangle$ and $\mathbf{c}$ is the column vector of coefficients $c{1}, \ldots, c_{n}$. In (8.79c), $\mathbf{H}$ is a known matrix and $\mathbf{c}$ and $W$ are unknowns to be solved for.

If

\(
\begin{equation}
\mathbf{A c}=\lambda \mathbf{c} \tag{8.80}
\end{equation}
\)

where $\mathbf{A}$ is a square matrix, $\mathbf{c}$ is a column vector with at least one nonzero element, and $\lambda$ is a scalar, then $\mathbf{c}$ is said to be an eigenvector (or characteristic vector) of $\mathbf{A}$ and $\lambda$ is an eigenvalue (or characteristic value) of $\mathbf{A}$.

Comparison of (8.80) with (8.79c) shows that solving the linear variation problem with $S{i j}=\delta{i j}$ amounts to finding the eigenvalues and eigenvectors of the matrix $\mathbf{H}$. The matrix eigenvalue equation $\mathbf{H c}=W \mathbf{c}$ is equivalent to the set of homogeneous equations (8.55), which has a nontrivial solution for the $c$ 's if and only if $\operatorname{det}\left(H{i j}-\delta{i j} W\right)=0$ [Eq. (8.57) with $S{i j}=\delta{i j}$ ]. For a general square matrix $\mathbf{A}$ of order $n$, the corresponding equation satisfied by the eigenvalues is

\(
\begin{equation}
\operatorname{det}\left(A{i j}-\delta{i j} \lambda\right)=0 \tag{8.81}
\end{equation}
\)

Equation (8.81) is called the characteristic equation of matrix $\mathbf{A}$. When the $n$ th-order determinant in (8.81) is expanded, it gives a polynomial in $\lambda$ (called the characteristic polynomial) whose highest power is $\lambda^{n}$. The characteristic polynomial has $n$ roots for $\lambda$ (some of which may be equal to each other and some of which may be imaginary), so a square matrix of order $n$ has $n$ eigenvalues. (The set of eigenvalues of a square matrix $\mathbf{A}$ is sometimes called the spectrum of $\mathbf{A}$.)

The matrix equation (8.79c) for $\mathbf{H}$ corresponds to (8.80) for $\mathbf{A}$. The elements of the eigenvectors of $\mathbf{A}$ satisfy the following set of equations that corresponds to (8.79a):

\(
\begin{align}
& A{11} c{1}+A{12} c{2}+\cdots+A{1 n} c{n}=\lambda c{1} \
& \vdots \tag{8.82}\
& \vdots
\end{aligned} \ddots \quad \begin{aligned}
& \vdots \
& A{n 1} c{1}+A{n 2} c{2}+\cdots+A{n n} c{n}=\lambda c{n}
\end{align}
\)

For each different eigenvalue, we have a different set of equations (8.82) and a different set of numbers $c{1}, c{2}, \ldots, c_{n}$, giving a different eigenvector.

If all the eigenvalues of a matrix are different, one can show that solving (8.82) leads to $n$ linearly independent eigenvectors (see Strang, Section 5.2), where linear independence means that no eigenvector can be written as a linear combination of the other eigenvectors. If some eigenvalues are equal, then the matrix may have fewer than $n$ linearly independent eigenvectors. The matrices that occur in quantum mechanics are usually Hermitian (this term is defined later in this section), and a Hermitian matrix of order $n$ always has $n$ linearly independent eigenvectors even if some of its eigenvalues are equal (see Strang, Section 5.6 for the proof).

If $\mathbf{A}$ is a diagonal matrix ( $a_{i j}=0$ for $i \neq j$ ), then the determinant in (8.81) is diagonal. A diagonal determinant equals the product of its diagonal elements [Eq. (8.33)], so the characteristic equation for a diagonal matrix is

\(
\left(a{11}-\lambda\right)\left(a{12}-\lambda\right) \cdots\left(a_{n n}-\lambda\right)=0
\)

The roots of this equation are $\lambda{1}=a{11}, \lambda{2}=a{22}, \ldots, \lambda{n}=a{n n}$. The eigenvalues of a diagonal matrix are equal to its diagonal elements. (For the eigenvectors, see Prob. 8.46.)

If $\mathbf{c}$ is an eigenvector of $\mathbf{A}$, then clearly $\mathbf{d} \equiv k \mathbf{c}$ is also an eigenvector of $\mathbf{A}$, where $k$ is any constant. If $k$ is chosen so that

\(
\begin{equation}
\sum{i=1}^{n}\left|d{i}\right|^{2}=1 \tag{8.83}
\end{equation}
\)

then the column vector $\mathbf{d}$ is said to be normalized. Two column vectors $\mathbf{b}$ and $\mathbf{c}$ that each have $n$ elements are said to be orthogonal if

\(
\begin{equation}
\sum{i=1}^{n} b{i}^{} c_{i}=0 \tag{8.84}
\end{}
\)

Let us denote the $n$ eigenvalues and the corresponding eigenvectors of $\mathbf{H}$ in the variation-method equations (8.79) by $W{1}, W{2}, \ldots, W_{n}$ and $\mathbf{c}^{(1)}, \mathbf{c}^{(2)}, \ldots, \mathbf{c}^{(n)}$, so that

\(
\begin{equation}
\mathbf{H c}^{(i)}=W_{i} \mathbf{c}^{(i)} \quad \text { for } i=1,2, \ldots, n \tag{8.85}
\end{equation}
\)

where $\mathbf{c}^{(i)}$ is a column vector whose elements are $c{1}^{(i)}, \ldots, c{n}^{(i)}$ and the basis functions $f_{i}$ are orthonormal. Furthermore, let $\mathbf{C}$ be the square matrix whose columns are the eigenvectors of $\mathbf{H}$, and let $\mathbf{W}$ be the diagonal matrix whose diagonal elements are the eigenvalues of $\mathbf{H}$ :

\(
\mathbf{C}=\left(\begin{array}{cccc}
c{1}^{(1)} & c{1}^{(2)} & \cdots & c{1}^{(n)} \tag{8.86}\
c{2}^{(1)} & c{2}^{(2)} & \cdots & c{2}^{(n)} \
\vdots & \vdots & \ddots & \vdots \
c{n}^{(1)} & c{n}^{(2)} & \cdots & c{n}^{(n)}
\end{array}\right), \quad \mathbf{W}=\left(\begin{array}{cccc}
W{1} & 0 & \cdots & 0 \
0 & W{2} & \cdots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \cdots & W{n}
\end{array}\right)
\)

The set of $n$ eigenvalue equations (8.85) can be written as the single equation:

\(
\begin{equation}
\mathbf{H C}=\mathbf{C W} \tag{8.87}
\end{equation}
\)

To verify the matrix equation (8.87), we show that each element $(\mathbf{H C}){i j}$ of the matrix $\mathbf{H C}$ equals the corresponding element $(\mathbf{C W}){i j}$ of $\mathbf{C W}$. The matrix-multiplication rule (7.107) gives $(\mathbf{H C}){i j}=\sum{k} H{i k}(\mathbf{C}){k j}=\sum{k} H{i k} c{k}^{(j)}$. Consider the eigenvalue equation $\mathbf{H C}^{(j)}=W{j} \mathbf{c}^{(j)}$ [Eq. (8.85)]. $\mathbf{H} \mathbf{c}^{(j)}$ and $W{j} \mathbf{c}^{(j)}$ are column matrices. Using (7.107) to equate the elements in row $i$ of each of these column matrices, we have $\sum{k} H{i k} c{k}^{(j)}=W{j} c{i}^{(j)}$. Then

\(
(\mathbf{H C}){i j}=\sum{k} H{i k} c{k}^{(j)}=W{j} c{i}^{(j)}, \quad(\mathbf{C W}){i j}=\sum{k}(\mathbf{C}){i k}(\mathbf{W}){k j}=\sum{k} c{i}^{(k)} \delta{k j} W{k}=c{i}^{(j)} W{j}
\)

Hence $(\mathbf{H C}){i j}=(\mathbf{C W}){i j}$ and (8.87) is proved.
Provided $\mathbf{C}$ has an inverse (see below), we can multiply each side of (8.87) by $\mathbf{C}^{-1}$ on the left to get $\mathbf{C}^{-1} \mathbf{H C}=\mathbf{C}^{-1}(\mathbf{C W})$. [Since matrix multiplication is not commutative, when we multiply each side of $\mathbf{H C}=\mathbf{C W}$ by $\mathbf{C}^{-1}$, we must put the factor $\mathbf{C}^{-1}$ on the left of $\mathbf{H C}$ and on the left of $\mathbf{C W}$ (or on the right of $\mathbf{H C}$ and the right of $\mathbf{C W}$ ).] We have $\mathbf{C}^{-1} \mathbf{H C}=\mathbf{C}^{-1}(\mathbf{C W})=\left(\mathbf{C}^{-1} \mathbf{C}\right) \mathbf{W}=\mathbf{I} \mathbf{W}=\mathbf{W}:$

\(
\begin{equation}
\mathbf{C}^{-1} \mathbf{H C}=\mathbf{W} \tag{8.88}
\end{equation}
\)

To simplify (8.88), we must learn more about matrices.
A square matrix $\mathbf{B}$ is a symmetric matrix if all its elements satisfy $b{m n}=b{n m}$. The elements of a symmetric matrix are symmetric about the principal diagonal; for example, $b{12}=b{21}$. A square matrix $\mathbf{D}$ is a Hermitian matrix if all its elements satisfy $d{m n}=d{n m}^{*}$. For example, if

\(
\mathbf{M}=\left(\begin{array}{ccc}
2 & 5 & 0 \tag{8.89}\
5 & i & 2 i \
0 & 2 i & 4
\end{array}\right), \quad \mathbf{N}=\left(\begin{array}{ccc}
6 & 1+2 i & 8 \
1-2 i & -1 & -i \
8 & i & 0
\end{array}\right)
\)

then $\mathbf{M}$ is symmetric and $\mathbf{N}$ is Hermitian. (Note that the diagonal elements of a Hermitian matrix must be real; $d{m m}=d{m m}^{*}$.) A real matrix is one whose elements are all real numbers. A real Hermitian matrix is a symmetric matrix.

The transpose $\mathbf{A}^{\mathrm{T}}$ (often written $\widetilde{\mathbf{A}}$ ) of the matrix $\mathbf{A}$ is the matrix formed by interchanging rows and columns of $\mathbf{A}$ so that column 1 becomes row one, column 2 becomes row two, and so on. The elements $a{m n}^{\mathrm{T}}$ of $\mathbf{A}^{\mathrm{T}}$ are related to the elements of $\mathbf{A}$ by $a{m n}^{\mathrm{T}}=a_{n m}$. For a square matrix, the transpose is found by reflecting the elements about the principal diagonal. A symmetric matrix is equal to its transpose. Thus, for the matrix $\mathbf{M}$ in (8.89), we have $\mathbf{M}^{\mathrm{T}}=\mathbf{M}$.

The complex conjugate $\mathbf{A}^{}$ of $\mathbf{A}$ is the matrix formed by taking the complex conjugate of each element of $\mathbf{A}$. The conjugate transpose $\mathbf{A}^{\dagger}$ (read as A dagger) of the matrix $\mathbf{A}$ is formed by taking the transpose of $\mathbf{A}^{}$; thus $\mathbf{A}^{\dagger}=\left(\mathbf{A}^{*}\right)^{\mathrm{T}}$ and

\(
\begin{equation}
a{m n}^{\dagger}=\left(a{n m}\right)^{} \tag{8.90}
\end{}
\)

(Physicists call $\mathbf{A}^{\dagger}$ the adjoint of $\mathbf{A}$, a name that has been used by mathematicians to denote an entirely different matrix.) An example is

\(
\mathbf{B}=\left(\begin{array}{cc}
2 & 3+i \
0 & 4 i
\end{array}\right), \quad \mathbf{B}^{\mathrm{T}}=\left(\begin{array}{cc}
2 & 0 \
3+i & 4 i
\end{array}\right), \quad \mathbf{B}^{\dagger}=\left(\begin{array}{cc}
2 & 0 \
3-i & -4 i
\end{array}\right)
\)

For a Hermitian matrix A, Eq. (8.90) gives $a{m n}^{\dagger}=\left(a{n m}\right)^{*}=a{m n}$, so $\left(\mathbf{A}^{\dagger}\right){m n}=(\mathbf{A})_{m n}$ and $\mathbf{A}^{\dagger}=\mathbf{A}$. A Hermitian matrix is equal to its conjugate transpose. (Physicists often use the term self-adjoint for a Hermitian matrix.)

An orthogonal matrix is a square matrix whose inverse is equal to its transpose:

\(
\begin{equation}
\mathbf{A}^{-1}=\mathbf{A}^{\mathrm{T}} \quad \text { if } \mathbf{A} \text { is orthogonal } \tag{8.91}
\end{equation}
\)

A unitary matrix is one whose inverse is equal to its conjugate transpose:

\(
\begin{equation}
\mathbf{U}^{-1}=\mathbf{U}^{\dagger} \quad \text { if } \mathbf{U} \text { is unitary } \tag{8.92}
\end{equation}
\)

From the definition (8.92), we have $\mathbf{U}^{\dagger} \mathbf{U}=\mathbf{I}$ if $\mathbf{U}$ is unitary. By equating $\left(\mathbf{U}^{\dagger} \mathbf{U}\right){m n}$ to $(\mathbf{I}){m n}$, we find (Prob. 8.43)

\(
\begin{equation}
\sum{k} u{k m}^{} u{k n}=\delta{m n} \tag{8.93}
\end{}
\)

for columns $m$ and $n$ of a unitary matrix. Thus the columns of a unitary matrix (viewed as column vectors) are orthogonal and normalized (orthonormal), as defined by (8.84) and (8.83). Conversely, if (8.93) is true for all columns, then $\mathbf{U}$ is a unitary matrix. If $\mathbf{U}$ is unitary and real, then $\mathbf{U}^{\dagger}=\mathbf{U}^{\mathrm{T}}$, and $\mathbf{U}$ is an orthogonal matrix.

One can prove that two eigenvectors of a Hermitian matrix $\mathbf{H}$ that correspond to different eigenvalues are orthogonal (see Strang, Section 5.5). For eigenvectors of $\mathbf{H}$ that correspond to the same eigenvalue, one can take linear combinations of them that will be orthogonal eigenvectors of $\mathbf{H}$. Moreover, the elements of an eigenvector can be multiplied by a constant to normalize the eigenvector. Hence, the eigenvectors of a Hermitian matrix can be chosen to be orthonormal. If the eigenvectors are chosen to be orthonormal, then the eigenvector matrix $\mathbf{C}$ in (8.86) is a unitary matrix, and $\mathbf{C}^{-1}=\mathbf{C}^{\dagger}$; Eq. (8.88) then becomes

\(
\begin{equation}
\mathbf{C}^{\dagger} \mathbf{H C}=\mathbf{W} \quad \text { if } \mathbf{H} \text { is Hermitian } \tag{8.94}
\end{equation}
\)

For the common case that $\mathbf{H}$ is real as well as Hermitian (that is, $\mathbf{H}$ is real and symmetric), the $c$ 's in (8.79a) are real (since $W$ and the $H_{i j}$ 's are real) and $\mathbf{C}$ is real as well as unitary; that is, $\mathbf{C}$ is orthogonal, with $\mathbf{C}^{-1}=\mathbf{C}^{\mathrm{T}}$; Eq. (8.94) becomes

\(
\begin{equation}
\mathbf{C}^{\mathrm{T}} \mathbf{H C}=\mathbf{W} \quad \text { if } \mathbf{H} \text { is real and symmetric } \tag{8.95}
\end{equation}
\)

The eigenvalues of a Hermitian matrix can be proven to be real numbers (Strang, Section 5.5).

EXAMPLE

Find the eigenvalues and normalized eigenvectors of the Hermitian matrix

\(
\mathbf{A}=\left(\begin{array}{cc}
3 & 2 i \
-2 i & 0
\end{array}\right)
\)

by solving algebraic equations. Then verify that $\mathbf{C}^{\dagger} \mathbf{A C}$ is diagonal, where $\mathbf{C}$ is the eigenvector matrix.
The characteristic equation (8.81) for the eigenvalues $\lambda$ is $\operatorname{det}\left(a{i j}-\delta{i j} \lambda\right)=0$, which becomes

\(
\begin{aligned}
& \left|\begin{array}{cc}
3-\lambda & 2 i \
-2 i & -\lambda
\end{array}\right|=0 \
& \lambda^{2}-3 \lambda-4=0 \
& \lambda{1}=4, \quad \lambda{2}=-1
\end{aligned}
\)

A useful theorem in checking eigenvalue calculations is the following (Strang, Exercise 5.1.9): The sum of the diagonal elements of a square matrix $\mathbf{A}$ of order $n$ is equal to the sum of the eigenvalues $\lambda{i}$ of $\mathbf{A}$; that is, $\sum{i=1}^{n} a{i i}=\sum{i=1}^{n} \lambda{i}$. In this example, $\sum{i} a{i i}=3+0$, which equals the sum $4-1=3$ of the eigenvalues.
For the root $\lambda{1}=4$, the set of simultaneous equations (8.82) is

\(
\begin{array}{r}
\left(3-\lambda{1}\right) c{1}^{(1)}+2 i c{2}^{(1)}=0 \
-2 i c{1}^{(1)}-\lambda{1} c{2}^{(1)}=0
\end{array}
\)

or

\(
\begin{aligned}
& -c{1}^{(1)}+2 i c{2}^{(1)}=0 \
& -2 i c{1}^{(1)}-4 c{2}^{(1)}=0
\end{aligned}
\)

Discarding either one of these equations, we find

\(
c{1}^{(1)}=2 i c{2}^{(1)}
\)

Normalization gives

\(
\begin{gathered}
1=\left|c{1}^{(1)}\right|^{2}+\left|c{2}^{(1)}\right|^{2}=4\left|c{2}^{(1)}\right|^{2}+\left|c{2}^{(1)}\right|^{2} \
\left|c{2}^{(1)}\right|=1 / \sqrt{5}, \quad c{2}^{(1)}=1 / \sqrt{5} \
c{1}^{(1)}=2 i c{2}^{(1)}=2 i / \sqrt{5}
\end{gathered}
\)

where the phase of $c{2}^{(1)}$ was chosen to be zero.
Similarly, we find for $\lambda{2}=-1$ (Prob. 8.49)

\(
c{1}^{(2)}=-i / \sqrt{5}, \quad c{2}^{(2)}=2 / \sqrt{5}
\)

The normalized eigenvectors are then

\(
\mathbf{c}^{(1)}=\binom{2 i / \sqrt{5}}{1 / \sqrt{5}}, \quad \mathbf{c}^{(2)}=\binom{-i / \sqrt{5}}{2 / \sqrt{5}}
\)

Because the eigenvalues $\lambda{1}$ and $\lambda{2}$ of the Hermitian matrix $\mathbf{A}$ differ, $\mathbf{c}^{(1)}$ and $\mathbf{c}^{(2)}$ are orthogonal (as you should verify). Also, $\mathbf{c}^{(1)}$ and $\mathbf{c}^{(2)}$ are normalized. Therefore, $\mathbf{C}$ is unitary and $\mathbf{C}^{-1}=\mathbf{C}^{\dagger}$. Forming $\mathbf{C}$ and its conjugate transpose, we have

\(
\begin{aligned}
\mathbf{C}^{-1} \mathbf{A} \mathbf{C}=\mathbf{C}^{\dagger} \mathbf{A C} & =\left(\begin{array}{rl}
-2 i / \sqrt{5} & 1 / \sqrt{5} \
i / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right)\left(\begin{array}{rr}
3 & 2 i \
-2 i & 0
\end{array}\right)\left(\begin{array}{rr}
2 i / \sqrt{5} & -i / \sqrt{5} \
1 / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right) \
& =\left(\begin{array}{rr}
-2 i / \sqrt{5} & 1 / \sqrt{5} \
i / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right)\left(\begin{array}{rr}
8 i / \sqrt{5} & i / \sqrt{5} \
4 / \sqrt{5} & -2 / \sqrt{5}
\end{array}\right)=\left(\begin{array}{rr}
4 & 0 \
0 & -1
\end{array}\right)
\end{aligned}
\)

which is the diagonal matrix of eigenvectors.

We have shown that if $\mathbf{H}$ is a real symmetric matrix with eigenvalues $W{i}$ and orthonormal eigenvectors $\mathbf{c}^{(i)}$ (that is, if $\mathbf{H} \mathbf{c}^{(i)}=W{i} \mathbf{c}^{(i)}$ for $i=1,2, \ldots, n$ ), then $\mathbf{C}^{\mathrm{T}} \mathbf{H C}=\mathbf{W}$ [Eq. (8.95)], where $\mathbf{C}$ is the real orthogonal matrix whose columns are the eigenvectors $\mathbf{c}^{(i)}$ and $\mathbf{W}$ is the diagonal matrix of eigenvalues $W_{i}$. The converse of this theorem is readily proved; that is, if $\mathbf{H}$ is a real symmetric matrix, $\mathbf{B}$ is a real orthogonal matrix, and $\mathbf{B}^{\mathrm{T}} \mathbf{H B}$ equals a diagonal matrix $\boldsymbol{\Lambda}$, then the columns of $\mathbf{B}$ are the eigenvectors of $\mathbf{H}$ and the diagonal elements of $\boldsymbol{\Lambda}$ are the eigenvalues of $\mathbf{H}$.

To find the eigenvalues and eigenvectors of a Hermitian matrix of order $n$, we can use either of the following procedures: (1) Solve the characteristic equation $\operatorname{det}\left(H{i j}-\delta{i j} W\right)=0$ [Eq. (8.81)] for the eigenvalues $W{1}, \ldots, W{n}$. Then substitute each $W{k}$ into the set of algebraic equations (8.79a) and solve for the elements $c{1}^{(k)}, \ldots, c_{n}^{(k)}$ of the $k$ th eigenvector. (2) Search for a unitary matrix $\mathbf{C}$ such that $\mathbf{C}^{\dagger} \mathbf{H C}$ is a diagonal matrix. The diagonal elements of $\mathbf{C}^{\dagger} \mathbf{H C}$ are the eigenvalues of $\mathbf{H}$, and the columns of $\mathbf{C}$ are the orthonormal eigenvectors of $\mathbf{H}$. For the large matrices that occur in quantum chemistry, procedure (2) (called matrix diagonalization) is computationally much faster than (1).

One reason that expanding the characteristic determinant and solving the characteristic equation is not a good way to find the eigenvalues of large matrices is that, for large matrices, a very small change in a coefficient in the characteristic polynomial may produce a large change in the eigenvalues (see Prob. 8.54). Hence we might have to calculate the coefficients in the characteristic polynomial to hundreds or thousands of decimal places in order to get eigenvalues accurate to a few decimal places. Although it is true that for certain matrices a tiny change in the value of an element of that matrix might produce large changes in the eigenvalues, one can prove that for Hermitian matrices, a small change in a matrix element always produces only small changes in the eigenvalues. Hence method (2) of the preceding paragraph is the correct way to get accurate eigenvalues.

A systematic way to diagonalize a real symmetric matrix $\mathbf{H}$ is as follows. Construct an orthogonal matrix $\mathbf{O}{1}$ such that the matrix $\mathbf{H}{1} \equiv \mathbf{O}{1}^{\mathrm{T}} \mathbf{H} \mathbf{O}{1}$ has zero in place of the off-diagonal elements $H{12}$ and $H{21}$ of $\mathbf{H}$. (Because $\mathbf{H}$ is symmetric, we have $H{12}=H{21}$. Also, the transformed matrices $\mathbf{H}{1}, \mathbf{H}{2}, \ldots$ are symmetric.) Then construct an orthogonal matrix $\mathbf{O}{2}$ such that $\mathbf{H}{2} \equiv \mathbf{O}{2}^{\mathrm{T}} \mathbf{H}{1} \mathbf{O}{2}=\mathbf{O}{2}^{\mathrm{T}} \mathbf{O}{1}^{\mathrm{T}} \mathbf{H} \mathbf{O}{1} \mathbf{O}{2}$ has zeros in place of the elements $\left(\mathbf{H}{1}\right){13}$ and $\left(\mathbf{H}{1}\right){31}$ of $\mathbf{H}{1}$; and so on. Unfortunately, when a given pair of off-diagonal elements is made zero in a step, some off-diagonal elements made zero in a previous step are likely to become nonzero, so one has to go back and recycle through the off-diagonal elements over and over again. Generally an infinite number of steps are required to make all off-diagonal elements equal to zero. In practice, one skips a step if the absolute value of the off-diagonal elements to be zeroed in that step is less than some tiny number, and one stops the procedure when the absolute values of all off-diagonal elements are less than some tiny number. The eigenvalues are then the diagonal elements of the transformed matrix $\cdots \mathbf{O}{3}^{\mathrm{T}} \mathbf{O}{2}^{\mathrm{T}} \mathbf{O}{1}^{\mathrm{T}} \mathbf{H O}{1} \mathbf{O}{2} \mathbf{O}{3} \cdots$, and the eigenvector matrix is the product $\mathbf{O}{1} \mathbf{O}{2} \mathbf{O}_{3} \cdots$. This method (the cyclic Jacobi method) is not very efficient for large matrices when run on a serial computer but is efficient on a parallel computer.

More-efficient approaches to diagonalize real, symmetric matrices than the Jacobi method begin by carrying out a series of orthogonal transformations to reduce the original matrix $\mathbf{H}$ to a symmetric tridiagonal matrix $\mathbf{T}$. A tridiagonal matrix is one whose elements are all zero except for those on the principal diagonal (elements $t{i i}$ ) and those on the diagonals immediately above and immediately below the principal diagonal (elements $t{i-1, i}$ and $t_{i+1, i}$, respectively). The relation between $\mathbf{T}$ and $\mathbf{H}$ is $\mathbf{T}=\mathbf{O}^{\mathrm{T}} \mathbf{H O}$, where $\mathbf{O}$ is a real orthogonal matrix that is the product of the orthogonal matrices used in the individual steps of going from $\mathbf{H}$ to $\mathbf{T}$. Two efficient methods of

transforming $\mathbf{H}$ to tridiagonal form are due to Givens and to Householder. An efficient method to find the eigenvalues of a symmetric tridiagonal matrix is the $\mathbf{Q R}$ method. Here, $\mathbf{T}$ is expressed as the product of an orthogonal matrix $\mathbf{Q}$ and an upper triangular matrix $\mathbf{R}$ (one whose elements below the principal diagonal are all zero). A series of iterative steps yields matrices converging to a diagonal matrix whose diagonal elements are the eigenvalues of $\mathbf{T}$, which equal the eigenvalues of $\mathbf{H}$ (Prob. 8.55). With certain refinements, the QR method is a very efficient way to find eigenvalues and eigenvectors (see Strang, Sections 5.3 and 7.3 for details).

Details of matrix diagonalization procedures and computer programs are given in Press et al., Chapter 11; Acton, Chapters 8 and 13; Shoup, Chapter 4.

A major compilation of procedures and computer programs for scientific and engineering calculations is Press et al.; the text of older editions of this book is available free on the Internet at www.nr.com. For comments on older editions of this book, see amath .colorado.edu/computing/Fortran/numrec.html.

Programs for mathematical and scientific calculations can be found at www.netlib .org and at gams.nist.gov. Downloadable free personal-computer mathematical software and demonstration software for such commercial programs as Mathcad and Maple can be found at archives.math.utk.edu.

The procedure for using matrix algebra to solve the linear variation equations when nonorthonormal basis functions are used is outlined in Prob. 8.57.

The Excel spreadsheet can be used to find eigenvalues and eigenvectors; see Prob. 8.53.
Computer algebra systems such as Maple, Mathematica, and Mathcad and some electronic calculators have built-in commands to easily find eigenvalues and eigenvectors.

The methods for finding matrix eigenvalues and eigenvectors discussed in this section are useful for matrices of order up to $10^{3}$. Special methods are used to find the lowest few eigenvalues and corresponding eigenvectors of matrices of order up to $10^{9}$ that occur in certain quantum-chemistry calculations (see Section 16.2).

As noted after Eq. (8.80), for the linear variation function $\sum{i=1}^{n} c{i} f{i}$ with orthonormal basis functions $f{i}$, the eigenvalues of the matrix $\mathbf{H}$ formed from the matrix elements $\left\langle f{i}\right| \hat{H}\left|f{k}\right\rangle$ are the roots of the secular equation, and the eigenvector corresponding to the eigenvalue $W{m}$ gives the coefficients in the variation function that corresponds to $W{m}$. Problems 8.60 to 8.65 apply the linear variation method to problems such as the double well and the harmonic oscillator using particle-in-a-box wave functions as basis functions and using a computer algebra program such as Mathcad to find the eigenvalues and eigenvectors of the $\mathbf{H}$ matrix.

We have discussed matrix diagonalization in the context of the linear variation method. However, finding the eigenvalues $a{k}$ and eigenfunctions $g{k}$ of any Hermitian operator $\left(\hat{A} g{k}=a{k} g{k}\right)$ can be formulated as a matrix-diagonalization problem. If we choose a complete, orthonormal basis set $\left{f{i}\right}$ and expand the eigenfunctions as $g{k}=\sum{i} c{i}^{(k)} f{i}$, then (Prob. 8.59) the eigenvalues of the matrix $\mathbf{A}$ whose elements are $a{i j}=\left\langle f{i}\right| \hat{A}\left|f{j}\right\rangle$ are the eigenvalues of the operator $\hat{A}$, and the elements $c{i}^{(k)}$ of the eigenvectors $\mathbf{c}^{(k)}$ of $\mathbf{A}$ give the coefficients in the expansions of the eigenfunctions $g_{k}$.

The material of this section further emphasizes the correspondence between linear operators and matrices and the correspondence between functions and column vectors (Section 7.10).

The PageRanks of Web pages indexed by Google are calculated by Google by constructing a certain matrix (called the Google matrix) and finding its eigenvector that corresponds to the eigenvalue 1 . The order of the Google matrix equals the number of indexed pages and was between $10^{10.5}$ and $10^{11}$ in 2012 . The components of this eigenvector are the PageRanks. For details, see www.ams.org/samplings/featurecolumn/ fcarc-pagerank.


Bottom Block Position

Back to Course