Comprehensive Study Notes

Site: PULSE, Pondicherry University Learning Management System
Course: CHEM412 Electronic Structure
Book: Comprehensive Study Notes
Printed by: Guest user
Date: Friday, 18 July 2025, 12:02 AM

Description

Comprehensive Study Notes for the full course

Table of contents

Keywords: (click the terms to know more)

Quantum Chemistry: The application of quantum mechanics to solve chemical problems. It influences all branches of chemistry, helping calculate thermodynamic properties, interpret molecular spectra, understand intermolecular forces, and more

Classical Mechanics: The laws of motion of macroscopic objects discovered by Isaac Newton in the late seventeenth century. It does not correctly describe the behavior of very small particles such as electrons and nuclei

Quantum Mechanics: A set of laws that describe the behavior of very small particles like electrons and nuclei. It was developed in the early twentieth century

Wave-Particle Duality: The concept that electrons and other microscopic particles exhibit both wave-like and particle-like properties. This duality is fundamental to understanding quantum mechanics

Uncertainty Principle: Werner Heisenberg's principle stating that it is impossible to simultaneously know the exact position and momentum of a particle. This principle imposes a limit on the precision of measurements at the microscopic level

Schrödinger Equation: An equation that describes how the wave function of a quantum system evolves over time. The time-dependent Schrödinger Equation is used for systems with constant energy and is simpler to apply in many chemical problems

Wave Function: A function that provides information about the probability of finding a particle in a particular region of space. The probability density is given by the square of the absolute value of the wave function

Probability Density: The square of the absolute value of the wave function, which gives the probability of finding a particle at various places on the x-axis

Complex Numbers: Numbers that have both a real part and an imaginary part. They are essential for understanding and solving the Schrödinger Equation

Diffraction: The bending of a wave around an obstacle. It is observed when light goes through two adjacent pinholes

Interference: The combining of two waves of the same frequency to give a wave whose disturbance at each point in space is the algebraic or vector sum of the disturbances at that point resulting from each interfering wave

Electromagnetic Waves: Waves consisting of oscillating electric and magnetic fields. Light is an electromagnetic wave

Blackbody Radiation: The radiation emitted by a heated blackbody, an object that absorbs all light falling on it

Energy Quantization: The concept that the energy of a resonator is restricted to be a whole-number multiple of a certain value. This concept was introduced by Max Planck

Photoelectric Effect: The emission of electrons from a metal when light shines on it. The energy of the emitted electrons depends on the frequency of the light

Photons: Particle-like entities that make up light. Each photon has an energy proportional to its frequency

Atomic Structure: Atoms are composed of electrons, protons, and neutrons. The positive charge is concentrated in a tiny, heavy nucleus

Bohr Model: Niels Bohr's model of the atom in which electrons revolve around the nucleus in quantized orbits

de Broglie Wavelength: The wavelength associated with a particle, given by the equation λ = h/p, where h is Planck's constant and p is the momentum

Stationary States: States of constant energy in quantum mechanics. The probability density does not change with time in these states

Normalization: The requirement that the integral of the probability density over all space is equal to one

Probability: The likelihood of an event occurring. In quantum mechanics, it is used to predict the probabilities of various possible results

Complex Conjugate: The complex conjugate of a complex number is obtained by replacing i with -i

Absolute Value: The distance of a complex number from the origin in the complex plane

Phase: The angle that the radius vector to the point representing a complex number makes with the positive horizontal axis

SI Units: The International System of Units, which includes the meter (m), kilogram (kg), and second (s) as units of length, mass, and time

Calculus: A branch of mathematics heavily used in quantum chemistry. It includes differentiation and integration

This glossary should serve as brief study notes to help students understand the key concepts and terms in quantum chemistry and the Schrödinger Equation

Q1: What is Quantum Chemistry? A1: Quantum chemistry applies quantum mechanics to solve chemical problems. It influences all branches of chemistry, helping calculate thermodynamic properties, interpret molecular spectra, understand intermolecular forces, and more

Q2: What is the historical background of Quantum Mechanics? A2: Quantum mechanics began with Planck's study of blackbody radiation in 1900. Key developments include the wave nature of light, Maxwell's equations, and the concept of energy quantization introduced by Planck

Q3: What is Wave-Particle Duality? A3: Wave-particle duality refers to the concept that electrons and other microscopic particles exhibit both wave-like and particle-like properties. This duality is fundamental to understanding quantum mechanics

Q4: What is the Uncertainty Principle? A4: Heisenberg's Uncertainty Principle states that it is impossible to simultaneously know the exact position and momentum of a particle. This principle imposes a limit on the precision of measurements at the microscopic level

Q5: What is the Schrödinger Equation? A5: The Schrödinger Equation describes how the wave function of a quantum system evolves over time. The time-dependent Schrödinger Equation is used for systems with constant energy and is simpler to apply in many chemical problems

Q6: What is the significance of the wave function in Quantum Mechanics? A6: The wave function provides information about the probability of finding a particle in a particular region of space. The probability density is given by the square of the absolute value of the wave function

Q7: How are complex numbers and calculus used in Quantum Chemistry? A7: The document reviews the mathematics of complex numbers and calculus, which are essential for understanding and solving the Schrödinger Equation

Q8: What are some applications of Quantum Mechanics in Chemistry? A8: Quantum mechanics is applied to real-world chemical systems through various examples and problems provided in the document. It helps in understanding molecular properties, reaction mechanisms, and more

Q9: What is the Time-Independent Schrödinger Equation? A9: The time-independent Schrödinger Equation is used for systems with constant energy and is simpler to apply in many chemical problems. It is derived from the time-dependent Schrödinger Equation for the one-particle, one-dimensional case

Q10: How does Quantum Mechanics differ from Classical Mechanics? A10: Classical mechanics applies to macroscopic particles, while quantum mechanics is required for microscopic particles. Quantum mechanics involves probabilities and wave functions, whereas classical mechanics involves deterministic equations

Keywords: (click the terms to watch YouTube videos)

Quantum Chemistry: The application of quantum mechanics to solve chemical problems. It influences all branches of chemistry, helping calculate thermodynamic properties, interpret molecular spectra, understand intermolecular forces, and more

Classical Mechanics: The laws of motion of macroscopic objects discovered by Isaac Newton in the late seventeenth century. It does not correctly describe the behavior of very small particles such as electrons and nuclei

Quantum Mechanics: A set of laws that describe the behavior of very small particles like electrons and nuclei. It was developed in the early twentieth century

Wave-Particle Duality: The concept that electrons and other microscopic particles exhibit both wave-like and particle-like properties. This duality is fundamental to understanding quantum mechanics

Uncertainty Principle: Werner Heisenberg's principle stating that it is impossible to simultaneously know the exact position and momentum of a particle. This principle imposes a limit on the precision of measurements at the microscopic level

Schrödinger Equation: An equation that describes how the wave function of a quantum system evolves over time. The time-dependent Schrödinger Equation is used for systems with constant energy and is simpler to apply in many chemical problems

Wave Function: A function that provides information about the probability of finding a particle in a particular region of space. The probability density is given by the square of the absolute value of the wave function

Probability Density: The square of the absolute value of the wave function, which gives the probability of finding a particle at various places on the x-axis

Complex Numbers: Numbers that have both a real part and an imaginary part. They are essential for understanding and solving the Schrödinger Equation

Diffraction: The bending of a wave around an obstacle. It is observed when light goes through two adjacent pinholes

Interference: The combining of two waves of the same frequency to give a wave whose disturbance at each point in space is the algebraic or vector sum of the disturbances at that point resulting from each interfering wave

Electromagnetic Waves: Waves consisting of oscillating electric and magnetic fields. Light is an electromagnetic wave

Blackbody Radiation: The radiation emitted by a heated blackbody, an object that absorbs all light falling on it

Energy Quantization: The concept that the energy of a resonator is restricted to be a whole-number multiple of a certain value. This concept was introduced by Max Planck

Photoelectric Effect: The emission of electrons from a metal when light shines on it. The energy of the emitted electrons depends on the frequency of the light

Photons: Particle-like entities that make up light. Each photon has an energy proportional to its frequency

Atomic Structure: Atoms are composed of electrons, protons, and neutrons. The positive charge is concentrated in a tiny, heavy nucleus

Bohr Model: Niels Bohr's model of the atom in which electrons revolve around the nucleus in quantized orbits

de Broglie Wavelength: The wavelength associated with a particle, given by the equation λ = h/p, where h is Planck's constant and p is the momentum

Stationary States: States of constant energy in quantum mechanics. The probability density does not change with time in these states

Normalization: The requirement that the integral of the probability density over all space is equal to one

Probability: The likelihood of an event occurring. In quantum mechanics, it is used to predict the probabilities of various possible results

Complex Conjugate: The complex conjugate of a complex number is obtained by replacing i with -i

Absolute Value: The distance of a complex number from the origin in the complex plane

Phase: The angle that the radius vector to the point representing a complex number makes with the positive horizontal axis

SI Units: The International System of Units, which includes the meter (m), kilogram (kg), and second (s) as units of length, mass, and time

Calculus: A branch of mathematics heavily used in quantum chemistry. It includes differentiation and integration

This glossary should serve as brief study notes to help students understand the key concepts and terms in quantum chemistry and the Schrödinger Equation

The development of quantum mechanics began in 1900 with Planck's study of the light emitted by heated solids, so we start by discussing the nature of light.

In 1803, Thomas Young gave convincing evidence for the wave nature of light by observing diffraction and interference when light went through two adjacent pinholes. (Diffraction is the bending of a wave around an obstacle. Interference is the combining of two waves of the same frequency to give a wave whose disturbance at each point in space is the algebraic or vector sum of the disturbances at that point resulting from each interfering wave. See any first-year physics text.)

In 1864, James Clerk Maxwell published four equations, known as Maxwell's equations, which unified the laws of electricity and magnetism. Maxwell's equations predicted that an accelerated electric charge would radiate energy in the form of electromagnetic waves consisting of oscillating electric and magnetic fields. The speed predicted by Maxwell's equations for these waves turned out to be the same as the experimentally measured speed of light. Maxwell concluded that light is an electromagnetic wave.

In 1888, Heinrich Hertz detected radio waves produced by accelerated electric charges in a spark, as predicted by Maxwell's equations. This convinced physicists that light is indeed an electromagnetic wave.

All electromagnetic waves travel at speed $c=2.998 \times 10^{8} \mathrm{~m} / \mathrm{s}$ in vacuum. The frequency $\nu$ and wavelength $\lambda$ of an electromagnetic wave are related by

\( \begin{equation<em>} \lambda \nu=c \tag{1.1} \end{equation</em>} \)

(Equations that are enclosed in a box should be memorized. The Appendix gives the Greek alphabet.) Various conventional labels are applied to electromagnetic waves depending on their frequency. In order of increasing frequency are radio waves, microwaves, infrared radiation, visible light, ultraviolet radiation, X-rays, and gamma rays. We shall use the term light to denote any kind of electromagnetic radiation. Wavelengths of visible and ultraviolet radiation were formerly given in angstroms ( $\AA$ ) and are now given in nanometers ( nm ):

\( \begin{equation<em>} 1 \mathrm{~nm}=10^{-9} \mathrm{~m}, \quad 1 \AA=10^{-10} \mathrm{~m}=0.1 \mathrm{~nm} \tag{1.2} \end{equation</em>} \)

In the 1890 s, physicists measured the intensity of light at various frequencies emitted by a heated blackbody at a fixed temperature, and did these measurements at several temperatures. A blackbody is an object that absorbs all light falling on it. A good approximation to a blackbody is a cavity with a tiny hole. In 1896, the physicist Wien proposed the following equation for the dependence of blackbody radiation on light frequency and blackbody temperature: $I=a \nu^{3} / e^{b \nu / T}$, where $a$ and $b$ are empirical constants, and $I d \nu$ is the energy with frequency in the range $\nu$ to $\nu+d \nu$ radiated per unit time and per unit surface area by a blackbody, with $d \nu$ being an infinitesimal frequency range. Wien's formula gave a good fit to the blackbody radiation data available in 1896, but his theoretical arguments for the formula were considered unsatisfactory.

In 1899-1900, measurements of blackbody radiation were extended to lower frequencies than previously measured, and the low-frequency data showed significant deviations from Wien's formula. These deviations led the physicist Max Planck to propose in October 1900 the following formula: $I=a \nu^{3} /\left(e^{b \nu / T}-1\right)$, which was found to give an excellent fit to the data at all frequencies.

Having proposed this formula, Planck sought a theoretical justification for it. In December 1900, he presented a theoretical derivation of his equation to the German Physical Society. Planck assumed the radiation emitters and absorbers in the blackbody to be harmonically oscillating electric charges ("resonators") in equilibrium with electromagnetic radiation in a cavity. He assumed that the total energy of those resonators whose frequency is $\nu$ consisted of $N$ indivisible "energy elements," each of magnitude $h \nu$, where $N$ is an integer and $h$ (Planck's constant) was a new constant in physics. Planck distributed these energy elements among the resonators. In effect, this restricted the energy of each resonator to be a whole-number multiple of $h v$ (although Planck did not explicitly say this). Thus the energy of each resonator was quantized, meaning that only certain discrete values were allowed for a resonator energy. Planck's theory showed that $a=2 \pi h / c^{2}$ and $b=h / k$, where $k$ is Boltzmann's constant. By fitting the experimental blackbody curves, Planck found $h=6.6 \times 10^{-34} \mathrm{~J} \cdot \mathrm{~s}$.

Planck's work is usually considered to mark the beginning of quantum mechanics. However, historians of physics have debated whether Planck in 1900 viewed energy quantization as a description of physical reality or as merely a mathematical approximation that allowed him to obtain the correct blackbody radiation formula. [See O. Darrigol, Centaurus, 43, 219 (2001); C. A. Gearhart, Phys. Perspect., 4, 170 (2002) (available online at employees.csbsju.edu/cgearhart/Planck/PQH.pdf; S. G. Brush, Am. J. Phys., 70, 119 (2002) (www.punsterproductions.com/~sciencehistory/cautious.htm).] The physics historian Kragh noted that "If a revolution occurred in physics in December 1900, nobody seemed to notice it. Planck was no exception, and the importance ascribed to his work is largely a historical reconstruction" (H. Kragh, Physics World, Dec. 2000, p. 31).

The concept of energy quantization is in direct contradiction to all previous ideas of physics. According to Newtonian mechanics, the energy of a material body can vary continuously. However, only with the hypothesis of quantized energy does one obtain the correct blackbody-radiation curves.

The second application of energy quantization was to the photoelectric effect. In the photoelectric effect, light shining on a metal causes emission of electrons. The energy of a wave is proportional to its intensity and is not related to its frequency, so the electromagnetic-wave picture of light leads one to expect that the kinetic energy of an emitted photoelectron would increase as the light intensity increases but would not change as the light frequency changes. Instead, one observes that the kinetic energy of an emitted electron is independent of the light's intensity but increases as the light's frequency increases.

In 1905, Einstein showed that these observations could be explained by regarding light as composed of particlelike entities (called photons), with each photon having an energy

\( \begin{equation<em>} E_{\text {photon }}=h \nu \tag{1.3} \end{equation</em>} \)

When an electron in the metal absorbs a photon, part of the absorbed photon energy is used to overcome the forces holding the electron in the metal; the remainder appears as kinetic energy of the electron after it has left the metal. Conservation of energy gives $h \nu=\Phi+T$, where $\Phi$ is the minimum energy needed by an electron to escape the metal (the metal's work function), and $T$ is the maximum kinetic energy of an emitted electron. An increase in the light's frequency $\nu$ increases the photon energy and hence increases the kinetic energy of the emitted electron. An increase in light intensity at fixed frequency increases the rate at which photons strike the metal and hence increases the rate of emission of electrons, but does not change the kinetic energy of each emitted electron. (According to Kragh, a strong "case can be made that it was Einstein who first recognized the essence of quantum theory"; Kragh, Physics World, Dec. 2000, p. 31.)

The photoelectric effect shows that light can exhibit particlelike behavior in addition to the wavelike behavior it shows in diffraction experiments.

In 1907, Einstein applied energy quantization to the vibrations of atoms in a solid element, assuming that each atom's vibrational energy in each direction $(x, y, z)$ is restricted to be an integer times $h \nu_{\text {vib }}$, where the vibrational frequency $\nu_{\text {vib }}$ is characteristic of the element. Using statistical mechanics, Einstein derived an expression for the constantvolume heat capacity $C_{V}$ of the solid. Einstein's equation agreed fairly well with known $C_{V}$-versus-temperature data for diamond.

Now let us consider the structure of matter. In the late nineteenth century, investigations of electric discharge tubes and natural radioactivity showed that atoms and molecules are composed of charged particles. Electrons have a negative charge. The proton has a positive charge equal in magnitude but opposite in sign to the electron charge and is 1836 times as heavy as the electron. The third constituent of atoms, the neutron (discovered in 1932), is uncharged and slightly heavier than the proton.

Starting in 1909, Rutherford, Geiger, and Marsden repeatedly passed a beam of alpha particles through a thin metal foil and observed the deflections of the particles by allowing them to fall on a fluorescent screen. Alpha particles are positively charged helium nuclei obtained from natural radioactive decay. Most of the alpha particles passed through the foil essentially undeflected, but, surprisingly, a few underwent large deflections, some being deflected backward. To get large deflections, one needs a very close approach between the charges, so that the Coulombic repulsive force is great. If the positive charge were spread throughout the atom (as J. J. Thomson had proposed in 1904), once the high-energy alpha particle penetrated the atom, the repulsive force would fall off, becoming zero at the center of the atom, according to classical electrostatics. Hence Rutherford concluded that such large deflections could occur only if the positive charge were concentrated in a tiny, heavy nucleus.

An atom contains a tiny ( $10^{-13}$ to $10^{-12} \mathrm{~cm}$ radius), heavy nucleus consisting of neutrons and $Z$ protons, where $Z$ is the atomic number. Outside the nucleus there are $Z$ electrons. The charged particles interact according to Coulomb's law. (The nucleons are held together in the nucleus by strong, short-range nuclear forces, which will not concern us.) The radius of an atom is about one angstrom, as shown, for example, by results from the kinetic theory of gases. Molecules have more than one nucleus.

The chemical properties of atoms and molecules are determined by their electronic structure, and so the question arises as to the nature of the motions and energies of the electrons. Since the nucleus is much more massive than the electron, we expect the motion of the nucleus to be slight compared with the electrons' motions.

In 1911, Rutherford proposed his planetary model of the atom in which the electrons revolved about the nucleus in various orbits, just as the planets revolve about the sun. However, there is a fundamental difficulty with this model. According to classical electromagnetic theory, an accelerated charged particle radiates energy in the form of electromagnetic (light) waves. An electron circling the nucleus at constant speed is being accelerated, since the direction of its velocity vector is continually changing. Hence the electrons in the Rutherford model should continually lose energy by radiation and therefore would spiral toward the nucleus. Thus, according to classical (nineteenth-century) physics, the Rutherford atom is unstable and would collapse.

A possible way out of this difficulty was proposed by Niels Bohr in 1913, when he applied the concept of quantization of energy to the hydrogen atom. Bohr assumed that the energy of the electron in a hydrogen atom was quantized, with the electron constrained to move only on one of a number of allowed circles. When an electron makes a transition from one Bohr orbit to another, a photon of light whose frequency $v$ satisfies

\( \begin{equation<em>} E_{\text {upper }}-E_{\text {lower }}=h \nu \tag{1.4} \end{equation</em>} \)

is absorbed or emitted, where $E_{\text {upper }}$ and $E_{\text {lower }}$ are the energies of the upper and lower states (conservation of energy). With the assumption that an electron making a transition from a free (ionized) state to one of the bound orbits emits a photon whose frequency is an integral multiple of one-half the classical frequency of revolution of the electron in the bound orbit, Bohr used classical mechanics to derive a formula for the hydrogenatom energy levels. Using (1.4), he got agreement with the observed hydrogen spectrum. However, attempts to fit the helium spectrum using the Bohr theory failed. Moreover, the theory could not account for chemical bonds in molecules.

The failure of the Bohr model arises from the use of classical mechanics to describe the electronic motions in atoms. The evidence of atomic spectra, which show discrete frequencies, indicates that only certain energies of motion are allowed; the electronic energy is quantized. However, classical mechanics allows a continuous range of energies. Quantization does occur in wave motion-for example, the fundamental and overtone frequencies of a violin string. Hence Louis de Broglie suggested in 1923 that the motion of electrons might have a wave aspect; that an electron of mass $m$ and speed $v$ would have a wavelength

\( \begin{equation<em>} \lambda=\frac{h}{m v}=\frac{h}{p} \tag{1.5} \end{equation</em>} \)

associated with it, where $p$ is the linear momentum. De Broglie arrived at Eq. (1.5) by reasoning in analogy with photons. The energy of a photon can be expressed, according to Einstein's special theory of relativity, as $E=p c$, where $c$ is the speed of light and $p$ is the photon's momentum. Using $E_{\text {photon }}=h \nu$, we get $p c=h \nu=h c / \lambda$ and $\lambda=h / p$ for a photon traveling at speed $c$. Equation (1.5) is the corresponding equation for an electron.

In 1927, Davisson and Germer experimentally confirmed de Broglie's hypothesis by reflecting electrons from metals and observing diffraction effects. In 1932, Stern observed the same effects with helium atoms and hydrogen molecules, thus verifying that the wave effects are not peculiar to electrons, but result from some general law of motion for microscopic particles. Diffraction and interference have been observed with molecules as large as $\mathrm{C}{48} \mathrm{H}{26} \mathrm{~F}{24} \mathrm{~N}{8} \mathrm{O}{8}$ passing through a diffraction grating [T. Juffmann et al., Nat. Nanotechnol., 7, 297 (2012).]. A movie of the buildup of an interference pattern involving $\mathrm{C}{32} \mathrm{H}{18} \mathrm{~N}{8}$ molecules can be seen at

.

Thus electrons behave in some respects like particles and in other respects like waves. We are faced with the apparently contradictory "wave-particle duality" of matter (and of light). How can an electron be both a particle, which is a localized entity, and a wave, which is nonlocalized? The answer is that an electron is neither a wave nor a particle, but something else. An accurate pictorial description of an electron's behavior is impossible using the wave or particle concept of classical physics. The concepts of classical physics have been developed from experience in the macroscopic world and do not properly describe the microscopic world. Evolution has shaped the human brain to allow it to understand and deal effectively with macroscopic phenomena. The human nervous system was not developed to deal with phenomena at the atomic and molecular level, so it is not surprising if we cannot fully understand such phenomena.

Although both photons and electrons show an apparent duality, they are not the same kinds of entities. Photons travel at speed $c$ in vacuum and have zero rest mass; electrons always have $v<c$ and a nonzero rest mass. Photons must always be treated relativistically, but electrons whose speed is much less than $c$ can be treated nonrelativistically.

Let us consider what effect the wave-particle duality has on attempts to measure simultaneously the $x$ coordinate and the $x$ component of linear momentum of a microscopic particle. We start with a beam of particles with momentum $p$, traveling in the $y$ direction, and we let the beam fall on a narrow slit. Behind this slit is a photographic plate. See Fig. 1.1.

Particles that pass through the slit of width $w$ have an uncertainty $w$ in their $x$ coordinate at the time of going through the slit. Calling this spread in $x$ values $\Delta x$, we have $\Delta x=w$.

Since microscopic particles have wave properties, they are diffracted by the slit producing (as would a light beam) a diffraction pattern on the plate. The height of the graph in Fig. 1.1 is a measure of the number of particles reaching a given point. The diffraction pattern shows that when the particles were diffracted by the slit, their direction of motion was changed so that part of their momentum was transferred to the $x$ direction. The $x$ component of momentum $p_{x}$ equals the projection of the momentum vector $\mathbf{p}$ in the $x$ direction. A particle deflected upward by an angle $\alpha$ has $p_{x}=p \sin \alpha$. A particle deflected downward by $\alpha$ has $p_{x}=-p \sin \alpha$. Since most of the particles undergo deflections in the range $-\alpha$ to $\alpha$, where $\alpha$ is the angle to the first minimum in the diffraction pattern, we shall take one-half the spread of momentum values in the central diffraction peak as a measure of the uncertainty $\Delta p_{x}$ in the $x$ component of momentum: $\Delta p_{x}=p \sin \alpha$.

Hence at the slit, where the measurement is made,

\( \begin{equation<em>} \Delta x \Delta p_{x}=p w \sin \alpha \tag{1.6} \end{equation</em>} \)

FIGURE 1.1 Diffraction of electrons by a slit.

The angle $\alpha$ at which the first diffraction minimum occurs is readily calculated. The condition for the first minimum is that the difference in the distances traveled by particles passing through the slit at its upper edge and particles passing through the center of the slit should be equal to $\frac{1}{2} \lambda$, where $\lambda$ is the wavelength of the associated wave. Waves originating from the top of the slit are then exactly out of phase with waves originating from the center of the slit, and they cancel each other. Waves originating from a point in the slit at a distance $d$ below the slit midpoint cancel with waves originating at a distance $d$ below the top of the slit. Drawing $A C$ in Fig. 1.2 so that $A D=C D$, we have the difference in path length as $B C$. The distance from the slit to the screen is large compared with the slit width. Hence $A D$ and $B D$ are nearly parallel. This makes the angle $A C B$ essentially a right angle, and so angle $B A C=\alpha$. The path difference $B C$ is then $\frac{1}{2} w \sin \alpha$. Setting $B C$ equal to $\frac{1}{2} \lambda$, we have $w \sin \alpha=\lambda$, and Eq. (1.6) becomes $\Delta x \Delta p_{x}=p \lambda$. The wavelength $\lambda$ is given by the de Broglie relation $\lambda=h / p$, so $\Delta x \Delta p_{x}=h$. Since the uncertainties have not been precisely defined, the equality sign is not really justified. Instead we write

\( \begin{equation<em>} \Delta x \Delta p_{x} \approx h \tag{1.7} \end{equation</em>} \)

indicating that the product of the uncertainties in $x$ and $p_{x}$ is of the order of magnitude of Planck's constant.

Although we have demonstrated (1.7) for only one experimental setup, its validity is general. No matter what attempts are made, the wave-particle duality of microscopic "particles" imposes a limit on our ability to measure simultaneously the position and momentum of such particles. The more precisely we determine the position, the less accurate is our determination of momentum. (In Fig. 1.1, $\sin \alpha=\lambda / w$, so narrowing the slit increases the spread of the diffraction pattern.) This limitation is the uncertainty principle, discovered in 1927 by Werner Heisenberg.

Because of the wave-particle duality, the act of measurement introduces an uncontrollable disturbance in the system being measured. We started with particles having a precise value of $p_{x}$ (zero). By imposing the slit, we measured the $x$ coordinate of the particles to an accuracy $w$, but this measurement introduced an uncertainty into the $p_{x}$ values of the particles. The measurement changed the state of the system.

Classical mechanics applies only to macroscopic particles. For microscopic "particles" we require a new form of mechanics, called quantum mechanics. We now consider some of the contrasts between classical and quantum mechanics. For simplicity a one-particle, one-dimensional system will be discussed.

In classical mechanics the motion of a particle is governed by Newton's second law:

\(\begin{equation*}F=m a=m \frac{d^{2} x}{d t^{2}} \tag{1.8}\end{equation*}\)

where $F$ is the force acting on the particle, $m$ is its mass, and $t$ is the time; $a$ is the acceleration, given by $a=d v / d t=(d / d t)(d x / d t)=d^{2} x / d t^{2}$, where $v$ is the velocity. Equation (1.8) contains the second derivative of the coordinate $x$ with respect to time. To solve it, we must carry out two integrations. This introduces two arbitrary constants $c_{1}$ and $c_{2}$ into the solution, and

\(\begin{equation*}x=g\left(t, c_{1}, c_{2}\right) \tag{1.9}\end{equation*}\)

where $g$ is some function of time. We now ask: What information must we possess at a given time $t_{0}$ to be able to predict the future motion of the particle? If we know that at $t_{0}$ the particle is at point $x_{0}$, we have

\(\begin{equation*}x_{0}=g\left(t_{0}, c_{1}, c_{2}\right) \tag{1.10}\end{equation*}\)

Since we have two constants to determine, more information is needed. Differentiating (1.9), we have

\(\frac{d x}{d t}=v=\frac{d}{d t} g\left(t, c_{1}, c_{2}\right)\)

If we also know that at time $t_{0}$ the particle has velocity $v_{0}$, then we have the additional relation

\(\begin{equation*}v_{0}=\left.\frac{d}{d t} g\left(t, c_{1}, c_{2}\right)\right|_{t=t_{0}} \tag{1.11}\end{equation*}\)

We may then use (1.10) and (1.11) to solve for $c_{1}$ and $c_{2}$ in terms of $x_{0}$ and $v_{0}$. Knowing $c_{1}$ and $c_{2}$, we can use Eq. (1.9) to predict the exact future motion of the particle.

As an example of Eqs. (1.8) to (1.11), consider the vertical motion of a particle in the earth's gravitational field. Let the $x$ axis point upward. The force on the particle is downward and is $F=-m g$, where $g$ is the gravitational acceleration constant. Newton's second law (1.8) is $-m g=m d^{2} x / d t^{2}$, so $d^{2} x / d t^{2}=-g$. A single integration gives $d x / d t=-g t+c_{1}$. The arbitrary constant $c_{1}$ can be found if we know that at time $t_{0}$ the particle had velocity $v_{0}$. Since $v=d x / d t$, we have $v_{0}=-g t_{0}+c_{1}$ and $c_{1}=v_{0}+g t_{0}$. Therefore, $d x / d t=-g t+g t_{0}+v_{0}$. Integrating a second time, we introduce another arbitrary constant $c_{2}$, which can be evaluated if we know that at time $t_{0}$ the particle had position $x_{0}$. We find (Prob. 1.7) $x=x_{0}-\frac{1}{2} g\left(t-t_{0}\right)^{2}+v_{0}\left(t-t_{0}\right)$. Knowing $x_{0}$ and $v_{0}$ at time $t_{0}$, we can predict the future position of the particle.

The classical-mechanical potential energy $V$ of a particle moving in one dimension is defined to satisfy

\(\begin{equation*}\frac{\partial V(x, t)}{\partial x}=-F(x, t) \tag{1.12}\end{equation*}\)

For example, for a particle moving in the earth's gravitational field, $\partial V / \partial x=-F=m g$ and integration gives $V=m g x+c$, where $c$ is an arbitrary constant. We are free to set the zero level of potential energy wherever we please. Choosing $c=0$, we have $V=m g x$ as the potential-energy function.

The word state in classical mechanics means a specification of the position and velocity of each particle of the system at some instant of time, plus specification of the forces
acting on the particles. According to Newton's second law, given the state of a system at any time, its future state and future motions are exactly determined, as shown by Eqs. (1.9)-(1.11). The impressive success of Newton's laws in explaining planetary motions led many philosophers to use Newton's laws as an argument for philosophical determinism. The mathematician and astronomer Laplace (1749-1827) assumed that the universe consisted of nothing but particles that obeyed Newton's laws. Therefore, given the state of the universe at some instant, the future motion of everything in the universe was completely determined. A super-being able to know the state of the universe at any instant could, in principle, calculate all future motions.


#### Abstract

Although classical mechanics is deterministic, many classical-mechanical systems (for example, a pendulum oscillating under the influence of gravity, friction, and a periodically varying driving force) show chaotic behavior for certain ranges of the systems' parameters. In a chaotic system, the motion is extraordinarily sensitive to the initial values of the particles' positions and velocities and to the forces acting, and two initial states that differ by an experimentally undetectable amount will eventually lead to very different future behavior of the system. Thus, because the accuracy with which one can measure the initial state is limited, prediction of the long-term behavior of a chaotic classical-mechanical system is, in practice, impossible, even though the system obeys deterministic equations. Computer calculations of solar-system planetary orbits over tens of millions of years indicate that the motions of the planets are chaotic [I. Peterson, Newton's Clock: Chaos in the Solar System, Freeman, 1993; J. J. Lissauer, Rev. Mod. Phys., 71, 835 (1999)].


Given exact knowledge of the present state of a classical-mechanical system, we can predict its future state. However, the Heisenberg uncertainty principle shows that we cannot determine simultaneously the exact position and velocity of a microscopic particle, so the very knowledge required by classical mechanics for predicting the future motions of a system cannot be obtained. We must be content in quantum mechanics with something less than complete prediction of the exact future motion.

Our approach to quantum mechanics will be to postulate the basic principles and then use these postulates to deduce experimentally testable consequences such as the energy levels of atoms. To describe the state of a system in quantum mechanics, we postulate the existence of a function $\Psi$ of the particles' coordinates called the state function or wave function (often written as wavefunction). Since the state will, in general, change with time, $\Psi$ is also a function of time. For a one-particle, one-dimensional system, we have $\Psi=\Psi(x, t)$. The wave function contains all possible information about a system, so instead of speaking of "the state described by the wave function $\Psi$," we simply say "the state $\Psi$." Newton's second law tells us how to find the future state of a classicalmechanical system from knowledge of its present state. To find the future state of a quantum-mechanical system from knowledge of its present state, we want an equation that tells us how the wave function changes with time. For a one-particle, one-dimensional system, this equation is postulated to be

\(\begin{equation*}-\frac{\hbar}{i} \frac{\partial \Psi(x, t)}{\partial t}=-\frac{\hbar^{2}}{2 m} \frac{\partial^{2} \Psi(x, t)}{\partial x^{2}}+V(x, t) \Psi(x, t) \tag{1.13}\end{equation*}\)

where the constant $\hbar$ (h-bar) is defined as

\(\begin{equation*}\hbar \equiv \frac{h}{2 \pi} \tag{1.14}\end{equation*}\)

The concept of the wave function and the equation governing its change with time were discovered in 1926 by the Austrian physicist Erwin Schrödinger (1887-1961). In this equation, known as the time-dependent Schrödinger equation (or the Schrödinger wave equation), $i=\sqrt{-1}, m$ is the mass of the particle, and $V(x, t)$ is the potentialenergy function of the system. (Many of the historically important papers in quantum mechanics are available at dieumsnh.qfb.umich.mx/archivoshistoricosmq.)

The time-dependent Schrödinger equation contains the first derivative of the wave function with respect to time and allows us to calculate the future wave function (state) at any time, if we know the wave function at time $t_{0}$.

The wave function contains all the information we can possibly know about the system it describes. What information does $\Psi$ give us about the result of a measurement of the $x$ coordinate of the particle? We cannot expect $\Psi$ to involve the definite specification of position that the state of a classical-mechanical system does. The correct answer to this question was provided by Max Born shortly after Schrödinger discovered the Schrödinger equation. Born postulated that for a one-particle, one-dimensional system,

\(\begin{equation*}|\Psi(x, t)|^{2} d x \tag{1.15}\end{equation*}\)

gives the probability at time $t$ of finding the particle in the region of the $x$ axis lying between $x$ and $x+d x$. In (1.15) the bars denote the absolute value and $d x$ is an infinitesimal length on the $x$ axis. The function $|\Psi(x, t)|^{2}$ is the probability density for finding the particle at various places on the $x$ axis. (Probability is reviewed in Section 1.6.) For example, suppose that at some particular time $t_{0}$ the particle is in a state characterized by the wave function $a e^{-b x^{2}}$, where $a$ and $b$ are real constants. If we measure the particle's position at time $t_{0}$, we might get any value of $x$, because the probability density $a^{2} e^{-2 b x^{2}}$ is nonzero everywhere. Values of $x$ in the region around $x=0$ are more likely to be found than other values, since $|\Psi|^{2}$ is a maximum at the origin in this case.

To relate $|\Psi|^{2}$ to experimental measurements, we would take many identical noninteracting systems, each of which was in the same state $\Psi$. Then the particle's position in each system is measured. If we had $n$ systems and made $n$ measurements, and if $d n_{x}$ denotes the number of measurements for which we found the particle between $x$ and $x+d x$, then $d n_{x} / n$ is the probability for finding the particle between $x$ and $x+d x$. Thus

\(\frac{d n_{x}}{n}=|\Psi|^{2} d x\)

and a graph of $(1 / n) d n_{x} / d x$ versus $x$ gives the probability density $|\Psi|^{2}$ as a function of $x$. It might be thought that we could find the probability-density function by taking one system that was in the state $\Psi$ and repeatedly measuring the particle's position. This procedure is wrong because the process of measurement generally changes the state of a system. We saw an example of this in the discussion of the uncertainty principle (Section 1.3).

Quantum mechanics is statistical in nature. Knowing the state, we cannot predict the result of a position measurement with certainty; we can only predict the probabilities of various possible results. The Bohr theory of the hydrogen atom specified the precise path of the electron and is therefore not a correct quantum-mechanical picture.

Quantum mechanics does not say that an electron is distributed over a large region of space as a wave is distributed. Rather, it is the probability patterns (wave functions) used to describe the electron's motion that behave like waves and satisfy a wave equation.

How the wave function gives us information on other properties besides the position is discussed in later chapters.

The postulates of thermodynamics (the first, second, and third laws of thermodynamics) are stated in terms of macroscopic experience and hence are fairly readily understood. The postulates of quantum mechanics are stated in terms of the microscopic world and appear quite abstract. You should not expect to fully understand the postulates of quantum mechanics at first reading. As we treat various examples, understanding of the postulates will increase.

It may bother the reader that we wrote down the Schrödinger equation without any attempt to prove its plausibility. By using analogies between geometrical optics and classical mechanics on the one hand, and wave optics and quantum mechanics on the other hand, one can show the plausibility of the Schrödinger equation. Geometrical optics is an approximation to wave optics, valid when the wavelength of the light is much less than the size of the apparatus. (Recall its use in treating lenses and mirrors.) Likewise, classical mechanics is an approximation to wave mechanics, valid when the particle's wavelength is much less than the size of the apparatus. One can make a plausible guess as to how to get the proper equation for quantum mechanics from classical mechanics based on the known relation between the equations of geometrical and wave optics. Since many chemists are not particularly familiar with optics, these arguments have been omitted. In any case, such analogies can only make the Schrödinger equation seem plausible. They cannot be used to derive or prove this equation. The Schrödinger equation is a postulate of the theory, to be tested by agreement of its predictions with experiment. (Details of the reasoning that led Schrödinger to his equation are given in Jammer, Section 5.3. A reference with the author's name italicized is listed in the Bibliography.)

Quantum mechanics provides the law of motion for microscopic particles. Experimentally, macroscopic objects obey classical mechanics. Hence for quantum mechanics to be a valid theory, it should reduce to classical mechanics as we make the transition from microscopic to macroscopic particles. Quantum effects are associated with the de Broglie wavelength $\lambda=h / m v$. Since $h$ is very small, the de Broglie wavelength of macroscopic objects is essentially zero. Thus, in the limit $\lambda \rightarrow 0$, we expect the time-dependent Schrödinger equation to reduce to Newton's second law. We can prove this to be so (see Prob. 7.59).

A similar situation holds in the relation between special relativity and classical mechanics. In the limit $v / c \rightarrow 0$, where $c$ is the speed of light, special relativity reduces to classical mechanics. The form of quantum mechanics that we will develop will be nonrelativistic. A complete integration of relativity with quantum mechanics has not been achieved.

Historically, quantum mechanics was first formulated in 1925 by Heisenberg, Born, and Jordan using matrices, several months before Schrödinger's 1926 formulation using differential equations. Schrödinger proved that the Heisenberg formulation (called matrix mechanics) is equivalent to the Schrödinger formulation (called wave mechanics). In 1926, Dirac and Jordan, working independently, formulated quantum mechanics in an abstract version called transformation theory that is a generalization of matrix mechanics and wave mechanics. In 1948, Feynman devised the path integral formulation of quantum mechanics [R. P. Feynman, Rev. Mod. Phys., 20, 367 (1948); R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw-Hill, 1965].
The time-dependent Schrödinger equation (1.13) is formidable looking. Fortunately, many applications of quantum mechanics to chemistry do not use this equation. Instead, the simpler time-independent Schrödinger equation is used. We now derive the
time-independent from the time-dependent Schrödinger equation for the one-particle, one-dimensional case.

We begin by restricting ourselves to the special case where the potential energy $V$ is not a function of time but depends only on $x$. This will be true if the system experiences no time-dependent external forces. The time-dependent Schrödinger equation reads

\(
\begin{equation*}
-\frac{\hbar}{i} \frac{\partial \Psi(x, t)}{\partial t}=-\frac{\hbar^{2}}{2 m} \frac{\partial^{2} \Psi(x, t)}{\partial x^{2}}+V(x) \Psi(x, t) \tag{1.16}
\end{equation*}
\)


We now restrict ourselves to looking for those solutions of (1.16) that can be written as the product of a function of time and a function of $x$ :

\(
\begin{equation*}
\Psi(x, t)=f(t) \psi(x) \tag{1.17}
\end{equation*}
\)


Capital psi is used for the time-dependent wave function and lowercase psi for the factor that depends only on the coordinate $x$. States corresponding to wave functions of the form (1.17) possess certain properties (to be discussed shortly) that make them of great interest. [Not all solutions of (1.16) have the form (1.17); see Prob. 3.51.] Taking partial derivatives of (1.17), we have

\(
\frac{\partial \Psi(x, t)}{\partial t}=\frac{d f(t)}{d t} \psi(x), \quad \frac{\partial^{2} \Psi(x, t)}{\partial x^{2}}=f(t) \frac{d^{2} \psi(x)}{d x^{2}}
\)


Substitution into (1.16) gives

\(
\begin{gather*}
-\frac{\hbar}{i} \frac{d f(t)}{d t} \psi(x)=-\frac{\hbar^{2}}{2 m} f(t) \frac{d^{2} \psi(x)}{d x^{2}}+V(x) f(t) \psi(x) \\
-\frac{\hbar}{i} \frac{1}{f(t)} \frac{d f(t)}{d t}=-\frac{\hbar^{2}}{2 m} \frac{1}{\psi(x)} \frac{d^{2} \psi(x)}{d x^{2}}+V(x) \tag{1.18}
\end{gather*}
\)


where we divided by $f \psi$. In general, we expect the quantity to which each side of (1.18) is equal to be a certain function of $x$ and $t$. However, the right side of (1.18) does not depend on $t$, so the function to which each side of (1.18) is equal must be independent of $t$. The left side of (1.18) is independent of $x$, so this function must also be independent of $x$. Since the function is independent of both variables, $x$ and $t$, it must be a constant. We call this constant $E$.

Equating the left side of (1.18) to $E$, we get

\(
\frac{d f(t)}{f(t)}=-\frac{i E}{\hbar} d t
\)


Integrating both sides of this equation with respect to $t$, we have

\(
\ln f(t)=-i E t / \hbar+C
\)


where $C$ is an arbitrary constant of integration. Hence

\(
f(t)=e^{C} e^{-i E t / \hbar}=A e^{-i E t / \hbar}
\)


where the arbitrary constant $A$ has replaced $e^{C}$. Since $A$ can be included as a factor in the function $\psi(x)$ that multiplies $f(t)$ in (1.17), $A$ can be omitted from $f(t)$. Thus

\(
f(t)=e^{-i E t / \hbar}
\)


Equating the right side of (1.18) to $E$, we have

\(
\begin{equation*}
-\frac{\hbar^{2}}{2 m} \frac{d^{2} \psi(x)}{d x^{2}}+V(x) \psi(x)=E \psi(x) \tag{1.19}
\end{equation*}
\)


Equation (1.19) is the time-independent Schrödinger equation for a single particle of mass $m$ moving in one dimension.

What is the significance of the constant $E$ ? Since $E$ occurs as $[E-V(x)]$ in (1.19), $E$ has the same dimensions as $V$, so $E$ has the dimensions of energy. In fact, we postulate that $E$ is the energy of the system. (This is a special case of a more general postulate to be discussed in a later chapter.) Thus, for cases where the potential energy is a function of $x$ only, there exist wave functions of the form

\(
\begin{equation*}
\Psi(x, t)=e^{-i E t / \hbar} \psi(x) \tag{1.20}
\end{equation*}
\)


and these wave functions correspond to states of constant energy $E$. Much of our attention in the next few chapters will be devoted to finding the solutions of (1.19) for various systems.

The wave function in (1.20) is complex, but the quantity that is experimentally observable is the probability density $|\Psi(x, t)|^{2}$. The square of the absolute value of a complex quantity is given by the product of the quantity with its complex conjugate, the complex conjugate being formed by replacing $i$ with $-i$ wherever it occurs. (See Section 1.7.) Thus

\(
\begin{equation*}
|\Psi|^{2}=\Psi^{*} \Psi \tag{1.21}
\end{equation*}
\)


where the star denotes the complex conjugate. For the wave function (1.20),

\(
\begin{align*}
|\Psi(x, t)|^{2} & =\left[e^{-i E t / \hbar} \psi(x)\right]^{*} e^{-i E t / \hbar} \psi(x) \\
& =e^{i E t / \hbar} \psi^{*}(x) e^{-i E t / \hbar} \psi(x) \\
& =e^{0} \psi^{*}(x) \psi(x)=\psi^{*}(x) \psi(x) \\
|\Psi(x, t)|^{2} & =|\psi(x)|^{2} \tag{1.22}
\end{align*}
\)


In deriving (1.22), we assumed that $E$ is a real number, so $E=E^{*}$. This fact will be proved in Section 7.2.

Hence for states of the form (1.20), the probability density is given by $|\Psi(x)|^{2}$ and does not change with time. Such states are called stationary states. Since the physically significant quantity is $|\Psi(x, t)|^{2}$, and since for stationary states $|\Psi(x, t)|^{2}=|\psi(x)|^{2}$, the function $\psi(x)$ is often called the wave function, although the complete wave function of a stationary state is obtained by multiplying $\psi(x)$ by $e^{-i E t / \hbar}$. The term stationary state should not mislead the reader into thinking that a particle in a stationary state is at rest. What is stationary is the probability density $|\Psi|^{2}$, not the particle itself.

We will be concerned mostly with states of constant energy (stationary states) and hence will usually deal with the time-independent Schrödinger equation (1.19). For simplicity we will refer to this equation as "the Schrödinger equation." Note that the Schrödinger equation contains two unknowns: the allowed energies $E$ and the allowed wave functions $\psi$. To solve for two unknowns, we need to impose additional conditions (called boundary conditions) on $\psi$ besides requiring that it satisfy (1.19). The boundary conditions determine the allowed energies, since it turns out that only certain values of $E$ allow $\psi$ to satisfy the boundary conditions. This will become clearer when we discuss specific examples in later chapters.

Probability plays a fundamental role in quantum mechanics. This section reviews the mathematics of probability.

There has been much controversy about the proper definition of probability. One definition is the following: If an experiment has $n$ equally probable outcomes, $m$ of which are favorable to the occurrence of a certain event $A$, then the probability that $A$ occurs is $m / n$. Note that this definition is circular, since it specifies equally probable outcomes when probability is what we are trying to define. It is simply assumed that we can recognize equally probable outcomes. An alternative definition is based on actually performing the experiment many times. Suppose that we perform the experiment $N$ times and that in $M$ of these trials the event $A$ occurs. The probability of $A$ occurring is then defined as

\(
\lim _{N \rightarrow \infty} \frac{M}{N}
\)


Thus, if we toss a coin repeatedly, the fraction of heads will approach $1 / 2$ as we increase the number of tosses.

For example, suppose we ask for the probability of drawing a heart when a card is picked at random from a standard 52 -card deck containing 13 hearts. There are 52 cards and hence 52 equally probable outcomes. There are 13 hearts and hence 13 favorable outcomes. Therefore, $m / n=13 / 52=1 / 4$. The probability for drawing a heart is $1 / 4$.

Sometimes we ask for the probability of two related events both occurring. For example, we may ask for the probability of drawing two hearts from a 52 -card deck, assuming we do not replace the first card after it is drawn. There are 52 possible outcomes of the first draw, and for each of these possibilities there are 51 possible second draws. We have $52 \cdot 51$ possible outcomes. Since there are 13 hearts, there are $13 \cdot 12$ different ways to draw two hearts. The desired probability is $(13 \cdot 12) /(52 \cdot 51)=1 / 17$. This calculation illustrates the theorem: The probability that two events $A$ and $B$ both occur is the probability that $A$ occurs, multiplied by the conditional probability that $B$ then occurs, calculated with the assumption that $A$ occurred. Thus, if $A$ is the probability of drawing a heart on the first draw, the probability of $A$ is $13 / 52$. The probability of drawing a heart on the second draw, given that the first draw yielded a heart, is $12 / 51$ since there remain 12 hearts in the deck. The probability of drawing two hearts is then $(13 / 52)(12 / 51)=1 / 17$, as found previously.

In quantum mechanics we must deal with probabilities involving a continuous variable, for example, the $x$ coordinate. It does not make much sense to talk about the probability of a particle being found at a particular point such as $x=0.5000 \ldots$, since there are an infinite number of points on the $x$ axis, and for any finite number of measurements we make, the probability of getting exactly $0.5000 \ldots$ is vanishingly small. Instead we talk of the probability of finding the particle in a tiny interval of the $x$ axis lying between $x$ and $x+d x, d x$ being an infinitesimal element of length. This probability will naturally be proportional to the length of the interval, $d x$, and will vary for different regions of the $x$ axis. Hence the probability that the particle will be found between $x$ and $x+d x$ is equal to $g(x) d x$, where $g(x)$ is some function that tells how the probability varies over the $x$ axis. The function $g(x)$ is called the probability density, since it is a probability per unit length. Since probabilities are real, nonnegative numbers, $g(x)$ must be a real function that is everywhere nonnegative. The wave function $\Psi$ can take on negative and complex values and is not a probability density. Quantum mechanics postulates that the probability density is $|\Psi|^{2}$ [Eq. (1.15)].

What is the probability that the particle lies in some finite region of space $a \leq x \leq b$ ? To find this probability, we sum up the probabilities $|\Psi|^{2} d x$ of finding the particle in all
the infinitesimal regions lying between $a$ and $b$. This is just the definition of the definite integral

\(
\begin{equation*}
\int_{a}^{b}|\Psi|^{2} d x=\operatorname{Pr}(a \leq x \leq b) \tag{1.23}
\end{equation*}
\)


where Pr denotes a probability. A probability of 1 represents certainty. Since it is certain that the particle is somewhere on the $x$ axis, we have the requirement

\(
\begin{equation*}
\int_{-\infty}^{\infty}|\Psi|^{2} d x=1 \tag{1.24}
\end{equation*}
\)


When $\Psi$ satisfies (1.24), it is said to be normalized. For a stationary state, $|\Psi|^{2}=|\psi|^{2}$ and $\int_{-\infty}^{\infty}|\psi|^{2} d x=1$.

## EXAMPLE

A one-particle, one-dimensional system has $\Psi=a^{-1 / 2} e^{-|x| / a}$ at $t=0$, where $a=1.0000 \mathrm{~nm}$. At $t=0$, the particle's position is measured. (a) Find the probability that the measured value lies between $x=1.5000 \mathrm{~nm}$ and $x=1.5001 \mathrm{~nm}$. (b) Find the probability that the measured value is between $x=0$ and $x=2 \mathrm{~nm}$. (c) Verify that $\Psi$ is normalized.
(a) In this tiny interval, $x$ changes by only 0.0001 nm , and $\Psi$ goes from $e^{-1.5000} \mathrm{~nm}^{-1 / 2}=0.22313 \mathrm{~nm}^{-1 / 2}$ to $e^{-1.5001} \mathrm{~nm}^{-1 / 2}=0.22311 \mathrm{~nm}^{-1 / 2}$, so $\Psi$ is nearly constant in this interval, and it is a very good approximation to consider this interval as infinitesimal. The desired probability is given by (1.15) as

\(
\begin{aligned}
|\Psi|^{2} d x=a^{-1} e^{-2|x| / a} d x & =(1 \mathrm{~nm})^{-1} e^{-2(1.5 \mathrm{~nm}) /(1 \mathrm{~nm})}(0.0001 \mathrm{~nm}) \\
& =4.979 \times 10^{-6}
\end{aligned}
\)


(See also Prob. 1.14.)
(b) Use of Eq. (1.23) and $|x|=x$ for $x \geq 0$ gives

\(
\begin{aligned}
\operatorname{Pr}(0 \leq x \leq 2 \mathrm{~nm}) & =\int_{0}^{2 \mathrm{~nm}}|\Psi|^{2} d x=a^{-1} \int_{0}^{2 \mathrm{~nm}} e^{-2 x / a} d x \\
& =-\left.\frac{1}{2} e^{-2 x / a}\right|_{0} ^{2 \mathrm{~nm}}=-\frac{1}{2}\left(e^{-4}-1\right)=0.4908
\end{aligned}
\)


(c) Use of $\int_{-\infty}^{\infty} f(x) d x=\int_{-\infty}^{0} f(x) d x+\int_{0}^{\infty} f(x) d x,|x|=-x$ for $x \leq 0$, and $|x|=x$ for $x \geq 0$, gives

\(
\begin{aligned}
\int_{-\infty}^{\infty}|\Psi|^{2} d x & =a^{-1} \int_{-\infty}^{0} e^{2 x / a} d x+a^{-1} \int_{0}^{\infty} e^{-2 x / a} d x \\
& =a^{-1}\left(\left.\frac{1}{2} a e^{2 x / a}\right|_{-\infty} ^{0}\right)+a^{-1}\left(-\left.\frac{1}{2} a e^{-2 x / a}\right|_{0} ^{\infty}\right)=\frac{1}{2}+\frac{1}{2}=1
\end{aligned}
\)


EXERCISE For a system whose state function at the time of a position measurement is $\Psi=\left(32 a^{3} / \pi\right)^{1 / 4} x e^{-a x^{2}}$, where $a=1.0000 \mathrm{~nm}^{-2}$, find the probability that the particle is found between $x=1.2000 \mathrm{~nm}$ and 1.2001 nm . Treat the interval as infinitesimal.
(Answer: 0.0000258.)

We have seen that the wave function can be complex, so we now review some properties of complex numbers.

A complex number $z$ is a number of the form

\(
\begin{equation}
z=x+i y, \quad \text { where } i \equiv \sqrt{-1} \tag{1.25}
\end{equation}
\)

and where $x$ and $y$ are real numbers (numbers that do not involve the square root of a negative quantity). If $y=0$ in (1.25), then $z$ is a real number. If $y \neq 0$, then $z$ is an imaginary number. If $x=0$ and $y \neq 0$, then $z$ is a pure imaginary number. For example, 6.83 is a real number, $5.4-3 i$ is an imaginary number, and $0.60 i$ is a pure imaginary number. Real and pure imaginary numbers are special cases of complex numbers. In (1.25), $x$ and $y$ are called the real and imaginary parts of $z$, respectively: $x=\operatorname{Re}(z) ; y=\operatorname{Im}(z)$.

The complex number $z$ can be represented as a point in the complex plane (Fig. 1.3), where the real part of $z$ is plotted on the horizontal axis and the imaginary part on the vertical axis. This diagram immediately suggests defining two quantities that characterize the complex number $z$ : the distance $r$ of the point $z$ from the origin is called the absolute value or modulus of $z$ and is denoted by $|z|$; the angle $\theta$ that the radius vector to the point $z$ makes with the positive horizontal axis is called the phase or argument of $z$. We have

\(
\begin{gather}
|z|=r=\left(x^{2}+y^{2}\right)^{1 / 2}, \quad \tan \theta=y / x \tag{1.26}\
x=r \cos \theta, \quad y=r \sin \theta
\end{gather}
\)

So we may write $z=x+i y$ as

\(
\begin{equation}
z=r \cos \theta+i r \sin \theta=r e^{i \theta} \tag{1.27}
\end{equation}
\)

since (Prob. 4.3)

\(
\begin{equation}
e^{i \theta}=\cos \theta+i \sin \theta \tag{1.28}
\end{equation}
\)

The angle $\theta$ in these equations is in radians.
If $z=x+i y$, the complex conjugate $z^{*}$ of the complex number $z$ is defined as

\(
\begin{equation}
z^{} \equiv x-i y=r e^{-i \theta} \tag{1.29}
\end{equation}
\)

FIGURE 1.3 (a) Plot of a complex number $z=x+i y$. (b) Plot of the number $-2+i$.

If $z$ is a real number, its imaginary part is zero. Thus $z$ is real if and only if $z=z^{}$. Taking the complex conjugate twice, we get $z$ back again, $\left(z^{}\right)^{*}=z$. Forming the product of $z$ and its complex conjugate and using $i^{2}=-1$, we have

\(
\begin{gather}
z z^{}=(x+i y)(x-i y)=x^{2}+i y x-i y x-i^{2} y^{2} \
z z^{}=x^{2}+y^{2}=r^{2}=|z|^{2} \tag{1.30}
\end{gather}
\)

For the product and quotient of two complex numbers $z{1}=r{1} e^{i \theta{1}}$ and $z{2}=r{2} e^{i \theta{2}}$, we have

\(
\begin{equation}
z{1} z{2}=r{1} r{2} e^{i\left(\theta{1}+\theta{2}\right)}, \quad \frac{z{1}}{z{2}}=\frac{r{1}}{r{2}} e^{i\left(\theta{1}-\theta{2}\right)} \tag{1.31}
\end{equation}
\)

It is easy to prove, either from the definition of complex conjugate or from (1.31), that

\(
\begin{equation}
\left(z{1} z{2}\right)^{}=z{1}^{*} z{2}^{} \tag{1.32}
\end{equation}
\)

Likewise,

\(
\begin{equation}
\left(z{1} / z{2}\right)^{}=z{1}^{*} / z{2}^{}, \quad\left(z{1}+z{2}\right)^{}=z{1}^{*}+z{2}^{}, \quad\left(z{1}-z{2}\right)^{}=z{1}^{*}-z{2}^{} \tag{1.33}
\end{equation}
\)

For the absolute values of products and quotients, it follows from (1.31) that

\(
\begin{equation}
\left|z{1} z{2}\right|=\left|z{1}\right|\left|z{2}\right|, \quad\left|\frac{z{1}}{z{2}}\right|=\frac{\left|z{1}\right|}{\left|z{2}\right|} \tag{1.34}
\end{equation}
\)

Therefore, if $\psi$ is a complex wave function, we have

\(
\begin{equation}
\left|\psi^{2}\right|=|\psi|^{2}=\psi^{} \psi \tag{1.35}
\end{equation}
\)

We now obtain a formula for the $n$th roots of the number 1 . We may take the phase of the number 1 to be 0 or $2 \pi$ or $4 \pi$, and so on. Hence $1=e^{i 2 \pi k}$, where $k$ is any integer, zero, negative, or positive. Now consider the number $\omega$, where $\omega \equiv e^{i 2 \pi k / n}, n$ being a positive integer. Using (1.31) $n$ times, we see that $\omega^{n}=e^{i 2 \pi k}=1$. Thus $\omega$ is an $n$th root of unity. There are $n$ different complex $n$th roots of unity, and taking $n$ successive values of the integer $k$ gives us all of them:

\(
\begin{equation}
\omega=e^{i 2 \pi k / n}, \quad k=0,1,2, \ldots, n-1 \tag{1.36}
\end{equation}
\)

Any other value of $k$ besides those in (1.36) gives a number whose phase differs by an integral multiple of $2 \pi$ from one of the numbers in (1.36) and hence is not a different root. For $n=2$ in (1.36), we get the two square roots of 1 ; for $n=3$, the three cube roots of 1 ; and so on.


Click the keywords to view related YouTube video.

Differential Equations: These are equations involving derivatives of a function. In this chapter, ordinary differential equations with one independent variable are discussed, as well as linear and nonlinear differential equations 1. Boundary Conditions: These are conditions that specify the value of a function or its derivatives at specific points. They are used to determine the constants in the general solution of a differential equation 1. Linear Differential Equation: A type of differential equation where the dependent variable and its derivatives appear to the first power and are not multiplied together 1. Nonlinear Differential Equation: A differential equation that cannot be written in the form of a linear differential equation. It involves terms where the dependent variable or its derivatives appear to a power other than one or are multiplied together 1. Homogeneous Differential Equation: A linear differential equation where the function on the right-hand side is zero 1. Inhomogeneous Differential Equation: A linear differential equation where the function on the right-hand side is not zero 1. Schrödinger Equation: A fundamental equation in quantum mechanics that describes how the quantum state of a physical system changes over time. The time-independent Schrödinger equation is used in this chapter to solve for the stationary-state wave functions and energy levels of a particle in a one-dimensional box 1. Wave Function (ψ): A mathematical function that describes the quantum state of a particle. The square of the wave function's magnitude gives the probability density of finding the particle at a given position 1. Quantum Number (n): A number that quantizes the energy levels of a particle in a box. Each quantum number corresponds to a different wave function and energy state 1. Nodes: Points where the wave function is zero. The number of nodes increases with the quantum number 1. Ground State: The lowest energy state of a particle in a box. It corresponds to the quantum number n=1 1. Excited States: Energy states higher than the ground state. They correspond to quantum numbers n=2, 3, etc 1. Bohr Correspondence Principle: A principle stating that the predictions of quantum mechanics converge to those of classical mechanics as the quantum numbers become very large 1. Orthogonality: A property of wave functions where the integral of the product of two different wave functions over all space is zero 1. Normalization: The process of adjusting the wave function so that the total probability of finding the particle is one 1. Tunneling: A quantum mechanical phenomenon where a particle can pass through a potential energy barrier that it classically should not be able to pass 1.

2.1 Differential Equations

This section considers only ordinary differential equations, which are those with only one independent variable. [A partial differential equation has more than one independent variable. An example is the time-dependent Schrödinger equation (1.16), in which $t$ and $x$ are the independent variables.] An ordinary differential equation is a relation involving an independent variable $x$, a dependent variable $y(x)$, and the first, second, $\ldots, n$th derivatives of $y\left(y^{\prime}, y^{\prime \prime}, \ldots, y^{(n)}\right)$. An example is

\( \begin{equation} y^{\prime \prime \prime}+2 x\left(y^{\prime}\right)^{2}+y^{2} \sin x=3 e^{x} \tag{2.1} \end{equation} \)

The order of a differential equation is the order of the highest derivative in the equation. Thus, (2.1) is of third order.

A special kind of differential equation is the linear differential equation, which has the form

\( \begin{equation} A{n}(x) y^{(n)}+A{n-1}(x) y^{(n-1)}+\cdots+A{1}(x) y^{\prime}+A{0}(x) y=g(x) \tag{2.2} \end{equation} \)

where the $A$ 's and $g$ (some of which may be zero) are functions of $x$ only. In the $n$ th-order linear differential equation (2.2), $y$ and its derivatives appear to the first power. A differential equation that cannot be put in the form (2.2) is nonlinear. If $g(x)=0$ in (2.2), the linear differential equation is homogeneous; otherwise it is inhomogeneous. The onedimensional Schrödinger equation (1.19) is a linear homogeneous second-order differential equation.

By dividing by the coefficient of $y^{\prime \prime}$, we can put every linear homogeneous secondorder differential equation into the form

\( \begin{equation} y^{\prime \prime}+P(x) y^{\prime}+Q(x) y=0 \tag{2.3} \end{equation} \)

Suppose $y{1}$ and $y{2}$ are two independent functions, each of which satisfies (2.3). By independent, we mean that $y{2}$ is not simply a multiple of $y{1}$. Then the general solution of the linear homogeneous differential equation (2.3) is

\( \begin{equation} y=c{1} y{1}+c{2} y{2} \tag{2.4} \end{equation} \)

where $c{1}$ and $c{2}$ are arbitrary constants. This is readily verified by substituting (2.4) into the left side of (2.3):

\( \begin{align} & c{1} y{1}^{\prime \prime}+c{2} y{2}^{\prime \prime}+P(x) c{1} y{1}^{\prime}+P(x) c{2} y{2}^{\prime}+Q(x) c{1} y{1}+Q(x) c{2} y{2} \ & \quad=c{1}\left[y{1}^{\prime \prime}+P(x) y{1}^{\prime}+Q(x) y{1}\right]+c{2}\left[y{2}^{\prime \prime}+P(x) y{2}^{\prime}+Q(x) y{2}\right] \ & \quad=c{1} \cdot 0+c{2} \cdot 0=0 \tag{2.5} \end{align} \)

where the fact that $y{1}$ and $y{2}$ satisfy (2.3) has been used. The general solution of a differential equation of $n$th order usually has $n$ arbitrary constants. To fix these constants, we may have boundary conditions, which are conditions that specify the value of $y$ or various of its derivatives at a point or points. For example, if $y$ is the displacement of a vibrating string held fixed at two points, we know $y$ must be zero at these points.

An important special case is a linear homogeneous second-order differential equation with constant coefficients:

\( \begin{equation} y^{\prime \prime}+p y^{\prime}+q y=0 \tag{2.6} \end{equation} \)

where $p$ and $q$ are constants. To solve (2.6), let us tentatively assume a solution of the form $y=e^{s x}$. We are looking for a function whose derivatives when multiplied by constants will cancel the original function. The exponential function repeats itself when differentiated and is thus the correct choice. Substitution in (2.6) gives

\( \begin{gather} s^{2} e^{s x}+p s e^{s x}+q e^{s x}=0 \ s^{2}+p s+q=0 \tag{2.7} \end{gather} \)

Equation (2.7) is called the auxiliary equation. It is a quadratic equation with two roots $s{1}$ and $s{2}$ that, provided $s{1}$ and $s{2}$ are not equal, give two independent solutions to (2.6). Thus, the general solution of (2.6) is

\( \begin{equation} y=c{1} e^{s{1} x}+c{2} e^{s{2} x} \tag{2.8} \end{equation} \)

For example, for $y^{\prime \prime}+6 y^{\prime}-7 y=0$, the auxiliary equation is $s^{2}+6 s-7=0$. The quadratic formula gives $s{1}=1, s{2}=-7$, so the general solution is $c{1} e^{x}+c{2} e^{-7 x}$.

2.2 Particle in a One-Dimensional Box

This section solves the time-independent Schrödinger equation for a particle in a onedimensional box. By this we mean a particle subjected to a potential-energy function that is infinite everywhere along the $x$ axis except for a line segment of length $l$, where the potential energy is zero. Such a system may seem physically unreal, but this model can be applied with some success to certain conjugated molecules; see Prob. 2.17. We put the origin at the left end of the line segment (Fig. 2.1).

FIGURE 2.1 Potential energy function $V(x)$ for the particle in a one-dimensional box.

We have three regions to consider. In regions I and III, the potential energy $V$ equals infinity and the time-independent Schrödinger equation (1.19) is

\( -\frac{\hbar^{2}}{2 m} \frac{d^{2} \psi}{d x^{2}}=(E-\infty) \psi \)

Neglecting $E$ in comparison with $\infty$, we have

\( \frac{d^{2} \psi}{d x^{2}}=\infty \psi, \quad \psi=\frac{1}{\infty} \frac{d^{2} \psi}{d x^{2}} \)

and we conclude that $\psi$ is zero outside the box:

\( \begin{equation} \psi{\mathrm{I}}=0, \quad \psi{\mathrm{III}}=0 \tag{2.9} \end{equation} \)

For region II, $x$ between zero and $l$, the potential energy $V$ is zero, and the Schrödinger equation (1.19) becomes

\( \begin{equation} \frac{d^{2} \psi{\mathrm{II}}}{d x^{2}}+\frac{2 m}{\hbar^{2}} E \psi{\mathrm{II}}=0 \tag{2.10} \end{equation} \)

where $m$ is the mass of the particle and $E$ is its energy. We recognize (2.10) as a linear homogeneous second-order differential equation with constant coefficients. The auxiliary equation (2.7) gives

\( \begin{gather} s^{2}+2 m E \hbar^{-2}=0 \ s= \pm(-2 m E)^{1 / 2} \hbar^{-1} \tag{2.11}\ s= \pm i(2 m E)^{1 / 2} / \hbar \tag{2.12} \end{gather} \)

where $i=\sqrt{-1}$. Using (2.8), we have

\( \begin{equation} \psi{\mathrm{II}}=c{1} e^{i(2 m E)^{1 / 2} x / \hbar}+c_{2} e^{-i(2 m E)^{1 / 2} x / \hbar} \tag{2.13} \end{equation} \)

Temporarily, let

\( \begin{aligned} \theta & \equiv(2 m E)^{1 / 2} x / \hbar \ \psi{\text {III }} & =c{1} e^{i \theta}+c_{2} e^{-i \theta} \end{aligned} \)

We have $e^{i \theta}=\cos \theta+i \sin \theta\left[\right.$ Eq. (1.28)] and $e^{-i \theta}=\cos (-\theta)+i \sin (-\theta)=\cos \theta-$ $i \sin \theta$, since

\( \begin{equation} \cos (-\theta)=\cos \theta \quad \text { and } \quad \sin (-\theta)=-\sin \theta \tag{2.14} \end{equation} \)

Therefore,

\( \begin{aligned} \psi{\mathrm{II}} & =c{1} \cos \theta+i c{1} \sin \theta+c{2} \cos \theta-i c{2} \sin \theta \ & =\left(c{1}+c{2}\right) \cos \theta+\left(i c{1}-i c_{2}\right) \sin \theta \ & =A \cos \theta+B \sin \theta \end{aligned} \)

where $A$ and $B$ are new arbitrary constants. Hence,

\( \begin{equation} \psi_{\mathrm{II}}=A \cos \left[\hbar^{-1}(2 m E)^{1 / 2} x\right]+B \sin \left[\hbar^{-1}(2 m E)^{1 / 2} x\right] \tag{2.15} \end{equation} \)

Now we find $A$ and $B$ by applying boundary conditions. It seems reasonable to postulate that the wave function will be continuous; that is, it will make no sudden jumps in

FIGURE 2.2 Lowest four energy levels for the particle in a one-dimensional box. value (see Fig. 3.4). If $\psi$ is to be continuous at the point $x=0$, then $\psi{\mathrm{I}}$ and $\psi{\mathrm{II}}$ must approach the same value at $x=0$ :

\( \begin{aligned} \lim {x \rightarrow 0} \psi{\mathrm{I}} & =\lim {x \rightarrow 0} \psi{\mathrm{II}} \ 0 & =\lim _{x \rightarrow 0}\left{A \cos \left[\hbar^{-1}(2 m E)^{1 / 2} x\right]+B \sin \left[\hbar^{-1}(2 m E)^{1 / 2} x\right]\right} \ 0 & =A \end{aligned} \)

since

\( \begin{equation} \sin 0=0 \quad \text { and } \quad \cos 0=1 \tag{2.16} \end{equation} \)

With $A=0$, Eq. (2.15) becomes

\( \begin{equation} \psi_{\mathrm{II}}=B \sin \left[(2 \pi / h)(2 m E)^{1 / 2} x\right] \tag{2.17} \end{equation} \)

Applying the continuity condition at $x=l$, we get

\( \begin{equation} B \sin \left[(2 \pi / h)(2 m E)^{1 / 2} l\right]=0 \tag{2.18} \end{equation} \)

$B$ cannot be zero because this would make the wave function zero everywhere-we would have an empty box. Therefore,

\( \sin \left[(2 \pi / h)(2 m E)^{1 / 2} l\right]=0 \)

The zeros of the sine function occur at $0, \pm \pi, \pm 2 \pi, \pm 3 \pi, \ldots= \pm n \pi$. Hence,

\( \begin{equation} (2 \pi / h)(2 m E)^{1 / 2} l= \pm n \pi \tag{2.19} \end{equation} \)

The value $n=0$ is a special case. From (2.19), $n=0$ corresponds to $E=0$. For $E=0$, the roots (2.12) of the auxiliary equation are equal and (2.13) is not the complete solution of the Schrödinger equation. To find the complete solution, we return to (2.10), which for $E=0$ reads $d^{2} \psi{\mathrm{II}} / d x^{2}=0$. Integration gives $d \psi{\mathrm{II}} / d x=c$ and $\psi{\mathrm{II}}=c x+d$, where $c$ and $d$ are constants. The boundary condition that $\psi{\text {II }}=0$ at $x=0$ gives $d=0$, and the condition that $\psi{\mathrm{II}}=0$ at $x=l$ then gives $c=0$. Thus, $\psi{\mathrm{II}}=0$ for $E=0$, and therefore $E=0$ is not an allowed energy value. Hence, $n=0$ is not allowed.

Solving (2.19) for $E$, we have

\( \begin{equation} E=\frac{n^{2} h^{2}}{8 m l^{2}} \quad n=1,2,3, \ldots \tag{2.20} \end{equation} \)

Only the energy values (2.20) allow $\psi$ to satisfy the boundary condition of continuity at $x=l$. Application of a boundary condition has forced us to the conclusion that the values of the energy are quantized (Fig. 2.2). This is in striking contrast to the classical result that the particle in the box can have any nonnegative energy. Note that there is a minimum value, greater than zero, for the energy of the particle. The state of lowest energy is called the ground state. States with energies higher than the ground-state energy are excited states. (In classical mechanics, the lowest possible energy of a particle in a box is zero. The classical particle sits motionless inside the box with zero kinetic energy and zero potential energy.)

EXAMPLE

A particle of mass $2.00 \times 10^{-26} \mathrm{~g}$ is in a one-dimensional box of length 4.00 nm . Find the frequency and wavelength of the photon emitted when this particle goes from the $n=3$ to the $n=2$ level.

By conservation of energy, the energy $h \nu$ of the emitted photon equals the energy difference between the two stationary states [Eq. (1.4); see also Section 9.9]:

\( \begin{gathered} h \nu=E{\text {upper }}-E{\text {lower }}=\frac{n{u}^{2} h^{2}}{8 m l^{2}}-\frac{n{l}^{2} h^{2}}{8 m l^{2}} \ \nu=\frac{\left(n{u}^{2}-n{l}^{2}\right) h}{8 m l^{2}}=\frac{\left(3^{2}-2^{2}\right)\left(6.626 \times 10^{-34} \mathrm{~J} \mathrm{~s}\right)}{8\left(2.00 \times 10^{-29} \mathrm{~kg}\right)\left(4.00 \times 10^{-9} \mathrm{~m}\right)^{2}}=1.29 \times 10^{12} \mathrm{~s}^{-1} \end{gathered} \)

where $u$ and $l$ stand for upper and lower. Use of $\lambda \nu=c$ gives $\lambda=2.32 \times 10^{-4} \mathrm{~m}$. (A common student error is to set $h \nu$ equal to the energy of one of the states instead of the energy difference between states.)

EXERCISE For an electron in a certain one-dimensional box, the longest-wavelength transition occurs at 400 nm . Find the length of the box. (Answer: 0.603 nm .)

Substitution of (2.19) into (2.17) gives for the wave function

\( \begin{equation} \psi_{\mathrm{II}}=B \sin \left(\frac{n \pi x}{l}\right), \quad n=1,2,3, \ldots \tag{2.21} \end{equation} \)

The use of the negative sign in front of $n \pi$ does not give us another independent solution. Since $\sin (-\theta)=-\sin \theta$, we would simply get a constant, -1 , times the solution with the plus sign.

The constant $B$ in Eq. (2.21) is still arbitrary. To fix its value, we use the normalization requirement, Eqs. (1.24) and (1.22):

\( \begin{gather} \int{-\infty}^{\infty}|\Psi|^{2} d x=\int{-\infty}^{\infty}|\psi|^{2} d x=1 \ \int{-\infty}^{0}\left|\psi{\mathrm{I}}\right|^{2} d x+\int{0}^{l}\left|\psi{\mathrm{II}}\right|^{2} d x+\int{l}^{\infty}\left|\psi{\mathrm{III}}\right|^{2} d x=1 \ |B|^{2} \int_{0}^{l} \sin ^{2}\left(\frac{n \pi x}{l}\right) d x=1=|B|^{2} \frac{l}{2} \tag{2.22} \end{gather} \)

where the integral was evaluated by using Eq. (A.2) in the Appendix. We have

\( |B|=(2 / l)^{1 / 2} \)

Note that only the absolute value of $B$ has been found. $B$ could be $-(2 / l)^{1 / 2}$ as well as $(2 / l)^{1 / 2}$. Moreover, $B$ need not be a real number. We could use any complex number with absolute value $(2 / l)^{1 / 2}$. All we can say is that $B=(2 / l)^{1 / 2} e^{i \alpha}$, where $\alpha$ is the phase of $B$ and could be any value in the range 0 to $2 \pi$ (Section 1.7). Choosing the phase to be zero, we write as the stationary-state wave functions for the particle in a box

\( \begin{equation} \psi_{\mathrm{II}}=\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right), \quad n=1,2,3, \ldots \tag{2.23} \end{equation} \)

Graphs of the wave functions and the probability densities are shown in Figs. 2.3 and 2.4.

The number $n$ in the energies (2.20) and the wave functions (2.23) is called a quantum number. Each different value of the quantum number $n$ gives a different wave function and a different state.

FIGURE 2.3 Graphs of $\psi$ for the three lowest-energy particle-in-a-box states.

FIGURE 2.4 Graphs of $|\psi|^{2}$ for the lowest particle-in-a-box states.

The wave function is zero at certain points; these points are called nodes. For each increase of one in the value of the quantum number $n, \psi$ has one more node. The existence of nodes in $\psi$ and $|\psi|^{2}$ may seem surprising. Thus, for $n=2$, Fig. 2.4 says that there is zero probability of finding the particle in the center of the box at $x=l / 2$. How can the particle get from one side of the box to the other without at any time being found in the center? This apparent paradox arises from trying to understand the motion of microscopic particles using our everyday experience of the motions of macroscopic particles. However, as noted in Chapter 1, electrons and other microscopic "particles" cannot be fully and correctly described in terms of concepts of classical physics drawn from the macroscopic world.

Figure 2.4 shows that the probability of finding the particle at various places in the box is quite different from the classical result. Classically, a particle of fixed energy in a box bounces back and forth elastically between the two walls, moving at constant speed. Thus it is equally likely to be found at any point in the box. Quantum mechanically, we find a maximum in probability at the center of the box for the lowest energy level. As we go to higher energy levels with more nodes, the maxima and minima of probability come closer together, and the variations in probability along the length of the box ultimately become undetectable. For very high quantum numbers, we approach the classical result of uniform probability density.

This result, that in the limit of large quantum numbers quantum mechanics goes over into classical mechanics, is known as the Bohr correspondence principle. Since Newtonian mechanics holds for macroscopic bodies (moving at speeds much less than the speed of light), we expect nonrelativistic quantum mechanics to give the same answer as classical mechanics for macroscopic bodies. Because of the extremely small size of Planck's constant, quantization of energy is unobservable for macroscopic bodies. Since the mass of the particle and the length of the box squared appear in the denominator of Eq. (2.20), a macroscopic object in a macroscopic box having a macroscopic energy of motion would have a huge value for $n$, and hence, according to the correspondence principle, would show classical behavior.

We have a whole set of wave functions, each corresponding to a different energy and characterized by the quantum number $n$, which is a positive integer. Let the subscript $i$ denote a particular wave function with the value $n_{i}$ for its quantum number:

\( \begin{gathered} \psi{i}=\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n{i} \pi x}{l}\right), \quad 0<x<l \ \psi_{i}=0 \quad \text { elsewhere } \end{gathered} \)

Since the wave function has been normalized, we have

\( \begin{equation} \int{-\infty}^{\infty} \psi{i}^{} \psi_{j} d x=1 \quad \text { if } i=j \tag{2.24} \end{equation} \)

We now ask for the value of this integral when we use wave functions corresponding to different energy levels:

\( \int{-\infty}^{\infty} \psi{i}^{*} \psi{j} d x=\int{0}^{l}\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n{i} \pi x}{l}\right)\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n{j} \pi x}{l}\right) d x, \quad n{i} \neq n{j} \)

Use of Eq. (A.5) in the Appendix gives

\( \begin{equation} \int{-\infty}^{\infty} \psi{i}^{} \psi{j} d x=\frac{2}{l}\left[\frac{\sin \left[\left(n{i}-n{j}\right) \pi\right]}{2\left(n{i}-n{j}\right) \pi / l}-\frac{\sin \left[\left(n{i}+n{j}\right) \pi\right]}{2\left(n{i}+n_{j}\right) \pi / l}\right]=0 \tag{2.25} \end{equation} \)

since $\sin m \pi=0$ for $m$ an integer. We thus have

\( \begin{equation} \int{-\infty}^{\infty} \psi{i}^{} \psi_{j} d x=0, \quad i \neq j \tag{2.26} \end{equation} \)

When (2.26) holds, the functions $\psi{i}$ and $\psi{j}$ are said to be orthogonal to each other for $i \neq j$. We can combine (2.24) and (2.26) by writing

\( \begin{equation} \int{-\infty}^{\infty} \psi{i}^{} \psi{j} d x=\delta{i j} \tag{2.27} \end{equation} \)

The symbol $\delta_{i j}$ is called the Kronecker delta (after a mathematician). It equals 1 when the two indexes $i$ and $j$ are equal, and it equals 0 when $i$ and $j$ are unequal:

\( \delta_{i j} \equiv \begin{cases}0 & \text { for } i \neq j \tag{2.28}\ 1 & \text { for } i=j\end{cases} \)

The property (2.27) of the wave functions is called orthonormality. We proved orthonormality only for the particle-in-a-box wave functions. We shall prove it more generally in Section 7.2.

You might be puzzled by Eq. (2.26) and wonder why we would want to multiply the wave function of one state by the wave function of a different state. We will later see (Section 7.3, for example) that it is often helpful to use equations that contain a sum involving all the wave functions of a system, and such equations can lead to integrals like that in (2.26).

A more rigorous way to look at the particle in a box with infinite walls is to first treat the particle in a box with a finite jump in potential energy at the walls and then take the limit as the jump in $V$ becomes infinite. The results, when the limit is taken, will be the same as (2.20) and (2.23) (see Prob. 2.22).

We have considered only the stationary states of the particle in a one-dimensional box. For an example of a nonstationary state of this system, see the example near the end of Section 7.8.

Some online computer simulations of the particle in a box can be found at www .chem.uci.edu/undergraduate/applets/dwell/dwell.htm (shows the effects on the wave functions and energy levels when a barrier of variable height and width is introduced into the middle of the box); web.williams.edu/wp-etc/chemistry/dbingemann/Chem153/particle .html (shows quantization by plotting the solution to the Schrödinger equation as the energy is varied and as the box length is varied); and falstad.com/qm1d/ (shows both timeindependent and time-dependent states; see Prob. 7.47).

The Free Particle in One Dimension

By a free particle, we mean a particle subject to no forces whatever. For a free particle, integration of (1.12) shows that the potential energy remains constant no matter what the value of $x$ is. Since the choice of the zero level of energy is arbitrary, we may set $V(x)=0$. The Schrödinger equation (1.19) becomes

\( \begin{equation} \frac{d^{2} \psi}{d x^{2}}+\frac{2 m}{\hbar^{2}} E \psi=0 \tag{2.29} \end{equation} \)

Equation (2.29) is the same as Eq. (2.10) (except for the boundary conditions). Therefore, the general solution of $(2.29)$ is (2.13):

\( \begin{equation} \psi=c{1} e^{i(2 m E)^{1 / 2} x / \hbar}+c{2} e^{-i(2 m E)^{1 / 2} x / \hbar} \tag{2.30} \end{equation} \)

What boundary condition might we impose? It seems reasonable to postulate (since $\psi^{*} \psi d x$ represents a probability) that $\psi$ will remain finite as $x$ goes to $\pm \infty$. If the energy $E$ is less than zero, then this boundary condition will be violated, since for $E<0$ we have

\( i(2 m E)^{1 / 2}=i(-2 m|E|)^{1 / 2}=i \cdot i \cdot(2 m|E|)^{1 / 2}=-(2 m|E|)^{1 / 2} \)

and therefore the first term in (2.30) will become infinite as $x$ approaches minus infinity. Similarly, if $E$ is negative, the second term in (2.30) becomes infinite as $x$ approaches plus infinity. Thus the boundary condition requires

\( \begin{equation} E \geq 0 \tag{2.31} \end{equation} \)

for the free particle. The wave function is oscillatory and is a linear combination of a sine and a cosine term [Eq. (2.15)]. For the free particle, the energy is not quantized; all nonnegative energies are allowed. Since we set $V=0$, the energy $E$ is in this case all kinetic energy. If we try to evaluate the arbitrary constants $c{1}$ and $c{2}$ by normalization, we will find that the integral $\int_{-\infty}^{\infty} \psi^{*}(x) \psi(x) d x$ is infinite. In other words, the free-particle wave function is not normalizable in the usual sense. This is to be expected on physical grounds because there is no reason for the probability of finding the free particle to approach zero as $x$ goes to $\pm \infty$.

The free-particle problem is an unreal situation because we could not actually have a particle that had no interaction with any other particle in the universe.

Particle in a Rectangular Well

Consider a particle in a one-dimensional box with walls of finite height (Fig. 2.5a). The potential-energy function is $V=V{0}$ for $x<0, V=0$ for $0 \leq x \leq l$, and $V=V{0}$ for $x>l$. There are two cases to examine, depending on whether the particle's energy $E$ is less than or greater than $V_{0}$.

FIGURE 2.5 (a) Potential energy for a particle in a one-dimensional rectangular well. (b) The groundstate wave function for this potential. (c) The first excited-state wave function. (a) (b) (c)

We first consider $E<V{0}$. The Schrödinger equation (1.19) in regions I and III is $d^{2} \psi / d x^{2}+\left(2 m / \hbar^{2}\right)\left(E-V{0}\right) \psi=0$. This is a linear homogeneous differential equation with constant coefficients, and the auxiliary equation (2.7) is $s^{2}+\left(2 m / \hbar^{2}\right)\left(E-V{0}\right)=0$ with roots $s= \pm\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2}$. Therefore,

\( \begin{aligned} \psi{\mathrm{I}} & =C \exp \left[\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right]+D \exp \left[-\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right] \ \psi{\text {III }} & =F \exp \left[\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right]+G \exp \left[-\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right] \end{aligned} \)

where $C, D, F$, and $G$ are constants. As in Section 2.3, we must prevent $\psi{\mathrm{I}}$ from becoming infinite as $x \rightarrow-\infty$. Since we are assuming $E<V{0}$, the quantity $\left(V{0}-E\right)^{1 / 2}$ is a real, positive number, and to keep $\psi{\mathrm{I}}$ finite as $x \rightarrow-\infty$, we must have $D=0$. Similarly, to keep $\psi_{\text {III }}$ finite as $x \rightarrow+\infty$, we must have $F=0$. Therefore,

\( \psi{\mathrm{I}}=C \exp \left[\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right], \quad \psi{\mathrm{III}}=G \exp \left[-\left(2 m / \hbar^{2}\right)^{1 / 2}\left(V{0}-E\right)^{1 / 2} x\right] \)

In region II, $V=0$, the Schrödinger equation is (2.10) and its solution is (2.15):

\( \begin{equation} \psi_{\mathrm{II}}=A \cos \left[\left(2 m / \hbar^{2}\right)^{1 / 2} E^{1 / 2} x\right]+B \sin \left[\left(2 m / \hbar^{2}\right)^{1 / 2} E^{1 / 2} x\right] \tag{2.32} \end{equation} \)

To complete the problem, we must apply the boundary conditions. As with the particle in a box with infinite walls, we require the wave function to be continuous at $x=0$ and at $x=l$; so $\psi{\mathrm{I}}(0)=\psi{\mathrm{II}}(0)$ and $\psi{\mathrm{II}}(l)=\psi{\mathrm{III}}(l)$. The wave function has four arbitrary constants, so more than these two boundary conditions are needed. As well as requiring $\psi$ to be continuous, we shall require that its derivative $d \psi / d x$ be continuous everywhere. To justify this requirement, we note that if $d \psi / d x$ changed discontinuously at a point, then its derivative (its instantaneous rate of change) $d^{2} \psi / d x^{2}$ would become infinite at that point. However, for the particle in a rectangular well, the Schrödinger equation $d^{2} \psi / d x^{2}=\left(2 m / \hbar^{2}\right)(V-E) \psi$ does not contain anything infinite on the right side, so $d^{2} \psi / d x^{2}$ cannot become infinite. [For a more rigorous argument, see D. Branson, Am. J. Phys., 47, 1000 (1979).] Therefore, $d \psi{\mathrm{I}} / d x=d \psi{\mathrm{II}} / d x$ at $x=0$ and $d \psi{\mathrm{II}} / d x=d \psi{\mathrm{III}} / d x$ at $x=l$.

From $\psi{\mathrm{I}}(0)=\psi{\mathrm{II}}(0)$, we get $C=A$. From $\psi{\mathrm{I}}^{\prime}(0)=\psi{\mathrm{II}}^{\prime}(0)$, we get (Prob. 2.21a) $B=\left(V{0}-E\right)^{1 / 2} A / E^{1 / 2}$. From $\psi{\mathrm{II}}(l)=\psi_{\mathrm{III}}(l)$, we get a complicated equation that allows $G$ to be found in terms of $A$. The constant $A$ is found by normalization.

Taking $\psi{\mathrm{II}}^{\prime}(l)=\psi{\mathrm{III}}^{\prime}(l)$, dividing it by $\psi{\mathrm{II}}(l)=\psi{\mathrm{III}}(l)$, and expressing $B$ in terms of $A$, we get the following equation for the energy levels (Prob. 2.21b):

\( \begin{equation} \left(2 E-V{0}\right) \sin \left[(2 m E)^{1 / 2} l / \hbar\right]=2\left(V{0} E-E^{2}\right)^{1 / 2} \cos \left[(2 m E)^{1 / 2} l / \hbar\right] \tag{2.33} \end{equation} \)

[Although $E=0$ satisfies (2.33), it is not an allowed energy value, since it gives $\psi=0$ (Prob. 2.30).] Defining the dimensionless constants $\varepsilon$ and $b$ as

\( \begin{equation} \varepsilon \equiv E / V{0} \quad \text { and } \quad b \equiv\left(2 m V{0}\right)^{1 / 2} l / \hbar \tag{2.34} \end{equation} \)

we divide (2.33) by $V_{0}$ to get

\( \begin{equation} (2 \varepsilon-1) \sin \left(b \varepsilon^{1 / 2}\right)-2\left(\varepsilon-\varepsilon^{2}\right)^{1 / 2} \cos \left(b \varepsilon^{1 / 2}\right)=0 \tag{2.35} \end{equation} \)

Only the particular values of $E$ that satisfy (2.33) give a wave function that is continuous and has a continuous derivative, so the energy levels are quantized for $E<V{0}$. To find the allowed energy levels, we can plot the left side of (2.35) versus $\varepsilon$ for $0<\varepsilon<1$ and find the points where the curve crosses the horizontal axis (see also Prob. 4.31c). A detailed study (Merzbacher, Section 6.8) shows that the number of allowed energy levels with $E<V{0}$ is $N$, where $N$ satisfies

\( \begin{equation} N-1<b / \pi \leq N, \quad \text { where } b \equiv\left(2 m V_{0}\right)^{1 / 2} l / \hbar \tag{2.36} \end{equation} \)

For example, if $V_{0}=h^{2} / m l^{2}$, then $b / \pi=2\left(2^{1 / 2}\right)=2.83$, and $N=3$.

Figure 2.5 shows $\psi$ for the lowest two energy levels. The wave function is oscillatory inside the box and dies off exponentially outside the box. It turns out that the number of nodes increases by one for each higher level.

So far we have considered only states with $EV{0}$, the quantity $\left(V{0}-E\right)^{1 / 2}$ is imaginary, and instead of dying off to zero as $x$ goes to $\pm \infty, \psi{\text {I }}$ and $\psi{\text {III }}$ oscillate (similar to the free-particle $\psi$ ). We no longer have any reason to set $D$ in $\psi{\mathrm{I}}$ and $F$ in $\psi{\text {III }}$ equal to zero, and with these additional constants available to satisfy the boundary conditions on $\psi$ and $\psi^{\prime}$, one finds that $E$ need not be restricted to obtain properly behaved wave functions. Therefore, all energies above $V_{0}$ are allowed.

A state in which $\psi \rightarrow 0$ as $x \rightarrow \infty$ and as $x \rightarrow-\infty$ is called a bound state. For a bound state, significant probability for finding the particle exists in only a finite region of space. For an unbound state, $\psi$ does not go to zero as $x \rightarrow \pm \infty$ and is not normalizable. For the particle in a rectangular well, states with $EV_{0}$ are unbound. For the particle in a box with infinitely high walls, all states are bound. For the free particle, all states are unbound.

For an online simulation of the particle in a well, go to www.falstad.com/qm1d and choose Finite Well in the Setup box. You can vary the well width and depth and see the effect on the energy levels and wave functions.

Tunneling

For the particle in a rectangular well (Section 2.4), Fig. 2.5 and the equations for $\psi{\mathrm{I}}$ and $\psi{\text {III }}$ show that for the bound states there is a nonzero probability of finding the particle in regions I and III, where its total energy $E$ is less than its potential energy $V=V_{0}$. Classically, this behavior is not allowed. The classical equations $E=T+V$ and $T \geq 0$, where $T$ is the kinetic energy, mean that $E$ cannot be less than $V$ in classical mechanics.

Consider a particle in a one-dimensional box with walls of finite height and finite thickness (Fig. 2.6). Classically, the particle cannot escape from the box unless its energy is greater than the potential-energy barrier $V{0}$. However, a quantum-mechanical treatment (which is omitted) shows that there is a finite probability for a particle of total energy less than $V{0}$ to be found outside the box.

The term tunneling denotes the penetration of a particle into a classically forbidden region (as in Fig. 2.5) or the passage of a particle through a potential-energy barrier whose height exceeds the particle's energy. Since tunneling is a quantum effect, its probability of occurrence is greater the less classical is the behavior of the particle. Therefore, tunneling is most prevalent with particles of small mass. (Note that the greater the mass $m$, the more rapidly the functions $\psi{\mathrm{I}}$ and $\psi{\text {III }}$ of Section 2.4 die away to zero.) Electrons tunnel quite readily. Hydrogen atoms and ions tunnel more readily than heavier atoms.

The emission of alpha particles from a radioactive nucleus involves tunneling of the alpha particles through the potential-energy barrier produced by the short-range attractive nuclear forces and the Coulombic repulsive force between the daughter nucleus and the alpha particle. The $\mathrm{NH}{3}$ molecule is pyramidal. There is a potential-energy barrier to inversion of the molecule, with the potential-energy maximum occurring at the planar configuration. The hydrogen atoms can tunnel through this barrier, thereby inverting the molecule. In $\mathrm{CH}{3} \mathrm{CH}_{3}$ there is a barrier to internal rotation, with a potential-energy

FIGURE 2.6 Potential energy for a particle in a onedimensional box of finite height and thickness. maximum at the eclipsed position of the hydrogens. The hydrogens can tunnel through this barrier from one staggered position to the next. Tunneling of electrons is important in oxidation-reduction reactions and in electrode processes. Tunneling usually contributes significantly to the rate of chemical reactions that involve transfer of hydrogen atoms. See R. P. Bell, The Tunnel Effect in Chemistry, Chapman \& Hall, 1980.

Tunneling of H atoms occurs in some enzyme-catalyzed reactions; see Quantum Tunnelling in Enzyme-Catalyzed Reactions, R. Allemann and N. Scrutton (eds.), RSC Publishing, 2009.

The scanning tunneling microscope, invented in 1981, uses the tunneling of electrons through the space between the extremely fine tip of a metal wire and the surface of an electrically conducting solid to produce images of individual atoms on the solid's surface. A small voltage is applied between the solid and the wire, and as the tip is moved across the surface at a height of a few angstroms, the tip height is adjusted to keep the current flow constant. A plot of tip height versus position gives an image of the surface.

Quantum Mechanical Operators

Click the keywords to read more about it

Operator: A rule that transforms a given function into another function. Operators are fundamental in quantum mechanics for describing physical quantities and their interactions 1. Energy Operator: An operator that, when applied to the wave function, returns the wave function multiplied by an allowed value of the energy 1. Differentiation Operator: An operator that differentiates a function with respect to a variable, often denoted with a circumflex (e.g., ( \hat{D}_n )) 1. Sum and Difference of Operators: The sum of two operators ( \hat{A}_n ) and ( \hat{B}_n ) applied to a function ( f(x) ) is defined as ( (\hat{A}_n + \hat{B}_n)f(x) = \hat{A}_n f(x) + \hat{B}_n f(x) ). The difference is similarly defined 1. Product of Operators: The product of two operators ( \hat{A}_n ) and ( \hat{B}_n ) applied to a function ( f(x) ) is defined as ( \hat{A}_n \hat{B}_n f(x) = \hat{A}_n (\hat{B}_n f(x)) ) 1. Operator Algebra: A set of rules and operations for manipulating operators, including the associative law of multiplication and commutative properties 1. Commutator: The commutator of two operators ( \hat{A}_n ) and ( \hat{B}_n ) is defined as ( [\hat{A}_n, \hat{B}_n] = \hat{A}_n \hat{B}_n - \hat{B}_n \hat{A}_n ). If the commutator is zero, the operators are said to commute 1. Eigenfunction: A function ( f(x) ) that, when operated on by a linear operator ( \hat{A}_n ), results in the function being multiplied by a constant ( k ). This constant is called the eigenvalue 1. Eigenvalue: The constant ( k ) that results from operating on an eigenfunction with a linear operator. It represents a characteristic value associated with the operator 1. Hamiltonian Operator: The operator corresponding to the total energy of a system, often used in the Schrödinger equation to find the energy eigenvalues and eigenfunctions 1. Linear Operator: An operator ( \hat{A}_n ) that satisfies the properties ( \hat{A}_n (f(x) + g(x)) = \hat{A}_n f(x) + \hat{A}_n g(x) ) and ( \hat{A}_n (cf(x)) = c\hat{A}_n f(x) ), where ( f ) and ( g ) are functions and ( c ) is a constant 1. Laplacian Operator: An operator denoted by ( \nabla^2 ) (del squared), which is the sum of the second partial derivatives with respect to each spatial coordinate 1. Expectation Value: The average value of a physical property ( B ) for a system in state ( \psi ), calculated as ( \langle B \rangle = \int \psi^* \hat{B} \psi , d\tau ), where ( \hat{B} ) is the operator corresponding to ( B ) 1. Normalization: The process of adjusting the wave function ( \psi ) so that the total probability of finding the particle is 1. This ensures that ( \int |\psi|^2 , d\tau = 1 ) 1. Quadratically Integrable: A function ( \psi ) is quadratically integrable if ( \int |\psi|^2 , d\tau ) is finite. This property is necessary for the function to be normalized 1.

We now develop the theory of quantum mechanics in a more general way than previously. We begin by writing the one-particle, one-dimensional, time-independent Schrödinger equation (1.19) in the form

\(
\begin{equation}
\left[-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+V(x)\right] \psi(x)=E \psi(x) \tag{3.1}
\end{equation}
\)

The entity in brackets in (3.1) is an operator. Equation (3.1) suggests that we have an energy operator, which, operating on the wave function, gives us the wave function back again, but multiplied by an allowed value of the energy. We therefore discuss operators.

An operator is a rule that transforms a given function into another function. For example, let $\hat{D}$ be the operator that differentiates a function with respect to $x$. We use a circumflex to denote an operator. Provided $f(x)$ is differentiable, the result of operating on $f(x)$ with $\hat{D}$ is $\hat{D} f(x)=f^{\prime}(x)$. For example, $\hat{D}\left(x^{2}+3 e^{2 x}\right)=2 x+6 e^{2 x}$. If $\hat{3}$ is the operator that multiplies a function by 3 , then $\hat{3}\left(x^{2}+3 e^{x}\right)=3 x^{2}+9 e^{x}$. If tan is the operator that takes the tangent of a function, then application of $\tan$ to the function $x^{2}+1$ gives $\tan \left(x^{2}+1\right)$. If the operator $\hat{A}$ transforms the function $f(x)$ into the function $g(x)$, we write $\hat{A} f(x)=g(x)$.

We define the sum and the difference of two operators $\hat{A}$ and $\hat{B}$ by

\(
\begin{align}
& (\hat{A}+\hat{B}) f(x) \equiv \hat{A} f(x)+\hat{B} f(x) \tag{3.2}\
& (\hat{A}-\hat{B}) f(x) \equiv \hat{A} f(x)-\hat{B} f(x)
\end{align}
\)

For example, if $\hat{D} \equiv d / d x$, then

\(
(\hat{D}+\hat{3})\left(x^{3}-5\right) \equiv \hat{D}\left(x^{3}-5\right)+\hat{3}\left(x^{3}-5\right)=3 x^{2}+\left(3 x^{3}-15\right)=3 x^{3}+3 x^{2}-15
\)

An operator can involve more than one variable. For example, the operator $\partial^{2} / \partial x^{2}+\partial^{2} / \partial y^{2}$ has the following effect:

\(
\left(\partial^{2} / \partial x^{2}+\partial^{2} / \partial y^{2}\right) g(x, y)=\partial^{2} g / \partial x^{2}+\partial^{2} g / \partial y^{2}
\)

The product of two operators $\hat{A}$ and $\hat{B}$ is defined by

\(
\begin{equation}
\hat{A} \hat{B} f(x) \equiv \hat{A}[\hat{B} f(x)] \tag{3.3}
\end{equation}
\)

In other words, we first operate on $f(x)$ with the operator on the right of the operator product, and then we take the resulting function and operate on it with the operator on the left of the operator product. For example, $\hat{3} \hat{D} f(x)=\hat{3}[\hat{D} f(x)]=\hat{3} f^{\prime}(x)=3 f^{\prime}(x)$.

The operators $\hat{A} \hat{B}$ and $\hat{B} \hat{A}$ may not have the same effect. Consider, for example, the operators $d / d x$ and $\hat{x}$ (where $\hat{x}$ means multiplication by $x$ ):

\(
\begin{gather}
\hat{D} \hat{x} f(x)=\frac{d}{d x}[x f(x)]=f(x)+x f^{\prime}(x)=(\hat{1}+\hat{x} \hat{D}) f(x) \tag{3.4}\
\hat{x} \hat{D} f(x)=\hat{x}\left[\frac{d}{d x} f(x)\right]=x f^{\prime}(x)
\end{gather}
\)

Thus $\hat{A} \hat{B}$ and $\hat{B} \hat{A}$ are different operators in this case.
We can develop an operator algebra as follows. Two operators $\hat{A}$ and $\hat{B}$ are said to be equal if $\hat{A} f=\hat{B} f$ for all functions $f$. Equal operators produce the same result when they operate on a given function. For example, (3.4) shows that

\(
\begin{equation}
\hat{D} \hat{x}=1+\hat{x} \hat{D} \tag{3.5}
\end{equation}
\)

The operator $\hat{1}$ (multiplication by 1) is the unit operator. The operator $\hat{0}$ (multiplication by 0 ) is the null operator. We usually omit the circumflex over operators that are simply multiplication by a constant. We can transfer operators from one side of an operator equation to the other (Prob. 3.7). Thus (3.5) is equivalent to $\hat{D} \hat{x}-\hat{x} \hat{D}-1=0$, where circumflexes over the null and unit operators were omitted.

Operators obey the associative law of multiplication:

\(
\begin{equation}
\hat{A}(\hat{B} \hat{C})=(\hat{A} \hat{B}) \hat{C} \tag{3.6}
\end{equation}
\)

The proof of (3.6) is outlined in Prob. 3.10. As an example, let $\hat{A}=d / d x, \hat{B}=\hat{x}$, and $\hat{C}=3$. Using (3.5), we have

\(
\begin{array}{ll}
(\hat{A} \hat{B})=\hat{D} \hat{x}=1+\hat{x} \hat{D}, & {[(\hat{A} \hat{B}) \hat{C}] f=(1+\hat{x} \hat{D}) 3 f=3 f+3 x f^{\prime}} \
(\hat{B} \hat{C})=3 \hat{x}, & {[\hat{A}(\hat{B} \hat{C})] f=\hat{D}(3 x f)=3 f+3 x f^{\prime}}
\end{array}
\)

A major difference between operator algebra and ordinary algebra is that numbers obey the commutative law of multiplication, but operators do not necessarily do so; $a b=b a$ if $a$ and $b$ are numbers, but $\hat{A} \hat{B}$ and $\hat{B} \hat{A}$ are not necessarily equal operators. We define the commutator $[\hat{A}, \hat{B}]$ of the operators $\hat{A}$ and $\hat{B}$ as the operator $\hat{A} \hat{B}-\hat{B} \hat{A}$ :

\(
\begin{equation}
[\hat{A}, \hat{B}] \equiv \hat{A} \hat{B}-\hat{B} \hat{A} \tag{3.7}
\end{equation}
\)

If $\hat{A} \hat{B}=\hat{B} \hat{A}$, then $[\hat{A}, \hat{B}]=0$, and we say that $\hat{A}$ and $\hat{B}$ commute. If $\hat{A} \hat{B} \neq \hat{B} \hat{A}$, then $\hat{A}$ and $\hat{B}$ do not commute. Note that $[\hat{A}, \hat{B}] f=\hat{A} \hat{B} f-\hat{B} \hat{A} f$. Since the order in which we apply the operators 3 and $d / d x$ makes no difference, we have

\(
\left[\hat{3}, \frac{d}{d x}\right]=\hat{3} \frac{d}{d x}-\frac{d}{d x} \hat{3}=0
\)

From Eq. (3.5) we have

\(
\begin{equation}
\left[\frac{d}{d x}, \hat{x}\right]=\hat{D} \hat{x}-\hat{x} \hat{D}=1 \tag{3.8}
\end{equation}
\)

The operators $d / d x$ and $\hat{x}$ do not commute.

EXAMPLE

Find $\left[z^{3}, d / d z\right]$.
To find $\left[z^{3}, d / d z\right]$, we apply this operator to an arbitrary function $g(z)$. Using the commutator definition (3.7) and the definitions of the difference and product of two operators, we have

\(
\begin{aligned}
{\left[z^{3}, d / d z\right] g=\left[z^{3}(d / d z)-(d / d z) z^{3}\right] g } & =z^{3}(d / d z) g-(d / d z)\left(z^{3} g\right) \
& =z^{3} g^{\prime}-3 z^{2} g-z^{3} g^{\prime}=-3 z^{2} g
\end{aligned}
\)

Deleting the arbitrary function $g$, we get the operator equation $\left[z^{3}, d / d z\right]=-3 z^{2}$.

EXERCISE Find $\left[d / d x, 5 x^{2}+3 x+4\right]$. (Answer: $10 x+3$.)

The square of an operator is defined as the product of the operator with itself: $\hat{B}^{2}=\hat{B} \hat{B}$. Let us find the square of the differentiation operator:

\(
\begin{aligned}
\hat{D}^{2} f(x) & =\hat{D}(\hat{D} f)=\hat{D} f^{\prime}=f^{\prime \prime} \
\hat{D}^{2} & =d^{2} / d x^{2}
\end{aligned}
\)

As another example, the square of the operator that takes the complex conjugate of a function is equal to the unit operator, since taking the complex conjugate twice gives the original function. The operator $\hat{B}^{n}(n=1,2,3, \ldots)$ is defined to mean applying the operator $\hat{B} n$ times in succession.

It turns out that the operators occurring in quantum mechanics are linear. $\hat{A}$ is a linear operator if and only if it has the following two properties:

\(
\begin{equation}
\hat{A}[f(x)+g(x)]=\hat{A} f(x)+\hat{A} g(x) \tag{3.9}
\end{equation}
\)

\(
\begin{equation}
\hat{A}[c f(x)]=c \hat{A} f(x) \tag{3.10}
\end{equation}
\)

where $f$ and $g$ are arbitrary functions and $c$ is an arbitrary constant (not necessarily real). Examples of linear operators include $\hat{x}^{2}, d / d x$, and $d^{2} / d x^{2}$. Some nonlinear operators are cos and ()$^{2}$, where ()$^{2}$ squares the function it acts on.

EXAMPLE

Is $d / d x$ a linear operator? Is $\sqrt{ }$ a linear operator?
We have

\(
\begin{gathered}
(d / d x)[f(x)+g(x)]=d f / d x+d g / d x=(d / d x) f(x)+(d / d x) g(x) \
(d / d x)[c f(x)]=c d f(x) / d x
\end{gathered}
\)

so $d / d x$ obeys (3.9) and (3.10) and is a linear operator. However,

\(
\sqrt{f(x)+g(x)} \neq \sqrt{f(x)}+\sqrt{g(x)}
\)

so $\sqrt{ }$ does not obey (3.9) and is nonlinear.
EXERCISE Is the operator $x^{2} \times$ (multiplication by $x^{2}$ ) linear? (Answer: Yes.)

Useful identities in linear-operator manipulations are

\(
\begin{equation}
(\hat{A}+\hat{B}) \hat{C}=\hat{A} \hat{C}+\hat{B} \hat{C} \tag{3.11}
\end{equation}
\)

\(
\begin{equation}
\hat{A}(\hat{B}+\hat{C})=\hat{A} \hat{B}+\hat{A} \hat{C} \tag{3.12}
\end{equation}
\)

EXAMPLE

Prove the distributive law (3.11) for linear operators.
A good way to begin a proof is to first write down what is given and what is to be proved. We are given that $\hat{A}, \hat{B}$, and $\hat{C}$ are linear operators. We must prove that $(\hat{A}+\hat{B}) \hat{C}=\hat{A} \hat{C}+\hat{B} \hat{C}$.

To prove that the operator $(\hat{A}+\hat{B}) \hat{C}$ is equal to the operator $\hat{A} \hat{C}+\hat{B} \hat{C}$, we must prove that these two operators give the same result when applied to an arbitrary function $f$. Thus we must prove that

\(
[(\hat{A}+\hat{B}) \hat{C}] f=(\hat{A} \hat{C}+\hat{B} \hat{C}) f
\)

We start with $[(\hat{A}+\hat{B}) \hat{C}] f$. This expression involves the product of the two operators $\hat{A}+\hat{B}$ and $\hat{C}$. The operator-product definition (3.3) with $\hat{A}$ replaced by $\hat{A}+\hat{B}$ and $\hat{B}$ replaced by $\hat{C}$ gives $[(\hat{A}+\hat{B}) \hat{C}] f=(\hat{A}+\hat{B})(\hat{C} f)$. The entity $\hat{C} f$ is a function, and use of the definition (3.2) of the sum $\hat{A}+\hat{B}$ of the two operators $\hat{A}$ and $\hat{B}$ gives $(\hat{A}+\hat{B})(\hat{C} f)=\hat{A}(\hat{C} f)+\hat{B}(\hat{C} f)$. Thus

\(
[(\hat{A}+\hat{B}) \hat{C}] f=(\hat{A}+\hat{B})(\hat{C} f)=\hat{A}(\hat{C} f)+\hat{B}(\hat{C} f)
\)

Use of the operator-product definition (3.3) gives $\hat{A}(\hat{C} f)=\hat{A} \hat{C} f$ and $\hat{B}(\hat{C} f)=\hat{B} \hat{C} f$. Hence

\(
\begin{equation}
[(\hat{A}+\hat{B}) \hat{C}] f=\hat{A} \hat{C} f+\hat{B} \hat{C} f \tag{3.13}
\end{equation}
\)

Use of the operator-sum definition (3.2) with $\hat{A}$ replaced by $\hat{A} \hat{C}$ and $\hat{B}$ replaced by $\hat{B} \hat{C}$ gives $(\hat{A} \hat{C}+\hat{B} \hat{C}) f=\hat{A} \hat{C} f+\hat{B} \hat{C} f$, so (3.13) becomes

\(
[(\hat{A}+\hat{B}) \hat{C}] f=(\hat{A} \hat{C}+\hat{B} \hat{C}) f
\)

which is what we wanted to prove. Hence $(\hat{A}+\hat{B}) \hat{C}=\hat{A} \hat{C}+\hat{B} \hat{C}$.
Note that we did not need to use the linearity of $\hat{A}, \hat{B}$, and $\hat{C}$. Hence (3.11) holds for all operators. However, (3.12) holds only if $\hat{A}$ is linear (see Prob. 3.17).

EXAMPLE

Find the square of the operator $d / d x+\hat{x}$.
To find the effect of $(d / d x+\hat{x})^{2}$, we apply this operator to an arbitrary function $f(x)$. Letting $\hat{D} \equiv d / d x$, we have

\(
\begin{aligned}
(\hat{D}+\hat{x})^{2} f(x)= & (\hat{D}+\hat{x})[(\hat{D}+x) f]=(\hat{D}+\hat{x})\left(f^{\prime}+x f\right) \
= & f^{\prime \prime}+f+x f^{\prime}+x f^{\prime}+x^{2} f=\left(\hat{D}^{2}+2 \hat{x} \hat{D}+\hat{x}^{2}+1\right) f(x) \
& (\hat{D}+\hat{x})^{2}=\hat{D}^{2}+2 \hat{x} \hat{D}+\hat{x}^{2}+1
\end{aligned}
\)

Let us repeat this calculation, using only operator equations:

\(
\begin{aligned}
(\hat{D}+\hat{x})^{2} & =(\hat{D}+\hat{x})(\hat{D}+\hat{x})=\hat{D}(\hat{D}+\hat{x})+\hat{x}(\hat{D}+\hat{x}) \
& =\hat{D}^{2}+\hat{D} \hat{x}+\hat{x} \hat{D}+\hat{x}^{2}=\hat{D}^{2}+\hat{x} \hat{D}+1+\hat{x} \hat{D}+\hat{x}^{2} \
& =\hat{D}^{2}+2 x \hat{D}+x^{2}+1
\end{aligned}
\)

where (3.11), (3.12), and (3.5) have been used and the circumflex over the operator "multiplication by $x$ " has been omitted. Until you have become thoroughly experienced with operators, it is safest when doing operator manipulations always to let the operator operate on an arbitrary function $f$ and then delete $f$ at the end.

EXERCISE Find $\left(d^{2} / d x^{2}+x\right)^{2}$. (Answer: $d^{4} / d x^{4}+2 x d^{2} / d x^{2}+2 d / d x+x^{2}$.)


Suppose that the effect of operating on some function $f(x)$ with the linear operator $\hat{A}$ is simply to multiply $f(x)$ by a certain constant $k$. We then say that $f(x)$ is an eigenfunction of $\hat{A}$ with eigenvalue $k$. (Eigen is a German word meaning characteristic.) As part of the definition, we shall require that the eigenfunction $f(x)$ is not identically zero. By this we mean that, although $f(x)$ may vanish at various points, it is not everywhere zero. We have

\(
\begin{equation}
\hat{A} f(x)=k f(x) \tag{3.14}
\end{equation}
\)

As an example of (3.14), $e^{2 x}$ is an eigenfunction of the operator $d / d x$ with eigenvalue 2:

\(
(d / d x) e^{2 x}=2 e^{2 x}
\)

However, $\sin 2 x$ is not an eigenfunction of $d / d x$, since $(d / d x)(\sin 2 x)=2 \cos 2 x$, which is not a constant times $\sin 2 x$.

EXAMPLE

If $f(x)$ is an eigenfunction of the linear operator $\hat{A}$ and $c$ is any constant, prove that $c f(x)$ is an eigenfunction of $\hat{A}$ with the same eigenvalue as $f(x)$.

A good way to see how to do a proof is to carry out the following steps:

  1. Write down the given information and translate this information from words into equations.
  2. Write down what is to be proved in the form of an equation or equations.
  3. (a) Manipulate the given equations of step 1 so as to transform them to the desired equations of step 2. (b) Alternatively, start with one side of the equation that we want to prove and use the given equations of step 1 to manipulate this side until it is transformed into the other side of the equation to be proved.

We are given three pieces of information: $f$ is an eigenfunction of $\hat{A} ; \hat{A}$ is a linear operator; $c$ is a constant. Translating these statements into equations, we have [see Eqs. (3.14), (3.9), and (3.10)]

\(
\begin{gather}
\hat{A} f=k f \tag{3.15}\
\hat{A}(f+g)=\hat{A} f+\hat{A} g \quad \text { and } \quad \hat{A}(b f)=b \hat{A} f \tag{3.16}\
c=\text { a constant }
\end{gather}
\)

where $k$ and $b$ are constants and $f$ and $g$ are functions.
We want to prove that $c f$ is an eigenfunction of $\hat{A}$ with the same eigenvalue as $f$, which, written as an equation, is

\(
\hat{A}(c f)=k(c f)
\)

Using the strategy of step $3(\mathrm{~b})$, we start with the left side $\hat{A}(c f)$ of this last equation and try to show that it equals $k(c f)$. Using the second equation in the linearity definition (3.16), we have $\hat{A}(c f)=c \hat{A} f$. Using the eigenvalue equation (3.15), we have $c \hat{A} f=c k f$. Hence

\(
\hat{A}(c f)=c \hat{A} f=c k f=k(c f)
\)

which completes the proof.

EXAMPLE

(a) Find the eigenfunctions and eigenvalues of the operator $d / d x$. (b) If we impose the boundary condition that the eigenfunctions remain finite as $x \rightarrow \pm \infty$, find the eigenvalues.
(a) Equation (3.14) with $\hat{A}=d / d x$ becomes

\(
\begin{align}
\frac{d f(x)}{d x} & =k f(x) \tag{3.17}\
\frac{1}{f} d f & =k d x
\end{align}
\)

Integration gives

\(
\begin{align}
\ln f & =k x+\text { constant } \
f & =e^{\text {constant }} e^{k x} \
f & =c e^{k x} \tag{3.18}
\end{align}
\)

The eigenfunctions of $d / d x$ are given by (3.18). The eigenvalues are $k$, which can be any number whatever and (3.17) will still be satisfied. The eigenfunctions contain an arbitrary multiplicative constant $c$. This is true for the eigenfunctions of every linear operator, as was proved in the previous example. Each different value of $k$ in (3.18) gives a different eigenfunction. However, eigenfunctions with the same value of $k$ but different values of $c$ are not independent of each other.
(b) Since $k$ can be complex, we write it as $k=a+i b$, where $a$ and $b$ are real numbers. We then have $f(x)=c e^{a x} e^{i b x}$. If $a>0$, the factor $e^{a x}$ goes to infinity as $x$ goes to infinity. If $a<0$, then $e^{a x} \rightarrow \infty$ in the limit $x \rightarrow-\infty$. Thus the boundary conditions require that $a=0$, and the eigenvalues are $k=i b$, where $b$ is real.

In the first example in Section 3.1, we found that $\left[z^{3}, d / d z\right] g(z)=-3 z^{2} g(z)$ for every function $g$, and we concluded that $\left[z^{3}, d / d z\right]=-3 z^{2}$. In contrast, the eigenvalue equation $\hat{A} f(x)=k f(x)$ [Eq. (3.14)] does not hold for every function $f(x)$, and we cannot conclude from this equation that $\hat{A}=k$. Thus the fact that $(d / d x) e^{2 x}=2 e^{2 x}$ does not mean that the operator $d / d x$ equals multiplication by 2 .


We now examine the relationship between operators and quantum mechanics. Comparing Eq. (3.1) with (3.14), we see that the Schrödinger equation is an eigenvalue problem. The values of the energy $E$ are the eigenvalues. The eigenfunctions are the time-independent wave functions $\psi$. The operator whose eigenfunctions and eigenvalues are desired is $-\left(\hbar^{2} / 2 m\right) d^{2} / d x^{2}+V(x)$. This operator is called the Hamiltonian operator for the system.

Sir William Rowan Hamilton (1805-1865) devised an alternative form of Newton's equations of motion involving a function $H$, the Hamiltonian function for the system. For a system where the potential energy is a function of the coordinates only, the total energy remains constant with time; that is, $E$ is conserved. We shall restrict ourselves to such conservative systems. For conservative systems, the classical-mechanical Hamiltonian function turns out to be simply the total energy expressed in terms of coordinates and conjugate momenta. For Cartesian coordinates $x, y, z$, the conjugate
momenta are the components of linear momentum in the $x, y$, and $z$ directions: $p{x}, p{y}$, and $p_{z}$ :

\(
\begin{equation}
p{x} \equiv m v{x}, \quad p{y} \equiv m v{y}, \quad p{z} \equiv m v{z} \tag{3.19}
\end{equation}
\)

where $v{x}, v{y}$, and $v_{z}$ are the components of the particle's velocity in the $x, y$, and $z$ directions.

Let us find the classical-mechanical Hamiltonian function for a particle of mass $m$ moving in one dimension and subject to a potential energy $V(x)$. The Hamiltonian function is equal to the energy, which is composed of kinetic and potential energies. The familiar form of the kinetic energy, $\frac{1}{2} m v{x}^{2}$, will not do, however, since we must express the Hamiltonian as a function of coordinates and momenta, not velocities. Since $v{x}=p{x} / m$, the form of the kinetic energy we want is $p{x}^{2} / 2 m$. The Hamiltonian function is

\(
\begin{equation}
H=\frac{p_{x}^{2}}{2 m}+V(x) \tag{3.20}
\end{equation}
\)

The time-independent Schrödinger equation (3.1) indicates that, corresponding to the Hamiltonian function (3.20), we have a quantum-mechanical operator

\(
-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+V(x)
\)

whose eigenvalues are the possible values of the system's energy. This correspondence between physical quantities in classical mechanics and operators in quantum mechanics is general. It is a fundamental postulate of quantum mechanics that every physical property (for example, the energy, the $x$ coordinate, the momentum) has a corresponding quantummechanical operator. We further postulate that the operator corresponding to the property $B$ is found by writing the classical-mechanical expression for $B$ as a function of Cartesian coordinates and corresponding momenta and then making the following replacements. Each Cartesian coordinate $q$ is replaced by the operator multiplication by that coordinate:

\(
\hat{q}=q \times
\)

Each Cartesian component of linear momentum $p_{q}$ is replaced by the operator

\(
\hat{p}_{q}=\frac{\hbar}{i} \frac{\partial}{\partial q}=-i \hbar \frac{\partial}{\partial q}
\)

where $i=\sqrt{-1}$ and $\partial / \partial q$ is the operator for the partial derivative with respect to the coordinate $q$. Note that $1 / i=i / i^{2}=i /(-1)=-i$.

Consider some examples. The operator corresponding to the $x$ coordinate is multiplication by $x$ :

\(
\begin{equation}
\hat{x}=x \times \tag{3.21}
\end{equation}
\)

Also,

\(
\begin{equation}
\hat{y}=y \times \quad \text { and } \quad \hat{z}=z \times \tag{3.22}
\end{equation}
\)

The operators for the components of linear momentum are

\(
\begin{equation}
\hat{p}{x}=\frac{\hbar}{i} \frac{\partial}{\partial x}, \quad \hat{p}{y}=\frac{\hbar}{i} \frac{\partial}{\partial y}, \quad \hat{p}_{z}=\frac{\hbar}{i} \frac{\partial}{\partial z} \tag{3.23}
\end{equation}
\)

The operator corresponding to $p_{x}^{2}$ is

\(
\begin{equation}
\hat{p}_{x}^{2}=\left(\frac{\hbar}{i} \frac{\partial}{\partial x}\right)^{2}=\frac{\hbar}{i} \frac{\partial}{\partial x} \frac{\hbar}{i} \frac{\partial}{\partial x}=-\hbar^{2} \frac{\partial^{2}}{\partial x^{2}} \tag{3.24}
\end{equation}
\)

with similar expressions for $\hat{p}{y}^{2}$ and $\hat{p}{z}^{2}$.
What are the potential-energy and kinetic-energy operators in one dimension? Suppose a system has the potential-energy function $V(x)=a x^{2}$, where $a$ is a constant. Replacing $x$ with $x \times$, we see that the potential-energy operator is simply multiplication by $a x^{2}$; that is, $\hat{V}(x)=a x^{2} \times$. In general, we have for any potential-energy function

\(
\begin{equation}
\hat{V}(x)=V(x) \times \tag{3.25}
\end{equation}
\)

The classical-mechanical expression for the kinetic energy $T$ in (3.20) is

\(
\begin{equation}
T=p_{x}^{2} / 2 m \tag{3.26}
\end{equation}
\)

Replacing $p_{x}$ by the corresponding operator (3.23), we have

\(
\begin{equation}
\hat{T}=-\frac{\hbar^{2}}{2 m} \frac{\partial^{2}}{\partial x^{2}}=-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}} \tag{3.27}
\end{equation}
\)

where (3.24) has been used, and the partial derivative becomes an ordinary derivative in one dimension. The classical-mechanical Hamiltonian (3.20) is

\(
\begin{equation}
H=T+V=p_{x}^{2} / 2 m+V(x) \tag{3.28}
\end{equation}
\)

The corresponding quantum-mechanical Hamiltonian (or energy) operator is

\(
\begin{equation}
\hat{H}=\hat{T}+\hat{V}=-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+V(x) \tag{3.29}
\end{equation}
\)

which agrees with the operator in the Schrödinger equation (3.1). Note that all these operators are linear.

How are the quantum-mechanical operators related to the corresponding properties of a system? Each such operator has its own set of eigenfunctions and eigenvalues. Let $\hat{B}$ be the quantum-mechanical operator that corresponds to the physical property $B$. Letting $f{i}$ and $b{i}$ symbolize the eigenfunctions and eigenvalues of $\hat{B}$, we have [Eq. (3.14)]

\(
\begin{equation}
\hat{B} f{i}=b{i} f_{i}, \quad i=1,2,3, \ldots \tag{3.30}
\end{equation}
\)

The operator $\hat{B}$ has many eigenfunctions and eigenvalues, and the subscript $i$ is used to indicate this. $\hat{B}$ is usually a differential operator, and (3.30) is a differential equation whose solutions give the eigenfunctions and eigenvalues. Quantum mechanics postulates that (no matter what the state function of the system happens to be) a measurement of the property $B$ must yield one of the eigenvalues $b{i}$ of the operator $\hat{B}$. For example, the only values that can be found for the energy of a system are the eigenvalues of the energy (Hamiltonian) operator $\hat{H}$. Using $\psi{i}$ to symbolize the eigenfunctions of $\hat{H}$, we have as the eigenvalue equation (3.30)

\(
\begin{equation}
\hat{H} \psi{i}=E{i} \psi_{i} \tag{3.31}
\end{equation}
\)

Using the Hamiltonian (3.29) in (3.31), we obtain for a one-dimensional, one-particle system

\(
\begin{equation}
\left[-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+V(x)\right] \psi{i}=E{i} \psi_{i} \tag{3.32}
\end{equation}
\)

which is the time-independent Schrödinger equation (3.1). Thus our postulates about operators are consistent with our previous work. We shall later further justify the choice (3.23) for the momentum operator by showing that in the limiting transition to classical mechanics this choice yields $p_{x}=m(d x / d t)$, as it should. (See Prob. 7.59.)

In Chapter 1 we postulated that the state of a quantum-mechanical system is specified by a state function $\Psi(x, t)$, which contains all the information we can know about the system. How does $\Psi$ give us information about the property $B$ ? We postulate that if $\Psi$ is an eigenfunction of $\hat{B}$ with eigenvalue $b{k}$, then a measurement of $B$ is certain to yield the value $b{k}$. Consider, for example, the energy. The eigenfunctions of the energy operator are the solutions $\psi(x)$ of the time-independent Schrödinger equation (3.32). Suppose the system is in a stationary state with state function [Eq. (1.20)]

\(
\begin{equation}
\Psi(x, t)=e^{-i E t / \hbar} \psi(x) \tag{3.33}
\end{equation}
\)

Is $\Psi(x, t)$ an eigenfunction of the energy operator $\hat{H}$ ? We have

\(
\hat{H} \Psi(x, t)=\hat{H} e^{-i E t / \hbar} \psi(x)
\)

$\hat{H}$ contains no derivatives with respect to time and therefore does not affect the exponential factor $e^{-i E t / \hbar}$. We have

\(
\begin{align}
\hat{H} \Psi(x, t)=e^{-i E t / \hbar} \hat{H} \psi(x) & =E e^{-i E t / \hbar} \psi(x)=E \Psi(x, t) \
\hat{H} \Psi & =E \Psi \tag{3.34}
\end{align}
\)

where (3.31) was used. Hence, for a stationary state, $\Psi(x, t)$ is an eigenfunction of $\hat{H}$, and we are certain to obtain the value $E$ when we measure the energy.

As an example of another property, consider momentum. The eigenfunctions $g$ of $\hat{p}_{x}$ are found by solving

\(
\begin{align}
\hat{p}_{x} g & =k g \
\frac{\hbar}{i} \frac{d g}{d x} & =k g \tag{3.35}
\end{align}
\)

We find (Prob. 3.29)

\(
\begin{equation}
g=A e^{i k x / \hbar} \tag{3.36}
\end{equation}
\)

where $A$ is an arbitrary constant. To keep $g$ finite for large $|x|$, the eigenvalues $k$ must be real. Thus the eigenvalues of $\hat{p}_{x}$ are all the real numbers

\(
\begin{equation}
-\infty<k<\infty \tag{3.37}
\end{equation}
\)

which is reasonable. Any measurement of $p{x}$ must yield one of the eigenvalues (3.37) of $\hat{p}{x}$. Each different value of $k$ in (3.36) gives a different eigenfunction $g$. It might seem surprising that the operator for the physical property momentum involves the imaginary number $i$. Actually, the presence of $i$ in $\hat{p}_{x}$ ensures that the eigenvalues $k$ are real. Recall that the eigenvalues of $d / d x$ are imaginary (Section 3.2).

Comparing the free-particle wave function (2.30) with the eigenfunctions (3.36) of $\hat{p}_{x}$, we note the following physical interpretation: The first term in (2.30) corresponds to positive momentum and represents motion in the $+x$ direction; the second term in (2.30) corresponds to negative momentum and represents motion in the $-x$ direction.

Now consider the momentum of a particle in a box. The state function for a particle in a stationary state in a one-dimensional box is [Eqs. (3.33), (2.20), and (2.23)]

\(
\begin{equation}
\Psi(x, t)=e^{-i E t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right) \tag{3.38}
\end{equation}
\)

where $E=n^{2} h^{2} / 8 m l^{2}$. Does the particle have a definite value of $p{x}$ ? That is, is $\Psi(x, t)$ an eigenfunction of $\hat{p}{x}$ ? Looking at the eigenfunctions (3.36) of $\hat{p}{x}$, we see that there is no numerical value of the real constant $k$ that will make the exponential function in (3.36) become a sine function, as in (3.38). Hence $\Psi$ is not an eigenfunction of $\hat{p}{x}$. We can verify this directly; we have

\(
\hat{p}_{x} \Psi=\frac{\hbar}{i} \frac{\partial}{\partial x} e^{-i E t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right)=\frac{n \pi \hbar}{i l} e^{-i E t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \cos \left(\frac{n \pi x}{l}\right)
\)

Since $\hat{p}{x} \Psi \neq$ constant $\cdot \Psi$, the state function $\Psi$ is not an eigenfunction of $\hat{p}{x}$.
Note that the system's state function $\Psi$ need not be an eigenfunction $f{i}$ of the operator $\hat{B}$ in (3.30) that corresponds to the physical property $B$ of the system. Thus, the particle-in-a-box stationary-state wave functions are not eigenfunctions of $\hat{p}{x}$. Despite this, we still must get one of the eigenvalues (3.37) of $\hat{p}{x}$ when we measure $p{x}$ for a particle-in-a-box stationary state.

Are the particle-in-a-box stationary-state wave functions eigenfunctions of $\hat{p}_{x}^{2}$ ? We have [Eq. (3.24)]

\(
\begin{align}
& \hat{p}{x}^{2} \Psi=-\hbar^{2} \frac{\partial^{2}}{\partial x^{2}} e^{-i E t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right)=\frac{n^{2} \pi^{2} \hbar^{2}}{l^{2}} e^{-i E t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right) \
& \hat{p}{x}^{2} \Psi=\frac{n^{2} h^{2}}{4 l^{2}} \Psi \tag{3.39}
\end{align}
\)

Hence a measurement of $p_{x}^{2}$ will always give the result $n^{2} h^{2} / 4 l^{2}$ when the particle is in the stationary state with quantum number $n$. This should come as no surprise: The potential energy in the box is zero, and the Hamiltonian operator is

\(
\hat{H}=\hat{T}+\hat{V}=\hat{T}=\hat{p}_{x}^{2} / 2 m
\)

We then have [Eq. (3.34)]

\(
\begin{gather}
\hat{H} \Psi=E \Psi=\frac{\hat{p}{x}^{2}}{2 m} \Psi \
\hat{p}{x}^{2} \Psi=2 m E \Psi=2 m \frac{n^{2} h^{2}}{8 m l^{2}} \Psi=\frac{n^{2} h^{2}}{4 l^{2}} \Psi \tag{3.40}
\end{gather}
\)

in agreement with (3.39). The only possible value for $p_{x}^{2}$ is

\(
\begin{equation}
p_{x}^{2}=n^{2} h^{2} / 4 l^{2} \tag{3.41}
\end{equation}
\)

Equation (3.41) suggests that a measurement of $p{x}$ would necessarily yield one of the two values $\pm \frac{1}{2} n h / l$, corresponding to the particle moving to the right or to the left in the box. This plausible suggestion is not accurate. An analysis using the methods of Chapter 7 shows that there is a high probability that the measured value will be close to one of the two values $\pm \frac{1}{2} n h / l$, but that any value consistent with (3.37) can result from a measurement of $p{x}$ for the particle in a box; see Prob. 7.41.

We postulated that a measurement of the property $B$ must give a result that is one of the eigenvalues of the operator $\hat{B}$. If the state function $\Psi$ happens to be an eigenfunction
of $\hat{B}$ with eigenvalue $b$, we are certain to get $b$ when we measure $B$. Suppose, however, that $\Psi$ is not one of the eigenfunctions of $\hat{B}$. What then? We still assert that we will get one of the eigenvalues of $\hat{B}$ when we measure $B$, but we cannot predict which eigenvalue will be obtained. We shall see in Chapter 7 that the probabilities for obtaining the various eigenvalues of $\hat{B}$ can be predicted.

EXAMPLE

The energy of a particle of mass $m$ in a one-dimensional box of length $l$ is measured. What are the possible values that can result from the measurement if at the time the measurement begins, the particle's state function is (a) $\Psi=\left(30 / l^{5}\right)^{1 / 2} x(l-x)$ for $0 \leq x \leq l ;$ (b) $\Psi=(2 / l)^{1 / 2} \sin (3 \pi x / l)$ for $0 \leq x \leq l$ ?
(a) The possible outcomes of a measurement of the property $E$ are the eigenvalues of the system's energy (Hamiltonian) operator $\hat{H}$. Therefore, the measured value must be one of the numbers $n^{2} h^{2} / 8 m l^{2}$, where $n=1,2,3, \ldots$ Since $\Psi$ is not one of the eigenfunctions $(2 / l)^{1 / 2} \sin (n \pi x / l)$ [Eq. (2.23)] of $\hat{H}$, we cannot predict which one of these eigenvalues will be obtained for this nonstationary state. (The probabilities for obtaining these eigenvalues are found in the last example in Section 7.6.)
(b) Since $\Psi$ is an eigenfunction of $\hat{H}$ with eigenvalue $3^{2} h^{2} / 8 m l^{2}$ [Eq. (2.20)], the measurement must give $9 h^{2} / 8 m l^{2}$.


Up to now we have restricted ourselves to one-dimensional, one-particle systems. The operator formalism developed in the last section allows us to extend our work to threedimensional, many-particle systems. The time-dependent Schrödinger equation for the time development of the state function is postulated to have the form of Eq. (1.13):

\(
\begin{equation}
i \hbar \frac{\partial \Psi}{\partial t}=\hat{H} \Psi \tag{3.42}
\end{equation}
\)

The time-independent Schrödinger equation for the energy eigenfunctions and eigenvalues is

\(
\begin{equation}
\hat{H} \psi=E \psi \tag{3.43}
\end{equation}
\)

which is obtained from (3.42) by taking the potential energy as independent of time and applying the separation-of-variables procedure used to obtain (1.19) from (1.13).

For a one-particle, three-dimensional system, the classical-mechanical Hamiltonian is

\(
\begin{equation}
H=T+V=\frac{1}{2 m}\left(p{x}^{2}+p{y}^{2}+p_{z}^{2}\right)+V(x, y, z) \tag{3.44}
\end{equation}
\)

Introducing the quantum-mechanical operators [Eq. (3.24)], we have for the Hamiltonian operator

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2}}{\partial x^{2}}+\frac{\partial^{2}}{\partial y^{2}}+\frac{\partial^{2}}{\partial z^{2}}\right)+V(x, y, z) \tag{3.45}
\end{equation}
\)

The operator in parentheses in (3.45) is called the Laplacian operator $\nabla^{2}$ (read as "del squared"):

\(
\begin{equation}
\nabla^{2} \equiv \frac{\partial^{2}}{\partial x^{2}}+\frac{\partial^{2}}{\partial y^{2}}+\frac{\partial^{2}}{\partial z^{2}} \tag{3.46}
\end{equation}
\)

The one-particle, three-dimensional, time-independent Schrödinger equation is then

\(
\begin{equation}
-\frac{\hbar^{2}}{2 m} \nabla^{2} \psi+V \psi=E \psi \tag{3.47}
\end{equation}
\)

Now consider a three-dimensional system with $n$ particles. Let particle $i$ have mass $m{i}$ and coordinates $\left(x{i}, y{i}, z{i}\right)$, where $i=1,2,3, \ldots, n$. The kinetic energy is the sum of the kinetic energies of the individual particles:

\(
T=\frac{1}{2 m{1}}\left(p{x{1}}^{2}+p{y{1}}^{2}+p{z{1}}^{2}\right)+\frac{1}{2 m{2}}\left(p{x{2}}^{2}+p{y{2}}^{2}+p{z{2}}^{2}\right)+\cdots+\frac{1}{2 m{n}}\left(p{x{n}}^{2}+p{y{n}}^{2}+p{z_{n}}^{2}\right)
\)

where $p{x{i}}$ is the $x$ component of the linear momentum of particle $i$, and so on. The kineticenergy operator is

\(
\begin{gather}
\hat{T}=-\frac{\hbar^{2}}{2 m{1}}\left(\frac{\partial^{2}}{\partial x{1}^{2}}+\frac{\partial^{2}}{\partial y{1}^{2}}+\frac{\partial^{2}}{\partial z{1}^{2}}\right)-\cdots-\frac{\hbar^{2}}{2 m{n}}\left(\frac{\partial^{2}}{\partial x{n}^{2}}+\frac{\partial^{2}}{\partial y{n}^{2}}+\frac{\partial^{2}}{\partial z{n}^{2}}\right) \
\hat{T}=-\sum{i=1}^{n} \frac{\hbar^{2}}{2 m{i}} \nabla{i}^{2} \tag{3.48}\
\nabla{i}^{2} \equiv \frac{\partial^{2}}{\partial x{i}^{2}}+\frac{\partial^{2}}{\partial y{i}^{2}}+\frac{\partial^{2}}{\partial z_{i}^{2}} \tag{3.49}
\end{gather}
\)

We shall usually restrict ourselves to cases where the potential energy depends only on the $3 n$ coordinates:

\(
V=V\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right)
\)

The Hamiltonian operator for an $n$-particle, three-dimensional system is then

\(
\begin{equation}
\hat{H}=-\sum{i=1}^{n} \frac{\hbar^{2}}{2 m{i}} \nabla{i}^{2}+V\left(x{1}, \ldots, z_{n}\right) \tag{3.50}
\end{equation}
\)

and the time-independent Schrödinger equation is

\(
\begin{equation}
\left[-\sum{i=1}^{n} \frac{\hbar^{2}}{2 m{i}} \nabla{i}^{2}+V\left(x{1}, \ldots, z_{n}\right)\right] \psi=E \psi \tag{3.51}
\end{equation}
\)

where the time-independent wave function is a function of the $3 n$ coordinates of the $n$ particles:

\(
\begin{equation}
\psi=\psi\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right) \tag{3.52}
\end{equation}
\)

The Schrödinger equation (3.51) is a linear partial differential equation.
As an example, consider a system of two particles interacting so that the potential energy is inversely proportional to the distance between them, with $c$ being the proportionality constant. The Schrödinger equation (3.51) becomes

FIGURE 3.1 An infinitesimal box-shaped region located at $x^{\prime}, y^{\prime}, z^{\prime}$.

\(
\begin{gather}
{\left[-\frac{\hbar^{2}}{2 m{1}}\left(\frac{\partial^{2}}{\partial x{1}^{2}}+\frac{\partial^{2}}{\partial y{1}^{2}}+\frac{\partial^{2}}{\partial z{1}^{2}}\right)-\frac{\hbar^{2}}{2 m{2}}\left(\frac{\partial^{2}}{\partial x{2}^{2}}+\frac{\partial^{2}}{\partial y{2}^{2}}+\frac{\partial^{2}}{\partial z{2}^{2}}\right)\right.} \
\left.+\frac{c}{\left[\left(x{1}-x{2}\right)^{2}+\left(y{1}-y{2}\right)^{2}+\left(z{1}-z{2}\right)^{2}\right]^{1 / 2}}\right] \psi=E \psi \tag{3.53}\
\psi=\psi\left(x{1}, y{1}, z{1}, x{2}, y{2}, z{2}\right)
\end{gather}
\)

Although (3.53) looks formidable, we shall solve it in Chapter 6.
For a one-particle, one-dimensional system, the Born postulate [Eq. (1.15)] states that $\left|\Psi\left(x^{\prime}, t\right)\right|^{2} d x$ is the probability of observing the particle between $x^{\prime}$ and $x^{\prime}+d x$ at time $t$, where $x^{\prime}$ is a particular value of $x$. We extend this postulate as follows. For $a$ three-dimensional, one-particle system, the quantity

\(
\begin{equation}
\left|\Psi\left(x^{\prime}, y^{\prime}, z^{\prime}, t\right)\right|^{2} d x d y d z \tag{3.54}
\end{equation}
\)

is the probability of finding the particle in the infinitesimal region of space with its $x$ coordinate lying between $x^{\prime}$ and $x^{\prime}+d x$, its $y$ coordinate lying between $y^{\prime}$ and $y^{\prime}+d y$, and its $z$ coordinate between $z^{\prime}$ and $z^{\prime}+d z$ (Fig. 3.1). Since the total probability of finding the particle is 1 , the normalization condition is

\(
\begin{equation}
\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int_{-\infty}^{\infty}|\Psi(x, y, z, t)|^{2} d x d y d z=1 \tag{3.55}
\end{equation}
\)

For a three-dimensional, n-particle system, we postulate that

\(
\begin{equation}
\left|\Psi\left(x{1}^{\prime}, y{1}^{\prime}, z{1}^{\prime}, x{2}^{\prime}, y{2}^{\prime}, z{2}^{\prime}, \ldots, x{n}^{\prime}, y{n}^{\prime}, z{n}^{\prime}, t\right)\right|^{2} d x{1} d y{1} d z{1} d x{2} d y{2} d z{2} \ldots d x{n} d y{n} d z{n} \tag{3.56}
\end{equation}
\)

is the probability at time $t$ of simultaneously finding particle 1 in the infinitesimal rectangular box-shaped region at $\left(x{1}^{\prime}, y{1}^{\prime}, z{1}^{\prime}\right)$ with edges $d x{1}, d y{1}, d z{1}$, particle 2 in the infinitesimal box-shaped region at $\left(x{2}^{\prime}, y{2}^{\prime}, z{2}^{\prime}\right)$ with edges $d x{2}, d y{2}, d z{2}, \ldots$, and particle $n$ in the infinitesimal box-shaped region at $\left(x{n}^{\prime}, y{n}^{\prime}, z{n}^{\prime}\right)$ with edges $d x{n}, d y{n}, d z{n}$. The total probability of finding all the particles is 1 , and the normalization condition is

\(
\begin{equation}
\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty}|\Psi|^{2} d x{1} d y{1} d z{1} \cdots d x{n} d y{n} d z{n}=1 \tag{3.57}
\end{equation}
\)

It is customary in quantum mechanics to denote integration over the full range of all the coordinates of a system by $\int d \tau$. A shorthand way of writing (3.55) or (3.57) is

\(
\begin{equation}
\int|\Psi|^{2} d \tau=1 \tag{3.58}
\end{equation}
\)

Although (3.58) may look like an indefinite integral, it is understood to be a definite integral. The integration variables and their ranges are understood from the context.

For a stationary state, $|\Psi|^{2}=|\psi|^{2}$, and

\(
\begin{equation}
\int|\psi|^{2} d \tau=1 \tag{3.59}
\end{equation}
\)


For the present, we confine ourselves to one-particle problems. In this section we consider the three-dimensional case of the problem solved in Section 2.2, the particle in a box.

There are many possible shapes for a three-dimensional box. The box we consider is a rectangular parallelepiped with edges of length $a, b$, and $c$. We choose our coordinate system so that one corner of the box lies at the origin and the box lies in the first octant of space (Fig. 3.2). Within the box, the potential energy is zero. Outside the box, it is infinite:

\(
\begin{align}
V(x, y, z) & =0 \text { in the region }\left{\begin{array}{l}
0<x<a \
0<y<b \
0<z<c
\end{array}\right. \tag{3.60}\
V & =\infty \quad \text { elsewhere }
\end{align}
\)

Since the probability for the particle to have infinite energy is zero, the wave function must be zero outside the box. Within the box, the potential-energy operator is zero and the Schrödinger equation (3.47) is

\(
\begin{equation}
-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2} \psi}{\partial x^{2}}+\frac{\partial^{2} \psi}{\partial y^{2}}+\frac{\partial^{2} \psi}{\partial z^{2}}\right)=E \psi \tag{3.61}
\end{equation}
\)

To solve (3.61), we assume that the solution can be written as the product of a function of $x$ alone times a function of $y$ alone times a function of $z$ alone:

\(
\begin{equation}
\psi(x, y, z)=f(x) g(y) h(z) \tag{3.62}
\end{equation}
\)

It might be thought that this assumption throws away solutions that are not of the form (3.62). However, it can be shown that, if we can find solutions of the form (3.62) that satisfy the boundary conditions, then there are no other solutions of the Schrödinger equation that will satisfy the boundary conditions. (For a proof, see G. F. D. Duff and

FIGURE 3.2 Inside the box-shaped region, $V=0$.
D. Naylor, Differential Equations of Applied Mathematics, Wiley, 1966, pp. 257-258.) The method we are using to solve (3.62) is called separation of variables.

From (3.62), we find

\(
\begin{equation}
\frac{\partial^{2} \psi}{\partial x^{2}}=f^{\prime \prime}(x) g(y) h(z), \quad \frac{\partial^{2} \psi}{\partial y^{2}}=f(x) g^{\prime \prime}(y) h(z), \quad \frac{\partial^{2} \psi}{\partial z^{2}}=f(x) g(y) h^{\prime \prime}(z) \tag{3.63}
\end{equation}
\)

Substitution of (3.62) and (3.63) into (3.61) gives

\(
\begin{equation}
-\left(\hbar^{2} / 2 m\right) f^{\prime \prime} g h-\left(\hbar^{2} / 2 m\right) f g^{\prime \prime} h-\left(\hbar^{2} / 2 m\right) f g h^{\prime \prime}-E f g h=0 \tag{3.64}
\end{equation}
\)

Division of this equation by $f g h$ gives

\(
\begin{gather}
-\frac{\hbar^{2} f^{\prime \prime}}{2 m f}-\frac{\hbar^{2} g^{\prime \prime}}{2 m g}-\frac{\hbar^{2} h^{\prime \prime}}{2 m h}-E=0 \tag{3.65}\
-\frac{\hbar^{2} f^{\prime \prime}(x)}{2 m f(x)}=\frac{\hbar^{2} g^{\prime \prime}(y)}{2 m g(y)}+\frac{\hbar^{2} h^{\prime \prime}(z)}{2 m h(z)}+E \tag{3.66}
\end{gather}
\)

Let us define $E_{x}$ as equal to the left side of (3.66):

\(
\begin{equation}
E_{x} \equiv-\hbar^{2} f^{\prime \prime}(x) / 2 m f(x) \tag{3.67}
\end{equation}
\)

The definition (3.67) shows that $E{x}$ is independent of $y$ and $z$. Equation (3.66) shows that $E{x}$ equals $\hbar^{2} g^{\prime \prime}(y) / 2 m g(y)+\hbar^{2} h^{\prime \prime}(z) / 2 m h(z)+E$; therefore, $E{x}$ must be independent of $x$. Being independent of $x, y$, and $z$, the quantity $E{x}$ must be a constant.

Similar to (3.67), we define $E{y}$ and $E{z}$ by

\(
\begin{equation}
E{y} \equiv-\hbar^{2} g^{\prime \prime}(y) / 2 m g(y), \quad E{z} \equiv-\hbar^{2} h^{\prime \prime}(z) / 2 m h(z) \tag{3.68}
\end{equation}
\)

Since $x, y$, and $z$ occur symmetrically in (3.65), the same reasoning that showed $E{x}$ to be a constant shows that $E{y}$ and $E_{z}$ are constants. Substitution of the definitions (3.67) and (3.68) into (3.65) gives

\(
\begin{equation}
E{x}+E{y}+E_{z}=E \tag{3.69}
\end{equation}
\)

Equations (3.67) and (3.68) are

\(
\begin{gather}
\frac{d^{2} f(x)}{d x^{2}}+\frac{2 m}{\hbar^{2}} E{x} f(x)=0 \tag{3.70}\
\frac{d^{2} g(y)}{d y^{2}}+\frac{2 m}{\hbar^{2}} E{y} g(y)=0, \quad \frac{d^{2} h(z)}{d z^{2}}+\frac{2 m}{\hbar^{2}} E_{z} h(z)=0 \tag{3.71}
\end{gather}
\)

We have converted the partial differential equation in three variables into three ordinary differential equations. What are the boundary conditions on (3.70)? Since the wave function vanishes outside the box, continuity of $\psi$ requires that it vanish on the walls of the box. In particular, $\psi$ must be zero on the wall of the box lying in the $y z$ plane, where $x=0$, and it must be zero on the parallel wall of the box, where $x=a$. Therefore, $f(0)=0$ and $f(a)=0$.

Now compare Eq. (3.70) with the Schrödinger equation [Eq. (2.10)] for a particle in a one-dimensional box. The equations are the same in form, with $E_{x}$ in (3.70) corresponding to $E$ in (2.10). Are the boundary conditions the same? Yes, except that we have $x=a$ instead of $x=l$ as the second point where the independent variable vanishes. Thus we can use the work in Section 2.2 to write as the solution [see Eqs. (2.23) and (2.20)]

\(
\begin{aligned}
f(x) & =\left(\frac{2}{a}\right)^{1 / 2} \sin \left(\frac{n{x} \pi x}{a}\right) \
E{x} & =\frac{n{x}^{2} h^{2}}{8 m a^{2}}, \quad n{x}=1,2,3 \ldots
\end{aligned}
\)

The same reasoning applied to the $y$ and $z$ equations gives

\(
\begin{gathered}
g(y)=\left(\frac{2}{b}\right)^{1 / 2} \sin \left(\frac{n{y} \pi y}{b}\right), \quad h(z)=\left(\frac{2}{c}\right)^{1 / 2} \sin \left(\frac{n{z} \pi z}{c}\right) \
E{y}=\frac{n{y}^{2} h^{2}}{8 m b^{2}}, \quad n{y}=1,2,3 \ldots \quad \text { and } \quad E{z}=\frac{n{z}^{2} h^{2}}{8 m c^{2}}, \quad n{z}=1,2,3 \ldots
\end{gathered}
\)

From (3.69), the energy is

\(
\begin{equation}
E=\frac{h^{2}}{8 m}\left(\frac{n{x}^{2}}{a^{2}}+\frac{n{y}^{2}}{b^{2}}+\frac{n_{z}^{2}}{c^{2}}\right) \tag{3.72}
\end{equation}
\)

As with the particle in a one-dimensional box, the ground-state energy is greater than the classical-mechanical, lowest-energy value of zero.

From (3.62), the wave function inside the box is

\(
\begin{equation}
\psi(x, y, z)=\left(\frac{8}{a b c}\right)^{1 / 2} \sin \left(\frac{n{x} \pi x}{a}\right) \sin \left(\frac{n{y} \pi y}{b}\right) \sin \left(\frac{n_{z} \pi z}{c}\right) \tag{3.73}
\end{equation}
\)

The wave function has three quantum numbers, $n{x}, n{y}, n_{z}$. We can attribute this to the three-dimensional nature of the problem. The three quantum numbers vary independently of one another.

In a one-particle, one-dimensional problem such as the particle in a one-dimensional box, the nodes are where $\psi(x)=0$, and solving this equation for $x$, we get points where $\psi=0$. In a one-particle, three-dimensional problem, the nodes are where $\psi(x, y, z)=0$, and solving this equation for $z$, we get solutions of the form $z=f(x, y)$. Each such solution is the equation of a nodal surface in three-dimensional space. For example, for the stationary state with $n{x}=1, n{y}=1, n{z}=2$, the wave function $\psi$ in (3.73) is zero on the surface where $z=c / 2$; this is the equation of a plane that lies parallel to the top and bottom faces of the box and is midway between these faces. Similarly, for the state $n{x}=2$, $n{y}=1, n{z}=1$ the plane $x=a / 2$ is a nodal surface.

Since the $x, y$, and $z$ factors in the wave function are each independently normalized, the wave function is normalized:

\(
\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty}|\psi|^{2} d x d y d z=\int{0}^{a}|f(x)|^{2} d x \int{0}^{b}|g(y)|^{2} d y \int{0}^{c}|h(z)|^{2} d z=1
\)

where we used (Prob. 3.40)

\(
\begin{equation}
\iiint F(x) G(y) H(z) d x d y d z=\int F(x) d x \int G(y) d y \int H(z) d z \tag{3.74}
\end{equation}
\)

What are the dimensions of $\psi(x, y, z)$ in (3.73)? For a one-particle, three-dimensional system, $|\psi|^{2} d x d y d z$ is a probability, and probabilities are dimensionless. Since the dimensions of $d x d y d z$ are length ${ }^{3}, \psi(x, y, z)$ must have dimensions of length ${ }^{-3 / 2}$ to make $|\psi|^{2} d x d y d z$ dimensionless.

Suppose that $a=b=c$. We then have a cube. The energy levels are then

\(
\begin{equation}
E=\left(h^{2} / 8 m a^{2}\right)\left(n{x}^{2}+n{y}^{2}+n_{z}^{2}\right) \tag{3.75}
\end{equation}
\)

Let us tabulate some of the allowed energies of a particle confined to a cube with infinitely strong walls:

$n{x} n{y} n_{z}$111211121112122212221113131311222
$E\left(8 m a^{2} / h^{2}\right)$366699911111112

FIGURE 3.3 Energies of the lowest few states of a particle in a cubic box.

Note that states with different quantum numbers may have the same energy (Fig. 3.3). For example, the states $\psi{211}, \psi{121}$, and $\psi{112}$ (where the subscripts give the quantum numbers) all have the same energy. However, Eq. (3.73) shows that these three sets of quantum numbers give three different, independent wave functions and therefore do represent different states of the system. When two or more independent wave functions correspond to states with the same energy eigenvalue, the eigenvalue is said to be degenerate. The degree of degeneracy (or, simply, the degeneracy) of an energy level is the number of states that have that energy. Thus the second-lowest energy level of the particle in a cube is threefold degenerate. We got the degeneracy when we made the edges of the box equal. Degeneracy is usually related to the symmetry of the system. Note that the wave functions $\psi{211}, \psi{121}$, and $\psi{112}$ can be transformed into one another by rotating the cubic box. Usually, the bound-state energy levels in one-dimensional problems are nondegenerate.

In the statistical-mechanical evaluation of the molecular partition function of an ideal gas, the translational energy levels of each gas molecule are taken to be the levels of a particle in a three-dimensional rectangular box (the box is the container holding the gas); see Levine, Physical Chemistry, Sections 21.6 and 21.7.

In the free-electron theory of metals, the valence electrons of a nontransition metal are treated as noninteracting particles in a box, the sides of the box being the surfaces of the metal. This approximation, though crude, gives fairly good results for some properties of metals.


For an $n$-fold degenerate energy level, there are $n$ independent wave functions $\psi{1}, \psi{2}, \ldots, \psi_{n}$, each having the same energy eigenvalue $w$ :

\(
\begin{equation}
\hat{H} \psi{1}=w \psi{1}, \quad \hat{H} \psi{2}=w \psi{2}, \quad \ldots, \quad \hat{H} \psi{n}=w \psi{n} \tag{3.76}
\end{equation}
\)

We wish to prove the following important theorem: Every linear combination

\(
\begin{equation}
\phi \equiv c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n} \tag{3.77}
\end{equation}
\)

of the wave functions of a degenerate level with energy eigenvalue $w$ is an eigenfunction of the Hamiltonian operator with eigenvalue $w$. [A linear combination of the functions $\psi{1}, \psi{2}, \ldots, \psi_{n}$ is defined as a function of the form (3.77) where the $c$ 's are constants, some of which might be zero.] To prove this theorem, we must show that $\hat{H} \phi=w \phi$ or

\(
\begin{equation}
\hat{H}\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right)=w\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right) \tag{3.78}
\end{equation}
\)

Since $\hat{H}$ is a linear operator, we can apply Eq. (3.9) $n-1$ times to the left side of (3.78) to get

\(
\hat{H}\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right)=\hat{H}\left(c{1} \psi{1}\right)+\hat{H}\left(c{2} \psi{2}\right)+\cdots+\hat{H}\left(c{n} \psi{n}\right)
\)

Use of Eqs. (3.10) and (3.76) gives

\(
\begin{aligned}
\hat{H}\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right) & =c{1} \hat{H} \psi{1}+c{2} \hat{H} \psi{2}+\cdots+c{n} \hat{H} \psi{n} \
& =c{1} w \psi{1}+c{2} w \psi{2}+\cdots+c{n} w \psi{n} \
\hat{H}\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right) & =w\left(c{1} \psi{1}+c{2} \psi{2}+\cdots+c{n} \psi{n}\right)
\end{aligned}
\)

which completes the proof.
For example, the stationary-state wave functions $\psi{211}, \psi{121}$, and $\psi{112}$ for the particle in a cubic box are degenerate, and the linear combination $c{1} \psi{211}+c{2} \psi{121}+c{3} \psi{112}$ is an eigenfunction of the particle-in-a-cubic-box Hamiltonian with eigenvalue $6 h^{2} / 8 m a^{2}$, the same eigenvalue as for each of $\psi{211}, \psi{121}$, and $\psi{112}$.

Note that the linear combination $c{1} \psi{1}+c{2} \psi{2}$ is not an eigenfunction of $\hat{H}$ if $\psi{1}$ and $\psi{2}$ correspond to different energy eigenvalues $\left(\hat{H} \psi{1}=E{1} \psi{1}\right.$ and $\hat{H} \psi{2}=E{2} \psi{2}$ with $E{1} \neq E{2}$ ).

Since any linear combination of the wave functions corresponding to a degenerate energy level is an eigenfunction of $\hat{H}$ with the same eigenvalue, we can construct an infinite number of different wave functions for any degenerate energy level. Actually, we are only interested in eigenfunctions that are linearly independent. The $n$ functions $f{1}, \ldots, f{n}$ are said to be linearly independent if the equation $c{1} f{1}+\cdots+c{n} f{n}=0$ can only be satisfied with all the constants $c{1}, \ldots, c{n}$ equal to zero. This means that no member of a set of linearly independent functions can be expressed as a linear combination of the remaining members. For example, the functions $f{1}=3 x, f{2}=5 x^{2}-x$, $f{3}=x^{2}$ are not linearly independent, since $f{2}=5 f{3}-\frac{1}{3} f{1}$. The functions $g{1}=1$, $g{2}=x, g_{3}=x^{2}$ are linearly independent, since none of them can be written as a linear combination of the other two.

The degree of degeneracy of an energy level is equal to the number of linearly independent wave functions corresponding to that value of the energy. The one-dimensional free-particle wave functions (2.30) are linear combinations of two linearly independent functions that are each an eigenfunction with the same energy eigenvalue $E$. Thus each such energy eigenvalue (except $E=0$ ) is doubly degenerate (meaning that the degree of degeneracy is two).


It was pointed out in Section 3.3 that, when the state function $\Psi$ is not an eigenfunction of the operator $\hat{B}$, a measurement of $B$ will give one of a number of possible values (the eigenvalues of $\hat{B}$ ). We now consider the average value of the property $B$ for a system whose state is $\Psi$.

To find the average value of $B$ experimentally, we take many identical, noninteracting systems each in the same state $\Psi$ and we measure $B$ in each system. The average value of $B$, symbolized by $\langle B\rangle$, is defined as the arithmetic mean of the observed values $b{1}$, $b{2}, \ldots, b_{N}$ :

\(
\begin{equation}
\langle B\rangle=\frac{\sum{j=1}^{N} b{j}}{N} \tag{3.79}
\end{equation}
\)

where $N$, the number of systems, is extremely large.
Instead of summing over the observed values of $B$, we can sum over all the possible values of $B$, multiplying each possible value by the number of times it is observed, to get the equivalent expression

\(
\begin{equation}
\langle B\rangle=\frac{\sum{b} n{b} b}{N} \tag{3.80}
\end{equation}
\)

where $n_{b}$ is the number of times the value $b$ is observed. An example will make this clear. Suppose a class of nine students takes a quiz that has five questions and the students receive these grades: $0,20,20,60,60,80,80,80,100$. Calculating the average grade according to (3.79), we have

\(
\frac{1}{N} \sum{j=1}^{N} b{j}=\frac{0+20+20+60+60+80+80+80+100}{9}=56
\)

To calculate the average grade according to (3.80), we sum over the possible grades: 0,20 , $40,60,80,100$. We have

\(
\frac{1}{N} \sum{b} n{b} b=\frac{1(0)+2(20)+0(40)+2(60)+3(80)+1(100)}{9}=56
\)

Equation (3.80) can be written as

\(
\langle B\rangle=\sum{b}\left(\frac{n{b}}{N}\right) b
\)

Since $N$ is very large, $n{b} / N$ is the probability $P{b}$ of observing the value $b$, and

\(
\begin{equation}
\langle B\rangle=\sum{b} P{b} b \tag{3.81}
\end{equation}
\)

Now consider the average value of the $x$ coordinate for a one-particle, one-dimensional system in the state $\Psi(x, t)$. The $x$ coordinate takes on a continuous range of values, and the probability of observing the particle between $x$ and $x+d x$ is $|\Psi|^{2} d x$. The summation over the infinitesimal probabilities is equivalent to an integration over the full range of $x$, and (3.81) becomes

\(
\begin{equation}
\langle x\rangle=\int_{-\infty}^{\infty} x|\Psi(x, t)|^{2} d x \tag{3.82}
\end{equation}
\)

For the one-particle, three-dimensional case, the probability of finding the particle in the volume element at point $(x, y, z)$ with edges $d x, d y, d z$ is

\(
\begin{equation}
|\Psi(x, y, z, t)|^{2} d x d y d z \tag{3.83}
\end{equation}
\)

If we want the probability that the particle is between $x$ and $x+d x$, we must integrate (3.83) over all possible values of $y$ and $z$, since the particle can have any values for its $y$ and $z$ coordinates while its $x$ coordinate lies between $x$ and $x+d x$. Hence, in the threedimensional case (3.82) becomes

\(
\begin{align}
& \langle x\rangle=\int{-\infty}^{\infty}\left[\int{-\infty}^{\infty} \int{-\infty}^{\infty}|\Psi(x, y, z, t)|^{2} d y d z\right] x d x \
& \langle x\rangle=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty}|\Psi(x, y, z, t)|^{2} x d x d y d z \tag{3.84}
\end{align}
\)

Now consider the average value of some physical property $B(x, y, z)$ that is a function of the particle's coordinates. An example is the potential energy $V(x, y, z)$. The same reasoning that gave Eq. (3.84) yields

\(
\begin{gather}
\langle B(x, y, z)\rangle=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty}|\Psi(x, y, z, t)|^{2} B(x, y, z) d x d y d z \tag{3.85}\
\langle B(x, y, z)\rangle=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \Psi B \Psi d x d y d z \tag{3.86}
\end{gather*}
\)

The form (3.86) might seem like a bit of whimsy, since it is no different from (3.85). In a moment we shall see its significance.

In general, the property $B$ depends on both coordinates and momenta:

\(
B=B\left(x, y, z, p{x}, p{y}, p_{z}\right)
\)

for the one-particle, three-dimensional case. How do we find the average value of $B$ ? We postulate that $\langle B\rangle$ for a system in state $\Psi$ is

\(
\begin{align}
& \langle B\rangle=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int_{-\infty}^{\infty} \Psi B\left(x, y, z, \frac{\hbar}{i} \frac{\partial}{\partial x}, \frac{\hbar}{i} \frac{\partial}{\partial y}, \frac{\hbar}{i} \frac{\partial}{\partial z}\right) \Psi d x d y d z \
& \langle B\rangle=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int_{-\infty}^{\infty} \Psi \hat{B} \Psi d x d y d z \tag{3.87}
\end{align}
\)

where $\hat{B}$ is the quantum-mechanical operator for the property $B$. [Later we shall provide some justification for this postulate by using (3.87) to show that the time-dependent Schrödinger equation reduces to Newton's second law in the transition from quantum to classical mechanics; see Prob. 7.59.] For the $n$-particle case, we postulate that

\(
\begin{equation}
\langle B\rangle=\int \Psi \hat{B} \Psi d \tau \tag{3.88}
\end{}
\)

where $\int d \tau$ denotes a definite integral over the full range of the $3 n$ coordinates. The state function in (3.88) must be normalized, since we took $\Psi^{} \Psi$ as the probability density. It is important to have the operator properly sandwiched between $\Psi^{}$ and $\Psi$. The quantities $\hat{B} \Psi^{} \Psi$ and $\Psi^{} \Psi \hat{B}$ are not the same as $\Psi \hat{B} \Psi$, unless $B$ is a function of coordinates only. In $\int \Psi^{} \hat{B} \Psi d \tau$, one first operates on $\Psi$ with $\hat{B}$ to produce a new function $\hat{B} \Psi$, which is then multiplied by $\Psi^{*}$; one then integrates over all space to produce a number, which is $\langle B\rangle$.

For a stationary state, we have [Eq. (1.20)]

\(
\Psi^{} \hat{B} \Psi=e^{i E t / \hbar} \psi^{} \hat{B} e^{-i E t / \hbar} \psi=e^{0} \psi^{} \hat{B} \psi=\psi^{} \hat{B} \psi
\)

since $\hat{B}$ contains no time derivatives and does not affect the time factor in $\Psi$. Hence, for a stationary state,

\(
\begin{equation}
\langle B\rangle=\int \psi^{} \hat{B} \psi d \tau \tag{3.89}
\end{}
\)

Thus, if $\hat{B}$ is time-independent, then $\langle B\rangle$ is time-independent in a stationary state.
Consider the special case where $\Psi$ is an eigenfunction of $\hat{B}$. When $\hat{B} \Psi=k \Psi$, Eq. (3.88) becomes

\(
\langle B\rangle=\int \Psi^{} \hat{B} \Psi d \tau=\int \Psi^{} k \Psi d \tau=k \int \Psi^{*} \Psi d \tau=k
\)

since $\Psi$ is normalized. This result is reasonable, since when $\hat{B} \Psi=k \Psi, k$ is the only possible value we can find for $B$ when we make a measurement (Section 3.3).

The following properties of average values are easily proved from Eq. (3.88) (see Prob. 3.49):

\(
\begin{equation}
\langle A+B\rangle=\langle A\rangle+\langle B\rangle \quad\langle c B\rangle=c\langle B\rangle \tag{3.90}
\end{equation}
\)

where $A$ and $B$ are any two properties and $c$ is a constant. However, the average value of a product need not equal the product of the average values: $\langle A B\rangle \neq\langle A\rangle\langle B\rangle$.

The term expectation value is often used instead of average value. The expectation value is not necessarily one of the possible values we might observe.

EXAMPLE

Find $\langle x\rangle$ and $\left\langle p_{x}\right\rangle$ for the ground stationary state of a particle in a three-dimensional box.
Substitution of the stationary-state wave function $\psi=f(x) g(y) h(z)$ [Eq. (3.62)] into the average-value postulate (3.89) gives

\(
\langle x\rangle=\int \psi^{} \hat{x} \psi d \tau=\int{0}^{c} \int{0}^{b} \int_{0}^{a} f^{} g^{} h^{} x f g h d x d y d z
\)

since $\psi=0$ outside the box. Use of (3.74) gives

\(
\langle x\rangle=\int{0}^{a} x|f(x)|^{2} d x \int{0}^{b}|g(y)|^{2} d y \int{0}^{c}|h(z)|^{2} d z=\int{0}^{a} x|f(x)|^{2} d x
\)

since $g(y)$ and $h(z)$ are each normalized. For the ground state, $n_{x}=1$ and $f(x)=(2 / a)^{1 / 2} \sin (\pi x / a)$. So

\(
\begin{equation}
\langle x\rangle=\frac{2}{a} \int_{0}^{a} x \sin ^{2}\left(\frac{\pi x}{a}\right) d x=\frac{a}{2} \tag{3.91}
\end{equation}
\)

where the Appendix integral (A.3) was used. A glance at Fig. 2.4 shows that this result is reasonable.

Also,

\(
\begin{align}
& \left\langle p_{x}\right\rangle=\int \psi^{} \hat{p}{x} \psi d \tau=\int{0}^{c} \int{0}^{b} \int{0}^{a} f^{} g^{} h^{} \frac{\hbar}{i} \frac{\partial}{\partial x}[f(x) g(y) h(z)] d x d y d z \
& \left\langle p{x}\right\rangle=\frac{\hbar}{i} \int{0}^{a} f^{}(x) f^{\prime}(x) d x \int{0}^{b}|g(y)|^{2} d y \int{0}^{c}|h(z)|^{2} d z \
& \left\langle p{x}\right\rangle=\frac{\hbar}{i} \int{0}^{a} f(x) f^{\prime}(x) d x=\left.\frac{\hbar}{2 i} f^{2}(x)\right|_{0} ^{a}=0 \tag{3.92}
\end{align*}
\)

where the boundary conditions $f(0)=0$ and $f(a)=0$ were used. The result (3.92) is reasonable since the particle is equally likely to be headed in the $+x$ or $-x$ direction.
EXERCISE Find $\left\langle p_{x}^{2}\right\rangle$ for the ground state of a particle in a three-dimensional box. (Answer: $h^{2} / 4 l^{2}$.)


In solving the particle in a box, we required $\psi$ to be continuous. We now discuss other requirements the wave function must satisfy.

Since $|\psi|^{2} d \tau$ is a probability, we want to be able to normalize the wave function by choosing a suitable normalization constant $N$ as a multiplier of the wave function. If $\psi$ is unnormalized and $N \psi$ is normalized, the normalization condition (3.59) gives

\(
\begin{align}
1 & =\int|N \psi|^{2} d \tau=|N|^{2} \int|\psi|^{2} d \tau \
|N| & =\left(\int|\psi|^{2} d \tau\right)^{-1 / 2} \tag{3.93}
\end{align}
\)

FIGURE 3.4 Function (a) is continuous, and its first derivative is continuous. Function (b) is continuous, but its first derivative has a discontinuity. Function (c) is discontinuous.

The definite integral $\int|\psi|^{2} d \tau$ will equal zero only if the function $\psi$ is zero everywhere. However, $\psi$ cannot be zero everywhere (this would mean no particles were present), so this integral is never zero. If $\int|\psi|^{2} d \tau$ is infinite, then the magnitude $|N|$ of the normalization constant is zero and $\psi$ cannot be normalized. We can normalize $\psi$ if and only if $\int|\psi|^{2} d \tau$ has a finite, rather than infinite, value. If the integral over all space $\int|\psi|^{2} d \tau$ is finite, $\psi$ is said to be quadratically integrable. Thus we generally demand that $\psi$ be quadratically integrable. The important exception is a particle that is not bound. Thus the wave functions for the unbound states of the particle in a well (Section 2.4) and for a free particle are not quadratically integrable.

Since $\psi^{} \psi$ is the probability density, it must be single-valued. It would be embarrassing if our theory gave two different values for the probability of finding a particle at a certain point. If we demand that $\psi$ be single-valued, then surely $\psi^{} \psi$ will be single-valued. It is possible to have $\psi$ multivalued [for example, $\psi(q)=-1,+1, i$ ] and still have $\psi^{*} \psi$ single-valued. We will, however, demand single-valuedness for $\psi$.

In addition to demanding that $\psi$ be continuous, we usually also require that all the partial derivatives $\partial \psi / \partial x, \partial \psi / \partial y$, and so on, be continuous. (See Fig. 3.4.) Referring back to Section 2.2, however, we note that for the particle in a box, $d \psi / d x$ is discontinuous at the walls of the box; $\psi$ and $d \psi / d x$ are zero everywhere outside the box; but from Eq. (2.23) we see that $d \psi / d x$ does not become zero at the walls. The discontinuity in $\psi^{\prime}$ is due to the infinite jump in potential energy at the walls of the box. For a box with walls of finite height, $\psi^{\prime}$ is continuous at the walls (Section 2.4).

In line with the requirement of quadratic integrability, it is sometimes stated that the wave function must be finite everywhere, including infinity. However, this is usually a much stronger requirement than quadratic integrability. In fact, it turns out that some of the relativistic wave functions for the hydrogen atom are infinite at the origin but are quadratically integrable. Occasionally, one encounters nonrelativistic wave functions that are infinite at the origin [L. D. Landau and E. M. Lifshitz, Quantum Mechanics, 3rd ed. (1977), Section 35]. Thus the fundamental requirement is quadratic integrability, rather than finiteness.

We require that the eigenfunctions of any operator representing a physical quantity meet the above requirements. A function meeting these requirements is said to be well-behaved.


The Harmonic Oscillator

Click the keywords to view related YouTube video

Harmonic Oscillator: A system in which a particle experiences a restoring force proportional to its displacement from equilibrium. It is a fundamental model for understanding molecular vibrations 1. Schrödinger Equation: A key equation in quantum mechanics that describes how the quantum state of a physical system changes over time. In this chapter, it is solved for the harmonic oscillator 1. Power-Series Method: A mathematical technique used to solve differential equations by expressing the solution as an infinite sum of terms 1. Force Constant (k): A parameter that measures the stiffness of the bond in a molecule. It is the proportionality constant in the force equation \( F = -kx \) 1. Vibration Frequency (ν): The frequency at which a molecule vibrates. For a harmonic oscillator, it is given by \( \nu = \frac{1}{2\pi} \sqrt{\frac{k}{m}} \) 1. Zero-Point Energy: The lowest possible energy that a quantum mechanical system may have. For a harmonic oscillator, it is \( \frac{1}{2} h \nu \) 1. Eigenvalues and Eigenfunctions: Solutions to the Schrödinger equation that describe the allowed energy levels and corresponding wave functions of a quantum system 1. Recursion Relation: A relation that defines each term of a sequence as a function of preceding terms. It is used in the power-series method to solve the Schrödinger equation 1. Hermite Polynomials: A set of orthogonal polynomials that arise in the solution of the Schrödinger equation for the harmonic oscillator 1. Classically Forbidden Region: Regions where the potential energy exceeds the total energy of the system, making it impossible for a classical particle to be found there 1. Reduced Mass (m): A hypothetical mass used in the analysis of two-body problems, defined as \( m = \frac{m_1 m_2}{m_1 + m_2} \) 1. Anharmonicity: The deviation of a system from the ideal harmonic oscillator model, leading to non-equally spaced energy levels 1. Boltzmann Distribution Law: A statistical law that describes the distribution of particles among various energy states in thermal equilibrium 1. Wavenumber (ν̅): The number of wavelengths per unit distance, often used in spectroscopy. It is defined as \( \nu̅ = \frac{1}{λ} = \frac{ν}{c} \) 1.

So far we have considered only cases where the potential energy $V(x)$ is a constant. This makes the Schrödinger equation a second-order linear homogeneous differential equation with constant coefficients, which we know how to solve. For cases in which $V$ varies with $x$, a useful approach is to try a power-series solution of the Schrödinger equation.

To illustrate the method, consider the differential equation

\(
\begin{equation}
y^{\prime \prime}(x)+c^{2} y(x)=0 \tag{4.1}
\end{equation}
\)

where $c^{2}>0$. Of course, this differential equation has constant coefficients, but we can solve it with the power-series method if we want. Let us first find the solution by using the auxiliary equation, which is $s^{2}+c^{2}=0$. We find $s= \pm i c$. Recalling the work in Section 2.2 [Eqs. (2.10) and (4.1) are the same], we get trigonometric solutions when the roots of the auxiliary equation are pure imaginary:

\(
\begin{equation}
y=A \cos c x+B \sin c x \tag{4.2}
\end{equation}
\)

where $A$ and $B$ are the constants of integration. A different form of (4.2) is

\(
\begin{equation}
y=D \sin (c x+e) \tag{4.3}
\end{equation}
\)

where $D$ and $e$ are arbitrary constants. Using the formula for the sine of the sum of two angles, we can show that (4.3) is equivalent to (4.2).

Now let us solve (4.1) using the power-series method. We start by assuming that the solution can be expanded in a Taylor series (see Prob. 4.1) about $x=0$; that is, we assume that

\(
\begin{equation}
y(x)=\sum{n=0}^{\infty} a{n} x^{n}=a{0}+a{1} x+a{2} x^{2}+a{3} x^{3}+\cdots \tag{4.4}
\end{equation}
\)

where the $a$ 's are constant coefficients to be determined so as to satisfy (4.1). Differentiating (4.4), we have

\(
\begin{equation}
y^{\prime}(x)=a{1}+2 a{2} x+3 a{3} x^{2}+\cdots=\sum{n=1}^{\infty} n a_{n} x^{n-1} \tag{4.5}
\end{equation}
\)

where we assumed that term-by-term differentiation is valid for the series. (This is not always true for infinite series.) For $y^{\prime \prime}$, we have

\(
\begin{equation}
y^{\prime \prime}(x)=2 a{2}+3(2) a{3} x+\cdots=\sum{n=2}^{\infty} n(n-1) a{n} x^{n-2} \tag{4.6}
\end{equation}
\)

Substituting (4.4) and (4.6) into (4.1), we get

\(
\begin{equation}
\sum{n=2}^{\infty} n(n-1) a{n} x^{n-2}+\sum{n=0}^{\infty} c^{2} a{n} x^{n}=0 \tag{4.7}
\end{equation}
\)

We want to combine the two sums in (4.7). Provided certain conditions are met, we can add two infinite series term by term to get their sum:

\(
\begin{equation}
\sum{j=0}^{\infty} b{j} x^{j}+\sum{j=0}^{\infty} c{j} x^{j}=\sum{j=0}^{\infty}\left(b{j}+c_{j}\right) x^{j} \tag{4.8}
\end{equation}
\)

To apply (4.8) to the two sums in (4.7), we want the limits in each sum to be the same and the powers of $x$ to be the same. We therefore change the summation index in the first sum in (4.7), defining $k$ as $k \equiv n-2$. The limits $n=2$ to $\infty$ correspond to $k=0$ to $\infty$ and use of $n=k+2$ gives
$\sum{n=2}^{\infty} n(n-1) a{n} x^{n-2}=\sum{k=0}^{\infty}(k+2)(k+1) a{k+2} x^{k}=\sum{n=0}^{\infty}(n+2)(n+1) a{n+2} x^{n}$
The last equality in (4.9) is valid because the summation index is a dummy variable; it makes no difference what letter we use to denote this variable. For example, the sums $\sum{i=1}^{3} c{i} x^{i}$ and $\sum{m=1}^{3} c{m} x^{m}$ are equal because only the dummy variables in the two sums differ. This equality is easy to see if we write out the sums:

\(
\sum{i=1}^{3} c{i} x^{i}=c{1} x+c{2} x^{2}+c{3} x^{3} \quad \text { and } \quad \sum{m=1}^{3} c{m} x^{m}=c{1} x+c{2} x^{2}+c{3} x^{3}
\)

In the last equality in (4.9), we simply changed the symbol denoting the summation index from $k$ to $n$.

The integration variable in a definite integral is also a dummy variable, since the value of a definite integral is unaffected by what letter we use for this variable:

\(
\begin{equation}
\int{a}^{b} f(x) d x=\int{a}^{b} f(t) d t \tag{4.10}
\end{equation}
\)

Using (4.9) in (4.7), we find, after applying (4.8), that

\(
\begin{equation}
\sum{n=0}^{\infty}\left[(n+2)(n+1) a{n+2}+c^{2} a_{n}\right] x^{n}=0 \tag{4.11}
\end{equation}
\)

If (4.11) is to be true for all values of $x$, then the coefficient of each power of $x$ must vanish. To see this, consider the equation

\(
\begin{equation}
\sum{j=0}^{\infty} b{j} x^{j}=0 \tag{4.12}
\end{equation}
\)

Putting $x=0$ in (4.12) shows that $b{0}=0$. Taking the first derivative of (4.12) with respect to $x$ and then putting $x=0$ shows that $b{1}=0$. Taking the $n$th derivative and putting $x=0$ gives $b_{n}=0$. Thus, from (4.11), we have

\(
\begin{gather}
(n+2)(n+1) a{n+2}+c^{2} a{n}=0 \tag{4.13}\
a{n+2}=-\frac{c^{2}}{(n+1)(n+2)} a{n} \tag{4.14}
\end{gather}
\)

Equation (4.14) is a recursion relation. If we know the value of $a{0}$, we can use (4.14) to find $a{2}, a{4}, a{6}, \ldots$ If we know $a{1}$, we can find $a{3}, a{5}, a{7}, \ldots$. Since there is no restriction on the values of $a{0}$ and $a{1}$, they are arbitrary constants, which we denote by $A$ and $B c$ :

\(
\begin{equation}
a{0}=A, \quad a{1}=B c \tag{4.15}
\end{equation}
\)

Using (4.14), we find for the coefficients

\(
\begin{align}
a{0}=A, \quad a{2} & =-\frac{c^{2} A}{1 \cdot 2}, \quad a{4}=\frac{c^{4} A}{4 \cdot 3 \cdot 2 \cdot 1}, \quad a{6}=-\frac{c^{6} A}{6!}, \ldots \
a{2 k} & =(-1)^{k} \frac{c^{2 k} A}{(2 k)!}, \quad k=0,1,2,3, \ldots \tag{4.16}\
a{1}=B c, \quad a{3} & =-\frac{c^{3} B}{2 \cdot 3}, \quad a{5}=\frac{c^{5} B}{5 \cdot 4 \cdot 3 \cdot 2}, \quad a{7}=-\frac{c^{7} B}{7!}, \ldots \
a{2 k+1} & =(-1)^{k} \frac{c^{2 k+1} B}{(2 k+1)!}, \quad k=0,1,2, \ldots \tag{4.17}
\end{align}
\)

From (4.4), (4.16), and (4.17), we have

\(
\begin{align}
& y=\sum{n=0}^{\infty} a{n} x^{n}=\sum{n=0,2,4, \ldots}^{\infty} a{n} x^{n}+\sum{n=1,3,5, \ldots}^{\infty} a{n} x^{n} \tag{4.18}\
& y=A \sum{k=0}^{\infty}(-1)^{k} \frac{c^{2 k} x^{2 k}}{(2 k)!}+B \sum{k=0}^{\infty}(-1)^{k} \frac{c^{2 k+1} x^{2 k+1}}{(2 k+1)!} \tag{4.19}
\end{align}
\)

The two series in (4.19) are the Taylor series for $\cos c x$ and $\sin c x$ (Prob. 4.2). Hence, in agreement with (4.2), we have $y=A \cos c x+B \sin c x$.


In this section we will increase our quantum-mechanical repertoire by solving the Schrödinger equation for the one-dimensional harmonic oscillator. This system is important as a model for molecular vibrations.

Classical-Mechanical Treatment

Before looking at the wave mechanics of the harmonic oscillator, we review the classical treatment. We have a single particle of mass $m$ attracted toward the origin by a force proportional to the particle's displacement from the origin:

\(
\begin{equation}
F_{x}=-k x \tag{4.20}
\end{equation}
\)

The proportionality constant $k$ is called the force constant. $F_{x}$ is the $x$ component of the force on the particle. This is also the total force in this one-dimensional problem. Equation (4.20) is obeyed by a particle attached to a spring, provided the spring is not stretched greatly from its equilibrium position.

Newton's second law, $F=m a$, gives

\(
\begin{equation}
-k x=m \frac{d^{2} x}{d t^{2}} \tag{4.21}
\end{equation}
\)

where $t$ is the time. Equation (4.21) is the same as Eq. (4.1) with $c^{2}=k / m$; hence the solution is [Eq. (4.3) with $c=(k / m)^{1 / 2}$ ]

\(
\begin{equation}
x=A \sin (2 \pi \nu t+b) \tag{4.22}
\end{equation}
\)

where $A$ (the amplitude of the vibration) and $b$ are the integration constants, and the vibration frequency $\nu$ is

\(
\begin{equation}
\nu=\frac{1}{2 \pi}\left(\frac{k}{m}\right)^{1 / 2} \tag{4.23}
\end{equation}
\)

Since the sine function has maximum and minimum values of 1 and -1 , respectively, $x$ in (4.22) oscillates between $A$ and $-A$. The sine function repeats itself every $2 \pi$ radians, and the time needed for one complete oscillation (called the period) is the time it takes for the argument of the sine function to increase by $2 \pi$. At time $t+1 / \nu$, the argument of the sine function is $2 \pi \nu(t+1 / \nu)+b=2 \pi \nu t+2 \pi+b$, which is $2 \pi$ greater than the argument at time $t$, so the period is $1 / \nu$. The reciprocal of the period is the number of vibrations per unit time (the vibrational frequency), and so the frequency is $\nu$.

Now consider the energy. The potential energy $V$ is related to the components of force in the three-dimensional case by

\(
\begin{equation}
F{x}=-\frac{\partial V}{\partial x}, \quad F{y}=-\frac{\partial V}{\partial y}, \quad F_{z}=-\frac{\partial V}{\partial z} \tag{4.24}
\end{equation}
\)

Equation (4.24) is the definition of potential energy. Since this is a one-dimensional problem, we have [Eq. (1.12)]

\(
\begin{equation}
F_{x}=-\frac{d V}{d x}=-k x \tag{4.25}
\end{equation}
\)

Integration of (4.25) gives $V=\int k x d x=\frac{1}{2} k x^{2}+C$, where $C$ is a constant. The potential energy always has an arbitrary additive constant. Choosing $C=0$, we have [Eq. (4.23)]

\(
\begin{gather}
V=\frac{1}{2} k x^{2} \tag{4.26}\
V=2 \pi^{2} \nu^{2} m x^{2} \tag{4.27}
\end{gather}
\)

The graph of $V(x)$ is a parabola (Fig. 4.5). The kinetic energy $T$ is

\(
\begin{equation}
T=\frac{1}{2} m(d x / d t)^{2} \tag{4.28}
\end{equation}
\)

and can be found by differentiating (4.22) with respect to $t$. Adding $T$ and $V$, one finds for the total energy (Prob. 4.4)

\(
\begin{equation}
E=T+V=\frac{1}{2} k A^{2}=2 \pi^{2} \nu^{2} m A^{2} \tag{4.29}
\end{equation}
\)

where the identity $\sin ^{2} \theta+\cos ^{2} \theta=1$ was used.

According to (4.22), the classical harmonic oscillator vibrates back and forth between $x=A$ and $x=-A$. These two points are the turning points for the motion. The particle has zero speed at these points, and the speed increases to a maximum at $x=0$, where the potential energy is zero and the energy is all kinetic energy. The classical harmonic oscillator spends more time in each of the regions near $x=A$ and $x=-A$ (where it is moving the slowest) than it does in the region near $x=0$. Problem 4.18 works out the probability density for finding the classical harmonic oscillator at various locations. (Interestingly, this probability density becomes infinite at the turning points.)

Quantum-Mechanical Treatment

The harmonic-oscillator Hamiltonian operator is [Eqs. (3.27) and (4.27)]

\(
\begin{equation}
\hat{H}=\hat{T}+\hat{V}=-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+2 \pi^{2} \nu^{2} m x^{2}=-\frac{\hbar^{2}}{2 m}\left(\frac{d^{2}}{d x^{2}}-\alpha^{2} x^{2}\right) \tag{4.30}
\end{equation}
\)

where, to save time in writing, $\alpha$ was defined as

\(
\begin{equation}
\alpha \equiv 2 \pi \nu m / \hbar \tag{4.31}
\end{equation}
\)

The Schrödinger equation $\hat{H} \psi=E \psi$ reads, after multiplication by $2 m / \hbar^{2}$,

\(
\begin{equation}
\frac{d^{2} \psi}{d x^{2}}+\left(2 m E \hbar^{-2}-\alpha^{2} x^{2}\right) \psi=0 \tag{4.32}
\end{equation}
\)

We might now attempt a power-series solution of (4.32). If we do now try a power series for $\psi$ of the form (4.4), we will find that it leads to a three-term recursion relation, which is harder to deal with than a two-term recursion relation like Eq. (4.14). We therefore modify the form of (4.32) so as to get a two-term recursion relation when we try a series solution. A substitution that will achieve this purpose is (see Prob. 4.22) $f(x) \equiv e^{\alpha x^{2} / 2} \psi(x)$. Thus

\(
\begin{equation}
\psi=e^{-\alpha x^{2} / 2} f(x) \tag{4.33}
\end{equation}
\)

This equation is simply the definition of a new function $f(x)$ that replaces $\psi(x)$ as the unknown function to be solved for. (We can make any substitution we please in a differential equation.) Differentiating (4.33) twice, we have

\(
\begin{equation}
\psi^{\prime \prime}=e^{-\alpha x^{2} / 2}\left(f^{\prime \prime}-2 \alpha x f^{\prime}-\alpha f+\alpha^{2} x^{2} f\right) \tag{4.34}
\end{equation}
\)

Substituting (4.33) and (4.34) into (4.32), we find

\(
\begin{equation}
f^{\prime \prime}(x)-2 \alpha x f^{\prime}(x)+\left(2 m E \hbar^{-2}-\alpha\right) f(x)=0 \tag{4.35}
\end{equation}
\)

Now we try a series solution for $f(x)$ :

\(
\begin{equation}
f(x)=\sum{n=0}^{\infty} c{n} x^{n} \tag{4.36}
\end{equation}
\)

Assuming the validity of term-by-term differentiation of (4.36), we get

\(
\begin{equation}
f^{\prime}(x)=\sum{n=1}^{\infty} n c{n} x^{n-1}=\sum{n=0}^{\infty} n c{n} x^{n-1} \tag{4.37}
\end{equation}
\)

[The first term in the second sum in (4.37) is zero.] Also,

\(
f^{\prime \prime}(x)=\sum{n=2}^{\infty} n(n-1) c{n} x^{n-2}=\sum{j=0}^{\infty}(j+2)(j+1) c{j+2} x^{j}=\sum{n=0}^{\infty}(n+2)(n+1) c{n+2} x^{n}
\)

where we made the substitution $j=n-2$ and then changed the summation index from $j$ to $n$. [See Eq. (4.9).] Substitution into (4.35) gives

\(
\begin{align}
& \sum{n=0}^{\infty}(n+2)(n+1) c{n+2} x^{n}-2 \alpha \sum{n=0}^{\infty} n c{n} x^{n}+\left(2 m E \hbar^{-2}-\alpha\right) \sum{n=0}^{\infty} c{n} x^{n}=0 \
& \sum{n=0}^{\infty}\left[(n+2)(n+1) c{n+2}-2 \alpha n c{n}+\left(2 m E \hbar^{-2}-\alpha\right) c{n}\right] x^{n}=0 \tag{4.38}
\end{align}
\)

Setting the coefficient of $x^{n}$ equal to zero [for the same reason as in Eq. (4.11)], we have

\(
\begin{equation}
c{n+2}=\frac{\alpha+2 \alpha n-2 m E \hbar^{-2}}{(n+1)(n+2)} c{n} \tag{4.39}
\end{equation}
\)

which is the desired two-term recursion relation. Equation (4.39) has the same form as (4.14), in that knowing $c{n}$ we can calculate $c{n+2}$. We thus have two arbitrary constants: $c{0}$ and $c{1}$. If we set $c_{1}$ equal to zero, then we will have as a solution a power series containing only even powers of $x$, multiplied by the exponential factor:

\(
\begin{equation}
\psi=e^{-\alpha x^{2} / 2} f(x)=e^{-\alpha x^{2} / 2} \sum{n=0,2,4, \ldots}^{\infty} c{n} x^{n}=e^{-\alpha x^{2} / 2} \sum{l=0}^{\infty} c{2 l} x^{2 l} \tag{4.40}
\end{equation}
\)

If we set $c_{0}$ equal to zero, we get another independent solution:

\(
\begin{equation}
\psi=e^{-\alpha x^{2} / 2} \sum{n=1,3, \ldots}^{\infty} c{n} x^{n}=e^{-\alpha x^{2} / 2} \sum{l=0}^{\infty} c{2 l+1} x^{2 l+1} \tag{4.41}
\end{equation}
\)

The general solution of the Schrödinger equation is a linear combination of these two independent solutions [recall Eq. (2.4)]:

\(
\begin{equation}
\psi=A e^{-\alpha x^{2} / 2} \sum{l=0}^{\infty} c{2 l+1} x^{2 l+1}+B e^{-\alpha x^{2} / 2} \sum{l=0}^{\infty} c{2 l} x^{2 l} \tag{4.42}
\end{equation}
\)

where $A$ and $B$ are arbitrary constants.
We now must see if the boundary conditions on the wave function lead to any restrictions on the solution. To see how the two infinite series behave for large $x$, we examine the ratio of successive coefficients in each series. The ratio of the coefficient of $x^{2 l+2}$ to that of $x^{2 l}$ in the second series is [set $n=2 l$ in (4.39)]

\(
\frac{c{2 l+2}}{c{2 l}}=\frac{\alpha+4 \alpha l-2 m E \hbar^{-2}}{(2 l+1)(2 l+2)}
\)

Assuming that for large values of $x$ the later terms in the series are the dominant ones, we look at this ratio for large values of $l$ :

\(
\begin{equation}
\frac{c{2 l+2}}{c{2 l}} \sim \frac{4 \alpha l}{(2 l)(2 l)}=\frac{\alpha}{l} \quad \text { for } l \text { large } \tag{4.43}
\end{equation}
\)

Setting $n=2 l+1$ in (4.39), we find that for large $l$ the ratio of successive coefficients in the first series is also $\alpha / l$. Now consider the power-series expansion for the function $e^{\alpha x^{2}}$. Using (Prob. 4.3)

\(
\begin{equation}
e^{z}=\sum_{n=0}^{\infty} \frac{z^{n}}{n!}=1+z+\frac{z^{2}}{2!}+\cdots \tag{4.44}
\end{equation}
\)

FIGURE 4.1 Lowest five energy levels for the one-dimensional harmonic oscillator.
we get

\(
e^{\alpha x^{2}}=1+\alpha x^{2}+\cdots+\frac{\alpha^{l} x^{2 l}}{l!}+\frac{\alpha^{l+1} x^{2 l+2}}{(l+1)!}+\cdots
\)

The ratio of the coefficients of $x^{2 l+2}$ and $x^{2 l}$ in this series is

\(
\frac{\alpha^{l+1}}{(l+1)!} \div \frac{\alpha^{l}}{l!}=\frac{\alpha}{l+1} \sim \frac{\alpha}{l} \quad \text { for large } l
\)

Thus the ratio of successive coefficients in each of the infinite series in the solution (4.42) is the same as in the series for $e^{\alpha x^{2}}$ for large $l$. We conclude that, for large $x$, each series behaves as $e^{\alpha x^{2}}$. [This is not a rigorous proof. A proper mathematical derivation is given in H. A. Buchdahl, Am. J. Phys., 42, 47 (1974); see also M. Bowen and J. Coster, Am. J. Phys., 48, 307 (1980).]

If each series behaves as $e^{\alpha x^{2}}$, then (4.42) shows that $\psi$ will behave as $e^{\alpha x^{2} / 2}$ for large $x$. The wave function will become infinite as $x$ goes to infinity and will not be quadratically integrable. If we could somehow break off the series after a finite number of terms, then the factor $e^{-\alpha x^{2} / 2}$ would ensure that $\psi$ went to zero as $x$ became infinite. (Using l'Hôpital's rule, it is easy to show that $x^{p} e^{-\alpha x^{2} / 2}$ goes to zero as $x \rightarrow \infty$, where $p$ is any finite power.) To have one of the series break off after a finite number of terms, the coefficient of $c{n}$ in the recursion relation (4.39) must become zero for some value of $n$, say for $n=v$. This makes $c{v+2}, c{v+4}, \ldots$ all equal to zero, and one of the series in (4.42) will have a finite number of terms. In the recursion relation (4.39), there is one quantity whose value is not yet fixed, but can be adjusted to make the coefficient of $c{v}$ vanish. This quantity is the energy $E$. Setting the coefficient of $c_{v}$ equal to zero in (4.39) and using (4.31) for $\alpha$, we get

\(
\begin{gather}
\alpha+2 \alpha v-2 m E \hbar^{-2}=0 \
2 m E \hbar^{-2}=(2 v+1) 2 \pi \nu m \hbar^{-1} \
E=\left(v+\frac{1}{2}\right) h \nu, \quad v=0,1,2, \ldots \tag{4.45}
\end{gather}
\)

The harmonic-oscillator stationary-state energy levels (4.45) are equally spaced (Fig. 4.1). Do not confuse the quantum number $v$ (vee) with the vibrational frequency $\nu$ (nu).

Substitution of (4.45) into the recursion relation (4.39) gives

\(
\begin{equation}
c{n+2}=\frac{2 \alpha(n-v)}{(n+1)(n+2)} c{n} \tag{4.46}
\end{equation}
\)

By quantizing the energy according to (4.45), we have made one of the series break off after a finite number of terms. To get rid of the other infinite series in (4.42), we must set the arbitrary constant that multiplies it equal to zero. This leaves us with a wave function that is $e^{-\alpha x^{2} / 2}$ times a finite power series containing only even or only odd powers of $x$, depending on whether $v$ is even or odd, respectively. The highest power of $x$ in this power series is $x^{v}$, since we chose $E$ to make $c{v+2}, c{v+4}, \ldots$ all vanish. The wave functions (4.42) are thus

\(
\psi{v}=\left{\begin{array}{lr}
e^{-\alpha x^{2} / 2}\left(c{0}+c{2} x^{2}+\cdots+c{v} x^{v}\right) & \text { for } v \text { even } \tag{4.47}\
e^{-\alpha x^{2} / 2}\left(c{1} x+c{3} x^{3}+\cdots+c_{v} x^{v}\right) & \text { for } v \text { odd }
\end{array}\right.
\)

where the arbitrary constants $A$ and $B$ in (4.42) can be absorbed into $c{1}$ and $c{0}$, respectively, and can therefore be omitted. The coefficients after $c{0}$ and $c{1}$ are found from the recursion relation (4.46). Since the quantum number $v$ occurs in the recursion relation, we get a different set of coefficients $c{i}$ for each different $v$. For example, $c{2}$ in $\psi{4}$ differs from $c{2}$ in $\psi_{2}$.

As in the particle in a box, the requirement that the wave functions be well-behaved forces us to quantize the energy. For values of $E$ that differ from (4.45), $\psi$ is not quadratically integrable. For example, Fig. 4.2 plots $\psi$ of Eq. (4.40) for the values $E / h \nu=0.499,0.500$, and 0.501 , where the recursion relation (4.39) is used to calculate the coefficients $c_{n}$ (see also Prob. 4.23). Figure 4.3 gives an enlarged view of these curves in the region near $\alpha^{1 / 2} x=3$.

The harmonic-oscillator ground-state energy is nonzero. This energy, $\frac{1}{2} h \nu$, is called the zero-point energy. This would be the vibrational energy of a harmonic oscillator in a collection of harmonic oscillators at a temperature of absolute zero. The zero-point energy can be understood from the uncertainty principle. If the lowest state had an energy of zero, both its potential and kinetic energies (which are nonnegative) would have to be zero. Zero kinetic energy would mean that the momentum was exactly zero, and $\Delta p{x}$ would be zero. Zero potential energy would mean that the particle was always located at the origin, and $\Delta x$ would be zero. But we cannot have both $\Delta x$ and $\Delta p{x}$ equal to zero. Hence the need for a nonzero ground-state energy. Similar ideas apply for the particle in a box. The definition of the zero-point energy (ZPE) is $E{\mathrm{ZPE}}=E{\mathrm{gs}}-V{\min }$, where $E{\mathrm{gs}}$ and $V_{\min }$ are the groundstate energy and the minimum value of the potential-energy function.

FIGURE 4.2 Plots of the harmonic-oscillator Schrödinger-equation solution containing only even powers of $x$ for $E=0.499 h \nu, E=0.500 h \nu$, and $E=0.501 h \nu$. In the region around $x=0$ the three curves nearly coincide. For $\left|\alpha^{1 / 2} x\right|>3$ the $E=0.500 h \nu$ curve nearly coincides with the $x$ axis.

FIGURE 4.3 Enlargement of Fig. 4.2 in the region $\alpha^{1 / 2} x=3$

Even and Odd Functions

Before considering the wave functions in detail, we define even and odd functions. If $f(x)$ satisfies

\(
\begin{equation}
f(-x)=f(x) \tag{4.48}
\end{equation}
\)

then $f$ is an even function of $x$. Thus $x^{2}$ and $e^{-b x^{2}}$ are both even functions of $x$ since $(-x)^{2}=x^{2}$ and $e^{-b(-x)^{2}}=e^{-b x^{2}}$. The graph of an even function is symmetric about the $y$ axis (for example, see Fig. 4.4a). Therefore

\(
\begin{equation}
\int{-a}^{+a} f(x) d x=2 \int{0}^{a} f(x) d x \text { for } f(x) \text { even } \tag{4.49}
\end{equation}
\)

If $g(x)$ satisfies

\(
\begin{equation}
g(-x)=-g(x) \tag{4.50}
\end{equation}
\)

then $g$ is an odd function of $x$. Examples are $x, 1 / x$, and $x^{3} e^{x^{2}}$. Setting $x=0$ in (4.50), we see that an odd function must be zero at $x=0$, provided $g(0)$ is defined and singlevalued. The graph of an odd function has the general appearance of Fig. 4.4b. Because positive contributions on one side of the $y$ axis are canceled by corresponding negative contributions on the other side, we have

\(
\begin{equation}
\int_{-a}^{+a} g(x) d x=0 \quad \text { for } g(x) \text { odd } \tag{4.51}
\end{equation}
\)

It is easy to show that the product of two even functions or of two odd functions is an even function, while the product of an even and an odd function is an odd function.

The Harmonic-Oscillator Wave Functions

The exponential factor $e^{-\alpha x^{2} / 2}$ in (4.47) is an even function of $x$. If $v$ is an even number, the polynomial factor contains only even powers of $x$, which makes $\psi{v}$ an even function. If $v$ is odd, the polynomial factor contains only odd powers of $x$, and $\psi{v}$, being the product of an even function and an odd function, is an odd function. Each harmonic-oscillator stationary state $\psi$ is either an even or odd function according to whether the quantum number $v$ is even or odd. In Section 7.5, we shall see that, when the potential energy $V$ is an even function, the wave functions of nondegenerate levels must be either even or odd functions.

We now find the explicit forms of the wave functions of the lowest three levels. For the $v=0$ ground state, Eq. (4.47) gives

\(
\begin{equation}
\psi{0}=c{0} e^{-\alpha x^{2} / 2} \tag{4.52}
\end{equation}
\)

where the subscript on $\psi$ gives the value of $v$. We fix $c_{0}$ by normalization:

\(
1=\int{-\infty}^{\infty}\left|c{0}\right|^{2} e^{-\alpha x^{2}} d x=2\left|c{0}\right|^{2} \int{0}^{\infty} e^{-\alpha x^{2}} d x
\)

where Eq. (4.49) has been used. Using the integral (A.9) in the Appendix, we find $\left|c_{0}\right|=(\alpha / \pi)^{1 / 4}$. Therefore,

\(
\begin{equation}
\psi_{0}=(\alpha / \pi)^{1 / 4} e^{-\alpha x^{2} / 2} \tag{4.53}
\end{equation}
\)


if we choose the phase of the normalization constant to be zero. The wave function (4.53) is a Gaussian function (Fig. 4.4a).

For the $v=1$ state, Eq. (4.47) gives

\(
\begin{equation}
\psi{1}=c{1} x e^{-\alpha x^{2} / 2} \tag{4.54}
\end{equation}
\)

After normalization using the integral in Eq. (A.10), we have

\(
\begin{equation}
\psi_{1}=\left(4 \alpha^{3} / \pi\right)^{1 / 4} x e^{-\alpha x^{2} / 2} \tag{4.55}
\end{equation}
\)

Figure 4.4 b shows $\psi_{1}$.
For $v=2$, Eq. (4.47) gives

\(
\psi{2}=\left(c{0}+c_{2} x^{2}\right) e^{-\alpha x^{2} / 2}
\)

The recursion relation (4.46) with $v=2$, gives

\(
c{2}=\frac{2 \alpha(-2)}{1 \cdot 2} c{0}=-2 \alpha c_{0}
\)

Therefore

\(
\begin{equation}
\psi{2}=c{0}\left(1-2 \alpha x^{2}\right) e^{-\alpha x^{2} / 2} \tag{4.56}
\end{equation}
\)

Evaluating $c_{0}$ by normalization, we find (Prob. 4.10)

\(
\begin{equation}
\psi_{2}=(\alpha / 4 \pi)^{1 / 4}\left(2 \alpha x^{2}-1\right) e^{-\alpha x^{2} / 2} \tag{4.57}
\end{equation}
\)

Note that $c{0}$ in $\psi{2}$ is not the same as $c{0}$ in $\psi{0}$.
The number of nodes in the wave function equals the quantum number $v$. It can be proved (see Messiah, pages 109-110) that for the bound stationary states of a onedimensional problem, the number of nodes interior to the boundary points is zero for the ground-state $\psi$ and increases by one for each successive excited state. The boundary points for the harmonic oscillator are $\pm \infty$. Moreover, one can show that a one-dimensional wave function must change sign as it goes through a node (see Prob. 4.45).

FIGURE 4.4 Harmonicoscillator wave functions. The same scale is used for all graphs. The points marked on the $x$ axes are for $\alpha^{1 / 2} x= \pm 2$.

FIGURE 4.5 The classically
allowed $(-a \leq x \leq a)$ and forbidden ( $x<-a$ and $x>a)$ regions for the harmonic oscillator.

The polynomial factors in the harmonic-oscillator wave functions are well known in mathematics and are called Hermite polynomials, after a French mathematician. (See Prob. 4.21.)

According to the quantum-mechanical solution, there is some probability of finding the particle at any point on the $x$ axis (except at the nodes). Classically, $E=T+V$ and the kinetic energy $T$ cannot be negative: $T \geq 0$. Therefore, $E-V=T \geq 0$ and $V \leq E$. The potential energy $V$ is a function of position, and a classical particle is confined to the region of space where $V \leq E$; that is, where the potential energy does not exceed the total energy. In Fig. 4.5, the horizontal line labeled $E$ gives the energy of a harmonic oscillator, and the parabolic curve gives the potential energy $\frac{1}{2} k x^{2}$. For the regions $x<-a$ and $x>a$, we have $V>E$, and these regions are classically forbidden. The classically allowed region $-a \leq x \leq a$ in Fig. 4.5 is where $V \leq E$.

In quantum mechanics, the stationary-state wave functions are not eigenfunctions of $\hat{T}$ or $\hat{V}$, and we cannot assign definite values to $T$ or $V$ for a stationary state. Instead of the classical equations $E=T+V$ and $T \geq 0$, we have in quantum mechanics that $E=\langle T\rangle+\langle V\rangle$ (Prob. 6.35) and $\langle T\rangle \geq 0$ (Prob. 7.7), so $\langle V\rangle \leq E$ in quantum mechanics, but we cannot write $V \leq E$, and a particle has some probability to be found in classically forbidden regions where $V>E$.

It might seem that, by saying the particle can be found outside the classically allowed region, we are allowing it to have negative kinetic energy. Actually, there is no paradox in the quantum-mechanical view. To verify that the particle is in the classically forbidden region, we must measure its position. This measurement changes the state of the system (Sections 1.3 and 1.4). The interaction of the oscillator with the measuring apparatus transfers enough energy to the oscillator for it to be in the classically forbidden region. An accurate measurement of $x$ introduces a large uncertainty in the momentum and hence in the kinetic energy. Penetration of classically forbidden regions was previously discussed in Sections 2.4 and 2.5.

A harmonic-oscillator stationary state has $E=\left(v+\frac{1}{2}\right) h \nu$ and $V=\frac{1}{2} k x^{2}=2 \pi^{2} \nu^{2} m x^{2}$, so the classically allowed region where $V \leq E$ is where $2 \pi^{2} \nu^{2} m x^{2} \leq\left(v+\frac{1}{2}\right) h \nu$, which gives $x^{2} \leq\left(v+\frac{1}{2}\right) h / 2 \pi^{2} \nu m=(2 v+1) / \alpha$, where $\alpha \equiv 2 \pi \nu m / \hbar$ [Eq. (4.31)]. Therefore, the classically allowed region for the harmonic oscillator is where $-(2 v+1)^{1 / 2} \leq$ $\alpha^{1 / 2} x \leq(2 v+1)^{1 / 2}$.

Note from Fig. 4.4 that $\psi$ oscillates in the classically allowed region and decreases exponentially to zero in the classically forbidden region. We previously saw this behavior for the particle in a rectangular well (Section 2.4).

Figure 4.4 shows that, as we go to higher-energy states of the harmonic oscillator, $\psi$ and $|\psi|^{2}$ tend to have maxima farther and farther from the origin. Since $V=\frac{1}{2} k x^{2}$ increases as we go farther from the origin, the average potential energy $\langle V\rangle=\int{-\infty}^{\infty}|\psi|^{2} V d x$ increases as the quantum number increases. The average kinetic
energy is given by $\langle T\rangle=-\left(\hbar^{2} / 2 m\right) \int{-\infty}^{\infty} \psi^{*} \psi^{\prime \prime} d x$. Integration by parts gives (Prob. 7.7b) $\langle T\rangle=\left(\hbar^{2} / 2 m\right) \int_{-\infty}^{\infty}|d \psi / d x|^{2} d x$. The higher number of nodes in states with a higher quantum number produces a faster rate of change of $\psi$, so $\langle T\rangle$ increases as the quantum number increases.

A classical harmonic oscillator is most likely to be found in the regions near the turning points of the motion, where the oscillator is moving the slowest and $V$ is large. In contrast, for the ground state of a quantum harmonic oscillator, the most probable region is the region around the origin. For high oscillator quantum numbers, one finds that the outer peaks of $|\psi|^{2}$ are larger than the peaks near the origin, and the most probable regions become the regions near the classical turning points, where $V$ is large (see Prob. 4.18). This is an example of the correspondence principle (Section 2.2).

Some online simulations of the quantum harmonic oscillator are available at www.phy. davidson.edu/StuHome/cabellf/energy.html (shows energy levels and wave functions and shows how the wave function diverges when the energy is changed from an allowed value); www.falstad.com/qm1d/ (choose harmonic oscillator from the drop-down menu at the top; double click on one of the small circles at the bottom to show a stationary state; shows energies, wave functions, probability densities; $m$ and $k$ can be varied); demonstrations. wolfram.com/HarmonicOscillatorEigenfunctions (shows $\left|\psi{v}\right|^{2}$ ).


We shall see in Section 13.1 that to an excellent approximation one can treat separately the motions of the electrons and the motions of the nuclei of a molecule. (This is due to the much heavier mass of the nuclei.) One first imagines the nuclei to be held stationary and solves a Schrödinger equation for the electronic energy $U$. ( $U$ also includes the energy of nuclear repulsion.) For a diatomic (two-atom) molecule, the electronic energy $U$ depends on the distance $R$ between the nuclei, $U=U(R)$, and the $U$ versus $R$ curve has the typical appearance of Fig. 13.1.

After finding $U(R)$, one solves a Schrödinger equation for nuclear motion, using $U(R)$ as the potential energy for nuclear motion. For a diatomic molecule, the nuclear Schrödinger equation is a two-particle equation. We shall see in Section 6.3 that, when the potential energy of a two-particle system depends only on the distance between the particles, the energy of the system is the sum of (a) the kinetic energy of translational motion of the entire system through space and (b) the energy of internal motion of the particles relative to each other. The classical expression for the two-particle internalmotion energy turns out to be the sum of the potential energy of interaction between the particles and the kinetic energy of a hypothetical particle whose mass is $m{1} m{2} /\left(m{1}+m{2}\right)$ (where $m{1}$ and $m{2}$ are the masses of the two particles) and whose coordinates are the coordinates of one particle relative to the other. The quantity $m{1} m{2} /\left(m{1}+m{2}\right)$ is called the reduced mass $\mu$.

The internal motion of a diatomic molecule consists of vibration, corresponding to a change in the distance $R$ between the two nuclei, and rotation, corresponding to a change in the spatial orientation of the line joining the nuclei. To a good approximation, one can usually treat the vibrational and rotational motions separately. The rotational energy levels are found in Section 6.4. Here we consider the vibrational levels.

The Schrödinger equation for the vibration of a diatomic molecule has a kineticenergy operator for the hypothetical particle of mass $\mu=m{1} m{2} /\left(m{1}+m{2}\right)$ and a potential-energy term given by $U(R)$. If we place the origin to coincide with the minimum point of the $U$ curve in Fig. 13.1 and take the zero of potential energy at the energy of this minimum point, then the lower portion of the $U(R)$ curve will nearly coincide with

FIGURE 4.6 Potential energy for vibration of a diatomic molecule (solid curve) and for a harmonic oscillator (dashed curve). Also shown are the boundstate vibrational energy levels for the diatomic molecule. In contrast to the harmonic oscillator, a diatomic molecule has only a finite number of bound vibrational levels.

the potential-energy curve of a harmonic oscillator with the appropriate force constant $k$ (see Fig. 4.6 and Prob. 4.28). The minimum in the $U(R)$ curve occurs at the equilibrium distance $R{e}$ between the nuclei. In Fig. 4.6, $x$ is the deviation of the internuclear distance from its equilibrium value: $x \equiv R-R{e}$.

The harmonic-oscillator force constant $k$ in Eq. (4.26) is obtained as $k=d^{2} V / d x^{2}$, and the harmonic-oscillator curve essentially coincides with the $U(R)$ curve at $R=R{e}$, so the molecular force constant is $k=d^{2} U /\left.d R^{2}\right|{R=R_{e}}$ (see also Prob. 4.28). Differences in nuclear mass have virtually no effect on the electronic-energy curve $U(R)$, so different isotopic species of the same molecule have essentially the same force constant $k$.

We expect, therefore, that a reasonable approximation to the vibrational energy levels $E_{\text {vib }}$ of a diatomic molecule would be the harmonic-oscillator vibrational energy levels; Eqs. (4.45) and (4.23) give

\(
\begin{equation}
E{\mathrm{vib}} \approx\left(v+\frac{1}{2}\right) h \nu{e}, \quad v=0,1,2, \ldots \tag{4.58}
\end{equation}
\)

\(
\begin{equation}
\nu{e}=\frac{1}{2 \pi}\left(\frac{k}{\mu}\right)^{1 / 2}, \quad \mu=\frac{m{1} m{2}}{m{1}+m{2}}, \quad k=\left.\frac{d^{2} U}{d R^{2}}\right|{R=R_{e}} \tag{4.59}
\end{equation}
\)

$\nu_{e}$ is called the equilibrium (or harmonic) vibrational frequency. This approximation is best for the lower vibrational levels. As $v$ increases, the nuclei spend more time in regions far from their equilibrium separation. For such regions the potential energy deviates substantially from that of a harmonic oscillator and the harmonic-oscillator approximation is poor. Instead of being equally spaced, one finds that the vibrational levels of a diatomic molecule come closer and closer together as $v$ increases (Fig. 4.6). Eventually, the vibrational energy is large enough to dissociate the diatomic molecule into atoms that are not bound to each other. Unlike the harmonic oscillator, a diatomic molecule has only a finite number of bound-state vibrational levels. A more accurate expression for the molecular vibrational energy that allows for the anharmonicity of the vibration is

\(
\begin{equation}
E{\mathrm{vib}}=\left(v+\frac{1}{2}\right) h \nu{e}-\left(v+\frac{1}{2}\right)^{2} h \nu{e} x{e} \tag{4.60}
\end{equation}
\)

where the anharmonicity constant $\nu{e} x{e}$ is positive in nearly all cases.

Using the time-dependent Schrödinger equation, one finds (Section 9.9) that the most probable vibrational transitions when a diatomic molecule is exposed to electromagnetic radiation are those where $v$ changes by $\pm 1$. Furthermore, for absorption or emission of electromagnetic radiation to occur, the vibration must change the molecule's dipole moment. Hence homonuclear diatomics (such as $\mathrm{H}{2}$ or $\mathrm{N}{2}$ ) cannot undergo transitions between vibrational levels by absorption or emission of radiation. (Such transitions can occur during intermolecular collisions.) The relation $E{\text {upper }}-E{\text {lower }}=h \nu$, the approximate equation (4.58), and the selection rule $\Delta v=1$ for absorption of radiation show that a heteronuclear diatomic molecule whose vibrational frequency is $\nu{e}$ will most strongly absorb light of frequency $\nu{\text {light }}$ given approximately by
$\nu{\text {light }}=\left(E{2}-E{1}\right) / h \approx\left[\left(v{2}+\frac{1}{2}\right) h \nu{e}-\left(v{1}+\frac{1}{2}\right) h \nu{e}\right] / h=\left(v{2}-v{1}\right) \nu{e}=\nu{e}$
The values of $k$ and $\mu$ in (4.59) for diatomic molecules are such that $\nu{\text {light }}$ usually falls in the infrared region of the spectrum. Transitions with $\Delta v=2,3, \ldots$ also occur, but these (called overtones) are much weaker than the $\Delta v=1$ absorption.

Use of the more accurate equation (4.60) gives (Prob. 4.27)

\(
\begin{equation}
\nu{\text {light }}=\nu{e}-2 \nu{e} x{e}\left(v_{1}+1\right) \tag{4.62}
\end{equation}
\)

where $v_{1}$ is the quantum number of the lower level and $\Delta v=1$.
The relative population of two molecular energy levels in a system in thermal equilibrium is given by the Boltzmann distribution law (see any physical chemistry text) as

\(
\begin{equation}
\frac{N{i}}{N{j}}=\frac{g{i}}{g{j}} e^{-\left(E{i}-E{j}\right) / k T} \tag{4.63}
\end{equation}
\)

where energy levels $i$ and $j$ have energies $E{i}$ and $E{j}$ and degeneracies $g{i}$ and $g{j}$ and are populated by $N{i}$ and $N{j}$ molecules, and where $k$ is Boltzmann's constant and $T$ the absolute temperature. For a nondegenerate level, $g_{i}=1$.

The magnitude of $\nu=(1 / 2 \pi)(k / \mu)^{1 / 2}$ is such that for light diatomics (for example, $\mathrm{H}{2}, \mathrm{HCl}, \mathrm{CO}$ ) only the $v=0$ vibrational level is significantly populated at room temperature. For heavy diatomics (for example, $\mathrm{I}{2}$, there is significant room-temperature population of one or more excited vibrational levels.

The vibrational absorption spectrum of a polar diatomic molecule consists of a $v=0 \rightarrow 1$ band, much weaker overtone bands $(v=0 \rightarrow 2,0 \rightarrow 3, \ldots)$, and, if $v>0$ levels are significantly populated, hot bands such as $v=1 \rightarrow 2,2 \rightarrow 3$. Each band corresponding to a particular vibrational transition consists of several closely spaced lines. Each such line corresponds to a different change in rotational state simultaneous with the change in vibrational state. Each line is the result of a vibration-rotation transition.

The SI unit for spectroscopic frequencies is the hertz $(\mathrm{Hz})$, defined by $1 \mathrm{~Hz} \equiv 1 \mathrm{~s}^{-1}$. Multiples such as the megahertz ( MHz ) equal to $10^{6} \mathrm{~Hz}$ and the gigahertz $(\mathrm{GHz})$ equal to $10^{9} \mathrm{~Hz}$ are often used. Infrared (IR) absorption lines are usually specified by giving their wavenumber $\widetilde{\nu}$ defined as

\(
\begin{equation}
\widetilde{\nu} \equiv 1 / \lambda=\nu / c \tag{4.64}
\end{equation}
\)

where $\lambda$ is the wavelength in vacuum.
In the harmonic-oscillator approximation, the quantum-mechanical energy levels of a polyatomic molecule turn out to be $E{\text {vib }}=\sum{i}\left(v{i}+\frac{1}{2}\right) h \nu{i}$, where the $\nu{i}$ 's are the frequencies of the normal modes of vibration of the molecule and $v{i}$ is the vibrational quantum
number of the $i$ th normal mode. Each $v_{i}$ takes on the values $0,1,2, \ldots$ independently of the values of the other vibrational quantum numbers. A linear molecule with $n$ atoms has $3 n-5$ normal modes; a nonlinear molecule has $3 n-6$ normal modes.

To calculate the reduced mass $\mu$ in (4.59), one needs the masses of isotopic species. Some relative isotopic masses are listed in Table A. 3 in the Appendix.

EXAMPLE

The strongest infrared band of ${ }^{12} \mathrm{C}^{16} \mathrm{O}$ occurs at $\widetilde{\nu}=2143 \mathrm{~cm}^{-1}$. Find the force constant of ${ }^{12} \mathrm{C}^{16} \mathrm{O}$. State any approximation made.
The strongest infrared band corresponds to the $v=0 \rightarrow 1$ transition. We approximate the molecular vibration as that of a harmonic oscillator. From (4.61), the equilibrium molecular vibrational frequency is approximately

\(
\nu{e} \approx \nu{\text {light }}=\widetilde{\nu} c=\left(2143 \mathrm{~cm}^{-1}\right)\left(2.9979 \times 10^{10} \mathrm{~cm} / \mathrm{s}\right)=6.424 \times 10^{13} \mathrm{~s}^{-1}
\)

To relate $k$ to $\nu{e}$ in (4.59), we need the reduced mass $\mu=m{1} m{2} /\left(m{1}+m_{2}\right)$. One mole of ${ }^{12} \mathrm{C}$ has a mass of 12 g and contains Avogadro's number of atoms. Hence the mass of one atom of ${ }^{12} \mathrm{C}$ is $(12 \mathrm{~g}) /\left(6.02214 \times 10^{23}\right)$. The reduced mass and force constant are

\(
\begin{gathered}
\mu=\frac{12(15.9949) \mathrm{g}}{27.9949} \frac{1}{6.02214 \times 10^{23}}=1.1385 \times 10^{-23} \mathrm{~g} \
k=4 \pi^{2} \nu_{e}^{2} \mu=4 \pi^{2}\left(6.424 \times 10^{13} \mathrm{~s}^{-1}\right)^{2}\left(1.1385 \times 10^{-26} \mathrm{~kg}\right)=1855 \mathrm{~N} / \mathrm{m}
\end{gathered}
\)

EXERCISE (a) Find the approximate zero-point energy of ${ }^{12} \mathrm{C}^{16} \mathrm{O}$.
(Answer: $2.1 \times 10^{-20} \mathrm{~J}$.) (b) Estimate $\nu_{e}$ of ${ }^{13} \mathrm{C}^{16} \mathrm{O}$. (Answer: $6.28 \times 10^{13} \mathrm{~s}^{-1}$.)


The Numerov Method

We solved the Schrödinger equation exactly for the particle in a box and the harmonic oscillator. For many potential-energy functions $V(x)$, the one-particle, one-dimensional Schrödinger equation cannot be solved exactly. This section presents a numerical method (the Numerov method) for computer solution of the one-particle, one-dimensional Schrödinger equation that allows one to get accurate bound-state eigenvalues and eigenfunctions for an arbitrary $V(x)$.

To solve the Schrödinger equation numerically, we deal with a portion of the $x$ axis that includes the classically allowed region and that extends somewhat into the classically forbidden region at each end of the classically allowed region. We divide this portion of the $x$ axis into small intervals, each of length $s$ (Fig. 4.7). The points $x{0}$ and $x{\text {max }}$ are the endpoints of this portion, and $x{n}$ is the endpoint of the $n$th interval. Let $\psi{n-1}, \psi{n}$, and $\psi{n+1}$ denote the values of $\psi$ at the points $x{n}-s, x{n}$, and $x_{n}+s$, respectively (these are the endpoints of adjacent intervals)

\(
\begin{equation}
\psi{n-1} \equiv \psi\left(x{n}-s\right), \quad \psi{n} \equiv \psi\left(x{n}\right), \quad \psi{n+1} \equiv \psi\left(x{n}+s\right) \tag{4.65}
\end{equation}
\)

Don't be confused by the notation. The subscripts $n-1, n$, and $n+1$ do not label different states but rather indicate values of one particular wave function $\psi$ at points on the $x$ axis separated by the interval $s$. The $n$ subscript means $\psi$ is evaluated at the point $x_{n}$. We write the Schrödinger equation $-\left(\hbar^{2} / 2 m\right) \psi^{\prime \prime}+V \psi=E \psi$ as

\(
\begin{equation}
\psi^{\prime \prime}=G \psi, \quad \text { where } G \equiv m \hbar^{-2}[2 V(x)-2 E] \tag{4.66}
\end{equation}
\)

By expanding $\psi\left(x{n}+s\right)$ and $\psi\left(x{n}-s\right)$ in Taylor series involving powers of $s$, adding these two expansions to eliminate odd powers of $s$, using the Schrödinger equation to express $\psi^{\prime \prime}$ and $\psi^{(\mathrm{iv})}$ in terms of $\psi$, and neglecting terms in $s^{6}$ and higher powers of $s$ (an approximation that will be accurate if $s$ is small), one finds that (Prob. 4.43)

\(
\begin{equation}
\psi{n+1} \approx \frac{2 \psi{n}-\psi{n-1}+5 G{n} \psi{n} s^{2} / 6+G{n-1} \psi{n-1} s^{2} / 12}{1-G{n+1} s^{2} / 12} \tag{4.67}
\end{equation}
\)

where $G{n} \equiv G\left(x{n}\right) \equiv m \hbar^{-2}\left[2 V\left(x{n}\right)-2 E\right]$ [Eqs. (4.66) and (4.65)]. Equation (4.67) allows us to calculate $\psi{n+1}$, the value of $\psi$ at point $x{n}+s$, if we know $\psi{n}$ and $\psi{n-1}$, the values of $\psi$ at the preceding two points $x{n}$ and $x_{n}-s$.

How do we use (4.67) to solve the Schrödinger equation? We first guess a value $E{\text {guess }}$ for an energy eigenvalue. We start at a point $x{0}$ well into the left-hand classically forbidden region (Fig. 4.7), where $\psi$ will be very small, and we approximate $\psi$ as zero at this point: $\psi{0} \equiv \psi\left(x{0}\right)=0$. Also, we pick a point $x{\max }$ well into the right-hand classically forbidden region, where $\psi$ will be very small and we shall demand that $\psi\left(x{\max }\right)=0$. We pick a small value for the interval $s$ between successive points, and we take $\psi$ at $x{0}+s$ as some small number, say, $0.0001: \psi{1} \equiv \psi\left(x{1}\right) \equiv \psi\left(x{0}+s\right)=0.0001$. The value of $\psi{1}$ will not make any difference in the eigenvalues found. If 0.001 were used instead of 0.0001 for $\psi{1}$, Eq. (4.67) shows that this would simply multiply all values of $\psi$ at subsequent points by 10 (Prob. 4.41). This would not affect the eigenvalues [see the example after Eq. (3.14)]. The wave function can be normalized after each eigenvalue is found.

Having chosen values for $\psi{0}$ and $\psi{1}$, we then use (4.67) with $n=1$ to calculate $\psi{2} \equiv \psi\left(x{2}\right) \equiv \psi\left(x{1}+s\right)$, where the $G$ values are calculated using $E{\text {guess }}$. Next, (4.67) with $n=2$ gives $\psi{3}$; and so on. We continue until we reach $x{\max }$. If $E{\text {guess }}$ is not equal to or very close to an eigenvalue, $\psi$ will not be quadratically integrable and $\left|\psi\left(x{\max }\right)\right|$ will be very large. If $\psi\left(x{\max }\right)$ is not found to be close to zero, we start again at $x{0}$ and repeat the process using a new $E{\text {guess }}$. The process is repeated until we find an $E{\text {guess }}$ that makes $\psi\left(x{\max }\right)$ very close to zero. $E{\text {guess }}$ is then essentially equal to an eigenvalue. The systematic way to locate the eigenvalues is to count the nodes in the $\psi$ produced by $E{\text {guess }}$. Recall (Section 4.2) that in a one-dimensional problem, the number of interior nodes is 0 for the ground state, 1 for the first excited state, and so on. Let $E{1}, E{2}, E{3}, \ldots$ denote the energies of the ground state, the first excited state, the second excited state, and so on, respectively. If $\psi{\text {guess }}$ contains no nodes between $x{0}$ and $x{\text {max }}$, then $E{\text {guess }}$ is less than or equal to $E_{1}$; if

FIGURE 4.7 $V$ versus $x$ for a one-particle, onedimensional system.

FIGURE 4.8 The number of nodes in a Numerov-method solution as a function of the energy $E_{\text {guess }}$.

$\psi{\text {guess }}$ contains one interior node, then $E{\text {guess }}$ is between $E{1}$ and $E{2}$ (Fig. 4.8). Examples are given later.

Dimensionless Variables

The Numerov method requires that we guess values of $E$. What should be the order of magnitude of our guesses: $10^{-20} \mathrm{~J}, 10^{-15} \mathrm{~J}, \ldots$ ? To answer this question, we reformulate the Schrödinger equation using dimensionless variables, taking the harmonic oscillator as the example.

The harmonic oscillator has $V=\frac{1}{2} k x^{2}$, and the harmonic-oscillator Schrödinger equation contains the three constants $k, m$, and $\hbar$. We seek to find a dimensionless reduced energy $E{r}$ and a dimensionless reduced $x$ coordinate $x{r}$ that are defined by

\(
\begin{equation}
E{r} \equiv E / A, \quad x{r} \equiv x / B \tag{4.68}
\end{equation}
\)

where the constant $A$ is a combination of $k, m$, and $\hbar$ that has dimensions of energy, and $B$ is a combination with dimensions of length. The dimensions of energy are mass $\times$ length $^{2} \times$ time $^{-2}$, which we write as

\(
\begin{equation}
[E]=\mathrm{ML}^{2} \mathrm{~T}^{-2} \tag{4.69}
\end{equation}
\)

where the brackets around $E$ denote its dimensions, and M, L, and T stand for the dimensions mass, length, and time, respectively. The equation $V=\frac{1}{2} k x^{2}$ shows that $k$ has dimensions of energy $\times$ length $^{-2}$, and (4.69) gives $[k]=\mathrm{MT}^{-2}$. The dimensions of $\hbar$ are energy $\times$ time. Thus

\(
\begin{equation}
[m]=\mathrm{M}, \quad[k]=\mathrm{MT}^{-2}, \quad[\hbar]=\mathrm{ML}^{2} \mathrm{~T}^{-1} \tag{4.70}
\end{equation}
\)

The dimensions of $A$ and $B$ in (4.68) are energy and length, respectively, so

\(
\begin{equation}
[A]=\mathrm{ML}^{2} \mathrm{~T}^{-2}, \quad[B]=\mathrm{L} \tag{4.71}
\end{equation}
\)

Let $A=m^{a} k^{b} \hbar^{c}$, where $a, b$, and $c$ are powers that are determined by the requirement that the dimensions of $A$ must be $\mathrm{ML}^{2} \mathrm{~T}^{-2}$. We have

\(
\begin{equation}
[A]=\left[m^{a} k^{b} \hbar^{c}\right]=\mathbf{M}^{a}\left(\mathrm{MT}^{-2}\right)^{b}\left(\mathrm{ML}^{2} \mathrm{~T}^{-1}\right)^{c}=\mathbf{M}^{a+b+c} \mathrm{~L}^{2 c} \mathrm{~T}^{-2 b-c} \tag{4.72}
\end{equation}
\)

Equating the exponents of each of $\mathrm{M}, \mathrm{L}$, and T in (4.71) and (4.72), we have

\(
a+b+c=1, \quad 2 c=2, \quad-2 b-c=-2
\)

Solving these equations, we get $c=1, b=\frac{1}{2}, a=-\frac{1}{2}$. Therefore,

\(
\begin{equation}
A=m^{-1 / 2} k^{1 / 2} \hbar \tag{4.73}
\end{equation}
\)

Let $B=m^{d} k^{e} \hbar^{f}$. The same dimensional-analysis procedure that gave (4.73) gives (Prob. 4.44)

\(
\begin{equation}
B=m^{-1 / 4} k^{-1 / 4} \hbar^{1 / 2} \tag{4.74}
\end{equation}
\)

From (4.68), (4.73), and (4.74), the reduced variables for the harmonic oscillator are

\(
\begin{equation}
E{r}=E / m^{-1 / 2} k^{1 / 2} \hbar, \quad x{r}=x / m^{-1 / 4} k^{-1 / 4} \hbar^{1 / 2} \tag{4.75}
\end{equation}
\)

Using $k^{1 / 2}=2 \pi \nu m^{1 / 2}$ [Eq. (4.23)] to eliminate $k$ from (4.75) and recalling the definition $\alpha \equiv 2 \pi \nu m / \hbar$ [Eq. (4.31)], we have the alternative expressions

\(
\begin{equation}
E{r}=E / h \nu, \quad x{r}=\alpha^{1 / 2} x \tag{4.76}
\end{equation}
\)

Similar to the equation $E{r} \equiv E / A$ [Eq. (4.68)], we define the reduced potential energy function $V{r}$ as

\(
\begin{equation}
V_{r} \equiv V / A \tag{4.77}
\end{equation}
\)

Since $|\psi(x)|^{2} d x$ is a probability, and probabilities are dimensionless, the normalized $\psi(x)$ must have the dimensions of length ${ }^{-1 / 2}$. We therefore define a reduced normalized wave function $\psi_{r}$ that is dimensionless. From (4.71), $B$ has dimensions of length, so $B^{-1 / 2}$ has units of length ${ }^{-1 / 2}$. Therefore,

\(
\begin{equation}
\psi_{r}=\psi / B^{-1 / 2} \tag{4.78}
\end{equation}
\)

$\psi{r}$ satisfies $\int{-\infty}^{\infty}\left|\psi{r}\right|^{2} d x{r}=1$; this follows from (4.68), (4.78), and $\int{-\infty}^{\infty}|\psi|^{2} d x=1$.
We now rewrite the Schrödinger equation in terms of the reduced variables $x{r}, \psi{r}, V{r}$, and $E_{r}$. We have

\(
\begin{align}
\frac{d^{2} \psi}{d x^{2}}=\frac{d^{2}}{d x^{2}} B^{-1 / 2} \psi{r}=B^{-1 / 2} \frac{d}{d x} \frac{d \psi{r}}{d x} & =B^{-1 / 2} \frac{d}{d x} \frac{d \psi{r}}{d x{r}} \frac{d x{r}}{d x}=B^{-1 / 2} \frac{d\left(d \psi{r} / d x{r}\right)}{d x{r}} \frac{d x{r}}{d x} \frac{d x{r}}{d x} \
\frac{d^{2} \psi}{d x^{2}} & =B^{-5 / 2} \frac{d^{2} \psi{r}}{d x{r}^{2}} \tag{4.79}
\end{align}
\)

since $d x_{r} / d x=B^{-1}$ [Eq. (4.68)]. Substitution of (4.68), (4.77), and (4.79) into the Schrödinger equation $-\left(\hbar^{2} / 2 m\right)\left(d^{2} \psi / d x^{2}\right)+V \psi=E \psi$ gives

\(
\begin{align}
-\frac{\hbar^{2}}{2 m} B^{-5 / 2} \frac{d^{2} \psi{r}}{d x{r}^{2}}+A V{r} B^{-1 / 2} \psi{r} & =A E{r} B^{-1 / 2} \psi{r} \
-\frac{\hbar^{2}}{2 m} \frac{1}{A B^{2}} \frac{d^{2} \psi{r}}{d x{r}^{2}}+V{r} \psi{r} & =E{r} \psi{r} \tag{4.80}
\end{align}
\)

From (4.73) and (4.74), we get $A B^{2}=\hbar^{2} / m$, so $\hbar^{2} / m A B^{2}=1$ for the harmonic oscillator.

More generally, let $V$ contain a single parameter $c$ that is not dimensionless. For example, we might have $V=c x^{4}$ or $V=c x^{2}\left(1+0.05 m^{1 / 2} c^{1 / 2} \hbar^{-1} x^{2}\right)$. (Note that $m^{1 / 2} c^{1 / 2} \hbar^{-1} x^{2}$ is dimensionless, as it must be, since 1 is dimensionless.) The quantity $A B^{2}$ in (4.80) must have the form $A B^{2}=\hbar^{r} m^{s} c^{t}$, where $r, s$, and $t$ are certain powers. Since the term $V{r} \psi{r}$ in (4.80) is dimensionless, the first term is dimensionless. Therefore, $\hbar^{2} / m A B^{2}$ is dimensionless and $A B^{2}$ has the same dimensions as $\hbar^{2} / m$; so $r=2, s=-1$, and $t=0$. With $A B^{2}=\hbar^{2} / m$, Eq. (4.80) gives as the dimensionless Schrödinger equation

\(
\begin{gather}
\frac{d^{2} \psi{r}}{d x{r}^{2}}=\left(2 V{r}-2 E{r}\right) \psi{r} \tag{4.81}\
\psi{r}^{\prime \prime}=G{r} \psi{r}, \quad \text { where } \quad G{r} \equiv 2 V{r}-2 E_{r} \tag{4.82}
\end{gather}
\)

For the harmonic oscillator, $V{r} \equiv V / A=\frac{1}{2} k x^{2} / m^{-1 / 2} k^{1 / 2} \hbar=\frac{1}{2} x{r}^{2}$ [Eqs. (4.73) and (4.75)]:

\(
\begin{equation}
V{r}=\frac{1}{2} x{r}^{2} \tag{4.83}
\end{equation}
\)

Having reduced the harmonic-oscillator Schrödinger equation to the form (4.81) involving only dimensionless quantities, we can expect that the lowest energy eigenvalues will be of the order of magnitude 1 .

The reduced harmonic-oscillator Schrödinger equation (4.82) has the same form as (4.66), so we can use the Numerov formula (4.67) with $\psi, G$, and $s$ replaced by $\psi{r}, G{r}$, and $s{r}$, respectively, where, similar to (4.68), $s{r} \equiv s / B$.

Once numerical values of the reduced energy $E_{r}$ have been found, the energies $E$ are found from (4.75) or (4.76).

Choice of $x{r, 0}, x{r, \text { max }}$, and $s_{r}$

We now need to choose initial and final values of $x{r}$ and the value of the interval $s{r}$ between adjacent points. Suppose we want to find all the harmonic-oscillator eigenvalues and eigenfunctions with $E{r} \leq 5$. We start the solution in the left-hand classically forbidden region, so we first locate the classically forbidden regions for $E{r}=5$. The boundaries between the classically allowed and forbidden regions are where $E{r}=V{r}$. From (4.83), $V{r}=\frac{1}{2} x{r}^{2}$. Thus $E{r}=V{r}$ becomes $5=\frac{1}{2} x{r}^{2}$ and the classically allowed region for $E{r}=5$ is from $x{r}=-(10)^{1 / 2}=-3.16$ to +3.16 . For $E{r}<5$, the classically allowed region is smaller. We want to start the solution at a point well into the left-hand classically forbidden region, where $\psi$ is very small, and we want to end the solution well into the right-hand classically forbidden region. The left-hand classically forbidden region ends at $x{r}=-3.16$ for $E{r}=5$, and a reasonable choice is to start at $x{r}=-5$. [Starting too far into the classically forbidden region can sometimes lead to trouble (see the following), so some trial-and-error might be needed in picking the starting point.] Since $V$ is symmetrical, we shall end the solution at $x{r}=5$.

For reasonable accuracy, one usually needs a minimum of 100 points, so we shall take $s{r}=0.1$ to give us 100 points. As is evident from the derivation of the Numerov method, $s{r}$ must be small. A reasonable rule might be to have $s_{r}$ no greater than 0.1.

If, as is often true, $V \rightarrow \infty$ as $x \rightarrow \pm \infty$, then starting too far into the classically forbidden region can make the denominator $1-G{n+1} s^{2} / 12$ in the Numerov formula (4.67) negative. We have $G{r}=2 V{r}-2 E{r}$, and if we start at a point $x{0}$ where $V{r}$ is extremely large, $G{r}$ at that point might be large enough to make the Numerov denominator negative. The method will then fail to work. We are taking $\psi{0}$ as zero and $\psi{1}$ as a positive number. The Numerov formula (4.67) shows that if the denominator is negative, then $\psi{2}$ will be negative, and we will have produce a spurious node in $\psi$ between $x{1}$ and $x{2}$. To avoid this problem, we can decrease either the step size $s{r}$ or $x{r, \text { max }}-x_{r, 0}$ (see Prob. 4.46).

Computer Program for the Numerov Method

Table 4.1 contains a C ++ computer program that applies the Numerov method to the harmonic-oscillator Schrödinger equation. The fifth and sixth lines declare variables as either integers or double precision. m is the number of intervals between $x{r, 0}$ and $x{r, \text { max }}$ and equals $\left(x{r, \text { max }}-x{r, 0}\right) / s$. cout and cin provide for output to the computer screen and input to the variables of the program. Note the colon at the end of line 13 and the semicolons after most other lines. The three lines beginning with $g[0]=, g[1]=$, and $g[i+1]=$ contain two times the potential-energy function. These lines must be modified if the problem is not the harmonic oscillator. If there is a node between two successive values of $x{r}$, then the $\psi{r}$ values at these two points will have opposite signs (see Prob. 4.45) and the statement $n n=n n+1$ will increase the nodes counter $n n$ by 1 .

You can download a free $\mathrm{C}++$ compiler and integrated development environmentsee the Wikipedia article, List of Compilers. Even simpler, you can enter and run C ++ programs online without downloading anything. The website ideone.com provides this service (after you register) for $\mathrm{C}++$ and other languages. However, at this site, you must enter your complete input into the input area before running the program, since the website does not accept input once the program begins; each input number is separated from its neighbors by a space.

TABLE 4.1 C++ Program for Numerov Solution of the One-Dimensional Schrödinger Equation

#include <iostream>
#include <math.h>
using namespace std;
int main() {<br /> int m, i, nn;<br /> double x<span class="hljs-string">[1000]</span>, g<span class="hljs-string">[1000]</span>, p<span class="hljs-string">[1000]</span>, E, s, ss;<br /> cout << <span class="hljs-string">"Enter initial xr "</span>;<br /> cin>>x<span class="hljs-string">[0]</span>;<br /> cout << <span class="hljs-string">"Enter the increment sr "</span>;<br /> cin>>S;<br /> cout << <span class="hljs-string">"Enter the number of intervals m "</span>;<br /> cin>>m;<br /> label1:<br /> cout << <span class="hljs-string">"Enter the reduced energy Er (enter 1e10 to quit) "</span>;<br /> cin>>E;<br /> if (E > 1e9) {<br /> cout << <span class="hljs-string">"Quitting"</span>;<br /> return <span class="hljs-number">0</span>;<br /> }
nn=0; p[0]=0;
p[1]=0.0001;
x[1]=x[0]+5;
g[0]=x[0]*x[0]-2*E;
g[1]=x[1]*x[1]-2*E;
ss=s*s/12;
for (i=1; i<=m-1; i=i+1) {<br /> x<span class="hljs-string">[i+1]</span>=x<span class="hljs-string">[i]</span>+s;<br /> g<span class="hljs-string">[i+1]</span>=x<span class="hljs-string">[i+1]</span>*x<span class="hljs-string">[i+1]</span>-<span class="hljs-number">2</span>*E;<br /> p<span class="hljs-string">[i+1]</span>=(-p<span class="hljs-string">[i-1]</span>+<span class="hljs-number">2</span>*p<span class="hljs-string">[i]</span>+<span class="hljs-number">10</span>*g<span class="hljs-string">[i]</span>*p<span class="hljs-string">[i]</span>*ss+g<span class="hljs-string">[i-1]</span>*p<span class="hljs-string">[i-1]</span>*ss)/(<span class="hljs-number">1</span>-g<span class="hljs-string">[i+1]</span>*ss);<br /> if(p<span class="hljs-string">[i+1]</span>*p<span class="hljs-string">[i]</span><<span class="hljs-number">0</span>)<br /> nn=nn+<span class="hljs-number">1</span>;<br /> }
cout << " Er = " << E << " Nodes = " << nn << " Psir(xm) = " << p[m] << endl;
goto label1;
}

For example, suppose we want the harmonic-oscillator ground-state energy. The program of Table 4.1, with $s{r}=0.1, x{r, 0}=-5$, and $\mathrm{m}=100$ gives the following results. The guess $E{r}=0$ gives a wave function with zero nodes, $\mathrm{nn}=0$, telling us (Fig. 4.8) that the ground-state energy $E{r, 1}$ is above 0 . (Also, the wave function at the rightmost point is found to be $9.94 \times 10^{6}$, very far from 0 . If we now guess 0.9 for $E{r}$, we get a function with one
node, so (Fig. 4.8) 0.9 is between $E{r, 1}$ and $E{r, 2}$. Hence the ground state $E{r}$ is between 0 and 0.9 . Averaging these, we try 0.45 . This value gives a function with no nodes, and so 0.45 is below $E{r, 1}$. Averaging 0.45 and 0.9 , we get 0.675 , which is found to give one node and so is too high. Averaging 0.675 and 0.45 , we try 0.5625 , which gives one node and is too high. We next try 0.50625 , and so on. The program's results show that as we get closer to the true $E{r, 1}, \psi_{r}(5)$ comes closer to zero.

Use of a Spreadsheet to Solve the One-Dimensional Schrödinger Equation

An alternative to a Numerov-method computer program is a spreadsheet.
The following directions for the Excel 2010 spreadsheet apply the Numerov method to solve the harmonic-oscillator Schrödinger equation. (Other versions of Excel can be used with modified directions.)

The columns in the spreadsheet are labeled $\mathrm{A}, \mathrm{B}, \mathrm{C}, \ldots$ and the rows are labeled 1 , $2,3, \ldots$ (see Fig. 4.9 later in this section). A cell's location is specified by its row and column. For example, the cell at the upper left is cell A1. To enter something into a cell, you first select that cell, either by moving the mouse pointer over the desired cell and then clicking the (left) mouse button, or by using the arrow keys to move from the currently selected cell (which has a heavy outline) to the desired cell. After a cell has been selected, type the entry for that cell and press Enter or one of the four arrow keys.

To begin, enter a title in cell A1. Then enter $\mathrm{Er}=$ in cell A3. We shall enter our guesses for $E{r}$ in cell B3. We shall look first for the ground-state (lowest) eigenvalue, pretending that we don't know the answer. The minimum value of $V(x)$ for the harmonic oscillator is zero, so $E{r}$ cannot be negative. We shall take zero as our initial guess for $E_{r}$, so enter 0 in cell B3. Enter $\mathrm{Sr}=$ in cell C3.

Enter 0.1 (the $s{r}$ value chosen earlier) in cell D3. Enter xr in cell A5, Gr in cell B5, and psir in C5. (These entries are labels for the data columns that we shall construct.) Enter -5 (the starting value for $x{r}$ ) in cell A7. Enter $=\mathrm{A} 7+\\( D $\$ 3$ in cell A8. The equal sign at the beginning of the entry tells the spreadsheet that a formula is being entered. This formula tells the spreadsheet to add the numbers in cells A7 and D3. The reason for the $\\) signs in D3 will be explained shortly. When you type a formula, you will see it displayed in the formula bar above the spreadsheet. When you press Enter, the value -4.9 is displayed in cell A8. This is the sum of cells A7 and D3. (If you see a different value in A8 or get an error message, you probably mistyped the formula. To correct the formula, select cell A8 and click in the formula displayed in the formula bar to make the correction.)

Select cell A8 and then click the Home tab and in the Clipboard group click the Copy icon. This copies the contents of cell A8 to a storage area called the Clipboard. Then in the Editing group click the Find \& Select icon and choose Go To. In the Go To box, enter A9:A107 under Reference: and click OK. This selects cells A9 through A107. Then in the Clipboard group click the Paste icon. This pastes the cell A8 formula (which was stored on the Clipboard) into each of cells A9 through A107. To see how this works, click on cell A9. You will see the formula $=A 8+\$ D \$ 3$ in the formula bar. Note that when the cell A8 formula $=A 7+\$ D \$ 3$ was copied one cell down to cell A9, the A7 in the formula was changed to A8. However, the $\\( signs prevented cell D3 in the formula from being changed when it was copied. In a spreadsheet formula, a cell address without $\\) signs is called a relative reference, whereas a cell address with $\$$ signs is called an absolute reference. When a relative reference is copied to the next row in a column, the row number is increased by one; when it is copied two rows below the original row, the row number is increased by two; and so on. Click in some of the other cells in column A to see their formulas. The net result of this copy-and-paste procedure is to fill
the column-A cells with numbers from -5 to 5 in increments of 0.1. (Spreadsheets have faster ways to accomplish this than using Copy and Paste.)

We next fill in the $G{r}$ column. From Eqs. (4.82) and (4.83), $G{r}=x{r}^{2}-2 E{r}$ for the harmonic oscillator. We therefore enter $=A 7 \wedge 2-2 \$ B \$ 3$ in cell B7 (which will contain the value of $G{r}$ at $x{r}=-5$ ). Cell A7 contains the $x_{r}=-5$ value, and the $\wedge$ symbol denotes exponentiation. The denotes multiplication. Cell B3 contains the $E{r}$ value. Next, the rest of the $G{r}$ values are calculated. Select cell B7. Then click Copy in the Clipboard group. Now select cells B8 through B107 by using Find \& Select as before. Then click Paste in the Clipboard group. This fills the cells with the appropriate $G_{r}$ values. (Click on cell B8 or B9 and see how its formula compares with that of B7.)

We now go to the $\psi{r}$ values. Enter 0 in cell C7. Cell C7 contains the value of $\psi{r}$ at $x{r}=-5$. Since this point is well into the classically forbidden region, $\psi{r}$ will be very small here, and we can approximate it as zero. Cell C8 contains the value of $\psi{r}$ at $x{r}=-4.9$. This value will be very small, and we can enter any small number in C 8 without affecting the eigenvalues. Enter $1 \mathrm{E}-4$ in cell C 8 (where E denotes the power of 10 ). Now that we have values in cells C7 and C8 for $\psi{r}$ at the first two points -5.0 and -4.9 [points $x{n-1}$ and $x{n}$ in (4.67)], we use (4.67) to calculate $\psi{r}$ at $x{r}=-4.8$ (point $x{n+1}$ ). Therefore, enter the Eq. (4.67) formula

=(2*C8-C7+5*B8*C8*$D$3^2/6+B7*C7*$D$3^2/12)/(1-B9*$D$3^2/12)

in cell C9. After the Enter key is pressed, the value 0.000224 for $\psi{r}$ at $x{r}=-4.8$ appears in cell C9. Select cell C9. Click Copy. Select C10 through C107. Click Paste. Cells C10 through C107 will now be filled with their appropriate $\psi_{r}$ values. As a further check that you entered the C9 formula correctly, verify that cell C10 contains 0.000401 .

Since the number of nodes tells us which eigenvalues our energy guess is between (Fig. 4.8), we want to count and display the number of nodes in $\psi{r}$. To do this, enter into cell D 9 the formula $=\operatorname{IF}(\mathrm{C} 9 * \mathrm{C} 8<0,1,0)$. This formula enters the number 1 into D 9 if C 9 times C 8 is negative and enters 0 into D 9 if C 9 times C 8 is not negative. If there is a node between the $x{r}$ values in A8 and A9, then the $\psi{r}$ values in C8 and C9 will have opposite signs, and the value 1 will be entered into D9. Use Copy and Paste to copy the cell D9 formula into cells D10 through D107. Enter nodes = into cell E2. Enter =SUM(D9:D107) into F2. This formula gives the sum of the values in D9 through D107 and thus gives the number of interior nodes in $\psi{r}$.

Next, we graph $\psi{r}$ versus $x{r}$. Select cells A7 through A107 and C7 through C107 by clicking Find \& Select, clicking Go To, typing A7:A107,C7:C107 in the Reference box, and clicking OK. Then click the Insert tab and in the Chart group click the Scatter icon and then click the chart showing smoothed lines with markers (Scatter with Smooth Lines and Markers). On the chart that appears, right-click Series1 and choose Delete. Right-click the horizontal grid line at $4 \times 10^{6}$ and choose Delete. The spreadsheet will look like Fig. 4.9.

Since $x{r}=5$ is well into the right-hand classically forbidden region, $\psi{r}$ should be very close to zero at this point. However, the graph shows that for our choice $E{r}=0$, the wave function $\psi{r}$ is very large at $x{r}=5$. Therefore, $E{r}=0$ does not give a well-behaved $\psi{r}$, and we must try a different $E{r}$. Cell F2 has a zero, so this $\psi{r}$ has no nodes. Therefore (Fig. 4.8), the guess $E{r}=0$ is less than the true ground-state energy. Let us try $E{r}=2$. Select cell B3 and enter 2 into it. After you press Enter, the spreadsheet will recalculate every column B and column C cell whose value depends on $E{r}$ (all column B and C cells except C 7 and C 8 change) and will then replot the graph. The graph for $E{r}=2$ goes to a very large positive $\psi{r}$ value at $x{r}=5$. Also, cell F2 tells us that $\psi{r}$ for $E{r}=2$ contains two nodes. These are not readily visible on the graph, but the column-C data show that $\psi{r}$ changes sign between the $x_{r}$ values -0.4 and -0.3 and between 1.2 and 1.3 . We are

FIGURE 4.9 Spreadsheet for Numerov-method solution of the harmonic oscillator.

ABCDEFGHI
1Numerov mthod for Srodingern, V$=0.5 \mathrm{kx}$ ^2
2Nodes=0
3Er =0$\mathrm{s}=$0.1
41000000
5xrGrpsir90000
680000
7-525070000
8-4.924.011.00E-0460000
9-4.823.040.000224050000
10-4.722.090.000401040000
11-4.621.160.000668030000
12-4.520.250.001078020000
13-4.419.360.0017090100000
14-4.318.490.0026750,
15-4.217.640.0041420-6-4 -204
16-4.116.810.0063480
17-4160.0096330

looking for the ground-state eigenfunction, which does not contain a node, or rather, in our approximation, will contain nodes at -5 and 5 . The presence of the two interior nodes shows (Fig. 4.8) that the value 2 for $E{r}$ is not only too high for the ground state, but is higher than $E{r}$ for the first excited state (whose wave function contains one interior node). We therefore need to try a lower value of $E_{r}$.

Before doing so, let us change the graph scale so as to make the nodes more visible. Double click on the $y$ axis of the graph. In the Format Axis dialog box that appears, click Axis Options at the left and click Fixed next to Minimum and Maximum. Replace the original numbers in the Minimum and Maximum boxes with -10 and 10, respectively. Then click Close. The graph will be redrawn with -10 and 10 as the minimum and maximum $y$-axis values, making the two nodes easily visible.

We now change $E{r}$ to a smaller value, say 1.2. Enter 1.2 in cell B3. We get a $\psi{r}$ that goes to a large negative value on the right and that has only one node. The presence of one node tells us that we are now below the energy of the second-lowest state and above the ground-state energy (see Fig. 4.8). We have bracketed the ground-state energy to lie between 0 and 1.2. Let us average these two numbers and try 0.6 as the energy. When we enter 0.6 into cell B3, we get a function with one node, so we are still above the groundstate energy. Since the maximum on the graph is now off scale, it's a good idea to change the graph scale and reset the $y$ maximum and minimum values to 25 and -25 .

We have found the lowest eigenvalue to be between 0 and 0.6 . Averaging these values, we enter 0.3 into B3. This gives a function that has no nodes, so 0.3 is below the groundstate reduced energy, and $E{r}$ is between 0.3 and 0.6 . Averaging these, we enter 0.45 into B3. We get a function that has no nodes, so we are still below the correct eigenvalue. However, if we rescale the $y$ axis suitably (taking, for example, -15 and 30 as the minimum and maximum values), we see a function that for values of $x{r}$ less than 0.2 resembles closely what we expect for the ground state, so we are getting warm. The eigenvalue is now known to be between 0.45 and 0.60 . Averaging these, we enter 0.525 into B3. We get a function with one node, so we are too high. Averaging 0.45 and 0.525 , we try 0.4875 , which we find to be too low.

Continuing in this manner, we get the successive values 0.50625 (too high), 0.496875 (too low), . . , 0.4999995943548 (low), 0.4999996301176 (high). Thus we have found
0.4999996 as the lowest eigenvalue. Since $E_{r}=E / h \nu$, we should have gotten 0.5 . The slight error is due to the approximations of the Numerov method.

Suppose we want the second-lowest eigenvalue. We previously found this eigenvalue to be below 2.0 , so it lies between 0.5 and 2.0. Averaging these numbers, we enter 1.25 into B3. We get a function that has the desired one node but that goes to a large negative value at the right, rather than returning to the $x$ axis at the right. Therefore, 1.25 is too low (Fig. 4.8). Averaging 1.25 and 2.0 , we next try 1.625 . This gives a function with two nodes, so we are too high. Continuing, we get the successive values 1.4375 (low), 1.53125 (high), 1.484375 (low), 1.5078125 (high), and so on.

To test the accuracy of the eigenvalues found, we can repeat the calculation with $s{r}$ half as large and see if the new eigenvalues differ significantly from those found with the larger $s{r}$. Also, we can start further into the classically forbidden region.

Finding eigenvalues as we have done by trial and error is instructive and fun the first few times, but if you have a lot of eigenvalues to find, you can use a faster method. Most spreadsheets have a built-in program that can adjust the value in one cell so as to yield a desired value in a second cell. This tool is called the Solver in Excel. Click the Data tab. In the Analysis group (at the right), see if there is an icon for Solver. (If the Solver icon is missing, click the File tab and click Options at the left. In the Excel Options box, click Add-Ins at the left. In the Manage box select Excel Add-Ins at the bottom and click the Go button. In the Add-Ins box check Solver Add-In and click OK. After a while, the Solver icon will appear in the Analysis group.) To see how the Solver works, enter 0 into cell B3. Click the Solver icon. The Solver Parameters box opens. The $\psi{r}$ value at $x{r}=5$ is in cell C107, and we want to make this value zero. Therefore, in the Set Objective box of the Solver, enter $\$ C \$ 107$. Next to To: click on Value of and enter 0 . We want to adjust the energy so as to satisfy the boundary condition at $x_{r}=5$, so click in the By Changing Variable Cells box and enter $\$ B \$ 3$. Select the Solver Method as GRG Nonlinear. Then click on Solve. When Solver declares it has found a solution, select Accept Solver solution to close the Solver box. The solution found by Excel has 0.500002 in cell B3 (the formula bar will show the full value if you select cell B3). Cell C 107 will have the value -6.38 . Since this value is not close to the desired value of 0 , click the Solver icon and then click Solve, thereby re-running Solver starting from the 0.500002 value. This time Solver gives 0.4999996089 , similar to the value found by hand, and C 107 has the value $-3.66 \times 10^{-9}$. (The Solver uses either the quasi-Newton or the conjugate-gradient method, both of which are discussed in Section 15.10.)

To find higher eigenvalues using the Solver, start with a value in B3 that is well above the previously found eigenvalue. If the program converges to the previous eigenvalue, start with a still-higher value in B3. You can check which eigenvalue you have found by counting the nodes in the wave function. If the Solver fails to find the desired eigenvalue, use trial and error to find an approximate value and then use the Solver starting from the approximate eigenvalue.

The wave function we have found is unnormalized. The normalization constant is given by Eq. (3.93) as $N=\left[\int{-\infty}^{\infty}\left|\psi{r}\right|^{2} d x{r}\right]^{-1 / 2}$. We have $\int{-\infty}^{\infty}\left|\psi{r}\right|^{2} d x{r} \approx \int{-5}^{5} \psi{r}^{2} d x{r} \approx \sum{i=1}^{100} \psi{r, i}^{2} s{r}$, where the $\psi{r, i}$ values are in column B. Enter npsir in E5. In cell D109, enter the formula $=$ SUMSQ(C8:C107)*\$D\$3. The SUMSQ function adds the squares of a series of numbers. Enter $=C 7 / \$ D \$ 109 \wedge 0.5$ in E7. Copy and paste E7 into E8 through E107. Column E will contain the normalized $\psi{r}$ values if $E_{\text {guess }}$ is equal to an eigenvalue.

Excel is widely used to do statistical and scientific calculations, but studies of Excel 2007 and its predecessors found many significant errors [B. D. McCullough and D. A. Heiser, Comput. Statist. Data Anal., 52, 4570 (2008)]. For example, McCullough and Heiser state: "It has been noted that Excel Solver has a marked tendency to stop at a point that is not a solution and declare that it has found a solution." Therefore one should
always verify the correctness of the Solver's solution. A study of Excel 2010 found that many of the errors present in earlier versions of Excel were fixed, but some problems remain [G. Mélard, "On the Accuracy of Statistical Procedures in Excel 2010," available at homepages.ulb.ac.be/~gmelard/rech/gmelard_csda24.pdf]. Mélard noted that "The conclusion is that Microsoft did not make an attempt to fix all the errors in Excel, and this point needs to be made strongly." Mélard found that the Solver yielded results with zero significant-figure accuracy in a substantial fraction of test cases.

Use of Mathcad to Solve the One-Dimensional Schrödinger Equation

Several programs classified as computer algebra systems do a wide variety of mathematical procedures, including symbolic integration and differentiation, numerical integration, algebraic manipulations, solving systems of equations, graphing, and matrix computations. Examples of such computer algebra systems are Maple, Mathematica, MATLAB, Mathcad, and LiveMath Maker. The Numerov procedure can be performed using these programs. One nice feature of Mathcad is its ability to produce animations ("movies"). With Mathcad one can create a movie showing how $\psi{r}$ changes as $E{r}$ goes through an eigenvalue.

Summary of Numerov-Method Steps

Problems 4.30-4.38 apply the Numerov method to several other one-dimensional problems. In solving these problems, you need to (a) find combinations of the constants in the problem that will give a dimensionless reduced energy and length [Eq. (4.68)]; (b) convert the Schrödinger equation to dimensionless form and find what $G{r}\left(x{r}\right)$ in (4.82) is; (c) decide on a maximum $E{r}$ value, below which you will find all eigenvalues; (d) locate the boundaries between the classically allowed and forbidden regions for this maximum $E{r}$ value and choose $x{r, 0}$ and $x{r, \text { max }}$ values in the classically forbidden regions (for the particle in a box with infinitely high walls, use $x{0}=0$ and $x{\max }=l$ ); and (e) decide on a value for the interval $s_{r}$.


Angular Momentum

Click the keywords to know about it 

 Angular Momentum: A measure of the amount of rotation a particle has, taking into account its mass, shape, and speed. In quantum mechanics, it is quantized and has discrete values. Eigenfunction: A function that is associated with a particular operator in quantum mechanics, where the operator acting on the function yields the function multiplied by a constant (the eigenvalue). Eigenvalue: The constant value obtained when an operator acts on its eigenfunction. It represents measurable quantities in quantum mechanics. Commutator: An operation used to determine whether two operators can be simultaneously measured. It is defined as [A, B] = AB - BA. Simultaneous Eigenfunctions: Functions that are eigenfunctions of two or more operators at the same time, indicating that the corresponding physical quantities can be simultaneously measured. Uncertainty Principle: A fundamental concept in quantum mechanics stating that certain pairs of physical properties, like position and momentum, cannot be simultaneously measured with arbitrary precision. Standard Deviation: A measure of the spread or dispersion of a set of values. In quantum mechanics, it represents the uncertainty in a measured property. Orbital Angular Momentum: The component of angular momentum associated with the motion of a particle through space, distinct from spin angular momentum. Spin Angular Momentum: An intrinsic form of angular momentum carried by particles, independent of their motion through space. Spherical Coordinates: A coordinate system used to describe the position of a point in space using three values: the radial distance, the polar angle, and the azimuthal angle. Ladder Operators: Operators used to raise or lower the eigenvalue of another operator, often used in the context of angular momentum. Normalization: The process of adjusting the magnitude of a function so that its total probability density equals one. Degeneracy: The condition where two or more eigenfunctions correspond to the same eigenvalue, indicating multiple states with the same energy. Associated Legendre Functions: Special functions used in the solution of angular momentum problems in quantum mechanics. Spherical Harmonics: Functions that describe the angular part of the wavefunction in spherical coordinates, often used in the context of angular momentum.

In this chapter we discuss angular momentum, and in the next chapter we show that for the stationary states of the hydrogen atom the magnitude of the electron's angular momentum is constant. As a preliminary, we consider what criterion we can use to decide which properties of a system can be simultaneously assigned definite values.

In Section 3.3 we postulated that if the state function $\Psi$ is an eigenfunction of the operator $\hat{A}$ with eigenvalue $s$, then a measurement of the physical property $A$ is certain to give the result $s$. If $\Psi$ is simultaneously an eigenfunction of the two operators $\hat{A}$ and $\hat{B}$, that is, if $\hat{A} \Psi=s \Psi$ and $\hat{B} \Psi=t \Psi$, then we can simultaneously assign definite values to the physical quantities $A$ and $B$. When will it be possible for $\Psi$ to be simultaneously an eigenfunction of two different operators? In Chapter 7, we shall prove the following two theorems. First, a necessary condition for the existence of a complete set of simultaneous eigenfunctions of two operators is that the operators commute with each other. (The word complete is used here in a certain technical sense, which we won't worry about until Chapter 7.) Conversely, if $\hat{A}$ and $\hat{B}$ are two commuting operators that correspond to physical quantities, then there exists a complete set of functions that are eigenfunctions of both $\hat{A}$ and $\hat{B}$. Thus, if $[\hat{A}, \hat{B}]=0$, then $\Psi$ can be an eigenfunction of both $\hat{A}$ and $\hat{B}$.

Recall that the commutator of $\hat{A}$ and $\hat{B}$ is $[\hat{A}, \hat{B}] \equiv \hat{A} \hat{B}-\hat{B} \hat{A}$ [Eq. (3.7)]. The following identities are helpful in evaluating commutators. These identities are proved by writing out the commutators in detail (Prob. 5.2):

\(
\begin{gather}
{[\hat{A}, \hat{B}]=-[\hat{B}, \hat{A}]} \tag{5.1}\
{\left[\hat{A}, \hat{A}^{n}\right]=0, \quad n=1,2,3, \ldots} \
{[k \hat{A}, \hat{B}]=[\hat{A}, k \hat{B}]=k[\hat{A}, \hat{B}]} \
{[\hat{A}, \hat{B}+\hat{C}]=[\hat{A}, \hat{B}]+[\hat{A}, \hat{C}]} \
\hline[\hat{A}, \hat{B} \hat{C}]=[\hat{A}, \hat{B}] \hat{C}+\hat{B}, \hat{B}[\hat{A}, \hat{C}]=[\hat{A}, \hat{C}]+[\hat{B}, \hat{C}] \
\hline[\hat{A} \hat{B}, \hat{C}]=[\hat{A}, \hat{C}] \hat{B}+\hat{A}[\hat{B}, \hat{C}] \
\hline
\end{gather}
\)

where $k$ is a constant and the operators are assumed to be linear.

EXAMPLE

Starting from $[\partial / \partial x, x]=1$ [Eq. (3.8)], use the commutator identities (5.1)-(5.5) to find (a) $\left[\hat{x}, \hat{p}{x}\right]$; (b) $\left[\hat{x}, \hat{p}{x}^{2}\right]$; and (c) $[\hat{x}, \hat{H}]$ for a one-particle, three-dimensional system.
(a) Use of (5.3), (5.1), and $[\partial / \partial x, x]=1$ gives

\(
\begin{align}
& {\left[\hat{x}, \hat{p}{x}\right]=\left[x, \frac{\hbar}{i} \frac{\partial}{\partial x}\right]=\frac{\hbar}{i}\left[x, \frac{\partial}{\partial x}\right]=-\frac{\hbar}{i}\left[\frac{\partial}{\partial x}, x\right]=-\frac{\hbar}{i}} \
& {\left[\hat{x}, \hat{p}{x}\right]=i \hbar} \tag{5.6}
\end{align}
\)

(b) Use of (5.5) and (5.6) gives

\(
\begin{align}
& {\left[\hat{x}, \hat{p}{x}^{2}\right]=\left[\hat{x}, \hat{p}{x}\right] \hat{p}{x}+\hat{p}{x}\left[\hat{x}, \hat{p}{x}\right]=i \hbar \cdot \frac{\hbar}{i} \frac{\partial}{\partial x}+\frac{\hbar}{i} \frac{\partial}{\partial x} \cdot i \hbar} \
& {\left[\hat{x}, \hat{p}{x}^{2}\right]=2 \hbar^{2} \frac{\partial}{\partial x}} \tag{5.7}
\end{align}
\)

(c) Use of (5.4), (5.3), and (5.7) gives

\(
\begin{align}
{[\hat{x}, \hat{H}] } & =[\hat{x}, \hat{T}+\hat{V}]=[\hat{x}, \hat{T}]+[\hat{x}, \hat{V}(x, y, z)]=[\hat{x}, \hat{T}] \
& =\left[\hat{x},(1 / 2 m)\left(\hat{p}{x}^{2}+\hat{p}{y}^{2}+\hat{p}{z}^{2}\right)\right] \
& =(1 / 2 m)\left[\hat{x}, \hat{p}{x}^{2}\right]+(1 / 2 m)\left[\hat{x}, \hat{p}{y}^{2}\right]+(1 / 2 m)\left[\hat{x}, \hat{p}{z}^{2}\right] \
& =\frac{1}{2 m} 2 \hbar^{2} \frac{\partial}{\partial x}+0+0 \
& \quad[\hat{x}, \hat{H}]=\frac{\hbar^{2}}{m} \frac{\partial}{\partial x}=\frac{i \hbar}{m} \hat{p}_{x} \tag{5.8}
\end{align}
\)

EXERCISE Show that for a one-particle, three-dimensional system,

\(
\begin{equation}
\left[\hat{p}_{x}, \hat{H}\right]=-i \hbar \partial V(x, y, z) / \partial x \tag{5.9}
\end{equation}
\)

These commutators have important physical consequences. Since $\left[\hat{x}, \hat{p}{x}\right] \neq 0$, we cannot expect the state function to be simultaneously an eigenfunction of $\hat{x}$ and of $\hat{p}{x}$. Hence we cannot simultaneously assign definite values to $x$ and $p_{x}$, in agreement with the uncertainty principle. Since $\hat{x}$ and $\hat{H}$ do not commute, we cannot expect to assign definite values to the energy and the $x$ coordinate at the same time. A stationary state (which has a definite energy) shows a spread of possible values for $x$, the probabilities for observing various values of $x$ being given by the Born postulate.

For a state function $\Psi$ that is not an eigenfunction of $\hat{A}$, we get various possible outcomes when we measure $A$ in identical systems. We want some measure of the spread or dispersion in the set of observed values $A{i}$. If $\langle A\rangle$ is the average of these values, then the deviation of each measurement from the average is $A{i}-\langle A\rangle$. If we averaged all the deviations, we would get zero, since positive and negative deviations would cancel. Hence to make all deviations positive, we square them. The average of the squares of the deviations is called the variance of $A$, symbolized in statistics by $\sigma^{2}(A)$ and in quantum mechanics by $(\Delta A)^{2}$ :

\(
\begin{equation}
(\Delta A)^{2} \equiv \sigma^{2}(A) \equiv\left\langle(A-\langle A\rangle)^{2}\right\rangle=\int \Psi^{}(\hat{A}-\langle A\rangle)^{2} \Psi d \tau \tag{5.10}
\end{}
\)

where the average-value expression (3.88) was used. The definition (5.10) is equivalent to (Prob. 5.7)

\(
\begin{equation}
(\Delta A)^{2}=\left\langle A^{2}\right\rangle-\langle A\rangle^{2} \tag{5.11}
\end{equation}
\)

The positive square root of the variance is called the standard deviation, $\sigma(A)$ or $\Delta A$. The standard deviation is the most commonly used measure of spread, and we shall take it as the measure of the "uncertainty" in the property $A$.

Robertson in 1929 proved that the product of the standard deviations of two properties of a quantum-mechanical system whose state function is $\Psi$ must satisfy the inequality

\(
\begin{equation}
\sigma(A) \sigma(B) \equiv \Delta A \Delta B \geq \frac{1}{2}\left|\int \Psi^{}[\hat{A}, \hat{B}] \Psi d \tau\right| \tag{5.12}
\end{}
\)

The proof of (5.12), which follows from the postulates of quantum mechanics, is outlined in Prob. 7.60. If $\hat{A}$ and $\hat{B}$ commute, then the integral in (5.12) is zero, and $\Delta A$ and $\Delta B$ may both be zero, in agreement with the previous discussion.

As an example of (5.12), we find, using (5.6), $\left|z{1} z{2}\right|=\left|z{1}\right|\left|z{2}\right|$ [Eq. (1.34)], and normalization:

\(
\begin{gather}
\Delta x \Delta p_{x} \geq \frac{1}{2}\left|\int \Psi^{}\left[\hat{x}, \hat{p}{x}\right] \Psi d \tau\right|=\frac{1}{2}\left|\int \Psi i \hbar \Psi d \tau\right|=\frac{1}{2} \hbar|i|\left|\int \Psi \Psi d \tau\right| \
\sigma(x) \sigma\left(p{x}\right) \equiv \Delta x \Delta p_{x} \geq \frac{1}{2} \hbar \tag{5.13}
\end{}
\)

Equation (5.13) is usually considered to be the quantitative statement of the Heisenberg uncertainty principle (Section 1.3). However, the meaning of the standard deviations in Eqs. (5.12) and (5.13) is rather different than the meaning of the uncertainties in Section 1.3. To find $\Delta x$ in (5.13) we take a very large number of systems, each of which has the same state function $\Psi$, and we perform one measurement of $x$ in each system. From these measured values, symbolized by $x{i}$, we calculate $\langle x\rangle$ and the squares of the deviations $\left(x{i}-\langle x\rangle\right)^{2}$. We average the squares of the deviations to get the variance and take the square root to get the standard deviation $\sigma(x) \equiv \Delta x$. Then we take many systems, each of which is in the same state $\Psi$ as used to get $\Delta x$, and we do a single measurement of $p{x}$ in each system, calculating $\Delta p{x}$ from these measurements. Thus, the statistical quantities $\Delta x$ and $\Delta p{x}$ in (5.13) are not errors of individual measurements and are not found from simultaneous measurements of $x$ and $p{x}$ (see Ballentine, pp. 225-226).

Let $\varepsilon(x)$ be the typical error in a single measurement of $x$ and let $\eta\left(p{x}\right)$ be the typical disturbance in $p{x}$ caused by the measurement of $x$. In 1927, Heisenberg analyzed specific thought experiments that perform position measurements and concluded that the product $\varepsilon(x) \eta\left(p_{x}\right)$ was of the order of magnitude of $h$ or larger. Heisenberg did not give a precise definition of these quantities. Ozawa rewrote Heisenberg's relation as

\(
\varepsilon(x) \eta\left(p_{x}\right) \geq \frac{1}{2} \hbar
\)

where $\varepsilon(x)$ is defined as the root-mean-square deviation of measured $x$ values from the theoretical value and $\eta\left(p{x}\right)$ is defined as the root-mean-square deviation of the change in $p{x}$ produced by the measurement of $x$. More generally, for any two properties, Ozawa wrote the Heisenberg uncertainty principle as

\(
\begin{equation}
\varepsilon(A) \eta(B) \geq \frac{1}{2}\left|\int \Psi^{}[\hat{A}, \hat{B}] \Psi d \tau\right| \tag{5.14}
\end{}
\)

Ozawa presented arguments that, in certain circumstances, the Heisenberg inequality (5.14) can be violated. Ozawa derived the following relation to replace (5.14) [M. Ozawa, Phys. Rev. A, 67, 042105 (2003); available at arxiv.org/abs/quant-ph/0207121]:

\(
\varepsilon(A) \eta(B)+\varepsilon(A) \sigma(B)+\sigma(A) \eta(B) \geq \frac{1}{2}\left|\int \Psi^{*}[\hat{A}, \hat{B}] \Psi d \tau\right|
\)

where $\sigma(A)$ and $\sigma(B)$ are found from (5.10). In 2012, an experiment that measured components of neutron spin found that the Heisenberg error-disturbance inequality (5.14) was not obeyed for the spin components but that the Ozawa inequality was obeyed [J. Erhart et al., Nature Physics, 8, 185 (2012); arxiv.org/abs/1201.1833].

Another inequality is the Heisenberg uncertainty relation for simultaneous measurement of two properties $A$ and $B$ by an apparatus that measures both $A$ and $B$ :

\(
\varepsilon(A) \varepsilon(B) \geq \frac{1}{2}\left|\int \Psi^{*}[\hat{A}, \hat{B}] \Psi d \tau\right|
\)

where $\varepsilon(A)$ and $\varepsilon(B)$ are the experimental errors in the measured $A$ and $B$ values. This relation has been proven to hold, provided a certain plausible assumption (believed to hold for all currently available measuring devices) is valid (see references 6-12 in the abovecited Ozawa paper).

EXAMPLE

Equations (3.91), (3.92), (3.39), the equation following (3.89), and Prob. 3.48 give for the ground state of the particle in a three-dimensional box

\(
\langle x\rangle=\frac{a}{2}, \quad\left\langle x^{2}\right\rangle=\left(\frac{1}{3}-\frac{1}{2 \pi^{2}}\right) a^{2}, \quad\left\langle p{x}\right\rangle=0, \quad\left\langle p{x}^{2}\right\rangle=\frac{h^{2}}{4 a^{2}}
\)

Use these results to check that the uncertainty principle (5.13) is obeyed.
We have

\(
\begin{gathered}
(\Delta x)^{2}=\left\langle x^{2}\right\rangle-\langle x\rangle^{2}=\left(\frac{1}{3}-\frac{1}{2 \pi^{2}}\right) a^{2}-\frac{a^{2}}{4}=\frac{\pi^{2}-6}{12 \pi^{2}} a^{2} \
\Delta x=\left(\frac{\pi^{2}-6}{12}\right)^{1 / 2} \frac{a}{\pi} \
\left(\Delta p{x}\right)^{2}=\left\langle p{x}^{2}\right\rangle-\left\langle p{x}\right\rangle^{2}=\frac{h^{2}}{4 a^{2}}, \quad \Delta p{x}=\frac{h}{2 a} \
\Delta x \Delta p_{x}=\left(\frac{\pi^{2}-6}{12}\right)^{1 / 2} \frac{h}{2 \pi}=0.568 \hbar>\frac{1}{2} \hbar
\end{gathered}
\)

There is also an uncertainty relation involving energy and time:

\(
\begin{equation}
\Delta E \Delta t \geq \frac{1}{2} \hbar \tag{5.15}
\end{equation}
\)

Some texts state that (5.15) is derived from (5.12) by taking $i \hbar \partial / \partial t$ as the energy operator and multiplication by $t$ as the time operator. However, the energy operator is the Hamiltonian $\hat{H}$ and not $i \hbar \partial / \partial t$. Moreover, time is not an observable but is a parameter in quantum mechanics. Hence there is no quantum-mechanical time operator. (The noun observable in quantum mechanics means a physically measurable property of a system.) Equation (5.15) must be derived by a special treatment, which we omit.
(See Ballentine, Section 12.3.) The derivation of (5.15) shows that $\Delta t$ is to be interpreted as the lifetime of the state whose energy is uncertain by $\Delta E$. It is often stated that $\Delta t$ in (5.15) is the duration of the energy measurement. However, Aharonov and Bohm have shown that "energy can be measured reproducibly in an arbitrarily short time" [Y. Aharonov and D. Bohm, Phys. Rev., 122, 1649 (1961); 134, B 1417 (1964); see also S. Massar and S. Popescu, Phys. Rev. A, 71, 042106 (2005); P. Busch, The Time-Energy Uncertainty Relation, arxiv.org/abs/quant-ph/0105049].

Now consider the possibility of simultaneously assigning definite values to three physical quantities: $A, B$, and $C$. Suppose

\(
\begin{equation}
[\hat{A}, \hat{B}]=0 \quad \text { and } \quad[\hat{A}, \hat{C}]=0 \tag{5.16}
\end{equation}
\)

Is this enough to ensure that there exist simultaneous eigenfunctions of all three operators? Since $[\hat{A}, \hat{B}]=0$, we can construct a common set of eigenfunctions for $\hat{A}$ and $\hat{B}$. Since $[\hat{A}, \hat{C}]=0$, we can construct a common set of eigenfunctions for $\hat{A}$ and $\hat{C}$. If these two sets of eigenfunctions are the same, then we will have a common set of eigenfunctions for all three operators. Hence we ask: Is the set of eigenfunctions of the linear operator $\hat{A}$ uniquely determined (apart from arbitrary multiplicative constants)? The answer is, in general, no. If there is more than one independent eigenfunction corresponding to an eigenvalue of $\hat{A}$ (that is, degeneracy), then any linear combination of the eigenfunctions of the degenerate eigenvalue is an eigenfunction of $\hat{A}$ (Section 3.6). It might well be that the proper linear combinations needed to give eigenfunctions of $\hat{B}$ would differ from the linear combinations that give eigenfunctions of $\hat{C}$. It turns out that, to have a common complete set of eigenfunctions of all three operators, we require that $[\hat{B}, \hat{C}]=0$ in addition to (5.16). To have a complete set of functions that are simultaneous eigenfunctions of several operators, each operator must commute with every other operator.


In the next section we shall solve the eigenvalue problem for angular momentum, which is a vector property. We therefore first review vectors.

Physical properties (for example, mass, length, energy) that are completely specified by their magnitude are called scalars. Physical properties (for example, force, velocity, momentum) that require specification of both magnitude and direction are called vectors. A vector is represented by a directed line segment whose length and direction give the magnitude and direction of the property.

The sum of two vectors $\mathbf{A}$ and $\mathbf{B}$ is defined by the following procedure: Slide the first vector so that its tail touches the head of the second vector, keeping the direction of the first vector fixed. Then draw a new vector from the tail of the second vector to the head of the first vector. See Fig. 5.1. The product of a vector and a scalar, $c \mathbf{A}$, is defined as a vector of length $|c|$ times the length of $\mathbf{A}$ with the same direction as $\mathbf{A}$ if $c$ is positive, or the opposite direction to $\mathbf{A}$ if $c$ is negative.

FIGURE 5.1 Addition of two vectors.

(a)

(b) $\mathbf{C}=\mathbf{A}+\mathbf{B}=\mathbf{B}+\mathbf{A}$

FIGURE 5.2 Unit vectors $\mathbf{i}, \mathbf{j}$,
$\mathbf{k}$, and components of $\mathbf{A}$.

To obtain an algebraic (as well as geometric) way of representing vectors, we set up Cartesian coordinates in space. We draw a vector of unit length directed along the positive $x$ axis and call it i. (No connection with $i=\sqrt{-1}$.) Unit vectors in the positive $y$ and $z$ directions are called $\mathbf{j}$ and $\mathbf{k}$ (Fig. 5.2). To represent any vector $\mathbf{A}$ in terms of the three unit vectors, we first slide $\mathbf{A}$ so that its tail is at the origin, preserving its direction during this process. We then find the projections of $\mathbf{A}$ on the $x, y$, and $z$ axes: $A{x}, A{y}$, and $A_{z}$. From the definition of vector addition, it follows that (Fig. 5.2)

\(
\begin{equation}
\mathbf{A}=A{x} \mathbf{i}+A{y} \mathbf{j}+A_{z} \mathbf{k} \tag{5.17}
\end{equation}
\)

We can specify $\mathbf{A}$ by specifying its three components: $\left(A{x}, A{y}, A_{z}\right)$. A vector in threedimensional space can therefore be defined as an ordered set of three numbers.

Two vectors $\mathbf{A}$ and $\mathbf{B}$ are equal if and only if all their corresponding components are equal: $A{x}=B{x}, A{y}=B{y}, A{z}=B{z}$. Therefore a vector equation is equivalent to three scalar equations.

To add two vectors, we add corresponding components:

\(
\mathbf{A}+\mathbf{B}=A{x} \mathbf{i}+A{y} \mathbf{j}+A{z} \mathbf{k}+B{x} \mathbf{i}+B{y} \mathbf{j}+B{z} \mathbf{k}
\)

\(
\begin{equation}
\mathbf{A}+\mathbf{B}=\left(A{x}+B{x}\right) \mathbf{i}+\left(A{y}+B{y}\right) \mathbf{j}+\left(A{z}+B{z}\right) \mathbf{k} \tag{5.18}
\end{equation}
\)

Also, if $c$ is a scalar, then

\(
\begin{equation}
c \mathbf{A}=c A{x} \mathbf{i}+c A{y} \mathbf{j}+c A_{z} \mathbf{k} \tag{5.19}
\end{equation}
\)

The magnitude of a vector $\mathbf{A}$ is its length and is denoted by $A$ or $|\mathbf{A}|$. The magnitude $A$ is a scalar.

The dot product or scalar product $\mathbf{A} \cdot \mathbf{B}$ of two vectors is defined by

\(
\begin{equation}
\mathbf{A} \cdot \mathbf{B}=|\mathbf{A}||\mathbf{B}| \cos \theta=\mathbf{B} \cdot \mathbf{A} \tag{5.20}
\end{equation}
\)

where $\theta$ is the angle between the vectors. The dot product, being the product of three scalars, is a scalar. Note that $|\mathbf{A}| \cos \theta$ is the projection of $\mathbf{A}$ on $\mathbf{B}$. From the definition of vector addition, it follows that the projection of the vector $\mathbf{A}+\mathbf{B}$ on some vector $\mathbf{C}$ is the sum of the projections of $\mathbf{A}$ and of $\mathbf{B}$ on $\mathbf{C}$. Therefore

\(
\begin{equation}
(\mathbf{A}+\mathbf{B}) \cdot \mathbf{C}=\mathbf{A} \cdot \mathbf{C}+\mathbf{B} \cdot \mathbf{C} \tag{5.21}
\end{equation}
\)

Since the three unit vectors $\mathbf{i}, \mathbf{j}$, and $\mathbf{k}$ are each of unit length and are mutually perpendicular, we have

\(
\begin{equation}
\mathbf{i} \cdot \mathbf{i}=\mathbf{j} \cdot \mathbf{j}=\mathbf{k} \cdot \mathbf{k}=\cos 0=1, \quad \mathbf{i} \cdot \mathbf{j}=\mathbf{j} \cdot \mathbf{k}=\mathbf{k} \cdot \mathbf{i}=\cos (\pi / 2)=0 \tag{5.22}
\end{equation}
\)

Using (5.22) and the distributive law (5.21), we have

\(
\begin{gather}
\mathbf{A} \cdot \mathbf{B}=\left(A{x} \mathbf{i}+A{y} \mathbf{j}+A{z} \mathbf{k}\right) \cdot\left(B{x} \mathbf{i}+B{y} \mathbf{j}+B{z} \mathbf{k}\right) \
\mathbf{A} \cdot \mathbf{B}=A{x} B{x}+A{y} B{y}+A{z} B{z} \tag{5.23}
\end{gather}
\)

where six of the nine terms in the dot product are zero.
Equation (5.20) gives

\(
\begin{equation}
\mathbf{A} \cdot \mathbf{A}=|\mathbf{A}|^{2} \tag{5.24}
\end{equation}
\)

Using (5.23), we therefore have

\(
\begin{equation}
|\mathbf{A}|=\left(A{x}^{2}+A{y}^{2}+A_{z}^{2}\right)^{1 / 2} \tag{5.25}
\end{equation}
\)

For three-dimensional vectors, there is another type of product. The cross product or vector product $\mathbf{A} \times \mathbf{B}$ is a vector whose magnitude is

\(
\begin{equation}
|\mathbf{A} \times \mathbf{B}|=|\mathbf{A}||\mathbf{B}| \sin \theta \tag{5.26}
\end{equation}
\)

whose line segment is perpendicular to the plane defined by $\mathbf{A}$ and $\mathbf{B}$, and whose direction is such that $\mathbf{A}, \mathbf{B}$, and $\mathbf{A} \times \mathbf{B}$ form a right-handed system (just as the $x, y$, and $z$ axes form a right-handed system). See Fig. 5.3. From the definition it follows that

\(
\mathbf{B} \times \mathbf{A}=-\mathbf{A} \times \mathbf{B}
\)

Also, it can be shown that (Taylor and Mann, Section 10.2)

\(
\begin{equation}
\mathbf{A} \times(\mathbf{B}+\mathbf{C})=\mathbf{A} \times \mathbf{B}+\mathbf{A} \times \mathbf{C} \tag{5.27}
\end{equation}
\)

For the three unit vectors, we have

\(
\begin{gathered}
\mathbf{i} \times \mathbf{i}=\mathbf{j} \times \mathbf{j}=\mathbf{k} \times \mathbf{k}=\sin 0=0 \
\mathbf{i} \times \mathbf{j}=\mathbf{k}, \quad \mathbf{j} \times \mathbf{i}=-\mathbf{k}, \quad \mathbf{j} \times \mathbf{k}=\mathbf{i}, \quad \mathbf{k} \times \mathbf{j}=-\mathbf{i}, \quad \mathbf{k} \times \mathbf{i}=\mathbf{j}, \quad \mathbf{i} \times \mathbf{k}=-\mathbf{j}
\end{gathered}
\)

Using these equations and the distributive property (5.27), we find

\(
\begin{gathered}
\mathbf{A} \times \mathbf{B}=\left(A{x} \mathbf{i}+A{y} \mathbf{j}+A{z} \mathbf{k}\right) \times\left(B{x} \mathbf{i}+B{y} \mathbf{j}+B{z} \mathbf{k}\right) \
\mathbf{A} \times \mathbf{B}=\left(A{y} B{z}-A{z} B{y}\right) \mathbf{i}+\left(A{z} B{x}-A{x} B{z}\right) \mathbf{j}+\left(A{x} B{y}-A{y} B{x}\right) \mathbf{k}
\end{gathered}
\)

FIGURE 5.3 Cross product of two vectors.

As a memory aid, we can express the cross product as a determinant (see Section 8.3):

\(
\mathbf{A} \times \mathbf{B}=\left|\begin{array}{ccc}
\mathbf{i} & \mathbf{j} & \mathbf{k} \tag{5.28}\
A{x} & A{y} & A{z} \
B{x} & B{y} & B{z}
\end{array}\right|=\mathbf{i}\left|\begin{array}{cc}
A{y} & A{z} \
B{y} & B{z}
\end{array}\right|-\mathbf{j}\left|\begin{array}{cc}
A{x} & A{z} \
B{x} & B{z}
\end{array}\right|+\mathbf{k}\left|\begin{array}{cc}
A{x} & A{y} \
B{x} & B{y}
\end{array}\right|
\)

We define the vector operator del as

\(
\begin{equation}
\nabla \equiv \mathbf{i} \frac{\partial}{\partial x}+\mathbf{j} \frac{\partial}{\partial y}+\mathbf{k} \frac{\partial}{\partial z} \tag{5.29}
\end{equation}
\)

From Eq. (3.23), the operator for the linear-momentum vector is $\hat{\mathbf{p}}=-i \hbar \nabla$.
The gradient of a function $g(x, y, z)$ is defined as the result of operating on $g$ with del:

\(
\begin{equation}
\operatorname{grad} g(x, y, z) \equiv \nabla g(x, y, z) \equiv \mathbf{i} \frac{\partial g}{\partial x}+\mathbf{j} \frac{\partial g}{\partial y}+\mathbf{k} \frac{\partial g}{\partial z} \tag{5.30}
\end{equation}
\)

The gradient of a scalar function is a vector function. The vector $\nabla g(x, y, z)$ represents the spatial rate of change of the function $g$ : The $x$ component of $\nabla g$ is the rate of change of $g$ with respect to $x$, and so on. It can be shown that the vector $\nabla g$ points in the direction in which the rate of change of $g$ is greatest. From Eq. (4.24), the relation between force and potential energy is

\(
\begin{equation}
\mathbf{F}=-\nabla V(x, y, z)=-\mathbf{i} \frac{\partial V}{\partial x}-\mathbf{j} \frac{\partial V}{\partial y}-\mathbf{k} \frac{\partial V}{\partial z} \tag{5.31}
\end{equation}
\)

Suppose that the components of the vector $\mathbf{A}$ are each functions of some parameter $t$; $A{x}=A{x}(t), A{y}=A{y}(t), A{z}=A{z}(t)$. The derivative of $\mathbf{A}$ with respect to $t$ is defined as

\(
\begin{equation}
\frac{d \mathbf{A}}{d t}=\mathbf{i} \frac{d A{x}}{d t}+\mathbf{j} \frac{d A{y}}{d t}+\mathbf{k} \frac{d A_{z}}{d t} \tag{5.32}
\end{equation}
\)

Vector notation is a convenient way to represent the variables of a function. The wave function of a two-particle system can be written as $\psi\left(x{1}, y{1}, z{1}, x{2}, y{2}, z{2}\right)$. If $\mathbf{r}{1}$ is the vector from the origin to particle 1 , then $\mathbf{r}{1}$ has components $x{1}, y{1}, z{1}$ and specification of $\mathbf{r}{1}$ is equivalent to specification of the three coordinates $x{1}, y{1}, z{1}$. The same is true for the vector $\mathbf{r}{2}$ from the origin to particle 2 . Therefore, we can write the wave function as $\psi\left(\mathbf{r}{1}, \mathbf{r}{2}\right)$. Vector notation can be used in integrals. For example, the integral over all space in Eq. (3.57) is often written as $\int \cdots \int\left|\Psi\left(\mathbf{r}{1}, \ldots, \mathbf{r}{n}, t\right)\right|^{2} d \mathbf{r}{1} \cdots d \mathbf{r}{n}$.

Vectors in $n$-Dimensional Space

The definition of a vector can be generalized to more than three dimensions. A vector $\mathbf{A}$ in three-dimensional space can be defined by its magnitude $|\mathbf{A}|$ and its direction, or it can be defined by its three components $\left(A{x}, A{y}, A{z}\right)$ in a Cartesian coordinate system. Therefore, we can define a three-dimensional vector as a set of three real numbers $\left(A{x}, A{y}, A{z}\right)$ in a particular order. A vector $\mathbf{B}$ in an $n$-dimensional real vector "space" (sometimes called a hyperspace) is defined as an ordered set of $n$ real numbers $\left(B{1}, B{2}, \ldots, B{n}\right)$, where $B{1}, B{2}, \ldots, B{n}$ are the components of $\mathbf{B}$. Don't be concerned that you can't visualize vectors in an $n$-dimensional space.

The variables of a function are often denoted using $n$-dimensional vector notation. For example, instead of writing the wave function of a two-particle system as $\psi\left(\mathbf{r}{1}, \mathbf{r}{2}\right)$, we can define a six-dimensional vector $\mathbf{q}$ whose components are $q{1}=x{1}, q{2}=y{1}$,
$q{3}=z{1}, q{4}=x{2}, q{5}=y{2}, q{6}=z{2}$ and write the wave function as $\psi(\mathbf{q})$. For an $n$-particle system, we can define $\mathbf{q}$ to have $3 n$ components and write the wave function as $\psi(\mathbf{q})$ and the normalization integral over all space as $\int|\psi(\mathbf{q})|^{2} d \mathbf{q}$.

The theory of searching for the equilibrium geometry of a molecule uses $n$-dimensional vectors (Section 15.10). The rest of Section 5.2 is relevant to Section 15.10 and need not be read until you study Section 15.10.

Two $n$-dimensional vectors are equal if all their corresponding components are equal; $\mathbf{B}=\mathbf{C}$ if and only if $B{1}=C{1}, B{2}=C{2}, \ldots, B{n}=C{n}$. Therefore, in $n$-dimensional space, a vector equation is equivalent to $n$ scalar equations. The sum of two $n$-dimensional vectors $\mathbf{B}$ and $\mathbf{D}$ is defined as the vector $\left(B{1}+D{1}, B{2}+D{2}, \ldots, B{n}+D{n}\right)$. The difference is defined similarly. The vector $k \mathbf{B}$ is defined as the vector $\left(k B{1}, k B{2}, \ldots, k B{n}\right)$, where $k$ is a scalar. In three-dimensional space, the vectors $k \mathbf{A}$, where $k>0$, all lie in the same direction. In $n$-dimensional space the vectors $k \mathbf{B}$ all lie in the same direction. Just as the numbers $\left(A{x}, A{y}, A{z}\right)$ define a point in three-dimensional space, the numbers $\left(B{1}, B{2}, \ldots, B_{n}\right)$ define a point in $n$-dimensional space.

The length (or magnitude or Euclidean norm) $|\mathbf{B}|$ (sometimes denoted $|\mathbf{B}|$ ) of an $n$-dimensional real vector is defined as

\(
|\mathbf{B}| \equiv(\mathbf{B} \cdot \mathbf{B})^{1 / 2}=\left(B{1}^{2}+B{2}^{2}+\cdots+B_{n}^{2}\right)^{1 / 2}
\)

A vector whose length is 1 is said to be normalized.
The inner product (or scalar product) $\mathbf{B} \cdot \mathbf{G}$ of two real $n$-dimensional vectors $\mathbf{B}$ and $\mathbf{G}$ is defined as the scalar

\(
\mathbf{B} \cdot \mathbf{G} \equiv B{1} G{1}+B{2} G{2}+\cdots+B{n} G{n}
\)

If $\mathbf{B} \cdot \mathbf{G}=0$, the vectors $\mathbf{B}$ and $\mathbf{G}$ are said to be orthogonal. The cosine of the angle $\theta$ between two $n$-dimensional vectors $\mathbf{B}$ and $\mathbf{C}$ is defined by analogy to (5.20) as $\cos \theta \equiv \mathbf{B} \cdot \mathbf{C} /|\mathbf{B} | \mathbf{C}|$. One can show that this definition makes $\cos \theta$ lie in the range -1 to 1 .

In three-dimensional space, the unit vectors $\mathbf{i}=(1,0,0), \mathbf{j}=(0,1,0), \mathbf{k}=(0,0,1)$ are mutually perpendicular. Also, any vector can be written as a linear combination of these three vectors [Eq. (5.17)]. In an $n$-dimensional real vector space, the unit vectors $\mathbf{e}{1} \equiv(1,0,0, \ldots, 0), \mathbf{e}{2} \equiv(0,1,0, \ldots, 0), \ldots, \mathbf{e}{n} \equiv(0,0,0, \ldots, 1)$ are mutually orthogonal. Since the $n$-dimensional vector $\mathbf{B}$ equals $B{1} \mathbf{e}{1}+B{2} \mathbf{e}{2}+\cdots+B{n} \mathbf{e}{n}$, any $n$-dimensional real vector can be written as a linear combination of the $n$ unit vectors $\mathbf{e}{1}, \mathbf{e}{2}, \ldots, \mathbf{e}{n}$. This set of $n$ vectors is therefore said to be a basis for the $n$-dimensional real vector space. Since the vectors $\mathbf{e}{1}, \mathbf{e}{2}, \ldots, \mathbf{e}{n}$ are orthogonal and normalized, they are an orthonormal basis for the real vector space. The scalar product $\mathbf{B} \cdot \mathbf{e}{i}$ gives the component of $\mathbf{B}$ in the direction of the basis vector $\mathbf{e}_{i}$. A vector space has many possible basis sets. Any set of $n$ linearly independent real vectors can serve as a basis for the $n$-dimensional real vector space.

A three-dimensional vector can be specified by its three components or by its length and its direction. The direction can be specified by giving the three angles that the vector makes with the positive halves of the $x, y$, and $z$ axes. These angles are the direction angles of the vector and lie in the range 0 to $180^{\circ}$. However, the direction angle with the $z$ axis is fixed once the other two direction angles have been given, so only two direction angles are independent. Thus a three-dimensional vector can be specified by its length and two direction angles. Similarly, in $n$-dimensional space, the direction angles between a vector and each unit vector $\mathbf{e}{1}, \mathbf{e}{2}, \ldots, \mathbf{e}_{n}$ can be found from the above formula for the cosine of the angle between two vectors. An $n$-dimensional vector can thus be specified by its length and $n-1$ direction angles.

The gradient of a function of three variables is defined by (5.30). The gradient $\nabla f$ of a function $f\left(q{1}, q{2}, \ldots, q_{n}\right)$ of $n$ variables is defined as the $n$-dimensional vector whose components are the first partial derivatives of $f$ :

\(
\nabla f \equiv\left(\partial f / \partial q{1}\right) \mathbf{e}{1}+\left(\partial f / \partial q{2}\right) \mathbf{e}{2}+\cdots+\left(\partial f / \partial q{n}\right) \mathbf{e}{n}
\)

We have considered real, $n$-dimensional vector spaces. Dirac's formulation of quantum mechanics uses a complex, infinite-dimensional vector space, discussion of which is omitted.


In Section 3.3 we found the eigenfunctions and eigenvalues for the linear-momentum operator $\hat{p}_{x}$. In this section we consider the same problem for the angular momentum of a particle. Angular momentum plays a key role in the quantum mechanics of atomic structure. We begin by reviewing the classical mechanics of angular momentum.

Classical Mechanics of One-Particle Angular Momentum

Consider a moving particle of mass $m$. We set up a Cartesian coordinate system that is fixed in space. Let $\mathbf{r}$ be the vector from the origin to the instantaneous position of the particle. We have

\(
\begin{equation}
\mathbf{r}=\mathbf{i} x+\mathbf{j} y+\mathbf{k} z \tag{5.33}
\end{equation}
\)

where $x, y$, and $z$ are the particle's coordinates at a given instant. These coordinates are functions of time. Defining the velocity vector $\mathbf{v}$ as the time derivative of the position vector, we have [Eq. (5.32)]

\(
\begin{gather}
\mathbf{v} \equiv \frac{d \mathbf{r}}{d t}=\mathbf{i} \frac{d x}{d t}+\mathbf{j} \frac{d y}{d t}+\mathbf{k} \frac{d z}{d t} \tag{5.34}\
v{x}=d x / d t, \quad v{y}=d y / d t, \quad v_{z}=d z / d t
\end{gather}
\)

We define the particle's linear momentum vector $\mathbf{p}$ by

\(
\begin{gather}
\mathbf{p} \equiv m \mathbf{v} \tag{5.35}\
p{x}=m v{x}, \quad p{y}=m v{y}, \quad p{z}=m v{z} \tag{5.36}
\end{gather}
\)

The particle's angular momentum $\mathbf{L}$ with respect to the coordinate origin is defined in classical mechanics as

\(
\begin{gather}
\mathbf{L} \equiv \mathbf{r} \times \mathbf{p} \tag{5.37}\
\mathbf{L}=\left|\begin{array}{ccc}
\mathbf{i} & \mathbf{j} & \mathbf{k} \
x & y & z \
p{x} & p{y} & p{z}
\end{array}\right| \tag{5.38}\
L{x}=y p{z}-z p{y}, \quad L{y}=z p{x}-x p{z}, \quad L{z}=x p{y}-y p{x} \tag{5.39}
\end{gather}
\)

where (5.28) was used. $L{x}, L{y}$, and $L_{z}$ are the components of $\mathbf{L}$ along the $x, y$, and $z$ axes. The angular-momentum vector $\mathbf{L}$ is perpendicular to the plane defined by the particle's position vector $\mathbf{r}$ and its velocity $\mathbf{v}$ (Fig. 5.4).

FIGURE 5.4 L $\equiv \mathbf{r} \times \mathbf{p}$.

The torque $\tau$ acting on a particle is defined as the cross product of $\mathbf{r}$ and the force $\mathbf{F}$ acting on the particle: $\tau \equiv \mathbf{r} \times \mathbf{F}$. One can show that $\tau=d \mathbf{L} / d t$. When no torque acts on a particle, the rate of change of its angular momentum is zero; that is, its angular momentum is constant (or conserved). For a planet orbiting the sun, the gravitational force is radially directed. Since the cross product of two parallel vectors is zero, there is no torque on the planet and its angular momentum is conserved.

One-Particle Orbital-Angular-Momentum Operators

Now let us turn to the quantum-mechanical treatment. In quantum mechanics, there are two kinds of angular momenta. Orbital angular momentum results from the motion of a particle through space, and is the analog of the classical-mechanical quantity $\mathbf{L}$. Spin angular momentum (Chapter 10) is an intrinsic property of many microscopic particles and has no classical-mechanical analog. We are now considering only orbital angular momentum. We get the quantum-mechanical operators for the components of orbital angular momentum of a particle by replacing the coordinates and momenta in the classical equations (5.39) by their corresponding operators [Eqs. (3.21)-(3.23)]. We find

\(
\begin{align}
& \hat{L}{x}=-i \hbar\left(y \frac{\partial}{\partial z}-z \frac{\partial}{\partial y}\right) \tag{5.40}\
& \hat{L}{y}=-i \hbar\left(z \frac{\partial}{\partial x}-x \frac{\partial}{\partial z}\right) \tag{5.41}\
& \hat{L}_{z}=-i \hbar\left(x \frac{\partial}{\partial y}-y \frac{\partial}{\partial x}\right) \tag{5.42}
\end{align}
\)

(Since $\hat{y} \hat{p}{z}=\hat{p}{z} \hat{y}$, and so on, we do not run into any problems of noncommutativity in constructing these operators.) Using

\(
\begin{equation}
\hat{L}^{2}=|\hat{\mathbf{L}}|^{2}=\hat{\mathbf{L}} \cdot \hat{\mathbf{L}}=\hat{L}{x}^{2}+\hat{L}{y}^{2}+\hat{L}_{z}^{2} \tag{5.43}
\end{equation}
\)

we can construct the operator for the square of the angular-momentum magnitude from the operators in (5.40)-(5.42).

Since the commutation relations determine which physical quantities can be simultaneously assigned definite values, we investigate these relations for angular momentum. Operating on some function $f(x, y, z)$ with $\hat{L}_{y}$, we have

\(
\hat{L}_{y} f=-i \hbar\left(z \frac{\partial f}{\partial x}-x \frac{\partial f}{\partial z}\right)
\)

Operating on this last equation with $\hat{L}_{x}$, we get

\(
\begin{equation}
\hat{L}{x} \hat{L}{y} f=-\hbar^{2}\left(y \frac{\partial f}{\partial x}+y z \frac{\partial^{2} f}{\partial z \partial x}-y x \frac{\partial^{2} f}{\partial z^{2}}-z^{2} \frac{\partial^{2} f}{\partial y \partial x}+z x \frac{\partial^{2} f}{\partial y \partial z}\right) \tag{5.44}
\end{equation}
\)

Similarly,

\(
\begin{gather}
\hat{L}{x} f=-i \hbar\left(y \frac{\partial f}{\partial z}-z \frac{\partial f}{\partial y}\right) \
\hat{L}{y} \hat{L}_{x} f=-\hbar^{2}\left(z y \frac{\partial^{2} f}{\partial x \partial z}-z^{2} \frac{\partial^{2} f}{\partial x \partial y}-x y \frac{\partial^{2} f}{\partial z^{2}}+x \frac{\partial f}{\partial y}+x z \frac{\partial^{2} f}{\partial z \partial y}\right) \tag{5.45}
\end{gather}
\)

Subtracting (5.45) from (5.44), we have

\(
\begin{align}
\hat{L}{x} \hat{L}{y} f-\hat{L}{y} \hat{L}{x} f & =-\hbar^{2}\left(y \frac{\partial f}{\partial x}-x \frac{\partial f}{\partial y}\right) \
{\left[\hat{L}{x}, \hat{L}{y}\right] } & =i \hbar \hat{L}_{z} \tag{5.46}
\end{align}
\)

where we used relations such as

\(
\begin{equation}
\frac{\partial^{2} f}{\partial z \partial x}=\frac{\partial^{2} f}{\partial x \partial z} \tag{5.47}
\end{equation}
\)

which are true for well-behaved functions. We could use the same procedure to find [ $\left.\hat{L}{y}, \hat{L}{z}\right]$ and $\left[\hat{L}{z}, \hat{L}{x}\right]$, but we can save time by noting a certain kind of symmetry in (5.40)-(5.42). By a cyclic permutation of $x, y$, and $z$, we mean replacing $x$ by $y$, replacing $y$ by $z$, and replacing $z$ by $x$. If we carry out a cyclic permutation in $\hat{L}{x}$, we get $\hat{L}{y}$; a cyclic permutation in $\hat{L}{y}$ gives $\hat{L}{z}$; and $\hat{L}{z}$ is transformed into $\hat{L}{x}$ by a cyclic permutation. Hence, by carrying out two successive cyclic permutations on (5.46), we get

\(
\begin{equation}
\left[\hat{L}{y}, \hat{L}{z}\right]=i \hbar \hat{L}{x}, \quad\left[\hat{L}{z}, \hat{L}{x}\right]=i \hbar \hat{L}{y} \tag{5.48}
\end{equation}
\)

We next evaluate the commutators of $\hat{L}^{2}$ with each of its components, using commutator identities of Section 5.1.

\(
\begin{align}
{\left[\hat{L}^{2}, \hat{L}{x}\right]=} & {\left[\hat{L}{x}^{2}+\hat{L}{y}^{2}+\hat{L}{z}^{2}, \hat{L}{x}\right] } \
= & {\left[\hat{L}{x}^{2}, \hat{L}{x}\right]+\left[\hat{L}{y}^{2}, \hat{L}{x}\right]+\left[\hat{L}{z}^{2}, \hat{L}{x}\right] } \
= & {\left[\hat{L}{y}^{2}, \hat{L}{x}\right]+\left[\hat{L}{z}^{2}, \hat{L}{x}\right] } \
= & {\left[\hat{L}{y}, \hat{L}{x}\right] \hat{L}{y}+\hat{L}{y}\left[\hat{L}{y}, \hat{L}{x}\right]+\left[\hat{L}{z}, \hat{L}{x}\right] \hat{L}{z}+\hat{L}{z}\left[\hat{L}{z}, \hat{L}{x}\right] } \
= & -i \hbar \hat{L}{z} \hat{L}{y}-i \hbar \hat{L}{y} \hat{L}{z}+i \hbar \hat{L}{y} \hat{L}{z}+i \hbar \hat{L}{z} \hat{L}{y} \
& \quad\left[\hat{L}^{2}, \hat{L}{x}\right]=0 \tag{5.49}
\end{align}
\)

Since a cyclic permutation of $x, y$, and $z$ leaves $\hat{L}^{2}=\hat{L}{x}^{2}+\hat{L}{y}^{2}+\hat{L}_{z}^{2}$ unchanged, if we carry out two such permutations on (5.49), we get

\(
\begin{equation}
\left[\hat{L}^{2}, \hat{L}{y}\right]=0, \quad\left[\hat{L}^{2}, \hat{L}{z}\right]=0 \tag{5.50}
\end{equation}
\)

To which of the quantities $L^{2}, L{x}, L{y}, L_{z}$ can we assign definite values simultaneously? Because $\hat{L}^{2}$ commutes with each of its components, we can specify an exact value for $L^{2}$

FIGURE 5.5 Spherical coordinates.

and any one component. However, no two components of $\hat{\mathbf{L}}$ commute with each other, so we cannot specify more than one component simultaneously. (There is one exception to this statement, which will be discussed shortly.) It is traditional to take $L_{z}$ as the component of angular momentum that will be specified along with $L^{2}$. Note that in specifying $L^{2}=|\mathbf{L}|^{2}$ we are not specifying the vector $\mathbf{L}$, only its magnitude. A complete specification of $\mathbf{L}$ requires simultaneous specification of each of its three components, which we usually cannot do. In classical mechanics when angular momentum is conserved, each of its three components has a definite value. In quantum mechanics when angular momentum is conserved, only its magnitude and one of its components are specifiable.

We could now try to find the eigenvalues and common eigenfunctions of $\hat{L}^{2}$ and $\hat{L}_{z}$ by using the forms for these operators in Cartesian coordinates. However, we would find that the partial differential equations obtained would not be separable. Therefore we transform these operators to spherical coordinates (Fig. 5.5). The coordinate $r$ is the distance from the origin to the point $(x, y, z)$. The angle $\theta$ is the angle the vector $\mathbf{r}$ makes with the positive $z$ axis. The angle that the projection of $\mathbf{r}$ in the $x y$ plane makes with the positive $x$ axis is $\phi$. (Mathematics texts often interchange $\theta$ and $\phi$.) A little trigonometry gives

\(
\begin{align}
x=r \sin \theta \cos \phi, \quad y & =r \sin \theta \sin \phi, \quad z=r \cos \theta \tag{5.51}\
r^{2}=x^{2}+y^{2}+z^{2}, \quad \cos \theta & =\frac{z}{\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}}, \quad \tan \phi=\frac{y}{x} \tag{5.52}
\end{align}
\)

To transform the angular-momentum operators to spherical coordinates, we must transform $\partial / \partial x, \partial / \partial y$, and $\partial / \partial z$ into these coordinates. [This transformation may be skimmed if desired. Begin reading again after Eq. (5.64).]

To perform this transformation, we use the chain rule. Suppose we have a function of $r, \theta$, and $\phi: f(r, \theta, \phi)$. If we change the independent variables by substituting

\(
r=r(x, y, z), \quad \theta=\theta(x, y, z), \quad \phi=\phi(x, y, z)
\)

into $f$, we transform it into a function of $x, y$, and $z$ :

\(
f[r(x, y, z), \theta(x, y, z), \phi(x, y, z)]=g(x, y, z)
\)

For example, suppose that $f(r, \theta, \phi)=3 r \cos \theta+2 \tan ^{2} \phi$. Using (5.52), we have $g(x, y, z)=3 z+2 y^{2} x^{-2}$.

The chain rule tells us how the partial derivatives of $g(x, y, z)$ are related to those of $f(r, \theta, \phi)$. In fact,

\(
\begin{align}
\left(\frac{\partial g}{\partial x}\right){y, z} & =\left(\frac{\partial f}{\partial r}\right){\theta, \phi}\left(\frac{\partial r}{\partial x}\right){y, z}+\left(\frac{\partial f}{\partial \theta}\right){r, \phi}\left(\frac{\partial \theta}{\partial x}\right){y, z}+\left(\frac{\partial f}{\partial \phi}\right){r, \theta}\left(\frac{\partial \phi}{\partial x}\right){y, z} \tag{5.53}\
\left(\frac{\partial g}{\partial y}\right){x, z} & =\left(\frac{\partial f}{\partial r}\right){\theta, \phi}\left(\frac{\partial r}{\partial y}\right){x, z}+\left(\frac{\partial f}{\partial \theta}\right){r, \phi}\left(\frac{\partial \theta}{\partial y}\right){x, z}+\left(\frac{\partial f}{\partial \phi}\right){r, \theta}\left(\frac{\partial \phi}{\partial y}\right){x, z} \tag{5.54}
\end{align}
\)

\(
\begin{equation}
\left(\frac{\partial g}{\partial z}\right){x, y}=\left(\frac{\partial f}{\partial r}\right){\theta, \phi}\left(\frac{\partial r}{\partial z}\right){x, y}+\left(\frac{\partial f}{\partial \theta}\right){r, \phi}\left(\frac{\partial \theta}{\partial z}\right){x, y}+\left(\frac{\partial f}{\partial \phi}\right){r, \theta}\left(\frac{\partial \phi}{\partial z}\right)_{x, y} \tag{5.55}
\end{equation}
\)

To convert these equations to operator equations, we delete $f$ and $g$ to give

\(
\begin{equation}
\frac{\partial}{\partial x}=\left(\frac{\partial r}{\partial x}\right){y, z} \frac{\partial}{\partial r}+\left(\frac{\partial \theta}{\partial x}\right){y, z} \frac{\partial}{\partial \theta}+\left(\frac{\partial \phi}{\partial x}\right)_{y, z} \frac{\partial}{\partial \phi} \tag{5.56}
\end{equation}
\)

with similar equations for $\partial / \partial y$ and $\partial / \partial z$. The task now is to evaluate the partial derivatives such as $(\partial r / \partial x)_{y, z}$. Taking the partial derivative of the first equation in (5.52) with respect to $x$ at constant $y$ and $z$, we have

\(
\begin{align}
2 r\left(\frac{\partial r}{\partial x}\right){y, z} & =2 x=2 r \sin \theta \cos \phi \
\left(\frac{\partial r}{\partial x}\right){y, z} & =\sin \theta \cos \phi \tag{5.57}
\end{align}
\)

Differentiating $r^{2}=x^{2}+y^{2}+z^{2}$ with respect to $y$ and with respect to $z$, we find

\(
\begin{equation}
\left(\frac{\partial r}{\partial y}\right){x, z}=\sin \theta \sin \phi, \quad\left(\frac{\partial r}{\partial z}\right){x, y}=\cos \theta \tag{5.58}
\end{equation}
\)

From the second equation in (5.52), we find

\(
\begin{align}
& -\sin \theta\left(\frac{\partial \theta}{\partial x}\right){y, z}=-\frac{x z}{r^{3}} \
& \left(\frac{\partial \theta}{\partial x}\right){y, z}=\frac{\cos \theta \cos \phi}{r} \tag{5.59}
\end{align}
\)

Also,

\(
\begin{equation}
\left(\frac{\partial \theta}{\partial y}\right){x, z}=\frac{\cos \theta \sin \phi}{r}, \quad\left(\frac{\partial \theta}{\partial z}\right){x, y}=-\frac{\sin \theta}{r} \tag{5.60}
\end{equation}
\)

From $\tan \phi=y / x$, we find

\(
\begin{equation}
\left(\frac{\partial \phi}{\partial x}\right){y, z}=-\frac{\sin \phi}{r \sin \theta}, \quad\left(\frac{\partial \phi}{\partial y}\right){x, z}=\frac{\cos \phi}{r \sin \theta}, \quad\left(\frac{\partial \phi}{\partial z}\right)_{x, y}=0 \tag{5.61}
\end{equation}
\)

Substituting (5.57), (5.59), and (5.61) into (5.56), we find

\(
\begin{equation}
\frac{\partial}{\partial x}=\sin \theta \cos \phi \frac{\partial}{\partial r}+\frac{\cos \theta \cos \phi}{r} \frac{\partial}{\partial \theta}-\frac{\sin \phi}{r \sin \theta} \frac{\partial}{\partial \phi} \tag{5.62}
\end{equation}
\)

Similarly,

\(
\begin{gather}
\frac{\partial}{\partial y}=\sin \theta \sin \phi \frac{\partial}{\partial r}+\frac{\cos \theta \sin \phi}{r} \frac{\partial}{\partial \theta}+\frac{\cos \phi}{r \sin \theta} \frac{\partial}{\partial \phi} \tag{5.63}\
\frac{\partial}{\partial z}=\cos \theta \frac{\partial}{\partial r}-\frac{\sin \theta}{r} \frac{\partial}{\partial \theta} \tag{5.64}
\end{gather}
\)

At last, we are ready to express the angular-momentum components in spherical coordinates. Substituting (5.51), (5.63), and (5.64) into (5.40), we have

\(
\begin{align}
\hat{L}{x}= & -i \hbar\left[r \sin \theta \sin \phi\left(\cos \theta \frac{\partial}{\partial r}-\frac{\sin \theta}{r} \frac{\partial}{\partial \theta}\right)\right. \
& \left.-r \cos \theta\left(\sin \theta \sin \phi \frac{\partial}{\partial r}+\frac{\cos \theta \sin \phi}{r} \frac{\partial}{\partial \theta}+\frac{\cos \phi}{r \sin \theta} \frac{\partial}{\partial \phi}\right)\right] \
\hat{L}{x}= & i \hbar\left(\sin \phi \frac{\partial}{\partial \theta}+\cot \theta \cos \phi \frac{\partial}{\partial \phi}\right) \tag{5.65}
\end{align}
\)

Also, we find

\(
\begin{gather}
\hat{L}{y}=-i \hbar\left(\cos \phi \frac{\partial}{\partial \theta}-\cot \theta \sin \phi \frac{\partial}{\partial \phi}\right) \tag{5.66}\
\hat{L}{z}=-i \hbar \frac{\partial}{\partial \phi} \tag{5.67}
\end{gather}
\)

By squaring each of $\hat{L}{x}, \hat{L}{y}$, and $\hat{L}{z}$ and then adding their squares, we can construct $\hat{L}^{2}=\hat{L}{x}^{2}+\hat{L}{y}^{2}+\hat{L}{z}^{2}[$ Eq. (5.43)]. The result is (Prob. 5.17)

\(
\begin{equation}
\hat{L}^{2}=-\hbar^{2}\left(\frac{\partial^{2}}{\partial \theta^{2}}+\cot \theta \frac{\partial}{\partial \theta}+\frac{1}{\sin ^{2} \theta} \frac{\partial^{2}}{\partial \phi^{2}}\right) \tag{5.68}
\end{equation}
\)

Although the angular-momentum operators depend on all three Cartesian coordinates, $x, y$, and $z$, they involve only the two spherical coordinates $\theta$ and $\phi$.

One-Particle Orbital-Angular-Momentum Eigenfunctions and Eigenvalues

We now find the common eigenfunctions of $\hat{L}^{2}$ and $\hat{L}{z}$, which we denote by $Y$. Since these operators involve only $\theta$ and $\phi, Y$ is a function of these two coordinates: $Y=Y(\theta, \phi)$. (Of course, since the operators are linear, we can multiply $Y$ by an arbitrary function of $r$ and still have an eigenfunction of $\hat{L}^{2}$ and $\hat{L}{z}$.) We must solve

\(
\begin{align}
& \hat{L}_{z} Y(\theta, \phi)=b Y(\theta, \phi) \tag{5.69}\
& \hat{L}^{2} Y(\theta, \phi)=c Y(\theta, \phi) \tag{5.70}
\end{align}
\)

where $b$ and $c$ are the eigenvalues of $\hat{L}{z}$ and $\hat{L}^{2}$.
Using the $\hat{L}{z}$ operator, we have

\(
\begin{equation}
-i \hbar \frac{\partial}{\partial \phi} Y(\theta, \phi)=b Y(\theta, \phi) \tag{5.71}
\end{equation}
\)

Since the operator in (5.71) does not involve $\theta$, we try a separation of variables, writing

\(
\begin{equation}
Y(\theta, \phi)=S(\theta) T(\phi) \tag{5.72}
\end{equation}
\)

Equation (5.71) becomes

\(
\begin{align}
-i \hbar \frac{\partial}{\partial \phi}[S(\theta) T(\phi)] & =b S(\theta) T(\phi) \
-i \hbar S(\theta) \frac{d T(\phi)}{d \phi} & =b S(\theta) T(\phi) \
\frac{d T(\phi)}{T(\phi)} & =\frac{i b}{\hbar} d \phi \
T(\phi) & =A e^{i b \phi / \hbar} \tag{5.73}
\end{align}
\)

where $A$ is an arbitrary constant.

Is $T$ suitable as an eigenfunction? The answer is no, since it is not, in general, a singlevalued function. If we add $2 \pi$ to $\phi$, we will still be at the same point in space, and hence we want no change in $T$ when this is done. For $T$ to be single-valued, we have the restriction

\(
\begin{align}
T(\phi+2 \pi) & =T(\phi) \
A e^{i b \phi / \hbar} e^{i b 2 \pi / \hbar} & =A e^{i b \phi / \hbar} \
e^{i b 2 \pi / \hbar} & =1 \tag{5.74}
\end{align}
\)

To satisfy $e^{i \alpha}=\cos \alpha+i \sin \alpha=1$, we must have $\alpha=2 \pi m$, where

\(
m=0, \pm 1, \pm 2, \pm \cdots
\)

Therefore, (5.74) gives

\(
\begin{gather}
2 \pi b / \hbar=2 \pi m \
b=m \hbar, \quad m=\ldots-2,-1,0,1,2, \ldots \tag{5.75}
\end{gather}
\)

and (5.73) becomes

\(
\begin{equation}
T(\phi)=A e^{i m \phi}, \quad m=0, \pm 1, \pm 2, \ldots \tag{5.76}
\end{equation}
\)

The eigenvalues for the $z$ component of angular momentum are quantized.
We fix $A$ by normalizing $T$. First let us consider normalizing some function $F$ of $r, \theta$, and $\phi$. The ranges of the independent variables are (see Fig. 5.5)

\(
\begin{equation}
0 \leq r \leq \infty, \quad 0 \leq \theta \leq \pi, \quad 0 \leq \phi \leq 2 \pi \tag{5.77}
\end{equation}
\)

The infinitesimal volume element in spherical coordinates is (Taylor and Mann, Section 13.9)

\(
\begin{equation}
d \tau=r^{2} \sin \theta d r d \theta d \phi \tag{5.78}
\end{equation}
\)

The quantity (5.78) is the volume of an infinitesimal region of space for which the spherical coordinates lie in the ranges $r$ to $r+d r, \theta$ to $\theta+d \theta$, and $\phi$ to $\phi+d \phi$. The normalization condition for $F$ in spherical coordinates is therefore

\(
\begin{equation}
\int{0}^{\infty}\left[\int{0}^{\pi}\left[\int_{0}^{2 \pi}|F(r, \theta, \phi)|^{2} d \phi\right] \sin \theta d \theta\right] r^{2} d r=1 \tag{5.79}
\end{equation}
\)

If $F$ happens to have the form

\(
F(r, \theta, \phi)=R(r) S(\theta) T(\phi)
\)

then use of the integral identity (3.74) gives for (5.79)

\(
\int{0}^{\infty}|R(r)|^{2} r^{2} d r \int{0}^{\pi}|S(\theta)|^{2} \sin \theta d \theta \int_{0}^{2 \pi}|T(\phi)|^{2} d \phi=1
\)

and it is convenient to normalize each factor of $F$ separately:

\(
\begin{equation}
\int{0}^{\infty}|R|^{2} r^{2} d r=1, \quad \int{0}^{\pi}|S|^{2} \sin \theta d \theta=1, \quad \int_{0}^{2 \pi}|T|^{2} d \phi=1 \tag{5.80}
\end{equation}
\)

Therefore,

\(
\begin{gathered}
\int{0}^{2 \pi}\left(A e^{i m \phi}\right) * A e^{i m \phi} d \phi=1=|A|^{2} \int{0}^{2 \pi} d \phi \
|A|=(2 \pi)^{-1 / 2}
\end{gathered}
\)

\(
\begin{equation}
T(\phi)=\frac{1}{\sqrt{2 \pi}} e^{i m \phi}, \quad m=0, \pm 1, \pm 2, \ldots \tag{5.81}
\end{equation}
\)

We now solve $\hat{L}^{2} Y=c Y$ [Eq. (5.70)] for the eigenvalues $c$ of $\hat{L}^{2}$. Using (5.68) for $\hat{L}^{2}$, (5.72) for $Y$, and (5.81), we have

\(
\begin{gather}
-\hbar^{2}\left(\frac{\partial^{2}}{\partial \theta^{2}}+\cot \theta \frac{\partial}{\partial \theta}+\frac{1}{\sin ^{2} \theta} \frac{\partial^{2}}{\partial \phi^{2}}\right)\left(S(\theta) \frac{1}{\sqrt{2 \pi}} e^{i m \phi}\right)=c S(\theta) \frac{1}{\sqrt{2 \pi}} e^{i m \phi} \
\frac{d^{2} S}{d \theta^{2}}+\cot \theta \frac{d S}{d \theta}-\frac{m^{2}}{\sin ^{2} \theta} S=-\frac{c}{\hbar^{2}} S \tag{5.82}
\end{gather}
\)

To solve (5.82), we carry out some tedious manipulations, which may be skimmed if desired. Begin reading again at Eq. (5.91). First, for convenience, we change the independent variable by making the substitution

\(
\begin{equation}
w=\cos \theta \tag{5.83}
\end{equation}
\)

This transforms $S$ into some new function of $w$ :

\(
\begin{equation}
S(\theta)=G(w) \tag{5.8}
\end{equation}
\)

The chain rule gives

\(
\begin{equation}
\frac{d S}{d \theta}=\frac{d G}{d w} \frac{d w}{d \theta}=-\sin \theta \frac{d G}{d w}=-\left(1-w^{2}\right)^{1 / 2} \frac{d G}{d w} \tag{5.85}
\end{equation}
\)

Similarly, we find (Prob. 5.25)

\(
\begin{equation}
\frac{d^{2} S}{d \theta^{2}}=\left(1-w^{2}\right) \frac{d^{2} G}{d w^{2}}-w \frac{d G}{d w} \tag{5.86}
\end{equation}
\)

Using (5.86), (5.85), and $\cot \theta=\cos \theta / \sin \theta=w /\left(1-w^{2}\right)^{1 / 2}$, we find that (5.82) becomes

\(
\begin{equation}
\left(1-w^{2}\right) \frac{d^{2} G}{d w^{2}}-2 w \frac{d G}{d w}+\left[\frac{c}{\hbar^{2}}-\frac{m^{2}}{1-w^{2}}\right] G(w)=0 \tag{5.87}
\end{equation}
\)

The range of $w$ is $-1 \leq w \leq 1$.
To get a two-term recursion relation when we try a power-series solution, we make the following change of dependent variable:

\(
\begin{equation}
G(w)=\left(1-w^{2}\right)^{|m| / 2} H(w) \tag{5.88}
\end{equation}
\)

Differentiating (5.88), we evaluate $G^{\prime}$ and $G^{\prime \prime}$, and (5.87) becomes, after we divide by $\left(1-w^{2}\right)^{|m| / 2}$,

\(
\begin{equation}
\left(1-w^{2}\right) H^{\prime \prime}-2(|m|+1) w H^{\prime}+\left[c \hbar^{-2}-|m|(|m|+1)\right] H=0 \tag{5.89}
\end{equation}
\)

We now try a power series for $H$ :

\(
\begin{equation}
H(w)=\sum{j=0}^{\infty} a{j} w^{j} \tag{5.90}
\end{equation}
\)

Differentiating [compare Eqs. (4.36)-(4.38)], we have

\(
\begin{aligned}
& H^{\prime}(w)=\sum{j=0}^{\infty} j a{j} w^{j-1} \
& H^{\prime \prime}(w)=\sum{j=0}^{\infty} j(j-1) a{j} w^{j-2}=\sum{j=0}^{\infty}(j+2)(j+1) a{j+2} w^{j}
\end{aligned}
\)

Substitution of these power series into (5.89) yields, after combining sums,

\(
\sum{j=0}^{\infty}\left[(j+2)(j+1) a{j+2}+\left(-j^{2}-j-2|m| j+\frac{c}{\hbar^{2}}-|m|^{2}-|m|\right) a_{j}\right] w^{j}=0
\)

Setting the coefficient of $w^{j}$ equal to zero, we get the recursion relation

\(
\begin{equation}
a{j+2}=\frac{(j+|m|)(j+|m|+1)-c / \hbar^{2}}{(j+1)(j+2)} a{j} \tag{5.91}
\end{equation}
\)

Just as in the harmonic-oscillator case, the general solution of (5.89) is an arbitrary linear combination of a series of even powers (whose coefficients are determined by $a{0}$ ) and a series of odd powers (whose coefficients are determined by $a{1}$ ). It can be shown that the infinite series defined by the recursion relation (5.91) does not give well-behaved eigenfunctions. [Many texts point out that the infinite series diverges at $w= \pm 1$. However, this is not sufficient cause to reject the infinite series, since the eigenfunctions might be quadratically integrable, even though infinite at two points. For a careful discussion, see M. Whippman, Am. J. Phys., 34, 656 (1966).] Hence, as in the harmonic-oscillator case, we must cause one of the series to break off, its last term being $a{k} w^{k}$. We eliminate the other series by setting $a{0}$ or $a_{1}$ equal to zero, depending on whether $k$ is odd or even.

Setting the coefficient of $a_{k}$ in (5.91) equal to zero, we have

\(
\begin{equation}
c=\hbar^{2}(k+|m|)(k+|m|+1), \quad k=0,1,2, \ldots \tag{5.92}
\end{equation}
\)

Since $|m|$ takes on the values $0,1,2, \ldots$, the quantity $k+|m|$ takes on the values $0,1,2, \ldots$ We therefore define the quantum number $l$ as

\(
\begin{equation}
l \equiv k+|m| \tag{5.93}
\end{equation}
\)

and the eigenvalues for the square of the magnitude of angular momentum are

\(
\begin{equation}
c=l(l+1) \hbar^{2}, \quad l=0,1,2, \ldots \tag{5.94}
\end{equation}
\)

The magnitude of the orbital angular momentum of a particle is

\(
\begin{equation}
|\mathbf{L}|=[l(l+1)]^{1 / 2} \hbar \tag{5.95}
\end{equation}
\)

From (5.93), it follows that $|m| \leq l$. The possible values for $m$ are thus

\(
\begin{equation}
m=-l,-l+1,-l+2, \ldots,-1,0,1, \ldots, l-2, l-1, l \tag{5.96}
\end{equation}
\)

Let us examine the angular-momentum eigenfunctions. From (5.83), (5.84), (5.88), (5.90), and (5.93), the theta factor in the eigenfunctions is

\(
\begin{equation}
S{l, m}(\theta)=\sin ^{|m|} \theta \sum{\substack{j=1,3, \ldots \ \text { or } j=0,2, \ldots}}^{l-|m|} a_{j} \cos ^{j} \theta \tag{5.97}
\end{equation}
\)

where the sum is over even or odd values of $j$, depending on whether $l-|m|$ is even or odd. The coefficients $a_{j}$ satisfy the recursion relation (5.91), which, using (5.94), becomes

\(
\begin{equation}
a{j+2}=\frac{(j+|m|)(j+|m|+1)-l(l+1)}{(j+1)(j+2)} a{j} \tag{5.98}
\end{equation}
\)

The $\hat{L}^{2}$ and $\hat{L}_{z}$ eigenfunctions are given by Eqs. (5.72) and (5.81) as

\(
\begin{equation}
Y{l}^{m}(\theta, \phi)=S{l, m}(\theta) T(\phi)=\frac{1}{\sqrt{2 \pi}} S_{l, m}(\theta) e^{i m \phi} \tag{5.99}
\end{equation}
\)

The $m$ in $Y_{l}^{m}$ is a label, and not an exponent.

EXAMPLE

Find $Y{l}^{m}(\theta, \phi)$ and the $\hat{L}^{2}$ and $\hat{L}{z}$ eigenvalues for (a) $l=0$; (b) $l=1$.
(a) For $l=0$, Eq. (5.96) gives $m=0$, and (5.97) becomes

\(
\begin{equation}
S{0,0}(\theta)=a{0} \tag{5.100}
\end{equation}
\)

The normalization condition (5.80) gives

\(
\begin{aligned}
\int{0}^{\pi}\left|a{0}\right|^{2} \sin \theta d \theta=1 & =2\left|a{0}\right|^{2} \
\left|a{0}\right| & =2^{-1 / 2}
\end{aligned}
\)

Equation (5.99) gives

\(
\begin{equation}
Y_{0}^{0}(\theta, \phi)=\frac{1}{\sqrt{4 \pi}} \tag{5.101}
\end{equation}
\)

[Obviously, (5.101) is an eigenfunction of $\hat{L}^{2}, \hat{L}{x}, \hat{L}{y}$, and $\hat{L}_{z}$, Eqs. (5.65)-(5.68).] For $l=0$, there is no angular dependence in the eigenfunction; we say that the eigenfunctions are spherically symmetric for $l=0$.

For $l=0$ and $m=0$, Eqs. (5.69), (5.70), (5.75), and (5.94) give the $\hat{L}^{2}$ eigenvalue as $c=0$ and the $\hat{L}_{z}$ eigenvalue as $b=0$.
(b) For $l=1$, the possible values for $m$ in (5.96) are $-1,0$, and 1 . For $|m|=1$, (5.97) gives

\(
\begin{equation}
S{1, \pm 1}(\theta)=a{0} \sin \theta \tag{5.102}
\end{equation}
\)

$a{0}$ in (5.102) is not necessarily the same as $a{0}$ in (5.100). Normalization gives

\(
\begin{gathered}
1=\left|a{0}^{2}\right| \int{0}^{\pi} \sin ^{2} \theta \sin \theta d \theta=\left|a{0}^{2}\right| \int{-1}^{1}\left(1-w^{2}\right) d w \
\left|a_{0}\right|=\sqrt{3} / 2
\end{gathered}
\)

where the substitution $w=\cos \theta$ was made. Thus $S_{1, \pm 1}=\left(3^{1 / 2} / 2\right) \sin \theta$ and (5.99) gives

\(
\begin{equation}
Y{1}^{1}=(3 / 8 \pi)^{1 / 2} \sin \theta e^{i \phi}, \quad Y{1}^{-1}=(3 / 8 \pi)^{1 / 2} \sin \theta e^{-i \phi} \tag{5.103}
\end{equation}
\)

For $l=1$ and $m=0$, we find (see the following exercise) $S{1,0}=(3 / 2)^{1 / 2} \cos \theta$ and $Y{1}^{0}=(3 / 4 \pi)^{1 / 2} \cos \theta$.

For $l=1$, (5.94) gives the $\hat{L}^{2}$ eigenvalue as $2 \hbar^{2}$; for $m=-1,0$, and 1 , (5.75) gives the $\hat{L}_{z}$ eigenvalues as $-\hbar, 0$, and $\hbar$, respectively.

EXERCISE Verify the expressions for $S{1,0}$ and $Y{1}^{0}$.

The functions $S{l, m}(\theta)$ are well known in mathematics and are associated Legendre functions multiplied by a normalization constant. The associated Legendre functions are defined in Prob. 5.34. Table 5.1 gives the $S{l, m}(\theta)$ functions for $l \leq 3$.

The angular-momentum eigenfunctions $Y_{l}^{m}$ in (5.99) are called spherical harmonics (or surface harmonics).

In summary, the one-particle orbital angular-momentum eigenfunctions and eigenvalues are [Eqs. (5.69), (5.70), (5.75), and (5.94)]

TABLE $5.1 \quad \boldsymbol{S}_{\boldsymbol{l}, \boldsymbol{m}}(\boldsymbol{\theta})$

\(
\begin{array}{ll}
l=0: & S{0,0}=\frac{1}{2} \sqrt{2} \
l=1: & S{1,0}=\frac{1}{2} \sqrt{6} \cos \theta \
& S{1, \pm 1}=\frac{1}{2} \sqrt{3} \sin \theta \
l=2: & S{2,0}=\frac{1}{4} \sqrt{10}\left(3 \cos ^{2} \theta-1\right) \
& S{2, \pm 1}=\frac{1}{2} \sqrt{15} \sin \theta \cos \theta \
& S{2, \pm 2}=\frac{1}{4} \sqrt{15} \sin ^{2} \theta \
l=3: & S{3,0}=\frac{3}{4} \sqrt{14}\left(\frac{5}{3} \cos ^{3} \theta-\cos \theta\right) \
& S{3, \pm 1}=\frac{1}{8} \sqrt{42} \sin \theta\left(5 \cos ^{2} \theta-1\right) \
& S{3, \pm 2}=\frac{1}{4} \sqrt{105} \sin ^{2} \theta \cos \theta \
& S{3, \pm 3}=\frac{1}{8} \sqrt{70} \sin ^{3} \theta
\end{array}
\)

\(
\begin{equation}
\hat{L}^{2} Y{l}^{m}(\theta, \phi)=l(l+1) \hbar^{2} Y{l}^{m}(\theta, \phi), \quad l=0,1,2, \ldots \tag{5.104}
\end{equation}
\)

\(
\begin{equation}
\hat{L}{z} Y{l}^{m}(\theta, \phi)=m \hbar Y_{l}^{m}(\theta, \phi), \quad m=-l,-l+1, \ldots, l-1, l \tag{5.105}
\end{equation}
\)

where the eigenfunctions are given by (5.99). Often the symbol $m{l}$ is used instead of $m$ for the $L{z}$ quantum number. We shall later see that the spherical harmonics are orthogonal functions [Eq. (7.27)].

Since $l \geq|m|$, the magnitude $[l(l+1)]^{1 / 2} \hbar$ of the orbital angular momentum $\mathbf{L}$ is greater than the magnitude $|m| \hbar$ of its $z$ component $L{z}$, except for $l=0$. If it were possible to have the angular-momentum magnitude equal to its $z$ component, this would mean that the $x$ and $y$ components were zero, and we would have specified all three components of $\mathbf{L}$. However, since the components of angular momentum do not commute with each other, we cannot do this. The one exception is when $l$ is zero. In this case, $|\mathbf{L}|^{2}=L{x}^{2}+L{y}^{2}+L{z}^{2}$ has zero for its eigenvalue, and it must be true that all three components $L{x}, L{y}$, and $L_{z}$ have zero eigenvalues. From Eq. (5.12), the uncertainties in angular-momentum components satisfy

\(
\begin{equation}
\Delta L{x} \Delta L{y} \geq \frac{1}{2}\left|\int \Psi^{}\left[\hat{L}{x}, \hat{L}{y}\right] \Psi d \tau\right|=\frac{\hbar}{2}\left|\int \Psi^{} \hat{L}_{z} \Psi d \tau\right| \tag{5.106}
\end{equation}
\)

and two similar equations obtained by cyclic permutation. When the eigenvalues of $\hat{L}{z}, \hat{L}{x}$, and $\hat{L}{y}$ are zero, $\hat{L}{x} \Psi=0, \hat{L}{y} \Psi=0, \hat{L}{z} \Psi=0$, the right-hand sides of (5.106) and the two similar equations are zero, and having $\Delta L{x}=\Delta L{y}=\Delta L{z}=0$ is permitted. But what about the statement in Section 5.1 that to have simultaneous eigenfunctions of two operators the operators must commute? The answer is that this theorem refers to the possibility of having a complete set of eigenfunctions of one operator be eigenfunctions of the other operator. Thus, even though $\hat{L}{x}$ and $\hat{L}{z}$ do not commute, it is possible to have some of the eigenfunctions of $\hat{L}{z}$ (those with $l=0=m$ ) be eigenfunctions of $\hat{L}{x}$. However, it is impossible to have all the $\hat{L}{z}$ eigenfunctions also be eigenfunctions of $\hat{L}_{x}$.

FIGURE 5.6 Orientation of L.

FIGURE 5.7 Orientations of $\mathbf{L}$ with respect to the $z$ axis for $l=1$.

Since we cannot specify $L{x}$ and $L{y}$, the vector $\mathbf{L}$ can lie anywhere on the surface of a cone whose axis is the $z$ axis, whose altitude is $m \hbar$, and whose slant height is $\sqrt{l(l+1)} \hbar$ (Fig. 5.6). The possible orientations of $\mathbf{L}$ with respect to the $z$ axis for the case $l=1$ are shown in Fig. 5.7. For each eigenvalue of $\hat{L}^{2}$, there are $2 l+1$ different eigenfunctions $Y_{l}^{m}$, corresponding to the $2 l+1$ values of $m$. We say that the $\hat{L}^{2}$ eigenvalues are $(2 l+1)$-fold degenerate. The term degeneracy is applicable to the eigenvalues of any operator, not just the Hamiltonian.

Of course, there is nothing special about the $z$ axis. All directions of space are equivalent. If we had chosen to specify $L^{2}$ and $L{x}$ (rather than $L{z}$ ), we would have gotten the same eigenvalues for $L{x}$ as we found for $L{z}$. However, it is easier to solve the $\hat{L}{z}$ eigenvalue equation because $\hat{L}{z}$ has a simple form in spherical coordinates, which involve the angle of rotation $\phi$ about the $z$ axis.


We found the eigenvalues of $\hat{L}^{2}$ and $\hat{L}_{z}$ by expressing these orbital angular-momentum operators as differential operators and solving the resulting differential equations. We now show that these eigenvalues can be found using only the operator commutation relations. The work in this section applies to any operators that satisfy the angular-momentum commutation relations. In particular, it applies to spin angular momentum (Chapter 10) as well as orbital angular momentum.

We used the letter $L$ for orbital angular momentum. Here we will use the letter $M$ to indicate that we are dealing with any kind of angular momentum. We have three linear
operators $\hat{M}{x}, \hat{M}{y}$, and $\hat{M}_{z}$, and all we know about them is that they obey the commutation relations [similar to (5.46) and (5.48)]

\(
\begin{equation}
\left[\hat{M}{x}, \hat{M}{y}\right]=i \hbar \hat{M}{z}, \quad\left[\hat{M}{y}, \hat{M}{z}\right]=i \hbar \hat{M}{x}, \quad\left[\hat{M}{z}, \hat{M}{x}\right]=i \hbar \hat{M}_{y} \tag{5.107}
\end{equation}
\)

We define the operator $\hat{M}^{2}$ as

\(
\begin{equation}
\hat{M}^{2}=\hat{M}{x}^{2}+\hat{M}{y}^{2}+\hat{M}_{z}^{2} \tag{5.108}
\end{equation}
\)

Our problem is to find the eigenvalues of $\hat{M}^{2}$ and $\hat{M}_{z}$.
We begin by evaluating the commutators of $\hat{M}^{2}$ with its components, using Eqs. (5.107) and (5.108). The work is identical with that used to derive Eqs. (5.49) and (5.50), and we have

\(
\begin{equation}
\left[\hat{M}^{2}, \hat{M}{x}\right]=\left[\hat{M}^{2}, \hat{M}{y}\right]=\left[\hat{M}^{2}, \hat{M}_{z}\right]=0 \tag{5.109}
\end{equation}
\)

Hence we can have simultaneous eigenfunctions of $\hat{M}^{2}$ and $\hat{M}{z}$.
Next we define two new operators, the raising operator $\hat{M}{+}$and the lowering operator $\hat{M}_{-}$:

\(
\begin{align}
& \hat{M}{+} \equiv \hat{M}{x}+i \hat{M}{y} \tag{5.110}\
& \hat{M}{-} \equiv \hat{M}{x}-i \hat{M}{y} \tag{5.111}
\end{align}
\)

These are examples of ladder operators. The reason for the terminology will become clear shortly. We have

\(
\begin{gather}
\hat{M}{+} \hat{M}{-}=\left(\hat{M}{x}+i \hat{M}{y}\right)\left(\hat{M}{x}-i \hat{M}{y}\right)=\hat{M}{x}\left(\hat{M}{x}-i \hat{M}{y}\right)+i \hat{M}{y}\left(\hat{M}{x}-i \hat{M}{y}\right) \
=\hat{M}{x}^{2}-i \hat{M}{x} \hat{M}{y}+i \hat{M}{y} \hat{M}{x}+\hat{M}{y}^{2}=\hat{M}^{2}-\hat{M}{z}^{2}+i\left[\hat{M}{y}, \hat{M}{x}\right] \
\hat{M}{+} \hat{M}{-}=\hat{M}^{2}-\hat{M}{z}^{2}+\hbar \hat{M}_{z} \tag{5.112}
\end{gather}
\)

Similarly, we find

\(
\begin{equation}
\hat{M}{-} \hat{M}{+}=\hat{M}^{2}-\hat{M}{z}^{2}-\hbar \hat{M}{z} \tag{5.113}
\end{equation}
\)

For the commutators of these operators with $\hat{M}_{z}$, we have

\(
\begin{gather}
{\left[\hat{M}{+}, \hat{M}{z}\right]=\left[\hat{M}{x}+i \hat{M}{y}, \hat{M}{z}\right]=\left[\hat{M}{x}, \hat{M}{z}\right]+i\left[\hat{M}{y}, \hat{M}{z}\right]=-i \hbar \hat{M}{y}-\hbar \hat{M}{x}} \
{\left[\hat{M}{+}, \hat{M}{z}\right]=-\hbar \hat{M}{+}} \
\hat{M}{+} \hat{M}{z}=\hat{M}{z} \hat{M}{+}-\hbar \hat{M}_{+} \tag{5.114}
\end{gather}
\)

where (5.107) was used. Similarly, we find

\(
\begin{equation}
\hat{M}{-} \hat{M}{z}=\hat{M}{z} \hat{M}{-}+\hbar \hat{M}_{-} \tag{5.115}
\end{equation}
\)

Using $Y$ for the common eigenfunctions of $\hat{M}^{2}$ and $\hat{M}_{z}$, we have

\(
\begin{align}
& \hat{M}^{2} Y=c Y \tag{5.116}\
& \hat{M}_{z} Y=b Y \tag{5.117}
\end{align}
\)

where $c$ and $b$ are the eigenvalues. Operating on Eq. (5.117) with $\hat{M}_{+}$, we get

\(
\hat{M}{+} \hat{M}{z} Y=\hat{M}_{+} b Y
\)

Using Eq. (5.114) and the fact that $\hat{M}_{+}$is linear, we have

\(
\begin{gather}
\left(\hat{M}{z} \hat{M}{+}-\hbar \hat{M}{+}\right) Y=b \hat{M}{+} Y \
\hat{M}{z}\left(\hat{M}{+} Y\right)=(b+\hbar)\left(\hat{M}_{+} Y\right) \tag{5.118}
\end{gather}
\)

This last equation says that the function $\hat{M}{+} Y$ is an eigenfunction of $\hat{M}{z}$ with eigenvalue $b+\hbar$. In other words, operating on the eigenfunction $Y$ with the raising operator $\hat{M}{+}$ converts $Y$ into another eigenfunction of $\hat{M}{z}$ with eigenvalue $\hbar$ higher than the eigenvalue of $Y$. If we now apply the raising operator to (5.118) and use (5.114) again, we find similarly

\(
\hat{M}{z}\left(\hat{M}{+}^{2} Y\right)=(b+2 \hbar)\left(\hat{M}_{+}^{2} Y\right)
\)

Repeated application of the raising operator gives

\(
\begin{equation}
\hat{M}{z}\left(\hat{M}{+}^{k} Y\right)=(b+k \hbar)\left(\hat{M}_{+}^{k} Y\right), \quad k=0,1,2, \ldots \tag{5.119}
\end{equation}
\)

If we operate on (5.117) with the lowering operator and apply (5.115), we find in the same manner

\(
\begin{gather}
\hat{M}{z}\left(\hat{M}{-} Y\right)=(b-\hbar)\left(\hat{M}{-} Y\right) \tag{5.120}\
\hat{M}{z}\left(\hat{M}{-}^{k} Y\right)=(b-k \hbar)\left(\hat{M}{-}^{k} Y\right) \tag{5.121}
\end{gather}
\)

Thus by using the raising and lowering operators on the eigenfunction with the eigenvalue $b$, we generate a ladder of eigenvalues, the difference from step to step being $\hbar$ :

\(
\ldots \quad b-2 \hbar, \quad b-\hbar, \quad b, \quad b+\hbar, \quad b+2 \hbar, \quad \ldots
\)

The functions $\hat{M}{ \pm}^{k} Y$ are eigenfunctions of $\hat{M}{z}$ with eigenvalues $b \pm k \hbar$ [Eqs. (5.119) and (5.121)]. We now show that these functions are also eigenfunctions of $\hat{M}^{2}$, all with the same eigenvalue $c$ :

\(
\begin{gather}
\hat{M}{z} \hat{M}{ \pm}^{k} Y=(b \pm k \hbar) \hat{M}{ \pm}^{k} Y \tag{5.122}\
\hat{M}^{2} \hat{M}{ \pm}^{k} Y=c \hat{M}_{ \pm}^{k} Y, \quad k=0,1,2, \ldots \tag{5.123}
\end{gather}
\)

To prove (5.123), we first show that $\hat{M}^{2}$ commutes with $\hat{M}{+}$and $\hat{M}{-}$:

\(
\left[\hat{M}^{2}, \hat{M}{ \pm}\right]=\left[\hat{M}^{2}, \hat{M}{x} \pm i \hat{M}{y}\right]=\left[\hat{M}^{2}, \hat{M}{x}\right] \pm i\left[\hat{M}^{2}, \hat{M}_{y}\right]=0 \pm 0=0
\)

We also have

\(
\left[\hat{M}^{2}, \hat{M}{ \pm}^{2}\right]=\left[\hat{M}^{2}, \hat{M}{ \pm}\right] \hat{M}{ \pm}+\hat{M}{ \pm}\left[\hat{M}^{2}, \hat{M}_{ \pm}\right]=0+0=0
\)

and it follows by induction that

\(
\begin{equation}
\left[\hat{M}^{2}, \hat{M}{ \pm}^{k}\right]=0 \quad \text { or } \quad \hat{M}^{2} \hat{M}{ \pm}^{k}=\hat{M}_{ \pm}^{k} \hat{M}^{2}, \quad k=0,1,2, \ldots \tag{5.124}
\end{equation}
\)

If we operate on (5.116) with $\hat{M}_{ \pm}^{k}$ and use (5.124), we get

\(
\begin{align}
\hat{M}{ \pm}^{k} \hat{M}^{2} Y & =\hat{M}{ \pm}^{k} c Y \
\hat{M}^{2}\left(\hat{M}{ \pm}^{k} Y\right) & =c\left(\hat{M}{ \pm}^{k} Y\right) \tag{5.125}
\end{align}
\)

which is what we wanted to prove.
Next we show that the set of eigenvalues of $\hat{M}{z}$ generated using the ladder operators must be bounded. For the particular eigenfunction $Y$ with $\hat{M}{z}$ eigenvalue $b$, we have

\(
\hat{M}_{z} Y=b Y
\)

and for the set of eigenfunctions and eigenvalues generated by the ladder operators, we have

\(
\begin{equation}
\hat{M}{z} Y{k}=b{k} Y{k} \tag{5.126}
\end{equation}
\)

where

\(
\begin{gather}
Y{k}=\hat{M}{ \pm}^{k} Y \tag{5.127}\
b_{k}=b \pm k \hbar \tag{5.128}
\end{gather}
\)

(Application of $\hat{M}{+}$or $\hat{M}{-}$destroys the normalization of $Y$, so $Y_{k}$ is not normalized. For the normalization constant, see Prob. 10.27.)

Operating on (5.126) with $\hat{M}_{z}$, we have

\(
\begin{align}
\hat{M}{z}^{2} Y{k} & =b{k} \hat{M}{z} Y{k} \
\hat{M}{z}^{2} Y{k} & =b{k}^{2} Y_{k} \tag{5.129}
\end{align}
\)

Now subtract (5.129) from (5.123), and use (5.127) and (5.108):

\(
\begin{align}
\hat{M}^{2} Y{k}-\hat{M}{z}^{2} Y{k} & =c Y{k}-b{k}^{2} Y{k} \
\left(\hat{M}{x}^{2}+\hat{M}{y}^{2}\right) Y{k} & =\left(c-b{k}^{2}\right) Y_{k} \tag{5.130}
\end{align}
\)

The operator $\hat{M}{x}^{2}+\hat{M}{y}^{2}$ corresponds to a nonnegative physical quantity and hence has nonnegative eigenvalues. (This is proved in Prob. 7.11.) Therefore, (5.130) implies that $c-b{k}^{2} \geq 0$ and $c^{1 / 2} \geq\left|b{k}\right|$. Thus

\(
\begin{equation}
c^{1 / 2} \geq b_{k} \geq-c^{1 / 2}, \quad k=0, \pm 1, \pm 2, \ldots \tag{5.131}
\end{equation}
\)

Since $c$ remains constant as $k$ varies, (5.131) shows that the set of eigenvalues $b{k}$ is bounded above and below. Let $b{\max }$ and $b{\text {min }}$ denote the maximum and minimum values of $b{k} . Y{\max }$ and $Y{\min }$ are the corresponding eigenfunctions:

\(
\begin{align}
\hat{M}{z} Y{\max } & =b{\max } Y{\max } \tag{5.132}\
\hat{M}{z} Y{\min } & =b{\min } Y{\min } \tag{5.133}
\end{align}
\)

Now operate on (5.132) with the raising operator and use (5.114):

\(
\begin{gather}
\hat{M}{+} \hat{M}{z} Y{\max }=b{\max } \hat{M}{+} Y{\max } \
\hat{M}{z}\left(\hat{M}{+} Y{\max }\right)=\left(b{\max }+\hbar\right)\left(\hat{M}{+} Y{\max }\right) \tag{5.134}
\end{gather}
\)

This last equation seems to contradict the statement that $b{\text {max }}$ is the largest eigenvalue of $\hat{M}{z}$, since it says that $\hat{M}{+} Y{\text {max }}$ is an eigenfunction of $\hat{M}{z}$ with eigenvalue $b{\text {max }}+\hbar$. The only way out of this contradiction is to have $\hat{M}{+} Y{\max }$ vanish. (We always reject zero as an eigenfunction on physical grounds.) Thus

\(
\begin{equation}
\hat{M}{+} Y{\max }=0 \tag{5.135}
\end{equation}
\)

Operating on (5.135) with the lowering operator and using (5.113), (5.132), and (5.116), we have

\(
\begin{gather}
0=\hat{M}{-} \hat{M}{+} Y{\max }=\left(\hat{M}^{2}-\hat{M}{z}^{2}-\hbar \hat{M}{z}\right) Y{\max }=\left(c-b{\max }^{2}-\hbar b{\max }\right) Y{\max } \
c-b{\max }^{2}-\hbar b{\max }=0 \
c=b{\max }^{2}+\hbar b_{\max } \tag{5.136}
\end{gather}
\)

A similar argument shows that

\(
\begin{equation}
\hat{M}{-} Y{\min }=0 \tag{5.137}
\end{equation}
\)

and by applying the raising operator to this equation and using (5.112), we find

\(
c=b{\min }^{2}-\hbar b{\min }
\)

Subtracting this last equation from (5.136), we have

\(
b{\max }^{2}+\hbar b{\max }+\left(\hbar b{\min }-b{\min }^{2}\right)=0
\)

This is a quadratic equation in the unknown $b_{\max }$, and using the usual formula (it still works in quantum mechanics), we find

\(
b{\max }=-b{\min }, \quad b{\max }=b{\min }-\hbar
\)

The second root is rejected, since it says that $b{\max }$ is less than $b{\text {min }}$. So

\(
\begin{equation}
b{\min }=-b{\max } \tag{5.138}
\end{equation}
\)

Moreover, (5.128) says that $b{\max }$ and $b{\text {min }}$ differ by an integral multiple of $\hbar$ :

\(
\begin{equation}
b{\max }-b{\min }=n \hbar, \quad n=0,1,2, \ldots \tag{5.139}
\end{equation}
\)

Substituting (5.138) in (5.139), we have for the $\hat{M}_{z}$ eigenvalues

\(
\begin{gather}
b{\max }=\frac{1}{2} n \hbar \
b{\max }=j \hbar, \quad j=0, \frac{1}{2}, 1, \frac{3}{2}, 2, \ldots \tag{5.140}\
b_{\min }=-j \hbar \
b=-j \hbar,(-j+1) \hbar,(-j+2) \hbar, \ldots,(j-2) \hbar,(j-1) \hbar, j \hbar \tag{5.141}
\end{gather}
\)

and from (5.136) we find as the $\hat{M}^{2}$ eigenvalues

\(
\begin{equation}
c=j(j+1) \hbar^{2}, \quad j=0, \frac{1}{2}, 1, \frac{3}{2}, \ldots \tag{5.142}
\end{equation}
\)

Thus

\(
\begin{gather}
\hat{M}^{2} Y=j(j+1) \hbar^{2} Y, \quad j=0, \frac{1}{2}, 1, \frac{3}{2}, 2, \ldots \tag{5.143}\
\hat{M}{z} Y=m{j} \hbar Y, \quad m_{j}=-j,-j+1, \ldots, j-1, j \tag{5.144}
\end{gather}
\)

We have found the eigenvalues of $\hat{M}^{2}$ and $\hat{M}_{z}$ using just the commutation relations. However, comparison of (5.143) and (5.144) with (5.104) and (5.105) shows that in addition to integral values for the angular-momentum quantum number $(l=0,1,2, \ldots)$ we now also have the possibility for half-integral values $\left(j=0, \frac{1}{2}, 1, \frac{3}{2}, \ldots\right)$. This perhaps suggests that there might be another kind of angular momentum besides orbital angular momentum. In Chapter 10 we shall see that spin angular momentum can have half-integral, as well as integral, quantum numbers. For orbital angular momentum, the boundary condition of single-valuedness of the $T(\phi)$ eigenfunctions [see the equation following (5.73)] eliminates the half-integral values of the angular-momentum quantum numbers. [Not everyone accepts single-valuedness as a valid boundary condition on wave functions, and many other reasons have been given for rejecting half-integral orbital-angular-momentum quantum numbers; see C. G. Gray, Am. J. Phys., 37, 559 (1969); M. L. Whippman, Am. J. Phys., 34, 656 (1966).]

The ladder-operator method can be used to solve other eigenvalue problems; see Prob. 5.36.


The Hydrogen Atom

Click the keywords to know more about it  

Central Force: A force that is directed along the line connecting the center of the force and the particle, and whose magnitude depends only on the distance between the particle and the center 1. Hamiltonian Operator: An operator used in quantum mechanics to describe the total energy of a system. It includes both kinetic and potential energy terms 1. Laplacian Operator: A differential operator that describes the rate at which a function changes at a point relative to the average rate of change at nearby points. In spherical coordinates, it is used to transform the Schrödinger equation 1. Orbital Angular Momentum: A measure of the amount of rotation a particle has in a quantum system. It is quantized and described by the quantum number \( l \) 1. Spherical Harmonics: Functions that describe the angular part of the wave function in a spherically symmetric potential. They are used to solve the Schrödinger equation in spherical coordinates 1. Radial Distribution Function: A function that describes the probability of finding an electron at a certain distance from the nucleus. It is derived from the radial part of the wave function 1. Reduced Mass: The effective mass of a two-particle system, used to simplify the Schrödinger equation for systems like the hydrogen atom 1. Bohr Radius: The average distance between the nucleus and the electron in the ground state of the hydrogen atom, approximately 0.529 Å 1. Coulomb's Law: A law describing the force between two charged particles. In the context of the hydrogen atom, it describes the attractive force between the electron and the nucleus 1. Schrödinger Equation: A fundamental equation in quantum mechanics that describes how the quantum state of a physical system changes over time 1. Eigenfunctions and Eigenvalues: Solutions to the Schrödinger equation. Eigenfunctions describe the possible states of the system, and eigenvalues describe the corresponding energy levels 1. Quantum Numbers: Numbers that describe the properties of electrons in atoms. They include the principal quantum number \( n ), the orbital angular momentum quantum number \( l ), and the magnetic quantum number \( m \) 1. Degeneracy: The condition where multiple quantum states have the same energy level. In the hydrogen atom, energy levels are degenerate with respect to the quantum numbers \( l \) and \( m \) 1. Continuum States: States where the electron is not bound to the nucleus and has a positive energy. These states correspond to ionized atoms 1. Bound States: States where the electron is bound to the nucleus and has a negative energy. These states correspond to discrete energy levels 1. Zeeman Effect: The splitting of atomic spectral lines in the presence of an external magnetic field, due to the interaction between the magnetic field and the magnetic dipole moment of the electron 1. These terms should enhance your understanding of the chapter on the hydrogen atom. If you have any specific questions or need further explanations, feel free to ask!

Before studying the hydrogen atom, we shall consider the more general problem of a single particle moving under a central force. The results of this section will apply to any central-force problem. Examples are the hydrogen atom (Section 6.5) and the isotropic three-dimensional harmonic oscillator (Prob. 6.3).

A central force is one derived from a potential-energy function that is spherically symmetric, which means that it is a function only of the distance of the particle from the origin: $V=V(r)$. The relation between force and potential energy is given by (5.31) as

\(
\begin{equation}
\mathbf{F}=-\nabla V(x, y, z)=-\mathbf{i}(\partial V / \partial x)-\mathbf{j}(\partial V / \partial y)-\mathbf{k}(\partial V / \partial z) \tag{6.1}
\end{equation}
\)

The partial derivatives in (6.1) can be found by the chain rule [Eqs. (5.53)-(5.55)]. Since $V$ in this case is a function of $r$ only, we have $(\partial V / \partial \theta){r, \phi}=0$ and $(\partial V / \partial \phi){r, \theta}=0$. Therefore,

\(
\begin{gather}
\left(\frac{\partial V}{\partial x}\right){y, z}=\frac{d V}{d r}\left(\frac{\partial r}{\partial x}\right){y, z}=\frac{x}{r} \frac{d V}{d r} \tag{6.2}\
\left(\frac{\partial V}{\partial y}\right){x, z}=\frac{y}{r} \frac{d V}{d r}, \quad\left(\frac{\partial V}{\partial z}\right){x, y}=\frac{z}{r} \frac{d V}{d r} \tag{6.3}
\end{gather}
\)

where Eqs. (5.57) and (5.58) have been used. Equation (6.1) becomes

\(
\begin{equation}
\mathbf{F}=-\frac{1}{r} \frac{d V}{d r}(x \mathbf{i}+y \mathbf{j}+z \mathbf{k})=-\frac{d V(r)}{d r} \frac{\mathbf{r}}{r} \tag{6.4}
\end{equation}
\)

where (5.33) for $\mathbf{r}$ was used. The quantity $\mathbf{r} / r$ in (6.4) is a unit vector in the radial direction. A central force is radially directed.

Now we consider the quantum mechanics of a single particle subject to a central force. The Hamiltonian operator is

\(
\begin{equation}
\hat{H}=\hat{T}+\hat{V}=-\left(\hbar^{2} / 2 m\right) \nabla^{2}+V(r) \tag{6.5}
\end{equation}
\)

where $\nabla^{2} \equiv \partial^{2} / \partial x^{2}+\partial^{2} / \partial y^{2}+\partial^{2} / \partial z^{2}$ [Eq. (3.46)]. Since $V$ is spherically symmetric, we shall work in spherical coordinates. Hence we want to transform the Laplacian operator to these coordinates. We already have the forms of the operators $\partial / \partial x, \partial / \partial y$, and $\partial / \partial z$ in these coordinates [Eqs. (5.62)-(5.64)], and by squaring each of these operators and
then adding their squares, we get the Laplacian. This calculation is left as an exercise. The result is (Prob. 6.4)

\(
\begin{equation}
\nabla^{2}=\frac{\partial^{2}}{\partial r^{2}}+\frac{2}{r} \frac{\partial}{\partial r}+\frac{1}{r^{2}} \frac{\partial^{2}}{\partial \theta^{2}}+\frac{1}{r^{2}} \cot \theta \frac{\partial}{\partial \theta}+\frac{1}{r^{2} \sin ^{2} \theta} \frac{\partial^{2}}{\partial \phi^{2}} \tag{6.6}
\end{equation}
\)

Looking back to (5.68), which gives the operator $\hat{L}^{2}$ for the square of the magnitude of the orbital angular momentum of a single particle, we see that

\(
\begin{equation}
\nabla^{2}=\frac{\partial^{2}}{\partial r^{2}}+\frac{2}{r} \frac{\partial}{\partial r}-\frac{1}{r^{2} \hbar^{2}} \hat{L}^{2} \tag{6.7}
\end{equation}
\)

The Hamiltonian (6.5) becomes

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2}}{\partial r^{2}}+\frac{2}{r} \frac{\partial}{\partial r}\right)+\frac{1}{2 m r^{2}} \hat{L}^{2}+V(r) \tag{6.8}
\end{equation}
\)

In classical mechanics a particle subject to a central force has its angular momentum conserved (Section 5.3). In quantum mechanics we might ask whether we can have states with definite values for both the energy and the angular momentum. To have the set of eigenfunctions of $\hat{H}$ also be eigenfunctions of $\hat{L}^{2}$, the commutator $\left[\hat{H}, \hat{L}^{2}\right]$ must vanish. We have

\(
\begin{gather}
{\left[\hat{H}, \hat{L}^{2}\right]=\left[\hat{T}, \hat{L}^{2}\right]+\left[\hat{V}, \hat{L}^{2}\right]} \
{\left[\hat{T}, \hat{L}^{2}\right]=\left[-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2}}{\partial r^{2}}+\frac{2}{r} \frac{\partial}{\partial r}\right)+\frac{1}{2 m r^{2}} \hat{L}^{2}, \hat{L}^{2}\right]} \
{\left[\hat{T}, \hat{L}^{2}\right]=-\frac{\hbar^{2}}{2 m}\left[\frac{\partial^{2}}{\partial r^{2}}+\frac{2}{r} \frac{\partial}{\partial r}, \hat{L}^{2}\right]+\frac{1}{2 m}\left[\frac{1}{r^{2}} \hat{L}^{2}, \hat{L}^{2}\right]} \tag{6.9}
\end{gather}
\)

Recall that $\hat{L}^{2}$ involves only $\theta$ and $\phi$ and not $r$ [Eq. (5.68)]. Hence it commutes with every operator that involves only $r$. [To reach this conclusion, we must use relations like (5.47) with $x$ and $z$ replaced by $r$ and $\theta$.] Thus the first commutator in (6.9) is zero. Moreover, since any operator commutes with itself, the second commutator in (6.9) is zero. Therefore, $\left[\hat{T}, \hat{L}^{2}\right]=0$. Also, since $\hat{L}^{2}$ does not involve $r$, and $V$ is a function of $r$ only, we have $\left[\hat{V}, \hat{L}^{2}\right]=0$. Therefore,

\(
\begin{equation}
\left[\hat{H}, \hat{L}^{2}\right]=0 \quad \text { if } V=V(r) \tag{6.10}
\end{equation}
\)

$\hat{H}$ commutes with $\hat{L}^{2}$ when the potential-energy function is independent of $\theta$ and $\phi$.
Now consider the operator $\hat{L}{z}=-i \hbar \partial / \partial \phi[E q . ~(5.67)]$. Since $\hat{L}{z}$ does not involve $r$ and since it commutes with $\hat{L}^{2}$ [Eq. (5.50)], it follows that $\hat{L}_{z}$ commutes with the Hamiltonian (6.8):

\(
\begin{equation}
\left[\hat{H}, \hat{L}_{z}\right]=0 \quad \text { if } V=V(r) \tag{6.11}
\end{equation}
\)

We can therefore have a set of simultaneous eigenfunctions of $\hat{H}, \hat{L}^{2}$, and $\hat{L}_{z}$ for the central-force problem. Let $\psi$ denote these common eigenfunctions:

\(
\begin{gather}
\hat{H} \psi=E \psi \tag{6.12}\
\hat{L}^{2} \psi=l(l+1) \hbar^{2} \psi, \quad l=0,1,2, \ldots \tag{6.13}\
\hat{L}_{z} \psi=m \hbar \psi, \quad m=-l, \quad-l+1, \ldots, l \tag{6.14}
\end{gather}
\)

where Eqs. (5.104) and (5.105) were used.

Using (6.8) and (6.13), we have for the Schrödinger equation (6.12)

\(
\begin{gather}
-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2} \psi}{\partial r^{2}}+\frac{2}{r} \frac{\partial \psi}{\partial r}\right)+\frac{1}{2 m r^{2}} \hat{L}^{2} \psi+V(r) \psi=E \psi \
-\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2} \psi}{\partial r^{2}}+\frac{2}{r} \frac{\partial \psi}{\partial r}\right)+\frac{l(l+1) \hbar^{2}}{2 m r^{2}} \psi+V(r) \psi=E \psi \tag{6.15}
\end{gather}
\)

The eigenfunctions of $\hat{L}^{2}$ are the spherical harmonics $Y{l}^{m}(\theta, \phi)$, and since $\hat{L}^{2}$ does not involve $r$, we can multiply $Y{l}^{m}$ by an arbitrary function of $r$ and still have eigenfunctions of $\hat{L}^{2}$ and $\hat{L}_{z}$. Therefore,

\(
\begin{equation}
\psi=R(r) Y_{l}^{m}(\theta, \phi) \tag{6.16}
\end{equation}
\)

Using (6.16) in (6.15), we then divide both sides by $Y_{l}^{m}$ to get an ordinary differential equation for the unknown function $R(r)$ :

\(
\begin{equation}
-\frac{\hbar^{2}}{2 m}\left(R^{\prime \prime}+\frac{2}{r} R^{\prime}\right)+\frac{l(l+1) \hbar^{2}}{2 m r^{2}} R+V(r) R=E R(r) \tag{6.17}
\end{equation}
\)

We have shown that, for any one-particle problem with a spherically symmetric potential-energy function $V(r)$, the stationary-state wave functions are $\psi=R(r) Y_{l}^{m}(\theta, \phi)$, where the radial factor $R(r)$ satisfies (6.17). By using a specific form for $V(r)$ in (6.17), we can solve it for a particular problem.


Up to this point, we have solved only one-particle quantum-mechanical problems. The hydrogen atom is a two-particle system, and as a preliminary to dealing with the H atom, we first consider a simpler case, that of two noninteracting particles.

Suppose that a system is composed of the noninteracting particles 1 and 2. Let $q{1}$ and $q{2}$ symbolize the coordinates $\left(x{1}, y{1}, z{1}\right)$ and $\left(x{2}, y{2}, z{2}\right)$ of particles 1 and 2 . Because the particles exert no forces on each other, the classical-mechanical energy of the system is the sum of the energies of the two particles: $E=E{1}+E{2}=T{1}+V{1}+T{2}+V{2}$, and the classical Hamiltonian is the sum of Hamiltonians for each particle: $H=H{1}+H{2}$. Therefore, the Hamiltonian operator is

\(
\hat{H}=\hat{H}{1}+\hat{H}{2}
\)

where $\hat{H}{1}$ involves only the coordinates $q{1}$ and the momentum operators $\hat{p}{1}$ that correspond to $q{1}$. The Schrödinger equation for the system is

\(
\begin{equation}
\left(\hat{H}{1}+\hat{H}{2}\right) \psi\left(q{1}, q{2}\right)=E \psi\left(q{1}, q{2}\right) \tag{6.18}
\end{equation}
\)

We try a solution of (6.18) by separation of variables, setting

\(
\begin{equation}
\psi\left(q{1}, q{2}\right)=G{1}\left(q{1}\right) G{2}\left(q{2}\right) \tag{6.19}
\end{equation}
\)

We have

\(
\begin{equation}
\hat{H}{1} G{1}\left(q{1}\right) G{2}\left(q{2}\right)+\hat{H}{2} G{1}\left(q{1}\right) G{2}\left(q{2}\right)=E G{1}\left(q{1}\right) G{2}\left(q{2}\right) \tag{6.20}
\end{equation}
\)

Since $\hat{H}{1}$ involves only the coordinate and momentum operators of particle 1 , we have $\hat{H}{1}\left[G{1}\left(q{1}\right) G{2}\left(q{2}\right)\right]=G{2}\left(q{2}\right) \hat{H}{1} G{1}\left(q{1}\right)$, since, as far as $\hat{H}{1}$ is concerned, $G{2}$ is a constant. Using this equation and a similar equation for $\hat{H}{2}$, we find that (6.20) becomes

\(
\begin{equation}
G{2}\left(q{2}\right) \hat{H}{1} G{1}\left(q{1}\right)+G{1}\left(q{1}\right) \hat{H}{2} G{2}\left(q{2}\right)=E G{1}\left(q{1}\right) G{2}\left(q{2}\right) \tag{6.21}
\end{equation}
\)

\(
\begin{equation}
\frac{\hat{H}{1} G{1}\left(q{1}\right)}{G{1}\left(q{1}\right)}+\frac{\hat{H}{2} G{2}\left(q{2}\right)}{G{2}\left(q{2}\right)}=E \tag{6.22}
\end{equation}
\)

Now, by the same arguments used in connection with Eq. (3.65), we conclude that each term on the left in (6.22) must be a constant. Using $E{1}$ and $E{2}$ to denote these constants, we have

\(
\begin{gather}
\frac{\hat{H}{1} G{1}\left(q{1}\right)}{G{1}\left(q{1}\right)}=E{1}, \quad \frac{\hat{H}{2} G{2}\left(q{2}\right)}{G{2}\left(q{2}\right)}=E{2} \
E=E{1}+E{2} \tag{6.23}
\end{gather}
\)

Thus, when the system is composed of two noninteracting particles, we can reduce the two-particle problem to two separate one-particle problems by solving

\(
\begin{equation}
\hat{H}{1} G{1}\left(q{1}\right)=E{1} G{1}\left(q{1}\right), \quad \hat{H}{2} G{2}\left(q{2}\right)=E{2} G{2}\left(q{2}\right) \tag{6.24}
\end{equation}
\)

which are separate Schrödinger equations for each particle.
Generalizing this result to $n$ noninteracting particles, we have

\(
\begin{gather}
\hat{H}=\hat{H}{1}+\hat{H}{2}+\cdots+\hat{H}{n} \
\psi\left(q{1}, q{2}, \ldots, q{n}\right)=G{1}\left(q{1}\right) G{2}\left(q{2}\right) \cdots G{n}\left(q{n}\right) \tag{6.25}\
E=E{1}+E{2}+\cdots+E{n} \tag{6.26}\
\hat{H}{i} G{i}=E{i} G_{i}, \quad i=1,2, \ldots, n \tag{6.27}
\end{gather}
\)

For a system of noninteracting particles, the energy is the sum of the individual energies of each particle and the wave function is the product of wave functions for each particle. The wave function $G{i}$ of particle $i$ is found by solving a Schrödinger equation for particle $i$ using the Hamiltonian $\hat{H}{i}$.

These results also apply to a single particle whose Hamiltonian is the sum of separate terms for each coordinate:

\(
\hat{H}=\hat{H}{x}\left(\hat{x}, \hat{p}{x}\right)+\hat{H}{y}\left(\hat{y}, \hat{p}{y}\right)+\hat{H}{z}\left(\hat{z}, \hat{p}{z}\right)
\)

In this case, we conclude that the wave functions and energies are

\(
\begin{gathered}
\psi(x, y, z)=F(x) G(y) K(z), \quad E=E{x}+E{y}+E{z} \
\hat{H}{x} F(x)=E{x} F(x), \quad \hat{H}{y} G(y)=E{y} G(y), \quad \hat{H}{z} K(z)=E_{z} K(z)
\end{gathered}
\)

Examples include the particle in a three-dimensional box (Section 3.5), the three-dimensional free particle (Prob. 3.42), and the three-dimensional harmonic oscillator (Prob. 4.20).


The hydrogen atom contains two particles, the proton and the electron. For a system of two particles 1 and 2 with coordinates $\left(x{1}, y{1}, z{1}\right)$ and $\left(x{2}, y{2}, z{2}\right)$, the potential energy of interaction between the particles is usually a function of only the relative coordinates $x{2}-x{1}, y{2}-y{1}$, and $z{2}-z{1}$ of the particles. In this case the two-particle problem can be simplified to two separate one-particle problems, as we now prove.

Consider the classical-mechanical treatment of two interacting particles of masses $m{1}$ and $m{2}$. We specify their positions by the radius vectors $\mathbf{r}{1}$ and $\mathbf{r}{2}$ drawn from the origin

FIGURE 6.1 A two-particle system with center of mass at $C$.

of a Cartesian coordinate system (Fig. 6.1). Particles 1 and 2 have coordinates $\left(x{1}, y{1}, z{1}\right)$ and $\left(x{2}, y{2}, z{2}\right)$. We draw the vector $\mathbf{r}=\mathbf{r}{2}-\mathbf{r}{1}$ from particle 1 to 2 and denote the components of $\mathbf{r}$ by $x, y$, and $z$ :

\(
\begin{equation}
x=x{2}-x{1}, \quad y=y{2}-y{1}, \quad z=z{2}-z{1} \tag{6.28}
\end{equation}
\)

The coordinates $x, y$, and $z$ are called the relative or internal coordinates.
We now draw the vector $\mathbf{R}$ from the origin to the system's center of mass, point $C$, and denote the coordinates of $C$ by $X, Y$, and $Z$ :

\(
\begin{equation}
\mathbf{R}=\mathbf{i} X+\mathbf{j} Y+\mathbf{k} Z \tag{6.29}
\end{equation}
\)

The definition of the center of mass of this two-particle system gives

\(
\begin{equation}
X=\frac{m{1} x{1}+m{2} x{2}}{m{1}+m{2}}, \quad Y=\frac{m{1} y{1}+m{2} y{2}}{m{1}+m{2}}, \quad Z=\frac{m{1} z{1}+m{2} z{2}}{m{1}+m{2}} \tag{6.30}
\end{equation}
\)

These three equations are equivalent to the vector equation

\(
\begin{equation}
\mathbf{R}=\frac{m{1} \mathbf{r}{1}+m{2} \mathbf{r}{2}}{m{1}+m{2}} \tag{6.31}
\end{equation}
\)

We also have

\(
\begin{equation}
\mathbf{r}=\mathbf{r}{2}-\mathbf{r}{1} \tag{6.32}
\end{equation}
\)

We regard (6.31) and (6.32) as simultaneous equations in the two unknowns $\mathbf{r}{1}$ and $\mathbf{r}{2}$ and solve for them to get

\(
\begin{equation}
\mathbf{r}{1}=\mathbf{R}-\frac{m{2}}{m{1}+m{2}} \mathbf{r}, \quad \mathbf{r}{2}=\mathbf{R}+\frac{m{1}}{m{1}+m{2}} \mathbf{r} \tag{6.33}
\end{equation}
\)

Equations (6.31) and (6.32) represent a transformation of coordinates from $x{1}, y{1}, z{1}, x{2}, y{2}, z{2}$ to $X, Y, Z, x, y, z$. Consider what happens to the Hamiltonian under this transformation. Let an overhead dot indicate differentiation with respect to time. The velocity of particle 1 is [Eq. (5.34)] $\mathbf{v}{1}=d \mathbf{r}{1} / d t=\dot{\mathbf{r}}_{1}$. The kinetic energy is the sum of the kinetic energies of the two particles:

\(
\begin{equation}
T=\frac{1}{2} m{1}\left|\dot{\mathbf{r}}{1}\right|^{2}+\frac{1}{2} m{2}\left|\dot{\mathbf{r}}{2}\right|^{2} \tag{6.34}
\end{equation}
\)

Introducing the time derivatives of Eqs. (6.33) into (6.34), we have

\(
\begin{aligned}
T= & \frac{1}{2} m{1}\left(\dot{\mathbf{R}}-\frac{m{2}}{m{1}+m{2}} \dot{\mathbf{r}}\right) \cdot\left(\dot{\mathbf{R}}-\frac{m{2}}{m{1}+m{2}} \dot{\mathbf{r}}\right) \
& +\frac{1}{2} m{2}\left(\dot{\mathbf{R}}+\frac{m{1}}{m{1}+m{2}} \dot{\mathbf{r}}\right) \cdot\left(\dot{\mathbf{R}}+\frac{m{1}}{m{1}+m{2}} \dot{\mathbf{r}}\right)
\end{aligned}
\)

where $|\mathbf{A}|^{2}=\mathbf{A} \cdot \mathbf{A}[$ Eq. (5.24)] has been used. Using the distributive law for the dot products, we find, after simplifying,

\(
\begin{equation}
T=\frac{1}{2}\left(m{1}+m{2}\right)|\dot{\mathbf{R}}|^{2}+\frac{1}{2} \frac{m{1} m{2}}{m{1}+m{2}}|\dot{\mathbf{r}}|^{2} \tag{6.35}
\end{equation}
\)

Let $M$ be the total mass of the system:

\(
\begin{equation}
M \equiv m{1}+m{2} \tag{6.36}
\end{equation}
\)

We define the reduced mass $\mu$ of the two-particle system as

\(
\begin{equation}
\mu \equiv \frac{m{1} m{2}}{m{1}+m{2}} \tag{6.37}
\end{equation}
\)

Then

\(
\begin{equation}
T=\frac{1}{2} M|\dot{\mathbf{R}}|^{2}+\frac{1}{2} \mu|\dot{\mathbf{r}}|^{2} \tag{6.38}
\end{equation}
\)

The first term in (6.38) is the kinetic energy due to translational motion of the whole system of mass $M$. Translational motion is motion in which each particle undergoes the same displacement. The quantity $\frac{1}{2} M|\dot{\mathbf{R}}|^{2}$ would be the kinetic energy of a hypothetical particle of mass $M$ located at the center of mass. The second term in (6.38) is the kinetic energy of internal (relative) motion of the two particles. This internal motion is of two types. The distance $r$ between the two particles can change (vibration), and the direction of the $\mathbf{r}$ vector can change (rotation). Note that $|\dot{\mathbf{r}}|=|d \mathbf{r} / d t| \neq d|\mathbf{r}| / d t$.

Corresponding to the original coordinates $x{1}, y{1}, z{1}, x{2}, y{2}, z{2}$, we have six linear momenta:

\(
\begin{equation}
p{x{1}}=m{1} \dot{x}{1}, \quad \ldots, \quad p{z{2}}=m{2} \dot{z}{2} \tag{6.39}
\end{equation}
\)

Comparing Eqs. (6.34) and (6.38), we define the six linear momenta for the new coordinates $X, Y, Z, x, y, z$ as

\(
\begin{array}{rlrl}
p{X} & \equiv M \dot{X}, & p{Y} \equiv M \dot{Y}, & p{Z} \equiv M \dot{Z} \
p{x} \equiv \mu \dot{x}, & p{y} \equiv \mu \dot{y}, & p{z} \equiv \mu \dot{z}
\end{array}
\)

We define two new momentum vectors as

\(
\mathbf{p}{M} \equiv \mathbf{i} M \dot{X}+\mathbf{j} M \dot{Y}+\mathbf{k} M \dot{Z} \quad \text { and } \quad \mathbf{p}{\mu} \equiv \mathbf{i} \mu \dot{x}+\mathbf{j} \mu \dot{y}+\mathbf{k} \mu \dot{z}
\)

Introducing these momenta into (6.38), we have

\(
\begin{equation}
T=\frac{\left|\mathbf{p}{M}\right|^{2}}{2 M}+\frac{\left|\mathbf{p}{\mu}\right|^{2}}{2 \mu} \tag{6.40}
\end{equation}
\)

Now consider the potential energy. We make the restriction that $V$ is a function only of the relative coordinates $x, y$, and $z$ of the two particles:

\(
\begin{equation}
V=V(x, y, z) \tag{6.41}
\end{equation}
\)

An example of (6.41) is two charged particles interacting according to Coulomb's law [see Eq. (3.53)]. With this restriction on $V$, the Hamiltonian function is

\(
\begin{equation}
H=\frac{p{M}^{2}}{2 M}+\left[\frac{p{\mu}^{2}}{2 \mu}+V(x, y, z)\right] \tag{6.42}
\end{equation}
\)

Now suppose we had a system composed of a particle of mass $M$ subject to no forces and a particle of mass $\mu$ subject to the potential-energy function $V(x, y, z)$. Further suppose that there was no interaction between these particles. If $(X, Y, Z)$ are the coordinates of the particle of mass $M$, and $(x, y, z)$ are the coordinates of the particle of mass $\mu$, what is the Hamiltonian of this hypothetical system? Clearly, it is identical with (6.42).

The Hamiltonian (6.42) can be viewed as the sum of the Hamiltonians $p{M}^{2} / 2 M$ and $p{\mu}^{2} / 2 \mu+V(x, y, z)$ of two hypothetical noninteracting particles with masses $M$ and $\mu$. Therefore, the results of Section 6.2 show that the system's quantummechanical energy is the sum of energies of the two hypothetical particles [Eq. (6.23)]: $E=E{M}+E{\mu}$. From Eqs. (6.24) and (6.42), the translational energy $E{M}$ is found by solving the Schrödinger equation $\left(\hat{p}{M}^{2} / 2 M\right) \psi{M}=E{M} \psi{M}$. This is the Schrödinger equation for a free particle of mass $M$, so its possible eigenvalues are all nonnegative numbers: $E{M} \geq 0$ [Eq. (2.31)]. From (6.24) and (6.42), the energy $E_{\mu}$ is found by solving the Schrödinger equation

\(
\begin{equation}
\left[\frac{\hat{p}{\mu}^{2}}{2 \mu}+V(x, y, z)\right] \psi{\mu}(x, y, z)=E{\mu} \psi{\mu}(x, y, z) \tag{6.43}
\end{equation}
\)

We have thus separated the problem of two particles interacting according to a potential-energy function $V(x, y, z)$ that depends on only the relative coordinates $x, y, z$ into two separate one-particle problems: (1) the translational motion of the entire system of mass $M$, which simply adds a nonnegative constant energy $E_{M}$ to the system's energy, and (2) the relative or internal motion, which is dealt with by solving the Schrödinger equation (6.43) for a hypothetical particle of mass $\mu$ whose coordinates are the relative coordinates $x, y, z$ and that moves subject to the potential energy $V(x, y, z)$.

For example, for the hydrogen atom, which is composed of an electron (e) and a proton $(p)$, the atom's total energy is $E=E{M}+E{\mu}$, where $E{M}$ is the translational energy of motion through space of the entire atom of mass $M=m{e}+m{p}$, and where $E{\mu}$ is found by solving (6.43) with $\mu=m{e} m{p} /\left(m{e}+m{p}\right)$ and $V$ being the Coulomb's law potential energy of interaction of the electron and proton; see Section 6.5.


Before solving the Schrödinger equation for the hydrogen atom, we will first deal with the two-particle rigid rotor. This is a two-particle system with the particles held at a fixed distance from each other by a rigid massless rod of length $d$. For this problem, the vector $\mathbf{r}$ in Fig. 6.1 has the constant magnitude $|\mathbf{r}|=d$. Therefore (see Section 6.3), the kinetic energy of internal motion is wholly rotational energy. The energy of the rotor is entirely kinetic, and

\(
\begin{equation}
V=0 \tag{6.44}
\end{equation}
\)

Equation (6.44) is a special case of Eq. (6.41), and we may therefore use the results of the last section to separate off the translational motion of the system as a whole. We will concern ourselves only with the rotational energy. The Hamiltonian operator for the rotation is given by the terms in brackets in (6.43) as

\(
\begin{equation}
\hat{H}=\frac{\hat{p}{\mu}^{2}}{2 \mu}=-\frac{\hbar^{2}}{2 \mu} \nabla^{2}, \quad \mu=\frac{m{1} m{2}}{m{1}+m_{2}} \tag{6.45}
\end{equation}
\)

where $m{1}$ and $m{2}$ are the masses of the two particles. The coordinates of the fictitious particle of mass $\mu$ are the relative coordinates of $m{1}$ and $m{2}$ [Eq. (6.28)].

Instead of the relative Cartesian coordinates $x, y, z$, it will prove more fruitful to use the relative spherical coordinates $r, \theta, \phi$. The $r$ coordinate is equal to the magnitude of
the $\mathbf{r}$ vector in Fig. 6.1, and since $m{1}$ and $m{2}$ are constrained to remain a fixed distance apart, we have $r=d$. Thus the problem is equivalent to a particle of mass $\mu$ constrained to move on the surface of a sphere of radius $d$. Because the radial coordinate is constant, the wave function will be a function of $\theta$ and $\phi$ only. Therefore the first two terms of the Laplacian operator in (6.8) will give zero when operating on the wave function and may be omitted. Looking at things in a slightly different way, we note that the operators in (6.8) that involve $r$ derivatives correspond to the kinetic energy of radial motion, and since there is no radial motion, the $r$ derivatives are omitted from $\hat{H}$.

Since $V=0$ is a special case of $V=V(r)$, the results of Section 6.1 tell us that the eigenfunctions are given by (6.16) with the $r$ factor omitted:

\(
\begin{equation}
\psi=Y_{J}^{m}(\theta, \phi) \tag{6.46}
\end{equation}
\)

where $J$ rather than $l$ is used for the rotational angular-momentum quantum number.
The Hamiltonian operator is given by Eq. (6.8) with the $r$ derivatives omitted and $V(r)=0$. Thus

\(
\hat{H}=\left(2 \mu d^{2}\right)^{-1} \hat{L}^{2}
\)

Use of (6.13) gives

\(
\begin{align}
\hat{H} \psi & =E \psi \
\left(2 \mu d^{2}\right)^{-1} \hat{L}^{2} Y{J}^{m}(\theta, \phi) & =E Y{J}^{m}(\theta, \phi) \
\left(2 \mu d^{2}\right)^{-1} J(J+1) \hbar^{2} Y{J}^{m}(\theta, \phi) & =E Y{J}^{m}(\theta, \phi) \
E=\frac{J(J+1) \hbar^{2}}{2 \mu d^{2}}, \quad J & =0,1,2 \cdots \tag{6.47}
\end{align}
\)

The moment of inertia $I$ of a system of $n$ particles about some particular axis in space as defined as

\(
\begin{equation}
I \equiv \sum{i=1}^{n} m{i} \rho_{i}^{2} \tag{6.48}
\end{equation}
\)

where $m{i}$ is the mass of the $i$ th particle and $\rho{i}$ is the perpendicular distance from this particle to the axis. The value of $I$ depends on the choice of axis. For the two-particle rigid rotor, we choose our axis to be a line that passes through the center of mass and is perpendicular to the line joining $m{1}$ and $m{2}$ (Fig. 6.2). If we place the rotor so that the center of mass, point $C$, lies at the origin of a Cartesian coordinate system and the line joining $m{1}$ and $m{2}$ lies on the $x$ axis, then $C$ will have the coordinates $(0,0,0), m{1}$ will have the coordinates $\left(-\rho{1}, 0,0\right)$, and $m{2}$ will have the coordinates $\left(\rho{2}, 0,0\right)$. Using these coordinates in (6.30), we find

\(
\begin{equation}
m{1} \rho{1}=m{2} \rho{2} \tag{6.49}
\end{equation}
\)

FIGURE 6.2 Axis (dashed line) for calculating the moment of inertia of a two-particle rigid rotor. $C$ is the center of mass.

The moment of inertia of the rotor about the axis we have chosen is

\(
\begin{equation}
I=m{1} \rho{1}^{2}+m{2} \rho{2}^{2} \tag{6.50}
\end{equation}
\)

Using (6.49), we transform Eq. (6.50) to (see Prob. 6.14)

\(
\begin{equation}
I=\mu d^{2} \tag{6.51}
\end{equation}
\)

where $\mu \equiv m{1} m{2} /\left(m{1}+m{2}\right)$ is the reduced mass of the system and $d \equiv \rho{1}+\rho{2}$ is the distance between $m{1}$ and $m{2}$. The allowed energy levels (6.47) of the two-particle rigid rotor are

\(
\begin{equation}
E=\frac{J(J+1) \hbar^{2}}{2 I}, \quad J=0,1,2, \ldots \tag{6.52}
\end{equation}
\)

The lowest level is $E=0$, so there is no zero-point rotational energy. Having zero rotational energy and therefore zero angular momentum for the rotor does not violate the uncertainty principle; recall the discussion following Eq. (5.105). Note that $E$ increases as $J^{2}+J$, so the spacing between adjacent rotational levels increases as $J$ increases.

Are the rotor energy levels (6.52) degenerate? The energy depends on $J$ only, but the wave function (6.46) depends on $J$ and $m$, where $m \hbar$ is the $z$ component of the rotor's angular momentum. For each value of $J$, there are $2 J+1$ values of $m$, ranging from $-J$ to $J$. Hence the levels are $(2 J+1)$-fold degenerate. The states of a degenerate level have different orientations of the angular-momentum vector of the rotor about a space-fixed axis.

The angles $\theta$ and $\phi$ in the wave function (6.46) are relative coordinates of the two point masses. If we set up a Cartesian coordinate system with the origin at the rotor's center of mass, $\theta$ and $\phi$ will be as shown in Fig. 6.3. This coordinate system undergoes the same translational motion as the rotor's center of mass but does not rotate in space.

The rotational angular momentum $\left[J(J+1) \hbar^{2}\right]^{1 / 2}$ is the angular momentum of the two particles with respect to an origin at the system's center of mass $C$.

The rotational levels of a diatomic molecule can be well approximated by the two-particle rigid-rotor energies (6.52). It is found (Levine, Molecular Spectroscopy, Section 4.4) that when a diatomic molecule absorbs or emits radiation, the allowed pure-rotational transitions are given by the selection rule

\(
\begin{equation}
\Delta J= \pm 1 \tag{6.53}
\end{equation}
\)

In addition, a molecule must have a nonzero dipole moment in order to show a pure-rotational spectrum. A pure-rotational transition is one where only the rotational

FIGURE 6.3 Coordinate system for the two-particle rigid rotor.

quantum number changes. [Vibration-rotation transitions (Section 4.3) involve changes in both vibrational and rotational quantum numbers.] The spacing between adjacent low-lying rotational levels is significantly less than that between adjacent vibrational levels, and the pure-rotational spectrum falls in the microwave (or the far-infrared) region. The frequencies of the pure-rotational spectral lines of a diatomic molecule are then (approximately)

\(
\begin{gather}
\nu=\frac{E{J+1}-E{J}}{h}=\frac{[(J+1)(J+2)-J(J+1)] h}{8 \pi^{2} I}=2(J+1) B \tag{6.54}\
B \equiv h / 8 \pi^{2} I, \quad J=0,1,2, \ldots \tag{6.55}
\end{gather}
\)

$B$ is called the rotational constant of the molecule.
The spacings between the diatomic rotational levels (6.52) for low and moderate values of $J$ are generally less than or of the same order of magnitude as $k T$ at room temperature, so the Boltzmann distribution law (4.63) shows that many rotational levels are significantly populated at room temperature. Absorption of radiation by diatomic molecules having $J=0$ (the $J=0 \rightarrow 1$ transition) gives a line at the frequency $2 B$; absorption by molecules having $J=1$ (the $J=1 \rightarrow 2$ transition) gives a line at $4 B$; absorption by $J=2$ molecules gives a line at $6 B$; and so on. See Fig. 6.4.

Measurement of the rotational absorption frequencies allows $B$ to be found. From $B$, we get the molecule's moment of inertia $I$, and from $I$ we get the bond distance $d$. The value of $d$ found is an average over the $v=0$ vibrational motion. Because of the asymmetry of the potential-energy curve in Figs. 4.6 and 13.1, $d$ is very slightly longer than the equilibrium bond length $R_{e}$ in Fig. 13.1.

As noted in Section 4.3, isotopic species such as ${ }^{1} \mathrm{H}^{35} \mathrm{Cl}$ and ${ }^{1} \mathrm{H}^{37} \mathrm{Cl}$ have virtually the same electronic energy curve $U(R)$ and so have virtually the same equilibrium bond distance. However, the different isotopic masses produce different moments of inertia and hence different rotational absorption frequencies.

Because molecules are not rigid, the rotational energy levels for diatomic molecules differ slightly from rigid-rotor levels. From (6.52) and (6.55), the two-particle rigid-rotor levels are $E{\mathrm{rot}}=B h J(J+1)$. Because of the anharmonicity of molecular vibration (Fig. 4.6), the average internuclear distance increases with increasing vibrational quantum number $v$, so as $v$ increases, the moment of inertia $I$ increases and the rotational constant $B$ decreases. To allow for the dependence of $B$ on $v$, one replaces $B$ in $E{\text {rot }}$ by $B{v}$. The mean rotational constant $B{v}$ for vibrational level $v$ is $B{v}=B{e}-\alpha{e}(v+1 / 2)$, where $B{e}$ is calculated using the equilibrium internuclear separation $R_{e}$ at the bottom

FIGURE 6.4 Two-particle rigid-rotor absorption transitions.
of the potential-energy curve in Fig. 4.6, and the vibration-rotation coupling constant $\alpha{e}$ is a positive constant (different for different molecules) that is much smaller than $B{e}$. Also, as the rotational energy increases, there is a very slight increase in average internuclear distance (a phenomenon called centrifugal distortion). This adds the term $-h D J^{2}(J+1)^{2}$ to $E{\text {rot }}$, where the centrifugal-distortion constant $D$ is an extremely small positive constant, different for different molecules. For example, for ${ }^{12} \mathrm{C}^{16} \mathrm{O}$, $B{0}=57636 \mathrm{MHz}, \alpha{e}=540 \mathrm{MHz}$, and $D=0.18 \mathrm{MHz}$. As noted in Section 4.3 , for lighter diatomic molecules, nearly all the molecules are in the ground $v=0$ vibrational level at room temperature, and the observed rotational constant is $B{0}$.

For more discussion of nuclear motion in diatomic molecules, see Section 13.2. For the rotational energies of polyatomic molecules, see Townes and Schawlow, chaps. 2-4.

EXAMPLE

The lowest-frequency pure-rotational absorption line of ${ }^{12} \mathrm{C}^{32} \mathrm{~S}$ occurs at 48991.0 MHz . Find the bond distance in ${ }^{12} \mathrm{C}^{32} \mathrm{~S}$.

The lowest-frequency rotational absorption is the $J=0 \rightarrow 1$ line. Equations (1.4), (6.52), and (6.51) give

\(
h \nu=E{\text {upper }}-E{\text {lower }}=\frac{1(2) \hbar^{2}}{2 \mu d^{2}}-\frac{0(1) \hbar^{2}}{2 \mu d^{2}}
\)

which gives $d=\left(h / 4 \pi^{2} \nu \mu\right)^{1 / 2}$. Table A. 3 in the Appendix gives

\(
\mu=\frac{m{1} m{2}}{m{1}+m{2}}=\frac{12(31.97207)}{(12+31.97207)} \frac{1}{6.02214 \times 10^{23}} \mathrm{~g}=1.44885 \times 10^{-23} \mathrm{~g}
\)

The SI unit of mass is the kilogram, and

\(
\begin{aligned}
d=\frac{1}{2 \pi}\left(\frac{h}{\nu_{0 \rightarrow 1} \mu}\right)^{1 / 2} & =\frac{1}{2 \pi}\left[\frac{6.62607 \times 10^{-34} \mathrm{~J} \mathrm{~s}}{\left(48991.0 \times 10^{6} \mathrm{~s}^{-1}\right)\left(1.44885 \times 10^{-26} \mathrm{~kg}\right)}\right]^{1 / 2} \
& =1.5377 \times 10^{-10} \mathrm{~m}=1.5377 \AA
\end{aligned}
\)

EXERCISE The $J=1$ to $J=2$ pure-rotational transition of ${ }^{12} \mathrm{C}^{16} \mathrm{O}$ occurs at $230.538 \mathrm{GHz} .\left(1 \mathrm{GHz}=10^{9} \mathrm{~Hz}\right.$. ) Find the bond distance in this molecule. (Answer: $1.1309 \times 10^{-10} \mathrm{~m}$.)


The hydrogen atom consists of a proton and an electron. If $e$ symbolizes the charge on the proton $\left(e=+1.6 \times 10^{-19} \mathrm{C}\right)$, then the electron's charge is $-e$.

A few scientists have speculated that the proton and electron charges might not be exactly equal in magnitude. Experiments show that the magnitudes of the electron and proton charges are equal to within one part in $10^{21}$. See G. Bressi et al., Phys. Rev. A, 83, 052101 (2011) (available online at arxiv.org/abs/1102.2766).

We shall assume the electron and proton to be point masses whose interaction is given by Coulomb's law. In discussing atoms and molecules, we shall usually be considering isolated systems, ignoring interatomic and intermolecular interactions.

Instead of treating just the hydrogen atom, we consider a slightly more general problem: the hydrogenlike atom, which consists of one electron and a nucleus of charge $Z e$. For $Z=1$, we have the hydrogen atom; for $Z=2$, the $\mathrm{He}^{+}$ion; for $Z=3$, the $\mathrm{Li}^{2+}$ ion; and so on. The hydrogenlike atom is the most important system in quantum chemistry. An exact solution of the Schrödinger equation for atoms with more than one electron cannot be obtained because of the interelectronic repulsions. If, as a crude first approximation, we ignore these repulsions, then the electrons can be treated independently. (See Section 6.2.) The atomic wave function will be approximated by a product of one-electron functions, which will be hydrogenlike wave functions. A one-electron wave function (whether or not it is hydrogenlike) is called an orbital. (More precisely, an orbital is a one-electron spatial wave function, where the word spatial means that the wave function depends on the electron's three spatial coordinates $x, y$, and $z$ or $r, \theta$, and $\phi$. We shall see in Chapter 10 that the existence of electron spin adds a fourth coordinate to a one-electron wave function, giving what is called a spin-orbital.) An orbital for an electron in an atom is called an atomic orbital. We shall use atomic orbitals to construct approximate wave functions for atoms with many electrons (Chapter 11). Orbitals are also used to construct approximate wave functions for molecules.

For the hydrogenlike atom, let $(x, y, z)$ be the coordinates of the electron relative to the nucleus, and let $\mathbf{r}=\mathbf{i} x+\mathbf{j} y+\mathbf{k} z$. The Coulomb's law force on the electron in the hydrogenlike atom is [see Eq. (1.37)]

\(
\begin{equation}
\mathbf{F}=-\frac{Z e^{2}}{4 \pi \varepsilon_{0} r^{2}} \frac{\mathbf{r}}{r} \tag{6.56}
\end{equation}
\)

where $\mathbf{r} / r$ is a unit vector in the $\mathbf{r}$ direction. The minus sign indicates an attractive force.
The possibility of small deviations from Coulomb's law has been considered. Experiments have shown that if the Coulomb's-law force is written as being proportional to $r^{-2+s}$, then $|s|<10^{-16}$. A deviation from Coulomb's law can be shown to imply a nonzero photon rest mass. No evidence exists for a nonzero photon rest mass, and data indicate that any such mass must be less than $10^{-51} \mathrm{~g}$; A. S. Goldhaber and M. M. Nieto, Rev. Mod. Phys., 82, 939 (2010) (arxiv.org/abs/0809.1003); G. Spavieri et al., Eur. Phys. J. D, 61, 531 (2011) (link.springer.com/content/pdf/10.1140/epjd/ e2011-10508-7).

The force in (6.56) is central, and comparison with Eq. (6.4) gives $d V(r) / d r=$ $Z e^{2} / 4 \pi \varepsilon_{0} r^{2}$. Integration gives

\(
\begin{equation}
V=\frac{Z e^{2}}{4 \pi \varepsilon{0}} \int \frac{1}{r^{2}} d r=-\frac{Z e^{2}}{4 \pi \varepsilon{0} r} \tag{6.57}
\end{equation}
\)

where the integration constant has been taken as 0 to make $V=0$ at infinite separation between the charges. For any two charges $Q{1}$ and $Q{2}$ separated by distance $r_{12}$, Eq. (6.57) becomes

\(
\begin{equation}
V=\frac{Q{1} Q{2}}{4 \pi \varepsilon{0} r{12}} \tag{6.58}
\end{equation}
\)

Since the potential energy of this two-particle system depends only on the relative coordinates of the particles, we can apply the results of Section 6.3 to reduce the problem to two one-particle problems. The translational motion of the atom as a whole simply adds some constant to the total energy, and we shall not concern ourselves

FIGURE 6.5 Relative spherical coordinates.

with it. To deal with the internal motion of the system, we introduce a fictitious particle of mass

\(
\begin{equation}
\mu=\frac{m{e} m{N}}{m{e}+m{N}} \tag{6.59}
\end{equation}
\)

where $m{e}$ and $m{N}$ are the electronic and nuclear masses. The particle of reduced mass $\mu$ moves subject to the potential-energy function (6.57), and its coordinates $(r, \theta, \phi)$ are the spherical coordinates of one particle relative to the other (Fig. 6.5).

The Hamiltonian for the internal motion is [Eq. (6.43)]

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2 \mu} \nabla^{2}-\frac{Z e^{2}}{4 \pi \varepsilon_{0} r} \tag{6.60}
\end{equation}
\)

Since $V$ is a function of the $r$ coordinate only, we have a one-particle central-force problem, and we may apply the results of Section 6.1. Using Eqs. (6.16) and (6.17), we have for the wave function

\(
\begin{equation}
\psi(r, \theta, \phi)=R(r) Y_{l}^{m}(\theta, \phi), \quad l=0,1,2, \ldots, \quad|m| \leq l \tag{6.61}
\end{equation}
\)

where $Y_{l}^{m}$ is a spherical harmonic, and the radial function $R(r)$ satisfies

\(
\begin{equation}
-\frac{\hbar^{2}}{2 \mu}\left(R^{\prime \prime}+\frac{2}{r} R^{\prime}\right)+\frac{l(l+1) \hbar^{2}}{2 \mu r^{2}} R-\frac{Z e^{2}}{4 \pi \varepsilon_{0} r} R=E R(r) \tag{6.62}
\end{equation}
\)

To save time in writing, we define the constant $a$ as

\(
\begin{equation}
a \equiv \frac{4 \pi \varepsilon_{0} \hbar^{2}}{\mu e^{2}} \tag{6.63}
\end{equation}
\)

and (6.62) becomes

\(
\begin{equation}
R^{\prime \prime}+\frac{2}{r} R^{\prime}+\left[\frac{8 \pi \varepsilon_{0} E}{a e^{2}}+\frac{2 Z}{a r}-\frac{l(l+1)}{r^{2}}\right] R=0 \tag{6.64}
\end{equation}
\)

Solution of the Radial Equation

We could now try a power-series solution of (6.64), but we would get a three-term rather than a two-term recursion relation. We therefore seek a substitution that will lead to a twoterm recursion relation. It turns out that the proper substitution can be found by examining the behavior of the solution for large values of $r$. For large $r$, (6.64) becomes

\(
\begin{equation}
R^{\prime \prime}+\frac{8 \pi \varepsilon_{0} E}{a e^{2}} R=0, \quad r \text { large } \tag{6.65}
\end{equation}
\)

which may be solved using the auxiliary equation (2.7). The solutions are

\(
\begin{equation}
\exp \left[ \pm\left(-8 \pi \varepsilon_{0} E / a e^{2}\right)^{1 / 2} r\right] \tag{6.66}
\end{equation}
\)

Suppose that $E$ is positive. The quantity under the square-root sign in (6.66) is negative, and the factor multiplying $r$ is imaginary:

\(
\begin{equation}
R(r) \sim e^{ \pm i \sqrt{2 \mu E} r / \hbar}, \quad E \geq 0 \tag{6.67}
\end{equation}
\)

where (6.63) was used. The symbol $\sim$ in (6.67) indicates that we are giving the behavior of $R(r)$ for large values of $r$; this is called the asymptotic behavior of the function. Note the resemblance of (6.67) to Eq. (2.30), the free-particle wave function. Equation (6.67) does not give the complete radial factor in the wave function for positive energies. Further study (Bethe and Salpeter, pages 21-24) shows that the radial function for $E \geq 0$ remains finite for all values of $r$, no matter what the value of $E$. Thus, just as for the free particle, all nonnegative energies of the hydrogen atom are allowed. Physically, these eigenfunctions correspond to states in which the electron is not bound to the nucleus; that is, the atom is ionized. (A classical-mechanical analogy is a comet moving in a hyperbolic orbit about the sun. The comet is not bound and makes but one visit to the solar system.) Since we get continuous rather than discrete allowed values for $E \geq 0$, the positive-energy eigenfunctions are called continuum eigenfunctions. The angular part of a continuum wave function is a spherical harmonic. Like the free-particle wave functions, the continuum eigenfunctions are not normalizable in the usual sense.

We now consider the bound states of the hydrogen atom, with $E<0$. (For a bound state, $\psi \rightarrow 0$ as $x \rightarrow \pm \infty$.) In this case, the quantity in parentheses in (6.66) is positive. Since we want the wave functions to remain finite as $r$ goes to infinity, we prefer the minus sign in (6.66), and in order to get a two-term recursion relation, we make the substitution

\(
\begin{gather}
R(r)=e^{-C r} K(r) \tag{6.68}\
C \equiv\left(-\frac{8 \pi \varepsilon_{0} E}{a e^{2}}\right)^{1 / 2} \tag{6.69}
\end{gather}
\)

where $e$ in (6.68) stands for the base of natural logarithms, and not the proton charge. Use of the substitution (6.68) will guarantee nothing about the behavior of the wave function for large $r$. The differential equation we obtain from this substitution will still have two linearly independent solutions. We can make any substitution we please in a differential equation; in fact, we could make the substitution $R(r)=e^{+C r} J(r)$ and still wind up with the correct eigenfunctions and eigenvalues. The relation between $J$ and $K$ would naturally be $J(r)=e^{-2 C r} K(r)$.

Proceeding with (6.68), we evaluate $R^{\prime}$ and $R^{\prime \prime}$, substitute into (6.64), multiply by $r^{2} e^{C r}$, and use (6.69) to get the following differential equation for $K(r)$ :

\(
\begin{equation}
r^{2} K^{\prime \prime}+\left(2 r-2 C r^{2}\right) K^{\prime}+\left[\left(2 Z a^{-1}-2 C\right) r-l(l+1)\right] K=0 \tag{6.70}
\end{equation}
\)

We could now substitute a power series of the form

\(
\begin{equation}
K=\sum{k=0}^{\infty} c{k} r^{k} \tag{6.71}
\end{equation}
\)

for $K$. If we did we would find that, in general, the first few coefficients in (6.71) are zero. If $c_{s}$ is the first nonzero coefficient, (6.71) can be written as

\(
\begin{equation}
K=\sum{k=s}^{\infty} c{k} r^{k}, \quad c_{s} \neq 0 \tag{6.72}
\end{equation}
\)

Letting $j \equiv k-s$, and then defining $b{j}$ as $b{j} \equiv c_{j+s}$, we have

\(
\begin{equation}
K=\sum{j=0}^{\infty} c{j+s} r^{j+s}=r^{s} \sum{j=0}^{\infty} b{j} r^{j}, \quad b_{0} \neq 0 \tag{6.73}
\end{equation}
\)

(Although the various substitutions we are making might seem arbitrary, they are standard procedure in solving differential equations by power series.) The integer $s$ is evaluated by substitution into the differential equation. Equation (6.73) is

\(
\begin{gather}
K(r)=r^{s} M(r) \tag{6.74}\
M(r)=\sum{j=0}^{\infty} b{j} r^{j}, \quad b_{0} \neq 0 \tag{6.75}
\end{gather}
\)

Evaluating $K^{\prime}$ and $K^{\prime \prime}$ from (6.74) and substituting into (6.70), we get

\(
\begin{equation}
r^{2} M^{\prime \prime}+\left[(2 s+2) r-2 C r^{2}\right] M^{\prime}+\left[s^{2}+s+\left(2 Z a^{-1}-2 C-2 C s\right) r-l(l+1)\right] M=0 \tag{6.76}
\end{equation}
\)

To find $s$, we look at (6.76) for $r=0$. From (6.75), we have

\(
\begin{equation}
M(0)=b{0}, \quad M^{\prime}(0)=b{1}, \quad M^{\prime \prime}(0)=2 b_{2} \tag{6.77}
\end{equation}
\)

Using (6.77) in (6.76), we find for $r=0$

\(
\begin{equation}
b_{0}\left(s^{2}+s-l^{2}-l\right)=0 \tag{6.78}
\end{equation}
\)

Since $b_{0}$ is not zero, the terms in parentheses must vanish: $s^{2}+s-l^{2}-l=0$. This is a quadratic equation in the unknown $s$, with the roots

\(
\begin{equation}
s=l, \quad s=-l-1 \tag{6.79}
\end{equation}
\)

These roots correspond to the two linearly independent solutions of the differential equation. Let us examine them from the standpoint of proper behavior of the wave function. From Eqs. (6.68), (6.74), and (6.75), we have

\(
\begin{equation}
R(r)=e^{-C r} r^{s} \sum{j=0}^{\infty} b{j} r^{j} \tag{6.80}
\end{equation}
\)

Since $e^{-C r}=1-C r+\ldots$, the function $R(r)$ behaves for small $r$ as $b_{0} r^{s}$. For the root $s=l, R(r)$ behaves properly at the origin. However, for $s=-l-1, R(r)$ is proportional to

\(
\begin{equation}
\frac{1}{r^{l+1}} \tag{6.81}
\end{equation}
\)

for small $r$. Since $l=0,1,2, \ldots$, the root $s=-l-1$ makes the radial factor in the wave function infinite at the origin. Many texts take this as sufficient reason for rejecting this root. However, this is not a good argument, since for the relativistic hydrogen atom, the $l=0$ eigenfunctions are infinite at $r=0$. Let us therefore look at (6.81) from the standpoint of quadratic integrability, since we certainly require the bound-state eigenfunctions to be normalizable.

The normalization integral [Eq. (5.80)] for the radial functions that behave like (6.81) looks like

\(
\begin{equation}
\int{0}|R|^{2} r^{2} d r \approx \int{0} \frac{1}{r^{2 l}} d r \tag{6.82}
\end{equation}
\)

for small $r$. The behavior of the integral at the lower limit of integration is

\(
\begin{equation}
\left.\frac{1}{r^{2 l-1}}\right|_{r=0} \tag{6.83}
\end{equation}
\)

For $l=1,2,3, \ldots,(6.83)$ is infinite, and the normalization integral is infinite. Hence we must reject the root $s=-l-1$ for $l \geq 1$. However, for $l=0$, (6.83) is finite, and there is no trouble with quadratic integrability. Thus there is a quadratically integrable solution to the radial equation that behaves as $r^{-1}$ for small $r$.

Further study of this solution shows that it corresponds to an energy value that the experimental hydrogen-atom spectrum shows does not exist. Thus the $r^{-1}$ solution must be rejected, but there is some dispute over the reason for doing so. One view is that the $1 / r$ solution satisfies the Schrödinger equation everywhere in space except at the origin and hence must be rejected [Dirac, page 156; B. H. Armstrong and E. A. Power, Am. J. Phys., 31, 262 (1963)]. A second view is that the $1 / r$ solution must be rejected because the Hamiltonian operator is not Hermitian with respect to it (Merzbacher, Section 10.5). (In Chapter 7 we shall define Hermitian operators and show that quantum-mechanical operators are required to be Hermitian.) Further discussion is given in A. A. Khelashvili and T. P. Nadareishvili, Am. J.Phys., 79, 668 (2011) (see arxiv.org/abs/1102.1185) and in Y. C. Cantelaube, arxiv.org/abs/1203.0551.

Taking the first root in (6.79), we have for the radial factor (6.80)

\(
\begin{equation}
R(r)=e^{-C r_{r}^{l} M(r)} \tag{6.84}
\end{equation}
\)

With $s=l$, Eq. (6.76) becomes

\(
\begin{equation}
r M^{\prime \prime}+(2 l+2-2 C r) M^{\prime}+\left(2 Z a^{-1}-2 C-2 C l\right) M=0 \tag{6.85}
\end{equation}
\)

From (6.75), we have

\(
\begin{gather}
M(r)=\sum{j=0}^{\infty} b{j} r^{j} \tag{6.86}\
M^{\prime}=\sum{j=0}^{\infty} j b{j} r^{j-1}=\sum{j=1}^{\infty} j b{j} r^{j-1}=\sum{k=0}^{\infty}(k+1) b{k+1} r^{k}=\sum{j=0}^{\infty}(j+1) b{j+1} r^{j} \
M^{\prime \prime}=\sum{j=0}^{\infty} j(j-1) b{j} r^{j-2}=\sum{j=1}^{\infty} j(j-1) b{j} r^{j-2}=\sum{k=0}^{\infty}(k+1) k b{k+1} r^{k-1} \
M^{\prime \prime}=\sum{j=0}^{\infty}(j+1) j b{j+1} r^{j-1} \tag{6.87}
\end{gather}
\)

Substituting these expressions in (6.85) and combining sums, we get

\(
\sum{j=0}^{\infty}\left[j(j+1) b{j+1}+2(l+1)(j+1) b{j+1}+\left(\frac{2 Z}{a}-2 C-2 C l-2 C j\right) b{j}\right] r^{j}=0
\)

Setting the coefficient of $r^{j}$ equal to zero, we get the recursion relation

\(
\begin{equation}
b{j+1}=\frac{2 C+2 C l+2 C j-2 Z a^{-1}}{j(j+1)+2(l+1)(j+1)} b{j} \tag{6.88}
\end{equation}
\)

We now must examine the behavior of the infinite series (6.86) for large $r$. The result of the same procedure used to examine the harmonic-oscillator power series in (4.42) suggests that for large $r$ the infinite series (6.86) behaves like $e^{2 C r}$. (See Prob. 6.20.) For large $r$, the radial function (6.84) behaves like

\(
\begin{equation}
R(r) \sim e^{-C r} r^{l} e^{2 C r}=r^{l} e^{C r} \tag{6.89}
\end{equation}
\)

Therefore, $R(r)$ will become infinite as $r$ goes to infinity and will not be quadratically integrable. The only way to avoid this "infinity catastrophe" (as in the harmonicoscillator case) is to have the series terminate after a finite number of terms, in which case the $e^{-C r}$ factor will ensure that the wave function goes to zero as $r$ goes to infinity. Let the
last term in the series be $b{k} r^{k}$. Then, to have $b{k+1}, b{k+2}, \ldots$ all vanish, the fraction multiplying $b{j}$ in the recursion relation (6.88) must vanish when $j=k$. We have

\(
\begin{equation}
2 C(k+l+1)=2 Z a^{-1}, \quad k=0,1,2, \ldots \tag{6.90}
\end{equation}
\)

$k$ and $l$ are integers, and we now define a new integer $n$ by

\(
\begin{equation}
n \equiv k+l+1, \quad n=1,2,3, \ldots \tag{6.91}
\end{equation}
\)

From (6.91) the quantum number $l$ must satisfy

\(
\begin{equation}
l \leq n-1 \tag{6.92}
\end{equation}
\)

Hence $l$ ranges from 0 to $n-1$.

Energy Levels

Use of (6.91) in (6.90) gives

\(
\begin{equation}
C n=Z a^{-1} \tag{6.93}
\end{equation}
\)

Substituting $C \equiv\left(-8 \pi \varepsilon_{0} E / a e^{2}\right)^{1 / 2}$ [Eq. (6.69)] into (6.93) and solving for $E$, we get

\(
\begin{equation}
E=-\frac{Z^{2}}{n^{2}} \frac{e^{2}}{8 \pi \varepsilon{0} a}=-\frac{Z^{2} \mu e^{4}}{8 \varepsilon{0}^{2} n^{2} h^{2}} \tag{6.94}
\end{equation}
\)

where $a \equiv 4 \pi \varepsilon_{0} \hbar^{2} / \mu e^{2}$ [Eq. (6.63)]. These are the bound-state energy levels of the hydrogenlike atom, and they are discrete. Figure 6.6 shows the potential-energy curve [Eq. (6.57)] and some of the allowed energy levels for the hydrogen atom $(Z=1)$. The crosshatching indicates that all positive energies are allowed.

It turns out that all changes in $n$ are allowed in light absorption and emission. The wavenumbers [Eq. (4.64)] of H -atom spectral lines are then

\(
\begin{equation}
\widetilde{\nu} \equiv \frac{1}{\lambda}=\frac{v}{c}=\frac{E{2}-E{1}}{h c}=\frac{e^{2}}{8 \pi \varepsilon{0} a h c}\left(\frac{1}{n{1}^{2}}-\frac{1}{n{2}^{2}}\right) \equiv R{\mathrm{H}}\left(\frac{1}{n{1}^{2}}-\frac{1}{n{2}^{2}}\right) \tag{6.95}
\end{equation}
\)

where $R_{\mathrm{H}}=109677.6 \mathrm{~cm}^{-1}$ is the Rydberg constant for hydrogen.

Degeneracy

Are the hydrogen-atom energy levels degenerate? For the bound states, the energy (6.94) depends only on $n$. However, the wave function (6.61) depends on all three quantum numbers $n, l$, and $m$, whose allowed values are [Eqs. (6.91), (6.92), (5.104), and (5.105)]

FIGURE 6.6 Energy levels of the hydrogen atom.

\(
\begin{gather}
n=1,2,3, \ldots \tag{6.96}\
l=0,1,2, \ldots, n-1 \tag{6.97}\
m=-l,-l+1, \ldots, 0, \ldots, l-1, l \tag{6.98}
\end{gather}
\)

Hydrogen-atom states with different values of $l$ or $m$, but the same value of $n$, have the same energy. The energy levels are degenerate, except for $n=1$, where $l$ and $m$ must both be 0 . For a given value of $n$, we can have $n$ different values of $l$. For each of these values of $l$, we can have $2 l+1$ values of $m$. The degree of degeneracy of an H -atom bound-state level is found to equal $n^{2}$ (spin considerations being omitted); see Prob. 6.16. For the continuum levels, it turns out that for a given energy there is no restriction on the maximum value of $l$; hence these levels are infinity-fold degenerate.

The radial equation for the hydrogen atom can also be solved by the use of ladder operators (also known as factorization); see Z. W. Salsburg, Am. J. Phys., 33, 36 (1965).


The Radial Factor

Using (6.93), we have for the recursion relation (6.88)

\(
\begin{equation}
b{j+1}=\frac{2 Z}{n a} \frac{j+l+1-n}{(j+1)(j+2 l+2)} b{j} \tag{6.99}
\end{equation}
\)

The discussion preceding Eq. (6.91) shows that the highest power of $r$ in the polynomial $M(r)=\Sigma{j} b{j} r^{j}$ [Eq. (6.86)] is $k=n-l-1$. Hence use of $C=Z / n a$ [Eq. (6.93)] in $R(r)=e^{-C r} r l M(r)$ [Eq. (6.84)] gives the radial factor in the hydrogen-atom $\psi$ as

\(
\begin{equation}
R{n l}(r)=r^{l} e^{-Z r / n a} \sum{j=0}^{n-l-1} b_{j} r^{j} \tag{6.100}
\end{equation}
\)

where $a \equiv 4 \pi \varepsilon_{0} \hbar^{2} / \mu e^{2}$ [Eq. (6.63)]. The complete hydrogenlike bound-state wave functions are [Eq. (6.61)]

\(
\begin{equation}
\psi{n l m}=R{n l}(r) Y{l}^{m}(\theta, \phi)=R{n l}(r) S_{l m}(\theta) \frac{1}{\sqrt{2 \pi}} e^{i m \phi} \tag{6.101}
\end{equation}
\)

where the first few theta functions are given in Table 5.1.
How many nodes does $R(r)$ have? The radial function is zero at $r=\infty$, at $r=0$ for $l \neq 0$, and at values of $r$ that make $M(r)$ vanish. $M(r)$ is a polynomial of degree $n-l-1$, and it can be shown that the roots of $M(r)=0$ are all real and positive. Thus, aside from the origin and infinity, there are $n-l-1$ nodes in $R(r)$. The nodes of the spherical harmonics are discussed in Prob. 6.41.

Ground-State Wave Function and Energy

For the ground state of the hydrogenlike atom, we have $n=1, l=0$, and $m=0$. The radial factor (6.100) is

\(
\begin{equation}
R{10}(r)=b{0} e^{-Z r / a} \tag{6.102}
\end{equation}
\)

The constant $b_{0}$ is determined by normalization [Eq. (5.80)]:

\(
\left|b{0}\right|^{2} \int{0}^{\infty} e^{-2 Z r / a} r^{2} d r=1
\)

Using the Appendix integral (A.8), we find

\(
\begin{equation}
R_{10}(r)=2\left(\frac{Z}{a}\right)^{3 / 2} e^{-Z r / a} \tag{6.103}
\end{equation}
\)

Multiplying by $Y_{0}^{0}=1 /(4 \pi)^{1 / 2}$, we have as the ground-state wave function

\(
\begin{equation}
\psi_{100}=\frac{1}{\pi^{1 / 2}}\left(\frac{Z}{a}\right)^{3 / 2} e^{-Z r / a} \tag{6.104}
\end{equation}
\)

The hydrogen-atom energies and wave functions involve the reduced mass, given by (6.59) as

\(
\begin{equation}
\mu{\mathrm{H}}=\frac{m{e} m{p}}{m{e}+m{p}}=\frac{m{e}}{1+m{e} / m{p}}=\frac{m{e}}{1+0.000544617}=0.9994557 m{e} \tag{6.105}
\end{equation}
\)

where $m{p}$ is the proton mass and $m{e} / m_{p}$ was found from Table A.1. The reduced mass is very close to the electron mass. Because of this, some texts use the electron mass instead of the reduced mass in the H atom Schrödinger equation. This corresponds to assuming that the proton mass is infinite compared with the electron mass in (6.105) and that all the internal motion is motion of the electron. The error introduced by using the electron mass for the reduced mass is about 1 part in 2000 for the hydrogen atom. For heavier atoms, the error introduced by assuming an infinitely heavy nucleus is even less than this. Also, for many-electron atoms, the form of the correction for nuclear motion is quite complicated. For these reasons we shall assume in the future an infinitely heavy nucleus and simply use the electron mass in writing the Schrödinger equation for atoms.

If we replace the reduced mass of the hydrogen atom by the electron mass, the quantity $a$ defined by (6.63) becomes

\(
\begin{equation}
a{0}=\frac{4 \pi \varepsilon{0} \hbar^{2}}{m_{e} e^{2}}=0.529177 \AA \tag{6.106}
\end{equation}
\)

where the subscript zero indicates use of the electron mass instead of the reduced mass. $a_{0}$ is called the Bohr radius, since it was the radius of the circle in which the electron moved in the ground state of the hydrogen atom in the Bohr theory. Of course, since the ground-state wave function (6.104) is nonzero for all finite values of $r$, there is some probability of finding the electron at any distance from the nucleus. The electron is certainly not confined to a circle.

A convenient unit for electronic energies is the electronvolt (eV), defined as the kinetic energy acquired by an electron accelerated through a potential difference of 1 volt (V). Potential difference is defined as energy per unit charge. Since $e=1.6021766 \times 10^{-19} \mathrm{C}$ and $1 \mathrm{VC}=1 \mathrm{~J}$, we have

\(
\begin{equation}
1 \mathrm{eV}=1.6021766 \times 10^{-19} \mathrm{~J} \tag{6.107}
\end{equation}
\)

EXAMPLE

Calculate the ground-state energy of the hydrogen atom using SI units and convert the result to electronvolts.

The H atom ground-state energy is given by (6.94) with $n=1$ and $Z=1$ as $E=-\mu e^{4} / 8 h^{2} \varepsilon_{0}^{2}$. Use of (6.105) for $\mu$ gives

\(
\begin{aligned}
E & =-\frac{0.9994557\left(9.109383 \times 10^{-31} \mathrm{~kg}\right)\left(1.6021766 \times 10^{-19} \mathrm{C}\right)^{4}}{8\left(6.626070 \times 10^{-34} \mathrm{~J} \mathrm{~s}\right)^{2}\left(8.8541878 \times 10^{-12} \mathrm{C}^{2} / \mathrm{N}-\mathrm{m}^{2}\right)^{2}} \frac{n^{2}}{n^{2}} \
E & =-\left(2.178686 \times 10^{-18} \mathrm{~J}\right)\left(Z^{2} / n^{2}\right)\left[(1 \mathrm{eV}) /\left(1.6021766 \times 10^{-19} \mathrm{~J}\right)\right]
\end{aligned}
\)

\(
\begin{equation}
E=-(13.598 \mathrm{eV})\left(Z^{2} / n^{2}\right)=-13.598 \mathrm{eV} \tag{6.108}
\end{equation}
\)

a number worth remembering. The minimum energy needed to ionize a ground-state hydrogen atom is 13.598 eV .
EXERCISE Find the $n=2$ energy of $\mathrm{Li}^{2+}$ in eV ; do the minimum amount of calculation needed. (Answer: -30.60 eV .)

EXAMPLE

Find $\langle T\rangle$ for the hydrogen-atom ground state.
Equations (3.89) for $\langle T\rangle$ and (6.7) for $\nabla^{2} \psi$ give

\(
\begin{gathered}
\langle T\rangle=\int \psi^{} \hat{T} \psi d \tau=-\frac{\hbar^{2}}{2 \mu} \int \psi^{} \nabla^{2} \psi d \tau \
\nabla^{2} \psi=\frac{\partial^{2} \psi}{\partial r^{2}}+\frac{2}{r} \frac{\partial \psi}{\partial r}-\frac{1}{r^{2} \hbar^{2}} \hat{L}^{2} \psi=\frac{\partial^{2} \psi}{\partial r^{2}}+\frac{2}{r} \frac{\partial \psi}{\partial r}
\end{gathered}
\)

since $\hat{L}^{2} \psi=l(l+1) \hbar^{2} \psi$ and $l=0$ for an $s$ state. From (6.104) with $Z=1$, we have $\psi=\pi^{-1 / 2} a^{-3 / 2} e^{-r / a}$, so $\partial \psi / \partial r=-\pi^{-1 / 2} a^{-5 / 2} e^{-r / a}$ and $\partial^{2} \psi / \partial r^{2}=\pi^{-1 / 2} a^{-7 / 2} e^{-r / a}$.
Using $d \tau=r^{2} \sin \theta d r d \theta d \phi$ [Eq. (5.78)], we have

\(
\begin{gathered}
\langle T\rangle=-\frac{\hbar^{2}}{2 \mu} \frac{1}{\pi a^{4}} \int{0}^{2 \pi} \int{0}^{\pi} \int{0}^{\infty}\left(\frac{1}{a} e^{-2 r / a}-\frac{2}{r} e^{-2 r / a}\right) r^{2} \sin \theta d r d \theta d \phi \
=-\frac{\hbar^{2}}{2 \mu \pi a^{4}} \int{0}^{2 \pi} d \phi \int{0}^{\pi} \sin \theta d \theta \int{0}^{\infty}\left(\frac{r^{2}}{a} e^{-2 r / a}-2 r e^{-2 r / a}\right) d r=\frac{\hbar^{2}}{2 \mu a^{2}}=\frac{e^{2}}{8 \pi \varepsilon_{0} a}
\end{gathered}
\)

where Appendix integral A. 8 and $a=4 \pi \varepsilon{0} \hbar^{2} / \mu e^{2}$ were used. From (6.94), $e^{2} / 8 \pi \varepsilon{0} a$ is minus the ground-state H -atom energy, and (6.108) gives $\langle T\rangle=13.598 \mathrm{eV}$. (See also Sec. 14.4.)

EXERCISE Find $\langle T\rangle$ for the hydrogen-atom $2 p{0}$ state using (6.113).
(Answer: $e^{2} / 32 \pi \varepsilon{0} a=(13.598 \mathrm{eV}) / 4=3.40 \mathrm{eV}$.)
Let us examine a significant property of the ground-state wave function (6.104). We have $r=\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}$. For points on the $x$ axis, where $y=0$ and $z=0$, we have $r=\left(x^{2}\right)^{1 / 2}=|x|$, and

\(
\begin{equation}
\psi_{100}(x, 0,0)=\pi^{-1 / 2}(Z / a)^{3 / 2} e^{-Z|x| / a} \tag{6.109}
\end{equation}
\)

Figure 6.7 shows how (6.109) varies along the $x$ axis. Although $\psi_{100}$ is continuous at the origin, the slope of the tangent to the curve is positive at the left of the origin but negative

FIGURE 6.7 Cusp in the hydrogen-atom groundstate wave function.
at its right. Thus $\partial \psi / \partial x$ is discontinuous at the origin. We say that the wave function has a cusp at the origin. The cusp is present because the potential energy $V=-Z e^{2} / 4 \pi \varepsilon_{0} r$ becomes infinite at the origin. Recall the discontinuous slope of the particle-in-a-box wave functions at the walls of the box.

We denoted the hydrogen-atom bound-state wave functions by three subscripts that give the values of $n, l$, and $m$. In an alternative notation, the value of $l$ is indicated by a letter:

Letter$s$$p$$d$$f$$g$$h$$i$$k$$\ldots$
$l$01234567$\ldots$

The letters $s, p, d, f$ are of spectroscopic origin, standing for sharp, principal, diffuse, and fundamental. After these we go alphabetically, except that $j$ is omitted. Preceding the code letter for $l$, we write the value of $n$. Thus the ground-state wave function $\psi{100}$ is called $\psi{1 s}$ or, more simply, $1 s$.

Wave Functions for $\boldsymbol{n}=\mathbf{2}$

For $n=2$, we have the states $\psi{200}, \psi{21-1}, \psi{210}$, and $\psi{211}$. We denote $\psi{200}$ as $\psi{2 s}$ or simply as $2 s$. To distinguish the three $2 p$ functions, we use a subscript giving the $m$ value and denote them as $2 p{1}, 2 p{0}$, and $2 p{-1}$. The radial factor in the wave function depends on $n$ and $l$, but not on $m$, as can be seen from (6.100). Each of the three $2 p$ wave functions thus has the same radial factor. The $2 s$ and $2 p$ radial factors may be found in the usual way from (6.100) and (6.99), followed by normalization. The results are given in Table 6.1. Note that the exponential factor in the $n=2$ radial functions is not the same as in the $R{1 s}$ function. The complete wave function is found by multiplying the radial factor by the appropriate spherical harmonic. Using (6.101), Table 6.1, and Table 5.1, we have

\(
\begin{align}
2 s & =\frac{1}{\pi^{1 / 2}}\left(\frac{Z}{2 a}\right)^{3 / 2}\left(1-\frac{Z r}{2 a}\right) e^{-Z r / 2 a} \tag{6.111}\
2 p_{-1} & =\frac{1}{8 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \sin \theta e^{-i \phi} \tag{6.112}
\end{align}
\)

TABLE 6.1 Radial Factors in the Hydrogenlike-Atom

Wave Functions\(
\begin{aligned}
R{1 s} & =2\left(\frac{Z}{a}\right)^{3 / 2} e^{-Z r / a} \
R{2 s} & =\frac{1}{\sqrt{2}}\left(\frac{Z}{a}\right)^{3 / 2}\left(1-\frac{Z r}{2 a}\right) e^{-Z r / 2 a} \
R{2 p} & =\frac{1}{2 \sqrt{6}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \
R{3 s} & =\frac{2}{3 \sqrt{3}}\left(\frac{Z}{a}\right)^{3 / 2}\left(1-\frac{2 Z r}{3 a}+\frac{2 Z^{2} r^{2}}{27 a^{2}}\right) e^{-Z r / 3 a} \
R{3 p} & =\frac{8}{27 \sqrt{6}}\left(\frac{Z}{a}\right)^{3 / 2}\left(\frac{Z r}{a}-\frac{Z^{2} r^{2}}{6 a^{2}}\right) e^{-Z r / 3 a} \
R{3 d} & =\frac{4}{81 \sqrt{30}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a}
\end{aligned}
\)

\(
\begin{align}
& 2 p{0}=\frac{1}{\pi^{1 / 2}}\left(\frac{Z}{2 a}\right)^{5 / 2} r e^{-Z r / 2 a} \cos \theta \tag{6.113}\
& 2 p{1}=\frac{1}{8 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \sin \theta e^{i \phi} \tag{6.114}
\end{align}
\)

Table 6.1 lists some of the normalized radial factors in the hydrogenlike wave functions. Figure 6.8 graphs some of the radial functions. The $r^{l}$ factor makes the radial functions zero at $r=0$, except for $s$ states.

The Radial Distribution Function

The probability of finding the electron in the region of space where its coordinates lie in the ranges $r$ to $r+d r, \theta$ to $\theta+d \theta$, and $\phi$ to $\phi+d \phi$ is [Eq. (5.78)]

\(
\begin{equation}
|\psi|^{2} d \tau=\left[R{n l}(r)\right]^{2}\left|Y{l}^{m}(\theta, \phi)\right|^{2} r^{2} \sin \theta d r d \theta d \phi \tag{6.115}
\end{equation}
\)

We now ask: What is the probability of the electron having its radial coordinate between $r$ and $r+d r$ with no restriction on the values of $\theta$ and $\phi$ ? We are asking for the probability of finding the electron in a thin spherical shell centered at the origin, of inner radius $r$ and outer radius $r+d r$. We must thus add up the infinitesimal probabilities (6.115) for all

FIGURE 6.8 Graphs of the radial factor $R_{n l}(r)$ in the hydrogen-atom ( $Z=1$ ) wave functions. The same scale is used in all graphs. (In some texts, these functions are not properly drawn to scale.)
possible values of $\theta$ and $\phi$, keeping $r$ fixed. This amounts to integrating (6.115) over $\theta$ and $\phi$. Hence the probability of finding the electron between $r$ and $r+d r$ is

\(
\begin{equation}
\left[R{n l}(r)\right]^{2} r^{2} d r \int{0}^{2 \pi} \int{0}^{\pi}\left|Y{l}^{m}(\theta, \phi)\right|^{2} \sin \theta d \theta d \phi=\left[R_{n l}(r)\right]^{2} r^{2} d r \tag{6.116}
\end{equation}
\)

since the spherical harmonics are normalized:

\(
\begin{equation}
\int{0}^{2 \pi} \int{0}^{\pi}\left|Y_{l}^{m}(\theta, \phi)\right|^{2} \sin \theta d \theta d \phi=1 \tag{6.117}
\end{equation}
\)

as can be seen from (5.72) and (5.80). The function $R^{2}(r) r^{2}$, which determines the probability of finding the electron at a distance $r$ from the nucleus, is called the radial distribution function; see Fig. 6.9.

For the $1 s$ ground state of H , the probability density $|\psi|^{2}$ is from Eq. (6.104) equal to $e^{-2 r / a}$ times a constant, and so $\left|\psi{1 s}\right|^{2}$ is a maximum at $r=0$ (see Fig. 6.14). However, the radial distribution function $\left[R{1 s}(r)\right]^{2} r^{2}$ is zero at the origin and is a maximum at $r=a$ (Fig. 6.9). These two facts are not contradictory. The probability density $|\psi|^{2}$ is proportional to the probability of finding the electron in an infinitesimal box of volume $d x d y d z$, and this probability is a maximum at the nucleus. The radial distribution function is proportional to the probability of finding the electron in a thin spherical shell of inner and outer radii $r$ and $r+d r$, and this probability is a maximum at $r=a$. Since $\psi{1 s}$ depends only on $r$, the $1 s$ probability density is essentially constant in the thin spherical shell. If we imagine the thin shell divided up into a huge number of infinitesimal boxes each of volume $d x d y d z$, we can sum up the probabilities $\left|\psi{1 s}\right|^{2} d x d y d z$ of being in

FIGURE 6.9 Plots of the radial distribution function $\left[R_{n \prime}(r)\right]^{2} r^{2}$ for the hydrogen atom.

each tiny box in the thin shell to get the probability of finding the electron in the thin shell as being $\left|\psi{1 s}\right|^{2} V{\text {shell }}$. The volume $V_{\text {shell }}$ of the thin shell is

\(
\frac{4}{3} \pi(r+d r)^{3}-\frac{4}{3} \pi r^{3}=4 \pi r^{2} d r
\)

where terms in $(d r)^{2}$ and $(d r)^{3}$ are negligible compared with the $d r$ term. Therefore the probability of being in the thin shell is

\(
\left|\psi{1 s}\right|^{2} V{\text {shell }}=R{1 s}^{2}\left(Y{0}^{0}\right)^{2} 4 \pi r^{2} d r=R{1 s}^{2}\left[(4 \pi)^{-1 / 2}\right]^{2} 4 \pi r^{2} d r=R{1 s}^{2} r^{2} d r
\)

in agreement with (6.116). The $1 s$ radial distribution function is zero at $r=0$ because the volume $4 \pi r^{2} d r$ of the thin spherical shell becomes zero as $r$ goes to zero. As $r$ increases from zero, the probability density $\left|\psi{1 s}\right|^{2}$ decreases and the volume $4 \pi r^{2} d r$ of the thin shell increases. Their product $\left|\psi{1 s}\right|^{2} 4 \pi r^{2} d r$ is a maximum at $r=a$.

EXAMPLE

Find the probability that the electron in the ground-state H atom is less than a distance $a$ from the nucleus.

We want the probability that the radial coordinate lies between 0 and $a$. This is found by taking the infinitesimal probability (6.116) of being between $r$ and $r+d r$ and summing it over the range from 0 to $a$. This sum of infinitesimal quantities is the definite integral

\(
\begin{aligned}
\int{0}^{a} R{n l}^{2} r^{2} d r & =\frac{4}{a^{3}} \int{0}^{a} e^{-2 r / a} r^{2} d r=\left.\frac{4}{a^{3}} e^{-2 r / a}\left(-\frac{r^{2} a}{2}-\frac{2 r a^{2}}{4}-\frac{2 a^{3}}{8}\right)\right|{0} ^{a} \
& =4\left[e^{-2}(-5 / 4)-(-1 / 4)\right]=0.323
\end{aligned}
\)

where $R{10}$ was taken from Table 6.1 and the Appendix integral A. 7 was used.
EXERCISE Find the probability that the electron in a $2 p{1} \mathrm{H}$ atom is less than a distance $a$ from the nucleus. Use a table of integrals or the website integrals.wolfram.com.
(Answer: 0.00366.)

Real Hydrogenlike Functions

The factor $e^{i m \phi}$ makes the spherical harmonics complex, except when $m=0$. Instead of working with complex wave functions such as (6.112) and (6.114), chemists often use real hydrogenlike wave functions formed by taking linear combinations of the complex functions. The justification for this procedure is given by the theorem of Section 3.6: Any linear combination of eigenfunctions of a degenerate energy level is an eigenfunction of the Hamiltonian with the same eigenvalue. Since the energy of the hydrogen atom does not depend on $m$, the $2 p{1}$ and $2 p{-1}$ states belong to a degenerate energy level. Any linear combination of them is an eigenfunction of the Hamiltonian with the same energy eigenvalue.

One way to combine these two functions to obtain a real function is

\(
\begin{equation}
2 p{x} \equiv \frac{1}{\sqrt{2}}\left(2 p{-1}+2 p_{1}\right)=\frac{1}{4 \sqrt{2 \pi}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \sin \theta \cos \phi \tag{6.118}
\end{equation}
\)

where we used (6.112), (6.114), and $e^{ \pm i \phi}=\cos \phi \pm i \sin \phi$. The $1 / \sqrt{2}$ factor normalizes $2 p_{x}$ :

\(
\begin{aligned}
\int\left|2 p{x}\right|^{2} d \tau & =\frac{1}{2}\left(\int\left|2 p{-1}\right|^{2} d \tau+\int\left|2 p{1}\right|^{2} d \tau+\int\left(2 p{-1}\right) 2 p{1} d \tau+\int\left(2 p{1}\right) 2 p_{-1} d \tau\right) \
& =\frac{1}{2}(1+1+0+0)=1
\end{aligned}
\)

Here we used the fact that $2 p{1}$ and $2 p{-1}$ are normalized and are orthogonal to each other, since

\(
\int{0}^{2 \pi}\left(e^{-i \phi}\right) * e^{i \phi} d \phi=\int{0}^{2 \pi} e^{2 i \phi} d \phi=0
\)

The designation $2 p_{x}$ for (6.118) becomes clearer if we note that (5.51) gives

\(
\begin{equation}
2 p_{x}=\frac{1}{4 \sqrt{2 \pi}}\left(\frac{Z}{a}\right)^{5 / 2} x e^{-Z r / 2 a} \tag{6.119}
\end{equation}
\)

A second way of combining the functions is

\(
\begin{gather}
2 p{y} \equiv \frac{1}{i \sqrt{2}}\left(2 p{1}-2 p{-1}\right)=\frac{1}{4 \sqrt{2 \pi}}\left(\frac{Z}{a}\right)^{5 / 2} r \sin \theta \sin \phi e^{-Z r / 2 a} \tag{6.120}\
2 p{y}=\frac{1}{4 \sqrt{2 \pi}}\left(\frac{Z}{a}\right)^{5 / 2} y e^{-Z r / 2 a} \tag{6.121}
\end{gather}
\)

The function $2 p_{0}$ is real and is often denoted by

\(
\begin{equation}
2 p{0}=2 p{z}=\frac{1}{\sqrt{\pi}}\left(\frac{Z}{2 a}\right)^{5 / 2} z e^{-Z r / 2 a} \tag{6.122}
\end{equation}
\)

where capital $Z$ stands for the number of protons in the nucleus, and small $z$ is the $z$ coordinate of the electron. The functions $2 p{x}, 2 p{y}$, and $2 p{z}$ are mutually orthogonal (Prob. 6.42). Note that $2 p{z}$ is zero in the $x y$ plane, positive above this plane, and negative below it.

The functions $2 p{-1}$ and $2 p{1}$ are eigenfunctions of $\hat{L}^{2}$ with the same eigenvalue: $2 \hbar^{2}$. The reasoning of Section 3.6 shows that the linear combinations (6.118) and (6.120) are also eigenfunctions of $\hat{L}^{2}$ with eigenvalue $2 \hbar^{2}$. However, $2 p{-1}$ and $2 p{1}$ are eigenfunctions of $\hat{L}{z}$ with different eigenvalues: $-\hbar$ and $+\hbar$. Therefore, $2 p{x}$ and $2 p{y}$ are not eigenfunctions of $\hat{L}{z}$.

We can extend this procedure to construct real wave functions for higher states. Since $m$ ranges from $-l$ to $+l$, for each complex function containing the factor $e^{-i|m| \phi}$ there is a function with the same value of $n$ and $l$ but having the factor $e^{+i|m| \phi}$. Addition and subtraction of these functions gives two real functions, one with the factor $\cos (|m| \phi)$, the other with the factor $\sin (|m| \phi)$. Table 6.2 lists these real wave functions for the hydrogenlike atom. The subscripts on these functions come from similar considerations as for the $2 p{x}, 2 p{y}$, and $2 p{z}$ functions. For example, the $3 d{x y}$ function is proportional to $x y$ (Prob. 6.37).

The real hydrogenlike functions are derived from the complex functions by replacing $e^{i m \phi} /(2 \pi)^{1 / 2}$ with $\pi^{-1 / 2} \sin (|m| \phi)$ or $\pi^{-1 / 2} \cos (|m| \phi)$ for $m \neq 0$; for $m=0$ the $\phi$ factor is $1 /(2 \pi)^{1 / 2}$ for both real and complex functions.

In dealing with molecules, the real hydrogenlike orbitals are more useful than the complex ones. For example, we shall see in Section 15.5 that the real atomic orbitals $2 p{x}, 2 p{y}$, and $2 p{z}$ of the oxygen atom have the proper symmetry to be used in constructing a wave function for the $\mathrm{H}{2} \mathrm{O}$ molecule, whereas the complex $2 p$ orbitals do not.

TABLE 6.2 Real Hydrogenlike Wave Functions

$1 s=\frac{1}{\pi^{1 / 2}}\left(\frac{Z}{a}\right)^{3 / 2} e^{-Z r / a}$
$2 s=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{3 / 2}\left(2-\frac{Z r}{a}\right) e^{-Z r / 2 a}$
$2 p{z}=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \cos \theta$
$2 p{x}=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \sin \theta \cos \phi$
$2 p{y}=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2} r e^{-Z r / 2 a} \sin \theta \sin \phi$
$3 s=\frac{1}{81(3 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{3 / 2}\left(27-18 \frac{Z r}{a}+2 \frac{Z^{2} r^{2}}{a^{2}}\right) e^{-Z r / 3 a}$
$3 p{z}=\frac{2^{1 / 2}}{81 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2}\left(6-\frac{Z r}{a}\right) r e^{-Z r / 3 a} \cos \theta$
$3 p{x}=\frac{2^{1 / 2}}{81 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2}\left(6-\frac{Z r}{a}\right) r e^{-Z r / 3 a} \sin \theta \cos \phi$
$3 p{y}=\frac{2^{1 / 2}}{81 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{5 / 2}\left(6-\frac{Z r}{a}\right) r e^{-Z r / 3 a} \sin \theta \sin \phi$
$3 d{z^{2}}=\frac{1}{81(6 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a}\left(3 \cos ^{2} \theta-1\right)$
$3 d{x z}=\frac{2^{1 / 2}}{81 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a} \sin \theta \cos \theta \cos \phi$
$3 d{y z}=\frac{2^{1 / 2}}{81 \pi^{1 / 2}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a} \sin \theta \cos \theta \sin \phi$
$3 d{x^{2}-y^{2}}=\frac{1}{81(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a} \sin ^{2} \theta \cos 2 \phi$
$3 d_{x y}=\frac{1}{81(2 \pi)^{1 / 2}}\left(\frac{Z}{a}\right)^{7 / 2} r^{2} e^{-Z r / 3 a} \sin ^{2} \theta \sin 2 \phi$


The hydrogenlike wave functions are one-electron spatial wave functions and so are hydrogenlike orbitals (Section 6.5). These functions have been derived for a one-electron atom, and we cannot expect to use them to get a truly accurate representation of the wave function of a many-electron atom. The use of the orbital concept to approximate manyelectron atomic wave functions is discussed in Chapter 11. For now we restrict ourselves to one-electron atoms.

There are two fundamentally different ways of depicting orbitals. One way is to draw graphs of the functions; a second way is to draw contour surfaces of constant probability density.

First consider drawing graphs. To graph the variation of $\psi$ as a function of the three independent variables $r, \theta$, and $\phi$, we need four dimensions. The three-dimensional nature of our world prevents us from drawing such a graph. Instead, we draw graphs of the

FIGURE 6.10 Polar graphs of the $\theta$ factors in the $s$ and $p_{z}$ hydrogen-atom wave functions.

factors in $\psi$. Graphing $R(r)$ versus $r$, we get the curves of Fig. 6.8, which contain no information on the angular variation of $\psi$.

Now consider graphs of $S(\theta)$. We have (Table 5.1)

\(
S{0,0}=1 / \sqrt{2}, \quad S{1,0}=\frac{1}{2} \sqrt{6} \cos \theta
\)

We can graph these functions using two-dimensional Cartesian coordinates, plotting $S$ on the vertical axis and $\theta$ on the horizontal axis. $S{0,0}$ gives a horizontal straight line, and $S{1,0}$ gives a cosine curve. More commonly, $S$ is graphed using plane polar coordinates. The variable $\theta$ is the angle with the positive $z$ axis, and $S(\theta)$ is the distance from the origin to the point on the graph. For $S{0,0}$, we get a circle; for $S{1,0}$ we obtain two tangent circles (Fig. 6.10). The negative sign on the lower circle of the graph of $S{1,0}$ indicates that $S{1,0}$ is negative for $\frac{1}{2} \pi<\theta \leq \pi$. Strictly speaking, in graphing $\cos \theta$ we only get the upper circle, which is traced out twice; to get two tangent circles, we must graph $|\cos \theta|$.

Instead of graphing the angular factors separately, we can draw a single graph that plots $|S(\theta) T(\phi)|$ as a function of $\theta$ and $\phi$. We will use spherical coordinates, and the distance from the origin to a point on the graph will be $|S(\theta) T(\phi)|$. For an $s$ state, $S T$ is independent of the angles, and we get a sphere of radius $1 /(4 \pi)^{1 / 2}$ as the graph. For a $p{z}$ state, $S T=\frac{1}{2}(3 / \pi)^{1 / 2} \cos \theta$, and the graph of $|S T|$ consists of two spheres with centers on the $z$ axis and tangent at the origin (Fig. 6.11). No doubt Fig. 6.11 is familiar. Some texts say this gives the shape of a $p{z}$ orbital, which is wrong. Figure 6.11 is simply a graph of the angular factor in a $p{z}$ wave function. Graphs of the $p{x}$ and $p_{y}$ angular factors give tangent spheres lying on the $x$ and $y$ axes, respectively. If we graph $S^{2} T^{2}$ in spherical coordinates, we get surfaces with the familiar figure-eight cross sections; to repeat, these are graphs, not orbital shapes.

FIGURE 6.11 Graph of $\left|Y{1}^{0}(\theta, \phi)\right|$, the angular factor in a $p{z}$ wave function.

Now consider drawing contour surfaces of constant probability density. We shall draw surfaces in space, on each of which the value of $|\psi|^{2}$, the probability density, is constant. Naturally, if $|\psi|^{2}$ is constant on a given surface, $|\psi|$ is also constant on that surface. The contour surfaces for $|\psi|^{2}$ and for $|\psi|$ are identical.

For an $s$ orbital, $\psi$ depends only on $r$, and a contour surface is a surface of constant $r$, that is, a sphere centered at the origin. To pin down the size of an orbital, we take a contour surface within which the probability of finding the electron is, say, $95 \%$; thus we want $\int_{V}|\psi|^{2} d \tau=0.95$, where $V$ is the volume enclosed by the orbital contour surface.

Let us obtain the cross section of the $2 p_{y}$ hydrogenlike orbital in the $y z$ plane. In this plane, $\phi=\pi / 2$ (Fig. 6.5), and $\sin \phi=1$; hence Table 6.2 gives for this orbital in the $y z$ plane

\(
\begin{equation}
\left|2 p_{y}\right|=k^{5 / 2} \pi^{-1 / 2} r e^{-k r}|\sin \theta| \tag{6.123}
\end{equation}
\)

where $k=Z / 2 a$. To find the orbital cross section, we use plane polar coordinates to plot (6.123) for a fixed value of $\psi ; r$ is the distance from the origin, and $\theta$ is the angle with the $z$ axis. The result for a typical contour (Prob. 6.44) is shown in Fig. 6.12. Since $y e^{-k r}=y \exp \left[-k\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}\right]$, we see that the $2 p{y}$ orbital is a function of $y$ and $\left(x^{2}+z^{2}\right)$. Hence, on a circle centered on the $y$ axis and parallel to the $x z$ plane, $2 p{y}$ is constant. Thus a three-dimensional contour surface may be developed by rotating the cross section in Fig. 6.12 about the $y$ axis, giving a pair of distorted ellipsoids. The shape of a real $2 p$ orbital is two separated, distorted ellipsoids, and not two tangent spheres.

Now consider the shape of the two complex orbitals $2 p_{ \pm 1}$. We have

\(
\begin{gather}
2 p{ \pm 1}=k^{5 / 2} \pi^{-1 / 2} r e^{-k r} \sin \theta e^{ \pm i \phi} \
\left|2 p{ \pm 1}\right|=k^{5 / 2} \pi^{-1 / 2} e^{-k r} r|\sin \theta| \tag{6.124}
\end{gather}
\)

and these two orbitals have the same shape. Since the right sides of (6.124) and (6.123) are identical, we conclude that Fig. 6.12 also gives the cross section of the $2 p_{ \pm 1}$ orbitals in the $y z$ plane. Since [Eq. (5.51)]

\(
e^{-k r} r|\sin \theta|=\exp \left[-k\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}\right]\left(x^{2}+y^{2}\right)^{1 / 2}
\)

we see that $2 p_{ \pm 1}$ is a function of $z$ and $x^{2}+y^{2}$; so we get the three-dimensional orbital shape by rotating Fig. 6.12 about the $z$ axis. This gives a doughnut-shaped surface.

Some hydrogenlike orbital surfaces are shown in Fig. 6.13. The $2 s$ orbital has a spherical node, which is not visible; the $3 s$ orbital has two such nodes. The $3 p_{z}$ orbital has a spherical node (indicated by a dashed line) and a nodal plane (the $x y$ plane).

FIGURE 6.12 Contour of a
$2 p_{y}$ orbital.

FIGURE 6.13 Shapes of some hydrogen-atom orbitals.

The $3 d{z^{2}}$ orbital has two nodal cones. The $3 d{x^{2}-y^{2}}$ orbital has two nodal planes. Note that the view shown is not the same for the various orbitals. The relative signs of the wave functions are indicated. The other three real $3 d$ orbitals in Table 6.2 have the same shape as the $3 d{x^{2}-y^{2}}$ orbital but have different orientations. The $3 d{x y}$ orbital has its lobes lying between the $x$ and $y$ axes and is obtained by rotating the $3 d{x^{2}-y^{2}}$ orbital by $45^{\circ}$ about the $z$ axis. The $3 d{y z}$ and $3 d_{x z}$ orbitals have their lobes between the $y$ and $z$ axes and between the $x$ and $z$ axes, respectively. (Online three-dimensional views of the real hydrogenlike orbitals are at www.falstad.com/qmatom; these can be rotated using a mouse.)

FIGURE 6.14 Probability

densities for some hydrogenatom states. [For accurate stereo plots, see D. T. Cromer, J. Chem. Educ., 45, 626 (1968).]

Figure 6.14 represents the probability density in the $y z$ plane for various orbitals. The number of dots in a given region is proportional to the value of $|\psi|^{2}$ in that region. Rotation of these diagrams about the vertical $(z)$ axis gives the three-dimensional probability density. The $2 s$ orbital has a constant for its angular factor and hence has no angular nodes; for this orbital, $n-l-1=1$, indicating one radial node. The sphere on which $\psi_{2 s}=0$ is evident in Fig. 6.14.

Schrödinger's original interpretation of $|\psi|^{2}$ was that the electron is "smeared out" into a charge cloud. If we consider an electron passing from one medium to another, we find that $|\psi|^{2}$ is nonzero in both mediums. According to the charge-cloud interpretation, this would mean that part of the electron was reflected and part transmitted. However, experimentally one never detects a fraction of an electron; electrons behave as indivisible entities. This difficulty is removed by the Born interpretation, according to which the values of $|\psi|^{2}$ in the two mediums give the probabilities for reflection and transmission. The orbital shapes we have drawn give the regions of space in which the total probability of finding the electron is $95 \%$.


In 1896, Zeeman observed that application of an external magnetic field caused a splitting of atomic spectral lines. We shall consider this Zeeman effect for the hydrogen atom. We begin by reviewing magnetism.

Magnetic fields arise from moving electric charges. A charge $Q$ with velocity $\mathbf{v}$ gives rise to a magnetic field $\mathbf{B}$ at point $P$ in space, such that

\(
\begin{equation}
\mathbf{B}=\frac{\mu_{0}}{4 \pi} \frac{Q \mathbf{v} \times \mathbf{r}}{r^{3}} \tag{6.125}
\end{equation}
\)

where $\mathbf{r}$ is the vector from $Q$ to point P and where $\mu_{0}$ (called the permeability of vacuum or the magnetic constant) is defined as $4 \pi \times 10^{-7} \mathrm{~N} \mathrm{C}^{-2} \mathrm{~s}^{2}$. [Equation (6.125) is valid only for a nonaccelerated charge moving with a speed much less than the speed of light.] The vector $\mathbf{B}$ is called the magnetic induction or magnetic flux density. (It was formerly believed that the vector $\mathbf{H}$ was the fundamental magnetic field vector, so $\mathbf{H}$ was called the magnetic field strength. It is now known that $\mathbf{B}$ is the fundamental magnetic vector.) Equation (6.125) is in SI units with $Q$ in coulombs and $\mathbf{B}$ in teslas (T), where $1 \mathrm{~T}=1 \mathrm{NC}^{-1} \mathrm{~m}^{-1} \mathrm{~s}$.

Two electric charges $+Q$ and $-Q$ separated by a small distance $b$ constitute an electric dipole. The electric dipole moment is defined as a vector from $-Q$ to $+Q$ with magnitude $Q b$. For a small planar loop of electric current, it turns out that the magnetic field generated by the moving charges of the current is given by the same mathematical
expression as that giving the electric field due to an electric dipole, except that the electric dipole moment is replaced by the magnetic dipole moment $\mathbf{m} ; \mathbf{m}$ is a vector of magnitude $I A$, where $I$ is the current flowing in a loop of area $A$. The direction of $\mathbf{m}$ is perpendicular to the plane of the current loop.

Consider the magnetic (dipole) moment associated with a charge $Q$ moving in a circle of radius $r$ with speed $v$. The current is the charge flow per unit time. The circumference of the circle is $2 \pi r$, and the time for one revolution is $2 \pi r / v$. Hence $I=Q v / 2 \pi r$. The magnitude of $\mathbf{m}$ is

\(
\begin{equation}
|\mathbf{m}|=I A=(Q v / 2 \pi r) \pi r^{2}=Q v r / 2=Q r p / 2 m \tag{6.126}
\end{equation}
\)

where $m$ is the mass of the charged particle and $p$ is its linear momentum. Since the radius vector $\mathbf{r}$ is perpendicular to $\mathbf{p}$, we have

\(
\begin{equation}
\mathbf{m}_{L}=\frac{Q \mathbf{r} \times \mathbf{p}}{2 m}=\frac{Q}{2 m} \mathbf{L} \tag{6.127}
\end{equation}
\)

where the definition of orbital angular momentum $\mathbf{L}$ was used and the subscript on $\mathbf{m}$ indicates that it arises from the orbital motion of the particle. Although we derived (6.127) for the special case of circular motion, its validity is general. For an electron, $Q=-e$, and the magnetic moment due to its orbital motion is

\(
\begin{equation}
\mathbf{m}{L}=-\frac{e}{2 m{e}} \mathbf{L} \tag{6.128}
\end{equation}
\)

The magnitude of $\mathbf{L}$ is given by (5.95), and the magnitude of the orbital magnetic moment of an electron with orbital-angular-momentum quantum number $l$ is

\(
\begin{equation}
\left|\mathbf{m}{L}\right|=\frac{e \hbar}{2 m{e}}[l(l+1)]^{1 / 2}=\mu_{\mathrm{B}}[l(l+1)]^{1 / 2} \tag{6.129}
\end{equation}
\)

The constant $e \hbar / 2 m{e}$ is called the Bohr magneton $\mu{\mathrm{B}}$ :

\(
\begin{equation}
\mu{\mathrm{B}} \equiv e \hbar / 2 m{e}=9.2740 \times 10^{-24} \mathrm{~J} / \mathrm{T} \tag{6.130}
\end{equation}
\)

Now consider applying an external magnetic field to the hydrogen atom. The energy of interaction between a magnetic dipole $\mathbf{m}$ and an external magnetic field $\mathbf{B}$ can be shown to be

\(
\begin{equation}
E_{B}=-\mathbf{m} \cdot \mathbf{B} \tag{6.131}
\end{equation}
\)

Using Eq. (6.128), we have

\(
\begin{equation}
E{B}=\frac{e}{2 m{e}} \mathbf{L} \cdot \mathbf{B} \tag{6.132}
\end{equation}
\)

We take the $z$ axis along the direction of the applied field: $\mathbf{B}=B \mathbf{k}$, where $\mathbf{k}$ is a unit vector in the $z$ direction. We have

\(
E{B}=\frac{e}{2 m{e}} B\left(L{x} \mathbf{i}+L{y} \mathbf{j}+L{z} \mathbf{k}\right) \cdot \mathbf{k}=\frac{e}{2 m{e}} B L{z}=\frac{\mu{\mathrm{B}}}{\hbar} B L_{z}
\)

where $L{z}$ is the $z$ component of orbital angular momentum. We now replace $L{z}$ by the operator $\hat{L}_{z}$ to give the following additional term in the Hamiltonian operator, resulting from the external magnetic field:

\(
\begin{equation}
\hat{H}{B}=\mu{\mathrm{B}} B \hbar^{-1} \hat{L}_{z} \tag{6.133}
\end{equation}
\)

The Schrödinger equation for the hydrogen atom in a magnetic field is

\(
\begin{equation}
\left(\hat{H}+\hat{H}_{B}\right) \psi=E \psi \tag{6.134}
\end{equation}
\)

where $\hat{H}$ is the hydrogen-atom Hamiltonian in the absence of an external field. We readily verify that the solutions of Eq. (6.134) are the complex hydrogenlike wave functions (6.61):

\(
\begin{equation}
\left(\hat{H}+\hat{H}{B}\right) R(r) Y{l}^{m}(\theta, \phi)=\hat{H} R Y{l}^{m}+\mu{\mathrm{B}} \hbar^{-1} B \hat{L}{z} R Y{l}^{m}=\left(-\frac{Z^{2}}{n^{2}} \frac{e^{2}}{8 \pi \varepsilon{0} a}+\mu{\mathrm{B}} B m\right) R Y_{l}^{m} \tag{6.135}
\end{equation}
\)

where Eqs. (6.94) and (5.105) were used. Thus there is an additional term $\mu_{\mathrm{B}} B m$ in the energy, and the external magnetic field removes the $m$ degeneracy. For obvious reasons, $m$ is often called the magnetic quantum number. Actually, the observed energy shifts do not match the predictions of Eq. (6.135) because of the existence of electron spin magnetic moment (Chapter 10 and Section 11.7).

In Chapter 5 we found that in quantum mechanics $\mathbf{L}$ lies on the surface of a cone. A classical-mechanical treatment of the motion of $\mathbf{L}$ in an applied magnetic field shows that the field exerts a torque on $\mathbf{m}{L}$, causing $\mathbf{L}$ to revolve about the direction of $\mathbf{B}$ at a constant frequency given by $\left|\mathbf{m}{L}\right| B / 2 \pi|\mathrm{~L}|$, while maintaining a constant angle with $\mathbf{B}$. This gyroscopic motion is called precession. In quantum mechanics, a complete specification of $\mathbf{L}$ is impossible. However, one finds that $\langle\mathbf{L}\rangle$ precesses about the field direction (Dicke and Wittke, Section 12-3).


For a one-particle central-force problem, the wave function is given by (6.16) as $\psi=R(r) Y_{l}^{m}(\theta, \phi)$ and the radial factor $R(r)$ is found by solving the radial equation (6.17). The Numerov method of Section 4.4 applies to differential equations of the form $\psi^{\prime \prime}=G(x) \psi(x)$ [Eq. (4.66)], so we need to eliminate the first derivative $R^{\prime}$ in (6.17). Let us define $F(r)$ by $F(r) \equiv r R(r)$, so

\(
\begin{equation}
R(r)=r^{-1} F(r) \tag{6.136}
\end{equation}
\)

Then $R^{\prime}=-r^{-2} F+r^{-1} F^{\prime}$ and $R^{\prime \prime}=2 r^{-3} F-2 r^{-2} F^{\prime}+r^{-1} F^{\prime \prime}$. Substitution in (6.17) transforms the radial equation to

\(
\begin{gather}
-\frac{\hbar^{2}}{2 m} F^{\prime \prime}(r)+\left[V(r)+\frac{l(l+1) \hbar^{2}}{2 m r^{2}}\right] F(r)=E F(r) \tag{6.137}\
F^{\prime \prime}(r)=G(r) F(r), \quad \text { where } G(r) \equiv \frac{m}{\hbar^{2}}(2 V-2 E)+\frac{l(l+1)}{r^{2}} \tag{6.138}
\end{gather}
\)

which has the form needed for the Numerov method. In solving (6.137) numerically, one deals separately with each value of $l$. Equation (6.137) resembles the one-dimensional Schrödinger equation $-\left(\hbar^{2} / 2 m\right) \psi^{\prime \prime}(x)+V(x) \psi(x)=E \psi(x)$, except that $r$ (whose range is 0 to $\infty$ ) replaces $x$ (whose range is $-\infty$ to $\infty$ ), $F(r) \equiv r R(r)$ replaces $\psi$, and $V(r)+l(l+1) \hbar^{2} / 2 m r^{2}$ replaces $V(x)$. We can expect that for each value of $l$, the lowest-energy solution will have 0 interior nodes (that is, nodes with $0<r<\infty$ ), the next lowest will have 1 interior node, and so on.

Recall from the discussion after (6.81) that if $R(r)$ behaves as $1 / r^{b}$ near the origin, then if $b>1, R(r)$ is not quadratically integrable; also, the value $b=1$ is not allowed, as noted after (6.83). Hence $F(r) \equiv r R(r)$ must be zero at $r=0$.

For $l \neq 0, G(r)$ in (6.138) is infinite at $r=0$, which upsets most computers. To avoid this problem, one starts the solution at an extremely small value of $r$ (for example, $10^{-15}$ for the dimensionless $r_{r}$ ) and approximates $F(r)$ as zero at this point.

As an example, we shall use the Numerov method to solve for the lowest boundstate H -atom energies. Here, $V=-e^{2} / 4 \pi \varepsilon{0} r=-e^{\prime 2} / r$, where $e^{\prime} \equiv e /\left(4 \pi \varepsilon{0}\right)^{1 / 2}$. The radial equation (6.62) contains the three constants $e^{\prime}, \mu$, and $\hbar$, where $e^{\prime} \equiv e /\left(4 \pi \varepsilon_{0}\right)^{1 / 2}$ has SI units of $\mathrm{m} \mathrm{N}^{1 / 2}$ (see Table A. 1 of the Appendix) and hence has the dimensions $\left[e^{\prime}\right]=\mathrm{L}^{3 / 2} \mathrm{M}^{1 / 2} \mathrm{~T}^{-1}$. Following the procedure used to derive Eq. (4.73), we find the H -atom reduced energy and reduced radial coordinate to be (Prob. 6.47)

\(
\begin{equation}
E{r}=E / \mu e^{\prime 4} \hbar^{-2}, \quad r{r}=r / B=r / \hbar^{2} \mu^{-1} e^{\prime-2} \tag{6.139}
\end{equation}
\)

Use of (6.139) and (4.76) and (4.77) with $\psi$ replaced by $F$ and $B=\hbar^{2} \mu^{-1} e^{\prime-2}$ transforms (6.137) for the H atom to (Prob. 6.47)

\(
\begin{equation}
F{r}^{\prime \prime}=G{r} F{r}, \quad \text { where } G{r}=l(l+1) / r{r}^{2}-2 / r{r}-2 E_{r} \tag{6.140}
\end{equation}
\)

and where $F{r}=F / B^{-1 / 2}$.
The bound-state H -atom energies are all less than zero. Suppose we want to find the H -atom bound-state eigenvalues with $E{r} \leq-0.04$. Equating this energy to $V{r}$, we have (Prob. 6.47) $-0.04=-1 / r{r}$ and the classically allowed region for this energy value extends from $r{r}=0$ to $r{r}=25$. Going two units into the classically forbidden region, we take $r{r, \max }=27$ and require that $F{r}(27)=0$. We shall take $s{r}=0.1$, giving 270 points from 0 to 27 (more precisely, from $10^{-15}$ to $27+10^{-15}$ ).
$G{r}$ in (6.140) contains the parameter $l$, so the program of Table 4.1 has to be modified to input the value of $l$. When setting up a spreadsheet, enter the $l$ value in some cell and refer to this cell when you type the formula for cell B7 (Fig. 4.9) that defines $G{r}$. Start column A at $r{r}=1 \times 10^{-15}$. Column C of the spreadsheet will contain $F{r}$ values instead of $\psi{r}$ values, and $F{r}$ will differ negligibly from zero at $r{r}=1 \times 10^{-15}$, and will be taken as zero at this point.

With these choices, we find (Prob. 6.48a) the lowest three H -atom eigenvalues for $l=0$ to be $E{r}=-0.4970,-0.1246$, and -0.05499 ; the lowest two $l=1$ eigenvalues found are -0.1250 and -0.05526 . The true values [Eqs. (6.94) and (6.139)] are $-0.5000,-0.1250$, and -0.05555 . The mediocre accuracy can be attributed mainly to the rapid variation of $G(r)$ near $r=0$. If $s{r}$ is taken as 0.025 instead of 0.1 (giving 1080 points), the $l=0$ eigenvalues are improved to $-0.4998,-0.12497$, and -0.05510 . See also Prob. 6.48b.


Theorems of Quantum Mechanics

Click the terms below to know more about it.

Schrödinger Equation: A fundamental equation in quantum mechanics that describes how the quantum state of a physical system changes over time. Hamiltonian: An operator corresponding to the total energy of the system, including both kinetic and potential energies. Variation Method: An approximation method used to find the ground state energy of a quantum system by minimizing the energy expectation value. Perturbation Theory: A method used to find an approximate solution to a problem by starting from the exact solution of a related, simpler problem and adding corrections. Bracket Notation: A notation introduced by Dirac, used to represent the inner product of two functions in quantum mechanics. Matrix Element: The integral of an operator sandwiched between two functions, representing the transition amplitude between states. Hermitian Operator: An operator that is equal to its own conjugate transpose, ensuring that its eigenvalues are real numbers. Eigenvalues: The possible outcomes of a measurement of a physical quantity, represented by the operator. Eigenfunctions: The functions that correspond to the eigenvalues of an operator, representing the state of the system. Orthogonality: A property of functions where their inner product is zero, indicating that they are independent. Completeness: A property of a set of functions where any well-behaved function can be expanded as a linear combination of the set. Parity Operator: An operator that replaces each Cartesian coordinate with its negative, used to determine the symmetry properties of wave functions. Commuting Operators: Operators that can be applied in any order without changing the result, indicating that the corresponding physical quantities can be simultaneously measured. Uncertainty Principle: A principle stating that certain pairs of physical properties, like position and momentum, cannot be simultaneously measured with arbitrary precision. Superposition: The principle that a quantum system can exist in multiple states at once, and the overall state is a combination of these states. Reduction of the Wave Function: The process by which a quantum system's state becomes definite upon measurement, also known as wave function collapse. Dirac Delta Function: A function that is zero everywhere except at a single point, where it is infinitely high and integrates to one, used to represent point particles. Heaviside Step Function: A function that is zero for negative arguments and one for positive arguments, used to represent sudden changes. Time-Dependent Schrödinger Equation: An equation describing how the quantum state of a system evolves over time. Stationary State: A quantum state with all observables independent of time, typically an eigenstate of the Hamiltonian.

The Schrödinger equation for the one-electron atom (Chapter 6) is exactly solvable. However, because of the interelectronic-repulsion terms in the Hamiltonian, the Schrödinger equation for many-electron atoms and molecules cannot be solved exactly. Hence we must seek approximate methods of solution. The two main approximation methods, the variation method and perturbation theory, will be presented in Chapters 8 and 9 . To derive these methods, we must develop further the theory of quantum mechanics, which is what is done in this chapter.

Before starting, we introduce some notation. The definite integral over all space of an operator sandwiched between two functions occurs often, and various abbreviations are used:

\(
\begin{equation}
\int f_{m}^{} \hat{A} f{n} d \tau \equiv\left\langle f{m}\right| \hat{A}\left|f_{n}\right\rangle \equiv\langle m| \hat{A}|n\rangle \tag{7.1}
\end{}
\)

where $f{m}$ and $f{n}$ are two functions. If it is clear what functions are meant, we can use just the indexes, as indicated in (7.1). The above notation, introduced by Dirac, is called bracket notation. Another notation is

\(
\begin{equation}
\int f_{m}^{} \hat{A} f{n} d \tau \equiv A{m n} \tag{7.2}
\end{}
\)

The notations $A_{m n}$ and $\langle m| \hat{A}|n\rangle$ imply that we use the complex conjugate of the function whose letter appears first. The definite integral $\langle m| \hat{A}|n\rangle$ is called a matrix element of the operator $\hat{A}$. Matrices are rectangular arrays of numbers and obey certain rules of combination (see Section 7.10).

For the definite integral over all space between two functions, we write

\(
\begin{equation}
\int f_{m}^{} f{n} d \tau \equiv\left\langle f{m} \mid f{n}\right\rangle \equiv\left(f{m}, f_{n}\right) \equiv\langle m \mid n\rangle \tag{7.3}
\end{}
\)

Note that

\(
\langle f| \hat{B}|g\rangle=\langle f \mid \hat{B} g\rangle
\)

where $f$ and $g$ are functions. Since $\left(\int f{m}^{*} f{n} d \tau\right)^{}=\int f_{n}^{} f_{m} d \tau$, we have the identity

\(
\begin{equation}
\langle m \mid n\rangle^{}=\langle n \mid m\rangle \tag{7.4}
\end{}
\)

Since the complex conjugate of $f_{m}$ in (7.1) is taken, it follows that

\(
\begin{equation}
\langle c f| \hat{B}|g\rangle=c^{}\langle f| \hat{B}|g\rangle \quad \text { and } \quad\langle f| \hat{B}|c g\rangle=c\langle f| \hat{B}|g\rangle \tag{7.5}
\end{}
\)

where $\hat{B}$ is a linear operator and $c$ is a constant.


The quantum-mechanical operators that represent physical quantities are linear (Section 3.1). These operators must meet an additional requirement, which we now discuss.

Definition of Hermitian Operators

Let $\hat{A}$ be the linear operator representing the physical property $A$. The average value of $A$ is [Eq. (3.88)]

\(
\langle A\rangle=\int \Psi^{*} \hat{A} \Psi d \tau
\)

where $\Psi$ is the state function of the system. Since the average value of a physical quantity must be a real number, we demand that

\(
\begin{gather}
\langle A\rangle=\langle A\rangle^{} \
\int \Psi^{} \hat{A} \Psi d \tau=\left[\int \Psi^{} \hat{A} \Psi d \tau\right]^{}=\int\left(\Psi^{}\right)^{}(\hat{A} \Psi)^{} d \tau \
\int \Psi^{} \hat{A} \Psi d \tau=\int \Psi(\hat{A} \Psi)^{} d \tau \tag{7.6}
\end{gather*}
\)

Equation (7.6) must hold for every function $\Psi$ that can represent a possible state of the system; that is, it must hold for all well-behaved functions $\Psi$. A linear operator that satisfies (7.6) for all well-behaved functions is called a Hermitian operator (after the mathematician Charles Hermite).

Many texts define a Hermitian operator as a linear operator that satisfies

\(
\begin{equation}
\int f^{} \hat{A} g d \tau=\int g(\hat{A} f)^{} d \tau \tag{7.7}
\end{equation}
\)

for all well-behaved functions $f$ and $g$. Note especially that on the left side of (7.7) $\hat{A}$ operates on $g$, but on the right side $\hat{A}$ operates on $f$. For the special case $f=g$, (7.7) reduces to (7.6). Equation (7.7) is apparently a more stringent requirement than (7.6), but we shall prove that (7.7) is a consequence of (7.6). Therefore the two definitions of a Hermitian operator are equivalent.

We begin the proof by setting $\Psi=f+c g$ in (7.6), where $c$ is an arbitrary constant. This gives

\(
\begin{aligned}
& \int(f+c g)^{} \hat{A}(f+c g) d \tau=\int(f+c g)[\hat{A}(f+c g)] d \tau \
& \int\left(f^{}+c^{} g^{}\right) \hat{A} f d \tau+\int\left(f^{}+c^{} g^{}\right) \hat{A} c g d \tau \
& =\int(f+c g)(\hat{A} f)^{} d \tau+\int(f+c g)(\hat{A} c g)^{} d \tau \
& \int f^{} \hat{A} f d \tau+c^{} \int g^{} \hat{A} f d \tau+c \int f^{} \hat{A} g d \tau+c^{} c \int g^{} \hat{A} g d \tau \
& =\int f(\hat{A} f)^{} d \tau+c \int g(\hat{A} f)^{} d \tau+c^{} \int f(\hat{A} g)^{} d \tau+c c^{} \int g(\hat{A} g)^{} d \tau
\end{aligned}
\)

By virtue of (7.6), the first terms on each side of this last equation are equal to each other; likewise, the last terms on each side are equal. Therefore

\(
\begin{equation}
c^{} \int g^{} \hat{A} f d \tau+c \int f^{} \hat{A} g d \tau=c \int g(\hat{A} f)^{} d \tau+c^{} \int f(\hat{A} g)^{} d \tau \tag{7.8}
\end{equation}
\)

Setting $c=1$ in (7.8), we have

\(
\begin{equation}
\int g^{} \hat{A} f d \tau+\int f^{} \hat{A} g d \tau=\int g(\hat{A} f)^{} d \tau+\int f(\hat{A} g)^{} d \tau \tag{7.9}
\end{equation}
\)

Setting $c=i$ in (7.8), we have, after dividing by $i$,

\(
\begin{equation}
-\int g^{} \hat{A} f d \tau+\int f^{} \hat{A} g d \tau=\int g(\hat{A} f)^{} d \tau-\int f(\hat{A} g)^{} d \tau \tag{7.10}
\end{equation}
\)

We now add (7.9) and (7.10) to get (7.7). This completes the proof.
Therefore, a Hermitian operator $\hat{A}$ is a linear operator that satisfies

\(
\begin{equation}
\int f_{m}^{} \hat{A} f{n} d \tau=\int f{n}\left(\hat{A} f_{m}\right)^{} d \tau \tag{7.11}
\end{equation}
\)

where $f{m}$ and $f{n}$ are arbitrary well-behaved functions and the integrals are definite integrals over all space. Using the bracket and matrix-element notations, we write

\(
\begin{align}
\left\langle f{m}\right| \hat{A}\left|f{n}\right\rangle & =\left\langle f{n}\right| \hat{A}\left|f{m}\right\rangle^{} \tag{7.12}\
\langle m| \hat{A}|n\rangle & =\langle n| \hat{A}|m\rangle^{} \tag{7.13}\
A{m n} & =\left(A{n m}\right)^{} \tag{7.14}
\end{align*}
\)

The two sides of (7.12) differ by having the functions interchanged and the complex conjugate taken.

Examples of Hermitian Operators

Let us show that some of the operators we have been using are indeed Hermitian. For simplicity, we shall work in one dimension. To prove that an operator is Hermitian, it suffices to show that it satisfies (7.6) for all well-behaved functions. However, we shall make things a bit harder by proving that (7.11) is satisfied.

First consider the one-particle, one-dimensional potential-energy operator. The right side of (7.11) is

\(
\begin{equation}
\int{-\infty}^{\infty} f{n}(x)\left[V(x) f_{m}(x)\right] d x \tag{7.15}
\end{}
\)

We have $V^{*}=V$, since the potential energy is a real function. Moreover, the order of the factors in (7.15) does not matter. Therefore,

\(
\int{-\infty}^{\infty} f{n}\left(V f{m}\right)^{*} d x=\int{-\infty}^{\infty} f{n} V^{*} f{m}^{} d x=\int{-\infty}^{\infty} f{m}^{} V f_{n} d x
\)

which proves that $V$ is Hermitian.
The operator for the $x$ component of linear momentum is $\hat{p}_{x}=-i \hbar d / d x$ [Eq. (3.23)]. For this operator, the left side of (7.11) is

\(
-i \hbar \int{-\infty}^{\infty} f{m}^{*}(x) \frac{d f_{n}(x)}{d x} d x
\)

Now we use the formula for integration by parts:

\(
\begin{equation}
\int{a}^{b} u(x) \frac{d v(x)}{d x} d x=\left.u(x) v(x)\right|{a} ^{b}-\int_{a}^{b} v(x) \frac{d u(x)}{d x} d x \tag{7.16}
\end{equation}
\)

Let

\(
u(x) \equiv-i \hbar f{m}^{*}(x), \quad v(x) \equiv f{n}(x)
\)

Then

\(
\begin{equation}
-i \hbar \int{-\infty}^{\infty} f{m}^{} \frac{d f{n}}{d x} d x=-\left.i \hbar f{m}^{} f{n}\right|{-\infty} ^{\infty}+i \hbar \int{-\infty}^{\infty} f{n}(x) \frac{d f_{m}^{}(x)}{d x} d x \tag{7.17}
\end{}
\)

Because $f{m}$ and $f{n}$ are well-behaved functions, they vanish at $x= \pm \infty$. (If they didn't vanish at infinity, they wouldn't be quadratically integrable.) Therefore, (7.17) becomes

\(
\int{-\infty}^{\infty} f{m}^{}\left(-i \hbar \frac{d f{n}}{d x}\right) d x=\int{-\infty}^{\infty} f{n}\left(-i \hbar \frac{d f{m}}{d x}\right)^{} d x
\)

which is the same as (7.11) and proves that $\hat{p}_{x}$ is Hermitian. The proof that the kineticenergy operator is Hermitian is left to the reader. The sum of two Hermitian operators can be shown to be Hermitian. Hence the Hamiltonian operator $\hat{H}=\hat{T}+\hat{V}$ is Hermitian.

Theorems about Hermitian Operators

We now prove some important theorems about the eigenvalues and eigenfunctions of Hermitian operators.

Since the eigenvalues of the operator $\hat{A}$ corresponding to the physical quantity $A$ are the possible results of a measurement of $A$ (Section 3.3), these eigenvalues should all be real numbers. We now prove that the eigenvalues of a Hermitian operator are real numbers.

We are given that $\hat{A}$ is Hermitian. Translating these words into an equation, we have [Eq. (7.11)]

\(
\begin{equation}
\int f_{m}^{} \hat{A} f{n} d \tau=\int f{n}\left(\hat{A} f_{m}\right)^{} d \tau \tag{7.18}
\end{equation}
\)

for all well-behaved functions $f{m}$ and $f{n}$. We want to prove that every eigenvalue of $\hat{A}$ is a real number. Translating this into equations, we want to show that $a{i}=a{i}^{*}$, where the eigenvalues $a{i}$ satisfy $\hat{A} g{i}=a{i} g{i}$; the functions $g_{i}$ are the eigenfunctions.

To introduce the eigenvalues $a{i}$ into (7.18), we write (7.18) for the special case where $f{m}=g{i}$ and $f{n}=g_{i}$ :

\(
\int g{i}^{*} \hat{A} g{i} d \tau=\int g{i}\left(\hat{A} g{i}\right)^{*} d \tau
\)

Use of $\hat{A} g{i}=a{i} g_{i}$ gives

\(
\begin{gather}
a{i} \int g{i}^{} g{i} d \tau=\int g{i}\left(a{i} g{i}\right)^{} d \tau=a_{i}^{} \int g{i} g{i}^{} d \tau \
\left(a{i}-a{i}^{}\right) \int\left|g_{i}\right|^{2} d \tau=0 \tag{7.19}
\end{gather*}
\)

Since the integrand $\left|g{i}\right|^{2}$ is never negative, the only way the integral in (7.19) could be zero would be if $g{i}$ were zero for all values of the coordinates. However, we always reject $g{i}=0$ as an eigenfunction on physical grounds. Hence the integral in (7.19) cannot be zero. Therefore, $\left(a{i}-a{i}^{*}\right)=0$, and $a{i}=a_{i}^{*}$. We have proved:

THEOREM 1. The eigenvalues of a Hermitian operator are real numbers.
To help become familiar with bracket notation, we shall repeat the proof of Theorem 1 using bracket notation. We begin by setting $m=i$ and $n=i$ in (7.13) to get $\langle i| \hat{A}|i\rangle=\langle i| \hat{A}|i\rangle^{}$. Choosing the function with index $i$ to be an eigenfunction of $\hat{A}$ and using the eigenvalue equation $\hat{A} g{i}=a{i} g{i}$, we have $\langle i| a{i}|i\rangle=\langle i| a_{i}|i\rangle^{}$. Therefore $a{i}\langle i \mid i\rangle=a{i}^{}\langle i \mid i\rangle^{}=a{i}^{*}\langle i \mid i\rangle$ and $\left(a{i}-a{i}^{*}\right)\langle i \mid i\rangle=0$. So $a{i}=a_{i}^{*}$, where (7.4) with $m=n$ was used.

We showed that two different particle-in-a-box energy eigenfunctions $\psi{i}$ and $\psi{j}$ are orthogonal, meaning that $\int{-\infty}^{\infty} \psi{i}^{*} \psi{j} d x=0$ for $i \neq j$ [Eq. (2.26)]. Two functions $f{1}$ and $f_{2}$ of the same set of coordinates are said to be orthogonal if

\(
\begin{equation}
\int f_{1}^{} f_{2} d \tau=0 \tag{7.20}
\end{}
\)

where the integral is a definite integral over the full range of the coordinates. We now prove the general theorem that the eigenfunctions of a Hermitian operator are, or can be chosen to be, mutually orthogonal. Given that

\(
\begin{equation}
\hat{B} F=s F, \quad \hat{B} G=t G \tag{7.21}
\end{equation}
\)

where $F$ and $G$ are two linearly independent eigenfunctions of the Hermitian operator $\hat{B}$, we want to prove that

\(
\int F^{*} G d \tau \equiv\langle F \mid G\rangle=0
\)

We begin with Eq. (7.12), which expresses the Hermitian nature of $\hat{B}$ :

\(
\langle F| \hat{B}|G\rangle=\langle G| \hat{B}|F\rangle^{*}
\)

Using (7.21), we have

\(
\begin{aligned}
\langle F| t|G\rangle & =\langle G| s|F\rangle^{} \
t\langle F \mid G\rangle & =s^{}\langle G \mid F\rangle^{*}
\end{aligned}
\)

Since eigenvalues of Hermitian operators are real (Theorem 1), we have $s^{}=s$. Use of $\langle G \mid F\rangle^{}=\langle F \mid G\rangle$ [Eq. (7.4)] gives

\(
\begin{gathered}
t\langle F \mid G\rangle=s\langle F \mid G\rangle \
(t-s)\langle F \mid G\rangle=0
\end{gathered}
\)

If $s \neq t$, then

\(
\begin{equation}
\langle F \mid G\rangle=0 \tag{7.22}
\end{equation}
\)

We have proved that two eigenfunctions of a Hermitian operator that correspond to different eigenvalues are orthogonal. The question now is: Can we have two independent eigenfunctions that have the same eigenvalue? The answer is yes. In the case of degeneracy, we have the same eigenvalue for more than one independent eigenfunction. Therefore, we can only be certain that two independent eigenfunctions of a Hermitian operator are orthogonal to each other if they do not correspond to a degenerate eigenvalue. We now show that in the case of degeneracy we may construct eigenfunctions that will be orthogonal to one another. We shall use the theorem proved in Section 3.6, that any linear combination of eigenfunctions corresponding to a degenerate eigenvalue is an eigenfunction with the same eigenvalue. Let us therefore suppose that $F$ and $G$ are independent eigenfunctions that have the same eigenvalue:

\(
\hat{B} F=s F, \quad \hat{B} G=s G
\)

We take linear combinations of $F$ and $G$ to form two new eigenfunctions $g{1}$ and $g{2}$ that will be orthogonal to each other. We choose

\(
g{1} \equiv F, \quad g{2} \equiv G+c F
\)

where the constant $c$ will be chosen to ensure orthogonality. We want

\(
\begin{gathered}
\int g{1}^{*} g{2} d \tau=0 \
\int F^{}(G+c F) d \tau=\int F^{} G d \tau+c \int F^{*} F d \tau=0
\end{gathered}
\)

Hence choosing

\(
\begin{equation}
c=-\int F^{} G d \tau / \int F^{} F d \tau \tag{7.23}
\end{equation}
\)

we have two orthogonal eigenfunctions $g{1}$ and $g{2}$ corresponding to the degenerate eigenvalue. This procedure (called Schmidt or Gram-Schmidt orthogonalization) can be extended to the case of $n$-fold degeneracy to give $n$ linearly independent orthogonal eigenfunctions corresponding to the degenerate eigenvalue.

Thus, although there is no guarantee that the eigenfunctions of a degenerate eigenvalue are orthogonal, we can always choose them to be orthogonal, if we desire, by using the Schmidt (or some other) orthogonalization method. In fact, unless stated otherwise, we shall always assume that we have chosen the eigenfunctions to be orthogonal:

\(
\begin{equation}
\int g_{i}^{} g_{k} d \tau=0, \quad i \neq k \tag{7.24}
\end{}
\)

where $g{i}$ and $g{k}$ are independent eigenfunctions of a Hermitian operator. We have proved:
THEOREM 2. Two eigenfunctions of a Hermitian operator $\hat{B}$ that correspond to different eigenvalues are orthogonal. Eigenfunctions of $\hat{B}$ that belong to a degenerate eigenvalue can always be chosen to be orthogonal.

An eigenfunction can usually be multiplied by a constant to normalize it, and we shall assume, unless stated otherwise, that all eigenfunctions are normalized:

\(
\begin{equation}
\int g_{i}^{} g_{i} d \tau=1 \tag{7.25}
\end{}
\)

The exception is where the eigenvalues form a continuum, rather than a discrete set of values. In this case, the eigenfunctions are not quadratically integrable. Examples are the linear-momentum eigenfunctions, the free-particle energy eigenfunctions, and the hydrogenatom continuum energy eigenfunctions.

Using the Kronecker delta, defined by $\delta{i k} \equiv 1$ if $i=k$ and $\delta{i k} \equiv 0$ if $i \neq k$ [Eq. (2.28)], we can combine (7.24) and (7.25) into one equation:

\(
\begin{equation}
\int g_{i}^{} g{k} d \tau=\langle i \mid k\rangle=\delta{i k} \tag{7.26}
\end{}
\)

where $g{i}$ and $g{k}$ are eigenfunctions of some Hermitian operator.
As an example, consider the spherical harmonics. We shall prove that

\(
\begin{equation}
\int{0}^{2 \pi} \int{0}^{\pi}\left[Y_{l}^{m}(\theta, \phi)\right] Y{l^{\prime}}^{m^{\prime}}(\theta, \phi) \sin \theta d \theta d \phi=\delta{l, l^{\prime}} \delta_{m, m^{\prime}} \tag{7.27}
\end{}
\)

where the $\sin \theta$ factor comes from the volume element in spherical coordinates, (5.78). The spherical harmonics are eigenfunctions of the Hermitian operator $\hat{L}^{2}$ [Eq. (5.104)]. Since eigenfunctions of a Hermitian operator belonging to different eigenvalues are orthogonal, we conclude that the integral in (7.27) is zero unless $l=l^{\prime}$. Similarly, since the $Y{l}^{m}$ functions are eigenfunctions of $\hat{L}{z}$ [Eq. (5.105)], we conclude that the integral in (7.27) is zero unless $m=m^{\prime}$. Also, the multiplicative constant in $Y_{l}^{m}$ [Eq. (5.147) of Prob. 5.34] has been chosen so that the spherical harmonics are normalized [Eq. (6.117)]. Therefore (7.27) is valid.

The integral $\langle f| \hat{B}|g\rangle$ can be simplified if either $f$ or $g$ is an eigenfunction of the Hermitian operator $\hat{B}$. If $\hat{B} g=c g$, where $c$ is a constant, then

\(
\langle f| \hat{B}|g\rangle=\langle f \mid \hat{B} g\rangle=\langle f \mid c g\rangle=c\langle f \mid g\rangle
\)

If $\hat{B} f=k f$, where $k$ is a constant, then use of the Hermitian property of $\hat{B}$ gives

\(
\langle f| \hat{B}|g\rangle=\langle g| \hat{B}|f\rangle^{}=\langle g \mid \hat{B} f\rangle^{}=\langle g \mid k f\rangle^{}=k^{}\langle g \mid f\rangle^{*}=k\langle f \mid g\rangle
\)

since the eigenvalue $k$ is real. The relation $\langle f| \hat{B}|g\rangle=k\langle f \mid g\rangle$ shows that the Hermitian operator $\hat{B}$ can act to the left in $\langle f| \hat{B}|g\rangle$.

A proof of the uncertainty principle is outlined in Prob. 7.60.


In the previous section, we proved the orthogonality of the eigenfunctions of a Hermitian operator. We now discuss another important property of these functions; this property allows us to expand an arbitrary well-behaved function in terms of these eigenfunctions.

We have used the Taylor-series expansion (Prob. 4.1) of a function as a linear combination of the nonnegative integral powers of $(x-a)$. Can we expand a function as a linear combination of some other set of functions besides $1,(x-a),(x-a)^{2}, \ldots$ ? The answer is yes, as was first shown by Fourier in 1807. A Fourier series is an expansion of a function as a linear combination of an infinite number of sine and cosine functions. We shall not go into detail about Fourier series, but shall simply look at one example.

Expansion of a Function Using Particle-in-a-Box Wave Functions

Let us consider expanding a function in terms of the particle-in-a-box stationary-state wave functions, which are [Eq. (2.23)]

\(
\begin{equation}
\psi_{n}=\left(\frac{2}{l}\right)^{1 / 2} \sin \left(\frac{n \pi x}{l}\right), \quad n=1,2,3, \ldots \tag{7.28}
\end{equation}
\)

for $x$ between 0 and $l$. What are our chances for representing an arbitrary function $f(x)$ in the interval $0 \leq x \leq l$ by a series of the form

\(
\begin{equation}
f(x)=\sum{n=1}^{\infty} a{n} \psi{n}=\left(\frac{2}{l}\right)^{1 / 2} \sum{n=1}^{\infty} a_{n} \sin \left(\frac{n \pi x}{l}\right), \quad 0 \leq x \leq l \tag{7.29}
\end{equation}
\)

where the $a{n}$ 's are constants? Substitution of $x=0$ and $x=l$ in (7.29) gives the restrictions that $f(0)=0$ and $f(l)=0$. In other words, $f(x)$ must satisfy the same boundary conditions as the $\psi{n}$ functions. We shall also assume that $f(x)$ is finite, single-valued, and continuous, but not necessarily differentiable. With these assumptions it can be shown that the expansion (7.29) is valid. We shall not prove (7.29) but will simply illustrate its use to represent a function.

Before we can apply (7.29) to a specific $f(x)$, we must derive an expression for the expansion coefficients $a{n}$. We start by multiplying (7.29) by $\psi{m}^{*}$ :

\(
\begin{equation}
\psi_{m}^{} f(x)=\sum{n=1}^{\infty} a{n} \psi{m}^{*} \psi{n}=\left(\frac{2}{l}\right) \sum{n=1}^{\infty} a{n} \sin \left(\frac{n \pi x}{l}\right) \sin \left(\frac{m \pi x}{l}\right) \tag{7.30}
\end{}
\)

Now we integrate this equation from 0 to $l$. Assuming the validity of interchanging the integration and the infinite summation, we have

\(
\int{0}^{l} \psi{m}^{} f(x) d x=\sum{n=1}^{\infty} a{n} \int{0}^{l} \psi{m}^{} \psi{n} d x=\sum{n=1}^{\infty} a{n}\left(\frac{2}{l}\right) \int{0}^{l} \sin \left(\frac{n \pi x}{l}\right) \sin \left(\frac{m \pi x}{l}\right) d x
\)

We proved the orthonormality of the particle-in-a-box wave functions [Eq. (2.27)]. Therefore, the last equation becomes

\(
\begin{equation}
\int{0}^{l} \psi{m}^{} f(x) d x=\sum{n=1}^{\infty} a{n} \delta_{m n} \tag{7.31}
\end{}
\)

The type of sum in (7.31) occurs often. Writing it in detail, we have

\(
\begin{gather}
\sum{n=1}^{\infty} a{n} \delta{m n}=a{1} \delta{m, 1}+a{2} \delta{m, 2}+\cdots+a{m} \delta{m, m}+a{m+1} \delta{m, m+1}+\cdots \
=0+0+\cdots+a{m}+0+\cdots \
\sum{n=1}^{\infty} a{n} \delta{m n}=a{m} \tag{7.32}
\end{gather}
\)

Thus, since $\delta_{m n}$ is zero except when the summation index $n$ is equal to $m$, all terms but one vanish, and (7.31) becomes

\(
\begin{equation}
a{m}=\int{0}^{l} \psi_{m}^{} f(x) d x \tag{7.33}
\end{}
\)

which is the desired expression for the expansion coefficients.

FIGURE 7.1 Function to be expanded in terms of particle-in-a-box functions.

Changing $m$ to $n$ in (7.33) and substituting it into (7.29), we have

\(
\begin{equation}
f(x)=\sum{n=1}^{\infty}\left[\int{0}^{l} \psi_{n}^{} f(x) d x\right] \psi_{n}(x) \tag{7.34}
\end{}
\)

This is the desired expression for the expansion of an arbitrary well-behaved function $f(x)(0 \leq x \leq l)$ as a linear combination of the particle-in-a-box wave functions $\psi{n}$. Note that the definite integral $\int{0}^{l} \psi_{n}^{*} f(x) d x$ is a number and not a function of $x$.

We now use (7.29) to represent a specific function, the function of Fig. 7.1, which is defined by

\(
\begin{array}{lrl}
f(x) & =x \quad \text { for } \quad 0 \leq x \leq \frac{1}{2} l \
f(x)=l-x \quad \text { for } \quad \frac{1}{2} l \leq x \leq l \tag{7.35}
\end{array}
\)

To find the expansion coefficients $a_{n}$, we substitute (7.28) and (7.35) into (7.33):

\(
\begin{aligned}
& a{n}=\int{0}^{l} \psi{n}^{*} f(x) d x=\left(\frac{2}{l}\right)^{1 / 2} \int{0}^{l} \sin \left(\frac{n \pi x}{l}\right) f(x) d x \
& a{n}=\left(\frac{2}{l}\right)^{1 / 2} \int{0}^{l / 2} x \sin \left(\frac{n \pi x}{l}\right) d x+\left(\frac{2}{l}\right)^{1 / 2} \int_{l / 2}^{l}(l-x) \sin \left(\frac{n \pi x}{l}\right) d x
\end{aligned}
\)

Using the Appendix integral (A.1), we find

\(
\begin{equation}
a_{n}=\frac{(2 l)^{3 / 2}}{n^{2} \pi^{2}} \sin \left(\frac{n \pi}{2}\right) \tag{7.36}
\end{equation}
\)

Using (7.36) in the expansion (7.29), we have [since $\sin (n \pi / 2)$ equals zero for $n$ even and equals +1 or -1 for $n$ odd]

\(
\begin{align}
& f(x)=\frac{4 l}{\pi^{2}}\left[\sin \left(\frac{\pi x}{l}\right)-\frac{1}{3^{2}} \sin \left(\frac{3 \pi x}{l}\right)+\frac{1}{5^{2}} \sin \left(\frac{5 \pi x}{l}\right)-\cdots\right] \
& f(x)=\frac{4 l}{\pi^{2}} \sum_{n=1}^{\infty}(-1)^{n+1} \frac{1}{(2 n-1)^{2}} \sin \left[(2 n-1) \frac{\pi x}{l}\right] \tag{7.37}
\end{align}
\)

where $f(x)$ is given by (7.35). Let us check (7.37) at $x=\frac{1}{2} l$. We have

\(
\begin{equation}
f\left(\frac{l}{2}\right)=\frac{4 l}{\pi^{2}}\left(1+\frac{1}{3^{2}}+\frac{1}{5^{2}}+\frac{1}{7^{2}}+\cdots\right) \tag{7.38}
\end{equation}
\)

FIGURE 7.2 Plots of (a) the error and (b) the percent error in the expansion of the function of Fig. 7.1 in terms of particle-in-a-box wave functions when 1 and 5 terms are taken in the expansion.

Tabulating the right side of (7.38) as a function of the number of terms we take in the infinite series, we get:

Number of terms1234520100
Right side of (7.38)$0.405 l$$0.450 l$$0.467 l$$0.475 l$$0.480 l$$0.495 l$$0.499 l$

If we take an infinite number of terms, the series should sum to $\frac{1}{2} l$, which is the value of $f\left(\frac{1}{2} l\right)$. Assuming the validity of the series, we have the interesting result that the infinite sum in parentheses in (7.38) equals $\pi^{2} / 8$. Figure 7.2 plots $f(x)-\sum{n=1}^{k} a{n} \psi{n}$ [where $f, a{n}$, and $\psi_{n}$ are given by (7.35), (7.36), and (7.28)] for $k$ values of 1 and 5 . As $k$, the number of terms in the expansion, increases, the series comes closer to $f(x)$, and the difference between $f$ and the series goes to zero.

Expansion of a Function in Terms of Eigenfunctions

We have seen an example of the expansion of a function in terms of a set of functionsthe particle-in-a-box energy eigenfunctions. Many different sets of functions can be used to expand an arbitrary function. A set of functions $g{1}, g{2}, \ldots, g{i}, \ldots$ is said to be a complete set if every well-behaved function $f$ that obeys the same boundary conditions as the $g{i}$ functions can be expanded as a linear combination of the $g_{i}$ 's according to

\(
\begin{equation}
f=\sum{i} a{i} g_{i} \tag{7.39}
\end{equation}
\)

where the $a{i}$ 's are constants. Of course, it is understood that $f$ and the $g{i}$ 's are all functions of the same set of variables. The limits have been omitted from the sum in (7.39). It is understood that this sum goes over all members of the complete set. By virtue of theorems of Fourier analysis (which we have not proved), the particle-in-a-box energy eigenfunctions can be shown to be a complete set.

We now postulate that the set of eigenfunctions of every Hermitian operator that represents a physical quantity is a complete set. (Completeness of the eigenfunctions can be proved in many cases, but must be postulated in the general case.) Thus, every wellbehaved function that satisfies the same boundary conditions as the set of eigenfunctions can be expanded according to (7.39). Equation (7.29) is an example of (7.39).

The harmonic-oscillator wave functions are given by a Hermite polynomial $H_{v}$ times an exponential factor [Eq. (4.86) of Prob. 4.21c]. By virtue of the expansion postulate, any well-behaved function $f(x)$ can be expanded as a linear combination of harmonicoscillator energy eigenfunctions:

\(
f(x)=\sum{n=0}^{\infty} a{n}\left(2^{n} n!\right)^{-1 / 2}(\alpha / \pi)^{1 / 4} H_{n}\left(\alpha^{1 / 2} x\right) e^{-\alpha x^{2} / 2}
\)

How about using the hydrogen-atom bound-state wave functions to expand an arbitrary function $f(r, \theta, \phi)$ ? The answer is that these functions do not form a complete set, and we cannot expand $f$ using them. To have a complete set, we must use all the eigenfunctions of a particular Hermitian operator. In addition to the bound-state eigenfunctions of the hydrogen-atom Hamiltonian, we have the continuum eigenfunctions, corresponding to ionized states. If the continuum eigenfunctions are included along with the bound-state eigenfunctions, then we have a complete set. (For the particle in a box and the harmonic oscillator, there are no continuum functions.) Equation (7.39) implies an integration over the continuum eigenfunctions, if there are any. Thus, if $\psi{n l m}(r, \theta, \phi)$ is a bound-state wave function of the hydrogen atom and $\psi{E l m}(r, \theta, \phi)$ is a continuum eigenfunction, then (7.39) becomes

\(
f(r, \theta, \phi)=\sum{n=1}^{\infty} \sum{l=0}^{n-1} \sum{m=-l}^{l} a{n l m} \psi{n l m}(r, \theta, \phi)+\sum{l=0}^{\infty} \sum{m=-l}^{l} \int{0}^{\infty} a{l m}(E) \psi{E l m}(r, \theta, \phi) d E
\)

As another example, consider the eigenfunctions of $\hat{p}_{x}$ [Eq. (3.36)]:

\(
g_{k}=e^{i k x / \hbar}, \quad-\infty<k<\infty
\)

Here the eigenvalues are all continuous, and the eigenfunction expansion (7.39) of an arbitrary function $f$ becomes

\(
f(x)=\int_{-\infty}^{\infty} a(k) e^{i k x / \hbar} d k
\)

The reader with a good mathematical background may recognize this integral as very nearly the Fourier transform of $a(k)$.

Let us evaluate the expansion coefficients in $f=\sum{i} a{i} g{i}$ [Eq. (7.39)], where the $g{i}$ functions are the complete set of eigenfunctions of a Hermitian operator. The procedure is the same as that used to derive (7.33). We multiply $f=\sum{i} a{i} g{i}$ by $g{k}^{*}$ and integrate over all space:

\(
\begin{gather}
g_{k}^{} f=\sum{i} a{i} g{k}^{*} g{i} \
\int g{k}^{*} f d \tau=\sum{i} a{i} \int g{k}^{} g{i} d \tau=\sum{i} a{i} \delta{i k}=a{k} \
a{k}=\int g_{k}^{} f d \tau \tag{7.40}
\end{gather*}
\)

where we used the orthonormality of the eigenfunctions of a Hermitian operator: $\int g{k}^{*} g{i} d \tau=\delta{i k}$ [Eq. (7.26)]. The procedure that led to (7.40) will be used often and is worth remembering. Substitution of (7.40) for $a{i}$ in $f=\sum{i} a{i} g_{i}$ gives

\(
\begin{equation}
f=\sum{i}\left[\int g{i}^{} f d \tau\right] g{i}=\sum{i}\left\langle g{i} \mid f\right\rangle g{i} \tag{7.41}
\end{}
\)

EXAMPLE

Let $F(x)=x(l-x)$ for $0 \leq x \leq l$ and $F(x)=0$ elsewhere. Expand $F$ in terms of the particle-in-a-box energy eigenfunctions $\psi_{n}=(2 / l)^{1 / 2} \sin (n \pi x / l)$ for $0 \leq x \leq l$.

We begin by noting that $F(0)=0$ and $F(l)=0$, so $F$ obeys the same boundary conditions as the $\psi{n}$ 's and can be expanded using the $\psi{n}$ 's. The expansion is $F=\sum{n=1}^{\infty} a{n} \psi{n}$, where $a{n}=\int \psi_{n}^{*} F d \tau$ [Eqs. (7.39) and (7.40)]. Thus

\(
a{n}=\int \psi{n}^{*} F d \tau=\left(\frac{2}{l}\right)^{1 / 2} \int_{0}^{l}\left(\sin \frac{n \pi x}{l}\right) x(l-x) d x=\frac{2^{3 / 2} l^{5 / 2}}{n^{3} \pi^{3}}\left[1-(-1)^{n}\right]
\)

where details of the integral evaluation are left as a problem (Prob. 7.18). The expansion $F=\sum{n=1}^{\infty} a{n} \psi_{n}$ is

\(
x(l-x)=\frac{4 l^{2}}{\pi^{3}} \sum_{n=1}^{\infty} \frac{1-(-1)^{n}}{n^{3}} \sin \frac{n \pi x}{l}, \quad \text { for } 0 \leq x \leq l
\)

EXERCISE Let $G(x)=1$ for $0 \leq x \leq l$ and $G(x)=0$ elsewhere. Expand $G$ in terms of the particle-in-a-box energy eigenfunctions. Since $G$ is not zero at 0 and at $l$, the expansion will not represent $G$ at these points but will represent $G$ elsewhere. Use the first 7 nonzero terms of the expansion to calculate $G$ at $x=\frac{1}{4} l$. Repeat this using the first 70 nonzero terms (use a programmable calculator). (Answers: 0.1219, 0.9977.)

A useful theorem is the following:
THEOREM 3. Let the functions $g{1}, g{2}, \ldots$ be the complete set of eigenfunctions of the Hermitian operator $\hat{A}$, and let the function $F$ be an eigenfunction of $\hat{A}$ with eigenvalue $k$ (that is, $\hat{A} F=k F$ ). Then if $F$ is expanded as $F=\sum{i} a{i} g{i}$, the only nonzero coefficients $a{i}$ are those for which $g{i}$ has the eigenvalue $k$. (Because of degeneracy, several $g{i}$ 's may have the same eigenvalue $k$.)

Thus in the expansion of $F$, we include only those eigenfunctions that have the same eigenvalue as $F$. The proof of Theorem 3 follows at once from $a{k}=\int g{k}^{*} F d \tau$ [Eq. (7.40)]; if $F$ and $g{k}$ correspond to different eigenvalues of the Hermitian operator $\hat{A}$, they will be orthogonal [Eq. (7.22)] and $a{k}$ will vanish.

We shall occasionally use a notation (called ket notation) in which the function $f$ is denoted by the symbol $|f\rangle$. There doesn't seem to be any point to this notation, but in advanced formulations of quantum mechanics, it takes on a special significance. In ket notation, Eq. (7.41) reads

\(
\begin{equation}
|f\rangle=\sum{i}\left|g{i}\right\rangle\left\langle g{i} \mid f\right\rangle=\sum{i}|i\rangle\langle i \mid f\rangle \tag{7.42}
\end{equation}
\)

Ket notation is conveniently used to specify eigenfunctions by listing their eigenvalues. For example, the hydrogen-atom wave function with quantum numbers $n, l, m$ is denoted by $\psi_{n l m}=|n l m\rangle$.

The contents of Sections 7.2 and 7.3 can be summarized by the statement that the eigenfunctions of a Hermitian operator form a complete, orthonormal set, and the eigenvalues are real.


If the state function $\Psi$ is simultaneously an eigenfunction of the two operators $\hat{A}$ and $\hat{B}$ with eigenvalues $a{j}$ and $b{j}$, respectively, then a measurement of the physical property $A$ will yield the result $a{j}$ and a measurement of $B$ will yield $b{j}$. Hence the two properties $A$ and $B$ have definite values when $\Psi$ is simultaneously an eigenfunction of $\hat{A}$ and $\hat{B}$.

In Section 5.1, some statements were made about simultaneous eigenfunctions of two operators. We now prove these statements.

First, we show that if there exists a common complete set of eigenfunctions for two linear operators then these operators commute. Let $\hat{A}$ and $\hat{B}$ denote two linear operators that have a common complete set of eigenfunctions $g{1}, g{2}, \ldots$ :

\(
\begin{equation}
\hat{A} g{i}=a{i} g{i}, \quad \hat{B} g{i}=b{i} g{i} \tag{7.43}
\end{equation}
\)

where $a{i}$ and $b{i}$ are the eigenvalues. We must prove that

\(
\begin{equation}
[\hat{A}, \hat{B}]=\hat{0} \tag{7.44}
\end{equation}
\)

Equation (7.44) is an operator equation. For two operators to be equal, the results of operating with either of them on an arbitrary well-behaved function $f$ must be the same. Hence we must show that

\(
(\hat{A} \hat{B}-\hat{B} \hat{A}) f=\hat{0} f=0
\)

where $f$ is an arbitrary function. We begin the proof by expanding $f$ (assuming that it obeys the proper boundary conditions) in terms of the complete set of eigenfunctions $g_{i}$ :

\(
f=\sum{i} c{i} g_{i}
\)

Operating on each side of this last equation with $\hat{A} \hat{B}-\hat{B} \hat{A}$, we have

\(
(\hat{A} \hat{B}-\hat{B} \hat{A}) f=(\hat{A} \hat{B}-\hat{B} \hat{A}) \sum{i} c{i} g_{i}
\)

Since the products $\hat{A} \hat{B}$ and $\hat{B} \hat{A}$ are linear operators (Prob. 3.16), we have

\(
(\hat{A} \hat{B}-\hat{B} \hat{A}) f=\sum{i} c{i}(\hat{A} \hat{B}-\hat{B} \hat{A}) g{i}=\sum{i} c{i}\left[\hat{A}\left(\hat{B} g{i}\right)-\hat{B}\left(\hat{A} g_{i}\right)\right]
\)

where the definitions of the sum and the product of operators were used. Use of the eigenvalue equations (7.43) gives

\(
(\hat{A} \hat{B}-\hat{B} \hat{A}) f=\sum{i} c{i}\left[\hat{A}\left(b{i} g{i}\right)-\hat{B}\left(a{i} g{i}\right)\right]=\sum{i} c{i}\left(b{i} a{i} g{i}-a{i} b{i} g{i}\right)=0
\)

This completes the proof of:
THEOREM 4. If the linear operators $\hat{A}$ and $\hat{B}$ have a common complete set of eigenfunctions, then $\hat{A}$ and $\hat{B}$ commute.

It is sometimes erroneously stated that if a common eigenfunction of $\hat{A}$ and $\hat{B}$ exists, then they commute. An example that shows this statement to be false is the fact that the
spherical harmonic $Y{0}^{0}$ is an eigenfunction of both $\hat{L}{z}$ and $\hat{L}_{x}$ even though these two operators do not commute (Section 5.3). It is instructive to examine the so-called proof that is given for this erroneous statement. Let $g$ be the common eigenfunction: $\hat{A} g=a g$ and $\hat{B} g=b g$. We have

\(
\begin{array}{cl}
\hat{A} \hat{B} g=\hat{A} b g=a b g \quad \text { and } \hat{B} \hat{A} g=\hat{B} a g=b a g=a b g \
& \hat{A} \hat{B} g=\hat{B} \hat{A} g \tag{7.45}
\end{array}
\)

The "proof" is completed by canceling $g$ from each side of (7.45) to get

\(
\begin{equation}
\hat{A} \hat{B}=\hat{B} \hat{A}(?) \tag{7.46}
\end{equation}
\)

It is in going from (7.45) to (7.46) that the error occurs. Just because the two operators $\hat{A} \hat{B}$ and $\hat{B} \hat{A}$ give the same result when acting on the single function $g$ is no reason to conclude that $\hat{A} \hat{B}=\hat{B} \hat{A}$. (For example, $d / d x$ and $d^{2} / d x^{2}$ give the same result when operating on $e^{x}$, but $d / d x$ is certainly not equal to $d^{2} / d x^{2}$.) The two operators must give the same result when acting on every well-behaved function before we can conclude that they are equal. Thus, even though $\hat{A}$ and $\hat{B}$ do not commute, one or more common eigenfunctions of $\hat{A}$ and $\hat{B}$ might exist. However, we cannot have a common complete set of eigenfunctions of two noncommuting operators, as we proved earlier in this section.

We have shown that, if there exists a common complete set of eigenfunctions of the linear operators $\hat{A}$ and $\hat{B}$, then they commute. We now prove the following:

THEOREM 5. If the Hermitian operators $\hat{A}$ and $\hat{B}$ commute, we can select a common complete set of eigenfunctions for them.

The proof is as follows. Let the functions $g{i}$ and the numbers $a{i}$ be the eigenfunctions and eigenvalues of $\hat{A}$ :

\(
\hat{A} g{i}=a{i} g_{i}
\)

Operating on both sides of this equation with $\hat{B}$, we have

\(
\hat{B} \hat{A} g{i}=\hat{B}\left(a{i} g_{i}\right)
\)

Since $\hat{A}$ and $\hat{B}$ commute and since $\hat{B}$ is linear, we have

\(
\begin{equation}
\hat{A}\left(\hat{B} g{i}\right)=a{i}\left(\hat{B} g_{i}\right) \tag{7.47}
\end{equation}
\)

This equation states that the function $\hat{B} g{i}$ is an eigenfunction of the operator $\hat{A}$ with the same eigenvalue $a{i}$ as the eigenfunction $g{i}$. Suppose the eigenvalues of $\hat{A}$ are nondegenerate, so that for any given eigenvalue $a{i}$ one and only one linearly independent eigenfunction exists. If this is so, then the two eigenfunctions $g{i}$ and $\hat{B} g{i}$, which correspond to the same eigenvalue $a_{i}$, must be linearly dependent; that is, one function must be simply a multiple of the other:

\(
\begin{equation}
\hat{B} g{i}=k{i} g_{i} \tag{7.48}
\end{equation}
\)

where $k{i}$ is a constant. This equation states that the functions $g{i}$ are eigenfunctions of $\hat{B}$, which is what we wanted to prove. In Section 7.3, we postulated that the eigenfunctions of any operator that represents a physical quantity form a complete set. Hence the $g_{i}$ 's form a complete set.

We have just proved the desired theorem for the nondegenerate case, but what about the degenerate case? Let the eigenvalue $a{i}$ be $n$-fold degenerate. We know from Eq. (7.47) that $\hat{B} g{i}$ is an eigenfunction of $\hat{A}$ with eigenvalue $a{i}$. Hence, Theorem 3 of Section 7.3 tells us that, if the function $\hat{B} g{i}$ is expanded in terms of the complete set of eigenfunctions of $\hat{A}$,
then all the expansion coefficients will be zero except those for which the $\hat{A}$ eigenfunction has the eigenvalue $a{i}$. In other words, $\hat{B} g{i}$ must be a linear combination of the $n$ linearly independent $\hat{A}$ eigenfunctions that correspond to the eigenvalue $a_{i}$ :

\(
\begin{equation}
\hat{B} g{i}=\sum{k=1}^{n} c{k} g{k}, \quad \text { where } \quad \hat{A} g{k}=a{i} g_{k} \quad \text { for } k=1 \text { to } n \tag{7.49}
\end{equation}
\)

where $g{1}, \ldots, g{n}$ denote those $\hat{A}$ eigenfunctions that have the degenerate eigenvalue $a{i}$. Equation (7.49) shows that $g{i}$ is not necessarily an eigenfunction of $\hat{B}$. However, by taking suitable linear combinations of the $n$ linearly independent $\hat{A}$ eigenfunctions corresponding to the degenerate eigenvalue $a_{i}$, one can construct a new set of $n$ linearly independent eigenfunctions of $\hat{A}$ that will also be eigenfunctions of $\hat{B}$. Proof of this statement is given in Merzbacher, Section 8.5 .

Thus, when $\hat{A}$ and $\hat{B}$ commute, it is always possible to select a common complete set of eigenfunctions for them. For example, consider the hydrogen atom, where the operators $\hat{L}{z}$ and $\hat{H}$ were shown to commute. If we desired, we could take the phi factor in the eigenfunctions of $\hat{H}$ as $\sin m \phi$ and $\cos m \phi$ (Section 6.6). If we did this, we would not have eigenfunctions of $\hat{L}{z}$, except for $m=0$. However, the linear combinations

\(
R(r) S(\theta)(\cos m \phi+i \sin m \phi)=R S e^{i m \phi}, \quad m=-l, \ldots, l
\)

give us eigenfunctions of $\hat{L}_{z}$ that are still eigenfunctions of $\hat{H}$ by virtue of the theorem in Section 3.6.

Extension of the above proofs to the case of more than two operators shows that for a set of Hermitian operators $\hat{A}, \hat{B}, \hat{C}, \ldots$ there exists a common complete set of eigenfunctions if and only if every operator commutes with every other operator.

A useful theorem that is related to Theorem 5 is:
THEOREM 6. If $g{m}$ and $g{n}$ are eigenfunctions of the Hermitian operator $\hat{A}$ with different eigenvalues (that is, if $\hat{\hat{A}} g{m}=a{m} g{m}$ and $\hat{A} g{n}=a{n} g{n}$ with $a{m} \neq a{n}$ ), and if the linear operator $\hat{B}$ commutes with $\hat{A}$, then

\(
\begin{equation}
\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle=0 \quad \text { for } a{n} \neq a{m} \tag{7.50}
\end{equation}
\)

To prove (7.50), we start with

\(
\begin{equation}
\left\langle g{n}\right| \hat{A} \hat{B}\left|g{m}\right\rangle=\left\langle g{n}\right| \hat{B} \hat{A}\left|g{m}\right\rangle \tag{7.51}
\end{equation}
\)

Use of the Hermitian property of $\hat{A}$ gives the left side of (7.51) as

\(
\begin{aligned}
\left\langle g{n}\right| \hat{A} \hat{B}\left|g{m}\right\rangle=\left\langle g{n}\right| \hat{A}\left|\hat{B} g{m}\right\rangle=\left\langle\hat{B} g{m}\right| \hat{A}\left|g{n}\right\rangle & =\left\langle\hat{B} g{m}\right| a{n}\left|g_{n}\right\rangle^{} \
& =a{n}^{*}\left\langle g{n} \mid \hat{B} g{m}\right\rangle=a{n}\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle
\end{aligned}
\)

The right side of (7.51) is

\(
\left\langle g{n}\right| \hat{B} \hat{A}\left|g{m}\right\rangle=\left\langle g{n}\right| \hat{B}\left|\hat{A} g{m}\right\rangle=a{m}\left\langle g{n}\right| \hat{B}\left|g_{m}\right\rangle
\)

Equating the final expressions for the left and right sides of (7.51), we get

\(
\begin{gathered}
a{n}\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle=a{m}\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle \
\left(a{n}-a{m}\right)\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle=0
\end{gathered}
\)

Since $a{m} \neq a{n}$, we have $\left\langle g{n}\right| \hat{B}\left|g{m}\right\rangle=0$, which completes the proof.


Certain quantum-mechanical operators have no classical analog. An example is the parity operator. Recall that the harmonic-oscillator wave functions are either even or odd. We shall show how this property is related to the parity operator.

The parity operator $\hat{\Pi}$ is defined in terms of its effect on an arbitrary function $f$ :

\(
\begin{equation<em>}
\hat{\Pi} f(x, y, z)=f(-x,-y,-z) \tag{7.52}
\end{equation</em>}
\)

The parity operator replaces each Cartesian coordinate with its negative. For example, $\hat{\Pi}\left(x^{2}-z e^{a y}\right)=x^{2}+z e^{-a y}$.

As with any quantum-mechanical operator, we are interested in the eigenvalues $c{i}$ and the eigenfunctions $g{i}$ of the parity operator:

\(
\begin{equation<em>}
\hat{\Pi} g{i}=c{i} g_{i} \tag{7.53}
\end{equation</em>}
\)

The key to the problem is to calculate the square of $\hat{\Pi}$ :

\(
\hat{\Pi}^{2} f(x, y, z)=\hat{\Pi}[\hat{\Pi} f(x, y, z)]=\hat{\Pi}[f(-x,-y,-z)]=f(x, y, z)
\)

Since $f$ is arbitrary, we conclude that $\hat{\Pi}^{2}$ equals the unit operator:

\(
\begin{equation<em>}
\hat{\Pi}^{2}=\hat{1} \tag{7.54}
\end{equation</em>}
\)

We now operate on (7.53) with $\hat{\Pi}$ to get $\hat{\Pi} \hat{\Pi} g{i}=\hat{\Pi} c{i} g{i}$. Since $\hat{\Pi}$ is linear (Prob. 7.26), we have $\hat{\Pi}^{2} g{i}=c{i} \hat{\Pi} g{i}$, which becomes

\(
\begin{equation<em>}
\hat{\Pi}^{2} g{i}=c{i}^{2} g_{i} \tag{7.55}
\end{equation</em>}
\)

where the eigenvalue equation (7.53) was used. Since $\hat{\Pi}^{2}$ is the unit operator, the left side of (7.55) is simply $g_{i}$, and

\(
g{i}=c{i}^{2} g_{i}
\)

The function $g{i}$ cannot be zero everywhere (zero is always rejected as an eigenfunction on physical grounds). We can therefore divide by $g{i}$ to get $c_{i}^{2}=1$ and

\(
\begin{equation<em>}
c_{i}= \pm 1 \tag{7.56}
\end{equation</em>}
\)

The eigenvalues of $\hat{\Pi}$ are +1 and -1 . Note that this derivation applies to any operator whose square is the unit operator.

What are the eigenfunctions $g_{i}$ ? The eigenvalue equation (7.53) reads

\(
\begin{gathered}
\hat{\Pi} g{i}(x, y, z)= \pm g{i}(x, y, z) \
g{i}(-x,-y,-z)= \pm g{i}(x, y, z)
\end{gathered}
\)

If the eigenvalue is +1 , then $g{i}(-x,-y,-z)=g{i}(x, y, z)$ and $g{i}$ is an even function. If the eigenvalue is -1 , then $g{i}$ is odd: $g{i}(-x,-y,-z)=-g{i}(x, y, z)$. Hence, the eigenfunctions of the parity operator $\hat{\Pi}$ are all possible well-behaved even and odd functions.

When the parity operator commutes with the Hamiltonian operator $\hat{H}$, we can select a common set of eigenfunctions for these operators, as proved in Section 7.4. The eigenfunctions of $\hat{H}$ are the stationary-state wave functions $\psi_{i}$. Hence when

\(
\begin{equation<em>}
[\hat{\Pi}, \hat{H}]=0 \tag{7.57}
\end{equation</em>}
\)

the wave functions $\psi_{i}$ can be chosen to be eigenfunctions of $\hat{\Pi}$. We just proved that the eigenfunctions of $\hat{\Pi}$ are either even or odd. Hence, when (7.57) holds, each wave function can be chosen to be either even or odd. Let us find out when the parity and Hamiltonian operators commute.

We have, for a one-particle system,

\(
[\hat{H}, \hat{\Pi}]=[\hat{T}, \hat{\Pi}]+[\hat{V}, \hat{\Pi}]=-\frac{\hbar^{2}}{2 m}\left(\left[\frac{\partial^{2}}{\partial x^{2}}, \hat{\Pi}\right]+\left[\frac{\partial^{2}}{\partial y^{2}}, \hat{\Pi}\right]+\left[\frac{\partial^{2}}{\partial z^{2}}, \hat{\Pi}\right]\right)+[\hat{V}, \hat{\Pi}]
\)

Since

\(
\begin{aligned}
\hat{\Pi}\left[\frac{\partial^{2}}{\partial x^{2}} f(x, y, z)\right]=\frac{\partial}{\partial(-x)} \frac{\partial}{\partial(-x)} f(-x,-y,-z) & =\frac{\partial^{2}}{\partial x^{2}} f(-x,-y,-z) \
& =\frac{\partial^{2}}{\partial x^{2}} \hat{\Pi} f(x, y, z)
\end{aligned}
\)

where $f$ is any function, we conclude that

\(
\left[\frac{\partial^{2}}{\partial x^{2}}, \hat{\Pi}\right]=0
\)

Similar equations hold for the $y$ and $z$ coordinates, and $[\hat{H}, \hat{\Pi}]$ becomes

\(
\begin{equation<em>}
[\hat{H}, \hat{\Pi}]=[\hat{V}, \hat{\Pi}] \tag{7.58}
\end{equation</em>}
\)

Now

\(
\begin{equation<em>}
\hat{\Pi}[V(x, y, z) f(x, y, z)]=V(-x,-y,-z) f(-x,-y,-z) \tag{7.59}
\end{equation</em>}
\)

If the potential energy is an even function, that is, if $V(-x,-y,-z)=V(x, y, z)$, then (7.59) becomes

\(
\hat{\Pi}[V(x, y, z) f(x, y, z)]=V(x, y, z) f(-x,-y,-z)=V(x, y, z) \hat{\Pi} f(x, y, z)
\)

so $[\hat{V}, \hat{\Pi}]=0$. Hence, when the potential energy is an even function, the parity operator commutes with the Hamiltonian:

\(
\begin{equation<em>}
[\hat{H}, \hat{\Pi}]=0 \quad \text { if } V \text { is even } \tag{7.60}
\end{equation</em>}
\)

These results are easily extended to the $n$-particle case. For an $n$-particle system, the parity operator is defined by

\(
\begin{equation<em>}
\hat{\Pi} f\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right)=f\left(-x{1},-y{1},-z{1}, \ldots,-x{n},-y{n},-z{n}\right) \tag{7.61}
\end{equation</em>}
\)

It is easy to see that (7.57) holds when

\(
\begin{equation<em>}
V\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right)=V\left(-x{1},-y{1},-z{1}, \ldots,-x{n},-y{n},-z{n}\right) \tag{7.62}
\end{equation</em>}
\)

If $V$ satisfies this equation, $V$ is said to be an even function of the $3 n$ coordinates. In summary, we have:

THEOREM 7. When the potential energy $V$ is an even function, we can choose the stationary-state wave functions so that each $\psi_{i}$ is either an even function or an odd function.

A function that is either even or odd is said to be of definite parity.
If the energy levels are all nondegenerate (as is usually true in one-dimensional problems), then only one independent wave function corresponds to each energy eigenvalue and there is no element of choice (apart from an arbitrary multiplicative constant) in the wave functions. Thus, for the nondegenerate case, the stationary-state wave functions must be of definite parity when $V$ is an even function. For example, the one-dimensional harmonic oscillator has $V=\frac{1}{2} k x^{2}$, which is an even function, and the wave functions have definite parity.

The hydrogen-atom potential-energy function is even, and the hydrogenlike orbitals can be chosen to have definite parity (Probs. 7.22 and 7.29).

For the degenerate case, we have an element of choice in the wave functions, since an arbitrary linear combination of the functions corresponding to the degenerate level is an eigenfunction of $\hat{H}$. For a degenerate energy level, by taking appropriate linear combinations we can choose wave functions that are of definite parity, but there is no necessity that they be of definite parity.

Parity aids in evaluating integrals. We showed that $\int_{-\infty}^{\infty} f(x) d x=0$ when $f(x)$ is an odd function [Eq. (4.51)]. Let us extend this result to the $3 n$-dimensional case. An odd function of $3 n$ variables satisfies

\(
\begin{equation<em>}
g\left(-x{1},-y{1},-z{1}, \ldots,-x{n},-y{n},-z{n}\right)=-g\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right) \tag{7.63}
\end{equation</em>}
\)

If $g$ is an odd function of the $3 n$ variables, then

\(
\begin{equation<em>}
\int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} g\left(x{1}, \ldots, z{n}\right) d x{1} \cdots d z{n}=0 \tag{7.64}
\end{equation</em>}
\)

where the integration is over the $3 n$ coordinates. This equation holds because the contribution to the integral from the value of $g$ at $\left(x{1}, y{1}, z{1}, \ldots, x{n}, y{n}, z{n}\right)$ is canceled by the contribution from ( $-x{1},-y{1},-z{1}, \ldots,-x{n},-y{n},-z{n}$ ). Equation (7.64) also holds when the integrand is an odd function of some (but not necessarily all) of the variables. See Prob. 7.30.


7.6 Measurement and the Superposition of States

Quantum mechanics can be regarded as a scheme for calculating the probabilities of the various possible outcomes of a measurement. For example, if we know the state function $\Psi(x, t)$, then the probability that a measurement at time $t$ of the particle's position yields a value between $x$ and $x+d x$ is given by $|\Psi(x, t)|^{2} d x$. We now consider measurement of the general property $B$. Our aim is to find out how to use $\Psi$ to calculate the probabilities for each possible result of a measurement of $B$. The results of this section, which tell us what information is contained in the state function $\Psi$, lie at the heart of quantum mechanics.

We shall deal with an $n$-particle system and use $q$ to symbolize the $3 n$ coordinates. We have postulated that the eigenvalues $b{i}$ of the operator $\hat{B}$ are the only possible results of a measurement of the property $B$. Using $g{i}$ for the eigenfunctions of $\hat{B}$, we have

\(
\begin{equation<em>}
\hat{B} g{i}(q)=b{i} g_{i}(q) \tag{7.65}
\end{equation</em>}
\)

We postulated in Section 7.3 that the eigenfunctions of any Hermitian operator that represents a physically observable property form a complete set. Since the $g_{i}$ 's form a complete set, we can expand the state function $\Psi$ as

\(
\begin{equation<em>}
\Psi(q, t)=\sum{i} c{i}(t) g_{i}(q) \tag{7.66}
\end{equation</em>}
\)

To allow for the change of $\Psi$ with time, the expansion coefficients $c_{i}$ vary with time.
Since $|\Psi|^{2}$ is a probability density, we require that

\(
\begin{equation<em>}
\int \Psi^{</em>} \Psi d \tau=1 \tag{7.67}
\end{equation*}
\)

Substituting (7.66) into the normalization condition and using (1.33) and (1.32), we get

\(
\begin{equation<em>}
1=\int \sum{i} c{i}^{</em>} g{i}^{*} \sum{i} c{i} g{i} d \tau=\int \sum{i} c{i}^{<em>} g_{i}^{</em>} \sum{k} c{k} g{k} d \tau=\int \sum{i} \sum{k} c{i}^{<em>} c{k} g{i}^{</em>} g_{k} d \tau \tag{7.68}
\end{equation*}
\)

Since the summation indexes in the two sums in (7.68) need not have the same value, different symbols must be used for these two dummy indexes. For example, consider the following product of two sums:

\(
\sum{i=1}^{2} s{i} \sum{i=1}^{2} t{i}=\left(s{1}+s{2}\right)\left(t{1}+t{2}\right)=s{1} t{1}+s{1} t{2}+s{2} t{1}+s{2} t{2}
\)

If we carelessly write

\(
\sum{i=1}^{2} s{i} \sum{i=1}^{2} t{i} \stackrel{(\text { wrong) })}{=} \sum{i=1}^{2} \sum{i=1}^{2} s{i} t{i}=\sum{i=1}^{2}\left(s{1} t{1}+s{2} t{2}\right)=2\left(s{1} t{1}+s{2} t_{2}\right)
\)

we get the wrong answer. The correct way to write the product is

\(
\sum{i=1}^{2} s{i} \sum{i=1}^{2} t{i}=\sum{i=1}^{2} s{i} \sum{k=1}^{2} t{k}=\sum{i=1}^{2} \sum{k=1}^{2} s{i} t{k}=\sum{i=1}^{2}\left(s{i} t{1}+s{i} t{2}\right)=s{1} t{1}+s{1} t{2}+s{2} t{1}+s{2} t_{2}
\)

which gives the right answer.
Assuming the validity of interchanging the infinite summation and the integration in (7.68), we have

\(
\sum{i} \sum{k} c{i}^{*} c{k} \int g{i}^{*} g{k} d \tau=1
\)

Since $\hat{B}$ is Hermitian, its eigenfunctions $g_{i}$ are orthonormal [Eq. (7.26)]; hence

\(
\begin{align<em>}
\sum{i} \sum{k} c_{i}^{</em>} c{k} \delta{i k} & =1 \
\sum{i}\left|c{i}\right|^{2} & =1 \tag{7.69}
\end{align*}
\)

We shall point out the significance of (7.69) shortly.
Recall the postulate (Section 3.7) that, if $\Psi$ is the normalized state function of a system, then the average value of the property $B$ is

\(
\langle B\rangle=\int \Psi^{*}(q, t) \hat{B} \Psi(q, t) d \tau
\)

Using the expansion (7.66) in the average-value expression, we have

\(
\langle B\rangle=\int \sum{i} c{i}^{<em>} g_{i}^{</em>} \hat{B} \sum{k} c{k} g{k} d \tau=\sum{i} \sum{k} c{i}^{<em>} c{k} \int g{i}^{</em>} \hat{B} g_{k} d \tau
\)

where the linearity of $\hat{B}$ was used. Use of $\hat{B} g{k}=b{k} g_{k}[E q$. (7.65)] gives

\(
\begin{gather<em>}
\langle B\rangle=\sum{i} \sum{k} c_{i}^{</em>} c{k} b{k} \int g{i}^{*} g{k} d \tau=\sum{i} \sum{k} c{i}^{*} c{k} b{k} \delta{i k} \
\langle B\rangle=\sum{i}\left|c{i}\right|^{2} b_{i} \tag{7.70}
\end{gather*}
\)

How do we interpret (7.70)? We postulated in Section 3.3 that the eigenvalues of an operator are the only possible numbers we can get when we measure the property that the operator represents. In any measurement of $B$, we get one of the values $b_{i}$ (assuming there is no experimental error). Now recall Eq. (3.81):

\(
\begin{equation<em>}
\langle B\rangle=\sum{b</em>{i}} P\left(b{i}\right) b{i} \tag{7.71}
\end{equation</em>}
\)

where $P\left(b{i}\right)$ is the probability of getting $b{i}$ in a measurement of $B$. The sum in (7.71) goes over the different eigenvalues $b{i}$, whereas the sum in (7.70) goes over the different eigenfunctions $g{i}$, since the expansion (7.66) is over the $g{i}$ 's. If there is only one independent eigenfunction for each eigenvalue, then a sum over eigenfunctions is the same as a sum over eigenvalues, and comparison of (7.71) and (7.70) shows that, when there is no degeneracy in the $\hat{B}$ eigenvalues, $\left|c{i}\right|^{2}$ is the probability of getting the value $b{i}$ in a measurement of the property $B$. Note that the $\left|c{i}\right|^{2}$ values sum to 1 , as probabilities should [Eq. (7.69)]. Suppose the eigenvalue $b{i}$ is degenerate. From (7.71), $P\left(b{i}\right)$ is given by the quantity that multiplies $b{i}$. With degeneracy, more than one term in (7.70) contains $b{i}$, so the probability $P\left(b{i}\right)$ of getting $b{i}$ in a measurement is found by adding the $\left|c{i}\right|^{2}$ values for those eigenfunctions that have the same eigenvalue $b{i}$. We have proved the following:

THEOREM 8. If $b{m}$ is a nondegenerate eigenvalue of the operator $\hat{B}$ and $g{m}$ is the corresponding normalized eigenfunction $\left(\hat{B} g{m}=b{m} g{m}\right)$, then, when the property $B$ is measured in a quantum-mechanical system whose state function at the time of the measurement is $\Psi$, the probability of getting the result $b{m}$ is given by $\left|c{m}\right|^{2}$, where $c{m}$ is the coefficient of $g{m}$ in the expansion $\Psi=\sum{i} c{i} g{i}$. If the eigenvalue $b{m}$ is degenerate, the probability of obtaining $b{m}$ when $B$ is measured is found by adding the $\left|c{i}\right|^{2}$ values for those eigenfunctions whose eigenvalue is $b{m}$.

When can the result of a measurement of $B$ be predicted with certainty? We can do this if all the coefficients in the expansion $\Psi=\sum{i} c{i} g{i}$ are zero, except one: $c{i}=0$ for all $i \neq k$ and $c{k} \neq 0$. For this case, Eq. (7.69) gives $\left|c{k}\right|^{2}=1$ and we are certain to find the result $b{k}$. In this case, the state function $\Psi=\sum{i} c{i} g{i}$ is given by $\Psi=g{k}$. When $\Psi$ is an eigenfunction of $\hat{B}$ with eigenvalue $b{k}$, we are certain to get the value $b_{k}$ when we measure $B$.

We can thus view the expansion $\Psi=\sum{i} c{i} g{i}$ [Eq. (7.66)] as expressing the general state $\Psi$ as a superposition of the eigenstates $g{i}$ of the operator $\hat{B}$. Each eigenstate $g{i}$ corresponds to the value $b{i}$ for the property $B$. The degree to which any eigenfunction $g{i}$ occurs in the expansion of $\Psi$, as measured by $\left|c{i}\right|^{2}$, determines the probability of getting the value $b_{i}$ in a measurement of $B$.

How do we calculate the expansion coefficients $c{i}$ so that we can get the probabilities $\left|c{i}\right|^{2}$ ? We multiply $\Psi=\sum{i} c{i} g{i}$ by $g{j}^{*}$, integrate over all space, and use the orthonormality of the eigenfunctions of the Hermitian operator $\hat{B}$ to get

\(
\begin{gather<em>}
\int g_{j}^{</em>} \Psi d \tau=\sum{i} c{i} \int g{j}^{*} g{i} d \tau=\sum{i} c{i} \delta{i j} \
c
{j}=\int g{j}^{*} \Psi d \tau=\left\langle g{j} \mid \Psi\right\rangle \tag{7.72}
\end{gather*}
\)

The probability of finding the nondegenerate eigenvalue $b_{j}$ in a measurement of $B$ is

\(
\begin{equation<em>}
\left|c{j}\right|^{2}=\left|\int g{j}^{</em>} \Psi d \tau\right|^{2}=\left|\left\langle g_{j} \mid \Psi\right\rangle\right|^{2} \tag{7.73}
\end{equation*}
\)

where $\hat{B} g{j}=b{j} g{j}$. The quantity $\left\langle g{j} \mid \Psi\right\rangle$ is called a probability amplitude.
Thus, if we know the state of the system as determined by the state function $\Psi$, we can use (7.73) to predict the probabilities of the various possible outcomes of a measurement of any property $B$. Determination of the eigenfunctions $g{j}$ and eigenvalues $b{j}$ of $\hat{B}$ is a mathematical problem.

To determine experimentally the probability of finding $g{j}$ when $B$ is measured, we take a very large number $n$ of identical, noninteracting systems, each in the same state $\Psi$, and measure the property $B$ in each system. If $n{j}$ of the measurements yield $b{j}$, then $P\left(b{j}\right)=n{j} / n=\left|\left\langle g{j} \mid \Psi\right\rangle\right|^{2}$.

We can restate the first part of Theorem 8 as follows:
THEOREM 9. If the property $B$ is measured in a quantum-mechanical system whose state function at the time of the measurement is $\Psi$, then the probability of observing the nondegenerate $\hat{B}$ eigenvalue $b{j}$ is $\left|\left\langle g{j} \mid \Psi\right\rangle\right|^{2}$, where $g{j}$ is the normalized eigenfunction corresponding to the eigenvalue $b{j}$.

The integral $\left\langle g{j} \mid \Psi\right\rangle=\int g{j}^{<em>} \Psi d \tau$ will have a substantial absolute value if the normalized functions $g{j}$ and $\Psi$ resemble each other closely and so have similar magnitudes in each region of space. If $g{j}$ and $\Psi$ do not resemble each other, then in regions where $g{j}$ is large $\Psi$ will be small (and vice versa), so the product $g{j}^{</em>} \Psi$ will always be small and the absolute value of the integral $\int g{j}^{*} \Psi d \tau$ will be small; the probability $\left|\left\langle g{j} \mid \Psi\right\rangle\right|^{2}$ of getting $b_{j}$ will then be small.

EXAMPLE

Suppose that we measure $L{z}$ of the electron in a hydrogen atom whose state at the time the measurement begins is the $2 p{x}$ state. Give the possible outcomes of the measurement and give the probability of each outcome. Use these probabilities to calculate $\left\langle L{z}\right\rangle$ for the $2 p{x}$ state.

From (6.118),

\(
\Psi=2 p{x}=2^{-1 / 2}\left(2 p{1}\right)+2^{-1 / 2}\left(2 p_{-1}\right)
\)

This equation is the expansion of $\Psi$ as a linear combination of $\hat{L}{z}$ eigenfunctions. The only nonzero coefficients are for $2 p{1}$ and $2 p{-1}$, which are eigenfunctions of $\hat{L}{z}$ with eigenvalues $\hbar$ and $-\hbar$, respectively. (Recall that the 1 and -1 subscripts give the $m$ quantum number and that the $\hat{L}{z}$ eigenvalues are $m \hbar$.) Using Theorem 8 , we take the squares of the absolute values of the coefficients in the expansion of $\Psi$ to get the probabilities. Hence the probability for getting $\hbar$ when $L{z}$ is measured is $\left|2^{-1 / 2}\right|^{2}=0.5$, and the probability for getting $-\hbar$ is $\left|2^{-1 / 2}\right|^{2}=0.5$. To find $\left\langle L_{z}\right\rangle$ from the probabilities, we use Eq. (7.71):

\(
\left\langle L{z}\right\rangle=\sum{L<em>{z i i}} P\left(L
{z, i}\right) L_{z, i}=0.5 \hbar+0.5(-\hbar)=0
\)

where $P\left(L{z, i}\right)$ is the probability of observing the value $L{z, i}$, and all probabilities except two are zero. Of course, $\left\langle L{z}\right\rangle$ can also be calculated from the average-value equation (3.88) as $\left\langle 2 p{x}\right| \hat{L}{z}\left|2 p{x}\right\rangle$.

EXERCISE Write down a hydrogen-atom wave function for which the probability of getting the $n=2$ energy if $E$ is measured is 1 , the probability of getting $2 \hbar^{2}$ if $L^{2}$ is measured is 1 , and there are equal probabilities for getting $-\hbar, 0$, and $\hbar$ if $L_{z}$ is measured. Is there only one possible answer? (Partial answer: No.)

EXAMPLE

Suppose that the energy $E$ is measured for a particle in a box of length $l$ and that at the time of the measurement the particle is in the nonstationary state $\Psi=30^{1 / 2} l^{-5 / 2} x(l-x)$ for $0 \leq x \leq l$. Give the possible outcomes of the measurement and give the probability of each possible outcome.

The possible outcomes are given by postulate (c) of Section 3.9 as the eigenvalues of the energy operator $\hat{H}$. The eigenvalues of the particle-in-a-box Hamiltonian are $E=n^{2} h^{2} / 8 m l^{2}(n=1,2,3, \ldots)$, and these are nondegenerate. The probabilities are found by expanding $\Psi$ in terms of the eigenfunctions $\psi{n}$ of $\hat{H} ; \Psi=\sum{n=1}^{\infty} c{n} \psi{n}$, where $\psi{n}=(2 / l)^{1 / 2} \sin (n \pi x / l)$. In the example after Eq. (7.41), the function $x(l-x)$ was expanded in terms of the particle-in-a-box energy eigenfunctions. The state function $30^{1 / 2} l^{-5 / 2} x(l-x)$ equals $x(l-x)$ multiplied by the normalization constant $30^{1 / 2} l^{-5 / 2}$. Hence the expansion coefficients $c{n}$ are found by multiplying the $a_{n}$ coefficients in the earlier example by $30^{1 / 2} l^{-5 / 2}$ to get

\(
c_{n}=\frac{(240)^{1 / 2}}{n^{3} \pi^{3}}\left[1-(-1)^{n}\right]
\)

The probability $P\left(E{n}\right)$ of observing the value $E{n}=n^{2} h^{2} / 8 m l^{2}$ equals $\left|c_{n}\right|^{2}$ :

\(
\begin{equation<em>}
P\left(E_{n}\right)=\frac{240}{n^{6} \pi^{6}}\left[1-(-1)^{n}\right]^{2} \tag{7.74}
\end{equation</em>}
\)

The first few probabilities are

$n$12345
$E_{n}$$h^{2} / 8 m l^{2}$$4 h^{2} / 8 m l^{2}$$9 h^{2} / 8 m l^{2}$$16 h^{2} / 8 m l^{2}$$25 h^{2} / 8 m l^{2}$
$P\left(E_{n}\right)$0.9985500.00137000.000064

The very high probability of finding the $n=1$ energy is related to the fact that the parabolic state function $30^{1 / 2} l^{-5 / 2} x(l-x)$ closely resembles the $n=1$ particle-in-a-box wave function $(2 / l)^{1 / 2} \sin (\pi x / l)$ (Fig. 7.3). The zero probabilities for $n=2,4,6, \ldots$

FIGURE 7.3 Plots of $\Psi=(30)^{1 / 2} I^{-5 / 2} x(I-x)$ and the $n=1$ particle-in-a-box wave function.

are due to the fact that, if the origin is put at the center of the box, the state function $\Psi=30^{1 / 2} l^{-5 / 2} x(l-x)$ is an even function, whereas the $n=2,4,6, \ldots$ functions are odd functions (Fig. 2.3) and so cannot contribute to the expansion of $\Psi$. The integral $\left\langle g_{n} \mid \Psi\right\rangle$ vanishes when the integrand is an odd function.

If the property $B$ has a continuous range of eigenvalues (for example, position; Section 7.7), the summation in the expansion (7.66) of $\Psi$ is replaced by an integration over the values of $b$ :

\(
\begin{equation<em>}
\Psi=\int c{b} g{b}(q) d b \tag{7.75}
\end{equation</em>}
\)

and $\left|\left\langle g_{b}(q) \mid \Psi\right\rangle\right|^{2}$ is interpreted as a probability density; that is, the probability of finding a value of $B$ between $b$ and $b+d b$ for a system in the state $\Psi$ is

\(
\begin{equation<em>}
\left|\left\langle g_{b}(q) \mid \Psi(q, t)\right\rangle\right|^{2} d b \tag{7.76}
\end{equation</em>}
\)


We derived the eigenfunctions of the linear-momentum and angular-momentum operators. We now ask: What are the eigenfunctions of the position operator $\hat{x}$ ?

The operator $\hat{x}$ is multiplication by $x$. Denoting the position eigenfunctions by $g_{a}(x)$, we write

\(
\begin{equation}
x g{a}(x)=a g{a}(x) \tag{7.77}
\end{equation}
\)

where $a$ symbolizes the possible eigenvalues. It follows that

\(
\begin{equation}
(x-a) g_{a}(x)=0 \tag{7.78}
\end{equation}
\)

We conclude from (7.78) that

\(
\begin{equation}
g_{a}(x)=0 \quad \text { for } x \neq a \tag{7.79}
\end{equation}
\)

Moreover, since an eigenfunction that is zero everywhere is unacceptable, we have

\(
\begin{equation}
g_{a}(x) \neq 0 \quad \text { for } x=a \tag{7.80}
\end{equation}
\)

These conclusions make sense. If the state function is an eigenfunction of $\hat{x}$ with eigenvalue $a, \Psi=g_{a}(x)$, we know (Section 7.6) that a measurement of $x$ is certain to give the value $a$. This can be true only if the probability density $|\Psi|^{2}$ is zero for $x \neq a$, in agreement with (7.79).

Before considering further properties of $g_{a}(x)$, we define the Heaviside step function $H(x)$ by (see Fig. 7.4)

\(
\begin{array}{ll}
H(x)=1 & \text { for } x>0 \
H(x)=0 & \text { for } x<0 \tag{7.81}\
H(x)=\frac{1}{2} & \text { for } x=0
\end{array}
\)

We next define the Dirac delta function $\delta(x)$ as the derivative of the Heaviside step function:

\(
\begin{equation}
\delta(x) \equiv \frac{d H(x)}{d x} \tag{7.82}
\end{equation}
\)

From (7.81) and (7.82), we have at once (see also Fig. 7.4)

\(
\begin{equation}
\delta(x)=0 \quad \text { for } x \neq 0 \tag{7.83}
\end{equation}
\)

FIGURE 7.4 The Heaviside step function.

Since $H(x)$ makes a sudden jump at $x=0$, its derivative is infinite at the origin:

\(
\begin{equation}
\delta(x)=\infty \quad \text { for } x=0 \tag{7.84}
\end{equation}
\)

We can generalize these equations slightly by setting $x=t-a$ and then changing the symbol $t$ to $x$. Equations (7.81) to (7.84) become

\(
\begin{gather}
H(x-a)=1, \quad x>a \tag{7.85}\
H(x-a)=0, \quad x<a \tag{7.86}\
H(x-a)=\frac{1}{2}, \quad x=a \tag{7.87}\
\delta(x-a)=d H(x-a) / d x \tag{7.88}\
\delta(x-a)=0, \quad x \neq a \quad \text { and } \quad \delta(x-a)=\infty, \quad x=a \tag{7.89}
\end{gather}
\)

Now consider the following integral:

\(
\int_{-\infty}^{\infty} f(x) \delta(x-a) d x
\)

We evaluate it using integration by parts:

\(
\begin{gathered}
\int u d v=u v-\int v d u \
u=f(x), \quad d v=\delta(x-a) d x
\end{gathered}
\)

Using (7.88), we have

\(
\begin{gather}
d u=f^{\prime}(x) d x, \quad v=H(x-a) \
\int{-\infty}^{\infty} f(x) \delta(x-a) d x=\left.f(x) H(x-a)\right|{-\infty} ^{\infty}-\int{-\infty}^{\infty} H(x-a) f^{\prime}(x) d x \
\int{-\infty}^{\infty} f(x) \delta(x-a) d x=f(\infty)-\int_{-\infty}^{\infty} H(x-a) f^{\prime}(x) d x \tag{7.90}
\end{gather}
\)

where (7.85) and (7.86) were used. Since $H(x-a)$ vanishes for $x<a$, (7.90) becomes

\(
\begin{align}
& \int{-\infty}^{\infty} f(x) \delta(x-a) d x=f(\infty)-\int{a}^{\infty} H(x-a) f^{\prime}(x) d x \
&=f(\infty)-\int{a}^{\infty} f^{\prime}(x) d x=f(\infty)-\left.f(x)\right|{a} ^{\infty} \
& \int_{-\infty}^{\infty} f(x) \delta(x-a) d x=f(a) \tag{7.91}
\end{align}
\)

Comparing (7.91) with the equation $\sum{k} c{k} \delta{i k}=c{i}$, we see that the Dirac delta function plays the same role in an integral that the Kronecker delta plays in a sum. The special case of (7.91) with $f(x)=1$ is

\(
\int_{-\infty}^{\infty} \delta(x-a) d x=1
\)

The properties (7.89) of the Dirac delta function agree with the properties (7.79) and (7.80) of the position eigenfunctions $g_{a}(x)$. We therefore tentatively set

\(
\begin{equation}
g_{a}(x)=\delta(x-a) \tag{7.92}
\end{equation}
\)

To verify (7.92), we now show it to be in accord with the Born postulate that $|\Psi(a, t)|^{2} d a$ is the probability of observing a value of $x$ between $a$ and $a+d a$. According to (7.76), this probability is given by

\(
\begin{equation}
\left|\left\langle g{a}(x) \mid \Psi(x, t)\right\rangle\right|^{2} d a=\left|\int{-\infty}^{\infty} g_{a}^{}(x) \Psi(x, t) d x\right|^{2} d a \tag{7.93}
\end{}
\)

Using (7.92) and then (7.91), we have for (7.93)

\(
\left|\int_{-\infty}^{\infty} \delta(x-a) \Psi(x, t) d x\right|^{2} d a=|\Psi(a, t)|^{2} d a
\)

which completes the proof.
Since the quantity $a$ in $\delta(x-a)$ can have any real value, the eigenvalues of $\hat{x}$ form a continuum: $-\infty<a<\infty$. As usual for continuum eigenfunctions, $\delta(x-a)$ is not quadratically integrable (Prob. 7.43).

Summarizing, the eigenfunctions and eigenvalues of position are

\(
\begin{equation}
\hat{x} \delta(x-a)=a \delta(x-a) \tag{7.94}
\end{equation}
\)

where $a$ is any real number.
The delta function is badly behaved, and consequently the manipulations we performed are lacking in rigor and would make a mathematician shudder. However, one can formulate things rigorously by considering the delta function to be the limiting case of a function that becomes successively more peaked at the origin (Fig. 7.5). The delta function is not really a function but is what mathematicians call a distribution (see en.wikipedia .org/wiki/Dirac_delta_function).

FIGURE 7.5 Functions that approximate $\delta(x)$ with successively increasing accuracy. The area under each curve is 1 .


This section summarizes the postulates of quantum mechanics introduced in previous chapters.

POSTULATE 1. The state of a system is described by a function $\Psi$ of the coordinates of the particles and the time. This function, called the state function or wave function, contains all the information that can be determined about the system. We further postulate that $\Psi$ is single-valued, continuous, and quadratically integrable. For continuum states, the quadratic integrability requirement is omitted.

The designation "wave function" for $\Psi$ is perhaps not the best choice. A physical wave moving in three-dimensional space is a function of the three spatial coordinates and the time. However, for an $n$-particle system, the function $\Psi$ is a function of $3 n$ spatial coordinates and the time. Hence, for a many-particle system, we cannot interpret $\Psi$ as any sort of physical wave. The state function is best thought of as a function from which we can calculate various properties of the system. The nature of the information that $\Psi$ contains is the subject of Postulate 5 and its consequences.

POSTULATE 2. To every physically observable property there corresponds a linear Hermitian operator. To find this operator, write down the classical-mechanical expression for the observable in terms of Cartesian coordinates and corresponding linearmomentum components, and then replace each coordinate $x$ by the operator $x$. and each momentum component $p_{x}$ by the operator $-i \hbar \partial / \partial x$.

We saw in Section 7.2 that the restriction to Hermitian operators arises from the requirement that average values of physical quantities be real numbers. The requirement of linearity is closely connected to the superposition of states discussed in Section 7.6. In our derivation of (7.70) for the average value of a property $B$ for a state that was expanded as a superposition of the eigenfunctions of $\hat{B}$, the linearity of $\hat{B}$ played a key role.

When the classical quantity contains a product of a Cartesian coordinate and its conjugate momentum, we run into the problem of noncommutativity in constructing the correct quantum-mechanical operator. Several different rules have been proposed to handle this case. See J. R. Shewell, Am. J. Phys., 27, 16 (1959); E. H. Kerner and W. G. Sutcliffe, J. Math. Phys., 11, 391 (1970); A. de Souza Dutra, J. Phys. A: Math. Gen., 39, 203 (2006) (arxiv.org/abs/0705.3247).

The process of finding quantum-mechanical operators in non-Cartesian coordinates is complicated. See K. Simon, Am. J. Phys., 33, 60 (1965); G. R. Gruber, Found. Phys., 1, 227 (1971).

POSTULATE 3. The only possible values that can result from measurements of the physically observable property $B$ are the eigenvalues $b{i}$ in the equation $\hat{B} g{i}=b{i} g{i}$, where $\hat{B}$ is the operator corresponding to the property $B$. The eigenfunctions $g_{i}$ are required to be well behaved.

Our main concern is with the energy levels of atoms and molecules. These are given by the eigenvalues of the energy operator, the Hamiltonian $\hat{H}$. The eigenvalue equation for $\hat{H}, \hat{H} \psi=E \psi$, is the time-independent Schrödinger equation. However, finding the possible values of any property involves solving an eigenvalue equation.

POSTULATE 4. If $\hat{B}$ is a linear Hermitian operator that represents a physically observable property, then the eigenfunctions $g_{i}$ of $\hat{B}$ form a complete set.

There are Hermitian operators whose eigenfunctions do not form a complete set (see Griffiths, pp. 99, 106; Messiah, p. 188; Ballentine, Sec. 1.3). The completeness requirement is essential to developing the theory of quantum mechanics, so it is necessary to postulate that all Hermitian operators that correspond to observable properties have a complete set of eigenfunctions. Postulate 4 allows us to expand the wave function for any state as a superposition of the orthonormal eigenfunctions of any quantum-mechanical operator:

\(
\begin{equation}
\Psi=\sum{i} c{i} g{i}=\sum{i}\left|g{i}\right\rangle\left\langle g{i} \mid \Psi\right\rangle \tag{7.95}
\end{equation}
\)

POSTULATE 5. If $\Psi(q, t)$ is the normalized state function of a system at time $t$, then the average value of a physical observable $B$ at time $t$ is

\(
\begin{equation}
\langle B\rangle=\int \Psi^{} \hat{B} \Psi d \tau \tag{7.96}
\end{}
\)

The definition of the quantum-mechanical average value is given in Section 3.7 and should not be confused with the time average used in classical mechanics.

From Postulates 4 and 5, we showed in Section 7.6 that the probability of observing the nondegenerate eigenvalue $b{i}$ in a measurement of $B$ is $P\left(b{i}\right)=\left|\int g{i}^{*} \Psi d \tau\right|^{2}=$ $\left|\left\langle g{i} \mid \Psi\right\rangle\right|^{2}$, where $\hat{B} g{i}=b{i} g{i}$. If $\Psi$ happens to be one of the eigenfunctions of $\hat{B}$, that is, if $\Psi=g{k}$, then $P\left(b{i}\right)$ becomes $P\left(b{i}\right)=\left|\int g{i}^{*} g{k} d \tau\right|^{2}=\left|\delta{i k}\right|^{2}=\delta{i k}$, where the orthonormality of the eigenfunctions of the Hermitian operator $\hat{B}$ was used. We are certain to observe the value $b{k}$ when $\Psi=g{k}$.

POSTULATE 6. The time development of the state of an undisturbed quantum-mechanical system is given by the Schrödinger time-dependent equation

\(
\begin{equation}
-\frac{\hbar}{i} \frac{\partial \Psi}{\partial t}=\hat{H} \Psi \tag{7.97}
\end{equation}
\)

where $\hat{H}$ is the Hamiltonian (that is, energy) operator of the system.

The time-dependent Schrödinger equation is a first-order differential equation in the time, so that, just as in classical mechanics, the present state of an undisturbed system
determines the future state. However, unlike knowledge of the state in classical mechanics, knowledge of the state in quantum mechanics involves knowledge of only the probabilities for various possible outcomes of a measurement. Thus, suppose we have several identical noninteracting systems, each having the same state function $\Psi\left(t{0}\right)$ at time $t{0}$. If we leave each system undisturbed, then the state function for each system will change in accord with (7.97). Since each system has the same Hamiltonian, each system will have the same state function $\Psi\left(t{1}\right)$ at any future time $t{1}$. However, suppose that at time $t{2}$ we measure property $B$ in each system. Although each system has the same state function $\Psi\left(t{2}\right)$ at the instant the measurement begins, we will not get the same result for each system. Rather, we will get a spread of possible values $b{i}$, where $b{i}$ are the eigenvalues of $\hat{B}$. The relative number of times we get each $b{i}$ can be calculated from the quantities $\left|c{i}\right|^{2}$, where $\Psi\left(t{2}\right)=\sum{i} c{i} g{i}$ with the $g_{i}$ 's being the eigenfunctions of $\hat{B}$.

If the Hamiltonian is independent of time, we have the possibility of states of definite energy $E$. For such states the state function must satisfy

\(
\begin{equation}
\hat{H} \Psi=E \Psi \tag{7.98}
\end{equation}
\)

and the time-dependent Schrödinger equation becomes

\(
-\frac{\hbar}{i} \frac{\partial \Psi}{\partial t}=E \Psi
\)

which integrates to $\Psi=A e^{-i E t / \hbar}$, where $A$, the integration "constant," is independent of time. The function $\Psi$ depends on the coordinates and the time, so $A$ is some function of the coordinates, which we designate as $\psi(q)$. We have

\(
\begin{equation}
\Psi(q, t)=e^{-i E t / \hbar} \psi(q) \tag{7.99}
\end{equation}
\)

for a state of constant energy. The function $\psi(q)$ satisfies the time-independent Schrödinger equation

\(
\hat{H} \psi(q)=E \psi(q)
\)

which follows from (7.98) and (7.99). The factor $e^{-i E t / \hbar}$ simply indicates a change in the phase of the wave function $\Psi(q, t)$ with time and has no direct physical significance. Hence we generally refer to $\psi(q)$ as "the wave function." The Hamiltonian operator plays a unique role in quantum mechanics in that it occurs in the fundamental dynamical equation, the time-dependent Schrödinger equation. The eigenstates of $\hat{H}$ (known as stationary states) have the special property that the probability density $|\Psi|^{2}$ is independent of time.

The time-dependent Schrödinger equation (7.97) is $(i \hbar \partial / \partial t-\hat{H}) \Psi=0$. Because the operator $i \hbar \partial / \partial t-\hat{H}$ is linear, any linear combination of solutions of the time-dependent Schrödinger equation (7.97) is a solution of (7.97). For example, if the Hamiltonian $\hat{H}$ is independent of time, then there exist stationary-state solutions $\Psi{n}=e^{-i E{n} t / \hbar} \psi_{n}(q)$ [Eq. (7.99)] of the time-dependent Schrödinger equation. Any linear combination

\(
\begin{equation}
\Psi=\sum{n} c{n} \Psi{n}=\sum{n} c{n} e^{-i E{n} t \hbar} \psi_{n}(q) \tag{7.100}
\end{equation}
\)

where the $c{n}$ 's are time-independent constants is a solution of the time-dependent Schrödinger equation, although it is not an eigenfunction of $\hat{H}$. Because of the completeness of the eigenfunctions $\psi{n}$, any state function can be written in the form (7.100) if $\hat{H}$ is independent of time. (See also Section 9.8.) The state function (7.100) represents a state that does not have a definite energy. Rather, when we measure the energy, the probability of getting $E{n}$ is $\left|c{n} e^{-i E{n} t / \hbar}\right|^{2}=\left|c{n}\right|^{2}$.

To find the constants $c{n}$ in (7.100), we write (7.100) at time $t{0}$ as $\Psi\left(q, t{0}\right)=$ $\sum{n} c{n} e^{-i E{n} t{0} / \hbar} \psi{n}(q)$. Multiplication by $\psi_{j}^{*}(q)$ followed by integration over all space gives

\(
\left\langle\psi{j}(q) \mid \Psi\left(q, t{0}\right)\right\rangle=\sum{n} c{n} e^{-i E{n} t{0} / \hbar}\left\langle\psi{j} \mid \psi{n}\right\rangle=\sum{n} c{n} e^{-i E{n} t{0} / \hbar} \delta{j n}=c{j} e^{-i E{j} t{0} / \hbar}
\)

so $c{j}=\left\langle\psi{j} \mid \Psi\left(q, t{0}\right)\right\rangle e^{i E{j} t_{0} / \hbar}$ and (7.100) becomes

\(
\begin{equation}
\Psi(q, t)=\sum{j}\left\langle\psi{j}(q) \mid \Psi\left(q, t{0}\right)\right\rangle e^{-i E{j}\left(t-t{0}\right) / \hbar} \psi{j}(q) \quad \text { if } \hat{H} \text { ind. of } t \tag{7.101}
\end{equation}
\)

where $\hat{H} \psi{j}=E{j} \psi{j}$ and the $\psi{j}$ 's have been chosen to be orthonormal. Equation (7.101) tells us how to find $\Psi$ at time $t$ from $\Psi$ at an initial time $t_{0}$ and is the general solution of the time-dependent Schrödinger equation when $\hat{H}$ is independent of $t$. [Equation (7.101) can also be derived directly from the time-dependent Schrödinger equation; see Prob. 7.46.]

EXAMPLE

A particle in a one-dimensional box of length $l$ has a time-independent Hamiltonian and has the state function $\Psi=2^{-1 / 2} \psi{1}+2^{-1 / 2} \psi{2}$ at time $t=0$, where $\psi{1}$ and $\psi{2}$ are particle-in-a-box time-independent energy eigenfunctions [Eq. (2.23)] with $n=1$ and $n=2$, respectively. (a) Find the probability density as a function of time. (b) Show that $|\Psi|^{2}$ oscillates with a period $T=8 m l^{2} / 3 h$. (c) Use a spreadsheet or Mathcad to plot $l|\Psi|^{2}$ versus $x / l$ at each of the times $j T / 8$, where $j=0,1,2, \ldots, 8$.
(a) Since $\hat{H}$ is independent of time, $\Psi$ at any future time will be given by (7.100) with $c{1}=2^{-1 / 2}, c{2}=2^{-1 / 2}$, and all other c's equal to zero. Therefore,

\(
\begin{aligned}
\Psi & =\frac{1}{\sqrt{2}} e^{-i E{1} t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \frac{\pi x}{l}+\frac{1}{\sqrt{2}} e^{-i E{2} t / \hbar}\left(\frac{2}{l}\right)^{1 / 2} \sin \frac{2 \pi x}{l} \
& =\frac{1}{\sqrt{2}} e^{-i E{1} t / \hbar} \psi{1}+\frac{1}{\sqrt{2}} e^{-i E{2} t / \hbar} \psi{2}
\end{aligned}
\)

We find for the probability density (Prob. 7.47)

\(
\begin{equation}
\Psi^{} \Psi=\frac{1}{2} \psi{1}^{2}+\frac{1}{2} \psi{2}^{2}+\psi{1} \psi{2} \cos \left[\left(E{2}-E{1}\right) t / \hbar\right] \tag{7.102}
\end{}
\)

(b) The time-dependent part of $|\Psi|^{2}$ is the cosine factor in (7.102). The period $T$ is the time it takes for the cosine to increase by $2 \pi$, so $\left(E{2}-E{1}\right) T / \hbar=2 \pi$ and $T=2 \pi \hbar /\left(E{2}-E{1}\right)=8 m l^{2} / 3 h$, since $E{n}=n^{2} h^{2} / 8 m l^{2}$. (c) Using (7.102), the expressions for $\psi{1}$ and $\psi{2}$, and $T=2 \pi \hbar /\left(E{2}-E{1}\right)$, we have
$l|\Psi|^{2}=\sin ^{2}\left(\pi x{r}\right)+\sin ^{2}\left(2 \pi x{r}\right)+2 \sin \left(\pi x{r}\right) \sin \left(2 \pi x{r}\right) \cos (2 \pi t / T)$
where $x{r} \equiv x / l$. With $t=j T / 8$, the graphs are easily plotted for each $j$ value. The plots show that the probability-density maximum oscillates between the left and right sides of the box. Using Mathcad, one can produce a movie of $|\Psi|^{2}$ as time changes (Prob. 7.47). (An online resource that allows one to follow $|\Psi|^{2}$ as a function of time for systems such as the particle in a box or the harmonic oscillator for any chosen initial mixture of stationary states is at www.falstad.com/qm1d; one chooses the mixture by clicking on the small circles at the bottom and dragging on each rotating arrow within a circle.)

Equation (7.100) with the $c{n}$ 's being constant is the general solution of the timedependent Schrödinger equation when $\hat{H}$ is independent of time. For a system acted on by an external time-dependent force, the Hamiltonian contains a time-dependent part: $\hat{H}=\hat{H}^{0}+\hat{H}^{\prime}(t)$, where $\hat{H}^{0}$ is the Hamiltonian of the system in the absence of the external force and $\hat{H}^{\prime}(t)$ is the time-dependent potential energy of interaction of the system with the external force. In this case, we can use the stationary-state time-independent eigenfunctions of $\hat{H}^{0}$ to expand $\Psi$ in an equation like (7.100), except that now the $c{n}$ 's depend on $t$. An example is an atom or molecule exposed to the time-dependent electric field of electromagnetic radiation (light); see Section 9.8.

What determines whether a system is in a stationary state such as (7.99) or a nonstationary state such as (7.100)? The answer is that the history of the system determines its present state. For example, if we take a system that is in a stationary state and expose it to radiation, the time-dependent Schrödinger equation shows that the radiation causes the state to change to a nonstationary state; see Section 9.8.

You might be wondering about the absence from the list of postulates of the Born postulate that $|\Psi(x, t)|^{2} d x$ is the probability of finding the particle between $x$ and $x+d x$. This postulate is a consequence of Postulate 5, as we now show. Equation (3.81) is $\langle B\rangle=\sum{b} P{b} b$, where $P{b}$ is the probability of observing the value $b$ in a measurement of the property $B$ that takes on discrete values. The corresponding equation for the continuous variable $x$ is $\langle x\rangle=\int{-\infty}^{\infty} P(x) x d x$, where $P(x)$ is the probability density for observing various values of $x$. According to Postulate 5, we have $\langle x\rangle=\int{-\infty}^{\infty} \Psi^{*} \hat{x} \Psi d x=\int{-\infty}^{\infty}|\Psi|^{2} x d x$. Comparison of these two expressions for $\langle x\rangle$ shows that $|\Psi|^{2}$ is the probability density $P(x)$.

Chapter 10 gives two further quantum-mechanical postulates that deal with spin and the spin-statistics theorem.


In quantum mechanics, the state function of a system changes in two ways. [See E. P. Wigner, Am. J. Phys., 31, 6 (1963).] First, there is the continuous, causal change with time given by the time-dependent Schrödinger equation (7.97). Second, there is the sudden, discontinuous, probabilistic change that occurs when a measurement is made on the system. This kind of change cannot be predicted with certainty, since the result of a measurement cannot be predicted with certainty; only the probabilities (7.73) are predictable. The sudden change in $\Psi$ caused by a measurement is called the reduction (or collapse) of the wave function. A measurement of the property $B$ that yields the result $b{k}$ changes the state function to $g{k}$, the eigenfunction of $\hat{B}$ whose eigenvalue is $b{k}$. (If $b{k}$ is degenerate, $\Psi$ is changed to a linear combination of the eigenfunctions corresponding to $b{k}$.) The probability of finding the nondegenerate eigenvalue $b{k}$ is given by Eq. (7.73) and Theorem 9 as $\left|\left\langle g{k} \mid \Psi\right\rangle\right|^{2}$, so the quantity $\left|\left\langle g{k} \mid \Psi\right\rangle\right|^{2}$ is the probability the system will make a transition from the state $\Psi$ to the state $g_{k}$ when $B$ is measured.

Consider an example. Suppose that at time $t$ we measure a particle's position. Let $\Psi\left(x, t_{-}\right)$be the state function of the particle the instant before the measurement is made (Fig. 7.6a). We further suppose that the result of the measurement is that the particle is found to be in the small region of space

\(
\begin{equation}
a<x<a+d a \tag{7.104}
\end{equation}
\)

We ask: What is the state function $\Psi\left(x, t{+}\right)$the instant after the measurement? To answer this question, suppose we were to make a second measurement of position at time $t{+}$.

FIGURE 7.6 Reduction of the wave function caused by a measurement of position.

Since $t{+}$differs from the time $t$ of the first measurement by an infinitesimal amount, we must still find that the particle is confined to the region (7.104). If the particle moved a finite distance in an infinitesimal amount of time, it would have infinite velocity, which is unacceptable. Since $\left|\Psi\left(x, t{+}\right)\right|^{2}$ is the probability density for finding various values of $x$, we conclude that $\Psi\left(x, t{+}\right)$must be zero outside the region (7.104) and must look something like Fig. 7.6b. Thus the position measurement at time $t$ has reduced $\Psi$ from a function that is spread out over all space to one that is localized in the region (7.104). The change from $\Psi\left(x, t{-}\right)$to $\Psi\left(x, t_{+}\right)$is a probabilistic change.

The measurement process is one of the most controversial areas in quantum mechanics. Just how and at what stage in the measurement process reduction occurs is unclear. Some physicists take the reduction of $\Psi$ as an additional quantum-mechanical postulate, while others claim it is a theorem derivable from the other postulates. Some physicists reject the idea of reduction [see M. Jammer, The Philosophy of Quantum Mechanics, Wiley, 1974, Section 11.4; L. E. Ballentine, Am. J. Phys., 55, 785 (1987)]. Ballentine advocates Einstein's statistical-ensemble interpretation of quantum mechanics, in which the wave function does not describe the state of a single system (as in the orthodox interpretation) but gives a statistical description of a collection of a large number of systems each prepared in the same way (an ensemble). In this interpretation, the need for reduction of the wave function does not occur. [See L. E. Ballentine, Am. J. Phys., 40, 1763 (1972); Rev. Mod. Phys., 42, 358 (1970).] There are many serious problems with the statistical-ensemble interpretation [see Whitaker, pp. 213-217; D. Home and M. A. B. Whitaker, Phys. Rep., 210, 223 (1992); Prob. 10.4], and this interpretation has been largely rejected.
"For the majority of physicists the problem of finding a consistent and plausible quantum theory of measurement is still unsolved. . . The immense diversity of opinion . . . concerning quantum measurements . . . [is] a reflection of the fundamental disagreement as to the interpretation of quantum mechanics as a whole" (M. Jammer, The Philosophy of Quantum Mechanics, pp. 519, 521).

The probabilistic nature of quantum mechanics has disturbed many physicists, including Einstein, de Broglie, and Schrödinger. These physicists and others have suggested that quantum mechanics may not furnish a complete description of physical reality. Rather, the probabilistic laws of quantum mechanics might be simply a reflection of deterministic laws that operate at a subquantum-mechanical level and that involve "hidden variables." An analogy given by the physicist Bohm is the Brownian motion of a dust particle in air. The particle undergoes random fluctuations of position, and its motion is not completely determined by its position and velocity. Of course, Brownian motion is a result of collisions with the gas molecules and is determined by variables existing on the level of
molecular motion. Analogously, the motions of electrons might be determined by hidden variables existing on a subquantum-mechanical level. The orthodox interpretation (often called the Copenhagen interpretation) of quantum mechanics, which was developed by Heisenberg and Bohr, denies the existence of hidden variables and asserts that the laws of quantum mechanics provide a complete description of physical reality. (Hidden-variables theories are discussed in F. J. Belinfante, A Survey of Hidden-Variables Theories, Pergamon, 1973.)

In 1964, J. S. Bell proved that, in certain experiments involving measurements on two widely separated particles that originally were in the same region of space, any possible local hidden-variable theory must make predictions that differ from those that quantum mechanics makes (see Ballentine, Chapter 20). In a local theory, two systems very far from each other act independently of each other. The results of such experiments agree with quantum-mechanical predictions, thus providing very strong evidence against all deterministic, local hidden-variable theories but do not rule out nonlocal hidden-variable theories. These experiments are described in A. Aspect, in The Wave-Particle Dualism, S. Diner et al. (eds.), Reidel, 1984, pp. 377-390; A. Shimony, Scientific American, Jan. 1988, p. 46; A. Zeilinger, Rev. Mod. Phys, 71, S288 (1999).

Further analysis by Bell and others shows that the results of these experiments and the predictions of quantum mechanics are incompatible with a view of the world in which both realism and locality hold. Realism (also called objectivity) is the doctrine that external reality exists and has definite properties independent of whether or not we observe this reality. Locality excludes instantaneous action-at-a-distance and asserts that any influence from one system to another must travel at a speed that does not exceed the speed of light. Clauser and Shimony stated that quantum mechanics leads to the "philosophically startling" conclusion that we must either "totally abandon the realistic philosophy of most working scientists, or dramatically revise our concept of space-time" to permit "some kind of action-at-a-distance" [J. F. Clauser and A. Shimony, Rep. Prog. Phys., 41, 1881 (1978); see also B. d’Espagnat, Scientific American, Nov. 1979, p. 158; A. Aspect, Nature, 446, 866 (2007); S. Gröblacher et al., Nature, 446, 871 (2007)].

Quantum theory predicts and experiments confirm that when measurements are made on two particles that once interacted but now are separated by an unlimited distance the results obtained in the measurement on one particle depend on the results obtained from the measurement on the second particle and also depend on which property of the second particle is measured. (Such particles are said to be entangled. For more on entanglement, see en.wikipedia.org/wiki/Quantum_entanglement; chaps. 7-10 of J. Baggott, Beyond Measure, Oxford, 2004; L. Gilder, The Age of Entanglement, Vintage, 2009; Part II of A. Whitaker, The New Quantum Age, Oxford, 2012.) Such instantaneous "spooky actions at a distance" (Einstein's phrase) have led one physicist to remark that "quantum mechanics is magic" (D. Greenberger, quoted in N. D. Mermin, Physics Today, April 1985, p. 38).

The relation between quantum mechanics and the mind has been the subject of much speculation. Wigner argued that the reduction of the wave function occurs when the result of a measurement enters the consciousness of an observer and thus "the being with consciousness must have a different role in quantum mechanics than the inanimate measuring device." He believed it likely that conscious beings obey different laws of nature than inanimate objects and proposed that scientists look for unusual effects of consciousness acting on matter. [E. P. Wigner, "Remarks on the Mind-Body Question," in The Scientist Speculates, I. J. Good, ed., Capricorn, 1965, p. 284; Proc. Amer. Phil. Soc., 113, 95 (1969); Found. Phys., 1, 35 (1970).]

In 1952, David Bohm (following a suggestion made by de Broglie in 1927 that the wave function might act as a pilot wave guiding the motion of the particle) devised a nonlocal deterministic hidden-variable theory that predicts the same experimental results as quantum mechanics [D. Bohm, Phys. Rev., 85, 166, 180 (1952)]. In Bohm's theory, a
particle at any instant of time possesses both a definite position and a definite momentum (although these quantities are not observable), and it travels on a definite path. The particle also possesses a wave function $\Psi$ whose time development obeys the time-dependent Schrödinger equation. In Bohm's theory, the wave function is a real physical entity that determines the motion of the particle. If we are given a particle at a particular position with a particular wave function at a particular time $t$, Bohm's theory postulates a certain equation that allows us to calculate the velocity of the particle at that time from its wave function and position; knowing the position and velocity at $t$, we can find the position at time $t+d t$ and can use the time-dependent Schrödinger equation to find the wave function at $t+d t$; then we calculate the velocity at $t+d t$ from the position and wave function at $t+d t$; and so on. Hence the path can be calculated from the initial position and wave function (assuming we know the potential energy). In Bohm's theory, the particle's position turns out to obey an equation like Newton's second law $m d^{2} x / d t^{2}=-\partial V / \partial x$ [Eqs. (1.8) and (1.12)], except that the potential energy $V$ is replaced by $V+Q$, where $Q$ is a quantum potential that is calculated in a certain way from the wave function. In Bohm's theory, collapse of the wave function does not occur. Rather, the interaction of the system with the measuring apparatus follows the equations of Bohm's theory, but this interaction leads to the system evolving after the measurement in the manner that would occur if the wave function had been collapsed.

Bohm's work was largely ignored for many years, but interest in his theory has increased. For more on Bohm's theory, see Whitaker, Chapter 7; D. Bohm and B. J. Hiley, The Undivided Universe, Routledge, 1992; D. Z Albert, Scientific American, May 1994, p. 58; S. Goldstein, "Bohmian Mechanics," plato.stanford.edu/entries/qm-bohm; en.wikipedia .org/wiki/De_Broglie-Bohm_theory. One of the main characters in Rebecca Goldstein's novel Properties of Light (Houghton Mifflin, 2000) is modeled in part on David Bohm.

Some physicists argue that the wave function represents merely our knowledge about the state of the system (this epistemic interpretation is used in the Copenhagen viewpoint), whereas others argue that the wave function corresponds directly to an element of physical reality (the ontic interpretation). A paper published in 2012 used certain mild assumptions to prove a result that the authors argued strongly favored the ontic interpretation; M. F. Pusey, J. Barrett, and T. Rudolph, Nature Phys. 8, 476 (2012) (arxiv.org/abs/1111.3328). For discussion of this result (called the PBR theorem), see www.aps.org/units/gqi/newsletters/upload/vol6num3.pdf; mattleifer.info/2011/11/20/ can-the-quantum-state-be-interpreted-statistically.

Although the experimental predictions of quantum mechanics are not arguable, its conceptual interpretation is still the subject of heated debate. Excellent bibliographies with commentary on this subject are B. S. DeWitt and R. N. Graham, Am. J. Phys., 39, 724 (1971); L. E. Ballentine, Am. J. Phys., 55, 785 (1987). See also B. d'Espagnat, Conceptual Foundations of Quantum Mechanics, 2nd ed., Benjamin, 1976; M. Jammer, The Philosophy of Quantum Mechanics, Wiley, 1974; Whitaker, Chapter 8; P. Yam, Scientific American, June 1997, p. 124. An online bibliography by A. Cabello on the foundations of quantum mechanics lists 12 different interpretations of quantum mechanics (arxiv.org/ abs/quant-ph/0012089); a Wikipedia article lists 14 interpretations of quantum mechanics (en.wikipedia.org/wiki/Interpretations_of_quantum_mechanics).


Matrix algebra is a key mathematical tool in doing modern-day quantum-mechanical calculations on molecules. Matrices also furnish a convenient way to formulate much of the theory of quantum mechanics. Matrix methods will be used in some later chapters, but this book is written so that the material on matrices can be omitted if time does not allow this material to be covered.

A matrix is a rectangular array of numbers. The numbers that compose a matrix are called the matrix elements. Let the matrix A have $m$ rows and $n$ columns, and let $a_{i j}$ $(i=1,2, \ldots, m$ and $j=1,2, \ldots, n)$ denote the element in row $i$ and column $j$. Then

\(
\mathbf{A}=\left(\begin{array}{cccc}
a{11} & a{12} & \cdots & a{1 n} \
a{21} & a{22} & \cdots & a{2 n} \
\cdot & \cdot & \cdots & \cdot \
a{m 1} & a{m 2} & \cdots & a_{m n}
\end{array}\right)
\)

$\mathbf{A}$ is said to be an $m$ by $n$ matrix. Do not confuse $\mathbf{A}$ with a determinant (Section 8.3); a matrix need not be square and is not equal to a single number.

A row matrix (also called a row vector) is a matrix having only one row. A column matrix or column vector has only one column.

Two matrices $\mathbf{R}$ and $\mathbf{S}$ are equal if they have the same number of rows, and the same number of columns, and have corresponding elements equal. If $\mathbf{R}=\mathbf{S}$, then $r{j k}=s{j k}$ for $j=1, \ldots, m$ and $k=1, \ldots, n$, where $m$ and $n$ are the dimensions of $\mathbf{R}$ and $\mathbf{S}$. A matrix equation is thus equivalent to $m n$ scalar equations.

The sum of two matrices $\mathbf{A}$ and $\mathbf{B}$ is defined as the matrix formed by adding corresponding elements of $\mathbf{A}$ and $\mathbf{B}$; the sum is defined only if $\mathbf{A}$ and $\mathbf{B}$ have the same dimensions. If $\mathbf{P}=\mathbf{A}+\mathbf{B}$, then we have the $m n$ scalar equations $p{j k}=a{j k}+b_{j k}$ for $j=1, \ldots, m$ and $k=1, \ldots, n$.

\(
\begin{equation}
\text { If } \mathbf{P}=\mathbf{A}+\mathbf{B}, \text { then } p{j k}=a{j k}+b_{j k} \tag{7.105}
\end{equation}
\)

The product of the scalar $c$ and the matrix $\mathbf{A}$ is defined as the matrix formed by multiplying every element of $\mathbf{A}$ by $c$.

\(
\begin{equation}
\text { If } \mathbf{D}=c \mathbf{A}, \quad \text { then } \quad d{j k}=c a{j k} \tag{7.106}
\end{equation}
\)

If $\mathbf{A}$ is an $m$ by $n$ matrix and $\mathbf{B}$ is an $n$ by $p$ matrix, the matrix product $\mathbf{R}=\mathbf{A B}$ is defined to be the $m$ by $p$ matrix whose elements are

\(
\begin{equation}
r{j k} \equiv a{j 1} b{1 k}+a{j 2} b{2 k}+\cdots+a{j n} b{n k}=\sum{i=1}^{n} a{j i} b{i k} \tag{7.107}
\end{equation}
\)

To calculate $r{j k}$ we take row $j$ of $\mathbf{A}$ (this row's elements are $a{j 1}, a{j 2}, \ldots, a{j n}$ ), multiply each element of this row by the corresponding element in column $k$ of $\mathbf{B}$ (this column's elements are $b{1 k}, b{2 k}, \ldots, b_{n k}$ ), and add the $n$ products. For example, suppose

\(
\mathbf{A}=\left(\begin{array}{rrr}
-1 & 3 & \frac{1}{2} \
0 & 4 & 1
\end{array}\right) \quad \text { and } \quad \mathbf{B}=\left(\begin{array}{rrr}
1 & 0 & -2 \
2 & 5 & 6 \
-8 & 3 & 10
\end{array}\right)
\)

The number of columns of $\mathbf{A}$ equals the number of rows of $\mathbf{B}$, so the matrix product $\mathbf{A B}$ is defined. $\mathbf{A B}$ is the product of the 2 by 3 matrix $\mathbf{A}$ and the 3 by 3 matrix $\mathbf{B}$, so $\mathbf{R} \equiv \mathbf{A B}$ is a 2 by 3 matrix. The element $r{21}$ is found from the second row of $\mathbf{A}$ and the first column of $\mathbf{B}$ as follows: $r{21}=0(1)+4(2)+1(-8)=0$. Calculation of the remaining elements gives

\(
\mathbf{R}=\left(\begin{array}{rrr}
1 & 16 \frac{1}{2} & 25 \
0 & 23 & 34
\end{array}\right)
\)

Matrix multiplication is not commutative; the products $\mathbf{A B}$ and $\mathbf{B A}$ need not be equal. (In the preceding example, the product $\mathbf{B A}$ happens to be undefined.) Matrix multiplication can be shown to be associative, meaning that $\mathbf{A}(\mathbf{B C})=(\mathbf{A B}) \mathbf{C}$ and can be shown to be distributive, meaning that $\mathbf{A}(\mathbf{B}+\mathbf{C})=\mathbf{A B}+\mathbf{A C}$ and $(\mathbf{B}+\mathbf{C}) \mathbf{D}=\mathbf{B D}+\mathbf{C D}$.

A matrix with equal numbers of rows and columns is a square matrix. The order of a square matrix equals the number of rows.

If $\mathbf{A}$ is a square matrix, its square, cube, $\ldots$ are defined by $\mathbf{A}^{2} \equiv \mathbf{A A}$, $\mathbf{A}^{3} \equiv \mathbf{A A A}, \ldots$

The elements $a{11}, a{22}, \ldots, a_{n n}$ of a square matrix of order $n$ lie on its principal diagonal. A diagonal matrix is a square matrix having zero as the value of each element not on the principal diagonal.

The trace of a square matrix is the sum of the elements on the principal diagonal. If $\mathbf{A}$ is a square matrix of order $n$, its trace is $\operatorname{Tr} \mathbf{A}=\sum{i=1}^{n} a{i i}$.

A diagonal matrix whose diagonal elements are each equal to 1 is called a unit matrix or an identity matrix. The $(j, k)$ th element of a unit matrix is the Kronecker delta $\delta{j k}$; $(\mathbf{I}){j k}=\delta_{j k}$, where $\mathbf{I}$ is a unit matrix. For example, the unit matrix of order 3 is

\(
\left(\begin{array}{lll}
1 & 0 & 0 \
0 & 1 & 0 \
0 & 0 & 1
\end{array}\right)
\)

Let $\mathbf{B}$ be a square matrix of the same order as a unit matrix $\mathbf{I}$. The $(j, k)$ th element of the product IB is given by (7.107) as (IB $){j k}=\sum{i}(\mathbf{I}){j i} b{i k}=\sum{i} \delta{j i} b{i k}=b{j k}$. Since the $(j, k)$ th elements of $\mathbf{I B}$ and $\mathbf{B}$ are equal for all $j$ and $k$, we have $\mathbf{I B}=\mathbf{B}$. Similarly, we find $\mathbf{B I}=\mathbf{B}$. Multiplication by a unit matrix has no effect.

A matrix all of whose elements are zero is called a zero matrix, symbolized by $\mathbf{0}$. A nonzero matrix has at least one element not equal to zero. These definitions apply to row vectors and column vectors.

Most matrices in quantum chemistry are either square matrices or row or column matrices.

Matrices and Quantum Mechanics

In Section 7.1 , the integral $\int f{m}^{*} \hat{A} f{n} d \tau$ was called a matrix element of $\hat{A}$. We now justify this name by showing that such integrals obey the rules of matrix algebra.

Let the functions $f{1}, f{2}, \ldots$ be a complete, orthonormal set and let the symbol $\left{f{i}\right}$ denote this complete set. The numbers $A{m n} \equiv\left\langle f{m}\right| \hat{A}\left|f{n}\right\rangle \equiv \int f{m}^{*} \hat{A} f{n} d \tau$ are called matrix elements of the linear operator $\hat{A}$ in the basis $\left{f{i}\right}$. The square matrix
is called the matrix representative of the linear operator $\hat{A}$ in the $\left{f{i}\right}$ basis. Since $\left{f_{i}\right}$ usually consists of an infinite number of functions, $\mathbf{A}$ is an infinite-order matrix.

Consider the addition of matrix-element integrals. Suppose $\hat{C}=\hat{A}+\hat{B}$. A typical matrix element of $\hat{C}$ in the $\left{f_{i}\right}$ basis is

\(
\begin{aligned}
C{m n} & =\left\langle f{m}\right| \hat{C}\left|f{n}\right\rangle=\left\langle f{m}\right| \hat{A}+\hat{B}\left|f{n}\right\rangle=\int f{m}^{}(\hat{A}+\hat{B}) f{n} d \tau \
& =\int f{m}^{} \hat{A} f{n} d \tau+\int f{m}^{*} \hat{B} f{n} d \tau=A{m n}+B_{m n}
\end{aligned}
\)

Thus, if $\hat{C}=\hat{A}+\hat{B}$, then $C{m n}=A{m n}+B_{m n}$, which is the rule (7.105) for matrix addition. Hence, if $\hat{C}=\hat{A}+\hat{B}$, then $\mathbf{C}=\mathbf{A}+\mathbf{B}$, where $\mathbf{A}, \mathbf{B}$, and $\mathbf{C}$ are the matrix representatives of the operators $\hat{A}, \hat{B}, \hat{C}$.

Similarly, if $\hat{P}=c \hat{S}$, where $c$ is a constant, then we find (Prob. 7.52) $P{j k}=c S{j k}$, which is the rule for multiplication of a matrix by a scalar.

Finally, suppose that $\hat{R}=\hat{S} \hat{T}$. We have

\(
\begin{equation}
R{m n}=\int f{m}^{} \hat{R} f{n} d \tau=\int f{m}^{} \hat{S} \hat{T} f_{n} d \tau \tag{7.109}
\end{equation}
\)

The function $\hat{T} f{n}$ can be expanded in terms of the complete orthonormal set $\left{f{i}\right}$ as [Eq. (7.41)]:

\(
\hat{T} f{n}=\sum{i} c{i} f{i}=\sum{i}\left\langle f{i} \mid \hat{T} f{n}\right\rangle f{i}=\sum{i}\left\langle f{i}\right| \hat{T}\left|f{n}\right\rangle f{i}=\sum{i} T{i n} f_{i}
\)

and $R_{m n}$ becomes

\(
\begin{equation}
R{m n}=\int f{m}^{} \hat{S} \sum{i} T{i n} f{i} d \tau=\sum{i} \int f{m}^{*} \hat{S} f{i} d \tau T{i n}=\sum{i} S{m i} T{i n} \tag{7.110}
\end{}
\)

The equation $R{m n}=\sum{i} S{m i} T{i n}$ is the rule (7.107) for matrix multiplication. Hence, if $\hat{R}=\hat{S} \hat{T}$, then $\mathbf{R}=\mathbf{S T}$.

We have proved that the matrix representatives of linear operators in a complete orthonormal basis set obey the same equations that the operators obey. Combining Eqs. (7.109) and (7.110), we have the useful sum rule

\(
\begin{equation}
\sum_{i}\langle m| \hat{S}|i\rangle\langle i| \hat{T}|n\rangle=\langle m| \hat{S} \hat{T}|n\rangle \tag{7.111}
\end{equation}
\)

Suppose the basis set $\left{f{i}\right}$ is chosen to be the complete, orthonormal set of eigenfunctions $g{i}$ of $\hat{A}$, where $\hat{A} g{i}=a{i} g{i}$. Then the matrix element $A{m n}$ is

\(
A{m n}=\left\langle g{m}\right| \hat{A}\left|g{n}\right\rangle=\left\langle g{m} \mid \hat{A} g{n}\right\rangle=\left\langle g{m} \mid a{n} g{n}\right\rangle=a{n}\left\langle g{m} \mid g{n}\right\rangle=a{n} \delta_{m n}
\)

The matrix that represents $\hat{A}$ in the basis of orthonormal $\hat{A}$ eigenfunctions is thus a diagonal matrix whose diagonal elements are the eigenvalues of $\hat{A}$. Conversely, one can prove (Prob. 7.53) that, when the matrix representative of $\hat{A}$ using a complete orthonormal set is a diagonal matrix, then the basis functions are the eigenfunctions of $\hat{A}$ and the diagonal matrix elements are the eigenvalues of $\hat{A}$.

We have used the complete, orthonormal basis $\left{f{i}\right}$ to represent the operator $\hat{A}$ by the matrix $\mathbf{A}$ of (7.108). The basis $\left{f{i}\right}$ can also be used to represent an arbitrary function $u$, as follows. We expand $u$ in terms of the complete set $\left{f{i}\right}$, according to $u=\sum{i} u{i} f{i}$, where the expansion coefficients $u{i}$ are numbers (not functions) given by Eq. (7.40) as $u{i}=\left\langle f{i} \mid u\right\rangle$. The set of expansion coefficients $u{1}, u{2}, \ldots$ is formed into a column matrix (column vector), which we call $\mathbf{u}$, and $\mathbf{u}$ is said to be the representative of the function $u$ in the $\left{f{i}\right}$ basis. If $\hat{A} u=w$, where $w$ is another function, then we can show (Prob. 7.54) that $\mathbf{A u}=\mathbf{w}$, where $\mathbf{A}, \mathbf{u}$, and $\mathbf{w}$ are the matrix representatives of $\hat{A}, u$, and $w$ in the $\left{f_{i}\right}$ basis. Thus, the effect of the linear operator $\hat{A}$ on an arbitrary function $u$ can be found if the matrix representative $\mathbf{A}$ of $\hat{A}$ is known. Hence, knowing the matrix representative $\mathbf{A}$ is equivalent to knowing what the operator $\hat{A}$ is.


The Variation Method

Click the keywords to know more about it.

Variation Method: An approximation technique used to estimate the ground-state energy of a system without solving the Schrödinger equation 1. Variation Theorem: A theorem stating that for any normalized, well-behaved function that satisfies the boundary conditions of a system, the expectation value of the Hamiltonian is an upper bound to the ground-state energy1. Hamiltonian Operator (H n): The operator corresponding to the total energy of the system, including both kinetic and potential energies1. Ground-State Energy (E 1): The lowest energy eigenvalue of the Hamiltonian operator for a given system1. Trial Variation Function (f): A well-behaved function used in the variation method to approximate the ground-state energy1. Variational Integral: The integral of the product of the trial variation function, the Hamiltonian operator, and the trial variation function, divided by the integral of the square of the trial variation function1. Normalization Constant (N): A constant used to ensure that the trial variation function is normalized1. Eigenfunctions (c k_): The stationary-state wave functions that are solutions to the Schrödinger equation for a given Hamiltonian1. Eigenvalues (E k_): The energy values corresponding to the eigenfunctions of the Hamiltonian operator1. Orthonormal Set: A set of functions that are both orthogonal and normalized1. Kronecker Delta (d k j_): A function that is 1 if the indices are equal and 0 otherwise1. Parabolic Function: A function of the form f = x_1_l - _x_2 for a particle in a one-dimensional box1. Harmonic Oscillator: A system in which the potential energy is proportional to the square of the displacement from equilibrium1. Gaussian Elimination: A method for solving systems of linear equations by transforming the system's matrix into an upper triangular form1. Gauss–Jordan Elimination: An extension of Gaussian elimination that reduces the matrix to row echelon form1. Linear Variation Function: A linear combination of linearly independent functions used in the variation method1. Overlap Integral (S jk_): The integral of the product of two basis functions1. Secular Equation: An algebraic equation derived from the variation method that determines the approximate energies of the system1. Matrix Diagonalization: The process of finding the eigenvalues and eigenvectors of a matrix1. Hermitian Matrix: A matrix that is equal to its conjugate transpose1. Orthogonal Matrix: A matrix whose inverse is equal to its transpose1. Unitary Matrix: A matrix whose inverse is equal to its conjugate transpose1. Eigenvector: A non-zero vector that changes by only a scalar factor when a linear transformation is applied1. Characteristic Equation: An equation that determines the eigenvalues of a matrix1. Symmetric Matrix: A matrix that is equal to its transpose1. Diagonal Matrix: A matrix in which the entries outside the main diagonal are all zero1. Tridiagonal Matrix: A matrix that has non-zero elements only on the main diagonal and the diagonals immediately above and below it1. QR Method: An algorithm for finding the eigenvalues and eigenvectors of a matrix1. Cyclic Jacobi Method: An iterative method for diagonalizing a symmetric matrix1. Gaussian Variational Function: A trial function of the form e -cx 2 used in the variation method1. Block-Diagonal Form: A matrix form where the matrix is divided into smaller square matrices along the diagonal1. Normalization Condition: The condition that the integral of the square of the trial variation function is equal to 11. Expectation Value: The average value of a physical quantity in a given quantum state1. Schmidt Orthogonalization: A method for orthogonalizing a set of functions1. Symmetric Orthogonalization: A method for orthogonalizing a set of functions using the overlap matrix1. Rayleigh-Ritz Theorem: A theorem that provides an upper bound to the ground-state energy using the variation method1. Numerov Method: A numerical method for solving differential equations1. Particle-in-a-Box: A model system in quantum mechanics where a particle is confined to a one-dimensional box with infinite potential walls1. Quartic Oscillator: A system in which the potential energy is proportional to the fourth power of the displacement from equilibrium1. Double-Well Potential: A potential energy function with two minima separated by a barrier1. Harmonic Oscillator Basis Functions: The eigenfunctions of the harmonic oscillator Hamiltonian1. Particle-in-a-Box Basis Functions: The eigenfunctions of the particle-in-a-box Hamiltonian1. Radial Equation: The part of the Schrödinger equation that depends only on the radial coordinate in spherical coordinates1. Hydrogen Atom: A model system in quantum mechanics consisting of a single electron orbiting a proton1. Eigenvalue Problem: The problem of finding the eigenvalues and eigenvectors of a matrix or operator1. Matrix Algebra: The branch of mathematics that deals with matrices and their operations1. Linear Transformation: A transformation that preserves the operations of addition and scalar multiplication1. Unit Matrix: A square matrix with ones on the main diagonal and zeros elsewhere1. Inverse Matrix: A matrix that, when multiplied by the original matrix, yields the unit matrix1. Characteristic Polynomial: The polynomial obtained from the characteristic equation of a matrix1.

To deal with the time-independent Schrödinger equation for systems (such as atoms or molecules) that contain interacting particles, we must use approximation methods. This chapter discusses the variation method, which allows us to approximate the ground-state energy of a system without solving the Schrödinger equation. The variation method is based on the following theorem:

THE VARIATION THEOREM

Given a system whose Hamiltonian operator $\hat{H}$ is time independent and whose lowest-energy eigenvalue is $E_{1}$, if $\phi$ is any normalized, well-behaved function of the coordinates of the system's particles that satisfies the boundary conditions of the problem, then

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau \geq E_{1}, \quad \phi \text { normalized } \tag{8.1}
\end{}
\)

The variation theorem allows us to calculate an upper bound for the system's groundstate energy.

To prove (8.1), we expand $\phi$ in terms of the complete, orthonormal set of eigenfunctions of $\hat{H}$, the stationary-state eigenfunctions $\psi_{k}$ :

\(
\begin{equation}
\phi=\sum{k} a{k} \psi_{k} \tag{8.2}
\end{equation}
\)

where

\(
\begin{equation}
\hat{H} \psi{k}=E{k} \psi_{k} \tag{8.3}
\end{equation}
\)

Note that the expansion (8.2) requires that $\phi$ obey the same boundary conditions as the $\psi_{k}$ 's. Substitution of (8.2) into the left side of (8.1) gives

\(
\int \phi^{} \hat{H} \phi d \tau=\int \sum{k} a{k}^{} \psi{k}^{*} \hat{H} \sum{j} a{j} \psi{j} d \tau=\int \sum{k} a{k}^{} \psi_{k}^{} \sum{j} a{j} \hat{H} \psi_{j} d \tau
\)

Using the eigenvalue equation (8.3) and assuming the validity of interchanging the integration and the infinite summations, we get

\(
\begin{aligned}
\int \phi^{} \hat{H} \phi d \tau & =\int \sum{k} a{k}^{} \psi{k}^{*} \sum{j} a{j} E{j} \psi{j} d \tau=\sum{k} \sum{j} a{k}^{} a{j} E{j} \int \psi_{k}^{} \psi{j} d \tau \
& =\sum{k} \sum{j} a{k}^{*} a{j} E{j} \delta_{k j}
\end{aligned}
\)

where the orthonormality of the eigenfunctions $\psi_{k}$ was used. We perform the sum over $j$, and, as usual, the Kronecker delta makes all terms zero except the one with $j=k$, giving

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau=\sum{k} a{k}^{} a{k} E{k}=\sum{k}\left|a{k}\right|^{2} E_{k} \tag{8.4}
\end{equation}
\)

Since $E{1}$ is the lowest-energy eigenvalue of $\hat{H}$, we have $E{k} \geq E{1}$. Since $\left|a{k}\right|^{2}$ is never negative, we can multiply the inequality $E{k} \geq E{1}$ by $\left|a{k}\right|^{2}$ without changing the direction of the inequality sign to get $\left|a{k}\right|^{2} E{k} \geq\left|a{k}\right|^{2} E{1}$. Therefore, $\sum{k}\left|a{k}\right|^{2} E{k} \geq \sum{k}\left|a{k}\right|^{2} E_{1}$, and use of (8.4) gives

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau=\sum{k}\left|a{k}\right|^{2} E{k} \geq \sum{k}\left|a{k}\right|^{2} E{1}=E{1} \sum{k}\left|a_{k}\right|^{2} \tag{8.5}
\end{}
\)

Because $\phi$ is normalized, we have $\int \phi^{*} \phi d \tau=1$. Substitution of the expansion (8.2) into the normalization condition gives

\(
\begin{gather}
1=\int \phi^{} \phi d \tau=\int \sum{k} a{k}^{} \psi_{k}^{} \sum{j} a{j} \psi{j} d \tau=\sum{k} \sum{j} a{k}^{} a{j} \int \psi{k}^{} \psi{j} d \tau=\sum{k} \sum{j} a{k}^{} a{j} \delta{k j} \
1=\sum{k}\left|a{k}\right|^{2} \tag{8.6}
\end{gather}
\)

[Note that in deriving Eqs. (8.4) and (8.6) we essentially repeated the derivations of Eqs. (7.70) and (7.69), respectively.]

Use of (8.6) in (8.5) gives the variation theorem (8.1):

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau \geq E_{1}, \quad \phi \text { normalized } \tag{8.7}
\end{}
\)

Suppose we have a function $\phi$ that is not normalized. To apply the variation theorem, we multiply $\phi$ by a normalization constant $N$ so that $N \phi$ is normalized. Replacing $\phi$ by $N \phi$ in (8.7), we have

\(
\begin{equation}
|N|^{2} \int \phi^{} \hat{H} \phi d \tau \geq E_{1} \tag{8.8}
\end{}
\)

$N$ is determined by $\int(N \phi)^{} N \phi d \tau=|N|^{2} \int \phi^{} \phi d \tau=1$; so $|N|^{2}=1 / \int \phi^{*} \phi d \tau$ and (8.8) becomes

\(
\begin{equation}
\frac{\int \phi^{} \hat{H} \phi d \tau}{\int \phi^{} \phi d \tau} \geq E_{1} \tag{8.9}
\end{equation}
\)

where $\phi$ is any well-behaved function (not necessarily normalized) that satisfies the boundary conditions of the problem.

The function $\phi$ is called a trial variation function, and the integral in (8.1) [or the ratio of integrals in (8.9)] is called the variational integral. To arrive at a good approximation to the ground-state energy $E{1}$, we try many trial variation functions and look for the one that gives the lowest value of the variational integral. From (8.1), the lower the value of the variational integral, the better the approximation we have to $E{1}$. One way to disprove quantum mechanics would be to find a trial variation function that made the variational integral less than $E{1}$ for some system where $E{1}$ is known.

Let $\psi_{1}$ be the true ground-state wave function:

\(
\begin{equation}
\hat{H} \psi{1}=E{1} \psi_{1} \tag{8.10}
\end{equation}
\)

If we happened to be lucky enough to hit upon a variation function that was equal to $\psi{1}$, then, using (8.10) in (8.1), we see that the variational integral will be equal to $E{1}$. Thus the ground-state wave function gives the minimum value of the variational integral. We therefore expect that the lower the value of the variational integral, the closer the trial variational function will approach the true ground-state wave function. However, it turns out that the variational integral approaches $E{1}$ a lot faster than the trial variation function approaches $\psi{1}$, and it is possible to get a rather good approximation to $E_{1}$ using a rather poor $\phi$.

In practice, one usually puts several parameters into the trial function $\phi$ and then varies these parameters so as to minimize the variational integral. Successful use of the variation method depends on the ability to make a good choice for the trial function.

Let us look at some examples of the variation method. Although the real utility of the method is for problems to which we do not know the true solutions, we will consider problems that are exactly solvable so that we can judge the accuracy of our results.

EXAMPLE

Devise a trial variation function for the particle in a one-dimensional box of length $l$.
The wave function is zero outside the box and the boundary conditions require that $\psi=0$ at $x=0$ and at $x=l$. The variation function $\phi$ must meet these boundary conditions of being zero at the ends of the box. As noted after Eq. (4.57), the ground-state $\psi$ has no nodes interior to the boundary points, so it is desirable that $\phi$ have no interior nodes. A simple function that has these properties is the parabolic function

\(
\begin{equation}
\phi=x(l-x) \quad \text { for } 0 \leq x \leq l \tag{8.11}
\end{equation}
\)

and $\phi=0$ outside the box. Since we have not normalized $\phi$, we use Eq. (8.9). Inside the box the Hamiltonian is $-\left(\hbar^{2} / 2 m\right) d^{2} / d x^{2}$. For the numerator and denominator of (8.9), we have

\(
\begin{gather}
\int \phi^{} \hat{H} \phi d \tau=-\frac{\hbar^{2}}{2 m} \int{0}^{l}\left(l x-x^{2}\right) \frac{d^{2}}{d x^{2}}\left(l x-x^{2}\right) d x=\frac{\hbar^{2}}{m} \int{0}^{l}\left(l x-x^{2}\right) d x=\frac{\hbar^{2} l^{3}}{6 m} \tag{8.12}\
\int \phi^{} \phi d \tau=\int_{0}^{l} x^{2}(l-x)^{2} d x=\frac{l^{5}}{30} \tag{8.13}
\end{gather}
\)

Substituting in the variation theorem (8.9), we get

\(
E_{1} \leq \frac{5 h^{2}}{4 \pi^{2} m l^{2}}=0.1266515 \frac{h^{2}}{m l^{2}}
\)

From Eq. (2.20), $E_{1}=h^{2} / 8 m l^{2}=0.125 h^{2} / m l^{2}$, and the energy error is $1.3 \%$.
Since $\int|\phi|^{2} d \tau=l^{5} / 30$, the normalized form of (8.11) is $\left(30 / l^{5}\right)^{1 / 2} x(l-x)$.
Figure 7.3 shows that this function rather closely resembles the true ground-state particle-in-a-box wave function.

EXERCISE A one-particle, one-dimensional system has the potential energy function $V=V{0}$ for $0 \leq x \leq l$ and $V=\infty$ elsewhere (where $V{0}$ is a constant). (a) Use the variation function $\phi=\sin (\pi x / l)$ for $0 \leq x \leq l$ and $\phi=0$ elsewhere to estimate the ground-state energy of this system. (b) Explain why the result of (a) is the exact groundstate energy. Hint: See one of the Chapter 4 problems. (Answer: (a) $V_{0}+h^{2} / 8 m l^{2}$.)

The preceding example did not have a parameter in the trial function. The next example does.

EXAMPLE

For the one-dimensional harmonic oscillator, devise a variation function with a parameter and find the optimum value of the parameter. Estimate the ground-state energy.
The variation function $\phi$ must be quadratically integrable and so must go to zero as $x$ goes to $\pm \infty$. The function $e^{-x}$ has the proper behavior at $+\infty$ but becomes infinite at $-\infty$. The function $e^{-x^{2}}$ has the proper behavior at $\pm \infty$. However, it is dimensionally unsatisfactory, since the power to which we raise $e$ must be dimensionless. This can be seen from the Taylor series $e^{z}=1+z+z^{2} / 2!+\cdots$ [Eq. (4.44)]. Since all the terms in this series must have the same dimensions, $z$ must have the same dimensions as 1 ; that is, $z$ in $e^{z}$ must be dimensionless. Hence we modify $e^{-x^{2}}$ to $e^{-c x^{2}}$, where $c$ has units of length ${ }^{-2}$. We shall take $c$ as a variational parameter. The true ground-state $\psi$ must have no nodes. Also, since $V=\frac{1}{2} k x^{2}$ is an even function, the ground-state $\psi$ must have definite parity and must be an even function, since an odd function has a node at the origin. The trial function $e^{-c x^{2}}$ has the desired properties of having no nodes and of being an even function.
Use of (4.30) for $\hat{H}$ and Appendix integrals gives (Prob. 8.3)

\(
\begin{aligned}
\int \phi^{} \hat{H} \phi d \tau & =-\frac{\hbar^{2}}{2 m} \int{-\infty}^{\infty} e^{-c x^{2}} \frac{d^{2} e^{-c x^{2}}}{d x^{2}} d x+2 \pi^{2} v^{2} m \int{-\infty}^{\infty} x^{2} e^{-2 c x^{2}} d x \
& =\frac{\hbar^{2}}{m}\left(\frac{\pi c}{8}\right)^{1 / 2}+v^{2} m\left(\frac{\pi^{5}}{8}\right)^{1 / 2} c^{-3 / 2} \
\int \phi^{} \phi d \tau & =\int{-\infty}^{\infty} e^{-2 c x^{2}} d x=2 \int{0}^{\infty} e^{-2 c x^{2}} d x=\left(\frac{\pi}{2 c}\right)^{1 / 2}
\end{aligned}
\)

The variational integral $W$ is

\(
\begin{equation}
W \equiv \frac{\int \phi^{} \hat{H} \phi d \tau}{\int \phi^{} \phi d \tau}=\frac{\hbar^{2} c}{2 m}+\frac{\pi^{2} \nu^{2} m}{2 c} \tag{8.14}
\end{equation}
\)

We now vary $c$ to minimize the variational integral (8.14). A necessary condition that $W$ be minimized is that

\(
\begin{gather}
\frac{d W}{d c}=0=\frac{\hbar^{2}}{2 m}-\frac{\pi^{2} \nu^{2} m}{2 c^{2}} \
c= \pm \pi \nu m / \hbar \tag{8.15}
\end{gather}
\)

The negative root $c=-\pi \nu m / \hbar$ is rejected, since it would make $\phi=e^{-c x^{2}}$ not quadratically integrable. Substitution of $c=\pi \nu m / \hbar$ into (8.14) gives $W=\frac{1}{2} h \nu$. This is the exact ground-state harmonic-oscillator energy. With $c=\pi \nu m / \hbar$ the variation function $\phi$ is the same (except for being unnormalized) as the harmonic-oscillator ground-state wave function (4.53) and (4.31).
For the normalized harmonic-oscillator variation function $\phi=(2 c / \pi)^{1 / 4} e^{-c x^{2}}$, a large value of $c$ makes $\phi$ fall off very rapidly from its maximum value at $x=0$. This makes the probability density large only near $x=0$. The potential energy $V=\frac{1}{2} k x^{2}$ is low near $x=0$, so a large $c$ means a low $\langle V\rangle=\langle\phi| V|\phi\rangle$. [Note also that $\langle V\rangle$ equals the second term on the right side of (8.14).] However, because a large $c$ makes $\phi$ fall off very rapidly from its maximum, it makes $|d \phi / d x|$ large in the region near
$x=0$. From Prob. 7.7b, a large $|d \phi / d x|$ means a large value of $\langle T\rangle$ [which equals the first term on the right side of (8.14)]. The optimum value of $c$ minimizes the sum $\langle T\rangle+\langle V\rangle=W$. In atoms and molecules, the true wave function is a compromise between the tendency to minimize $\langle V\rangle$ by confining the electrons to regions of low $V$ (near the nuclei) and the tendency to minimize $\langle T\rangle$ by allowing the electron probability density to spread out over a large region.
EXERCISE Consider a one-particle, one-dimensional system with $V=0$ for $-\frac{1}{2} l \leq x \leq \frac{1}{2} l$ and $V=b \hbar^{2} / m l^{2}$ elsewhere (where $b$ is a positive constant) (Fig. 2.5 with $V_{0}=b \hbar^{2} / m l^{2}$ and the origin shifted). (a) For the variation function $\phi=(x-c)(x+c)=x^{2}-c^{2}$ for $-c \leq x \leq c$ and $\phi=0$ elsewhere, where the variational parameter $c$ satisfies $c>\frac{1}{2} l$, one finds that the variational integral $W$ is given by

\(
W=\frac{\hbar^{2}}{m l^{2}}\left[\frac{5 l^{2}}{4 c^{2}}+b\left(1-\frac{15 l}{16 c}+\frac{5 l^{3}}{32 c^{3}}-\frac{3 l^{5}}{256 c^{5}}\right)\right]
\)

Sketch $\phi$ and $V$ on the same plot. Find the equation satisfied by the value of $c$ that minimizes $W$. (b) Find the optimum $c$ and $W$ for $V_{0}=20 \hbar^{2} / \mathrm{ml}^{2}$ and compare with the true ground-state energy $2.814 \hbar^{2} / m l^{2}$ (Prob. 4.31c). (Hint: You may want to use the Solver in a spreadsheet or a programmable calculator to find $c / l$.) (Answer: (a) $48 t^{4}-24 t^{2}-128 t^{3} / b+3=0$, where $t \equiv c / l$. (b) $c=0.6715 l, W=3.454 \hbar^{2} / m l^{2}$.)


The variation method as presented in the last section gives information about only the ground-state energy and wave function. We now discuss extension of the variation method to excited states. (See also Section 8.5.)

Consider how we might extend the variation method to estimate the energy of the first excited state. We number the stationary states of the system $1,2,3, \ldots$ in order of increasing energy:

\(
E{1} \leq E{2} \leq E_{3} \leq \cdots
\)

We showed that for a normalized variational function $\phi$ [Eqs. (8.4) and (8.6)]

\(
\int \phi^{} \hat{H} \phi d \tau=\sum{k=1}^{\infty}\left|a{k}\right|^{2} E_{k} \quad \text { and } \quad \int \phi^{} \phi d \tau=\sum{k=1}^{\infty}\left|a{k}\right|^{2}=1
\)

where the $a{k}$ 's are the expansion coefficients in $\phi=\sum{k} a{k} \psi{k}$ [Eq. (8.2)]. We have $a{k}=\left\langle\psi{k} \mid \phi\right\rangle$ [Eq. (7.40)]. Let us restrict ourselves to normalized functions $\phi$ that are orthogonal to the true ground-state wave function $\psi{1}$. Then we have $a{1}=\left\langle\psi_{1} \mid \phi\right\rangle=0$ and

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau=\sum{k=2}^{\infty}\left|a{k}\right|^{2} E{k} \quad \text { and } \quad \int \phi^{*} \phi d \tau=\sum{k=2}^{\infty}\left|a_{k}\right|^{2}=1 \tag{8.16}
\end{}
\)

For $k \geq 2$, we have $E{k} \geq E{2}$ and $\left|a{k}\right|^{2} E{k} \geq\left|a{k}\right|^{2} E{2}$. Hence

\(
\begin{equation}
\sum{k=2}^{\infty}\left|a{k}\right|^{2} E{k} \geq \sum{k=2}^{\infty}\left|a{k}\right|^{2} E{2}=E{2} \sum{k=2}^{\infty}\left|a{k}\right|^{2}=E{2} \tag{8.17}
\end{equation}
\)

Combining (8.16) and (8.17), we have the desired result:

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau \geq E{2} \quad \text { if } \quad \int \psi{1}^{} \phi d \tau=0 \quad \text { and } \quad \int \phi^{} \phi d \tau=1 \tag{8.18}
\end{}
\)

The inequality (8.18) allows us to get an upper bound to the energy $E{2}$ of the first excited state. However, the restriction $\left\langle\psi{1} \mid \phi\right\rangle=0$ makes this method troublesome to apply.

For certain systems, it is possible to be sure that $\left\langle\psi{1} \mid \phi\right\rangle=0$ even though we do not know the true ground-state wave function. An example is a one-dimensional problem for which $V$ is an even function of $x$. In this case the ground-state wave function is always an even function, while the first excited-state wave function is odd. (All the wave functions must be of definite parity. The ground-state wave function is nodeless, and, since an odd function vanishes at the origin, the ground-state wave function must be even. The first excited-state wave function has one node and must be odd.) Therefore, for odd trial functions, it must be true that $\left\langle\psi{1} \mid \phi\right\rangle=0$; the even function $\psi_{1}$ times the odd function $\phi$ gives an odd integrand whose integral from $-\infty$ to $\infty$ is zero.

Another example is a particle moving in a central field (Section 6.1). The form of the potential energy might be such that we could not solve for the radial factor $R(r)$ in the eigenfunction. However, the angular factor in $\psi$ is a spherical harmonic [Eq. (6.16)], and spherical harmonics with different values of $l$ are orthogonal. Thus we can get an upper bound to the energy of the lowest state with any given angular momentum $l$ by using the factor $Y_{l}^{m}$ in the trial function. This result depends on the extension of (8.18) to higher excited states:

\(
\begin{equation}
\frac{\int \phi^{} \hat{H} \phi d \tau}{\int \phi^{} \phi d \tau} \geq E{k+1} \quad \text { if } \quad \int \psi{1}^{} \phi d \tau=\int \psi{2}^{*} \phi d \tau=\cdots=\int \psi{k}^{} \phi d \tau=0 \tag{8.19}
\end{equation}
\)


Section 8.5 discusses a kind of variation function that gives rise to an equation involving a determinant. Therefore, we now discuss determinants.

A determinant is a square array of $n^{2}$ quantities (called elements); the value of the determinant is calculated from its elements in a manner to be given shortly. The number $n$ is the order of the determinant. Using $a_{i j}$ to represent a typical element, we write the $n$ th-order determinant as

\(
\operatorname{det}\left(a{i j}\right)=\left|\begin{array}{ccccc}
a{11} & a{12} & a{13} & \cdots & a{1 n} \tag{8.20}\
a{21} & a{22} & a{23} & \cdots & a{2 n} \
\cdot & \cdot & \cdot & \cdots & \cdot \
\cdot & \cdot & \cdot & \cdots & \cdot \
\cdot & \cdot & \cdot & \cdots & \cdot \
a{n 1} & a{n 2} & a{n 3} & \cdots & a_{n n}
\end{array}\right|
\)

The vertical lines in (8.20) have nothing to do with absolute value. Before considering how the value of the $n$ th-order determinant is defined, we consider determinants of first, second, and third orders.

A first-order determinant has one element, and its value is simply the value of that element. Thus

\(
\begin{equation}
\left|a{11}\right|=a{11} \tag{8.21}
\end{equation}
\)

where the vertical lines indicate a determinant and not an absolute value.
A second-order determinant has four elements, and its value is defined by

\(
\left|\begin{array}{ll}
a{11} & a{12} \tag{8.22}\
a{21} & a{22}
\end{array}\right|=a{11} a{22}-a{12} a{21}
\)

The value of a third-order determinant is defined by

\(
\left|\begin{array}{lll}
a{11} & a{12} & a{13} \tag{8.23}\
a{21} & a{22} & a{23} \
a{31} & a{32} & a{33}
\end{array}\right|=a{11}\left|\begin{array}{ll}
a{22} & a{23} \
a{32} & a{33}
\end{array}\right|-a{12}\left|\begin{array}{ll}
a{21} & a{23} \
a{31} & a{33}
\end{array}\right|+a{13}\left|\begin{array}{ll}
a{21} & a{22} \
a{31} & a{32}
\end{array}\right|
\)

\(
\begin{align}
= & a{11} a{22} a{33}-a{11} a{32} a{23}-a{12} a{21} a{33}+a{12} a{31} a{23} \
& +a{13} a{21} a{32}-a{13} a{31} a{22} \tag{8.24}
\end{align}
\)

A third-order determinant is evaluated by writing down the elements of the top row with alternating plus and minus signs and then multiplying each element by a certain secondorder determinant; the second-order determinant that multiplies a given element is found by crossing out the row and column of the third-order determinant in which that element appears. The $(n-1)$-order determinant obtained by striking out the $i$ th row and the $j$ th column of the $n$ th-order determinant is called the minor of the element $a{i j}$. We define the cofactor of $a{i j}$ as the minor of $a{i j}$ times the factor $(-1)^{i+j}$. Thus (8.23) states that a thirdorder determinant is evaluated by multiplying each element of the top row by its cofactor and adding up the three products. [Note that (8.22) conforms to this evaluation by means of cofactors, since the cofactor of $a{11}$ in (8.22) is $a{22}$, and the cofactor of $a{12}$ is $-a_{21}$.] A numerical example is

\(
\begin{aligned}
\left|\begin{array}{rrr}
5 & 10 & 2 \
0.1 & 3 & 1 \
0 & 4 & 4
\end{array}\right| & =5\left|\begin{array}{ll}
3 & 1 \
4 & 4
\end{array}\right|-10\left|\begin{array}{rr}
0.1 & 1 \
0 & 4
\end{array}\right|+2\left|\begin{array}{rr}
0.1 & 3 \
0 & 4
\end{array}\right| \
& =5(8)-10(0.4)+2(0.4)=36.8
\end{aligned}
\)

Denoting the minor of $a{i j}$ by $M{i j}$ and the cofactor of $a{i j}$ by $C{i j}$, we have

\(
\begin{equation}
C{i j}=(-1)^{i+j} M{i j} \tag{8.25}
\end{equation}
\)

The expansion (8.23) of the third-order determinant can be written as

\(
\operatorname{det}\left(a{i j}\right)=\left|\begin{array}{lll}
a{11} & a{12} & a{13} \tag{8.26}\
a{21} & a{22} & a{23} \
a{31} & a{32} & a{33}
\end{array}\right|=a{11} C{11}+a{12} C{12}+a{13} C{13}
\)

A third-order determinant can be expanded using the elements of any row and the corresponding cofactors. For example, using the second row to expand the third-order determinant, we have

\(
\begin{gather}
\operatorname{det}\left(a{i j}\right)=a{21} C{21}+a{22} C{22}+a{23} C{23} \tag{8.27}\
\operatorname{det}\left(a{i j}\right)=-a{21}\left|\begin{array}{ll}
a{12} & a{13} \
a{32} & a{33}
\end{array}\right|+a{22}\left|\begin{array}{ll}
a{11} & a{13} \
a{31} & a{33}
\end{array}\right|-a{23}\left|\begin{array}{ll}
a{11} & a{12} \
a{31} & a_{32}
\end{array}\right| \tag{8.28}
\end{gather}
\)

and expansion of the second-order determinants shows that (8.28) is equal to (8.24). We may also use the elements of any column and the corresponding cofactors to expand the determinant, as can be readily verified. Thus for the third-order determinant, we can write

\(
\begin{array}{ll}
\operatorname{det}\left(a{i j}\right)=a{k 1} C{k 1}+a{k 2} C{k 2}+a{k 3} C{k 3}=\sum{l=1}^{3} a{k l} C{k l}, & k=1 \text { or } 2 \text { or } 3 \
\operatorname{det}\left(a{i j}\right)=a{1 k} C{1 k}+a{2 k} C{2 k}+a{3 k} C{3 k}=\sum{l=1}^{3} a{l k} C{l k}, & k=1 \text { or } 2 \text { or } 3
\end{array}
\)

The first expansion uses one of the rows; the second uses one of the columns.

We define determinants of higher order by an analogous row (or column) expansion. For an $n$ th-order determinant,

\(
\begin{equation}
\operatorname{det}\left(a{i j}\right)=\sum{l=1}^{n} a{k l} C{k l}=\sum{l=1}^{n} a{l k} C_{l k}, \quad k=1 \text { or } 2 \text { or } \ldots \text { or } n \tag{8.29}
\end{equation}
\)

Some theorems on determinants are as follows (for proofs, see Sokolnikoff and Redheffer, pp. 702-707):
I. If every element of a row (or column) of a determinant is zero, the value of the determinant is zero.
II. Interchanging any two rows (or columns) multiplies the value of a determinant by -1 .
III. If any two rows (or columns) of a determinant are identical, the determinant has the value zero.
IV. Multiplication of each element of any one row (or any one column) by some constant $k$ multiplies the value of the determinant by $k$.
V. Addition to each element of one row of the same constant multiple of the corresponding element of another row leaves the value of the determinant unchanged. This theorem also applies to the addition of a multiple of one column to another column.
VI. The interchange of all corresponding rows and columns leaves the value of the determinant unchanged. (This interchange means that column one becomes row one, column two becomes row two, etc.)

EXAMPLE

Use Theorem V to evaluate

\(
B=\left|\begin{array}{llll}
1 & 2 & 3 & 4 \tag{8.30}\
4 & 1 & 2 & 3 \
3 & 4 & 1 & 2 \
2 & 3 & 4 & 1
\end{array}\right|
\)

Addition of -2 times the elements of row one to the corresponding elements of row four changes row four to $2+(-2) 1=0,3+(-2)(2)=-1,4+(-2) 3=-2$, and $1+(-2) 4=-7$. Then, addition of -3 times row one to row three and -4 times row one to row two gives

\(
B=\left|\begin{array}{rrrr}
1 & 2 & 3 & 4 \tag{8.31}\
0 & -7 & -10 & -13 \
0 & -2 & -8 & -10 \
0 & -1 & -2 & -7
\end{array}\right|=1\left|\begin{array}{rrr}
-7 & -10 & -13 \
-2 & -8 & -10 \
-1 & -2 & -7
\end{array}\right|
\)

where we expanded $B$ in terms of elements of the first column. Subtracting twice row three from row two and seven times row three from row one, we have

\(
B=\left|\begin{array}{rrr}
0 & 4 & 36 \tag{8.32}\
0 & -4 & 4 \
-1 & -2 & -7
\end{array}\right|=(-1)\left|\begin{array}{rr}
4 & 36 \
-4 & 4
\end{array}\right|=-(16+144)=-160
\)

The diagonal of a determinant that runs from the top left to the lower right is the principal diagonal. A diagonal determinant is a determinant all of whose elements are zero except those on the principal diagonal. For a diagonal determinant,

\(
\begin{align}
\left|\begin{array}{ccccc}
a{11} & 0 & 0 & \cdots & 0 \
0 & a{22} & 0 & \cdots & 0 \
0 & 0 & a{33} & \cdots & 0 \
\cdot & \cdot & \cdot & \cdots & \cdot \
0 & 0 & 0 & \cdots & a{n n}
\end{array}\right| & =a{11}\left|\begin{array}{cccc}
a{22} & 0 & \cdots & 0 \
0 & a{33} & \cdots & 0 \
\cdot & \cdot & \cdots & \cdot \
0 & 0 & \cdots & a{n n}
\end{array}\right|=a{11} a{22}\left|\begin{array}{cccc}
a{33} & 0 & \cdots & 0 \
0 & a{44} & \cdots & 0 \
\cdot & \cdot & \cdots & \cdot \
0 & 0 & \cdots & a{n n}
\end{array}\right| \
& =\cdots=a{11} a{22} a{33} \ldots a_{n n} \tag{8.33}
\end{align}
\)

A diagonal determinant is equal to the product of its diagonal elements.
A determinant whose only nonzero elements occur in square blocks centered about the principal diagonal is in block-diagonal form. If we regard each square block as a determinant, then a block-diagonal determinant is equal to the product of the blocks. For example,

The dashed lines outline the blocks. Equation (8.34) is readily proved by expanding the left side in terms of elements of the top row and expanding several subsequent determinants using their top rows (Prob. 8.21).


To deal with the kind of variation function discussed in the next section, we need to know about simultaneous linear equations.

Consider the following system of $n$ linear equations in $n$ unknowns:

\(
\begin{align}
& a{11} x{1}+a{12} x{2}+\cdots+a{1 n} x{n}=b{1} \
& a{21} x{1}+a{22} x{2}+\cdots+a{2 n} x{n}=b{2} \
& \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \tag{8.35}\
& a{n 1} x{1}+a{n 2} x{2}+\cdots+a{n n} x{n}=b_{n}
\end{align}
\)

where the $a$ 's and $b$ 's are known constants and $x{1}, x{2}, \ldots, x{n}$ are the unknowns. If at least one of the $b$ 's is not zero, we have a system of inhomogeneous linear equations. Such a system can be solved by Cramer's rule. (For a proof of Cramer's rule, see Sokolnikoff and Redheffer, p. 708.) Let $\operatorname{det}\left(a{i j}\right)$ be the determinant of the coefficients of the unknowns in (8.35). Cramer's rule states that $x_{k}(k=1,2, \ldots, n)$ is given by

\(
x{k}=\frac{\left|\begin{array}{cccccccc}
a{11} & a{12} & \ldots & a{1, k-1} & b{1} & a{1, k+1} & \ldots & a{1 n} \tag{8.36}\
a{21} & a{22} & \ldots & a{2, k-1} & b{2} & a{2, k+1} & \ldots & a{2 n} \
\cdot & \cdot & \ldots & \cdot & \cdot & \cdot & \ldots & \cdot \
a{n 1} & a{n 2} & \ldots & a{n, k-1} & b{n} & a{n, k+1} & \ldots & a{n n}
\end{array}\right|}{\operatorname{det}\left(a{i j}\right)}, k=1,2, \ldots, n
\)

where $\operatorname{det}\left(a{i j}\right)$ is given by (8.20) and the numerator is the determinant obtained by replacing the $k$ th column of $\operatorname{det}\left(a{i j}\right)$ with the elements $b{1}, b{2}, \ldots, b_{n}$. Although Cramer's rule is
of theoretical significance, it should not be used for numerical calculations, since successive elimination of unknowns is much more efficient.

A widely used successive-elimination procedure is Gaussian elimination, which proceeds as follows: Divide the first equation in (8.35) by the coefficient $a{11}$ of $x{1}$, thereby making the coefficient of $x{1}$ equal to 1 in this equation. Then subtract $a{21}$ times the first equation from the second equation, subtract $a{31}$ times the first equation from the third equation, $\ldots$, and subtract $a{n 1}$ times the first equation from the $n$th equation. This eliminates $x{1}$ from all equations but the first. Now divide the second equation by the coefficient of $x{2}$; then subtract appropriate multiples of the second equation from the 3 rd, 4 th, $\ldots$, $n$th equations, so as to eliminate $x{2}$ from all equations but the first and second. Continue in this manner. Ultimately, equation $n$ will contain only $x{n}$, equation $n-1$ only $x{n-1}$ and $x{n}$, and so on. The value of $x{n}$ found from equation $n$ is substituted into equation $n-1$ to give $x{n-1}$; the values of $x{n}$ and $x{n-1}$ are substituted into equation $n-2$ to give $x_{n-2}$; and so on. If at any stage a coefficient we want to divide by happens to be zero, the equation with the zero coefficient is exchanged with a later equation that has a nonzero coefficient in the desired position. (The Gaussian elimination procedure also gives an efficient way to evaluate a determinant; see Prob. 8.25.)

A related method is Gauss-Jordan elimination, which proceeds the same way as Gaussian elimination, except that instead of eliminating $x{2}$ from equations $3,4, \ldots, n$, we eliminate $x{2}$ from equations $1,3,4, \ldots, n$, by subtracting appropriate multiples of the second equation from equations $1,3,4, \ldots, n$; instead of eliminating $x{3}$ from equations $4,5, \ldots, n$, we eliminate $x{3}$ from equations $1,2,4,5, \ldots, n$; and so on. At the end of Gauss-Jordan elimination, equation 1 contains only $x{1}$, equation 2 contains only $x{2}, \ldots$, equation $n$ contains only $x_{n}$. Gauss-Jordan elimination requires more computation than Gaussian elimination.

If all the $b$ 's in (8.35) are zero, we have a system of linear homogeneous equations:

\(
\begin{align}
& a{11} x{1}+a{12} x{2}+\cdots+a{1 n} x{n}=0 \
& a{21} x{1}+a{22} x{2}+\cdots+a{2 n} x{n}=0 \
& \cdot \cdot \cdot \cdot \cdot \cdots \cdot \cdot \cdot \tag{8.37}\
& a{n 1} x{1}+a{n 2} x{2}+\cdots+a{n n} x{n}=0
\end{align}
\)

One obvious solution of (8.37) is $x{1}=x{2}=\cdots=x{n}=0$, which is called the trivial solution. If the determinant of the coefficients in (8.37) is not equal to zero, $\operatorname{det}\left(a{i j}\right) \neq 0$, then we can use Cramer's rule (8.36) to solve for the unknowns, and we find $x{k}=0, k=1,2, \ldots, n$, since the determinant in the numerator of (8.36) has a column all of whose elements are zero. Thus, when $\operatorname{det}\left(a{i j}\right) \neq 0$, the only solution is the trivial solution, which is of no interest. For there to be a nontrivial solution of a system of $n$ linear homogeneous equations in $n$ unknowns, the determinant of the coefficients must be zero. Also, this condition can be shown to be sufficient to ensure the existence of a nontrivial solution. We thus have the extremely important theorem:

A system of $n$ linear homogeneous equations in $n$ unknowns has a nontrivial solution if and only if the determinant of the coefficients is zero.

Suppose that $\operatorname{det}\left(a{i j}\right)=0$, so that (8.37) has a nontrivial solution. How do we find it? With $\operatorname{det}\left(a{i j}\right)=0$, Cramer's rule (8.36) gives $x{k}=0 / 0, k=1, \ldots, n$, which is indeterminate. Thus Cramer's rule is of no immediate help. We also observe that, if $x{1}=d{1}, x{2}=d{2}, \ldots$, $x{n}=d{n}$ is a solution of (8.37), then so is $x{1}=c d{1}, x{2}=c d{2}, \ldots, x{n}=c d_{n}$, where $c$ is an arbitrary constant. This is easily seen, since

\(
a{11} c d{1}+a{12} c d{2}+\cdots+a{1 n} c d{n}=c\left(a{11} d{1}+a{12} d{2}+\cdots+a{1 n} d{n}\right)=c \cdot 0=0
\)

and so on. Therefore, the solution to the linear homogeneous system of equations will contain an arbitrary constant, and we cannot determine a unique value for each unknown. To solve (8.37), we therefore assign an arbitrary value to any one of the unknowns, say $x{n}$; we set $x{n}=c$, where $c$ is an arbitrary constant. Having assigned a value to $x_{n}$, we transfer the last term in each of the equations of (8.37) to the right side to get

\(
\begin{gather}
a{11} x{1}+a{12} x{2}+\cdots+a{1, n-1} x{n-1}=-a{1, n} c \
a{21} x{1}+a{22} x{2}+\cdots+a{2, n-1} x{n-1}=-a{2, n} c \tag{8.38}\
\cdot \cdot \cdot \
a{n-1,1} x{1}+a{n-1,2} x{2}+\cdots+a{n-1, n-1} x{n-1}=-a{n-1, n} c \
a{n 1} x{1}+a{n 2} x{2}+\cdots+a{n, n-1} x{n-1}=-a{n n} c
\end{gather}
\)

We now have $n$ equations in $n-1$ unknowns, which is one more equation than we need. We therefore discard any one of the equations of (8.38), say the last one. This gives a system of $n-1$ linear inhomogeneous equations in $n-1$ unknowns. We could then apply Cramer's rule (8.36) to solve for $x{1}, x{2}, \ldots, x_{n-1}$. Since the constants on the right side of the equations in (8.38) all contain the factor $c$, Theorem IV in Section 8.3 shows that all the unknowns contain this arbitrary constant as a factor. The form of the solution is therefore

\(
\begin{equation}
x{1}=c e{1}, \quad x{2}=c e{2}, \quad \ldots, \quad x{n-1}=c e{n-1}, \quad x_{n}=c \tag{8.39}
\end{equation}
\)

where $e{1}, \ldots, e{n-1}$ are numbers and $c$ is an arbitrary constant.

EXAMPLE

Solve

\(
\begin{aligned}
& 3 x{1}+4 x{2}+x{3}=0 \
& x{1}+3 x{2}-2 x{3}=0 \
& x{1}-2 x{2}+5 x_{3}=0
\end{aligned}
\)

This is a set of linear homogeneous equations, and we begin by evaluating the determinant of the coefficients. We find (see the Exercise)

\(
\left|\begin{array}{rrr}
3 & 4 & 1 \
1 & 3 & -2 \
1 & -2 & 5
\end{array}\right|=0
\)

Therefore, a nontrivial solution exists. We set $x{3}$ equal to an arbitrary constant $c$ ( $x{3}=c$ ) and discard the third equation to give

\(
\begin{aligned}
3 x{1}+4 x{2} & =-c \
x{1}+3 x{2} & =2 c
\end{aligned}
\)

Subtracting 3 times the second equation from the first, we get $-5 x{2}=-7 c$, so $x{2}=\frac{7}{5} c$. Substitution into $x{1}+3 x{2}=2 c$ gives $x{1}+\frac{21}{5} c=2 c$, so $x{1}=-\frac{11}{5} c$. Hence the general solution is $x{1}=-\frac{11}{5} c, x{2}=\frac{7}{5} c, x{3}=c$. For those allergic to fractions, we define a new arbitrary constant $s$ as $s \equiv \frac{1}{5} c$ and write $x{1}=-11 s, x{2}=7 s, x{3}=5 s$.

EXERCISE (a) Verify that the coefficient determinant in this example is zero. (b) Verify that the third equation in this example can be obtained by adding a certain constant times the second equation to the first equation.

The procedure just outlined fails if the determinant of the inhomogeneous system of $n-1$ equations in $n-1$ unknowns [(8.38) with the last equation omitted] happens to be zero. Cramer's rule then has a zero in the denominator and is of no use. We could try to get around this difficulty by initially assigning the arbitrary value to another of the unknowns rather than to $x_{n}$. We could also try discarding some other equation of (8.38), rather than the last one. What we are looking for is a nonvanishing determinant of order $n-1$ formed from the determinant of the coefficients of the system (8.37) by striking out a row and a column. If such a determinant exists, then by the procedure given, with the right choice of the equation to be discarded and the right choice of the unknown to be assigned an arbitrary value, we can solve the system and will get solutions of the form (8.39). If no such determinant exists, we must assign arbitrary values to two of the unknowns and attempt to proceed from there. Thus the solution to (8.37) might contain two (or even more) arbitrary constants.

An efficient way to solve a system of linear homogeneous equations is to do GaussJordan elimination on the equations. If only the trivial solution exists, the final set of equations obtained will be $x{1}=0, x{2}=0, \ldots, x_{n}=0$. If a nontrivial solution exists, at least one equation will be reduced to the form $0=0$; if $m$ equations of the form $0=0$ are obtained, we assign arbitrary constants to $m$ of the unknowns and express the remaining unknowns in terms of these $m$ unknowns.

EXAMPLE

Use Gauss-Jordan elimination to solve the set of equations in the preceding example.
In doing Gaussian or Gauss-Jordan elimination on a set of $n$ inhomogeneous or homogeneous equations, we can eliminate needless writing by omitting the variables $x{1}, \ldots, x{n}$ and writing down only the $n$-row, $(n+1)$-column array of coefficients and constant terms (including any zero coefficients); we then produce the next array by operating on the numbers of each row as if that row were the equation it represents.
To eliminate one set of divisions, we interchange the first and second equations so that we start with $a_{11}=1$. Detaching the coefficients and proceeding with GaussJordan elimination, we have

113-2013-2013-200$\frac{11}{5}$0
3410$\rightarrow$0-57001$-\frac{7}{5}$01
$1-2$500-5700-57000

The first array is the original set of equations with the first and second equations interchanged. To eliminate $x{1}$ from the second and third equations, we subtract 3 times row one from row two and 1 times row one from row three, thereby producing the second array. Division of row two by -5 produces the third array. To eliminate $x{2}$ from the first and third equations, we subtract 3 times row two from row one and -5 times row two from row three, thereby producing the fourth array. Because the fourth array has the $x{3}$ coefficient in row three equal to zero, we cannot use row three to eliminate $x{3}$ from rows one and two (as would be the last step in the GaussJordan algorithm). Discarding the last equation, which reads $0=0$, we assign $x{3}=k$, where $k$ is an arbitrary constant. The first and second equations in the last array read $x{1}+\frac{11}{5} x{3}=0$ and $x{2}-\frac{7}{5} x{3}=0$, or $x{1}=-\frac{11}{5} x{3}, x{2}=\frac{7}{5} x{3}$. The general solution is $x{1}=-\frac{11}{5} k, x{2}=\frac{7}{5} k, x{3}=k$.


A special kind of variation function widely used in the study of molecules is the linear variation function. A linear variation function is a linear combination of $n$ linearly independent functions $f{1}, f{2}, \ldots, f_{n}$ :

\(
\begin{equation}
\phi=c{1} f{1}+c{2} f{2}+\cdots+c{n} f{n}=\sum{j=1}^{n} c{j} f_{j} \tag{8.40}
\end{equation}
\)

where $\phi$ is the trial variation function and the coefficients $c{j}$ are parameters to be determined by minimizing the variational integral. The functions $f{j}$ (which are called basis functions) must satisfy the boundary conditions of the problem. We shall restrict ourselves to real $\phi$ so that the $c{j}$ 's and $f{j}$ 's are all real. In (8.40), the functions $f_{j}$ are known functions.

We now apply the variation theorem (8.9). For the real linear variation function, we have
$\int \phi^{*} \phi d \tau=\int \sum{j=1}^{n} c{j} f{j} \sum{k=1}^{n} c{k} f{k} d \tau=\sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} \int f{j} f{k} d \tau \equiv \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}$
where we defined the overlap integral $S{j k}$ as

\(
\begin{equation}
S{j k} \equiv \int f{j}^{} f_{k} d \tau \tag{8.42}
\end{}
\)

Note that $S{j k}$ is not necessarily equal to $\delta{j k}$, since there is no reason to suppose that the functions $f_{j}$ are mutually orthogonal. They are not necessarily the eigenfunctions of any operator. The numerator in (8.9) is

\(
\begin{aligned}
\int \phi^{*} \hat{H} \phi d \tau & =\int \sum{j=1}^{n} c{j} f{j} \hat{H} \sum{k=1}^{n} c{k} f{k} d \tau \
& =\sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} \int f{j} \hat{H} f{k} d \tau \equiv \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} H_{j k}
\end{aligned}
\)

where we defined $H_{j k}$ as

\(
\begin{equation}
H{j k} \equiv \int f{j}^{} \hat{H} f_{k} d \tau \tag{8.43}
\end{}
\)

The variational integral $W$ is

\(
\begin{align}
W \equiv \frac{\int \phi^{} \hat{H} \phi d \tau}{\int \phi^{} \phi d \tau} & =\frac{\sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} H{j k}}{\sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}} \tag{8.44}\
W \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k} & =\sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} H{j k} \tag{8.45}
\end{align}
\)

We now minimize $W$ so as to approach as closely as we can to $E{1}\left(W \geq E{1}\right)$. The variational integral $W$ is a function of the $n$ independent variables $c{1}, c{2}, \ldots, c_{n}$ :

\(
W=W\left(c{1}, c{2}, \ldots, c_{n}\right)
\)

A necessary condition for a minimum in a function $W$ of several variables is that its partial derivatives with respect to each of the variables must be zero at the minimum:

\(
\begin{equation}
\frac{\partial W}{\partial c_{i}}=0, \quad i=1,2, \ldots, n \tag{8.46}
\end{equation}
\)

We now differentiate (8.45) partially with respect to each $c_{i}$ to obtain $n$ equations:

\(
\begin{equation}
\frac{\partial W}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}+W \frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}=\frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} H{j k}, \quad i=1,2, \ldots, n \tag{8.47}
\end{equation}
\)

Now

\(
\frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}=\sum{j=1}^{n} \sum{k=1}^{n}\left[\frac{\partial}{\partial c{i}}\left(c{j} c{k}\right)\right] S{j k}=\sum{j=1}^{n} \sum{k=1}^{n}\left(c{k} \frac{\partial c{j}}{\partial c{i}}+c{j} \frac{\partial c{k}}{\partial c{i}}\right) S_{j k}
\)

The $c_{j}$ 's are independent variables, and therefore

\(
\begin{gather}
\frac{\partial c{j}}{\partial c{i}}=0 \quad \text { if } j \neq i, \quad \frac{\partial c{j}}{\partial c{i}}=1 \quad \text { if } j=i \
\frac{\partial c{j}}{\partial c{i}}=\delta_{i j} \tag{8.48}
\end{gather}
\)

We then have

\(
\frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}=\sum{k=1}^{n} \sum{j=1}^{n} c{k} \delta{i j} S{j k}+\sum{j=1}^{n} \sum{k=1}^{n} c{j} \delta{i k} S{j k}=\sum{k=1}^{n} c{k} S{i k}+\sum{j=1}^{n} c{j} S{j i}
\)

where we evaluated one of the sums in each double summation using Eq. (7.32). Use of (7.4) gives

\(
\begin{equation}
S{j i}=S{i j}^{}=S_{i j} \tag{8.49}
\end{}
\)

where the last equality follows because we are dealing with real functions. Hence,

\(
\begin{equation}
\frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} S{j k}=\sum{k=1}^{n} c{k} S{i k}+\sum{j=1}^{n} c{j} S{i j}=\sum{k=1}^{n} c{k} S{i k}+\sum{k=1}^{n} c{k} S{i k}=2 \sum{k=1}^{n} c{k} S_{i k} \tag{8.50}
\end{equation}
\)

where the fact that $j$ is a dummy variable was used.
By replacing $S{j k}$ by $H{j k}$ in each of these manipulations, we get

\(
\begin{equation}
\frac{\partial}{\partial c{i}} \sum{j=1}^{n} \sum{k=1}^{n} c{j} c{k} H{j k}=2 \sum{k=1}^{n} c{k} H_{i k} \tag{8.51}
\end{equation}
\)

This result depends on the fact that

\(
\begin{equation}
H{j i}=H{i j}^{}=H_{i j} \tag{8.52}
\end{}
\)

which is true because $\hat{H}$ is Hermitian, and $f{i}, f{j}$, and $\hat{H}$ are real.
Substitution of Eqs. (8.46), (8.50), and (8.51) into (8.47) gives

\(
\begin{array}{ll}
2 W \sum{k=1}^{n} c{k} S{i k}=2 \sum{k=1}^{n} c{k} H{i k}, & i=1,2, \ldots, n \
\sum{k=1}^{n}\left[\left(H{i k}-S{i k} W\right) c{k}\right]=0, & i=1,2, \ldots, n \tag{8.53}
\end{array}
\)

Equation (8.53) is a set of $n$ simultaneous, linear, homogeneous equations in the $n$ unknowns $c{1}, c{2}, \ldots, c_{n}$ [the coefficients in the linear variation function (8.40)]. For example, for $n=2$, (8.53) gives

\(
\begin{align}
& \left(H{11}-S{11} W\right) c{1}+\left(H{12}-S{12} W\right) c{2}=0 \
& \left(H{21}-S{21} W\right) c{1}+\left(H{22}-S{22} W\right) c{2}=0 \tag{8.54}
\end{align}
\)

For the general case of $n$ functions $f{1}, \ldots, f{n}$, (8.53) is

\(
\begin{align}
& \left(H{11}-S{11} W\right) c{1}+\left(H{12}-S{12} W\right) c{2}+\cdots+\left(H{1 n}-S{1 n} W\right) c{n}=0 \
& \left(H{21}-S{21} W\right) c{1}+\left(H{22}-S{22} W\right) c{2}+\cdots+\left(H{2 n}-S{2 n} W\right) c{n}=0 \tag{8.55}\
& \left(H{n 1}-S{n 1} W\right) c{1}+\left(H{n 2}-S{n 2} W\right) c{2}+\cdots+\left(H{n n}-S{n n} W\right) c_{n}=0
\end{align}
\)

From the theorem of Section 8.4, for there to be a solution to the linear homogeneous equations (8.55) besides the trivial solution $0=c{1}=c{2}=\cdots=c_{n}$ (which would make the variation function $\phi$ zero), the determinant of the coefficients must vanish. For $n=2$ we have

\(
\left|\begin{array}{ll}
H{11}-S{11} W & H{12}-S{12} W \tag{8.56}\
H{21}-S{21} W & H{22}-S{22} W
\end{array}\right|=0
\)

and for the general case

\(
\begin{align}
& \operatorname{det}\left(H{i j}-S{i j} W\right)=0 \tag{8.57}\
& \left|\begin{array}{cccc}
H{11}-S{11} W & H{12}-S{12} W & \cdots & H{1 n}-S{1 n} W \
H{21}-S{21} W & H{22}-S{22} W & \cdots & H{2 n}-S{2 n} W \
\cdot & \cdot & \cdots & \cdot \
\cdot & \cdot & \cdots & \cdot \
\cdot & \cdot & \cdots & \cdot \
H{n 1}-S{n 1} W & H{n 2}-S{n 2} W & \cdots & H{n n}-S{n n} W
\end{array}\right|=0 \tag{8.58}
\end{align}
\)

Expansion of the determinant in (8.58) gives an algebraic equation of degree $n$ in the unknown $W$. This algebraic equation has $n$ roots, which can be shown to be real. Arranging these roots in order of increasing value, we denote them as

\(
\begin{equation}
W{1} \leq W{2} \leq \cdots \leq W_{n} \tag{8.59}
\end{equation}
\)

If we number the bound states of the system in order of increasing energy, we have

\(
\begin{equation}
E{1} \leq E{2} \leq \cdots \leq E{n} \leq E{n+1} \leq \cdots \tag{8.60}
\end{equation}
\)

where the $E$ 's denote the true energies of various states. From the variation theorem, we know that $E{1} \leq W{1}$. Moreover, it can be proved that [J. K. L. MacDonald, Phys. Rev., 43, 830 (1933); R. H. Young, Int. J. Quantum Chem., 6, 596 (1972); see Prob. 8.40]

\(
\begin{equation}
E{1} \leq W{1}, \quad E{2} \leq W{2}, \quad E{3} \leq W{3}, \ldots, \quad E{n} \leq W{n} \tag{8.61}
\end{equation}
\)

Thus, the linear variation method provides upper bounds to the energies of the lowest $n$ bound states of the system. We use the roots $W{1}, W{2}, \ldots, W{n}$ as approximations to the energies of the lowest states. If approximations to the energies of more states are wanted, we add more functions $f{k}$ to the trial function $\phi$. The addition of more functions $f{k}$ can be shown to increase (or cause no change in) the accuracy of the previously calculated energies. If the functions $f{k}$ in $\phi=\sum{k} c{k} f_{k}$ form a complete set, then we will obtain the
true wave functions of the system. Unfortunately, we usually need an infinite number of functions to have a complete set.

Quantum chemists may use dozens, hundreds, thousands, or even millions of terms in linear variation functions so as to get accurate results for molecules. Obviously, a computer is essential for this work. The most efficient way to solve (8.58) (which is called the secular equation) and the associated linear equations (8.55) is by matrix methods (Section 8.6).

To obtain an approximation to the wave function of the ground state, we take the lowest root $W{1}$ of the secular equation and substitute it in the set of equations (8.55); we then solve this set of equations for the coefficients $c{1}^{(1)}, c{2}^{(1)}, \ldots, c{n}^{(1)}$, where the superscript ${ }^{(1)}$ was added to indicate that these coefficients correspond to $W{1}$. [As noted in the previous section, we can determine only the ratios of the coefficients. We solve for $c{2}^{(1)}, \ldots, c{n}^{(1)}$ in terms of $c{1}^{(1)}$, and then determine $c{1}^{(1)}$ by normalization.] Having found the $c{k}^{(1)}$ 's, we take $\phi{1}=\sum{k} c{k}^{(1)} f{k}$ as an approximate ground-state wave function. Use of higher roots of (8.58) in (8.55) gives approximations to excited-state wave functions. These approximate wave functions can be shown to be orthogonal (Prob. 8.40).

Solution of (8.58) and (8.55) is simplified by having as many of the integrals equal to zero as possible. We can make some of the off-diagonal $H{i j}$ 's vanish by choosing the functions $f{k}$ as eigenfunctions of some operator $\hat{A}$ that commutes with $\hat{H}$. If $f{i}$ and $f{j}$ correspond to different eigenvalues of $\hat{A}$, then $H{i j}$ vanishes (Theorem 6 of Section 7.4). If the functions $f{k}$ are orthonormal, the off-diagonal $S{i j}$ 's vanish $\left(S{i j}=\delta{i j}\right)$. If the initially chosen $f{k}$ 's are not orthogonal, we can use the Schmidt (or some other) procedure to find $n$ linear combinations of these $f_{k}$ 's that are orthogonal and then use the orthogonalized functions.

Equations (8.55) and (8.58) are also valid when the restriction that the variation function be real is removed (Prob. 8.39).

EXAMPLE

Add functions to the function $x(l-x)$ of the first example of Section 8.1 to form a linear variation function for the particle in a one-dimensional box of length $l$ and find approximate energies and wave functions for the lowest four states.
In the trial function $\phi=\sum{k=1}^{n} c{k} f{k}$, we take $f{1}=x(l-x)$. Since we want approximations to the lowest four states, $n$ must be at least 4. There are an infinite number of possible well-behaved functions that could be used for $f{2}, f{3}$, and $f{4}$. The function $x^{2}(l-x)^{2}$ obeys the boundary conditions of vanishing at $x=0$ and $x=l$ and leads to simple integrals, so we take $f{2}=x^{2}(l-x)^{2}$.
If the origin is placed at the center of the box, the potential energy (Fig. 2.1) is an even function, and, as noted in Section 8.2, the wave functions alternate between being even and odd functions (see also Fig. 2.3). (Throughout this example, the terms even and odd will refer to having the origin at the box's center.) The functions $f{1}=x(l-x)$ and $f{2}=x^{2}(l-x)^{2}$ are both even functions (see Prob. 8.34). If we were to take $\phi=c{1} x(l-x)+c{2} x^{2}(l-x)^{2}$, we would end up with upper bounds to the energies of the lowest two states with even wave functions (the $n=1$ and $n=3$ states) and would get approximate wave functions for these two states. Since we also want to approximate the $n=2$ and $n=4$ states, we shall add in two functions that are odd. An odd function must vanish at the origin [as noted after Eq. (4.50)], so we need functions that vanish at the box midpoint $x=\frac{1}{2} l$, as well as at $x=0$ and $l$. A simple function with these properties is $f{3}=x(l-x)\left(\frac{1}{2} l-x\right)$. To get $f{4}$, we shall multiply $f{2}$ by $\left(\frac{1}{2} l-x\right)$. Thus we take $\phi=\sum{k=1}^{4} c{k} f{k}$, with

\(
\begin{equation}
f{1}=x(l-x), f{2}=x^{2}(l-x)^{2}, f{3}=x(l-x)\left(\frac{1}{2} l-x\right), f{4}=x^{2}(l-x)^{2}\left(\frac{1}{2} l-x\right) \tag{8.62}
\end{equation}
\)

Note that $f{1}, f{2}, f{3}$, and $f{4}$ are linearly independent, as assumed in (8.40).

Because $f{1}$ and $f{2}$ are even, while $f{3}$ and $f{4}$ are odd, many integrals will vanish.
Thus

\(
\begin{equation}
S{13}=S{31}=0, \quad S{14}=S{41}=0, \quad S{23}=S{32}=0, \quad S{24}=S{42}=0 \tag{8.63}
\end{equation}
\)

because the integrand in each of these overlap integrals is an odd function with respect to the origin at the box center. The functions $f{1}, f{2}, f{3}, f{4}$ are eigenfunctions of the parity operator $\hat{\Pi}$ (Section 7.5) with the even functions $f{1}$ and $f{2}$ having parity eigenvalue +1 and $f{3}$ and $f{4}$ having eigenvalue -1 . The operator $\hat{\Pi}$ commutes with $\hat{H}$ (since $V$ is an even function), so by Theorem 6 of Section 7.4, $H{i j}$ vanishes if $f{i}$ is an odd function and $f_{j}$ is even, or vice versa. Thus

\(
\begin{equation}
H{13}=H{31}=0, \quad H{14}=H{41}=0, \quad H{23}=H{32}=0, \quad H{24}=H{42}=0 \tag{8.64}
\end{equation}
\)

From (8.63) and (8.64), the $n=4$ secular equation (8.58) becomes

\(
\left|\begin{array}{cccc}
H{11}-S{11} W & H{12}-S{12} W & 0 & 0 \tag{8.65}\
H{21}-S{21} W & H{22}-S{22} W & 0 & 0 \
0 & 0 & H{33}-S{33} W & H{34}-S{34} W \
0 & 0 & H{43}-S{43} W & H{44}-S{44} W
\end{array}\right|=0
\)

The secular determinant is in block-diagonal form and so is equal to the product of its blocks [Eq. (8.34)]:

\(
\left|\begin{array}{ll}
H{11}-S{11} W & H{12}-S{12} W \
H{21}-S{21} W & H{22}-S{22} W
\end{array}\right| \times\left|\begin{array}{ll}
H{33}-S{33} W & H{34}-S{34} W \
H{43}-S{43} W & H{44}-S{44} W
\end{array}\right|=0
\)

The four roots of this equation are found from the equations

\(
\begin{align}
& \left|\begin{array}{ll}
H{11}-S{11} W & H{12}-S{12} W \
H{21}-S{21} W & H{22}-S{22} W
\end{array}\right|=0 \tag{8.66}\
& \left|\begin{array}{ll}
H{33}-S{33} W & H{34}-S{34} W \
H{43}-S{43} W & H{44}-S{44} W
\end{array}\right|=0 \tag{8.67}
\end{align}
\)

Let the roots of (8.66) (which are approximations to the $n=1$ and $n=3$ energies) be $W{1}$ and $W{3}$ and let the roots of (8.67) be $W{2}$ and $W{4}$. After solving the secular equation for the $W$ 's, we substitute them one at a time into the set of equations (8.55) to find the coefficients $c{k}$ in the variation function. From the secular equation (8.65), the set of equations (8.55) with the root $W{1}$ is

\(
\begin{align}
\left.\begin{array}{c}
\left(H{11}-S{11} W{1}\right) c{1}^{(1)}+\left(H{12}-S{12} W{1}\right) c{2}^{(1)} \
\left(H{21}-S{21} W{1}\right) c{1}^{(1)}+\left(H{22}-S{22} W{1}\right) c{2}^{(1)} \
\
\left(H{33}-S{33} W{1}\right) c{3}^{(1)}+\left(H{34}-S{34} W{1}\right) c{4}^{(1)}=0
\end{array}\right} \tag{8.68a}\
\left.\begin{array}{r}
\left(H{43}-S{43} W{1}\right) c{3}^{(1)}+\left(H{44}-S{44} W{1}\right) c{4}^{(1)}=0
\end{array}\right}
\end{align}
\)

Because $W{1}$ is a root of (8.66), the set of equations (8.68a) has the determinant of its coefficients [which is the determinant in (8.66)] equal to zero. Hence (8.68a) has a nontrivial solution for $c{1}^{(1)}$ and $c{2}^{(1)}$. However, $W{1}$ is not a root of (8.67), so the determinant of the coefficients of the set of equations (8.68b) is nonzero. Hence, (8.68b) has only the trivial solution $c{3}^{(1)}=c{4}^{(1)}=0$. The trial function $\phi{1}$ corresponding to the root $W{1}$ thus has the form $\phi{1}=\sum{k=1}^{4} c{k}^{(1)} f{k}=c{1}^{(1)} f{1}+c{2}^{(1)} f{2}$. The same reasoning
shows that $\phi{3}$ is a linear combination of $f{1}$ and $f{2}$, while $\phi{2}$ and $\phi{4}$ are each linear combinations of $f{3}$ and $f_{4}$ :

\(
\begin{array}{ll}
\phi{1}=c{1}^{(1)} f{1}+c{2}^{(1)} f{2}, & \phi{3}=c{1}^{(3)} f{1}+c{2}^{(3)} f{2} \tag{8.69}\
\phi{2}=c{3}^{(2)} f{3}+c{4}^{(2)} f{4}, & \phi{4}=c{3}^{(4)} f{3}+c{4}^{(4)} f{4}
\end{array}
\)

The even wave functions $\psi{1}$ and $\psi{3}$ are approximated by linear combinations of the even functions $f{1}$ and $f{2}$; the odd functions $\psi{2}$ and $\psi{4}$ are approximated by linear combinations of the odd functions $f{3}$ and $f{4}$.

When the secular equation is in block-diagonal form, it factors into two or more smaller secular equations, and the set of simultaneous equations (8.55) breaks up into two or more smaller sets of equations.
We now must evaluate the $H{i j}$ and $S{i j}$ integrals so as to solve (8.66) and (8.67) for $W{1}, W{2}, W{3}$, and $W{4}$. We have

\(
\begin{aligned}
H{11}=\left\langle f{1}\right| \hat{H}\left|f{1}\right\rangle & =\int{0}^{l} x(l-x)\left(\frac{-\hbar^{2}}{2 m}\right) \frac{d^{2}}{d x^{2}}[x(l-x)] d x=\frac{\hbar^{2} l^{3}}{6 m} \
S{11} & =\left\langle f{1} \mid f{1}\right\rangle=\int{0}^{l} x^{2}(l-x)^{2} d x=\frac{l^{5}}{30}
\end{aligned}
\)

where Eqs. (8.12) and (8.13) were used. Evaluation of the remaining integrals using (8.62), (8.49), and (8.52) gives (Prob. 8.35)

\(
\begin{gathered}
H{12}=H{21}=\left\langle f{2}\right| \hat{H}\left|f{1}\right\rangle=\hbar^{2} l^{5} / 30 m, \quad H{22}=\hbar^{2} l^{7} / 105 m \
H{33}=\hbar^{2} l^{5} / 40 m, \quad H{44}=\hbar^{2} l^{9} / 1260 m, \quad H{34}=H{43}=\hbar^{2} l^{7} / 280 m \
S{12}=S{21}=\left\langle f{1} \mid f{2}\right\rangle=l^{7} / 140, \quad S{22}=l^{9} / 630 \
S{33}=l^{7} / 840, \quad S{44}=l^{11} / 27720, \quad S{34}=S{43}=l^{9} / 5040
\end{gathered}
\)

Equation (8.66) becomes

\(
\left|\begin{array}{ll}
\frac{\hbar^{2} l^{3}}{6 m}-\frac{l^{5}}{30} W & \frac{\hbar^{2} l^{5}}{30 m}-\frac{l^{7}}{140} W \tag{8.70}\
\frac{\hbar^{2} l^{5}}{30 m}-\frac{l^{7}}{140} W & \frac{\hbar^{2} l^{7}}{105 m}-\frac{l^{9}}{630} W
\end{array}\right|=0
\)

Using Theorem IV of Section 8.3, we eliminate the fractions by multiplying row one of the determinant by $420 \mathrm{~m} / \mathrm{l}^{3}$, row two by $1260 \mathrm{~m} / \mathrm{l}^{5}$, and the right side of (8.70) by both factors, to get

\(
\begin{gather}
\left|\begin{array}{cc}
70 \hbar^{2}-14 m l^{2} W & 14 \hbar^{2} l^{2}-3 m l^{4} W \
42 \hbar^{2}-9 m l^{2} W & 12 \hbar^{2} l^{2}-2 m l^{4} W
\end{array}\right|=0 \
m^{2} l^{4} W^{2}-56 m l^{2} \hbar^{2} W+252 \hbar^{4}=0 \
W=\left(\hbar^{2} / m l^{2}\right)(28 \pm \sqrt{532})=0.1250018 h^{2} / m l^{2}, \quad 1.293495 h^{2} / m l^{2} \tag{8.71}
\end{gather}
\)

Substitution of the integrals into (8.67) leads to the roots (Prob. 8.36)

\(
\begin{equation}
W=\left(\hbar^{2} / m l^{2}\right)(60 \pm \sqrt{1620})=0.5002930 h^{2} / m l^{2}, \quad 2.5393425 h^{2} / m l^{2} \tag{8.72}
\end{equation}
\)

The approximate values $\left(m l^{2} / h^{2}\right) W=0.1250018,0.5002930,1.293495$, and 2.5393425 may be compared with the exact values [Eq. (2.20)] $\left(\mathrm{ml}^{2} / h^{2}\right) E=0.125,0.5,1.125$, and 2 for the four lowest states. The percent errors are $0.0014 \%, 0.059 \%, 15.0 \%$, and $27.0 \%$ for $n=1,2,3$, and 4 , respectively. We did great for $n=1$ and 2 ; lousy for $n=3$ and 4 .

We now find the approximate wave functions corresponding to these $W$ 's. Substitution of $W_{1}=0.1250018 h^{2} / \mathrm{ml}^{2}$ into the set of equations (8.68a) corresponding to (8.71) gives (after division by $h^{2}$ )

\(
\begin{align}
0.023095 c{1}^{(1)}-0.020381 c{2}^{(1)} l^{2} & =0 \
-0.061144 c{1}^{(1)}+0.053960 c{2}^{(1)} l^{2} & =0 \tag{8.73}
\end{align}
\)

where, for example, the first coefficient is found from

\(
70 \hbar^{2}-14 m l^{2} W=70 h^{2} / 4 \pi^{2}-14(0.1250018) h^{2}=0.023095 h^{2}
\)

To solve the homogeneous equations (8.73), we follow the procedure given near the end of Section 8.4. We discard the second equation of (8.73), transfer the $c_{2}^{(1)}$ term to the right side, and solve for the coefficient ratio; we get

\(
c{1}^{(1)}=k, \quad c{2}^{(1)}=1.133 k / l^{2}
\)

where $k$ is a constant. We find $k$ from the normalization condition:

\(
\begin{aligned}
\left\langle\phi{1} \mid \phi{1}\right\rangle & =1=\left\langle k f{1}+1.133 k f{2} / l^{2} \mid k f{1}+1.133 k f{2} / l^{2}\right\rangle \
& =k^{2}\left(\left\langle f{1} \mid f{1}\right\rangle+2.266\left\langle f{1} \mid f{2}\right\rangle / l^{2}+1.284\left\langle f{2} \mid f{2}\right\rangle / l^{4}\right) \
& =k^{2}\left(S{11}+2.266 S{12} / l^{2}+1.284 S_{22} / l^{4}\right)=0.05156 k^{2} l^{5}
\end{aligned}
\)

where the previously found values of the overlap integrals were used. Therefore $k=4.404 / l^{5 / 2}$ and

\(
\begin{align}
& \phi{1}=c{1}^{(1)} f{1}+c{2}^{(1)} f{2}=4.404 f{1} / l^{5 / 2}+4.990 f{2} / l^{9 / 2} \
& \phi{1}=l^{-1 / 2}\left[4.404(x / l)(1-x / l)+4.990(x / l)^{2}(1-x / l)^{2}\right] \tag{8.74}
\end{align}
\)

where (8.62) was used.
Using $W{2}, W{3}$, and $W_{4}$ in turn in (8.55), we find the following normalized linear variation functions (Prob. 8.38), where $X \equiv x / l$ :

\(
\begin{align}
& \phi{2}=l^{-1 / 2}\left[16.78 X(1-X)\left(\frac{1}{2}-X\right)+71.85 X^{2}(1-X)^{2}\left(\frac{1}{2}-X\right)\right] \
& \phi{3}=l^{-1 / 2}\left[28.65 X(1-X)-132.7 X^{2}(1-X)^{2}\right] \tag{8.75}\
& \phi_{4}=l^{-1 / 2}\left[98.99 X(1-X)\left(\frac{1}{2}-X\right)-572.3 X^{2}(1-X)^{2}\left(\frac{1}{2}-X\right)\right]
\end{align}
\)


Matrix algebra (Section 7.10) was developed during the period 1855-1858 by the British mathematician Arthur Cayley as a shorthand way of dealing with simultaneous linear equations and linear transformations from one set of variables to another. [Cayley had to support himself by working as a lawyer for 14 years until he obtained a professorship in 1860. While working as a lawyer, he published hundreds of papers in mathematics and regularly discussed mathematics with his fellow lawyer John Joseph Sylvester. Sylvester coined the term matrix for rectangular arrays (after the Latin word for "womb") and obtained a mathematics professorship in 1855.] Matrix algebra remained unknown to most physicists for many years, and when Heisenberg discovered the matrix-mechanics version of quantum mechanics in 1925, he did not realize that the entities he was dealing with were matrices. When he showed his work to Born, Born, who was well-trained in mathematics, recognized that Heisenberg was using matrices. Nowadays, matrix algebra is an essential tool in quantum chemistry computations and is widely used in most branches of physics.

The set of linear inhomogeneous equations (8.35) can be written as the matrix equation

\(
\begin{gather}
\left(\begin{array}{cccc}
a{11} & a{12} & \cdots & a{1 n} \
a{21} & a{22} & \cdots & a{2 n} \
\vdots & \vdots & \ddots & \vdots \
a{n 1} & a{n 2} & \cdots & a{n n}
\end{array}\right)\left(\begin{array}{c}
x{1} \
x{2} \
\vdots \
x{n}
\end{array}\right)=\left(\begin{array}{c}
b{1} \
b{2} \
\vdots \
b_{n}
\end{array}\right) \tag{8.76}\
\mathbf{A x}=\mathbf{b} \tag{8.77}
\end{gather}
\)

where $\mathbf{A}$ is the coefficient matrix and $\mathbf{x}$ and $\mathbf{b}$ are column matrices. The equivalence of (8.35) and (8.76) is readily verified using the matrix-multiplication rule (7.107).

The determinant of a square matrix $\mathbf{A}$ is the determinant whose elements are the same as the elements of $\mathbf{A}$. If $\operatorname{det} \mathbf{A} \neq 0$, the matrix $\mathbf{A}$ is said to be nonsingular.

The inverse of a square matrix $\mathbf{A}$ of order $n$ is the square matrix whose product with $\mathbf{A}$ is the unit matrix of order $n$. Denoting the inverse by $\mathbf{A}^{-1}$, we have

\(
\begin{equation}
\mathbf{A A}^{-1}=\mathbf{A}^{-1} \mathbf{A}=\mathbf{I} \tag{8.78}
\end{equation}
\)

where $\mathbf{I}$ is the unit matrix. One can prove that $\mathbf{A}^{-1}$ exists if and only if $\operatorname{det} \mathbf{A} \neq 0$. (For efficient methods of computing $\mathbf{A}^{-1}$, see Press et al., Section 2.3; Shoup, Section 3.3; Prob. 8.51. Many spreadsheets have a built-in capability to find the inverse of a matrix.)

If det $\mathbf{A} \neq 0$ for the coefficient matrix $\mathbf{A}$ in (8.76), then we can multiply each side of (8.77) by $\mathbf{A}^{-1}$ on the left to get $\mathbf{A}^{-1}(\mathbf{A x})=\mathbf{A}^{-1} \mathbf{b}$. Since matrix multiplication is associative (Section 7.10), we have $\mathbf{A}^{-1}(\mathbf{A x})=\left(\mathbf{A}^{-1} \mathbf{A}\right) \mathbf{x}=\mathbf{I} \mathbf{x}=\mathbf{x}$. Thus, left multiplication of (8.77) by $\mathbf{A}^{-1}$ gives $\mathbf{x}=\mathbf{A}^{-1} \mathbf{b}$ as the solution for the unknowns in a set of linear inhomogeneous equations.

The linear variation method is widely used to find approximate molecular wave functions, and matrix algebra gives the most computationally efficient method to solve the equations of the linear variation method. If the functions $f{1}, \ldots, f{n}$ in the linear variation function $\phi=\sum{k=1}^{n} c{k} f{k}$ are made to be orthonormal, then $S{i j}=\int f{i}^{*} f{j} d \tau=\delta{i j}$, and the homogeneous set of equations (8.55) for the coefficients $c{k}$ that minimize the variational integral becomes

\(
\begin{gather}
H{11} c{1}+H{12} c{2}+\cdots+H{1 n} c{n}=W c{1} \
H{21} c{1}+H{22} c{2}+\cdots+H{2 n} c{n}=W c{2} \
\vdots \tag{8.79a}\
\vdots \
H{n 1} c{1}+H{n 2} c{2}+\cdots+H{n n} c{n}=W c{n} \tag{8.79b}\
\left(\begin{array}{cccc}
H{11} & H{12} & \cdots & H{1 n} \
H{21} & H{22} & \cdots & H{2 n} \
\vdots & \vdots & \ddots & \vdots \
H{n 1} & H{n 2} & \cdots & H{n n}
\end{array}\right)\left(\begin{array}{c}
c{1} \
c{2} \
\vdots \
c{n}
\end{array}\right)=W\left(\begin{array}{c}
c{1} \
c{2} \
\vdots \
c{n}
\end{array}\right) \tag{8.79c}
\end{gather}
\)

where $\mathbf{H}$ is the square matrix whose elements are $H{i j}=\left\langle f{i}\right| \hat{H}\left|f{j}\right\rangle$ and $\mathbf{c}$ is the column vector of coefficients $c{1}, \ldots, c_{n}$. In (8.79c), $\mathbf{H}$ is a known matrix and $\mathbf{c}$ and $W$ are unknowns to be solved for.

If

\(
\begin{equation}
\mathbf{A c}=\lambda \mathbf{c} \tag{8.80}
\end{equation}
\)

where $\mathbf{A}$ is a square matrix, $\mathbf{c}$ is a column vector with at least one nonzero element, and $\lambda$ is a scalar, then $\mathbf{c}$ is said to be an eigenvector (or characteristic vector) of $\mathbf{A}$ and $\lambda$ is an eigenvalue (or characteristic value) of $\mathbf{A}$.

Comparison of (8.80) with (8.79c) shows that solving the linear variation problem with $S{i j}=\delta{i j}$ amounts to finding the eigenvalues and eigenvectors of the matrix $\mathbf{H}$. The matrix eigenvalue equation $\mathbf{H c}=W \mathbf{c}$ is equivalent to the set of homogeneous equations (8.55), which has a nontrivial solution for the $c$ 's if and only if $\operatorname{det}\left(H{i j}-\delta{i j} W\right)=0$ [Eq. (8.57) with $S{i j}=\delta{i j}$ ]. For a general square matrix $\mathbf{A}$ of order $n$, the corresponding equation satisfied by the eigenvalues is

\(
\begin{equation}
\operatorname{det}\left(A{i j}-\delta{i j} \lambda\right)=0 \tag{8.81}
\end{equation}
\)

Equation (8.81) is called the characteristic equation of matrix $\mathbf{A}$. When the $n$ th-order determinant in (8.81) is expanded, it gives a polynomial in $\lambda$ (called the characteristic polynomial) whose highest power is $\lambda^{n}$. The characteristic polynomial has $n$ roots for $\lambda$ (some of which may be equal to each other and some of which may be imaginary), so a square matrix of order $n$ has $n$ eigenvalues. (The set of eigenvalues of a square matrix $\mathbf{A}$ is sometimes called the spectrum of $\mathbf{A}$.)

The matrix equation (8.79c) for $\mathbf{H}$ corresponds to (8.80) for $\mathbf{A}$. The elements of the eigenvectors of $\mathbf{A}$ satisfy the following set of equations that corresponds to (8.79a):

\(
\begin{align}
& A{11} c{1}+A{12} c{2}+\cdots+A{1 n} c{n}=\lambda c{1} \
& \vdots \tag{8.82}\
& \vdots
\end{aligned} \ddots \quad \begin{aligned}
& \vdots \
& A{n 1} c{1}+A{n 2} c{2}+\cdots+A{n n} c{n}=\lambda c{n}
\end{align}
\)

For each different eigenvalue, we have a different set of equations (8.82) and a different set of numbers $c{1}, c{2}, \ldots, c_{n}$, giving a different eigenvector.

If all the eigenvalues of a matrix are different, one can show that solving (8.82) leads to $n$ linearly independent eigenvectors (see Strang, Section 5.2), where linear independence means that no eigenvector can be written as a linear combination of the other eigenvectors. If some eigenvalues are equal, then the matrix may have fewer than $n$ linearly independent eigenvectors. The matrices that occur in quantum mechanics are usually Hermitian (this term is defined later in this section), and a Hermitian matrix of order $n$ always has $n$ linearly independent eigenvectors even if some of its eigenvalues are equal (see Strang, Section 5.6 for the proof).

If $\mathbf{A}$ is a diagonal matrix ( $a_{i j}=0$ for $i \neq j$ ), then the determinant in (8.81) is diagonal. A diagonal determinant equals the product of its diagonal elements [Eq. (8.33)], so the characteristic equation for a diagonal matrix is

\(
\left(a{11}-\lambda\right)\left(a{12}-\lambda\right) \cdots\left(a_{n n}-\lambda\right)=0
\)

The roots of this equation are $\lambda{1}=a{11}, \lambda{2}=a{22}, \ldots, \lambda{n}=a{n n}$. The eigenvalues of a diagonal matrix are equal to its diagonal elements. (For the eigenvectors, see Prob. 8.46.)

If $\mathbf{c}$ is an eigenvector of $\mathbf{A}$, then clearly $\mathbf{d} \equiv k \mathbf{c}$ is also an eigenvector of $\mathbf{A}$, where $k$ is any constant. If $k$ is chosen so that

\(
\begin{equation}
\sum{i=1}^{n}\left|d{i}\right|^{2}=1 \tag{8.83}
\end{equation}
\)

then the column vector $\mathbf{d}$ is said to be normalized. Two column vectors $\mathbf{b}$ and $\mathbf{c}$ that each have $n$ elements are said to be orthogonal if

\(
\begin{equation}
\sum{i=1}^{n} b{i}^{} c_{i}=0 \tag{8.84}
\end{}
\)

Let us denote the $n$ eigenvalues and the corresponding eigenvectors of $\mathbf{H}$ in the variation-method equations (8.79) by $W{1}, W{2}, \ldots, W_{n}$ and $\mathbf{c}^{(1)}, \mathbf{c}^{(2)}, \ldots, \mathbf{c}^{(n)}$, so that

\(
\begin{equation}
\mathbf{H c}^{(i)}=W_{i} \mathbf{c}^{(i)} \quad \text { for } i=1,2, \ldots, n \tag{8.85}
\end{equation}
\)

where $\mathbf{c}^{(i)}$ is a column vector whose elements are $c{1}^{(i)}, \ldots, c{n}^{(i)}$ and the basis functions $f_{i}$ are orthonormal. Furthermore, let $\mathbf{C}$ be the square matrix whose columns are the eigenvectors of $\mathbf{H}$, and let $\mathbf{W}$ be the diagonal matrix whose diagonal elements are the eigenvalues of $\mathbf{H}$ :

\(
\mathbf{C}=\left(\begin{array}{cccc}
c{1}^{(1)} & c{1}^{(2)} & \cdots & c{1}^{(n)} \tag{8.86}\
c{2}^{(1)} & c{2}^{(2)} & \cdots & c{2}^{(n)} \
\vdots & \vdots & \ddots & \vdots \
c{n}^{(1)} & c{n}^{(2)} & \cdots & c{n}^{(n)}
\end{array}\right), \quad \mathbf{W}=\left(\begin{array}{cccc}
W{1} & 0 & \cdots & 0 \
0 & W{2} & \cdots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \cdots & W{n}
\end{array}\right)
\)

The set of $n$ eigenvalue equations (8.85) can be written as the single equation:

\(
\begin{equation}
\mathbf{H C}=\mathbf{C W} \tag{8.87}
\end{equation}
\)

To verify the matrix equation (8.87), we show that each element $(\mathbf{H C}){i j}$ of the matrix $\mathbf{H C}$ equals the corresponding element $(\mathbf{C W}){i j}$ of $\mathbf{C W}$. The matrix-multiplication rule (7.107) gives $(\mathbf{H C}){i j}=\sum{k} H{i k}(\mathbf{C}){k j}=\sum{k} H{i k} c{k}^{(j)}$. Consider the eigenvalue equation $\mathbf{H C}^{(j)}=W{j} \mathbf{c}^{(j)}$ [Eq. (8.85)]. $\mathbf{H} \mathbf{c}^{(j)}$ and $W{j} \mathbf{c}^{(j)}$ are column matrices. Using (7.107) to equate the elements in row $i$ of each of these column matrices, we have $\sum{k} H{i k} c{k}^{(j)}=W{j} c{i}^{(j)}$. Then

\(
(\mathbf{H C}){i j}=\sum{k} H{i k} c{k}^{(j)}=W{j} c{i}^{(j)}, \quad(\mathbf{C W}){i j}=\sum{k}(\mathbf{C}){i k}(\mathbf{W}){k j}=\sum{k} c{i}^{(k)} \delta{k j} W{k}=c{i}^{(j)} W{j}
\)

Hence $(\mathbf{H C}){i j}=(\mathbf{C W}){i j}$ and (8.87) is proved.
Provided $\mathbf{C}$ has an inverse (see below), we can multiply each side of (8.87) by $\mathbf{C}^{-1}$ on the left to get $\mathbf{C}^{-1} \mathbf{H C}=\mathbf{C}^{-1}(\mathbf{C W})$. [Since matrix multiplication is not commutative, when we multiply each side of $\mathbf{H C}=\mathbf{C W}$ by $\mathbf{C}^{-1}$, we must put the factor $\mathbf{C}^{-1}$ on the left of $\mathbf{H C}$ and on the left of $\mathbf{C W}$ (or on the right of $\mathbf{H C}$ and the right of $\mathbf{C W}$ ).] We have $\mathbf{C}^{-1} \mathbf{H C}=\mathbf{C}^{-1}(\mathbf{C W})=\left(\mathbf{C}^{-1} \mathbf{C}\right) \mathbf{W}=\mathbf{I} \mathbf{W}=\mathbf{W}:$

\(
\begin{equation}
\mathbf{C}^{-1} \mathbf{H C}=\mathbf{W} \tag{8.88}
\end{equation}
\)

To simplify (8.88), we must learn more about matrices.
A square matrix $\mathbf{B}$ is a symmetric matrix if all its elements satisfy $b{m n}=b{n m}$. The elements of a symmetric matrix are symmetric about the principal diagonal; for example, $b{12}=b{21}$. A square matrix $\mathbf{D}$ is a Hermitian matrix if all its elements satisfy $d{m n}=d{n m}^{*}$. For example, if

\(
\mathbf{M}=\left(\begin{array}{ccc}
2 & 5 & 0 \tag{8.89}\
5 & i & 2 i \
0 & 2 i & 4
\end{array}\right), \quad \mathbf{N}=\left(\begin{array}{ccc}
6 & 1+2 i & 8 \
1-2 i & -1 & -i \
8 & i & 0
\end{array}\right)
\)

then $\mathbf{M}$ is symmetric and $\mathbf{N}$ is Hermitian. (Note that the diagonal elements of a Hermitian matrix must be real; $d{m m}=d{m m}^{*}$.) A real matrix is one whose elements are all real numbers. A real Hermitian matrix is a symmetric matrix.

The transpose $\mathbf{A}^{\mathrm{T}}$ (often written $\widetilde{\mathbf{A}}$ ) of the matrix $\mathbf{A}$ is the matrix formed by interchanging rows and columns of $\mathbf{A}$ so that column 1 becomes row one, column 2 becomes row two, and so on. The elements $a{m n}^{\mathrm{T}}$ of $\mathbf{A}^{\mathrm{T}}$ are related to the elements of $\mathbf{A}$ by $a{m n}^{\mathrm{T}}=a_{n m}$. For a square matrix, the transpose is found by reflecting the elements about the principal diagonal. A symmetric matrix is equal to its transpose. Thus, for the matrix $\mathbf{M}$ in (8.89), we have $\mathbf{M}^{\mathrm{T}}=\mathbf{M}$.

The complex conjugate $\mathbf{A}^{}$ of $\mathbf{A}$ is the matrix formed by taking the complex conjugate of each element of $\mathbf{A}$. The conjugate transpose $\mathbf{A}^{\dagger}$ (read as A dagger) of the matrix $\mathbf{A}$ is formed by taking the transpose of $\mathbf{A}^{}$; thus $\mathbf{A}^{\dagger}=\left(\mathbf{A}^{*}\right)^{\mathrm{T}}$ and

\(
\begin{equation}
a{m n}^{\dagger}=\left(a{n m}\right)^{} \tag{8.90}
\end{}
\)

(Physicists call $\mathbf{A}^{\dagger}$ the adjoint of $\mathbf{A}$, a name that has been used by mathematicians to denote an entirely different matrix.) An example is

\(
\mathbf{B}=\left(\begin{array}{cc}
2 & 3+i \
0 & 4 i
\end{array}\right), \quad \mathbf{B}^{\mathrm{T}}=\left(\begin{array}{cc}
2 & 0 \
3+i & 4 i
\end{array}\right), \quad \mathbf{B}^{\dagger}=\left(\begin{array}{cc}
2 & 0 \
3-i & -4 i
\end{array}\right)
\)

For a Hermitian matrix A, Eq. (8.90) gives $a{m n}^{\dagger}=\left(a{n m}\right)^{*}=a{m n}$, so $\left(\mathbf{A}^{\dagger}\right){m n}=(\mathbf{A})_{m n}$ and $\mathbf{A}^{\dagger}=\mathbf{A}$. A Hermitian matrix is equal to its conjugate transpose. (Physicists often use the term self-adjoint for a Hermitian matrix.)

An orthogonal matrix is a square matrix whose inverse is equal to its transpose:

\(
\begin{equation}
\mathbf{A}^{-1}=\mathbf{A}^{\mathrm{T}} \quad \text { if } \mathbf{A} \text { is orthogonal } \tag{8.91}
\end{equation}
\)

A unitary matrix is one whose inverse is equal to its conjugate transpose:

\(
\begin{equation}
\mathbf{U}^{-1}=\mathbf{U}^{\dagger} \quad \text { if } \mathbf{U} \text { is unitary } \tag{8.92}
\end{equation}
\)

From the definition (8.92), we have $\mathbf{U}^{\dagger} \mathbf{U}=\mathbf{I}$ if $\mathbf{U}$ is unitary. By equating $\left(\mathbf{U}^{\dagger} \mathbf{U}\right){m n}$ to $(\mathbf{I}){m n}$, we find (Prob. 8.43)

\(
\begin{equation}
\sum{k} u{k m}^{} u{k n}=\delta{m n} \tag{8.93}
\end{}
\)

for columns $m$ and $n$ of a unitary matrix. Thus the columns of a unitary matrix (viewed as column vectors) are orthogonal and normalized (orthonormal), as defined by (8.84) and (8.83). Conversely, if (8.93) is true for all columns, then $\mathbf{U}$ is a unitary matrix. If $\mathbf{U}$ is unitary and real, then $\mathbf{U}^{\dagger}=\mathbf{U}^{\mathrm{T}}$, and $\mathbf{U}$ is an orthogonal matrix.

One can prove that two eigenvectors of a Hermitian matrix $\mathbf{H}$ that correspond to different eigenvalues are orthogonal (see Strang, Section 5.5). For eigenvectors of $\mathbf{H}$ that correspond to the same eigenvalue, one can take linear combinations of them that will be orthogonal eigenvectors of $\mathbf{H}$. Moreover, the elements of an eigenvector can be multiplied by a constant to normalize the eigenvector. Hence, the eigenvectors of a Hermitian matrix can be chosen to be orthonormal. If the eigenvectors are chosen to be orthonormal, then the eigenvector matrix $\mathbf{C}$ in (8.86) is a unitary matrix, and $\mathbf{C}^{-1}=\mathbf{C}^{\dagger}$; Eq. (8.88) then becomes

\(
\begin{equation}
\mathbf{C}^{\dagger} \mathbf{H C}=\mathbf{W} \quad \text { if } \mathbf{H} \text { is Hermitian } \tag{8.94}
\end{equation}
\)

For the common case that $\mathbf{H}$ is real as well as Hermitian (that is, $\mathbf{H}$ is real and symmetric), the $c$ 's in (8.79a) are real (since $W$ and the $H_{i j}$ 's are real) and $\mathbf{C}$ is real as well as unitary; that is, $\mathbf{C}$ is orthogonal, with $\mathbf{C}^{-1}=\mathbf{C}^{\mathrm{T}}$; Eq. (8.94) becomes

\(
\begin{equation}
\mathbf{C}^{\mathrm{T}} \mathbf{H C}=\mathbf{W} \quad \text { if } \mathbf{H} \text { is real and symmetric } \tag{8.95}
\end{equation}
\)

The eigenvalues of a Hermitian matrix can be proven to be real numbers (Strang, Section 5.5).

EXAMPLE

Find the eigenvalues and normalized eigenvectors of the Hermitian matrix

\(
\mathbf{A}=\left(\begin{array}{cc}
3 & 2 i \
-2 i & 0
\end{array}\right)
\)

by solving algebraic equations. Then verify that $\mathbf{C}^{\dagger} \mathbf{A C}$ is diagonal, where $\mathbf{C}$ is the eigenvector matrix.
The characteristic equation (8.81) for the eigenvalues $\lambda$ is $\operatorname{det}\left(a{i j}-\delta{i j} \lambda\right)=0$, which becomes

\(
\begin{aligned}
& \left|\begin{array}{cc}
3-\lambda & 2 i \
-2 i & -\lambda
\end{array}\right|=0 \
& \lambda^{2}-3 \lambda-4=0 \
& \lambda{1}=4, \quad \lambda{2}=-1
\end{aligned}
\)

A useful theorem in checking eigenvalue calculations is the following (Strang, Exercise 5.1.9): The sum of the diagonal elements of a square matrix $\mathbf{A}$ of order $n$ is equal to the sum of the eigenvalues $\lambda{i}$ of $\mathbf{A}$; that is, $\sum{i=1}^{n} a{i i}=\sum{i=1}^{n} \lambda{i}$. In this example, $\sum{i} a{i i}=3+0$, which equals the sum $4-1=3$ of the eigenvalues.
For the root $\lambda{1}=4$, the set of simultaneous equations (8.82) is

\(
\begin{array}{r}
\left(3-\lambda{1}\right) c{1}^{(1)}+2 i c{2}^{(1)}=0 \
-2 i c{1}^{(1)}-\lambda{1} c{2}^{(1)}=0
\end{array}
\)

or

\(
\begin{aligned}
& -c{1}^{(1)}+2 i c{2}^{(1)}=0 \
& -2 i c{1}^{(1)}-4 c{2}^{(1)}=0
\end{aligned}
\)

Discarding either one of these equations, we find

\(
c{1}^{(1)}=2 i c{2}^{(1)}
\)

Normalization gives

\(
\begin{gathered}
1=\left|c{1}^{(1)}\right|^{2}+\left|c{2}^{(1)}\right|^{2}=4\left|c{2}^{(1)}\right|^{2}+\left|c{2}^{(1)}\right|^{2} \
\left|c{2}^{(1)}\right|=1 / \sqrt{5}, \quad c{2}^{(1)}=1 / \sqrt{5} \
c{1}^{(1)}=2 i c{2}^{(1)}=2 i / \sqrt{5}
\end{gathered}
\)

where the phase of $c{2}^{(1)}$ was chosen to be zero.
Similarly, we find for $\lambda{2}=-1$ (Prob. 8.49)

\(
c{1}^{(2)}=-i / \sqrt{5}, \quad c{2}^{(2)}=2 / \sqrt{5}
\)

The normalized eigenvectors are then

\(
\mathbf{c}^{(1)}=\binom{2 i / \sqrt{5}}{1 / \sqrt{5}}, \quad \mathbf{c}^{(2)}=\binom{-i / \sqrt{5}}{2 / \sqrt{5}}
\)

Because the eigenvalues $\lambda{1}$ and $\lambda{2}$ of the Hermitian matrix $\mathbf{A}$ differ, $\mathbf{c}^{(1)}$ and $\mathbf{c}^{(2)}$ are orthogonal (as you should verify). Also, $\mathbf{c}^{(1)}$ and $\mathbf{c}^{(2)}$ are normalized. Therefore, $\mathbf{C}$ is unitary and $\mathbf{C}^{-1}=\mathbf{C}^{\dagger}$. Forming $\mathbf{C}$ and its conjugate transpose, we have

\(
\begin{aligned}
\mathbf{C}^{-1} \mathbf{A} \mathbf{C}=\mathbf{C}^{\dagger} \mathbf{A C} & =\left(\begin{array}{rl}
-2 i / \sqrt{5} & 1 / \sqrt{5} \
i / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right)\left(\begin{array}{rr}
3 & 2 i \
-2 i & 0
\end{array}\right)\left(\begin{array}{rr}
2 i / \sqrt{5} & -i / \sqrt{5} \
1 / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right) \
& =\left(\begin{array}{rr}
-2 i / \sqrt{5} & 1 / \sqrt{5} \
i / \sqrt{5} & 2 / \sqrt{5}
\end{array}\right)\left(\begin{array}{rr}
8 i / \sqrt{5} & i / \sqrt{5} \
4 / \sqrt{5} & -2 / \sqrt{5}
\end{array}\right)=\left(\begin{array}{rr}
4 & 0 \
0 & -1
\end{array}\right)
\end{aligned}
\)

which is the diagonal matrix of eigenvectors.

We have shown that if $\mathbf{H}$ is a real symmetric matrix with eigenvalues $W{i}$ and orthonormal eigenvectors $\mathbf{c}^{(i)}$ (that is, if $\mathbf{H} \mathbf{c}^{(i)}=W{i} \mathbf{c}^{(i)}$ for $i=1,2, \ldots, n$ ), then $\mathbf{C}^{\mathrm{T}} \mathbf{H C}=\mathbf{W}$ [Eq. (8.95)], where $\mathbf{C}$ is the real orthogonal matrix whose columns are the eigenvectors $\mathbf{c}^{(i)}$ and $\mathbf{W}$ is the diagonal matrix of eigenvalues $W_{i}$. The converse of this theorem is readily proved; that is, if $\mathbf{H}$ is a real symmetric matrix, $\mathbf{B}$ is a real orthogonal matrix, and $\mathbf{B}^{\mathrm{T}} \mathbf{H B}$ equals a diagonal matrix $\boldsymbol{\Lambda}$, then the columns of $\mathbf{B}$ are the eigenvectors of $\mathbf{H}$ and the diagonal elements of $\boldsymbol{\Lambda}$ are the eigenvalues of $\mathbf{H}$.

To find the eigenvalues and eigenvectors of a Hermitian matrix of order $n$, we can use either of the following procedures: (1) Solve the characteristic equation $\operatorname{det}\left(H{i j}-\delta{i j} W\right)=0$ [Eq. (8.81)] for the eigenvalues $W{1}, \ldots, W{n}$. Then substitute each $W{k}$ into the set of algebraic equations (8.79a) and solve for the elements $c{1}^{(k)}, \ldots, c_{n}^{(k)}$ of the $k$ th eigenvector. (2) Search for a unitary matrix $\mathbf{C}$ such that $\mathbf{C}^{\dagger} \mathbf{H C}$ is a diagonal matrix. The diagonal elements of $\mathbf{C}^{\dagger} \mathbf{H C}$ are the eigenvalues of $\mathbf{H}$, and the columns of $\mathbf{C}$ are the orthonormal eigenvectors of $\mathbf{H}$. For the large matrices that occur in quantum chemistry, procedure (2) (called matrix diagonalization) is computationally much faster than (1).

One reason that expanding the characteristic determinant and solving the characteristic equation is not a good way to find the eigenvalues of large matrices is that, for large matrices, a very small change in a coefficient in the characteristic polynomial may produce a large change in the eigenvalues (see Prob. 8.54). Hence we might have to calculate the coefficients in the characteristic polynomial to hundreds or thousands of decimal places in order to get eigenvalues accurate to a few decimal places. Although it is true that for certain matrices a tiny change in the value of an element of that matrix might produce large changes in the eigenvalues, one can prove that for Hermitian matrices, a small change in a matrix element always produces only small changes in the eigenvalues. Hence method (2) of the preceding paragraph is the correct way to get accurate eigenvalues.

A systematic way to diagonalize a real symmetric matrix $\mathbf{H}$ is as follows. Construct an orthogonal matrix $\mathbf{O}{1}$ such that the matrix $\mathbf{H}{1} \equiv \mathbf{O}{1}^{\mathrm{T}} \mathbf{H} \mathbf{O}{1}$ has zero in place of the off-diagonal elements $H{12}$ and $H{21}$ of $\mathbf{H}$. (Because $\mathbf{H}$ is symmetric, we have $H{12}=H{21}$. Also, the transformed matrices $\mathbf{H}{1}, \mathbf{H}{2}, \ldots$ are symmetric.) Then construct an orthogonal matrix $\mathbf{O}{2}$ such that $\mathbf{H}{2} \equiv \mathbf{O}{2}^{\mathrm{T}} \mathbf{H}{1} \mathbf{O}{2}=\mathbf{O}{2}^{\mathrm{T}} \mathbf{O}{1}^{\mathrm{T}} \mathbf{H} \mathbf{O}{1} \mathbf{O}{2}$ has zeros in place of the elements $\left(\mathbf{H}{1}\right){13}$ and $\left(\mathbf{H}{1}\right){31}$ of $\mathbf{H}{1}$; and so on. Unfortunately, when a given pair of off-diagonal elements is made zero in a step, some off-diagonal elements made zero in a previous step are likely to become nonzero, so one has to go back and recycle through the off-diagonal elements over and over again. Generally an infinite number of steps are required to make all off-diagonal elements equal to zero. In practice, one skips a step if the absolute value of the off-diagonal elements to be zeroed in that step is less than some tiny number, and one stops the procedure when the absolute values of all off-diagonal elements are less than some tiny number. The eigenvalues are then the diagonal elements of the transformed matrix $\cdots \mathbf{O}{3}^{\mathrm{T}} \mathbf{O}{2}^{\mathrm{T}} \mathbf{O}{1}^{\mathrm{T}} \mathbf{H O}{1} \mathbf{O}{2} \mathbf{O}{3} \cdots$, and the eigenvector matrix is the product $\mathbf{O}{1} \mathbf{O}{2} \mathbf{O}_{3} \cdots$. This method (the cyclic Jacobi method) is not very efficient for large matrices when run on a serial computer but is efficient on a parallel computer.

More-efficient approaches to diagonalize real, symmetric matrices than the Jacobi method begin by carrying out a series of orthogonal transformations to reduce the original matrix $\mathbf{H}$ to a symmetric tridiagonal matrix $\mathbf{T}$. A tridiagonal matrix is one whose elements are all zero except for those on the principal diagonal (elements $t{i i}$ ) and those on the diagonals immediately above and immediately below the principal diagonal (elements $t{i-1, i}$ and $t_{i+1, i}$, respectively). The relation between $\mathbf{T}$ and $\mathbf{H}$ is $\mathbf{T}=\mathbf{O}^{\mathrm{T}} \mathbf{H O}$, where $\mathbf{O}$ is a real orthogonal matrix that is the product of the orthogonal matrices used in the individual steps of going from $\mathbf{H}$ to $\mathbf{T}$. Two efficient methods of

transforming $\mathbf{H}$ to tridiagonal form are due to Givens and to Householder. An efficient method to find the eigenvalues of a symmetric tridiagonal matrix is the $\mathbf{Q R}$ method. Here, $\mathbf{T}$ is expressed as the product of an orthogonal matrix $\mathbf{Q}$ and an upper triangular matrix $\mathbf{R}$ (one whose elements below the principal diagonal are all zero). A series of iterative steps yields matrices converging to a diagonal matrix whose diagonal elements are the eigenvalues of $\mathbf{T}$, which equal the eigenvalues of $\mathbf{H}$ (Prob. 8.55). With certain refinements, the QR method is a very efficient way to find eigenvalues and eigenvectors (see Strang, Sections 5.3 and 7.3 for details).

Details of matrix diagonalization procedures and computer programs are given in Press et al., Chapter 11; Acton, Chapters 8 and 13; Shoup, Chapter 4.

A major compilation of procedures and computer programs for scientific and engineering calculations is Press et al.; the text of older editions of this book is available free on the Internet at www.nr.com. For comments on older editions of this book, see amath .colorado.edu/computing/Fortran/numrec.html.

Programs for mathematical and scientific calculations can be found at www.netlib .org and at gams.nist.gov. Downloadable free personal-computer mathematical software and demonstration software for such commercial programs as Mathcad and Maple can be found at archives.math.utk.edu.

The procedure for using matrix algebra to solve the linear variation equations when nonorthonormal basis functions are used is outlined in Prob. 8.57.

The Excel spreadsheet can be used to find eigenvalues and eigenvectors; see Prob. 8.53.
Computer algebra systems such as Maple, Mathematica, and Mathcad and some electronic calculators have built-in commands to easily find eigenvalues and eigenvectors.

The methods for finding matrix eigenvalues and eigenvectors discussed in this section are useful for matrices of order up to $10^{3}$. Special methods are used to find the lowest few eigenvalues and corresponding eigenvectors of matrices of order up to $10^{9}$ that occur in certain quantum-chemistry calculations (see Section 16.2).

As noted after Eq. (8.80), for the linear variation function $\sum{i=1}^{n} c{i} f{i}$ with orthonormal basis functions $f{i}$, the eigenvalues of the matrix $\mathbf{H}$ formed from the matrix elements $\left\langle f{i}\right| \hat{H}\left|f{k}\right\rangle$ are the roots of the secular equation, and the eigenvector corresponding to the eigenvalue $W{m}$ gives the coefficients in the variation function that corresponds to $W{m}$. Problems 8.60 to 8.65 apply the linear variation method to problems such as the double well and the harmonic oscillator using particle-in-a-box wave functions as basis functions and using a computer algebra program such as Mathcad to find the eigenvalues and eigenvectors of the $\mathbf{H}$ matrix.

We have discussed matrix diagonalization in the context of the linear variation method. However, finding the eigenvalues $a{k}$ and eigenfunctions $g{k}$ of any Hermitian operator $\left(\hat{A} g{k}=a{k} g{k}\right)$ can be formulated as a matrix-diagonalization problem. If we choose a complete, orthonormal basis set $\left{f{i}\right}$ and expand the eigenfunctions as $g{k}=\sum{i} c{i}^{(k)} f{i}$, then (Prob. 8.59) the eigenvalues of the matrix $\mathbf{A}$ whose elements are $a{i j}=\left\langle f{i}\right| \hat{A}\left|f{j}\right\rangle$ are the eigenvalues of the operator $\hat{A}$, and the elements $c{i}^{(k)}$ of the eigenvectors $\mathbf{c}^{(k)}$ of $\mathbf{A}$ give the coefficients in the expansions of the eigenfunctions $g_{k}$.

The material of this section further emphasizes the correspondence between linear operators and matrices and the correspondence between functions and column vectors (Section 7.10).

The PageRanks of Web pages indexed by Google are calculated by Google by constructing a certain matrix (called the Google matrix) and finding its eigenvector that corresponds to the eigenvalue 1 . The order of the Google matrix equals the number of indexed pages and was between $10^{10.5}$ and $10^{11}$ in 2012 . The components of this eigenvector are the PageRanks. For details, see www.ams.org/samplings/featurecolumn/ fcarc-pagerank.


Perturbation Theory

Click the keywords below to read more about it.

Perturbation Theory: A quantum-mechanical approximation method used to find an approximate solution to a problem by starting from the exact solution of a related, simpler problem 1. Hamiltonian Operator: An operator corresponding to the total energy of the system, including both kinetic and potential energies 1. Schrödinger Equation: A fundamental equation in quantum mechanics that describes how the quantum state of a physical system changes over time 1. Eigenfunctions and Eigenvalues: Solutions to the Schrödinger equation where eigenfunctions represent the possible states of the system and eigenvalues represent the corresponding energy levels 1. Unperturbed System: A system whose Hamiltonian is exactly solvable and serves as the starting point for perturbation theory 1. Perturbed System: A system whose Hamiltonian is slightly different from the unperturbed system, making it more complex to solve 1. Perturbation: The difference between the Hamiltonians of the perturbed and unperturbed systems 1. Nondegenerate Perturbation Theory: Perturbation theory applied to energy levels that are not degenerate (i.e., each energy level corresponds to a unique state) 1. Degenerate Perturbation Theory: Perturbation theory applied to energy levels that are degenerate (i.e., multiple states share the same energy level) 1. First-Order Energy Correction: The initial correction to the energy of a system due to perturbation 1. Second-Order Energy Correction: The subsequent correction to the energy of a system, taking into account the first-order correction 1. Intermediate Normalization: A simplification method where the perturbed wave function is required to satisfy a specific normalization condition 1. Configuration Interaction: The mixing of different configurations in the wave function due to perturbation 1. Variation-Perturbation Method: A method that combines variational principles and perturbation theory to estimate higher-order energy corrections 1. Coulomb Integral: An integral representing the electrostatic energy of repulsion between two charge distributions 1. Exchange Integral: An integral representing the interaction between two electrons when their positions are exchanged 1. Transition Dipole Moment: A measure of the probability of a transition between two states due to interaction with electromagnetic radiation 1. Selection Rules: Rules that determine the allowed transitions between quantum states based on the change in quantum numbers 1. Time-Dependent Perturbation Theory: Perturbation theory applied to systems exposed to time-dependent perturbations, such as electromagnetic radiation 1. Stimulated Emission: The process by which an atom or molecule emits a photon when exposed to radiation, causing a transition to a lower energy state 1. Spontaneous Emission: The process by which an atom or molecule emits a photon without external stimulation, causing a transition to a lower energy state 1. Absorption: The process by which an atom or molecule absorbs a photon, causing a transition to a higher energy state 1. Secular Equation: An algebraic equation used to find the energy corrections in degenerate perturbation theory 1. Hermitian Operator: An operator whose eigenvalues are real and whose eigenfunctions form a complete orthonormal set 1. Orthonormality: The property of eigenfunctions being orthogonal and normalized 1.

This chapter discusses the second major quantum-mechanical approximation method, perturbation theory.

Suppose we have a system with a time-independent Hamiltonian operator $\hat{H}$ and we are unable to solve the Schrödinger equation

\(
\begin{equation}
\hat{H} \psi{n}=E{n} \psi_{n} \tag{9.1}
\end{equation}
\)

for the eigenfunctions and eigenvalues of the bound stationary states. Suppose also that the Hamiltonian $\hat{H}$ is only slightly different from the Hamiltonian $\hat{H}^{0}$ of a system whose Schrödinger equation

\(
\begin{equation}
\hat{H}^{0} \psi{n}^{(0)}=E{n}^{(0)} \psi_{n}^{(0)} \tag{9.2}
\end{equation}
\)

we can solve. An example is the one-dimensional anharmonic oscillator with

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+\frac{1}{2} k x^{2}+c x^{3}+d x^{4} \tag{9.3}
\end{equation}
\)

The Hamiltonian (9.3) is closely related to the Hamiltonian

\(
\begin{equation}
\hat{H}^{0}=-\frac{\hbar^{2}}{2 m} \frac{d^{2}}{d x^{2}}+\frac{1}{2} k x^{2} \tag{9.4}
\end{equation}
\)

of the harmonic oscillator. If the constants $c$ and $d$ in (9.3) are small, we expect the eigenfunctions and eigenvalues of the anharmonic oscillator to be closely related to those of the harmonic oscillator.

We shall call the system with Hamiltonian $\hat{H}^{0}$ the unperturbed system. The system with Hamiltonian $\hat{H}$ is the perturbed system. The difference between the two Hamiltonians is the perturbation $\hat{H}^{\prime}$ :

\(
\begin{equation}
\hat{H}^{\prime} \equiv \hat{H}-\hat{H}^{0} \tag{9.5}
\end{equation}
\)

\(
\begin{equation}
\hat{H}=\hat{H}^{0}+\hat{H}^{\prime} \tag{9.6}
\end{equation}
\)

(The prime does not denote differentiation.) For the anharmonic oscillator with Hamiltonian (9.3), the perturbation on the related harmonic oscillator is $\hat{H}^{\prime}=c x^{3}+d x^{4}$.

In $\hat{H}^{0} \psi{n}^{(0)}=E{n}^{(0)} \psi{n}^{(0)}$ [Eq. (9.2)], $E{n}^{(0)}$ and $\psi_{n}^{(0)}$ are called the unperturbed energy and unperturbed wave function of state $n$. For $\hat{H}^{0}$ equal to the harmonic-oscillator

Hamiltonian (9.4), $E_{n}^{(0)}$ is $\left(n+\frac{1}{2}\right) h \nu$ [Eq. (4.45)], where $n$ is a nonnegative integer. ( $n$ is used instead of $v$ for consistency with the perturbation-theory notation.) Note that the superscript ${ }^{(0)}$ does not mean the ground state. Perturbation theory can be applied to any state. The subscript $n$ labels the state we are dealing with. The superscript ${ }^{(0)}$ denotes the unperturbed system.

Our task is to relate the unknown eigenvalues and eigenfunctions of the perturbed system to the known eigenvalues and eigenfunctions of the unperturbed system. To aid in doing so, we shall imagine that the perturbation is applied gradually, giving a continuous change from the unperturbed to the perturbed system. Mathematically, this corresponds to inserting a parameter $\lambda$ into the Hamiltonian, so that

\(
\begin{equation}
\hat{H}=\hat{H}^{0}+\lambda \hat{H}^{\prime} \tag{9.7}
\end{equation}
\)

When $\lambda$ is zero, we have the unperturbed system. As $\lambda$ increases, the perturbation grows larger, and at $\lambda=1$ the perturbation is fully "turned on." We inserted $\lambda$ to help relate the perturbed and unperturbed eigenfunctions, and ultimately we shall set $\lambda=1$, thereby eliminating it.

Sections 9.1 to 9.7 deal with time-independent Hamiltonians and stationary states. Section 9.8 deals with time-dependent perturbations.


The perturbation treatments of degenerate and nondegenerate energy levels differ. This section examines the effect of a perturbation on a nondegenerate level. If some of the energy levels of the unperturbed system are degenerate while others are nondegenerate, the treatment in this section will apply to the nondegenerate levels only.

Nondegenerate Perturbation Theory

Let $\psi{n}^{(0)}$ be the wave function of some particular unperturbed nondegenerate level with energy $E{n}^{(0)}$. Let $\psi{n}$ be the perturbed wave function into which $\psi{n}^{(0)}$ is converted when the perturbation is applied. From (9.1) and (9.7), the Schrödinger equation for the perturbed state is

\(
\begin{equation}
\hat{H} \psi{n}=\left(\hat{H}^{0}+\lambda \hat{H}^{\prime}\right) \psi{n}=E{n} \psi{n} \tag{9.8}
\end{equation}
\)

Since the Hamiltonian in (9.8) depends on the parameter $\lambda$, both the eigenfunction $\psi{n}$ and the eigenvalue $E{n}$ depend on $\lambda$ :

\(
\psi{n}=\psi{n}(\lambda, q) \quad \text { and } \quad E{n}=E{n}(\lambda)
\)

where $q$ denotes the system's coordinates. We now expand $\psi{n}$ and $E{n}$ as Taylor series (Prob. 4.1) in powers of $\lambda$ :

\(
\begin{align}
& \psi{n}=\left.\psi{n}\right|{\lambda=0}+\left.\frac{\partial \psi{n}}{\partial \lambda}\right|{\lambda=0} \lambda+\left.\frac{\partial^{2} \psi{n}}{\partial \lambda^{2}}\right|{\lambda=0} \frac{\lambda^{2}}{2!}+\cdots \tag{9.9}\
& E{n}=\left.E{n}\right|{\lambda=0}+\left.\frac{d E{n}}{d \lambda}\right|{\lambda=0} \lambda+\left.\frac{d^{2} E{n}}{d \lambda^{2}}\right|{\lambda=0} \frac{\lambda^{2}}{2!}+\cdots \tag{9.10}
\end{align}
\)

By hypothesis, when $\lambda$ goes to zero, $\psi{n}$ and $E{n}$ go to $\psi{n}^{(0)}$ and $E{n}^{(0)}$ :

\(
\begin{equation}
\left.\psi{n}\right|{\lambda=0}=\psi{n}^{(0)} \quad \text { and }\left.\quad E{n}\right|{\lambda=0}=E{n}^{(0)} \tag{9.11}
\end{equation}
\)

We introduce the following abbreviations:

\(
\begin{equation}
\left.\psi{n}^{(k)} \equiv \frac{1}{k!} \frac{\partial^{k} \psi{n}}{\partial \lambda^{k}}\right|{\lambda=0},\left.\quad E{n}^{(k)} \equiv \frac{1}{k!} \frac{d^{k} E{n}}{d \lambda^{k}}\right|{\lambda=0}, \quad k=1,2, \ldots \tag{9.12}
\end{equation}
\)

Equations (9.9) and (9.10) become

\(
\begin{align}
& \psi{n}=\psi{n}^{(0)}+\lambda \psi{n}^{(1)}+\lambda^{2} \psi{n}^{(2)}+\cdots+\lambda^{k} \psi{n}^{(k)}+\cdots \tag{9.13}\
& E{n}=E{n}^{(0)}+\lambda E{n}^{(1)}+\lambda^{2} E{n}^{(2)}+\cdots+\lambda^{k} E{n}^{(k)}+\cdots \tag{9.14}
\end{align}
\)

For $k=1,2,3, \ldots$, we call $\psi{n}^{(k)}$ and $E{n}^{(k)}$ the $\boldsymbol{k}$ th-order corrections to the wave function and energy. We shall assume that the series (9.13) and (9.14) converge for $\lambda=1$, and we hope that for a small perturbation, taking just the first few terms of the series will give a good approximation to the true energy and wave function. (Quite often, perturbation-theory series do not converge, but even so, the first few terms of a nonconvergent series can often give a useful approximation.)

We shall take $\psi{n}^{(0)}$ to be normalized: $\left\langle\psi{n}^{(0)} \mid \psi{n}^{(0)}\right\rangle=1$. Instead of taking $\psi{n}$ as normalized, we shall require that $\psi_{n}$ satisfy

\(
\begin{equation}
\left\langle\psi{n}^{(0)} \mid \psi{n}\right\rangle=1 \tag{9.15}
\end{equation}
\)

If $\psi{n}$ does not satisfy this equation, then multiplication of $\psi{n}$ by the constant $1 /\left\langle\psi{n}^{(0)} \mid \psi{n}\right\rangle$ gives a perturbed wave function with the desired property. The condition (9.15), called intermediate normalization, simplifies the derivation. Note that multiplication of $\psi{n}$ by a constant does not change the energy in the Schrödinger equation $\hat{H} \psi{n}=E{n} \psi{n}$, so use of intermediate normalization does not affect the results for the energy corrections. If desired, at the end of the calculation, the intermediate-normalized $\psi_{n}$ can be multiplied by a constant to normalize it in the usual sense.

Substitution of (9.13) into $1=\left\langle\psi{n}^{(0)} \mid \psi{n}\right\rangle$ [Eq. (9.15)] gives

\(
1=\left\langle\psi{n}^{(0)} \mid \psi{n}^{(0)}\right\rangle+\lambda\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle+\lambda^{2}\left\langle\psi{n}^{(0)} \mid \psi{n}^{(2)}\right\rangle+\cdots
\)

Since this equation is true for all values of $\lambda$ in the range 0 to 1 , the coefficients of like powers of $\lambda$ on each side of the equation must be equal, as proved after Eq. (4.11). Equating the $\lambda^{0}$ coefficients, we have $1=\left\langle\psi{n}^{(0)} \mid \psi{n}^{(0)}\right\rangle$, which is satisfied since $\psi_{n}^{(0)}$ is normalized. Equating the coefficients of $\lambda^{1}$, of $\lambda^{2}$, and so on, we have

\(
\begin{equation}
\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle=0, \quad\left\langle\psi{n}^{(0)} \mid \psi{n}^{(2)}\right\rangle=0, \quad \text { etc. } \tag{9.16}
\end{equation}
\)

The corrections to the wave function are orthogonal to $\psi_{n}^{(0)}$ when intermediate normalization is used.

Substituting (9.13) and (9.14) into the Schrödinger equation (9.8), we have

\(
\begin{aligned}
\left(\hat{H}^{0}+\lambda \hat{H}^{\prime}\right)\left(\psi{n}^{(0)}+\right. & \left.\lambda \psi{n}^{(1)}+\lambda^{2} \psi{n}^{(2)}+\cdots\right) \
& =\left(E{n}^{(0)}+\lambda E{n}^{(1)}+\lambda^{2} E{n}^{(2)}+\cdots\right)\left(\psi{n}^{(0)}+\lambda \psi{n}^{(1)}+\lambda^{2} \psi_{n}^{(2)}+\cdots\right)
\end{aligned}
\)

Collecting like powers of $\lambda$, we have

\(
\begin{align}
& \hat{H}^{0} \psi{n}^{(0)}+\lambda\left(\hat{H}^{\prime} \psi{n}^{(0)}+\hat{H}^{0} \psi{n}^{(1)}\right)+\lambda^{2}\left(\hat{H}^{0} \psi{n}^{(2)}+\hat{H}^{\prime} \psi{n}^{(1)}\right)+\cdots \
& \quad=E{n}^{(0)} \psi{n}^{(0)}+\lambda\left(E{n}^{(1)} \psi{n}^{(0)}+E{n}^{(0)} \psi{n}^{(1)}\right)+\lambda^{2}\left(E{n}^{(2)} \psi{n}^{(0)}+E{n}^{(1)} \psi{n}^{(1)}+E{n}^{(0)} \psi_{n}^{(2)}\right)+\cdots \tag{9.17}
\end{align}
\)

Now (assuming suitable convergence) for the two series on each side of (9.17) to be equal to each other for all values of $\lambda$, the coefficients of like powers of $\lambda$ in the two series must be equal.

Equating the coefficients of the $\lambda^{0}$ terms, we have $\hat{H}^{0} \psi{n}^{(0)}=E{n}^{(0)} \psi_{n}^{(0)}$, which is the Schrödinger equation for the unperturbed problem, Eq. (9.2), and gives us no new information. Equating the coefficients of the $\lambda^{1}$ terms, we have

\(
\begin{gather}
\hat{H}^{\prime} \psi{n}^{(0)}+\hat{H}^{0} \psi{n}^{(1)}=E{n}^{(1)} \psi{n}^{(0)}+E{n}^{(0)} \psi{n}^{(1)} \
\hat{H}^{0} \psi{n}^{(1)}-E{n}^{(0)} \psi{n}^{(1)}=E{n}^{(1)} \psi{n}^{(0)}-\hat{H}^{\prime} \psi{n}^{(0)} \tag{9.18}
\end{gather}
\)

The First-Order Energy Correction

To find $E{n}^{(1)}$ we multiply (9.18) by $\psi{m}^{(0) *}$ and integrate over all space, which gives

\(
\begin{equation}
\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(1)}\right\rangle-E{n}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle=E{n}^{(1)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(0)}\right\rangle-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle \tag{9.19}
\end{equation}
\)

where bracket notation [Eqs. (7.1) and (7.3)] is used. $\hat{H}^{0}$ is Hermitian, and use of the Hermitian property (7.12) gives for the first term on the left side of (9.19)

\(
\begin{align}
\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(1)}\right\rangle & =\left\langle\psi{n}^{(1)}\right| \hat{H}^{0}\left|\psi{m}^{(0)}\right\rangle =\left\langle\psi{n}^{(1)} \mid \hat{H}^{0} \psi{m}^{(0)}\right\rangle \
& =\left\langle\psi{n}^{(1)} \mid E{m}^{(0)} \psi_{m}^{(0)}\right\rangle =E{m}^{(0)} *\left\langle\psi{n}^{(1)} \mid \psi{m}^{(0)}\right\rangle *=E{m}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle \tag{9.20}
\end{align*}
\)

where we used the unperturbed Schrödinger equation $\hat{H}^{0} \psi{m}^{(0)}=E{m}^{(0)} \psi{m}^{(0)}$, the fact that $E{m}^{(0)}$ is real, and (7.4). Substitution of (9.20) into (9.19) and use of the orthonormality equation $\left\langle\psi{m}^{(0)} \mid \psi{n}^{(0)}\right\rangle=\delta_{m n}$ for the unperturbed eigenfunctions gives

\(
\begin{equation}
\left(E{m}^{(0)}-E{n}^{(0)}\right)\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle=E{n}^{(1)} \delta{m n}-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle \tag{9.21}
\end{equation}
\)

If $m=n$, the left side of (9.21) equals zero, and (9.21) becomes

\(
\begin{equation}
E{n}^{(1)}=\left\langle\psi{n}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle=\int \psi{n}^{(0) } \hat{H}^{\prime} \psi_{n}^{(0)} d \tau \tag{9.22}
\end{}
\)

The first-order correction to the energy is found by averaging the perturbation $\hat{H}^{\prime}$ over the appropriate unperturbed wave function.

Setting $\lambda=1$ in (9.14), we have

\(
\begin{equation}
E{n} \approx E{n}^{(0)}+E{n}^{(1)}=E{n}^{(0)}+\int \psi_{n}^{(0) } \hat{H}^{\prime} \psi_{n}^{(0)} d \tau \tag{9.23}
\end{}
\)

EXAMPLE

For the anharmonic oscillator with Hamiltonian (9.3), evaluate $E^{(1)}$ for the ground state if the unperturbed system is taken as the harmonic oscillator.
The perturbation is given by Eqs. (9.3) to (9.5) as

\(
\hat{H}^{\prime}=\hat{H}-\hat{H}^{0}=c x^{3}+d x^{4}
\)

and the first-order energy correction for the state with quantum number $v$ is given by (9.22) as $E{v}^{(1)}=\left\langle\psi{v}^{(0)}\right| c x^{3}+d x^{4}\left|\psi{v}^{(0)}\right\rangle$, where $\psi{v}^{(0)}$ is the harmonic-oscillator wave function for state $v$. For the $v=0$ ground state, use of $\psi_{0}^{(0)}=(\alpha / \pi)^{1 / 4} e^{-\alpha x^{2} / 2}$ [Eq. (4.53)] gives

\(
E{0}^{(1)}=\left\langle\psi{0}^{(0)}\right| c x^{3}+d x^{4}\left|\psi{0}^{(0)}\right\rangle=\left(\frac{\alpha}{\pi}\right)^{1 / 2} \int{-\infty}^{\infty} e^{-\alpha x^{2}}\left(c x^{3}+d x^{4}\right) d x
\)

The integral from $-\infty$ to $\infty$ of the odd function $c x^{3} e^{-\alpha x^{2}}$ is zero. Use of the Appendix integral (A.10) with $n=2$ and (4.31) for $\alpha$ gives

\(
E{0}^{(1)}=2 d\left(\frac{\alpha}{\pi}\right)^{1 / 2} \int{0}^{\infty} e^{-\alpha x^{2}} x^{4} d x=\frac{3 d}{4 \alpha^{2}}=\frac{3 d h^{2}}{64 \pi^{4} \nu^{2} m^{2}}
\)

The unperturbed ground-state energy is $E{0}^{(0)}=\frac{1}{2} h \nu$ and $E{0}^{(0)}+E_{0}^{(1)}=$ $\frac{1}{2} h \nu+3 d h^{2} / 64 \pi^{4} \nu^{2} m^{2}$.

EXERCISE Consider a one-particle, one-dimensional system with $V=\infty$ for $x<0$ and for $x>l$, and $V=c x$ for $0 \leq x \leq l$, where $c$ is a constant. (a) Sketch $V$ for $c>0$. (b) Treat the system as a perturbed particle in a box and find $E^{(1)}$ for the state with quantum number $n$. Then use Eq. (3.88) to state why the answer you got is to be expected. (Partial Answer: (b) $\frac{1}{2} c l$.)

The First-Order Wave-Function Correction
For $m \neq n$, Eq. (9.21) is

\(
\begin{equation}
\left(E{m}^{(0)}-E{n}^{(0)}\right)\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle=-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle, \quad m \neq n \tag{9.24}
\end{equation}
\)

To find $\psi{n}^{(1)}$, we expand it in terms of the complete, orthonormal set of unperturbed eigenfunctions $\psi{m}^{(0)}$ of the Hermitian operator $\hat{H}^{0}$ :

\(
\begin{equation}
\psi{n}^{(1)}=\sum{m} a{m} \psi{m}^{(0)}, \quad \text { where } \quad a{m}=\left\langle\psi{m}^{(0)} \mid \psi_{n}^{(1)}\right\rangle \tag{9.25}
\end{equation}
\)

where Eq. (7.41) was used for the expansion coefficients $a{m}$. Use of $a{m}=\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle$ in (9.24) gives

\(
\left(E{m}^{(0)}-E{n}^{(0)}\right) a{m}=-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi_{n}^{(0)}\right\rangle, \quad m \neq n
\)

By hypothesis, the level $E{n}^{(0)}$ is nondegenerate. Therefore $E{m}^{(0)} \neq E{n}^{(0)}$ for $m \neq n$, and we may divide by $\left(E{m}^{(0)}-E_{n}^{(0)}\right)$ to get

\(
\begin{equation}
a{m}=\frac{\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle}{E{n}^{(0)}-E_{m}^{(0)}}, \quad m \neq n \tag{9.26}
\end{equation}
\)

The coefficients $a{m}$ in the expansion (9.25) of $\psi{n}^{(1)}$ are given by (9.26) except for $a{n}$, the coefficient of $\psi{n}^{(0)}$. From the second equation in (9.25), $a{n}=\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle$. Recall that the choice of intermediate normalization for $\psi{n}$ makes $\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle=0$ [Eq. (9.16)]. Hence $a{n}=\left\langle\psi{n}^{(0)} \mid \psi_{n}^{(1)}\right\rangle=0$, and Eqs. (9.25) and (9.26) give the first-order correction to the wave function as

\(
\begin{equation}
\psi{n}^{(1)}=\sum{m \neq n} \frac{\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle}{E{n}^{(0)}-E{m}^{(0)}} \psi_{m}^{(0)} \tag{9.27}
\end{equation}
\)

The symbol $\sum_{m \neq n}$ means we sum over all the unperturbed states except state $n$.
Setting $\lambda=1$ in (9.13) and using just the first-order wave-function correction, we have as the approximation to the perturbed wave function

\(
\begin{equation}
\psi{n} \approx \psi{n}^{(0)}+\sum{m \neq n} \frac{\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle}{E{n}^{(0)}-E{m}^{(0)}} \psi{m}^{(0)} \tag{9.28}
\end{equation}
\)

For $\psi_{n}^{(2)}$ and the normalization of $\psi$, see Kemble, Chapter XI.

The Second-Order Energy Correction

Equating the coefficients of the $\lambda^{2}$ terms in (9.17), we get

\(
\begin{equation}
\hat{H}^{0} \psi{n}^{(2)}-E{n}^{(0)} \psi{n}^{(2)}=E{n}^{(2)} \psi{n}^{(0)}+E{n}^{(1)} \psi{n}^{(1)}-\hat{H}^{\prime} \psi{n}^{(1)} \tag{9.29}
\end{equation}
\)

Multiplication by $\psi_{m}^{(0) *}$ followed by integration over all space gives

\(
\begin{align}
&\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(2)}\right\rangle-E{n}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(2)}\right\rangle \
& \quad=E{n}^{(2)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(0)}\right\rangle+E{n}^{(1)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi_{n}^{(1)}\right\rangle \tag{9.30}
\end{align}
\)

The integral $\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(2)}\right\rangle$ in this equation is the same as the integral in (9.20), except that $\psi{n}^{(1)}$ is replaced by $\psi{n}^{(2)}$. Replacement of $\psi{n}^{(1)}$ by $\psi{n}^{(2)}$ in (9.20) gives

\(
\begin{equation}
\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(2)}\right\rangle=E{m}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi_{n}^{(2)}\right\rangle \tag{9.31}
\end{equation}
\)

Use of (9.31) and orthonormality of the unperturbed functions in (9.30) gives

\(
\begin{equation}
\left(E{m}^{(0)}-E{n}^{(0)}\right)\left\langle\psi{m}^{(0)} \mid \psi{n}^{(2)}\right\rangle=E{n}^{(2)} \delta{m n}+E{n}^{(1)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi_{n}^{(1)}\right\rangle \tag{9.32}
\end{equation}
\)

For $m=n$, the left side of (9.32) is zero and we get

\(
\begin{align}
& E{n}^{(2)}=-E{n}^{(1)}\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle+\left\langle\psi{n}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(1)}\right\rangle \
& E{n}^{(2)}=\left\langle\psi{n}^{(0)}\right| \hat{H}^{\prime}\left|\psi_{n}^{(1)}\right\rangle \tag{9.33}
\end{align}
\)

since $\left\langle\psi{n}^{(0)} \mid \psi{n}^{(1)}\right\rangle=0$ [Eq. (9.16)]. Note from (9.33) that to find the second-order correction to the energy, we have to know only the first-order correction to the wave function. In fact, it can be shown that knowledge of $\psi{n}^{(1)}$ suffices to determine $E{n}^{(3)}$ also.

In general, it can be shown that if we know the corrections to the wave function through order $k$, then we can compute the corrections to the energy through order $2 k+1$ (see Bates, Vol. I, p. 184).

Substitution of (9.27) for $\psi_{n}^{(1)}$ into (9.33) gives

\(
\begin{equation}
E{n}^{(2)}=\sum{m \neq n} \frac{\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle}{E{n}^{(0)}-E{m}^{(0)}}\left\langle\psi{n}^{(0)}\right| \hat{H}^{\prime}\left|\psi{m}^{(0)}\right\rangle \tag{9.34}
\end{equation}
\)

since the expansion coefficients $a_{m}$ [Eq. (9.26)] are constants that can be taken outside the integral. Since $\hat{H}^{\prime}$ is Hermitian, we have

\(
\begin{aligned}
\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle\left\langle\psi{n}^{(0)}\right| \hat{H}^{\prime}\left|\psi{m}^{(0)}\right\rangle & =\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle * \
& \left.=\left|\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\right| \psi{n}^{(0)}\right\rangle\left.\right|^{2}
\end{aligned}
\)

and (9.34) becomes

\(
\begin{equation}
E{n}^{(2)}=\sum{m \neq n} \frac{\left.\left|\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\right| \psi{n}^{(0)}\right\rangle\left.\right|^{2}}{E{n}^{(0)}-E{m}^{(0)}} \tag{9.35}
\end{equation}
\)

which is the desired expression for $E_{n}^{(2)}$ in terms of the unperturbed wave functions and energies.

Inclusion of $E_{n}^{(2)}$ in (9.14) with $\lambda=1$ gives the approximate energy of the perturbed state as

\(
\begin{equation}
E{n} \approx E{n}^{(0)}+H{n n}^{\prime}+\sum{m \neq n} \frac{\left|H{m n}^{\prime}\right|^{2}}{E{n}^{(0)}-E_{m}^{(0)}} \tag{9.36}
\end{equation}
\)

where the integrals are over the unperturbed normalized wave functions.
For formulas for higher-order energy corrections, see Bates, Volume I, pages 181-185. The form of perturbation theory developed in this section is called Rayleigh-Schrödinger perturbation theory; other approaches exist.

Discussion

Equation (9.28) shows that the effect of the perturbation on the wave function $\psi{n}^{(0)}$ is to "mix in" contributions from other states $\psi{m}^{(0)}, m \neq n$. Because of the factor $1 /\left(E{n}^{(0)}-E{m}^{(0)}\right)$, the most important contributions (aside from $\psi_{n}^{(0)}$ ) to the perturbed wave function come from states nearest in energy to state $n$.

To evaluate the first-order correction to the energy, we must evaluate only the single integral $H_{n n}^{\prime}$, whereas to evaluate the second-order energy correction, we must evaluate the matrix elements of $\hat{H}^{\prime}$ between the $n$th state and all other states $m$, and then perform the infinite sum in (9.35). In many cases the second-order energy correction cannot be evaluated exactly. It is even harder to deal with third-order and higher-order energy corrections.

The sums in (9.28) and (9.36) are sums over different states rather than sums over different energy values. If some of the energy levels (other than the $n$ th) are degenerate, we must include a term in the sums for each linearly independent wave function corresponding to the degenerate levels.

We have a sum over states in (9.28) and (9.36) because we require a complete set of functions for the expansion (9.25), and therefore we must include all linearly independent wave functions in the sum. If the unperturbed problem has continuum wave functions (for example, the hydrogen atom), we must also include an integration over the continuum functions, if we are to have a complete set. If $\psi_{E}^{(0)}$ denotes an unperturbed continuum wave function of energy $E^{(0)}$, then (9.27) and (9.35) become

\(
\begin{gather}
\psi{n}^{(1)}=\sum{m \neq n} \frac{H{m n}^{\prime}}{E{n}^{(0)}-E{m}^{(0)}} \psi{m}^{(0)}+\int \frac{H{E, n}^{\prime}}{E{n}^{(0)}-E^{(0)}} \psi{E}^{(0)} d E^{(0)} \
E{n}^{(2)}=\sum{m \neq n} \frac{\left|H{m n}^{\prime}\right|^{2}}{E{n}^{(0)}-E{m}^{(0)}}+\int \frac{\left|H{E, n}^{\prime}\right|^{2}}{E{n}^{(0)}-E^{(0)}} d E^{(0)} \tag{9.37}
\end{gather}
\)

where $H{E, n}^{\prime} \equiv\left\langle\psi{E}^{(0)}\right| \hat{H}^{\prime}\left|\psi{n}^{(0)}\right\rangle$. The integrals in these equations are over the range of continuum-state energies (for example, from zero to infinity for the hydrogen atom). The existence of continuum states in the unperturbed problem makes evaluation of $E{n}^{(2)}$ even harder.

Comparison of the Variation and Perturbation Methods

The perturbation method applies to all the bound states of a system. Although the variation theorem stated in Section 8.1 applies only to the lowest state of a given symmetry, we can use the linear variation method to treat excited bound states.

Perturbation calculations are often hard to do because of the need to evaluate the infinite sums over discrete states and integrals over continuum states that occur in the second-order and higher-order energy corrections.

In the perturbation method, one can calculate the energy much more accurately (to order $2 k+1$ ) than the wave function (to order $k$ ). The same situation holds in the variation method, where one can get a rather good energy with a rather inaccurate wave function. If one calculates properties other than the energy, the results will generally not be as reliable as the calculated energy.

The Variation-Perturbation Method

The variation-perturbation method allows one to accurately estimate $E^{(2)}$ and higher-order perturbation-theory energy corrections for the ground state of a system without evaluating the infinite sum in (9.36). The method is based on the inequality

\(
\begin{equation}
\langle u| \hat{H}^{0}-E{g}^{(0)}|u\rangle+\langle u| \hat{H}^{\prime}-E{g}^{(1)}\left|\psi{g}^{(0)}\right\rangle+\left\langle\psi{g}^{(0)}\right| \hat{H}^{\prime}-E{g}^{(1)}|u\rangle \geq E{g}^{(2)} \tag{9.38}
\end{equation}
\)

where $u$ is any well-behaved function that satisfies the boundary conditions and where the subscript $g$ refers to the ground state. For the proof of (9.38), see Hameka, Section 7-9. By taking $u$ to be a trial function with parameters that we vary to minimize the left side of (9.38), we can estimate $E{g}^{(2)}$. The function $u$ turns out to be an approximation to $\psi{g}^{(1)}$, the first-order correction to the ground-state wave function, and $u$ can then be used to estimate $E_{g}^{(3)}$ also. Similar variational integrals can be used to find higher-order corrections to the ground-state energy and wave function.


The helium atom has two electrons and a nucleus of charge $+2 e$. We shall consider the nucleus to be at rest (Section 6.6) and place the origin of the coordinate system at the nucleus. The coordinates of electrons 1 and 2 are $\left(x{1}, y{1}, z{1}\right)$ and $\left(x{2}, y{2}, z{2}\right)$; see Fig. 9.1.

If we take the nuclear charge to be $+Z e$ instead of $+2 e$, we can treat heliumlike ions such as $\mathrm{H}^{-}, \mathrm{Li}^{+}$, and $\mathrm{Be}^{2+}$. The Hamiltonian operator is

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{1}}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{2}}+\frac{e^{2}}{4 \pi \varepsilon{0} r{12}} \tag{9.39}
\end{equation}
\)

where $m{e}$ is the mass of the electron, $r{1}$ and $r{2}$ are the distances of electrons 1 and 2 from the nucleus, and $r{12}$ is the distance from electron 1 to 2 . The first two terms are the operators for the electrons' kinetic energy [Eq. (3.48)]. The third and fourth terms are the potential energies of attraction between the electrons and the nucleus. The final term is the potential energy of interelectronic repulsion [Eq. (6.58)]. Note that the potential energy of a system of interacting particles cannot be written as the sum of potential energies of the individual particles. The potential energy is a property of the system as a whole.

The Schrödinger equation involves six independent variables, three coordinates for each electron. In spherical coordinates, $\psi=\psi\left(r{1}, \theta{1}, \phi{1}, r{2}, \theta{2}, \phi{2}\right)$.

The operator $\nabla{1}^{2}$ is given by Eq. (6.6) with $r{1}, \theta{1}, \phi{1}$ replacing $r, \theta, \phi$. The variable $r{12}$ is $r{12}=\left[\left(x{1}-x{2}\right)^{2}+\left(y{1}-y{2}\right)^{2}+\left(z{1}-z{2}\right)^{2}\right]^{1 / 2}$, and by using the relations between Cartesian and spherical coordinates, we can express $r{12}$ in terms of $r{1}, \theta{1}, \phi{1}, r{2}, \theta{2}, \phi_{2}$.

Because of the $e^{2} / 4 \pi \varepsilon{0} r{12}$ term, the Schrödinger equation for helium cannot be separated in any coordinate system, and we must use approximation methods. The perturbation method separates the Hamiltonian (9.39) into two parts, $\hat{H}^{0}$ and $\hat{H}^{\prime}$, where $\hat{H}^{0}$ is the Hamiltonian of an exactly solvable problem. If we choose

\(
\begin{gather}
\hat{H}^{0}=-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{1}}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{2}} \tag{9.40}\
\hat{H}^{\prime}=\frac{e^{2}}{4 \pi \varepsilon{0} r{12}} \tag{9.41}
\end{gather}
\)

then $\hat{H}^{0}$ is the sum of two hydrogenlike Hamiltonians, one for each electron:

\(
\begin{gather}
\hat{H}^{0}=\hat{H}{1}^{0}+\hat{H}{2}^{0} \tag{9.42}\
\hat{H}{1}^{0} \equiv-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{1}}, \quad \hat{H}{2}^{0} \equiv-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r{2}} \tag{9.43}
\end{gather}
\)

The unperturbed system is a helium atom in which the two electrons exert no forces on each other. Although such a system does not exist, this does not prevent us from applying perturbation theory to this system.

Since the unperturbed Hamiltonian (9.42) is the sum of the Hamiltonians for two independent particles, we can use the separation-of-variables results of Eqs. (6.18) to (6.24) to conclude that the unperturbed wave functions have the form

\(
\begin{equation}
\psi^{(0)}\left(r{1}, \theta{1}, \phi{1}, r{2}, \theta{2}, \phi{2}\right)=F{1}\left(r{1}, \theta{1}, \phi{1}\right) F{2}\left(r{2}, \theta{2}, \phi{2}\right) \tag{9.44}
\end{equation}
\)

and the unperturbed energies are

\(
\begin{gather}
E^{(0)}=E{1}+E{2} \tag{9.45}\
\hat{H}{1}^{0} F{1}=E{1} F{1}, \quad \hat{H}{2}^{0} F{2}=E{2} F{2} \tag{9.46}
\end{gather}
\)

Since $\hat{H}{1}^{0}$ and $\hat{H}{2}^{0}$ are hydrogenlike Hamiltonians, the solutions of (9.46) are the hydrogenlike eigenfunctions and eigenvalues. From Eq. (6.94), we have

\(
\begin{gather}
E{1}=-\frac{Z^{2}}{n{1}^{2}} \frac{e^{2}}{8 \pi \varepsilon{0} a{0}}, \quad E{2}=-\frac{Z^{2}}{n{2}^{2}} \frac{e^{2}}{8 \pi \varepsilon{0} a{0}} \tag{9.47}\
E^{(0)}=-Z^{2}\left(\frac{1}{n{1}^{2}}+\frac{1}{n{2}^{2}}\right) \frac{e^{2}}{8 \pi \varepsilon{0} a{0}}, \quad \begin{aligned}
n{1} & =1,2,3, \ldots \
n{2} & =1,2,3, \ldots
\end{aligned} \tag{9.48}
\end{gather}
\)

where $a_{0}$ is the Bohr radius. Equation (9.48) gives the zeroth-order energies of states with both electrons bound to the nucleus. The He atom also has continuum states.

The lowest level has $n{1}=1, n{2}=1$, and its zeroth-order wave function is [Eq. (6.104)]

\(
\begin{equation}
\psi{1 s^{2}}^{(0)}=\frac{1}{\pi^{1 / 2}}\left(\frac{Z}{a{0}}\right)^{3 / 2} e^{-Z r{1} / a{0}} \cdot \frac{1}{\pi^{1 / 2}}\left(\frac{Z}{a{0}}\right)^{3 / 2} e^{-Z r{2} / a_{0}}=1 s(1) 1 s(2) \tag{9.49}
\end{equation}
\)

where $1 s(1) 1 s(2)$ denotes the product of hydrogenlike $1 s$ functions for electrons 1 and 2 , and where the subscript indicates that both electrons are in hydrogenlike $1 s$ orbitals. (Note that the procedure of assigning electrons to orbitals and writing the atomic wave function as the product of one-electron orbital functions is an approximation.) The energy of this unperturbed ground state is

\(
\begin{equation}
E{1 s^{2}}^{(0)}=-Z^{2}(2) \frac{e^{2}}{8 \pi \varepsilon{0} a_{0}} \tag{9.50}
\end{equation}
\)

The quantity $-e^{2} / 8 \pi \varepsilon{0} a{0}$ is the ground-state energy of the hydrogen atom (taking the nucleus to be infinitely heavy) and equals -13.606 eV [Eqs. (6.105)-(6.108)]. If the electron mass $m{e}$ in $a{0}$ is replaced by the reduced mass for ${ }^{4} \mathrm{He},-e^{2} / 8 \pi \varepsilon{0} a{0}$ is changed to -13.604 eV , and we shall use this number to (partly) correct for the nuclear motion in He. For helium, $Z=2$ and (9.50) gives $-8(13.604 \mathrm{eV})=-108.83 \mathrm{eV}$ :

\(
\begin{equation}
E_{1 s^{2}}^{(0)}=-108.83 \mathrm{eV} \tag{9.51}
\end{equation}
\)

How does this zeroth-order energy compare with the true helium ground-state energy? The experimental first ionization energy of He is 24.587 eV . The second ionization energy of He is easily calculated theoretically, since it is the ionization energy of the hydrogenlike ion $\mathrm{He}^{+}$and is equal to $2^{2}(13.604 \mathrm{eV})=54.416 \mathrm{eV}$. If we choose the zero of energy as the completely ionized atom [this choice is implicit in (9.39)], then the ground-state energy of the helium atom is $-(24.587+54.416) \mathrm{eV}=-79.00 \mathrm{eV}$. The zeroth-order energy (9.51) is in error by $38 \%$. We should have expected such a large error, since the perturbation term $e^{2} / 4 \pi \varepsilon{0} r{12}$ is not small.

The next step is to evaluate the first-order perturbation correction to the energy. The unperturbed ground state is nondegenerate, and use of (9.22) and (9.49) gives

\(
\begin{align}
E^{(1)} & =\left\langle\psi^{(0)}\right| \hat{H}^{\prime}\left|\psi^{(0)}\right\rangle \
E^{(1)}=\frac{Z^{6} e^{2}}{\left(4 \pi \varepsilon{0}\right) \pi^{2} a{0}^{6}} \int{0}^{2 \pi} \int{0}^{2 \pi} \int{0}^{\pi} \int{0}^{\pi} & \int{0}^{\infty} \int{0}^{\infty} e^{-2 Z r{1} / a{0}} e^{-2 Z r{2} / a{0}} \frac{1}{r{12}} r{1}^{2} \sin \theta{1} \
& \times r{2}^{2} \sin \theta{2} d r{1} d r{2} d \theta{1} d \theta{2} d \phi{1} d \phi_{2} \tag{9.52}
\end{align}
\)

The volume element for this two-electron problem contains the coordinates of both electrons; $d \tau=d \tau{1} d \tau{2}$. The integral in (9.52) can be evaluated by using an expansion of $1 / r_{12}$ in terms of spherical harmonics, as outlined in Prob. 9.14. One finds

\(
\begin{equation}
E^{(1)}=\frac{5 Z}{8}\left(\frac{e^{2}}{4 \pi \varepsilon{0} a{0}}\right) \tag{9.53}
\end{equation}
\)

Recalling that $\frac{1}{2} e^{2} / 4 \pi \varepsilon{0} a{0}$ equals 13.604 eV when the ${ }^{4} \mathrm{He}$ reduced mass is used, and putting $Z=2$, we find for the first-order perturbation energy correction for the helium ground state:

\(
E^{(1)}=\frac{10}{4}(13.604 \mathrm{eV})=34.01 \mathrm{eV}
\)

Our approximation to the energy is now

\(
\begin{equation}
E^{(0)}+E^{(1)}=-108.83 \mathrm{eV}+34.01 \mathrm{eV}=-74.82 \mathrm{eV} \tag{9.54}
\end{equation}
\)

which, compared with the experimental value of -79.00 eV , is in error by $5.3 \%$.
To evaluate the first-order correction to the wave function and higher-order corrections to the energy requires evaluating the matrix elements of $1 / r_{12}$ between the ground unperturbed state and all excited states (including the continuum) and performing the appropriate summations and integrations. No one has yet figured out how to evaluate directly all the contributions to $E^{(1)}$. Note that the effect of $\psi^{(1)}$ is to mix into the wave-function contributions from other configurations besides $1 s^{2}$. We call this configuration interaction. The largest contribution to the true ground-state wave function of helium comes from the $1 s^{2}$ configuration, which is the unperturbed (zeroth-order) wave function.
$E^{(2)}$ for the helium ground state has been calculated using the variation-perturbation method, Eq. (9.38). Scherr and Knight used 100-term trial functions to get extremely accurate approximations to the wave-function corrections through sixth order and thus to the energy corrections through thirteenth order [C. W. Scherr and R. E. Knight, Rev. Mod. Phys., 35, 436 (1963)]. For calculations of the energy corrections through order 401, see J. D. Baker et al., Phys. Rev. A, 41, 1247 (1990). The second-order correction $E^{(2)}$ turns out to be -4.29 eV , and $E^{(3)}$ is +0.12 eV . Through third order, we have for the groundstate energy

\(
E \approx-108.83 \mathrm{eV}+34.01 \mathrm{eV}-4.29 \mathrm{eV}+0.12 \mathrm{eV}=-78.99 \mathrm{eV}
\)

which is close to the experimental value -79.00 eV . Including corrections through thirteenth order, Scherr and Knight obtained a ground-state helium energy of $-2.90372433\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$, which is close to the value $-2.90372438\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$ obtained from the purely variational calculations described in the next section.

The perturbation-theory series expansion for the He-atom energy can be proved to converge [R. Ahlrichs, Phys. Rev. A, 5, 605 (1972)].

An exact wave function and energy cannot be found for the two-electron groundstate He atom, but remarkably, there exists a two-electron problem for which the exact ground-state solution of the Schrödinger equation has been found. This is a hypothetical atom (called the Hooke's-law atom or harmonium) with Hamiltonian operator

\(
\hat{H}=-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}+\frac{1}{2} k\left(r{1}^{2}+r{2}^{2}\right)+\frac{e^{2}}{4 \pi \varepsilon{0} r{12}}
\)

where $r{1}$ and $r{2}$ are the distances of the electrons from the origin. For certain values of the force-constant $k$, exact ground-state wave functions and energies have been found. See en.wikipedia.org/wiki/Hooke's_atom.


In the last section, we wrote the helium-atom Hamiltonian as $\hat{H}=\hat{H}^{0}+\hat{H}^{\prime}$, where the ground-state eigenfunction $\psi{g}^{(0)}$ of $\hat{H}^{0}$ is (9.49). What happens if we use the zeroth-order perturbation-theory ground-state wave function $\psi{g}^{(0)}$ as the variation function $\phi$ in the variational integral? The variational integral $\langle\phi| \hat{H}|\phi\rangle=\langle\phi \mid \hat{H} \phi\rangle$ then becomes

\(
\begin{align}
\langle\phi| \hat{H}|\phi\rangle & =\left\langle\psi{g}^{(0)} \mid\left(\hat{H}^{0}+\hat{H}^{\prime}\right) \psi{g}^{(0)}\right\rangle=\left\langle\psi{g}^{(0)} \mid \hat{H}^{0} \psi{g}^{(0)}+\hat{H}^{\prime} \psi{g}^{(0)}\right\rangle \
& =\left\langle\psi{g}^{(0)} \mid E{g}^{(0)} \psi{g}^{(0)}\right\rangle+\left\langle\psi{g}^{(0)} \mid \hat{H}^{\prime} \psi{g}^{(0)}\right\rangle=E{g}^{(0)}+E{g}^{(1)} \tag{9.55}
\end{align}
\)

since $\hat{H} \psi{g}^{(0)}=E{g}^{(0)} \psi{g}^{(0)},\left\langle\psi{g}^{(0)} \mid \psi{g}^{(0)}\right\rangle=1$, and $E{g}^{(1)}=\left\langle\psi{g}^{(0)}\right| \hat{H}^{\prime}\left|\psi{g}^{(0)}\right\rangle$ [Eq. (9.22)]. Use of $\psi_{g}^{(0)}$ as the variation function gives the same energy result as in first-order perturbation theory.

Now consider variation functions for the helium-atom ground state. If we used $\psi_{g}^{(0)}$ [Eq. (9.49)] as the trial function, we would get the first-order perturbation result, -74.82 eV . To improve on this result, we introduce a variational parameter into (9.49). We try the normalized function

\(
\begin{equation}
\phi=\frac{1}{\pi}\left(\frac{\zeta}{a{0}}\right)^{3} e^{-\zeta r{1} / a{0}} e^{-\zeta r{2} / a_{0}} \tag{9.56}
\end{equation}
\)

which is obtained from (9.49) by replacing the true atomic number $Z$ by a variational parameter $\zeta$ (zeta). $\zeta$ has a simple physical interpretation. Since one electron tends to screen the other from the nucleus, each electron is subject to an effective nuclear charge somewhat less than the full nuclear charge $Z$. If one electron fully shielded the other from the nucleus, we would have an effective nuclear charge of $Z-1$. Since both electrons are in the same orbital, they will be only partly effective in shielding each other. We thus expect $\zeta$ to lie between $Z-1$ and $Z$.

We now evaluate the variational integral. To expedite things, we rewrite the helium Hamiltonian (9.39) as

\(
\begin{align}
\hat{H}=[ & \left.-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\zeta e^{2}}{4 \pi \varepsilon{0} r{1}}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{\zeta e^{2}}{4 \pi \varepsilon{0} r{2}}\right]+(\zeta-Z) \frac{e^{2}}{4 \pi \varepsilon{0} r{1}}+(\zeta-Z) \frac{e^{2}}{4 \pi \varepsilon{0} r{2}} \
& +\frac{e^{2}}{4 \pi \varepsilon{0} r{12}} \tag{9.57}
\end{align}
\)

where the terms involving zeta were added and subtracted. The terms in brackets in (9.57) are the sum of two hydrogenlike Hamiltonians for nuclear charge $\zeta$. Moreover, the trial function (9.56) is the product of two hydrogenlike $1 s$ functions for nuclear charge $\zeta$. Therefore, when these terms operate on $\phi$, we have an eigenvalue equation, the eigenvalue being the sum of two hydrogenlike $1 s$ energies for nuclear charge $\zeta$ :

\(
\begin{equation}
\left[-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\zeta e^{2}}{4 \pi \varepsilon{0} r{1}}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{\zeta e^{2}}{4 \pi \varepsilon{0} r{2}}\right] \phi=-\zeta^{2}(2) \frac{e^{2}}{8 \pi \varepsilon{0} a{0}} \phi \tag{9.58}
\end{equation}
\)

Using (9.57) and (9.58), we have

\(
\begin{align}
\int \phi^{} \hat{H} \phi d \tau=-\zeta^{2} \frac{e^{2}}{4 \pi \varepsilon{0} a{0}} \int & \phi^{} \phi d \tau+\frac{(\zeta-Z) e^{2}}{4 \pi \varepsilon_{0}} \int \frac{\phi^{} \phi}{r{1}} d \tau \
& +\frac{(\zeta-Z) e^{2}}{4 \pi \varepsilon{0}} \int \frac{\phi^{} \phi}{r{2}} d \tau+\frac{e^{2}}{4 \pi \varepsilon{0}} \int \frac{\phi^{} \phi}{r_{12}} d \tau \tag{9.59}
\end{align*}
\)

Let $f{1}$ be a normalized $1 s$ hydrogenlike orbital for nuclear charge $\zeta$, occupied by electron 1 . Let $f{2}$ be the same function for electron 2 :

\(
\begin{equation}
f{1}=\frac{1}{\pi^{1 / 2}}\left(\frac{\zeta}{a{0}}\right)^{3 / 2} e^{-\zeta r{1} / a{0}}, \quad f{2}=\frac{1}{\pi^{1 / 2}}\left(\frac{\zeta}{a{0}}\right)^{3 / 2} e^{-\zeta r{2} / a{0}} \tag{9.60}
\end{equation}
\)

Noting that $\phi=f{1} f{2}$, we now evaluate the integrals in (9.59):

\(
\begin{gathered}
\int \phi^{} \phi d \tau=\iint f_{1}^{} f{2}^{*} f{1} f{2} d \tau{1} d \tau{2}=\int f{1}^{} f{1} d \tau{1} \int f_{2}^{} f{2} d \tau{2}=1 \
\int \frac{\phi^{} \phi}{r{1}} d \tau=\int \frac{f{1}^{} f{1}}{r{1}} d \tau{1} \int f{2}^{} f{2} d \tau{2}=\int \frac{f_{1}^{} f{1}}{r{1}} d \tau{1} \
=\frac{1}{\pi} \frac{\zeta^{3}}{a{0}^{3}} \int{0}^{\infty} e^{-2 \zeta r{1} / a{0}} \frac{r{1}^{2}}{r{1}} d r{1} \int{0}^{\pi} \sin \theta{1} d \theta{1} \int{0}^{2 \pi} d \phi{1}=\frac{\zeta}{a{0}}
\end{gathered}
\)

where the Appendix integral (A.8) was used. Also,

\(
\int \frac{\phi^{} \phi}{r{2}} d \tau=\int \frac{f{2}^{} f{2}}{r{2}} d \tau{2}=\int \frac{f{1}^{*} f{1}}{r{1}} d \tau{1}=\frac{\zeta}{a{0}}
\)

since it doesn't matter whether the label 1 or 2 is used on the dummy variables in the definite integral. Finally, we must evaluate $\left(e^{2} / 4 \pi \varepsilon{0}\right) \int\left(\phi^{*} \phi / r{12}\right) d \tau$. This is the same as the integral (9.52) that occurred in the perturbation treatment, except that $Z$ is replaced by $\zeta$. Hence, from (9.53)

\(
\frac{e^{2}}{4 \pi \varepsilon{0}} \int \frac{\phi^{*} \phi}{r{12}} d \tau=\frac{5 \zeta e^{2}}{32 \pi \varepsilon{0} a{0}}
\)

The variational integral (9.59) thus has the value

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau=\left(\zeta^{2}-2 Z \zeta+\frac{5}{8} \zeta\right) \frac{e^{2}}{4 \pi \varepsilon{0} a{0}} \tag{9.61}
\end{}
\)

As a check, if we set $\zeta=Z$ in (9.61), we get the first-order perturbation-theory result, (9.50) plus (9.53).

We now vary $\zeta$ to minimize the variational integral:

\(
\begin{gather}
\frac{\partial}{\partial \zeta} \int \phi^{} \hat{H} \phi d \tau=\left(2 \zeta-2 Z+\frac{5}{8}\right) \frac{e^{2}}{4 \pi \varepsilon{0} a{0}}=0 \
\zeta=Z-\frac{5}{16} \tag{9.62}
\end{gather*}
\)

As anticipated, the effective nuclear charge lies between $Z$ and $Z-1$. Using (9.62) and (9.61), we get

\(
\begin{equation}
\int \phi^{} \hat{H} \phi d \tau=\left(-Z^{2}+\frac{5}{8} Z-\frac{25}{256}\right) \frac{e^{2}}{4 \pi \varepsilon{0} a{0}}=-\left(Z-\frac{5}{16}\right)^{2} \frac{e^{2}}{4 \pi \varepsilon{0} a{0}} \tag{9.63}
\end{}
\)

Putting $Z=2$, we get as our approximation to the helium ground-state energy $-(27 / 16)^{2}\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)=-(729 / 256) 2(13.604 \mathrm{eV})=-77.48 \mathrm{eV}$, as compared with the true value of -79.00 eV . Use of $\zeta$ instead of $Z$ has reduced the error from $5.3 \%$ to $1.9 \%$. In accord with the variation theorem, the true ground-state energy is less than the variational integral.

How can we improve our variational result? We might try a function that had the general form of (9.56), that is, a product of two functions, one for each electron:

\(
\begin{equation}
\phi=u(1) u(2) \tag{9.64}
\end{equation}
\)

However, we could try a variety of functions $u$ in (9.64), instead of the single exponential used in (9.56). A systematic procedure for finding the function $u$ that gives the lowest value of the variational integral will be discussed in Section 11.1. This procedure shows that for the best possible choice of $u$ in (9.64) the variational integral equals -77.86 eV , which is still in error by $1.5 \%$. We might ask why (9.64) does not give the true ground-state energy, no matter what form we try for $u$. The answer is that, when we write the trial function as the product of separate functions for each electron, we are making an approximation. Because of the $e^{2} / 4 \pi \varepsilon{0} r{12}$ term in the Hamiltonian, the Schrödinger equation for helium is not separable, and the true ground-state wave function cannot be written as the product of separate functions for each electron. To reach the true ground-state energy, we must go beyond a function of the form (9.64).

The Bohr model gave the correct energies for the hydrogen atom but failed when applied to helium. Hence, in the early days of quantum mechanics, it was important to show that the new theory could give an accurate treatment of helium. The pioneering work on the helium ground state was done by Hylleraas in the years 1928-1930. To allow for the effect of one electron on the motion of the other, Hylleraas used variational functions that contained the interelectronic distance $r_{12}$. One function he used is

\(
\begin{equation}
\phi=N\left[e^{-\zeta r{1} / a{0}} e^{-\zeta r{2} / a{0}}\left(1+b r_{12}\right)\right] \tag{9.65}
\end{equation}
\)

where $N$ is the normalization constant and $\zeta$ and $b$ are variational parameters. Since

\(
\begin{equation}
r{12}=\left[\left(x{2}-x{1}\right)^{2}+\left(y{2}-y{1}\right)^{2}+\left(z{2}-z_{1}\right)^{2}\right]^{1 / 2} \tag{9.66}
\end{equation}
\)

the function (9.65) goes beyond the simple product form (9.64). Minimization of the variational integral with respect to the parameters gives $\zeta=1.849, b=0.364 / a{0}$, and a ground-state energy of -78.7 eV , in error by 0.3 eV . The $1+b r{12}$ term makes the wave function larger for large values of $r{12}$. This is as it should be, because the repulsion between the electrons makes it energetically more favorable for the electrons to avoid each other. Using a more complicated six-term trial function containing $r{12}$, Hylleraas obtained an energy only 0.01 eV above the true ground-state energy.

Hylleraas's work has been extended by others. Using a 1078 -term variational function, Pekeris found a ground-state energy of $-2.903724375\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$ [C. L. Pekeris, Phys. Rev., 115, 1216 (1959)]. With relativistic and nuclear-motion corrections added, this gave for $E{i}$, the ionization energy of helium, $E{i} / h c=198310.69 \mathrm{~cm}^{-1}$, compared with the experimental value $198310.67 \mathrm{~cm}^{-1}$. Using an improved variational function, Frankowski and Pekeris bettered Perkeris's result by obtaining the energy $-2.90372437703\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$, a result believed to be within $10^{-11}\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$ of the true nonrelativistic, infinite-nuclearmass ground-state energy [K. Frankowski and C. L. Pekeris, Phys. Rev., 146, 46 (1966)]. Drake and Yan used linear variational functions containing $r{12}$ to calculate the ground-state energy and many excited-state energies of He that are thought to be accurate to 1 part in $10^{14}$ or better [G. W. F. Drake and Z-C. Yan, Chem. Phys. Lett., 229, 486 (1994); Phys. Rev. A, 46, 2378 (1992)]. These workers similarly calculated Li variational energies for the ground state and two excited states with 1 part in $10^{9}$ accuracy or better [Z-C. Yan and G. W. F. Drake, Phys. Rev. A, 52, 3711 (1995)]. Adding in relativistic and nuclear motion corrections, Drake and Yan found good agreement between theoretically calculated and experimental spectroscopic transition frequencies of He and Li. By doing a series of variational calculations with increasing numbers of terms in the variation function and extrapolating to the limit of an infinite number of terms, Drake and co-workers found the following ground-state,
nonrelativistic, infinite-nuclear-mass He energy: $-2.90372437703411959831\left(e^{2} / 4 \pi \varepsilon{0} a_{0}\right)$, which is believed accurate to 21 significant figures [G. W. F. Drake et al., Phys. Rev. A, 65, 054501 (2002)].


We now consider the perturbation treatment of an energy level whose degree of degeneracy is $d$. We have $d$ linearly independent unperturbed wave functions corresponding to the degenerate level. We shall use the labels $1,2, \ldots, d$ for the states of the degenerate level, without implying that these are necessarily the lowest-lying states. The unperturbed Schrödinger equation is

\(
\begin{equation}
\hat{H}^{0} \psi{n}^{(0)}=E{n}^{(0)} \psi_{n}^{(0)} \tag{9.67}
\end{equation}
\)

with

\(
\begin{equation}
E{1}^{(0)}=E{2}^{(0)}=\cdots=E_{d}^{(0)} \tag{9.68}
\end{equation}
\)

The perturbed problem is

\(
\begin{gather}
\hat{H} \psi{n}=E{n} \psi_{n} \tag{9.69}\
\hat{H}=\hat{H}^{0}+\lambda \hat{H}^{\prime} \tag{9.70}
\end{gather}
\)

As $\lambda$ goes to zero, the eigenvalues in (9.69) go to the eigenvalues in (9.67); we have $\lim {\lambda \rightarrow 0} E{n}=E_{n}^{(0)}$. Figure 9.2 shows this for a hypothetical system with six states and a threefold-degenerate unperturbed level. Note that the perturbation splits the degenerate energy level. In some cases the perturbation may have no effect on the degeneracy or may only partly remove the degeneracy.

As $\lambda \rightarrow 0$, the eigenfunctions satisfying (9.69) approach eigenfunctions satisfying (9.67). Does this mean that $\lim {\lambda \rightarrow 0} \psi{n}=\psi{n}^{(0)}$ ? Not necessarily. If $E{n}^{(0)}$ is nondegenerate, there is a unique normalized eigenfunction $\psi{n}^{(0)}$ of $\hat{H}^{0}$ with eigenvalue $E{n}^{(0)}$, and we can be sure that $\lim {\lambda \rightarrow 0} \psi{n}=\psi{n}^{(0)}$. However, if $E{n}^{(0)}$ is the eigenvalue of the $d$-fold degenerate level, then (Section 3.6) any linear combination

\(
\begin{equation}
c{1} \psi{1}^{(0)}+c{2} \psi{2}^{(0)}+\cdots+c{d} \psi{d}^{(0)} \tag{9.71}
\end{equation}
\)

is a solution of (9.67) with eigenvalue (9.68). The set of linearly independent normalized functions $\psi{1}^{(0)}, \psi{2}^{(0)}, \ldots, \psi_{d}^{(0)}$, which we use as eigenfunctions corresponding to the states

FIGURE 9.2 Effect of a perturbation on energy levels.
of the degenerate level, is not unique. Using (9.71), we can construct an infinite number of sets of $d$ linearly independent normalized eigenfunctions for the degenerate level. As far as the unperturbed problem is concerned, one such set is as good as another. For example, for the three degenerate $2 p$ states of the hydrogen atom, we can use the $2 p{1}, 2 p{0}$, and $2 p{-1}$ functions; the $2 p{x}, 2 p{y}$, and $2 p{z}$, functions; or some other set of three linearly independent functions constructed as linear combinations of the members of one of the preceding sets. For the perturbed eigenfunctions that correspond to the $d$-fold degenerate unperturbed level, all we can say is that as $\lambda$ approaches zero they each approach a linear combination of unperturbed eigenfunctions:

\(
\begin{equation}
\lim {\lambda \rightarrow 0} \psi{n}=\sum{i=1}^{d} c{i} \psi_{i}^{(0)}, \quad 1 \leq n \leq d \tag{9.72}
\end{equation}
\)

Our first task is thus to determine the correct zeroth-order wave functions (9.72) for the perturbation $\hat{H}^{\prime}$. Calling these correct zeroth-order functions $\phi_{n}^{(0)}$, we have

\(
\begin{equation}
\phi{n}^{(0)}=\lim {\lambda \rightarrow 0} \psi{n}=\sum{i=1}^{d} c{i} \psi{i}^{(0)}, \quad 1 \leq n \leq d \tag{9.73}
\end{equation}
\)

Each different function $\phi_{n}^{(0)}$ has a different set of coefficients in (9.73). The correct set of zeroth-order functions depends on what the perturbation $\hat{H}^{\prime}$ is.

The treatment of the $d$-fold degenerate level proceeds like the nondegenerate treatment of Section 9.2, except that instead of $\psi{n}^{(0)}$ we use $\phi{n}^{(0)}$. Instead of Eqs. (9.13) and (9.14), we have

\(
\begin{array}{ll}
\psi{n}=\phi{n}^{(0)}+\lambda \psi{n}^{(1)}+\lambda^{2} \psi{n}^{(2)}+\cdots, & n=1,2, \ldots, d \
E{n}=E{d}^{(0)}+\lambda E{n}^{(1)}+\lambda^{2} E{n}^{(2)}+\cdots, & n=1,2, \ldots, d \tag{9.75}
\end{array}
\)

where (9.68) was used. Substitution into $\hat{H} \psi{n}=E{n} \psi_{n}$ gives

\(
\begin{aligned}
\left(\hat{H}^{0}+\lambda \hat{H}^{\prime}\right)\left(\phi{n}^{(0)}+\right. & \left.\lambda \psi{n}^{(1)}+\lambda^{2} \psi{n}^{(2)}+\cdots\right) \
& =\left(E{d}^{(0)}+\lambda E{n}^{(1)}+\lambda^{2} E{n}^{(2)}+\cdots\right)\left(\phi{n}^{(0)}+\lambda \psi{n}^{(1)}+\lambda^{2} \psi_{n}^{(2)}+\cdots\right)
\end{aligned}
\)

Equating the coefficients of $\lambda^{0}$ in this equation, we get $\hat{H}^{0} \phi{n}^{(0)}=E{d}^{(0)} \phi{n}^{(0)}$. By the theorem of Section 3.6, each linear combination $\phi{n}^{(0)}(n=1,2, \ldots, d)$ is an eigenfunction of $\hat{H}^{0}$ with eigenvalue $E_{d}^{(0)}$, and this equation gives no new information.

Equating the coefficients of the $\lambda^{1}$ terms, we get

\(
\begin{gather}
\hat{H}^{0} \psi{n}^{(1)}+\hat{H}^{\prime} \phi{n}^{(0)}=E{d}^{(0)} \psi{n}^{(1)}+E{n}^{(1)} \phi{n}^{(0)} \
\hat{H}^{0} \psi{n}^{(1)}-E{d}^{(0)} \psi{n}^{(1)}=E{n}^{(1)} \phi{n}^{(0)}-\hat{H}^{\prime} \phi{n}^{(0)}, \quad n=1,2, \ldots, d \tag{9.76}
\end{gather}
\)

We now multiply $(9.76)$ by $\psi_{m}^{(0) *}$ and integrate over all space, where $m$ is one of the states corresponding to the $d$-fold degenerate unperturbed level under consideration; that is, $1 \leq m \leq d$. We get

\(
\begin{array}{r}
\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(1)}\right\rangle-E{d}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle=E{n}^{(1)}\left\langle\psi{m}^{(0)} \mid \phi{n}^{(0)}\right\rangle-\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\phi{n}^{(0)}\right\rangle, \
1 \leq m \leq d \tag{9.77}
\end{array}
\)

From Eq. (9.20), we have $\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(1)}\right\rangle=E{m}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle$. From (9.68), $E{m}^{(0)}=E{d}^{(0)}$ for $1 \leq m \leq d$, so $\left\langle\psi{m}^{(0)}\right| \hat{H}^{0}\left|\psi{n}^{(1)}\right\rangle=E{d}^{(0)}\left\langle\psi{m}^{(0)} \mid \psi{n}^{(1)}\right\rangle$, and the left side of (9.77) equals zero. Equation (9.77) becomes

\(
\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\phi{n}^{(0)}\right\rangle-E{n}^{(1)}\left\langle\psi{m}^{(0)} \mid \phi_{n}^{(0)}\right\rangle=0, \quad m=1,2, \ldots, d
\)

Substitution of the linear combination (9.73) for $\phi_{n}^{(0)}$ gives

\(
\begin{equation}
\sum{i=1}^{d} c{i}\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{i}^{(0)}\right\rangle-E{n}^{(1)} \sum{i=1}^{d} c{i}\left\langle\psi{m}^{(0)} \mid \psi_{i}^{(0)}\right\rangle=0 \tag{9.78}
\end{equation}
\)

The zeroth-order wave functions $\psi_{i}^{(0)}(i=1,2, \ldots, d)$ of the degenerate level can always be chosen to be orthonormal, and we shall assume this has been done:

\(
\begin{equation}
\left\langle\psi{m}^{(0)} \mid \psi{i}^{(0)}\right\rangle=\delta_{m i} \tag{9.79}
\end{equation}
\)

for $m$ and $i$ in the range 1 to $d$. Equation (9.78) becomes

\(
\begin{equation}
\sum{i=1}^{d}\left[\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{i}^{(0)}\right\rangle-E{n}^{(1)} \delta{m i}\right] c{i}=0, \quad m=1,2, \ldots, d \tag{9.80}
\end{equation}
\)

This is a set of $d$ linear, homogeneous equations in the $d$ unknowns $c{1}, c{2}, \ldots, c{d}$, which are the coefficients in the correct zeroth-order wave function $\phi{n}^{(0)}$ in (9.73). Writing out (9.80), we have

\(
\begin{align}
& \left(H{11}^{\prime}-E{n}^{(1)}\right) c{1}+H{12}^{\prime} c{2}+\cdots+H{1 d}^{\prime} c{d}=0 \
& H{21}^{\prime} c{1}+\left(H{22}^{\prime}-E{n}^{(1)}\right) c{2}+\cdots+H{2 d}^{\prime} c{d}=0 \
& \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \tag{9.81}\
& H{d 1}^{\prime} c{1}+H{d 2}^{\prime} c{2}+\cdots+\left(H{d d}^{\prime}-E{n}^{(1)}\right) c{d}=0 \
& \quad H{m i}^{\prime} \equiv\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{i}^{(0)}\right\rangle
\end{align}
\)

For this set of linear homogeneous equations to have a nontrivial solution, the determinant of the coefficients must vanish (Section 8.4):

\(
\begin{align}
& \operatorname{det}\left[\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{i}^{(0)}\right\rangle-E{n}^{(1)} \delta{m i}\right]=0 \tag{9.82}\
& \left|\begin{array}{cccc}
H{11}^{\prime}-E{n}^{(1)} & H{12}^{\prime} & \cdots & H{1 d}^{\prime} \
H{21}^{\prime} & H{22}^{\prime}-E{n}^{(1)} & \cdots & H{2 d}^{\prime} \
\vdots & \vdots & \ddots & \vdots \
H{d 1}^{\prime} & H{d 2}^{\prime} & \cdots & H{d d}^{\prime}-E{n}^{(1)}
\end{array}\right|=0 \tag{9.83}
\end{align}
\)

The secular equation (9.83) is an algebraic equation of degree $d$ in $E{n}^{(1)}$. It has $d$ roots, $E{1}^{(1)}, E{2}^{(1)}, \ldots, E{d}^{(1)}$, which are the first-order corrections to the energy of the $d$-fold degenerate unperturbed level. If the roots are all different, then the first-order perturbation correction has split the $d$-fold degenerate unperturbed level into $d$ different perturbed levels of energies (correct through first order):

\(
E{d}^{(0)}+E{1}^{(1)}, \quad E{d}^{(0)}+E{2}^{(1)}, \quad \ldots, \quad E{d}^{(0)}+E{d}^{(1)}
\)

If two or more roots of the secular equation are equal, the degeneracy is not completely removed in first order. In the rest of this section, we shall assume that all the roots of (9.83) are different.

Having found the $d$ first-order energy corrections, we go back to the set of equations (9.81) to find the unknowns $c_{i}$, which determine the correct zeroth-order wave functions. To find the correct zeroth-order function

\(
\begin{equation}
\phi{n}^{(0)}=c{1} \psi{1}^{(0)}+c{2} \psi{2}^{(0)}+\cdots+c{d} \psi_{d}^{(0)} \tag{9.84}
\end{equation}
\)

corresponding to the root $E{n}^{(1)}$, we solve (9.81) for $c{2}, c{3}, \ldots, c{d}$ in terms of $c{1}$ and then find $c{1}$ by normalization. Use of (9.79) in $\left\langle\phi{n}^{(0)} \mid \phi{n}^{(0)}\right\rangle=1$ gives (Prob. 9.21)

\(
\begin{equation}
\sum{k=1}^{d}\left|c{k}\right|^{2}=1 \tag{9.85}
\end{equation}
\)

For each root $E{n}^{(1)}, n=1,2, \ldots, d$, we have a different set of coefficients $c{1}, c{2}, \ldots, c{d}$, giving a different correct zeroth-order wave function.

In the next section, we shall show that

\(
\begin{equation}
E{n}^{(1)}=\left\langle\phi{n}^{(0)}\right| \hat{H}^{\prime}\left|\phi_{n}^{(0)}\right\rangle, \quad n=1,2, \ldots, d \tag{9.86}
\end{equation}
\)

which is similar to the nondegenerate-case formula (9.22), except that the correct zerothorder functions have to be used.

Using procedures similar to those for the nondegenerate case, one can now find the first-order corrections to the correct zeroth-order wave functions and the secondorder energy corrections. For the results, see Bates, Volume I, pages 197-198; Hameka, pages 230-231.

As an example, consider the effect of a perturbation $\hat{H}^{\prime}$ on the lowest degenerate energy level of a particle in a cubic box. We have three states corresponding to this level: $\psi{211}^{(0)}, \psi{121}^{(0)}$, and $\psi_{112}^{(0)}$. These unperturbed wave functions are orthonormal, and the secular equation (9.83) is

\(
\left|\begin{array}{ccc}
\langle 211| \hat{H}^{\prime}|211\rangle-E{n}^{(1)} & \langle 211| \hat{H}^{\prime}|121\rangle & \langle 211| \hat{H}^{\prime}|112\rangle \tag{9.87}\
\langle 121| \hat{H}^{\prime}|211\rangle & \langle 121| \hat{H}^{\prime}|121\rangle-E{n}^{(1)} & \langle 121| \hat{H}^{\prime}|112\rangle \
\langle 112| \hat{H}^{\prime}|211\rangle & \langle 112| \hat{H}^{\prime}|121\rangle & \langle 112| \hat{H}^{\prime}|112\rangle-E_{n}^{(1)}
\end{array}\right|=0
\)

Solving this equation, we find the first-order energy corrections: $E{1}^{(1)}, E{2}^{(1)}, E{3}^{(1)}$. The triply degenerate unperturbed level is split into three levels of energies (through first order): $\left(6 h^{2} / 8 m a^{2}\right)+E{1}^{(1)},\left(6 h^{2} / 8 m a^{2}\right)+E{2}^{(1)},\left(6 h^{2} / 8 m a^{2}\right)+E{3}^{(1)}$. Using each of the roots $E{1}^{(1)}, E{2}^{(1)}, E_{3}^{(1)}$, we get a different set of simultaneous equations (9.81). Solving each set, we find three sets of coefficients, which determine the three correct zeroth-order wave functions.

If you are familiar with matrix algebra, note that solving (9.83) and (9.81) amounts to finding the eigenvalues and eigenvectors of the matrix whose elements are $\left\langle\psi{m}^{(0)}\right| \hat{H}^{\prime}\left|\psi{i}^{(0)}\right\rangle$.


The secular equation (9.83) is easier to solve if some of the off-diagonal elements of the secular determinant are zero. In the most favorable case, all the off-diagonal elements are zero, and

\(
\begin{gather}
\left|\begin{array}{cccc}
H{11}^{\prime}-E{n}^{(1)} & 0 & \cdots & 0 \
0 & H{22}^{\prime}-E{n}^{(1)} & \cdots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \cdots & H{d d}^{\prime}-E{n}^{(1)}
\end{array}\right|=0 \tag{9.88}\
\left(H{11}^{\prime}-E{n}^{(1)}\right)\left(H{22}^{\prime}-E{n}^{(1)}\right) \cdots\left(H{d d}^{\prime}-E{n}^{(1)}\right)=0 \
E{1}^{(1)}=H{11}^{\prime}, \quad E{2}^{(1)}=H{22}^{\prime}, \quad \cdots, \quad E{d}^{(1)}=H{d d}^{\prime} \tag{9.89}
\end{gather}
\)

Now we want to find the correct zeroth-order wave functions. We shall assume that the roots (9.89) are all different. For the root $E{n}^{(1)}=H{11}^{\prime}$, the system of equations (9.81) is

\(
\begin{aligned}
& 0=0 \
& \left(H{22}^{\prime}-H{11}^{\prime}\right) c{2}=0 \
& \left(H{d d}^{\prime}-H{11}^{\prime}\right) c{d}=0
\end{aligned}
\)

Since we are assuming unequal roots, the quantities $H{22}^{\prime}-H{11}^{\prime}, \ldots, H{d d}^{\prime}-H{11}^{\prime}$ are all nonzero. Therefore, $c{2}=0, c{3}=0, \ldots, c{d}=0$. The normalization condition (9.85) gives $c{1}=1$. The correct zeroth-order wave function corresponding to the first-order perturbation energy correction $H{11}^{\prime}$ is then [Eq. (9.73)] $\phi{1}^{(0)}=\psi{1}^{(0)}$. For the root $H{22}^{\prime}$, the same reasoning gives $\phi{2}^{(0)}=\psi{2}^{(0)}$. Using each of the remaining roots, we find similarly: $\phi{3}^{(0)}=\psi{3}^{(0)}, \ldots, \phi{d}^{(0)}=\psi{d}^{(0)}$.

When the secular determinant is in diagonal form, the initially assumed wave functions $\psi{1}^{(0)}, \psi{2}^{(0)}, \ldots, \psi_{n}^{(0)}$ are the correct zeroth-order wave functions for the perturbation $\hat{H}^{\prime}$.

The converse is also true. If the initially assumed functions are the correct zerothorder functions, then the secular determinant is in diagonal form. This is seen as follows. From $\phi{1}^{(0)}=\psi{1}^{(0)}$ we know that the coefficients in the expansion $\phi{1}^{(0)}=\sum{i=1}^{d} c{i} \psi{i}^{(0)}$ are $c{1}=1, c{2}=c_{3}=\cdots=0$, so for $n=1$ the set of simultaneous equations (9.81) becomes

\(
H{11}^{\prime}-E{1}^{(1)}=0, \quad H{21}^{\prime}=0, \quad \ldots, \quad H{d 1}^{\prime}=0
\)

Applying the same reasoning to the remaining functions $\phi{n}^{(0)}$, we conclude that $H{m i}^{\prime}=0$ for $i \neq m$. Hence, use of the correct zeroth-order functions makes the secular determinant diagonal. Note also that the first-order corrections to the energy can be found by averaging the perturbation over the correct zeroth-order wave functions:

\(
\begin{equation}
E{n}^{(1)}=H{n n}^{\prime}=\left\langle\phi{n}^{(0)}\right| \hat{H}^{\prime}\left|\phi{n}^{(0)}\right\rangle \tag{9.90}
\end{equation}
\)

a result mentioned in Eq. (9.86).
Often, instead of being in diagonal form, the secular determinant is in block-diagonal form. For example, we might have

\(
\left|\begin{array}{cccc}
H{11}^{\prime}-E{n}^{(1)} & H{12}^{\prime} & 0 & 0 \tag{9.91}\
H{21}^{\prime} & H{22}^{\prime}-E{n}^{(1)} & 0 & 0 \
0 & 0 & H{33}^{\prime}-E{n}^{(1)} & H{34}^{\prime} \
0 & 0 & H{43}^{\prime} & H{44}^{\prime}-E{n}^{(1)}
\end{array}\right|=0
\)

The secular determinant in (9.91) has the same form as the secular determinant in the linear-variation secular equation (8.65) with $S{i j}=\delta{i j}$. By the same reasoning used to show that two of the variation functions are linear combinations of $f{1}$ and $f{2}$ and two are linear combinations of $f{3}$ and $f{4}$ [Eq. (8.69)], it follows that two of the correct zeroth-order wave functions are linear combinations of $\psi{1}^{(0)}$ and $\psi{2}^{(0)}$ and two are linear combinations of $\psi{3}^{(0)}$ and $\psi{4}^{(0)}$ :

\(
\begin{array}{ll}
\phi{1}^{(0)}=c{1} \psi{1}^{(0)}+c{2} \psi{2}^{(0)}, & \phi{2}^{(0)}=c{1}^{\prime} \psi{1}^{(0)}+c{2}^{\prime} \psi{2}^{(0)} \
\phi{3}^{(0)}=c{3} \psi{3}^{(0)}+c{4} \psi{4}^{(0)}, & \phi{4}^{(0)}=c{3}^{\prime} \psi{3}^{(0)}+c{4}^{\prime} \psi{4}^{(0)}
\end{array}
\)

where primes were used to distinguish different coefficients.
When the secular determinant of degenerate perturbation theory is in block-diagonal form, the secular equation breaks up into two or more smaller secular equations, and the set of simultaneous equations (9.81) for the coefficients $c_{i}$ breaks up into two or more smaller sets of simultaneous equations.

Conversely, if we have, say, a fourfold-degenerate unperturbed level, and we happen to know that $\phi{1}^{(0)}$ and $\phi{2}^{(0)}$ are each linear combinations of $\psi{1}^{(0)}$ and $\psi{2}^{(0)}$ only, while $\phi{3}^{(0)}$ and $\phi{4}^{(0)}$ are each linear combinations of $\psi{3}^{(0)}$ and $\psi{4}^{(0)}$ only, we deal with two second-order secular determinants rather than a fourth-order secular determinant.

How can we choose the right zeroth-order wave functions in advance and thereby simplify the secular equation? Suppose there is an operator $\hat{A}$ that commutes with both $\hat{H}^{0}$ and $\hat{H}^{\prime}$. Then we can choose the unperturbed functions to be eigenfunctions of $\hat{A}$. Because $\hat{A}$ commutes with $\hat{H}^{\prime}$, this choice of unperturbed functions will make the integrals $H{i j}^{\prime}$ vanish if $\psi{i}^{(0)}$ and $\psi{j}^{(0)}$ belong to different eigenvalues of $\hat{A}$ [see Eq. (7.50)]. Thus, if the eigenvalues of $\hat{A}$ for $\psi{1}^{(0)}, \psi{2}^{(0)}, \ldots, \psi{d}^{(0)}$ are all different, the secular determinant will be in diagonal form, and we will have the right zeroth-order wave functions. If some of the eigenvalues of $\hat{A}$ are the same, we get block-diagonal rather than diagonal form. In general, the correct zeroth-order functions will be linear combinations of those unperturbed functions that have the same eigenvalue of $\hat{A}$. (This is to be expected since $\hat{A}$ commutes with $\hat{H}=\hat{H}^{0}+\hat{H}^{\prime}$, so the perturbed eigenfunctions of $\hat{H}$ can be chosen to be eigenfunctions of $\hat{A}$.) For an example, see Prob. 9.23.


Section 9.3 applied perturbation theory to the ground state of the helium atom. We now treat the lowest excited states of He . The unperturbed energies are given by (9.48). The lowest unperturbed excited states have $n{1}=1, n{2}=2$ or $n{1}=2, n{2}=1$, and substitution in (9.48) gives
$E^{(0)}=-\frac{5 Z^{2}}{8}\left(\frac{e^{2}}{4 \pi \varepsilon{0} a{0}}\right)=-\frac{20}{8} 2\left(\frac{e^{2}}{8 \pi \varepsilon{0} a{0}}\right)=-5(13.606 \mathrm{eV})=-68.03 \mathrm{eV}$
Recall that the $n=2$ level of a hydrogenlike atom is fourfold degenerate, the $2 s$ and three $2 p$ states all having the same energy. The first excited unperturbed energy level of He is eightfold degenerate. The eight unperturbed wave functions are [Eq. (9.44)]

\(
\begin{align}
& \psi{1}^{(0)}=1 s(1) 2 s(2), \quad \psi{2}^{(0)}=2 s(1) 1 s(2), \quad \psi{3}^{(0)}=1 s(1) 2 p{x}(2), \quad \psi{4}^{(0)}=2 p{x}(1) 1 \mathrm{~s}(2) \
& \psi{5}^{(0)}=1 \mathrm{~s}(1) 2 p{y}(2), \quad \psi{6}^{(0)}=2 p{y}(1) 1 s(2), \quad \psi{7}^{(0)}=1 \mathrm{~s}(1) 2 p{z}(2), \quad \psi{8}^{(0)}=2 p{z}(1) 1 s(2) \tag{9.93}
\end{align}
\)

where $1 s(1) 2 s(2)$ signifies the product of a hydrogenlike $1 s$ function for electron 1 and a hydrogenlike $2 s$ function for electron 2 . The explicit form of $\psi_{8}^{(0)}$, for example, is (Table 6.2)

\(
\psi{8}^{(0)}=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{Z}{a{0}}\right)^{5 / 2} r{1} e^{-Z r{1} / 2 a{0}} \cos \theta{1} \cdot \frac{1}{\pi^{1 / 2}}\left(\frac{Z}{a{0}}\right)^{3 / 2} e^{-Z r{2} / a_{0}}
\)

We chose to use the real $2 p$ hydrogenlike orbitals, rather than the complex ones.
Since the unperturbed level is degenerate, we must solve a secular equation. The secular equation (9.83) assumes that the functions $\psi{1}^{(0)}, \psi{2}^{(0)}, \ldots, \psi_{8}^{(0)}$ are orthonormal. This condition is met. For example,

\(
\begin{aligned}
\int \psi{1}^{(0)} * \psi{1}^{(0)} d \tau & =\iint 1 s(1) 2 s(2) 1 s(1) 2 s(2) d \tau{1} d \tau{2} \
& =\int|1 s(1)|^{2} d \tau{1} \int|2 s(2)|^{2} d \tau{2}=1 \cdot 1=1 \
\int \psi{3}^{(0) *} \psi{5}^{(0)} d \tau & =\int|1 s(1)|^{2} d \tau{1} \int 2 p{x}(2) * 2 p{y}(2) d \tau{2}=1 \cdot 0=0
\end{aligned}
\)

where the orthonormality of the hydrogenlike orbitals has been used.

The secular determinant contains $8^{2}=64$ elements. The operator $\hat{H}^{\prime}$ is Hermitian, and $H{i j}^{\prime}=\left(H{j i}^{\prime}\right)^{}$. Also, since $\hat{H}^{\prime}$ and $\psi{1}^{(0)}, \ldots, \psi{8}^{(0)}$ are all real, we have $\left(H_{j i}^{\prime}\right)^{}=H{j i}^{\prime}$, so $H{i j}^{\prime}=H_{j i}^{\prime}$. The secular determinant is symmetric about the principal diagonal. This cuts the labor of evaluating integrals almost in half.

By using parity considerations, we can show that most of the integrals $H{i j}^{\prime}$ are zero. First consider $H{13}^{\prime}$ :
$H{13}^{\prime}=\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty} 1 s(1) 2 s(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 p{x}(2) d x{1} d y{1} d z{1} d x{2} d y{2} d z{2}$
An $s$ hydrogenlike function depends only on $r=\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}$ and is therefore an even function. The $2 p{x}(2)$ function is an odd function of $x{2}$ [Eq. (6.119)]. $r{12}$ is given by (9.66), and if we invert all six coordinates, $r{12}$ is unchanged:

\(
r{12} \rightarrow\left[\left(-x{1}+x{2}\right)^{2}+\left(-y{1}+y{2}\right)^{2}+\left(-z{1}+z{2}\right)^{2}\right]^{1 / 2}=r{12}
\)

Hence, on inverting all six coordinates, the integrand of $H{13}^{\prime}$ goes into minus itself. Therefore [Eq. (7.64)], $H{13}^{\prime}=0$. The same reasoning gives $H{14}^{\prime}=H{15}^{\prime}=H{16}^{\prime}=H{17}^{\prime}=$ $H{18}^{\prime}=0$ and $H{23}^{\prime}=H{24}^{\prime}=H{25}^{\prime}=H{26}^{\prime}=H{27}^{\prime}=H{28}^{\prime}=0$. Now consider $H{35}^{\prime}$ :

\(
H{35}^{\prime}=\int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} 1 s(1) 2 p{x}(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 p{y}(2) d x{1} \cdots d z_{2}
\)

Suppose we invert the $x$ coordinates: $x{1} \rightarrow-x{1}$ and $x{2} \rightarrow-x{2}$. This transformation will leave $r{12}$ unchanged. The functions $1 s(1)$ and $2 p{y}(2)$ will be unaffected. However, $2 p{x}(2)$ will go over to minus itself, so the net effect will be to change the integrand of $H{35}^{\prime}$ into minus itself. Hence (Prob. 7.30), $H{35}^{\prime}=0$. Likewise, $H{36}^{\prime}=H{37}^{\prime}=H{38}^{\prime}=0$ and $H{45}^{\prime}=H{46}^{\prime}=H{47}^{\prime}=H{48}^{\prime}=0$. By considering the transformation $y{1} \rightarrow-y{1}, y{2} \rightarrow-y{2}$, we see that $H{57}^{\prime}=H{58}^{\prime}=H{67}^{\prime}=H{68}^{\prime}=0$. The secular equation is thus

\(
\left.\begin{array}{|cccccccc}
b{11} & H{12}^{\prime} & 0 & 0 & 0 & 0 & 0 & 0 \
H{12}^{\prime} & b{22} & 0 & 0 & 0 & 0 & 0 & 0 \
0 & 0 & b{33} & H{34}^{\prime} & 0 & 0 & 0 & 0 \
0 & 0 & H{34}^{\prime} & b{44} & 0 & 0 & 0 & 0 \
0 & 0 & 0 & 0 & b{55} & H{56}^{\prime} & 0 & 0 \
0 & 0 & 0 & 0 & H{56}^{\prime} & b{66} & 0 & 0 \
0 & 0 & 0 & 0 & 0 & 0 & b{77} & H{78}^{\prime} \
0 & 0 & 0 & 0 & 0 & 0 & H{78}^{\prime} & b{88}
\end{array} \right\rvert\,=0
\)

The secular determinant is in block-diagonal form and factors into four determinants, each of second order. We conclude that the correct zeroth-order functions have the form

\(
\begin{array}{ll}
\phi{1}^{(0)}=c{1} \psi{1}^{(0)}+c{2} \psi{2}^{(0)}, & \phi{2}^{(0)}=\bar{c}{1} \psi{1}^{(0)}+\bar{c}{2} \psi{2}^{(0)} \
\phi{3}^{(0)}=c{3} \psi{3}^{(0)}+c{4} \psi{4}^{(0)}, & \phi{4}^{(0)}=\bar{c}{3} \psi{3}^{(0)}+\bar{c}{4} \psi{4}^{(0)} \
\phi{5}^{(0)}=c{5} \psi{5}^{(0)}+c{6} \psi{6}^{(0)}, & \phi{6}^{(0)}=\bar{c}{5} \psi{5}^{(0)}+\bar{c}{6} \psi{6}^{(0)} \tag{9.94}\
\phi{7}^{(0)}=c{7} \psi{7}^{(0)}+c{8} \psi{8}^{(0)}, & \phi{8}^{(0)}=\bar{c}{7} \psi{7}^{(0)}+\bar{c}{8} \psi{8}^{(0)}
\end{array}
\)

where the unbarred coefficients correspond to one root of each second-order determinant and the barred coefficients correspond to the second root.

The first determinant is

\(
\left|\begin{array}{cc}
H{11}^{\prime}-E^{(1)} & H{12}^{\prime} \tag{9.95}\
H{12}^{\prime} & H{22}^{\prime}-E^{(1)}
\end{array}\right|=0
\)

We have

\(
\begin{aligned}
H{11}^{\prime} & =\int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} 1 s(1) 2 s(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 s(2) d x{1} \cdots d z{2} \
H{11}^{\prime} & =\iint[1 s(1)]^{2}[2 s(2)]^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d \tau{1} d \tau{2} \
H{22}^{\prime} & =\iint[1 s(2)]^{2}[2 s(1)]^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d \tau{1} d \tau_{2}
\end{aligned}
\)

The integration variables are dummy variables and may be given any symbols whatever. Let us relabel the integration variables in $H{22}^{\prime}$ as follows: We interchange $x{1}$ and $x{2}$, interchange $y{1}$ and $y{2}$, and interchange $z{1}$ and $z{2}$. This relabeling leaves $r{12}$ [Eq. (9.66)] unchanged, so

\(
\begin{equation}
H{22}^{\prime}=\iint[1 s(1)]^{2}[2 s(2)]^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d \tau{2} d \tau{1}=H{11}^{\prime} \tag{9.96}
\end{equation}
\)

The same argument shows that $H{33}^{\prime}=H{44}^{\prime}, H{55}^{\prime}=H{66}^{\prime}$, and $H{77}^{\prime}=H{88}^{\prime}$.
We denote $H{11}^{\prime}$ by the symbol $J{1 s 2 s}$ :

\(
\begin{equation}
H{11}^{\prime}=J{1 s 2 s}=\iint[1 s(1)]^{2}[2 s(2)]^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d \tau{1} d \tau{2} \tag{9.97}
\end{equation}
\)

This is an example of a Coulomb integral, the name arising because $J_{1 s 2 s}$ is equal to the electrostatic energy of repulsion between an electron with probability density function $[1 s]^{2}$ and an electron with probability density function $[2 s]^{2}$.

The integral $H{12}^{\prime}$ is denoted by $K{1 s 2 s}$ :

\(
\begin{equation}
H{12}^{\prime}=K{1 s 2 s}=\iint 1 s(1) 2 s(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 2 s(1) 1 s(2) d \tau{1} d \tau{2} \tag{9.98}
\end{equation}
\)

This is an exchange integral: The functions on the left and right of $e^{2} / 4 \pi \varepsilon{0} r{12}$ differ from each other by an exchange of electrons 1 and 2 .

The general definitions of the Coulomb integral $J{m n}$ and the exchange integral $K{m n}$ are

\(
\begin{align}
J{m n} & \equiv\left\langle f{m}(1) f{n}(2)\right| e^{2} / 4 \pi \varepsilon{0} r{12}\left|f{m}(1) f{n}(2)\right\rangle \tag{9.99}\
K{m n} & \equiv\left\langle f{m}(1) f{n}(2)\right| e^{2} / 4 \pi \varepsilon{0} r{12}\left|f{n}(1) f{m}(2)\right\rangle \tag{9.100}
\end{align}
\)

where the integrals go over the full range of the spatial coordinates of electrons 1 and 2 , and $f{m}$ and $f{n}$ are spatial orbitals.

Substitution of (9.96) to (9.98) into (9.95) gives

\(
\begin{align}
& \left|\begin{array}{cc}
J{1 s 2 s}-E^{(1)} & K{1 s 2 s} \
K{1 s 2 s} & J{1 s 2 s}-E^{(1)}
\end{array}\right|=0 \tag{9.101}\
& \left(J{1 s 2 s}-E^{(1)}\right)^{2}=\left(K{1 s 2 s}\right)^{2} \
& E{1}^{(1)}=J{1 s 2 s}-K{1 s 2 s}, \quad E{2}^{(1)}=J{1 s 2 s}+K{1 s 2 s} \tag{9.102}
\end{align}
\)

We now find the coefficients of the correct zeroth-order wave functions that correspond to these two roots. Use of $E_{1}^{(1)}$ in (9.81) gives

\(
\begin{aligned}
& K{1 s 2 s} c{1}+K{1 s 2 s} c{2}=0 \
& K{1 s 2 s} c{1}+K{1 s 2 s} c{2}=0
\end{aligned}
\)

Hence $c{2}=-c{1}$. Normalization gives

\(
\begin{gathered}
\left\langle\phi{1}^{(0)} \mid \phi{1}^{(0)}\right\rangle=\left\langle c{1} \psi{1}^{(0)}-c{1} \psi{2}^{(0)} \mid c{1} \psi{1}^{(0)}-c{1} \psi{2}^{(0)}\right\rangle=\left|c{1}\right|^{2}+\left|c{1}\right|^{2}=1 \
c_{1}=2^{-1 / 2}
\end{gathered}
\)

where the orthonormality of $\psi{1}^{(0)}$ and $\psi{2}^{(0)}$ was used. The zeroth-order wave function corresponding to $E_{1}^{(1)}$ is then

\(
\begin{equation}
\phi{1}^{(0)}=2^{-1 / 2}\left(\psi{1}^{(0)}-\psi_{2}^{(0)}\right)=2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] \tag{9.103}
\end{equation}
\)

Similarly, one finds the function corresponding to $E_{2}^{(1)}$ to be

\(
\begin{equation}
\phi{2}^{(0)}=2^{-1 / 2}\left(\psi{1}^{(0)}+\psi_{2}^{(0)}\right)=2^{-1 / 2}[1 s(1) 2 s(2)+2 s(1) 1 s(2)] \tag{9.104}
\end{equation}
\)

We have three more second-order determinants to deal with:

\(
\begin{align}
& \left|\begin{array}{cc}
H{33}^{\prime}-E^{(1)} & H{34}^{\prime} \
H{34}^{\prime} & H{33}^{\prime}-E^{(1)}
\end{array}\right|=0 \tag{9.105}\
& \left|\begin{array}{cc}
H{55}^{\prime}-E^{(1)} & H{56}^{\prime} \
H{56}^{\prime} & H{55}^{\prime}-E^{(1)}
\end{array}\right|=0 \tag{9.106}\
& \left|\begin{array}{cc}
H{77}^{\prime}-E^{(1)} & H{78}^{\prime} \
H{78}^{\prime} & H{77}^{\prime}-E^{(1)}
\end{array}\right|=0 \tag{9.107}
\end{align}
\)

Consider $H{33}^{\prime}$ and $H{55}^{\prime}$ :

\(
\begin{aligned}
& H{33}^{\prime}=\int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} 1 s(1) 2 p{x}(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 p{x}(2) d x{1} \cdots d z{2} \
& H{55}^{\prime}=\int{-\infty}^{\infty} \cdots \int{-\infty}^{\infty} 1 s(1) 2 p{y}(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 p{y}(2) d x{1} \cdots d z{2}
\end{aligned}
\)

These two integrals are equal-the only difference between them involves replacement of $2 p{x}(2)$ by $2 p{y}(2)$-and these two orbitals differ only in their orientation in space. More formally, if we relabel the dummy integration variables in $H{33}^{\prime}$ according to the scheme $x{2} \rightarrow y{2}, y{2} \rightarrow x{2}, x{1} \rightarrow y{1}, y{1} \rightarrow x{1}$, then $r{12}$ is unaffected and $H{33}^{\prime}$ is transformed to $H{55}^{\prime}$. Similar reasoning shows $H{77}^{\prime}=H{33}^{\prime}$. Introducing the symbol $J_{1 s 2 p}$ for these Coulomb integrals, we have

\(
H{33}^{\prime}=H{55}^{\prime}=H{77}^{\prime}=J{1 s 2 p}=\iint 1 s(1) 2 p{z}(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 1 s(1) 2 p{z}(2) d \tau{1} d \tau{2}
\)

Also, the exchange integrals involving the $2 p$ orbitals are equal:

\(
H{34}^{\prime}=H{56}^{\prime}=H{78}^{\prime}=K{1 s 2 p}=\iint 1 s(1) 2 p{z}(2) \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} 2 p{z}(1) 1 s(2) d \tau{1} d \tau{2}
\)

The three determinants $(9.105)$ to $(9.107)$ are thus identical and have the form

\(
\left|\begin{array}{cc}
J{1 s 2 p}-E^{(1)} & K{1 s 2 p} \
K{1 s 2 p} & J{1 s 2 p}-E^{(1)}
\end{array}\right|=0
\)

The determinant is similar to (9.101), and by analogy with (9.102)-(9.104), we get

\(
\begin{gather}
E{3}^{(1)}=E{5}^{(1)}=E{7}^{(1)}=J{1 s 2 p}-K{1 s 2 p} \tag{9.108}\
E{4}^{(1)}=E{6}^{(1)}=E{8}^{(1)}=J{1 s 2 p}+K{1 s 2 p} \tag{9.109}\
\phi{3}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{x}(2)-1 s(2) 2 p{x}(1)\right] \
\phi{4}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{x}(2)+1 s(2) 2 p{x}(1)\right] \
\phi{5}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{y}(2)-1 s(2) 2 p{y}(1)\right] \tag{9.110}\
\phi{6}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{y}(2)+1 s(2) 2 p{y}(1)\right] \
\phi{7}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{z}(2)-1 s(2) 2 p{z}(1)\right] \
\phi{8}^{(0)}=2^{-1 / 2}\left[1 s(1) 2 p{z}(2)+1 s(2) 2 p{z}(1)\right]
\end{gather}
\)

The electrostatic repulsion $e^{2} / 4 \pi \varepsilon{0} r{12}$ between the electrons has partly removed the degeneracy. The hypothetical eightfold-degenerate unperturbed level has been split into two nondegenerate levels associated with the configuration $1 s 2 s$ and two triply degenerate levels associated with the configuration $1 s 2 p$. It might be thought that higher-order energy corrections would further resolve the degeneracy. Actually, application of an external magnetic field is required to completely remove the degeneracy. Because the $e^{2} / 4 \pi \varepsilon{0} r{12}$ perturbation has not completely removed the degeneracy, any normalized linear combinations of $\phi{3}^{(0)}$, $\phi{5}^{(0)}$, and $\phi{7}^{(0)}$ and of $\phi{4}^{(0)}, \phi{6}^{(0)}$, and $\phi{8}^{(0)}$ can serve as correct zeroth-order wave functions.

To evaluate the Coulomb and exchange integrals in $E^{(1)}$ in (9.102) and (9.108), one uses the $1 / r_{12}$ expansion given in Prob. 9.14. The results are

\(
\begin{align}
J{1 s 2 s} & =\left(\frac{17}{81}\right) \frac{Z e^{2}}{4 \pi \varepsilon{0} a_{0}}=11.42 \mathrm{eV},
\end{align} \quad J{1 s 2 p}=\left(\frac{59}{243}\right) \frac{Z e^{2}}{4 \pi \varepsilon{0} a{0}}=13.21 \mathrm{eV}, ~\left(\frac{16}{729}\right) \frac{Z e^{2}}{4 \pi \varepsilon{0} a{0}}=1.19 \mathrm{eV}, \quad K{1 s 2 p}=\left(\frac{112}{6561}\right) \frac{Z e^{2}}{4 \pi \varepsilon{0} a{0}}=0.93 \mathrm{eV}
\)

where we used $Z=2$ and $e^{2} / 8 \pi \varepsilon{0} a{0}=13.606 \mathrm{eV}$. Recalling that $E^{(0)}=-68.03 \mathrm{eV}$ [Eq. (9.92)], we get (Fig. 9.3)

FIGURE 9.3 The first excited levels of the helium atom.

\(
\begin{gathered}
E^{(0)}+E{1}^{(1)}=E^{(0)}+J{1 s 2 s}-K{1 s 2 s}=-57.8 \mathrm{eV} \
E^{(0)}+E{2}^{(1)}=E^{(0)}+J{1 s 2 s}+K{1 s 2 s}=-55.4 \mathrm{eV} \
E^{(0)}+E{3}^{(1)}=E^{(0)}+J{1 s 2 p}-K{1 s 2 p}=-55.7{5} \mathrm{eV} \
E^{(0)}+E{4}^{(1)}=E^{(0)}+J{1 s 2 p}+K_{1 s 2 p}=-53.9 \mathrm{eV}
\end{gathered}
\)

The first-order energy corrections seem to indicate that the lower of the two levels of the $1 s 2 p$ configuration lies below the higher of the two levels of the $1 s 2 s$ configuration. Study of the helium spectrum reveals that this is not so. The error is due to neglect of the higherorder perturbation-energy corrections.

Using the variation-perturbation method (Section 9.2), Knight and Scherr calculated the second- and third-order corrections $E^{(2)}$ and $E^{(3)}$ for these four excited levels. [R. E. Knight and C. W. Scherr, Rev. Mod. Phys., 35, 431 (1963); for energy corrections through 17th order, see F. C. Sanders and C. W. Scherr, Phys. Rev., 181, 84 (1969).] Figure 9.4 shows their results (which are within 0.1 eV of the experimental energies). Figure 9.4 shows that Fig. 9.3 is quite inaccurate. Since the perturbation $e^{2} / 4 \pi \varepsilon{0} r{12}$ is not really very small, a perturbation treatment that includes only the $E^{(1)}$ correction does not give accurate results.

The first-order correction to the wave function, $\psi^{(1)}$, will include contributions from other configurations (configuration interaction). When we say that a level belongs to the configuration $1 s 2 s$, we are indicating the configuration that makes the largest contribution to the true wave function.

We started with the eight degenerate zeroth-order functions (9.93). These functions have three kinds of degeneracy. There is the degeneracy between hydrogenlike functions with the same $n$, but different $l$; the $2 s$ and the $2 p$ functions have the same energy. There is the degeneracy between hydrogenlike functions with the same $n$ and $l$, but different $m$; the $2 p{1}, 2 p{0}$, and $2 p{-1}$ functions have the same energy. (For convenience we used the real functions $2 p{x, y, z}$, but we could have started with the functions $2 p{1,0,-1}$.) Finally, there is the degeneracy between functions that differ only in the interchange of the two electrons between the orbitals; the functions $\psi{1}^{(0)}=1 s(1) 2 s(2)$ and $\psi{2}^{(0)}=1 s(2) 2 s(1)$ have the same energy. This last kind of degeneracy is called exchange degeneracy. When the interelectronic repulsion $e^{2} / 4 \pi \varepsilon{0} r{12}$ was introduced as a perturbation, the exchange degeneracy and the degeneracy associated with the quantum number $l$ were removed. The degeneracy associated with $m$ remained, however; each $1 s 2 p$ helium level is triply degenerate, and we could just as well have used the $2 p{1}, 2 p{0}$, and $2 p{-1}$ orbitals instead of the real orbitals in constructing the correct zeroth-order wave functions. Let us consider the reasons for the removal of the $l$ degeneracy and the exchange degeneracy.

The interelectronic repulsion in helium makes the $2 s$ orbital energy less than the $2 p$ energy. Figures 6.9 and 6.8 show that a $2 s$ electron has a greater probability than a $2 p$ electron of being closer to the nucleus than the $1 s$ electron(s). A $2 s$ electron will not be as effectively shielded from the nucleus by the $1 s$ electrons and will therefore have a lower energy than a $2 p$ electron. [According to Eq. (6.94), the greater the nuclear charge, the lower the energy.] Mathematically, the difference between the $1 s 2 s$ and the $1 s 2 p$ energies results from the Coulomb integral $J{1 s 2 s}$ being smaller than $J{1 s 2 p}$. These Coulomb integrals
$1 s 2 p\left{\begin{array}{ll}-57.8 \mathrm{eV} & \text { FIGURE 9.4 } E^{(0)}+E^{(1)}+ \ -58.1 & \begin{array}{l}E^{(2)}+E^{(3)} \text { for the first excited } \ \text { levels of helium. Also shown }\end{array} \ 1 s 2 s \begin{cases}-58.4 & 2^{-1 / 2}[1 s(1) 2 s(2)+2 s(1) 1 s(2)] \ \text { are the correct zeroth-order } \ \text { wave functions for the } 1 s 2 s\end{cases} \ -59.2 & 2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)]\end{array} \begin{array}{l}\text { levels. }\end{array}\right.$
represent the electrostatic repulsion between the appropriate charge distributions. When the $2 s$ electron penetrates the charge distribution of the $1 s$ electron, it feels a repulsion from only the unpenetrated part of the $1 s$ charge distribution. Hence the $1 s-2 s$ electrostatic repulsion is less than the $1 s-2 p$ repulsion, and the $1 s 2 s$ levels lie below the $1 s 2 p$ levels. The interelectronic repulsion in many-electron atoms lifts the $l$ degeneracy, and the orbital energies for the same value of $n$ increase with increasing $l$.

Now consider the removal of the exchange degeneracy. The functions (9.93) with which we began the perturbation treatment have each electron assigned to a definite orbital. For example, the function $\psi{1}^{(0)}=1 s(1) 2 s(2)$ has electron 1 in the $1 s$ orbital and electron 2 in the $2 s$ orbital. For $\psi{2}^{(0)}$ the opposite is true. The secular determinant was not diagonal, so the initial functions were not the correct zeroth-order wave functions. The correct zeroth-order functions do not assign each electron to a definite orbital. Thus the first two correct zeroth-order functions are

\(
\phi{1}^{(0)}=2^{-1 / 2}[1 s(1) 2 s(2)-1 s(2) 2 s(1)], \quad \phi{2}^{(0)}=2^{-1 / 2}[1 s(1) 2 s(2)+1 s(2) 2 s(1)]
\)

We cannot say which orbital electron 1 is in for either $\phi{1}^{(0)}$ or $\phi{2}^{(0)}$. This property of the wave functions of systems containing more than one electron results from the indistinguishability of identical particles in quantum mechanics and will be discussed further in Chapter 10. Since the functions $\phi{1}^{(0)}$ and $\phi{2}^{(0)}$ have different energies, the exchange degeneracy is removed when the correct zeroth-order functions are used.


In spectroscopy, we start with a system in some stationary state, expose it to electromagnetic radiation (light), and then observe whether the system has made a transition to another stationary state. The radiation produces a time-dependent potential-energy term in the Hamiltonian, so we must use the time-dependent Schrödinger equation. The most convenient approach here is an approximate one called time-dependent perturbation theory.

Let the system (atom or molecule) have the time-independent Hamiltonian $\hat{H}^{0}$ in the absence of the radiation (or other time-dependent perturbation), and let $\hat{H}^{\prime}(t)$ be the timedependent perturbation. The time-independent Schrödinger equation for the unperturbed problem is

\(
\begin{equation}
\hat{H}^{0} \psi{k}^{0}=E{k}^{0} \psi_{k}^{0} \tag{9.112}
\end{equation}
\)

where $E{k}^{0}$ and $\psi{k}^{0}$ are the stationary-state energies and wave functions. The time-dependent Schrödinger equation (7.97) in the presence of the radiation is

\(
\begin{equation}
-\frac{\hbar}{i} \frac{\partial \Psi}{\partial t}=\left(\hat{H}^{0}+\hat{H}^{\prime}\right) \Psi \tag{9.113}
\end{equation}
\)

where the state function $\Psi$ depends on the spatial and spin coordinates (symbolized by $q$ ) and on the time: $\Psi=\Psi(q, t)$. (See Chapter 10 for a discussion of spin coordinates.)

First suppose that $\hat{H}^{\prime}(t)$ is absent. The unperturbed time-dependent Schrödinger equation is

\(
\begin{equation}
-(\hbar / i) \partial \Psi^{0} / \partial t=\hat{H}^{0} \Psi^{0} \tag{9.114}
\end{equation}
\)

The system's possible stationary-state state functions are given by (7.99) as $\Psi{k}^{0}=$ $\exp \left(-i E{k}^{0} t / \hbar\right) \psi{k}^{0}$, where the $\psi{k}^{0}$ functions are the eigenfunctions of $\hat{H}^{0}$ [Eq. (9.112)]. Each $\Psi_{k}^{0}$ is a solution of (9.114). Moreover, the linear combination

\(
\begin{equation}
\Psi^{0}=\sum{k} c{k} \Psi{k}^{0}=\sum{k} c{k} \exp \left(-i E{k}^{0} t / \hbar\right) \psi_{k}^{0} \tag{9.115}
\end{equation}
\)

with the $c{k}$ 's being arbitrary time-independent constants, is a solution of the time-dependent Schrödinger equation (9.114), as proved in the discussion leading to Eq. (7.100). The functions $\Psi{k}^{0}$ form a complete set (since they are the eigenfunctions of the Hermitian operator $\hat{H}^{0}$ ), so any solution of (9.114) can be expressed in the form (9.115). Hence (9.115) is the general solution of the time-dependent Schrödinger equation (9.114), where $\hat{H}^{0}$ is independent of time.

Now suppose that $\hat{H}^{\prime}(t)$ is present. The function (9.115) is no longer a solution of the time-dependent Schrödinger equation. However, because the unperturbed functions $\Psi{k}^{0}$ form a complete set, the true state function $\Psi$ can at any instant of time be expanded as a linear combination of the $\Psi{k}^{0}$ functions according to $\Psi=\sum{k} b{k} \Psi{k}^{0}$. Because $\hat{H}$ is timedependent, $\Psi$ will change with time and the expansion coefficients $b{k}$ will change with time. Therefore,

\(
\begin{equation}
\Psi=\sum{k} b{k}(t) \exp \left(-i E{k}^{0} t / \hbar\right) \psi{k}^{0} \tag{9.116}
\end{equation}
\)

In the limit $\hat{H}^{\prime}(t) \rightarrow 0$, the expansion (9.116) reduces to (9.115).
Substitution of (9.116) into the time-dependent Schrödinger equation (9.113) and use of (9.112) gives

\(
\begin{aligned}
-\frac{\hbar}{i} \sum{k} \frac{d b{k}}{d t} \exp \left(-i E{k}^{0} t / \hbar\right) \psi{k}^{0}+ & \sum{k} E{k}^{0} b{k} \exp \left(-i E{k}^{0} t / \hbar\right) \psi{k}^{0} \
& =\sum{k} b{k} \exp \left(-i E{k}^{0} t / \hbar\right) E{k}^{0} \psi{k}^{0}+\sum{k} b{k} \exp \left(-i E{k}^{0} t / \hbar\right) \hat{H}^{\prime} \psi{k}^{0} \
-\frac{\hbar}{i} \sum{k} \frac{d b{k}}{d t} & \exp \left(-i E{k}^{0} t / \hbar\right) \psi{k}^{0}=\sum{k} b{k} \exp \left(-i E{k}^{0} t / \hbar\right) \hat{H}^{\prime} \psi{k}^{0}
\end{aligned}
\)

We now multiply by $\psi{m}^{0} *$ and integrate over the spatial and spin coordinates. Using the orthonormality equation $\left\langle\psi{m}^{0} \mid \psi{k}^{0}\right\rangle=\delta{m k}$, we get

\(
-\frac{\hbar}{i} \sum{k} \frac{d b{k}}{d t} \exp \left(-i E{k}^{0} t / \hbar\right) \delta{m k}=\sum{k} b{k} \exp \left(-i E{k}^{0} t / \hbar\right)\left\langle\psi{m}^{0}\right| \hat{H}^{\prime}\left|\psi_{k}^{0}\right\rangle
\)

Because of the $\delta{m k}$ factor, all terms but one in the sum on the left are zero, and the left side equals $-(\hbar / i)\left(d b{m} / d t\right) \exp \left(-i E_{m}^{0} t / \hbar\right)$. We get

\(
\begin{equation}
\frac{d b{m}}{d t}=-\frac{i}{\hbar} \sum{k} b{k} \exp \left[i\left(E{m}^{0}-E{k}^{0}\right) t / \hbar\right]\left\langle\psi{m}^{0}\right| \hat{H}^{\prime}\left|\psi_{k}^{0}\right\rangle \tag{9.117}
\end{equation}
\)

Let us suppose that the perturbation $\hat{H}^{\prime}(t)$ was applied at time $t=0$ and that before the perturbation was applied the system was in stationary state $n$ with energy $E{n}^{(0)}$. The state function at $t=0$ is therefore $\Psi=\exp \left(-i E{n}^{0} t / \hbar\right) \psi{n}^{0}$ [Eq. (7.99)], and the $t=0$ values of the expansion coefficients in (9.116) are thus $b{n}(0)=1$ and $b_{k}(0)=0$ for $k \neq n$ :

\(
\begin{equation}
b{k}(0)=\delta{k n} \tag{9.118}
\end{equation}
\)

We shall assume that the perturbation $\hat{H}^{\prime}$ is small and acts for only a short time. Under these conditions, the change in the expansion coefficients $b_{k}$ from their initial values at the time the perturbation is applied will be small. To a good approximation, we can replace the expansion coefficients on the right side of (9.117) by their initial values (9.118). This gives

\(
\begin{equation}
\frac{d b{m}}{d t} \approx-\frac{i}{\hbar} \exp \left[i\left(E{m}^{0}-E{n}^{0}\right) t / \hbar\right]\left\langle\psi{m}^{0}\right| \hat{H}^{\prime}\left|\psi_{n}^{0}\right\rangle \tag{9.119}
\end{equation}
\)

Let the perturbation $\hat{H}^{\prime}$ act from $t=0$ to $t=t^{\prime}$. Integrating from $t=0$ to $t^{\prime}$ and using (9.118), we get

\(
\begin{equation}
b{m}\left(t^{\prime}\right) \approx \delta{m n}-\frac{i}{\hbar} \int{0}^{t^{\prime}} \exp \left[i\left(E{m}^{0}-E{n}^{0}\right) t / \hbar\right]\left\langle\psi{m}^{0}\right| \hat{H}^{\prime}\left|\psi_{n}^{0}\right\rangle d t \tag{9.120}
\end{equation}
\)

Use of the approximate result (9.120) for the expansion coefficients in (9.116) gives the desired approximation to the state function at time $t^{\prime}$ for the case that the time-dependent perturbation $\hat{H}^{\prime}$ is applied at $t=0$ to a system in stationary state $n$. [As with time-independent perturbation theory, one can go to higher-order approximations (see Fong, pp. 234-244).]

For times after $t^{\prime}$, the perturbation has ceased to act, and $\hat{H}^{\prime}=0$. Equation (9.117) gives $d b{m} / d t=0$ for $t>t^{\prime}$, so $b{m}=b_{m}\left(t^{\prime}\right)$ for $t \geq t^{\prime}$. Therefore, for times after exposure to the perturbation, the state function $\Psi$ is [Eq. (9.116)]

\(
\begin{equation}
\Psi=\sum{m} b{m}\left(t^{\prime}\right) \exp \left(-i E{m}^{0} t / \hbar\right) \psi{m}^{0} \quad \text { for } t \geq t^{\prime} \tag{9.121}
\end{equation}
\)

where $b{m}\left(t^{\prime}\right)$ is given by (9.120). In (9.121), $\Psi$ is a superposition of the eigenfunctions $\psi{m}^{0}$ of the energy operator $\hat{H}^{0}$, the expansion coefficients being $b{m} \exp \left(-i E{m}^{0} t / \hbar\right)$. [Compare (9.121) and (7.66).] The work of Section 7.6 tells us that a measurement of the system's energy at a time after $t^{\prime}$ will give one of the eigenvalues $E{m}^{0}$ of the energy operator $\hat{H}^{0}$, and the probability of getting $E{m}^{0}$ equals the square of the absolute value of the expansion coefficient that multiplies $\psi{m}^{0}$; that is, it equals $\left|b{m}\left(t^{\prime}\right) \exp \left(-i E{m}^{0} t / \hbar\right)\right|^{2}=\left|b{m}\left(t^{\prime}\right)\right|^{2}$.

The time-dependent perturbation changes the system's state function from $\exp \left(-i E{n}^{0} t / \hbar\right) \psi{n}^{0}$ to the superposition (9.121). Measurement of the energy then changes $\Psi$ to one of the energy eigenfunctions $\exp \left(-i E{m}^{0} t / \hbar\right) \psi{m}^{0}$ (reduction of the wave function, Section 7.9). The net result is a transition from stationary state $n$ to stationary state $m$, the probability of such a transition being $\left|b_{m}\left(t^{\prime}\right)\right|^{2}$.


We now consider the interaction of an atom or molecule with electromagnetic radiation. A proper quantum-mechanical approach would treat both the atom and the radiation quantum mechanically, but we shall simplify things by using the classical picture of the light as an electromagnetic wave of oscillating electric and magnetic fields.

A detailed investigation, which we omit, shows that usually the interaction between the radiation's magnetic field and the atom's charges is much weaker than the interaction between the radiation's electric field and the charges, so we shall consider only the latter interaction. (In NMR spectroscopy the important interaction is between the magnetic dipole moments of the nuclei and the radiation's magnetic field. We shall not consider this case.)

Let the electric field $\mathscr{E}$ of the electromagnetic wave point in the $x$ direction only. (This is plane-polarized radiation.) The electric field is defined as the force per unit charge, so the force on charge $Q{i}$ is $F=Q{i} \mathscr{E}{x}=-d V / d x$, where (4.24) was used. Integration gives the potential energy of interaction between the radiation's electric field and the charge as $V=-Q{i} \mathscr{C}{x} x$, where the arbitrary integration constant was taken as zero. For a system of several charges, $V=-\sum{i} Q{i} x{i} \mathscr{E}{x}$. This is the time-dependent perturbation $\hat{H}^{\prime}(t)$. The space and time dependence of the electric field of an electromagnetic wave traveling in the $z$ direction with wavelength $\lambda$ and frequency $\nu$ is given by (see a first-year physics text) $\mathscr{E}{x}=$ $\mathscr{E}{0} \sin (2 \pi \nu t-2 \pi z / \lambda)$, where $\mathscr{E}{0}$ is the maximum value of $\mathscr{E}_{x}$ (the amplitude). Therefore,

\(
\hat{H}^{\prime}(t)=-\mathscr{E}{0} \sum{i} Q{i} x{i} \sin \left(2 \pi \nu t-2 \pi z_{i} / \lambda\right)
\)

where the sum goes over all the electrons and nuclei of the atom or molecule.

Defining $\omega$ and $\omega_{m n}$ as

\(
\begin{equation}
\omega \equiv 2 \pi \nu, \quad \omega{m n} \equiv\left(E{m}^{0}-E_{n}^{0}\right) / \hbar \tag{9.122}
\end{equation}
\)

and substituting $\hat{H}^{\prime}(t)$ into (9.120), we get the coefficients in the expansion (9.116) of the state function $\Psi$ as

\(
b{m} \approx \delta{m n}+\frac{i \mathscr{E}{0}}{\hbar} \int{0}^{t^{\prime}} \exp \left(i \omega{m n} t\right)\left\langle\psi{m}^{0}\right| \sum{i} Q{i} x{i} \sin \left(\omega t-2 \pi z{i} / \lambda\right)\left|\psi_{n}^{0}\right\rangle d t
\)

The integral $\left\langle\psi{m}^{0}\right| \sum{i} \cdots\left|\psi{n}^{0}\right\rangle$ in this equation is over all space, but significant contributions to its magnitude come only from regions where $\psi{m}^{0}$ and $\psi{n}^{0}$ are of significant magnitude. In regions well outside the atom or molecule, $\psi{m}^{0}$ and $\psi{n}^{0}$ are vanishingly small, and such regions can be ignored. Let the coordinate origin be chosen within the atom or molecule. Since regions well outside the atom can be ignored, the coordinate $z{i}$ can be considered to have a maximum magnitude of the order of one nm. For ultraviolet light, the wavelength $\lambda$ is on the order of $10^{2} \mathrm{~nm}$. For visible, infrared, microwave, and radiofrequency radiation, $\lambda$ is even larger. Hence $2 \pi z{i} / \lambda$ is very small and can be neglected, and this leaves $\sum{i} Q{i} x{i} \sin \omega t$ in the integral.

Use of the identity (Prob. 1.29) $\sin \omega t=\left(e^{i \omega t}-e^{-i \omega t}\right) / 2 i$ gives

\(
b{m}\left(t^{\prime}\right) \approx \delta{m n}+\frac{\mathscr{E}{0}}{2 \hbar}\left\langle\psi{m}^{0}\right| \sum{i} Q{i} x{i}\left|\psi{n}^{0}\right\rangle \int{0}^{t^{\prime}}\left[e^{i\left(\omega{m n}+\omega\right) t}-e^{i\left(\omega_{m n}-\omega\right) t}\right] d t
\)

Using $\int_{0}^{t^{\prime}} e^{a t} d t=a^{-1}\left(e^{a t^{\prime}}-1\right)$, we get

\(
\begin{equation}
b{m}\left(t^{\prime}\right) \approx \delta{m n}+\frac{\mathscr{E}{0}}{2 \hbar i}\left\langle\psi{m}^{0}\right| \sum{i} Q{i} x{i}\left|\psi{n}^{0}\right\rangle\left[\frac{e^{i\left(\omega{m n}+\omega\right) t^{\prime}}-1}{\omega{m n}+\omega}-\frac{e^{i\left(\omega{m n}-\omega\right) t^{\prime}}-1}{\omega{m n}-\omega}\right] \tag{9.123}
\end{equation}
\)

For $m \neq n$, the $\delta{m n}$ term equals zero.
As noted at the end of Section 9.8, $\left|b{m}\left(t^{\prime}\right)\right|^{2}$ gives the probability of a transition to state $m$ from state $n$. There are two cases where this probability becomes of significant magnitude. If $\omega{m n}=\omega$, the denominator of the second fraction in brackets is zero and this fraction's absolute value is large (but not infinite; see Prob. 9.27). If $\omega{m n}=-\omega$, the first fraction has a zero denominator and a large absolute value.

For $\omega{m n}=\omega$, Eq. (9.122) gives $E{m}^{0}-E{n}^{0}=h \nu$. Exposure of the atom to radiation of frequency $\nu$ has produced a transition from stationary state $n$ to stationary state $m$, where (since $\nu$ is positive) $E{m}^{0}>E_{n}^{0}$. We might suppose that the energy for this transition came from the system's absorption of a photon of energy $h \nu$. This supposition is confirmed by a fully quantum-mechanical treatment (called quantum field theory) in which the radiation is treated quantum mechanically rather than classically. We have absorption of radiation with a consequent increase in the system's energy.

For $\omega{m n}=-\omega$, we get $E{n}^{0}-E{m}^{0}=h \nu$. Exposure to radiation of frequency $\nu$ has induced a transition from stationary state $n$ to stationary state $m$, where (since $\nu$ is positive) $E{n}^{0}>E_{m}^{0}$. The system has gone to a lower energy level, and a quantum-field-theory treatment shows that a photon of energy $h \nu$ is emitted in this process. This is stimulated emission of radiation. Stimulated emission occurs in lasers.

A defect of our treatment is that it does not predict spontaneous emission, the emission of a photon by a system not exposed to radiation, the system falling to a lower energy level in the process. Quantum field theory does predict spontaneous emission.

Note from (9.123) that the probability of absorption is proportional to $\left.\left|\left\langle\psi{m}^{0}\right| \sum{i} Q{i} x{i}\right| \psi{n}^{0}\right\rangle\left.\right|^{2}$. The quantity $\sum{i} Q{i} x{i}$ is the $x$ component of the system's electric-dipole-moment operator $\hat{\boldsymbol{\mu}}$ (see Section 14.2 for details), which is [Eqs. (14.14) and (14.15)]
$\hat{\boldsymbol{\mu}}=\mathbf{i} \sum{i} Q{i} x{i}+\mathbf{j} \sum{i} Q{i} y{i}+\mathbf{k} \sum{i} Q{i} z{i}=\mathbf{i} \hat{\mu}{x}+\mathbf{j} \hat{\mu}{y}+\mathbf{k} \hat{\mu}{z}$, where $\mathbf{i}, \mathbf{j}, \mathbf{k}$ are unit vectors along the axes and $\hat{\mu}{x}, \hat{\mu}{y}, \hat{\mu}_{z}$ are the components of $\hat{\boldsymbol{\mu}}$. We assumed polarized radiation with an electric field in the $x$ direction only. If the radiation has electric-field components in the $y$ and $z$ directions also, then the probability of absorption will be proportional to

\(
\left.\left.\left.\left.\left|\left\langle\psi{m}^{0}\right| \hat{\mu}{x}\right| \psi{n}^{0}\right\rangle\left.\right|^{2}+\left|\left\langle\psi{m}^{0}\right| \hat{\mu}{y}\right| \psi{n}^{0}\right\rangle\left.\right|^{2}+\left|\left\langle\psi{m}^{0}\right| \hat{\mu}{z}\right| \psi{n}^{0}\right\rangle\left.\right|^{2}=\left|\left\langle\psi{m}^{0}\right| \hat{\boldsymbol{\mu}}\right| \psi_{n}^{0}\right\rangle\left.\right|^{2}
\)

where Eq. (5.25) was used. The integral $\left\langle\psi{m}^{0}\right| \hat{\boldsymbol{\mu}}\left|\psi{n}^{0}\right\rangle=\boldsymbol{\mu}_{m n}$ is the transition (dipole) moment.

When $\boldsymbol{\mu}{m n}=0$, the transition between states $m$ and $n$ with absorption or emission of radiation is said to be forbidden. Allowed transitions have $\boldsymbol{\mu}{m n} \neq 0$. Because of approximations made in the derivation of (9.123), forbidden transitions may have some small probability of occurring.

Consider, for example, the particle in a one-dimensional box (Section 2.2). The transition dipole moment is $\left\langle\psi{m}^{0}\right| Q x\left|\psi{n}^{0}\right\rangle$, where $Q$ is the particle's charge and $x$ is its coordinate and where $\psi{m}^{0}=(2 / l)^{1 / 2} \sin (m \pi x / l)$ and $\psi{n}^{0}=(2 / l)^{1 / 2} \sin (n \pi x / l)$. Evaluation of this integral (Prob. 9.28) shows it is nonzero only when $m-n= \pm 1, \pm 3, \pm 5, \ldots$. and is zero when $m-n=0, \pm 2, \ldots$ The selection rule for a charged particle in a one-dimensional box is that the quantum number must change by an odd integer when radiation is absorbed or emitted.

Evaluation of the transition moment for the harmonic oscillator and for the twoparticle rigid rotor gives the selection rules $\Delta v= \pm 1$ and $\Delta J= \pm 1$ stated in Sections 4.3 and 6.4 .

The quantity $\left|b{m}\right|^{2}$ in (9.123) is sharply peaked at $\omega=\omega{m n}$ and $\omega=-\omega{m n}$, but there is a nonzero probability that a transition will occur when $\omega$ is not precisely equal to $\left|\omega{m n}\right|$, that is, when $h \nu$ is not precisely equal to $\left|E{m}^{0}-E{n}^{0}\right|$. This fact is related to the energy-time uncertainty relation (5.15). States with a finite lifetime have an uncertainty in their energy.

Radiation is not the only time-dependent perturbation that produces transitions between states. When an atom or molecule comes close to another atom or molecule, it suffers a time-dependent perturbation that can change its state. Selection rules derived for radiative transitions need not apply to collision processes, since $\hat{H}^{\prime}(t)$ differs for the two processes.


Electron Spin and the Spin-Statistics Theorem

Click the keywords below to know more about it.

Electron Spin: A fundamental property of electrons that gives rise to their intrinsic angular momentum. It is not a classical effect and cannot be visualized as a physical rotation1. Spin Angular Momentum: The intrinsic angular momentum of a particle, distinct from orbital angular momentum. It is represented by operators ( S_x ), ( S_y ), ( S_z ), and ( S^2 )1. Spin Quantum Number (s): A quantum number that describes the intrinsic spin of a particle. For electrons, ( s = \frac{1}{2} )1. Spin Eigenfunctions: Functions that describe the spin state of a particle. For electrons, the eigenfunctions are denoted by ( \alpha ) and ( \beta ), corresponding to spin up and spin down states1. Spin–Statistics Theorem: A fundamental principle in quantum mechanics stating that particles with half-integer spin (fermions) must have antisymmetric wave functions, while particles with integer spin (bosons) must have symmetric wave functions1. Fermions: Particles with half-integer spin that obey the Pauli exclusion principle, meaning no two fermions can occupy the same quantum state1. Bosons: Particles with integer spin that do not obey the Pauli exclusion principle and can occupy the same quantum state1. Pauli Exclusion Principle: A principle stating that no two electrons can occupy the same spin-orbital in an atom, a consequence of the antisymmetry requirement for fermions1. Spin Magnetic Moment: The magnetic moment associated with the spin of a particle. For electrons, it is proportional to the spin angular momentum1. Nuclear Magnetic Resonance (NMR): A spectroscopic technique that observes transitions between nuclear spin energy levels in an applied magnetic field1. Spin–Spin Coupling: An interaction between the spins of adjacent nuclei that affects the magnetic field experienced by each nucleus, leading to splitting of NMR lines1. Slater Determinant: A mathematical expression used to construct antisymmetric wave functions for a system of electrons, ensuring that the wave function changes sign upon interchange of any two electrons1. Ladder Operators: Operators used in quantum mechanics to raise or lower the eigenvalue of the spin angular momentum component ( S_z ). They are denoted by ( S_+ ) and ( S_- )1.

All chemists are familiar with the yellow color imparted to a flame by sodium atoms. The strongest yellow line (the D line) in the sodium spectrum is actually two closely spaced lines. The sodium D line arises from a transition from the excited configuration $1 s^{2} 2 s^{2} 2 p^{6} 3 p$ to the ground state. The doublet nature of this and other lines in the Na spectrum indicates a doubling of the expected number of states available to the valence electron.

To explain this fine structure of atomic spectra, Uhlenbeck and Goudsmit proposed in 1925 that the electron has an intrinsic (built-in) angular momentum in addition to the orbital angular momentum due to its motion about the nucleus. If we picture the electron as a sphere of charge spinning about one of its diameters, we can see how such an intrinsic angular momentum can arise. Hence we have the term spin angular momentum or, more simply, spin. However, electron "spin" is not a classical effect, and the picture of an electron rotating about an axis has no physical reality. The intrinsic angular momentum is real, but no easily visualizable model can properly explain its origin. We cannot hope to understand microscopic particles based on models taken from our experience in the macroscopic world. Other elementary particles besides the electron have spin angular momentum.

In 1928, Dirac developed the relativistic quantum mechanics of an electron, and in his treatment electron spin arises naturally.

In the nonrelativistic quantum mechanics to which we are confining ourselves, electron spin must be introduced as an additional hypothesis. We have learned that each physical property has its corresponding linear Hermitian operator in quantum mechanics. For such properties as orbital angular momentum, we can construct the quantum-mechanical operator from the classical expression by replacing $p{x}, p{y}, p_{z}$ by the appropriate operators. The inherent spin angular momentum of a microscopic particle has no analog in classical mechanics, so we cannot use this method to construct operators for spin. For our purposes, we shall simply use symbols for the spin operators, without giving an explicit form for them.

Analogous to the orbital angular-momentum operators $\hat{L}^{2}, \hat{L}{x}, \hat{L}{y}, \hat{L}{z}$, we have the spin angular-momentum operators $\hat{S}^{2}, \hat{S}{x}, \hat{S}{y}, \hat{S}{z}$, which are postulated to be linear and Hermitian. $\hat{S}^{2}$ is the operator for the square of the magnitude of the total spin angular
momentum of a particle. $\hat{S}_{z}$ is the operator for the $z$ component of the particle's spin angular momentum. We have

\(
\begin{equation}
\hat{S}^{2}=\hat{S}{x}^{2}+\hat{S}{y}^{2}+\hat{S}_{z}^{2} \tag{10.1}
\end{equation}
\)

We postulate that the spin angular-momentum operators obey the same commutation relations as the orbital angular-momentum operators. Analogous to $\left[\hat{L}{x}, \hat{L}{y}\right]=$ $i \hbar \hat{L}{z},\left[\hat{L}{y}, \hat{L}{z}\right]=i \hbar \hat{L}{x},\left[\hat{L}{z}, \hat{L}{x}\right]=i \hbar \hat{L}_{y}$ [Eqs. (5.46) and (5.48)], we have

\(
\begin{equation}
\left[\hat{S}{x}, \hat{S}{y}\right]=i \hbar \hat{S}{z}, \quad\left[\hat{S}{y}, \hat{S}{z}\right]=i \hbar \hat{S}{x}, \quad\left[\hat{S}{z}, \hat{S}{x}\right]=i \hbar \hat{S}_{y} \tag{10.2}
\end{equation}
\)

From (10.1) and (10.2), it follows, by the same operator algebra used to obtain (5.49) and (5.50), that

\(
\begin{equation}
\left[\hat{S}^{2}, \hat{S}{x}\right]=\left[\hat{S}^{2}, \hat{S}{y}\right]=\left[\hat{S}^{2}, \hat{S}_{z}\right]=0 \tag{10.3}
\end{equation}
\)

Since Eqs. (10.1) and (10.2) are of the form of Eqs. (5.107) and (5.108), it follows from the work of Section 5.4 (which depended only on the commutation relations and not on the specific forms of the operators) that the eigenvalues of $\hat{S}^{2}$ are [Eq. (5.142)]

\(
\begin{equation}
s(s+1) \hbar^{2}, \quad s=0, \frac{1}{2}, 1, \frac{3}{2}, \ldots \tag{10.4}
\end{equation}
\)

and the eigenvalues of $\hat{S}_{z}$ are [Eq. (5.141)]

\(
\begin{equation}
m{s} \hbar, \quad m{s}=-s,-s+1, \ldots, s-1, \quad s \tag{10.5}
\end{equation}
\)

The quantum number $s$ is called the spin of the particle. Although nothing in Section 5.4 restricts electrons to a single value for $s$, experiment shows that all electrons do have a single value for $s$, namely, $s=\frac{1}{2}$. Protons and neutrons also have $s=\frac{1}{2}$. Pions have $s=0$. Photons have $s=1$. However, Eq. (10.5) does not hold for photons. Photons travel at speed $c$ in vacuum. Because of their relativistic nature, it turns out that photons can have either $m{s}=+1$ or $m{s}=-1$, but not $m{s}=0$ (see Merzbacher, Chapter 22). These two $m{s}$ values correspond to left circularly polarized and right circularly polarized light.

With $s=\frac{1}{2}$, the magnitude of the total spin angular momentum of an electron is given by the square root of (10.4) as

\(
\begin{equation}
\left[\frac{1}{2}\left(\frac{3}{2}\right) \hbar^{2}\right]^{1 / 2}=\frac{1}{2} \sqrt{3} \hbar \tag{10.6}
\end{equation}
\)

For $s=\frac{1}{2}$, Eq. (10.5) gives the possible eigenvalues of $\hat{S}{z}$ of an electron as $+\frac{1}{2} \hbar$ and $-\frac{1}{2} \hbar$. The electron spin eigenfunctions that correspond to these $\hat{S}{z}$ eigenvalues are denoted by $\alpha$ and $\beta$ :

\(
\begin{align}
& \hat{S}{z} \alpha=+\frac{1}{2} \hbar \alpha \tag{10.7}\
& \hat{S}{z} \beta=-\frac{1}{2} \hbar \beta \tag{10.8}
\end{align}
\)

Since $\hat{S}{z}$ commutes with $\hat{S}^{2}$, we can take the eigenfunctions of $\hat{S}{z}$ to be eigenfunctions of $\hat{S}^{2}$ also, with the eigenvalue given by (10.4) with $s=\frac{1}{2}$ :

\(
\begin{equation}
\hat{S}^{2} \alpha=\frac{3}{4} \hbar^{2} \alpha, \quad \hat{S}^{2} \beta=\frac{3}{4} \hbar^{2} \beta \tag{10.9}
\end{equation}
\)

$\hat{S}{z}$ does not commute with $\hat{S}{x}$ or $\hat{S}{y}$, so $\alpha$ and $\beta$ are not eigenfunctions of these operators. The terms spin $u p$ and spin down refer to $m{s}=+\frac{1}{2}$ and $m{s}=-\frac{1}{2}$, respectively. See Fig. 10.1. We shall later show that the two possibilities for the quantum number $m{s}$ give the doubling of lines in the spectra of the alkali metals.

FIGURE 10.1 Possible orientations of the electron spin vector with respect to the $z$ axis. In each case, $\mathbf{S}$ lies on the surface of a cone whose axis is the $z$ axis.

The wave functions we have dealt with previously are functions of the spatial coordinates of the particle: $\psi=\psi(x, y, z)$. We might ask: What is the variable for the spin eigenfunctions $\alpha$ and $\beta$ ? Sometimes one talks of a spin coordinate $\omega$, without really specifying what this coordinate is. Most often, one takes the spin quantum number $m_{s}$ as being the variable on which the spin eigenfunctions depend. This procedure is quite unusual as compared with the spatial wave functions; but because we have only two possible electronic spin eigenfunctions and eigenvalues, this is a convenient choice. We have

\(
\begin{equation}
\alpha=\alpha\left(m{s}\right), \quad \beta=\beta\left(m{s}\right) \tag{10.10}
\end{equation}
\)

As usual, we want the eigenfunctions to be normalized. The three variables of a one-particle spatial wave function range continuously from $-\infty$ to $+\infty$, so normalization means

\(
\int{-\infty}^{\infty} \int{-\infty}^{\infty} \int_{-\infty}^{\infty}|\psi(x, y, z)|^{2} d x d y d z=1
\)

The variable $m_{s}$ of the electronic spin eigenfunctions takes on only the two discrete values $+\frac{1}{2}$ and $-\frac{1}{2}$. Normalization of the one-particle spin eigenfunctions therefore means

\(
\begin{equation}
\sum{m{s}=-1 / 2}^{1 / 2}\left|\alpha\left(m{s}\right)\right|^{2}=1, \quad \sum{m{s}=-1 / 2}^{1 / 2}\left|\beta\left(m{s}\right)\right|^{2}=1 \tag{10.11}
\end{equation}
\)

Since the eigenfunctions $\alpha$ and $\beta$ correspond to different eigenvalues of the Hermitian operator $\hat{S}_{z}$, they are orthogonal:

\(
\begin{equation}
\sum{m{s}=-1 / 2}^{1 / 2} \alpha^{}\left(m{s}\right) \beta\left(m{s}\right)=0 \tag{10.12}
\end{}
\)

Taking $\alpha\left(m{s}\right)=\delta{m{s} 1 / 2}$ and $\beta\left(m{s}\right)=\delta{m{s},-1 / 2}$, where $\delta_{j k}$ is the Kronecker delta function, we can satisfy (10.11) and (10.12).

When we consider the complete wave function for an electron including both space and spin variables, we shall normalize it according to

\(
\begin{equation}
\sum{m{s}=-1 / 2}^{1 / 2} \int{-\infty}^{\infty} \int{-\infty}^{\infty} \int{-\infty}^{\infty}\left|\psi\left(x, y, z, m{s}\right)\right|^{2} d x d y d z=1 \tag{10.13}
\end{equation}
\)

The notation

\(
\int\left|\psi\left(x, y, z, m_{s}\right)\right|^{2} d \tau
\)

will denote summation over the spin variable and integration over the full range of the spatial variables, as in (10.13). The symbol $\int d v$ will denote integration over the full range of the system's spatial variables.

An electron is currently considered to be a pointlike elementary particle with no substructure. High-energy electron-positron collision experiments show no evidence for a nonzero electron size and put an upper limit of $3 \times 10^{-19} \mathrm{~m}$ on the radius of an electron [D. Bourilkov, Phys. Rev. D, 62, 076005 (2000); arxiv.org/abs/hep-ph/0002172]. Protons and neutrons are made of quarks, and so are not elementary particles. The proton rms charge radius is $0.88 \times 10^{-15} \mathrm{~m}$.


The wave function specifying the state of an electron depends not only on the coordinates $x, y$, and $z$ but also on the spin state of the electron. What effect does this have on the wave functions and energy levels of the hydrogen atom?

To a very good approximation, the Hamiltonian operator for a system of electrons does not involve the spin variables but is a function only of spatial coordinates and derivatives with respect to spatial coordinates. As a result, we can separate the stationary-state wave function of a single electron into a product of space and spin parts:

\(
\psi(x, y, z) g\left(m_{s}\right)
\)

where $g\left(m{s}\right)$ is either one of the functions $\alpha$ or $\beta$, depending on whether $m{s}=\frac{1}{2}$ or $-\frac{1}{2}$. [More generally, $g\left(m{s}\right)$ might be a linear combination of $\alpha$ and $\beta ; g\left(m{s}\right)=c{1} \alpha+c{2} \beta$.] Since the Hamiltonian operator has no effect on the spin function, we have

\(
\hat{H}\left[\psi(x, y, z) g\left(m{s}\right)\right]=g\left(m{s}\right) \hat{H} \psi(x, y, z)=E\left[\psi(x, y, z) g\left(m_{s}\right)\right]
\)

and we get the same energies as previously found without taking spin into account. The only difference spin makes is to double the possible number of states. Instead of the state $\psi(x, y, z)$, we have the two possible states $\psi(x, y, z) \alpha$ and $\psi(x, y, z) \beta$. When we take spin into account, the degeneracy of the hydrogen-atom energy levels is $2 n^{2}$ rather than $n^{2}$.


Suppose we have a system of several identical particles. In classical mechanics the identity of the particles leads to no special consequences. For example, consider identical billiard balls rolling on a billiard table. We can follow the motion of any individual ball, say by taking a motion picture of the system. We can say that ball number one is moving along a certain path, ball two is on another definite path, and so on, the paths being determined by Newton's laws of motion. Thus, although the balls are identical, we can distinguish among them by specifying the path each takes. The identity of the balls has no special effect on their motions.

In quantum mechanics the uncertainty principle tells us that we cannot follow the exact path taken by a microscopic "particle." If the microscopic particles of the system all have different masses or charges or spins, we can use one of these properties to distinguish the particles from one another. But if they are all identical, then the one way we had in classical mechanics of distinguishing them, namely by specifying their paths, is lost in quantum mechanics because of the uncertainty principle. Therefore, the wave function of a system of interacting identical particles must not distinguish among the particles. For example, in the perturbation treatment of the helium-atom excited states in Chapter 9, we saw that the function $1 s(1) 2 s(2)$, which says that electron 1 is in the $1 s$ orbital and electron 2 is in the $2 s$ orbital, was not a correct zeroth-order wave function.

Rather, we had to use the functions $2^{-1 / 2}[1 s(1) 2 s(2) \pm 1 s(2) 2 s(1)]$, which do not specify which electron is in which orbital. (If the identical particles are well separated from one another so that their wave functions do not overlap, they may be regarded as distinguishable.)

We now derive the restrictions on the wave function due to the requirement of indistinguishability of identical particles in quantum mechanics. The wave function of a system of $n$ identical microscopic particles depends on the space and spin variables of the particles. For particle 1 , these variables are $x{1}, y{1}, z{1}, m{s 1}$. Let $q{1}$ stand for all four of these variables. Thus $\psi=\psi\left(q{1}, q{2}, \ldots, q{n}\right)$.

We define the exchange or permutation operator $\hat{P}_{12}$ as the operator that interchanges all the coordinates of particles 1 and 2 :

\(
\begin{equation}
\hat{P}{12} f\left(q{1}, q{2}, q{3}, \ldots, q{n}\right)=f\left(q{2}, q{1}, q{3}, \ldots, q_{n}\right) \tag{10.14}
\end{equation}
\)

For example, the effect of $\hat{P}_{12}$ on the function that has electron 1 in a $1 s$ orbital with spin up and electron 2 in a $3 s$ orbital with spin down is

\(
\begin{equation}
\hat{P}_{12}[1 s(1) \alpha(1) 3 s(2) \beta(2)]=1 s(2) \alpha(2) 3 s(1) \beta(1) \tag{10.15}
\end{equation}
\)

What are the eigenvalues of $\hat{P}{12}$ ? Applying $\hat{P}{12}$ twice has no net effect:

\(
\hat{P}{12} \hat{P}{12} f\left(q{1}, q{2}, \ldots, q{n}\right)=\hat{P}{12} f\left(q{2}, q{1}, \ldots, q{n}\right)=f\left(q{1}, q{2}, \ldots, q{n}\right)
\)

Therefore, $\hat{P}{12}^{2}=\hat{1}$. Let $w{i}$ and $c{i}$ denote the eigenfunctions and eigenvalues of $\hat{P}{12}$. We have $\hat{P}{12} w{i}=c{i} w{i}$. Application of $\hat{P}{12}$ to this equation gives $\hat{P}{12}^{2} w{i}=c{i} P{12} w{i}$. Substitution of $\hat{P}{12}^{2}=\hat{1}$ and $\hat{P}{12} w{i}=c{i} w{i}$ in $\hat{P}{12}^{2} w{i}=c{i} \hat{P}{12} w{i}$ gives $w{i}=c{i}^{2} w{i}$. Since zero is not allowed as an eigenfunction, we can divide by $w{i}$ to get $1=c{i}^{2}$ and $c{i}= \pm 1$. The eigenvalues of $\hat{P}_{12}$ (and of any linear operator whose square is the unit operator) are +1 and -1 .

If $w{+}$is an eigenfunction of $\hat{P}{12}$ with eigenvalue +1 , then

\(
\begin{gathered}
\hat{P}{12} w{+}\left(q{1}, q{2}, \ldots, q{n}\right)=(+1) w{+}\left(q{1}, q{2}, \ldots, q{n}\right) \
w{+}\left(q{2}, q{1}, \ldots, q{n}\right)=w{+}\left(q{1}, q{2}, \ldots, q_{n}\right)
\end{gathered}
\)

(10.16)

A function such as $w_{+}$that has the property (10.16) of being unchanged when particles 1 and 2 are interchanged is said to be symmetric with respect to interchange of particles 1 and 2 . For eigenvalue -1 , we have

\(
\begin{equation}
w{-}\left(q{2}, q{1}, \ldots, q{n}\right)=-w{-}\left(q{1}, q{2}, \ldots, q{n}\right) \tag{10.17}
\end{equation}
\)

The function $w{-}$in (10.17) is antisymmetric with respect to interchange of particles 1 and 2 , meaning that this interchange multiplies $w{-}$by -1 . There is no necessity for an arbitrary function $f\left(q{1}, q{2}, \ldots, q_{n}\right)$ to be either symmetric or antisymmetric with respect to interchange of 1 and 2 .

Do not confuse the property of being symmetric or antisymmetric with respect to particle interchange with the property of being even or odd with respect to inversion in space. The function $x{1}+x{2}$ is symmetric with respect to interchange of 1 and 2 and is an odd function of $x{1}$ and $x{2}$. The function $x{1}^{2}+x{2}^{2}$ is symmetric with respect to interchange of 1 and 2 and is an even function of $x{1}$ and $x{2}$.

The operator $\hat{P}_{i k}$ is defined by

\(
\begin{equation}
\hat{P}{i k} f\left(q{1}, \ldots, q{i}, \ldots, q{k}, \ldots, q{n}\right)=f\left(q{1}, \ldots, q{k}, \ldots, q{i}, \ldots, q_{n}\right) \tag{10.18}
\end{equation}
\)

The eigenvalues of $\hat{P}{i k}$ are, like those of $\hat{P}{12},+1$ and -1 .

We now consider the wave function of a system of $n$ identical microscopic particles. Since the particles are indistinguishable, the way we label them cannot affect the state of the system. Thus the two wave functions

\(
\psi\left(q{1}, \ldots, q{i}, \ldots, q{k}, \ldots, q{n}\right) \quad \text { and } \psi\left(q{1}, \ldots, q{k}, \ldots, q{i}, \ldots, q{n}\right)
\)

must correspond to the same state of the system. Two wave functions that correspond to the same state can differ at most by a multiplicative constant. Hence

\(
\begin{aligned}
\psi\left(q{1}, \ldots, q{k}, \ldots, q{i}, \ldots, q{n}\right) & =c \psi\left(q{1}, \ldots, q{i}, \ldots, q{k}, \ldots, q{n}\right) \
\hat{P}{i k} \psi\left(q{1}, \ldots, q{i}, \ldots, q{k}, \ldots, q{n}\right) & =c \psi\left(q{1}, \ldots, q{i}, \ldots, q{k}, \ldots, q_{n}\right)
\end{aligned}
\)

The last equation states that $\psi$ is an eigenfunction of $\hat{P}{i k}$. But we know that the only possible eigenvalues of $\hat{P}{i k}$ are 1 and -1 . We conclude that the wave function for a system of $n$ identical particles must be symmetric or antisymmetric with respect to interchange of any two of the identical particles, $i$ and $k$. Since the $n$ particles are all identical, we could not have the wave function symmetric with respect to some interchanges and antisymmetric with respect to other interchanges. Thus the wave function of $n$ identical particles must be either symmetric with respect to every possible interchange or antisymmetric with respect to every possible interchange of two particles. (The argument just given is not rigorous. The statement that the wave function of a system of identical particles must be either completely symmetric or completely antisymmetric with respect to interchange of two particles is called the symmetrization postulate.)

We have seen that there are two possible cases for the wave function of a system of identical particles, the symmetric and the antisymmetric cases. Experimental evidence (such as the periodic table of the elements to be discussed later) shows that for electrons only the antisymmetric case occurs. Thus we have an additional postulate of quantum mechanics, which states that the wave function of a system of electrons must be antisymmetric with respect to interchange of any two electrons.

In 1926, Dirac concluded (based on theoretical work and experimental data) that electrons require antisymmetric wave functions and photons require symmetric wave functions. However, Dirac and other physicists erroneously believed in 1926 that all material particles required antisymmetric wave functions. In 1930, experimental data indicated that $\alpha$ particles (which have $s=0$ ) require symmetric wave functions; physicists eventually realized that what determines whether a system of identical particles requires symmetric or antisymmetric wave functions is the spin of the particle. Particles with half-integral spin ( $s=\frac{1}{2}, \frac{3}{2}$, and so on) require antisymmetric wave functions, while particles with integral spin ( $s=0,1$, and so on) require symmetric wave functions. In 1940, the physicist Wolfgang Pauli used relativistic quantum field theory to prove this result. Particles requiring antisymmetric wave functions, such as electrons, are called fermions (after E. Fermi), whereas particles requiring symmetric wave functions, such as pions, are called bosons (after S. N. Bose). In nonrelativistic quantum mechanics, we must postulate that the wave function of a system of identical particles must be antisymmetric with respect to interchange of any two particles if the particles have half-integral spin and must be symmetric with respect to interchange if the particles have integral spin. This statement is called the spin-statistics theorem (since the statistical mechanics of a system of bosons differs from that of a system of fermions).

Many proofs of varying validity have been offered for the spin-statistics theorem; see I. Duck and E. C. G. Sudurshan, Pauli and the Spin-Statistics Theorem, World

Scientific, 1997; Am. J. Phys., 66, 284 (1998); Sudurshan and Duck, Pramana-J. Phys., 61, 645 (2003) (available at www.ias.ac.in/pramana/v61/p645/fulltext.pdf). Several experiments have confirmed the validity of the spin-statistics theorem to extremely high accuracy; see G. M. Tino, Fortschr. Phys., 48, 537 (2000) (available at arxiv.org/ abs/quant-ph/9907028).

The spin-statistics theorem has an important consequence for a system of identical fermions. The antisymmetry requirement means that

\(
\begin{equation}
\psi\left(q{1}, q{2}, q{3}, \ldots, q{n}\right)=-\psi\left(q{2}, q{1}, q{3}, \ldots, q{n}\right) \tag{10.19}
\end{equation}
\)

Consider the value of $\psi$ when electrons 1 and 2 have the same coordinates, that is, when $x{1}=x{2}, y{1}=y{2}, z{1}=z{2}$, and $m{s 1}=m{s 2}$. Putting $q{2}=q{1}$ in (10.19), we have

\(
\begin{align}
\psi\left(q{1}, q{1}, q{3}, \ldots, q{n}\right) & =-\psi\left(q{1}, q{1}, q{3}, \ldots, q{n}\right) \
2 \psi & =0 \
\psi\left(q{1}, q{1}, q{3}, \ldots, q{n}\right) & =0 \tag{10.20}
\end{align}
\)

Thus, two electrons with the same spin have zero probability of being found at the same point in three-dimensional space. (By "the same spin," we mean the same value of $m_{s}$ ). Since $\psi$ is a continuous function, Eq. (10.20) means that the probability of finding two electrons with the same spin close to each other in space is quite small. Thus the antisymmetry requirement forces electrons of like spin to keep apart from one another. To describe this, one often speaks of a Pauli repulsion between such electrons. This "repulsion" is not a real physical force, but a reflection of the fact that the electronic wave function must be antisymmetric with respect to exchange.

The requirement for symmetric or antisymmetric wave functions also applies to a system containing two or more identical composite particles. Consider, for example, an ${ }^{16} \mathrm{O}{2}$ molecule. The ${ }^{16} \mathrm{O}$ nucleus has 8 protons and 8 neutrons. Each proton and each neutron has $s=\frac{1}{2}$ and is a fermion. Therefore, interchange of the two ${ }^{16} \mathrm{O}$ nuclei interchanges 16 fermions and must multiply the molecular wave function by $(-1)^{16}=1$. Thus the ${ }^{16} \mathrm{O}{2}$ molecular wave function must be symmetric with respect to interchange of the nuclear coordinates. The requirement for symmetry or antisymmetry with respect to interchange of identical nuclei affects the degeneracy of molecular wave functions and leads to the symmetry number in the rotational partition function [see McQuarrie (2000), pp. 104-105].

For interchange of two identical composite particles containing $m$ identical bosons and $n$ identical fermions, the wave function is multiplied by $(+1)^{m}(-1)^{n}=(-1)^{n}$.

A composite particle is thus a fermion if it contains an odd number of fermions and is a boson otherwise.

When the variational principle (Section 8.1) is used to get approximate electronic wave functions of atoms and molecules, the requirement that the trial variation function be well-behaved includes the requirement that it be antisymmetric.


We now reconsider the helium atom from the standpoint of electron spin and the antisymmetry requirement. In the perturbation treatment of helium in Section 9.3, we found the zeroth-order wave function for the ground state to be $1 s(1) 1 s(2)$. To take spin into account, we must multiply this spatial function by a spin eigenfunction. We therefore consider the possible spin eigenfunctions for two electrons. We shall use the notation $\alpha(1) \alpha(2)$ to indicate a state where electron 1 has spin up and electron 2 has spin up; $\alpha(1)$ stands for
$\alpha\left(m_{s 1}\right)$. Since each electron has two possible spin states, we have at first sight the four possible spin functions:

\(
\alpha(1) \alpha(2), \quad \beta(1) \beta(2), \quad \alpha(1) \beta(2), \quad \alpha(2) \beta(1)
\)

There is nothing wrong with the first two functions, but the third and fourth functions violate the principle of indistinguishability of identical particles. For example, the third function says that electron 1 has spin up and electron 2 has spin down, which does distinguish between electrons 1 and 2. More formally, if we apply $\hat{P}_{12}$ to these functions, we find that the first two functions are symmetric with respect to interchange of the two electrons, but the third and fourth functions are neither symmetric nor antisymmetric and so are unacceptable.

What now? Recall that we ran into essentially the same situation in treating the helium excited states (Section 9.7), where we started with the functions $1 s(1) 2 s(2)$ and $2 s(1) 1 s(2)$. We found that these two functions, which distinguish between electrons 1 and 2 , are not the correct zeroth-order functions and that the correct zeroth-order functions are $2^{-1 / 2}[1 s(1) 2 s(2) \pm 2 s(1) 1 s(2)]$. This result suggests pretty strongly that instead of $\alpha(1) \beta(2)$ and $\beta(1) \alpha(2)$, we use

\(
\begin{equation}
2^{-1 / 2}[\alpha(1) \beta(2) \pm \beta(1) \alpha(2)] \tag{10.21}
\end{equation}
\)

These functions are the normalized linear combinations of $\alpha(1) \beta(2)$ and $\beta(1) \alpha(2)$ that are eigenfunctions of $\hat{P}_{12}$, that is, are symmetric or antisymmetric. When electrons 1 and 2 are interchanged, $2^{-1 / 2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)]$ becomes $2^{-1 / 2}[\alpha(2) \beta(1)+\beta(2) \alpha(1)]$, which is the same as the original function. In contrast, $2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]$ becomes $2^{-1 / 2}[\alpha(2) \beta(1)-\beta(2) \alpha(1)]$, which is -1 times the original function. To show that the functions (10.21) are normalized, we have

\(
\begin{aligned}
& \sum{m{s 1}} \sum{m{s 2}} \frac{1}{\sqrt{2}}[\alpha(1) \beta(2) \pm \beta(1) \alpha(2)] \frac{1}{\sqrt{2}}[\alpha(1) \beta(2) \pm \beta(1) \alpha(2)] \
& =\frac{1}{2} \sum{m{s 1}}|\alpha(1)|^{2} \sum{m{s 2}}|\beta(2)|^{2} \pm \frac{1}{2} \sum{m{s 1}} \alpha^{}(1) \beta(1) \sum{m{s 2}} \beta^{}(2) \alpha(2) \
& \quad \pm \frac{1}{2} \sum{m{s 1}} \beta^{}(1) \alpha(1) \sum{m{s 2}} \alpha^{*}(2) \beta(2)+\frac{1}{2} \sum{m{s 1}}|\beta(1)|^{2} \sum{m{s 2}}|\alpha(2)|^{2}=1
\end{aligned}
\)

where we used the orthonormality relations (10.11) and (10.12).
Therefore, the four normalized two-electron spin eigenfunctions with the correct exchange properties are

\(
\text { symmetric: } \quad\left{\begin{array}{l}
\alpha(1) \alpha(2) \tag{10.22}\
\beta(1) \beta(2) \
{[\alpha(1) \beta(2)+\beta(1) \alpha(2)] / \sqrt{2}}
\end{array}\right.
\)

\(
\begin{equation}
\text { antisymmetric: }[\alpha(1) \beta(2)-\beta(1) \alpha(2)] / \sqrt{2} \tag{10.24}
\end{equation}
\)

We now include spin in the He zeroth-order ground-state wave function. The function $1 s(1) 1 s(2)$ is symmetric with respect to exchange. The overall electronic wave function including spin must be antisymmetric. Hence we must multiply the symmetric space function $1 s(1) 1 s(2)$ by an antisymmetric spin function. There is only one antisymmetric twoelectron spin function, so the ground-state zeroth-order wave function for the helium atom including spin is

\(
\begin{equation}
\psi^{(0)}=1 s(1) 1 s(2) \cdot 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] \tag{10.26}
\end{equation}
\)

$\psi^{(0)}$ is an eigenfunction of $\hat{P}_{12}$ with eigenvalue -1 .

To a very good approximation, the Hamiltonian does not contain spin terms, so the energy is unaffected by inclusion of the spin factor in the ground-state wave function. Also, the ground state of helium is still nondegenerate when spin is considered.

To further demonstrate that the spin factor does not affect the energy, we shall assume we are doing a variational calculation for the He ground state using the trial function $\phi=f\left(r{1}, r{2}, r_{12}\right) 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]$, where $f$ is a normalized function symmetric in the coordinates of the two electrons. The variational integral is

\(
\begin{aligned}
\int \phi^{} \hat{H} \phi d \tau=\sum{m{s 1}} \sum{m{s 2}} \iint f^{}\left(r{1},\right. & \left.r{2}, r{12}\right) \frac{1}{\sqrt{2}}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]^{*} \
& \times \hat{H} f\left(r{1}, r{2}, r{12}\right) \frac{1}{\sqrt{2}}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] d v{1} d v{2}
\end{aligned}
\)

Since $\hat{H}$ has no effect on the spin functions, the variational integral becomes

\(
\iint f * \hat{H} f d v{1} d v{2} \sum{m{s 1}} \sum{m{s 2}} \frac{1}{2}|\alpha(1) \beta(2)-\beta(1) \alpha(2)|^{2}
\)

Since the spin function (10.25) is normalized, the variational integral reduces to $\iint f^{*} \hat{H} f d v{1} d v{2}$, which is the expression we used before we introduced spin.

Now consider the excited states of helium. We found the lowest excited state to have the zeroth-order spatial wave function $2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)]$ [Eq. (9.103)]. Since this spatial function is antisymmetric, we must multiply it by a symmetric spin function. We can use any one of the three symmetric two-electron spin functions, so instead of the nondegenerate level previously found, we have a triply degenerate level with the three zeroth-order wave functions

\(
\begin{gather}
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] \alpha(1) \alpha(2) \tag{10.27}\
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] \beta(1) \beta(2) \tag{10.28}\
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] 2^{-1 / 2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)] \tag{10.29}
\end{gather}
\)

For the next excited state, the requirement of antisymmetry of the overall wave function leads to the zeroth-order wave function

\(
\begin{equation}
2^{-1 / 2}[1 s(1) 2 s(2)+2 s(1) 1 s(2)] 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] \tag{10.30}
\end{equation}
\)

The same considerations apply for the $1 s 2 p$ states.


So far, we have not seen any very spectacular consequences of electron spin and the antisymmetry requirement. In the hydrogen and helium atoms, the spin factors in the wave functions and the antisymmetry requirement simply affect the degeneracy of the levels but do not (except for very small effects to be considered later) affect the previously obtained energies. For lithium, the story is quite different.

Suppose we take the interelectronic repulsions in the Li atom as a perturbation on the remaining terms in the Hamiltonian. By the same steps used in the treatment of helium, the unperturbed wave functions are products of three hydrogenlike functions. For the ground state,

\(
\begin{equation}
\psi^{(0)}=1 s(1) 1 s(2) 1 s(3) \tag{10.31}
\end{equation}
\)

and the zeroth-order (unperturbed) energy is [Eq. (9.48) and the paragraph after (9.50)]

\(
E^{(0)}=-\left(\frac{1}{1^{2}}+\frac{1}{1^{2}}+\frac{1}{1^{2}}\right)\left(\frac{Z^{2} e^{2}}{8 \pi \varepsilon{0} a{0}}\right)=-27\left(\frac{e^{2}}{8 \pi \varepsilon{0} a{0}}\right)=-27(13.606 \mathrm{eV})=-367.4 \mathrm{eV}
\)

The first-order energy correction is $E^{(1)}=\left\langle\psi^{(0)}\right| \hat{H}^{\prime}\left|\psi^{(0)}\right\rangle$. The perturbation $\hat{H}^{\prime}$ consists of the interelectronic repulsions, so

\(
\begin{aligned}
& E^{(1)}=\int|1 s(1)|^{2}|1 s(2)|^{2}|1 s(3)|^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d v+\int|1 s(1)|^{2}|1 s(2)|^{2}|1 s(3)|^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{23}} d v \
& +\int|1 s(1)|^{2}|1 s(2)|^{2}|1 s(3)|^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{13}} d v
\end{aligned}
\)

The way we label the dummy integration variables in these definite integrals cannot affect their value. If we interchange the labels 1 and 3 on the variables in the second integral, it is converted to the first integral. Hence these two integrals are equal. Interchange of the labels 2 and 3 in the third integral shows it to be equal to the first integral also. Therefore

\(
E^{(1)}=3 \iint|1 s(1)|^{2}|1 s(2)|^{2} \frac{e^{2}}{4 \pi \varepsilon{0} r{12}} d v{1} d v{2} \int|1 s(3)|^{2} d v_{3}
\)

The integral over electron 3 gives 1 (normalization). The integral over electrons 1 and 2 was evaluated in the perturbation treatment of helium, and [Eqs. (9.52) and (9.53)]

\(
\begin{gathered}
E^{(1)}=3\left(\frac{5 Z}{4}\right)\left(\frac{e^{2}}{8 \pi \varepsilon{0} a{0}}\right)=\frac{45}{4}(13.606 \mathrm{eV})=153.1 \mathrm{eV} \
E^{(0)}+E^{(1)}=-214.3 \mathrm{eV}
\end{gathered}
\)

Since we can use the zeroth-order perturbation wave function as a trial variation function (recall the discussion at the beginning of Section 9.4), $E^{(0)}+E^{(1)}$ must be, according to the variation principle, equal to or greater than the true ground-state energy. The experimental value of the lithium ground-state energy is found by adding up the first, second, and third ionization energies, which gives [C. E. Moore, "Ionization Potentials and Ionization Limits," publication NSRDS-NBS 34 of the National Bureau of Standards (1970); available at www.nist.gov/data/nsrds/NSRDS-NBS34.pdf]

\(
-(5.39+75.64+122.45) \mathrm{eV}=-203.5 \mathrm{eV}
\)

We thus have $E^{(0)}+E^{(1)}$ as less than the true ground-state energy, which is a violation of the variation principle. Moreover, the supposed configuration $1 s^{3}$ for the Li ground state disagrees with the low value of the first ionization potential and with all chemical evidence. If we continued in this manner, we would have a $1 s^{Z}$ ground-state configuration for the element of atomic number $Z$. We would not get the well-known periodic behavior of the elements.

Of course, our error is failure to consider spin and the antisymmetry requirement. The hypothetical zeroth-order wave function $1 s(1) 1 s(2) 1 s(3)$ is symmetric with respect to interchange of any two electrons. If we are to have an antisymmetric $\psi^{(0)}$, we must multiply this symmetric space function by an antisymmetric spin function. It is easy to construct completely symmetric spin functions for three electrons, such as $\alpha(1) \alpha(2) \alpha(3)$. However, it is impossible to construct a completely antisymmetric spin function for three electrons.

Let us consider how we can systematically construct an antisymmetric function for three electrons. We shall use $f, g$, and $h$ to stand for three functions of electronic coordinates, without specifying whether we are considering space coordinates or spin coordinates or both. We start with the function

\(
\begin{equation}
f(1) g(2) h(3) \tag{10.32}
\end{equation}
\)

which is certainly not antisymmetric. The antisymmetric function we desire must be converted into its negative by each of the permutation operators $\hat{P}{12}, \hat{P}{13}$, and $\hat{P}_{23}$. Applying each of these operators in turn to $f(1) g(2) h(3)$, we get

\(
\begin{equation}
f(2) g(1) h(3), \quad f(3) g(2) h(1), \quad f(1) g(3) h(2) \tag{10.33}
\end{equation}
\)

We might try to construct the antisymmetric functions as a linear combination of the four functions (10.32) and (10.33), but this attempt would fail. Application of $\hat{P}_{12}$ to the last two functions in (10.33) gives

\(
\begin{equation}
f(3) g(1) h(2) \text { and } f(2) g(3) h(1) \tag{10.34}
\end{equation}
\)

which are not included in (10.32) or (10.33). We must therefore include all six functions (10.32) to (10.34) in the desired antisymmetric linear combination. These six functions are the six $(3 \cdot 2 \cdot 1)$ possible permutations of the three electrons among the three functions $f, g$, and $h$. If $f(1) g(2) h(3)$ is a solution of the Schrödinger equation with eigenvalue $E$, then, because of the identity of the particles, each of the functions (10.32) to (10.34) is also a solution with the same eigenvalue $E$ (exchange degeneracy), and any linear combination of these functions is an eigenfunction with eigenvalue $E$.

The antisymmetric linear combination will have the form

\(
\begin{align}
c{1} f(1) g(2) h(3)+c{2} f(2) g(1) h(3) & +c{3} f(3) g(2) h(1)+c{4} f(1) g(3) h(2) \
& +c{5} f(3) g(1) h(2)+c{6} f(2) g(3) h(1) \tag{10.35}
\end{align}
\)

Since $f(2) g(1) h(3)=\hat{P}{12} f(1) g(2) h(3)$, in order to have (10.35) be an eigenfunction of $\hat{P}{12}$ with eigenvalue -1 , we must have $c{2}=-c{1}$. Likewise, $f(3) g(2) h(1)=$ $\hat{P}{13} f(1) g(2) h(3)$ and $f(1) g(3) h(2)=\hat{P}{23} f(1) g(2) h(3)$, so $c{3}=-c{1}$ and $c{4}=-c{1}$. Since $f(3) g(1) h(2)=\hat{P}{12} f(3) g(2) h(1)$, we must have $c{5}=-c{3}=c{1}$. Similarly, we find $c{6}=c{1}$. We thus arrive at the linear combination

\(
\begin{align}
c_{1}[f(1) g(2) h(3)-f(2) g(1) h(3) & -f(3) g(2) h(1)-f(1) g(3) h(2) \
& +f(3) g(1) h(2)+f(2) g(3) h(1)] \tag{10.36}
\end{align}
\)

which is easily verified to be antisymmetric with respect to $1-2,1-3$, and $2-3$ interchange. [Taking all signs as plus in (10.36), we would get a completely symmetric function.]

Let us assume $f, g$, and $h$ to be orthonormal and choose $c_{1}$ so that (10.36) is normalized. Multiplying (10.36) by its complex conjugate, we get many terms, but because of the assumed orthogonality the integrals of all products involving two different terms of (10.36) vanish. For example,

\(
\begin{aligned}
& \int[f(1) g(2) h(3)]^{} f(2) g(1) h(3) d \tau \
&=\int f^{}(1) g(1) d \tau{1} \int g^{*}(2) f(2) d \tau{2} \int h^{*}(3) h(3) d \tau_{3}=0 \cdot 0 \cdot 1=0
\end{aligned}
\)

Integrals involving the product of a term of (10.36) with its own complex conjugate are equal to 1 , because $f, g$, and $h$ are normalized. Therefore,

\(
\begin{aligned}
1=\int|(10.36)|^{2} d \tau & =\left|c{1}\right|^{2}(1+1+1+1+1+1) \
c{1} & =1 / \sqrt{6}
\end{aligned}
\)

We could work with (10.36) as it stands, but its properties are most easily found if we recognize it as the expansion [Eq. (8.24)] of the following third-order determinant:

\(
\frac{1}{\sqrt{6}}\left|\begin{array}{lll}
f(1) & g(1) & h(1) \tag{10.37}\
f(2) & g(2) & h(2) \
f(3) & g(3) & h(3)
\end{array}\right|
\)

(See also Prob. 8.22.) The antisymmetry property holds for (10.37) because interchange of two electrons amounts to interchanging two rows of the determinant, which multiplies it by -1 .

We now use (10.37) to prove that it is impossible to construct an antisymmetric spin function for three electrons. The functions $f, g$, and $h$ may each be either $\alpha$ or $\beta$. If we take $f=\alpha, g=\beta, h=\alpha$, then (10.37) becomes

\(
\frac{1}{\sqrt{6}}\left|\begin{array}{lll}
\alpha(1) & \beta(1) & \alpha(1) \tag{10.38}\
\alpha(2) & \beta(2) & \alpha(2) \
\alpha(3) & \beta(3) & \alpha(3)
\end{array}\right|
\)

Although (10.38) is antisymmetric, we must reject it because it is equal to zero. The first and third columns of the determinant are identical, so (Section 8.3) the determinant vanishes. No matter how we choose $f, g$, and $h$, at least two columns of the determinant must be equal, so we cannot construct a nonzero antisymmetric three-electron spin function.

We now use (10.37) to construct the zeroth-order ground-state wave function for lithium, including both space and spin variables. The functions $f, g$, and $h$ will now involve both space and spin variables. We choose

\(
\begin{equation}
f(1)=1 s(1) \alpha(1) \tag{10.39}
\end{equation}
\)

We call a function like (10.39) a spin-orbital. A spin-orbital is the product of a one-electron spatial orbital and a one-electron spin function.

If we were to take $g(1)=1 s(1) \alpha(1)$, this would make the first and second columns of (10.37) identical, and the wave function would vanish. This is a particular case of the Pauli exclusion principle: No two electrons can occupy the same spin-orbital. Another way of stating this is to say that no two electrons in an atom can have the same values for all their quantum numbers. The Pauli exclusion principle is a consequence of the more general antisymmetry requirement for the wave function of a system of identical spin- $\frac{1}{2}$ particles and is less satisfying than the antisymmetry statement, since the exclusion principle is based on approximate (zeroth-order) wave functions.

We therefore take $g(1)=1 s(1) \beta(1)$, which puts two electrons with opposite spin in the $1 s$ orbital. For the spin-orbital $h$, we cannot use either $1 s(1) \alpha(1)$ or $1 s(1) \beta(1)$, since these choices make the determinant vanish. We take $h(1)=2 s(1) \alpha(1)$, which gives the familiar Li ground-state configuration $1 s^{2} 2 s$ and the zeroth-order wave function

\(
\psi^{(0)}=\frac{1}{\sqrt{6}}\left|\begin{array}{lll}
1 s(1) \alpha(1) & 1 s(1) \beta(1) & 2 s(1) \alpha(1) \tag{10.40}\
1 s(2) \alpha(2) & 1 s(2) \beta(2) & 2 s(2) \alpha(2) \
1 s(3) \alpha(3) & 1 s(3) \beta(3) & 2 s(3) \alpha(3)
\end{array}\right|
\)

Note especially that (10.40) is not simply a product of space and spin parts (as we found for H and He ), but is a linear combination of terms, each of which is a product of space and spin parts.

Since we could just as well have taken $h(1)=2 s(1) \beta(1)$, the ground state of lithium is, like hydrogen, doubly degenerate, corresponding to the two possible orientations of the spin of the $2 s$ electron. We might use the orbital diagrams

to indicate this. Each spatial orbital such as $1 s$ or $2 p_{0}$ can hold two electrons of opposite spin. A spin-orbital such as $2 s \alpha$ can hold one electron.

Although the $1 s^{2} 2 p$ configuration will have the same unperturbed energy $E^{(0)}$ as the $1 s^{2} 2 s$ configuration, when we take electron repulsion into account by calculating $E^{(1)}$ and
higher corrections, we find that the $1 s^{2} 2 s$ configuration lies lower for the same reason as in helium.

Consider some points about the Pauli exclusion principle, which we restate as follows: In a system of identical fermions, no two particles can occupy the same state. If we have a system of $n$ interacting particles (for example, an atom), there is a single wave function (involving $4 n$ variables) for the entire system. Because of the interactions between the particles, the wave function cannot be written as the product of wave functions of the individual particles. Hence, strictly speaking, we cannot talk of the states of individual particles, only the state of the whole system. If, however, the interactions between the particles are not too large, then as an initial approximation we can neglect them and write the zeroth-order wave function of the system as a product of wave functions of the individual particles. In this zeroth-order wave function, no two fermions can have the same wave function (state).

Since bosons require a wave function symmetric with respect to interchange, there is no restriction on the number of bosons in a given state.

In 1925, Einstein showed that in an ideal gas of noninteracting bosons, there is a very low temperature $T{c}$ (called the condensation temperature) above which the fraction $f$ of bosons in the ground state is negligible but below which $f$ becomes appreciable and goes to 1 as the absolute temperature $T$ goes to 0 . The equation for $f$ for noninteracting bosons in a cubic box is $f=1-\left(T / T{c}\right)^{3 / 2}$ for $T<T_{c}$ [McQuarrie (2000), Section 10-4]. The phenomenon of a significant fraction of bosons falling into the ground state is called Bose-Einstein condensation. Bose-Einstein condensation is important in determining the properties of superfluid liquid ${ }^{4} \mathrm{He}$ (whose atoms are bosons), but the interatomic interactions in the liquid make theoretical analysis difficult.

In 1995, physicists succeeded in producing Bose-Einstein condensation in a gas [Physics Today, August 1995, p. 17; C. E. Wieman, Am. J. Phys., 64, 847 (1996)]. They used a gas of ${ }_{37}^{87} \mathrm{Rb}$ atoms. $\mathrm{An}{ }^{87} \mathrm{Rb}$ atom has 87 nucleons and 37 electrons. With an even number (124) of fermions, ${ }^{87} \mathrm{Rb}$ is a boson. With a combination of laser light, an applied inhomogeneous magnetic field, and applied radiofrequency radiation, a sample of $10^{4}{ }^{87} \mathrm{Rb}$ atoms was cooled to $10^{-7} \mathrm{~K}$, thereby condensing a substantial fraction of the atoms into the ground state. The radiofrequency radiation was then used to remove most of the atoms in excited states, leaving a condensate of 2000 atoms, nearly all of which were in the ground state. Each Rb atom in this experiment was subject to a potential-energy function $V(x, y, z)$ produced by the interaction of the atom's total spin magnetic moment with the applied magnetic field (Sections 6.8 and 10.9). The inhomogeneous applied magnetic field was such that the potential energy $V$ was that of a three-dimensional harmonic oscillator (Prob. 4.20) plus a constant. The Rb atoms in the Bose-Einstein condensate are in the ground state of this harmonic-oscillator potential.


Slater pointed out in 1929 that a determinant of the form (10.40) satisfies the antisymmetry requirement for a many-electron atom. A determinant like (10.40) is called a Slater determinant. All the elements in a given column of a Slater determinant involve the same spin-orbital, whereas elements in the same row all involve the same electron. (Since interchanging rows and columns does not affect the value of a determinant, we could write the Slater determinant in another, equivalent form.)

Consider how the zeroth-order helium wave functions that we found previously can be written as Slater determinants. For the ground-state configuration $1 s^{2}$, we have the spin-orbitals $1 s \alpha$ and $1 s \beta$, giving the Slater determinant

\(
\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
1 s(1) \alpha(1) & 1 s(1) \beta(1) \tag{10.41}\
1 s(2) \alpha(2) & 1 s(2) \beta(2)
\end{array}\right|=1 s(1) 1 s(2) \frac{1}{\sqrt{2}}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]
\)

which agrees with (10.26). For the states corresponding to the excited configuration $1 s 2 s$, we have the possible spin-orbitals $1 s \alpha, 1 s \beta, 2 s \alpha, 2 s \beta$, which give the four Slater determinants

\(
\begin{array}{ll}
D{1}=\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
1 s(1) \alpha(1) & 2 s(1) \alpha(1) \
1 s(2) \alpha(2) & 2 s(2) \alpha(2)
\end{array}\right| & D{2}=\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
1 s(1) \alpha(1) & 2 s(1) \beta(1) \
1 s(2) \alpha(2) & 2 s(2) \beta(2)
\end{array}\right| \
D{3}=\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
1 s(1) \beta(1) & 2 s(1) \alpha(1) \
1 s(2) \beta(2) & 2 s(2) \alpha(2)
\end{array}\right| & D{4}=\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
1 s(1) \beta(1) & 2 s(1) \beta(1) \
1 s(2) \beta(2) & 2 s(2) \beta(2)
\end{array}\right|
\end{array}
\)

Comparison with (10.27) to (10.30) shows that the $1 s 2 s$ zeroth-order wave functions are related to these four Slater determinants as follows:

\(
\begin{gather}
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] \alpha(1) \alpha(2)=D{1} \tag{10.42}\
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] \beta(1) \beta(2)=D{4} \tag{10.43}\
2^{-1 / 2}[1 s(1) 2 s(2)-2 s(1) 1 s(2)] 2^{-1 / 2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)]=2^{-1 / 2}\left(D{2}+D{3}\right) \tag{10.44}\
2^{-1 / 2}[1 s(1) 2 s(2)+2 s(1) 1 s(2)] 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]=2^{-1 / 2}\left(D{2}-D{3}\right) \tag{10.45}
\end{gather}
\)

(To get a zeroth-order function that is an eigenfunction of the spin and orbital angularmomentum operators, we sometimes have to take a linear combination of the Slater determinants of a configuration; see Chapter 11.)

Next, consider some notations used for Slater determinants. Instead of writing $\alpha$ and $\beta$ for spin functions, one often puts a bar over the spatial function to indicate the spin function $\beta$, and a spatial function without a bar implies the spin factor $\alpha$. With this notation, (10.40) is written as

\(
\psi^{(0)}=\frac{1}{\sqrt{6}}\left|\begin{array}{lll}
1 s(1) & \overline{1 s}(1) & 2 s(1) \tag{10.46}\
1 s(2) & \overline{1 s}(2) & 2 s(2) \
1 s(3) & \overline{1 s}(3) & 2 s(3)
\end{array}\right|
\)

Given the spin-orbitals occupied by the electrons, we can readily construct the Slater determinant. Therefore, a shorthand notation for Slater determinants that simply specifies the spin-orbitals is often used. In this notation, (10.46) is written as

\(
\begin{equation}
\psi^{(0)}=|1 s \overline{1 s} 2 s| \tag{10.47}
\end{equation}
\)

where the vertical lines indicate formation of the determinant and multiplication by $1 / \sqrt{6}$.
We showed that the factor $1 / \sqrt{6}$ normalizes a third-order Slater determinant constructed of orthonormal functions. The expansion of an $n$ th-order determinant has $n$ ! terms (Prob. 8.20). For an $n$ th-order Slater determinant of orthonormal spin-orbitals, the same reasoning used in the third-order case shows that the normalization constant is $1 / \sqrt{n!}$. We always include a factor $1 / \sqrt{n!}$ in defining a Slater determinant of order $n$.


Let us carry out a perturbation treatment of the ground state of the lithium atom. Defining $e^{\prime}$ as $e^{\prime} \equiv e / 4 \pi \varepsilon{0}$, we take
$\hat{H}^{0}=-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{3}^{2}-\frac{Z e^{\prime 2}}{r{1}}-\frac{Z e^{\prime 2}}{r{2}}-\frac{Z e^{\prime 2}}{r{3}}, \quad \hat{H}^{\prime}=\frac{e^{\prime 2}}{r{12}}+\frac{e^{\prime 2}}{r{23}}+\frac{e^{\prime 2}}{r_{13}}$

We found in Section 10.5 that to satisfy the antisymmetry requirement, the ground-state configuration must be $1 s^{2} 2 s$. The correct zeroth-order wave function is (10.40):

\(
\begin{align}
\psi^{(0)}= & 6^{-1 / 2}[1 s(1) 1 s(2) 2 s(3) \alpha(1) \beta(2) \alpha(3)-1 s(1) 2 s(2) 1 s(3) \alpha(1) \alpha(2) \beta(3) \
& -1 s(1) 1 s(2) 2 s(3) \beta(1) \alpha(2) \alpha(3)+1 s(1) 2 s(2) 1 s(3) \beta(1) \alpha(2) \alpha(3) \tag{10.48}\
& +2 s(1) 1 s(2) 1 s(3) \alpha(1) \alpha(2) \beta(3)-2 s(1) 1 s(2) 1 s(3) \alpha(1) \beta(2) \alpha(3)]
\end{align}
\)

What is $E^{(0)}$ ? Each term in $\psi^{(0)}$ contains the product of two $1 s$ hydrogenlike functions and one $2 s$ hydrogenlike function, multiplied by a spin factor. $\hat{H}^{0}$ is the sum of three hydrogenlike Hamiltonians, one for each electron, and does not involve spin. Thus $\psi^{(0)}$ is a linear combination of terms, each of which is an eigenfunction of $\hat{H}^{0}$ with eigenvalue $E{1 s}^{(0)}+E{1 s}^{(0)}+E{2 s}^{(0)}$, where these are hydrogenlike energies. Hence $\psi^{(0)}$ is an eigenfunction of $\hat{H}^{0}$ with eigenvalue $E{1 s}^{(0)}+E{1 s}^{(0)}+E{2 s}^{(0)}$. Therefore [Eq. (6.94)],

\(
\begin{equation}
E^{(0)}=-\left(\frac{1}{1^{2}}+\frac{1}{1^{2}}+\frac{1}{2^{2}}\right)\left(\frac{Z^{2} e^{\prime 2}}{2 a_{0}}\right)=-\frac{81}{4}(13.606 \mathrm{eV})=-275.5 \mathrm{eV} \tag{10.49}
\end{equation}
\)

The evaluation of $E^{(1)}=\left\langle\psi^{(0)}\right| \hat{H}^{\prime}\left|\psi^{(0)}\right\rangle$ is outlined in Prob. 10.15. One finds

\(
\begin{align}
E^{(1)}=2 \iint 1 s^{2}(1) 2 s^{2}(2) \frac{e^{\prime 2}}{r{12}} d v{1} d v{2} & +\iint 1 s^{2}(1) 1 s^{2}(2) \frac{e^{\prime 2}}{r{12}} d v{1} d v{2} \
& -\iint 1 s(1) 2 s(2) 1 s(2) 2 s(1) \frac{e^{\prime 2}}{r{12}} d v{1} d v_{2} \tag{10.50}
\end{align}
\)

These integrals are Coulomb and exchange integrals:

\(
\begin{equation}
E^{(1)}=2 J{1 s 2 s}+J{1 s 1 s}-K_{1 s 2 s} \tag{10.51}
\end{equation}
\)

We have [Eqs. (9.52), (9.53), and (9.111)]

\(
\begin{gathered}
J{1 s 1 s}=\frac{5}{8} \frac{Z e^{\prime 2}}{a{0}}, \quad J{1 s 2 s}=\frac{17}{81} \frac{Z e^{\prime 2}}{a{0}}, \quad K{1 s 2 s}=\frac{16}{729} \frac{Z e^{\prime 2}}{a{0}} \
E^{(1)}=\frac{5965}{972}\left(\frac{e^{\prime 2}}{2 a_{0}}\right)=83.5 \mathrm{eV}
\end{gathered}
\)

The energy through first order is -192.0 eV , as compared with the true ground-state energy of lithium, -203.5 eV . To improve on this result, we could calculate higher-order wave-function and energy corrections. This will mix into the wave function contributions from Slater determinants involving configurations besides $1 s^{2} 2 s$ (configuration interaction).


The zeroth-order perturbation wave function (10.40) uses the full nuclear charge $(Z=3)$ for both the $1 s$ and $2 s$ orbitals of lithium. We expect that the $2 s$ electron, which is partially shielded from the nucleus by the two $1 s$ electrons, will see an effective nuclear charge that is much less than 3 . Even the $1 s$ electrons partially shield each other (recall the treatment of the helium ground state). This reasoning suggests the introduction of two variational parameters $b{1}$ and $b{2}$ into (10.40).

Instead of using the $Z=31$ s function in Table 6.2, we take

\(
\begin{equation}
f \equiv \frac{1}{\pi^{1 / 2}}\left(\frac{b{1}}{a{0}}\right)^{3 / 2} e^{-b{1} r / a{0}} \tag{10.52}
\end{equation}
\)

where $b_{1}$ is a variational parameter representing an effective nuclear charge for the $1 s$ electrons. Instead of the $Z=32 s$ function in Table 6.2, we use

\(
\begin{equation}
g=\frac{1}{4(2 \pi)^{1 / 2}}\left(\frac{b{2}}{a{0}}\right)^{3 / 2}\left(2-\frac{b{2} r}{a{0}}\right) e^{-b{2} r / 2 a{0}} \tag{10.53}
\end{equation}
\)

Our trial variation function is then

\(
\phi=\frac{1}{\sqrt{6}}\left|\begin{array}{lll}
f(1) \alpha(1) & f(1) \beta(1) & g(1) \alpha(1) \tag{10.54}\
f(2) \alpha(2) & f(2) \beta(2) & g(2) \alpha(2) \
f(3) \alpha(3) & f(3) \beta(3) & g(3) \alpha(3)
\end{array}\right|
\)

The use of different charges $b{1}$ and $b{2}$ for the $1 s$ and $2 s$ orbitals destroys their orthogonality, so (10.54) is not normalized. The best values of the variational parameters are found by setting $\partial W / \partial b{1}=0$ and $\partial W / \partial b{2}=0$, where the variational integral $W$ is given by the left side of Eq. (8.9). The results are [E. B. Wilson, Jr., J. Chem. Phys., 1, 210 (1933)] $b{1}=2.686, b{2}=1.776$, and $W=-201.2 \mathrm{eV} . W$ is much closer to the true value -203.5 eV than the result -192.0 eV found in the last section. The value of $b_{2}$ shows substantial, but not complete, screening of the $2 s$ electron by the $1 s$ electrons.

We might try other forms for the orbitals besides (10.52) and (10.53) to improve the trial function. However, no matter what orbital functions we try, if we restrict ourselves to a trial function of the form of (10.54), we can never reach the true ground-state energy. To do this, we can introduce $r{12}, r{23}$, and $r_{13}$ into the trial function or use a linear combination of several Slater determinants corresponding to various configurations (configuration interaction).


Recall that the orbital angular momentum $\mathbf{L}$ of an electron has a magnetic moment $-\left(e / 2 m{e}\right) \mathbf{L}$ associated with it [Eq. (6.128)], where $-e$ is the electron charge. It is natural to suppose that there is also a magnetic moment $\mathbf{m}{S}$ associated with the electronic spin angular momentum $\mathbf{S}$. We might guess that $\mathbf{m}{S}$ would be $-\left(e / 2 m{e}\right)$ times $\mathbf{S}$. Spin is a relativistic phenomenon, however, and we cannot expect $\mathbf{m}{S}$ to be related to $\mathbf{S}$ in exactly the same way that $\mathbf{m}{L}$ is related to $\mathbf{L}$. In fact, Dirac's relativistic treatment of the electron gave the result that (in SI units)

\(
\begin{equation}
\mathbf{m}{S}=-g{e} \frac{e}{2 m_{e}} \mathbf{S} \tag{10.55}
\end{equation}
\)

where Dirac's treatment gave $g{e}=2$ for the electron $g$ factor $g{e}$. Theoretical and experimental work subsequent to Dirac's treatment has shown that $g{e}$ is slightly greater than $2 ; g{e}=2(1+\alpha / 2 \pi+\cdots)=2.0023$, where the dots indicate terms involving higher powers of $\alpha$ and where the fine-structure constant $\alpha$ is

\(
\begin{equation}
\alpha \equiv \frac{e^{2}}{4 \pi \varepsilon_{0} \hbar c}=0.00729735257 \tag{10.56}
\end{equation}
\)

$g{e}$ has been measured with extraordinary accuracy [D. Hanneke et al., Phys. Rev. A, 83, 052122 (2011); available at arxiv.org/abs/1009.4831]: $g{e}=2.002319304361$. [Some workers prefer to omit the minus sign in (10.55) and use the convention that $g_{e}$ is negative; see www.stanford.edu/group/Zarelab/publinks/zarepub642.pdf.]

The magnitude of the spin magnetic moment of an electron is (in SI units)

\(
\begin{equation}
\left|\mathbf{m}{S}\right|=g{e} \frac{e}{2 m{e}}|\mathbf{S}|=g{e} \sqrt{\frac{3}{4}} \frac{e \hbar}{2 m_{e}} \tag{10.57}
\end{equation}
\)

The ferromagnetism of iron is due to the electron's magnetic moment.

The two possible orientations of an electron's spin and its associated spin magnetic moment with respect to an axis produce two energy levels in an externally applied magnetic field. In electron-spin-resonance (ESR) spectroscopy, one observes transitions between these two levels. ESR spectroscopy is applicable to species such as free radicals and transition-metal ions that have one or more unpaired electron spins and hence have a nonzero total electron spin and spin magnetic moment.

NMR Spectroscopy

Many atomic nuclei have a nonzero spin angular momentum I. Similar to (10.4) and (10.5), the magnitude of $\mathbf{I}$ is $[I(I+1)]^{1 / 2} \hbar$, where the nuclear-spin quantum number $I$ can be $0, \frac{1}{2}, 1$, and so on, and the $z$ component of $\mathbf{I}$ has the possible values $M{I} \hbar$, where $M{I}=-I,-I+1, \ldots, I$. Some $I$ values are: 0 for every nucleus with an even number of protons and an even number of neutrons (for example, ${ }{8}^{16} \mathrm{O}$ and ${ }{6}^{12} \mathrm{C}$ ); $\frac{1}{2}$ for ${ }{1}^{1} \mathrm{H},{ }{6}^{13} \mathrm{C},{ }{9}^{19} \mathrm{~F}$, and ${ }{15}^{31} \mathrm{P} ; 1$ for ${ }{1}^{2} \mathrm{H}$ and ${ }{7}^{14} \mathrm{~N} ; \frac{3}{2}$ for ${ }{5}^{11} \mathrm{~B},{ }{11}^{23} \mathrm{Na}$ and ${ }{17}^{35} \mathrm{Cl}$. If $I \neq 0$, the nucleus has a spin magnetic moment $\mathbf{m}{I}$ given by an equation similar to (10.55):

\(
\begin{equation}
\mathbf{m}{I}=g{N}\left(e / 2 m_{p}\right) \mathbf{I} \equiv \gamma \mathbf{I} \tag{10.58}
\end{equation}
\)

where $m{p}$ is the proton mass and the nuclear $g$ factor $g{N}$ has a value characteristic of the nucleus. The quantity $\gamma$, called the magnetogyric ratio of the nucleus, is defined by $\gamma \equiv \mathbf{m}{I} / \mathbf{I}=g{N} e / 2 m{p}$. Values of $I, g{N}$, and $\gamma$ for some nuclei are

nucleus${ }^{1} \mathrm{H}$${ }^{12} \mathrm{C}$${ }^{13} \mathrm{C}$${ }^{15} \mathrm{~N}$${ }^{19} \mathrm{~F}$${ }^{31} \mathrm{P}$
I$1 / 2$0$1 / 2$$1 / 2$$1 / 2$$1 / 2$
$g_{N}$5.585691.40482-0.566385.257732.2632
$\gamma /(\mathrm{MHz} / \mathrm{T})$267.52267.283-27.126251.815108.39

In nuclear-magnetic-resonance (NMR) spectroscopy, one observes transitions between nuclear-spin energy levels in an applied magnetic field. The sample (most commonly a dilute solution of the compound being studied) is placed between the poles of a strong magnet. The energy of interaction between an isolated nuclear-spin magnetic moment $\mathbf{m}{I}$ in an external magnetic field $\mathbf{B}$ is given by Eq. (6.131) as $E=-\mathbf{m}{I} \cdot \mathbf{B}$. Using (10.58) for $\mathbf{m}_{I}$ and taking the $z$ axis as coinciding with the direction of $\mathbf{B}$, we have

\(
E=-\mathbf{m}{I} \cdot \mathbf{B}=-\gamma\left(I{x} \mathbf{i}+I{y} \mathbf{j}+I{z} \mathbf{k}\right) \cdot(B \mathbf{k})=-\gamma B I_{z}
\)

We convert this classical expression for the energy into a Hamiltonian operator by replacing the classical quantity $I{z}$ by the operator $\hat{I}{z}$. Thus, $\hat{H}=-\gamma B \hat{I}{z}$. Let $\left|M{I}\right\rangle$ denote the function that is simultaneously an eigenfunction of the operators $\hat{I}^{2}$ (for the square of the magnitude of the nuclear-spin angular momentum) and $\hat{I}_{z}$. We have

\(
\begin{equation}
\hat{H}\left|M{I}\right\rangle=-\gamma B \hat{I}{z}\left|M{I}\right\rangle=-\gamma B M{I} \hbar\left|M_{I}\right\rangle \tag{10.59}
\end{equation}
\)

Therefore, (10.59) gives the energy levels of the isolated nuclear spin in the applied magnetic field as

\(
E=-\gamma \hbar B M{I}, \quad M{I}=-I,-I+1, \ldots, I
\)

In NMR spectroscopy, the sample is exposed to electromagnetic radiation that induces transitions between nuclear-spin energy levels. The selection rule turns out to be $\Delta M_{I}= \pm 1$. The NMR transition frequency $\nu$ is found as follows:

\(
\begin{align}
& h \nu=|\Delta E|=|\gamma| \hbar B\left|\Delta M{I}\right|=|\gamma| \hbar B \
& \nu=(|\gamma| / 2 \pi) B=\left|g{N}\right|\left(e / 4 \pi m_{p}\right) B \tag{10.60}
\end{align}
\)

The value of $\gamma$ differs greatly for different nuclei, and in any one experiment, one studies the NMR spectrum of one kind of nucleus. The most commonly studied nucleus is ${ }^{1} \mathrm{H}$, the proton. The second most studied nucleus is ${ }^{13} \mathrm{C}$. The ${ }^{13} \mathrm{C}$ isotope occurs in $1 \%$ abundance in carbon.

Equation (10.60) is for a nucleus isolated except for the presence of the external magnetic field $\mathbf{B}$. For a nucleus present in a molecule, we also have to consider the contribution of the molecular electrons to the magnetic field felt by each nucleus. In most ground-state molecules, the electron spins are all paired and there is no electronic orbital angular momentum. With no electronic spin or orbital angular momentum, the electrons do not contribute to the magnetic field experienced by each nucleus. However, the application of the external applied field $\mathbf{B}$ perturbs the molecular electronic wave function, thereby producing an electronic contribution to the magnetic field at each nucleus. This electronic contribution is proportional to the magnitude of the external field $B$ and is usually in the opposite direction to $B$. Therefore, the magnetic field experienced by nucleus $i$ in a molecule is $B-\sigma{i} B=\left(1-\sigma{i}\right) B$, where the proportionality constant $\sigma_{i}$ is called the screening constant or shielding constant for nucleus $I$ and is much less than 1. Equation (10.60) becomes for a nucleus in a molecule

\(
\begin{equation}
\nu{i}=(|\gamma| / 2 \pi)\left(1-\sigma{i}\right) B \tag{10.61}
\end{equation}
\)

The value of $\sigma{i}$ is the same for nuclei that are in the same electronic environment in the molecule. For example, in $\mathrm{CH}{3} \mathrm{CH}{2} \mathrm{OH}$, the three $\mathrm{CH}{3}$ protons have the same $\sigma{i}$, and the two $\mathrm{CH}{2}$ protons have the same $\sigma{i}$. (A Newman projection of ethanol, which has a staggered conformation, shows that two of the three $\mathrm{CH}{3}$ hydrogens are closer to the OH group than is the third $\mathrm{CH}_{3}$ hydrogen, but the low barrier to internal rotation in ethanol allows the three methyl hydrogens to be rapidly interchanged at room temperature, thereby making the electronic environment the same for these three hydrogens.)

We might thus expect the ${ }^{1} \mathrm{H}$ NMR spectrum of ethanol to show three peaks-one for the $\mathrm{CH}{3}$ protons, one for the $\mathrm{CH}{2}$ protons, and one for the OH proton, with the relative intensities of these peaks being 3:2:1. However, there is an additional effect, called spinspin coupling, in which the nuclear spins of the protons on one carbon affect the magnetic field experienced by the protons on an adjacent carbon. Different possible orientations of the proton spins on one carbon produce different magnetic fields at the protons of the adjacent carbon, thereby splitting the NMR transition of the protons at the adjacent carbon. For example, the two $\mathrm{CH}_{2}$ proton nuclei have the following four possible nuclear-spin orientations:

where the up and down arrows represent $M{I}=\frac{1}{2}$ and $M{I}=-\frac{1}{2}$, respectively. [Actually, because of the indistinguishability of identical particles, the middle two spin states in (10.62) must be replaced by linear combinations of these two states, to give nuclear-spin states that are analogous to the electron spin functions (10.22) to (10.25).] The middle two spin states in (10.62) have the same effect on the magnetic field felt by the $\mathrm{CH}{3}$ protons, so the four spin states in (10.62) produce three different magnetic fields at the $\mathrm{CH}{3}$ protons, thereby splitting the $\mathrm{CH}{3}$ proton NMR absorption line into three closely spaced lines of relative intensities $1: 2: 1$, corresponding to the number of $\mathrm{CH}{2}$ proton spin states that produce each magnetic field. One might expect the $\mathrm{CH}_{2}$ protons to also split the OH proton NMR line. However, even a trace of water present in the ethanol will catalyze a rapid exchange of the OH proton between different ethanol molecules, thereby eliminating the splitting of the OH line.

A similar analysis of the possible $\mathrm{CH}{3}$ proton orientations (Prob. 10.22) shows that the $\mathrm{CH}{3}$ protons split the ethanol $\mathrm{CH}{2}$ proton NMR line into four lines of relative intensities 1:3:3:1. The general rule is that a group of $n$ equivalent protons on an atom splits the NMR line of protons on an adjacent atom into $n+1$ lines. The spin-spin splitting (which is transmitted through the chemical bonds) is too weak to affect the NMR lines of protons separated by more than three bonds from the protons doing the splitting. In the proton NMR spectrum of $\mathrm{CH}{3} \mathrm{CH}{2} \mathrm{C}(\mathrm{O}) \mathrm{H}$, the $\mathrm{CH}{3}$ protons split the $\mathrm{CH}{2}$ proton line into 4 lines, and each of these is split into two lines by the $\mathrm{C}(\mathrm{O}) \mathrm{H}$ proton, so the $\mathrm{CH}{2}$ NMR line is split into 8 lines. (The intermolecular proton exchange in ethanol prevents the OH proton from splitting the $\mathrm{CH}_{2}$ proton NMR absorption.)

Nuclei with $I=0$ (for example, ${ }^{12} \mathrm{C},{ }^{16} \mathrm{O}$ ) don't split proton NMR peaks. It turns out that nuclei with $I>\frac{1}{2}$ (for example, ${ }^{35} \mathrm{Cl},{ }^{37} \mathrm{Cl},{ }^{14} \mathrm{~N}$ ) generally don't split proton NMR peaks. The ${ }^{19} \mathrm{~F}$ nucleus has $I=\frac{1}{2}$ and does split proton NMR peaks. Also, a quantummechanical analysis shows that the spin-spin interactions between equivalent protons don't affect the NMR spectrum.

The treatment just given (called a first-order analysis) is actually an approximation that is valid provided that the spin-spin splittings are much smaller than all the NMR frequency differences between chemically nonequivalent nuclei. In very large molecules, there will likely be chemically nonequivalent nuclei that are in only slightly different electronic environments, so the NMR frequency differences between them will be quite small and the first-order analysis will not hold. By increasing the strength of the applied magnetic field, one increases the NMR frequency differences between chemically nonequivalent nuclei, thereby tending to make the spectrum first-order, which is easier to analyze. Also, the signal strength is increased as the field is increased. Therefore, people try and use as high a field as is feasible. Current NMR research spectrometers have fields that correspond to proton NMR frequencies in the range 300 to 1000 MHz .

NMR spectroscopy is the premier structural research tool in organic chemistry, and special NMR techniques allow the structures of small proteins to be determined with the aid of NMR.


The spin angular-momentum operators obey the general angular-momentum commutation relations of Section 5.4, and it is often helpful to use spin-angular-momentum ladder operators.

From (5.110) and (5.111), the raising and lowering operators for spin angular momentum are

\(
\begin{equation}
\hat{S}{+}=\hat{S}{x}+i \hat{S}{y} \text { and } \hat{S}{-}=\hat{S}{x}-i \hat{S}{y} \tag{10.63}
\end{equation}
\)

Equations (5.112) and (5.113) give

\(
\begin{align}
& \hat{S}{+} \hat{S}{-}=\hat{S}^{2}-\hat{S}{z}^{2}+\hbar \hat{S}{z} \tag{10.64}\
& \hat{S}{-} \hat{S}{+}=\hat{S}^{2}-\hat{S}{z}^{2}-\hbar \hat{S}{z} \tag{10.65}
\end{align}
\)

The spin functions $\alpha$ and $\beta$ are eigenfunctions of $\hat{S}{z}$ with eigenvalues $+\frac{1}{2} \hbar$ and $-\frac{1}{2} \hbar$, respectively. Since $\hat{S}{+}$is the raising operator, the function $\hat{S}{+} \beta$ is an eigenfunction of $\hat{S}{z}$ with eigenvalue $+\frac{1}{2} \hbar$. The most general eigenfunction of $\hat{S}_{z}$ with this eigenvalue is an arbitrary constant times $\alpha$. Hence

\(
\begin{equation}
\hat{S}_{+} \beta=c \alpha \tag{10.66}
\end{equation}
\)

where $c$ is some constant. To find $c$, we use normalization [Eq. (10.11)]:

\(
\begin{gather}
1=\sum{m{s}}\left[\alpha\left(m_{s}\right)\right] \alpha\left(m{s}\right)=\sum\left(\hat{S}{+} \beta / c\right) \left(\hat{S}{+} \beta / c\right) \
|c|^{2}=\sum\left(\hat{S}{+} \beta\right) \hat{S}{+} \beta=\sum\left(\hat{S}{+} \beta\right) \left(\hat{S}{x}+i \hat{S}{y}\right) \beta \
|c|^{2}=\sum\left(\hat{S}_{+} \beta\right) \hat{S}{x} \beta+i \sum\left(\hat{S}{+} \beta\right) \hat{S}_{y} \beta \tag{10.67}
\end{gather}
\)

We now use the Hermitian property of $\hat{S}{x}$ and $\hat{S}{y}$. For an operator $\hat{A}$ that acts on functions of the continuous variable $x$, the Hermitian property is

\(
\int{-\infty}^{\infty} f^{*}(x) \hat{A} g(x) d x=\int{-\infty}^{\infty} g(x)[\hat{A} f(x)] * d x
\)

For an operator such as $\hat{S}{x}$ that acts on functions of the variable $m{s}$, which takes on discrete values, the Hermitian property is

\(
\begin{equation}
\sum{m{s}} f^{}\left(m{s}\right) \hat{S}{x} g\left(m{s}\right)=\sum{m{s}} g\left(m{s}\right)\left[\hat{S}{x} f\left(m{s}\right)\right]^{} \tag{10.68}
\end{equation}
\)

Taking $f=\hat{S}_{+} \beta$ and $g=\beta$, we can write (10.67) as

\(
c^{} c=\sum \beta\left(\hat{S}{x} \hat{S}{+} \beta\right)^{}+i \sum \beta\left(\hat{S}{y} \hat{S}{+} \beta\right)^{*}
\)

Taking the complex conjugate of this equation and using (10.63) and (10.65), we have

\(
\begin{aligned}
& c c^{}=\sum \beta^{} \hat{S}{x} \hat{S}{+} \beta-i \sum \beta^{} \hat{S}{y} \hat{S}{+} \beta \
&|c|^{2}=\sum \beta^{}\left(\hat{S}{x}-i \hat{S}{y}\right) \hat{S}{+} \beta=\sum \beta^{*} \hat{S}{-} \hat{S}{+} \beta \
&|c|^{2}=\sum \beta^{*}\left(\hat{S}^{2}-\hat{S}{z}^{2}-\hbar \hat{S}_{z}\right) \beta \
&|c|^{2}=\sum \beta^{}\left(\frac{3}{4} \hbar^{2}-\frac{1}{4} \hbar^{2}+\frac{1}{2} \hbar^{2}\right) \beta=\hbar^{2} \sum \beta^{} \beta=\hbar^{2} \
& \quad|c|=\hbar
\end{aligned}
\)

Choosing the phase of $c$ as zero, we have $c=\hbar$, and (10.66) reads

\(
\begin{equation}
\hat{S}_{+} \beta=\hbar \alpha \tag{10.69}
\end{equation}
\)

A similar calculation gives

\(
\begin{equation}
\hat{S}_{-} \alpha=\hbar \beta \tag{10.70}
\end{equation}
\)

Since $\alpha$ is the eigenfunction with the highest possible value of $m{s}$, the operator $\hat{S}{+}$acting on $\alpha$ must annihilate it [Eq. (5.135)]:

\(
\hat{S}_{+} \alpha=0
\)

Likewise,

\(
\hat{S}_{-} \beta=0
\)

From these last four equations, we get

\(
\begin{equation}
\left(\hat{S}{+}+\hat{S}{-}\right) \beta=\hbar \alpha, \quad\left(\hat{S}{+}-\hat{S}{-}\right) \beta=\hbar \alpha \tag{10.71}
\end{equation}
\)

Use of (10.63) in (10.71) gives

\(
\begin{equation}
\hat{S}{x} \beta=\frac{1}{2} \hbar \alpha, \quad \hat{S}{y} \beta=-\frac{1}{2} i \hbar \alpha \tag{10.72}
\end{equation}
\)

Similarly, we find

\(
\begin{equation}
\hat{S}{x} \alpha=\frac{1}{2} \hbar \beta, \quad \hat{S}{y} \alpha=\frac{1}{2} i \hbar \beta \tag{10.73}
\end{equation}
\)

Matrix representatives of the spin operators are considered in Prob. 10.28.


Many Electrons Atoms

Click the keywords below to know more about it. 

Hartree–Fock Self-Consistent-Field Method: A computational approach used to find approximate wave functions for many-electron atoms. It involves iteratively solving the Schrödinger equation by assuming each electron moves in the field created by the nucleus and a hypothetical charge cloud formed by other electrons 1. Hamiltonian Operator: An operator representing the total energy of a system, including kinetic and potential energy. For an n-electron atom, it includes terms for electron-nucleus attractions, electron-electron repulsions, and kinetic energy of electrons 1. Central-Field Approximation: An approximation where the effective potential acting on an electron in an atom is assumed to depend only on the distance from the nucleus, simplifying calculations by averaging over angular coordinates 1. Slater Determinant: A mathematical expression used to describe the wave function of a multi-electron system, ensuring the antisymmetry required by the Pauli exclusion principle. It is a determinant of spin-orbitals 1. Spin-Orbit Interaction: A relativistic effect where the electron's spin interacts with its orbital motion, leading to energy level splitting. It is proportional to the dot product of the electron's spin and orbital angular momentum 1. Coulomb Integral: An integral representing the electrostatic interaction energy between two electrons in different orbitals. It is part of the Hartree–Fock energy calculation 1. Exchange Integral: An integral representing the exchange interaction energy between two electrons with the same spin in different orbitals. It arises due to the antisymmetry of the wave function 1. Configuration Interaction (CI): A method to improve the accuracy of wave functions by considering contributions from multiple electron configurations. It involves expressing the wave function as a linear combination of configuration state functions 1. Electron Correlation: The interaction between electrons that is not accounted for in the Hartree–Fock method. It includes instantaneous interactions that cause electrons to avoid each other, leading to a more accurate description of the system 1. Term Symbol: A notation used to describe the quantum state of an atom, including its total electronic spin and orbital angular momentum. It is written as 2S+1LJ, where S is the spin multiplicity, L is the orbital angular momentum, and J is the total angular momentum 1.

For the hydrogen atom, the exact wave function is known. For helium and lithium, very accurate wave functions have been calculated by including interelectronic distances in the variation functions. For atoms of higher atomic number, one way to find an accurate wave function is to first find an approximate wave function using the Hartree-Fock procedure, which we shall outline in this section. The Hartree-Fock method is the basis for the use of atomic and molecular orbitals in many-electron systems.

The Hamiltonian operator for an $n$-electron atom is

\( \begin{equation} \hat{H}=-\frac{\hbar^{2}}{2 m{e}} \sum{i=1}^{n} \nabla{i}^{2}-\sum{i=1}^{n} \frac{Z e^{2}}{4 \pi \varepsilon{0} r{i}}+\sum{i=1}^{n-1} \sum{j=i+1}^{n} \frac{e^{2}}{4 \pi \varepsilon{0} r{i j}} \tag{11.1} \end{equation} \)

where an infinitely heavy point nucleus was assumed (Section 6.6). The first sum in (11.1) contains the kinetic-energy operators for the $n$ electrons. The second sum is the potential energy (6.58) for the attractions between the electrons and the nucleus of charge $Z e$. For a neutral atom, $Z=n$. The last sum is the potential energy of the interelectronic repulsions. The restriction $j>i$ avoids counting each interelectronic repulsion twice and avoids terms like $e^{2} / 4 \pi \varepsilon{0} r{i i}$. The Hamiltonian (11.1) is incomplete, because it omits spin-orbit and other interactions. The omitted terms are small (except for atoms with high $Z$ ) and will be considered in Sections 11.6 and 11.7.

The Hartree SCF Method

Because of the interelectronic repulsion terms $e^{2} / 4 \pi \varepsilon{0} r{i j}$, the Schrödinger equation for an atom is not separable. Recalling the perturbation treatment of helium (Section 9.3), we can obtain a zeroth-order wave function by neglecting these repulsions. The Schrödinger equation would then separate into $n$ one-electron hydrogenlike equations. The zeroth-order wave function would be a product of $n$ hydrogenlike (one-electron) orbitals:

\( \begin{equation} \psi^{(0)}=f{1}\left(r{1}, \theta{1}, \phi{1}\right) f{2}\left(r{2}, \theta{2}, \phi{2}\right) \cdots f{n}\left(r{n}, \theta{n}, \phi{n}\right) \tag{11.2} \end{equation} \)

where the hydrogenlike orbitals are

\( \begin{equation} f=R{n l}(r) Y{l}^{m}(\theta, \phi) \tag{11.3} \end{equation} \)

For the ground state of the atom, we would feed two electrons with opposite spin into each of the lowest orbitals, in accord with the Pauli exclusion principle, giving the ground-state configuration. Although the approximate wave function (11.2) is qualitatively useful, it is gravely lacking in quantitative accuracy. For one thing, all the orbitals use the full nuclear charge $Z$. Recalling our variational treatments of helium and lithium, we know we can get a better approximation by using different effective atomic numbers for the different orbitals to account for screening of electrons. The use of effective atomic numbers gives considerable improvement, but we are still far from having an accurate wave function. The next step is to use a variation function that has the same form as (11.2) but is not restricted to hydrogenlike or any other particular form of orbitals. Thus we take

\( \begin{equation} \phi=g{1}\left(r{1}, \theta{1}, \phi{1}\right) g{2}\left(r{2}, \theta{2}, \phi{2}\right) \cdots g{n}\left(r{n}, \theta{n}, \phi{n}\right) \tag{11.4} \end{equation} \)

and we look for the functions $g{1}, g{2}, \ldots, g{n}$ that minimize the variational integral $\langle\phi| \hat{H}|\phi\rangle /\langle\phi \mid \phi\rangle$. Our task is harder than in previous variational calculations, where we guessed a trial function that included some parameters and then varied the parameters. In (11.4) we must vary the functions $g{i}$. [After we have found the best possible functions $g_{i}$, Eq. (11.4) will still be only an approximate wave function. The many-electron Schrödinger equation is not separable, so the true wave function cannot be written as the product of $n$ one-electron functions.]

To simplify matters somewhat, we approximate the best possible atomic orbitals with orbitals that are the product of a radial factor and a spherical harmonic:

\( \begin{equation} g{i}=h{i}\left(r{i}\right) Y{l{i}}^{m{i}}\left(\theta{i}, \phi{i}\right) \tag{11.5} \end{equation} \)

This approximation is generally made in atomic calculations. The procedure for finding the functions $g_{i}$ was introduced by Hartree in 1928 and is called the Hartree self-consistent-field (SCF) method. Hartree arrived at the SCF procedure by intuitive physical arguments. The proof that Hartree's procedure gives the best possible variation function of the form (11.4) was given by Slater and by Fock in 1930. [For the proof and a review of the SCF method, see S. M. Blinder, Am. J. Phys., 33, 431 (1965).]

Hartree's procedure is as follows. We first guess a product wave function

\( \begin{equation} \phi{0}=s{1}\left(r{1}, \theta{1}, \phi{1}\right) s{2}\left(r{2}, \theta{2}, \phi{2}\right) \cdots s{n}\left(r{n}, \theta{n}, \phi_{n}\right) \tag{11.6} \end{equation} \)

where each $s{i}$ is a normalized function of $r$ multiplied by a spherical harmonic. A reasonable guess for $\phi{0}$ would be a product of hydrogenlike orbitals with effective atomic numbers. For the function (11.6), the probability density of electron $i$ is $\left|s{i}\right|^{2}$. We now focus attention on electron 1 and regard electrons $2,3, \ldots, n$ as being smeared out to form a fixed distribution of electric charge through which electron 1 moves. We are thus averaging out the instantaneous interactions between electron 1 and the other electrons. The potential energy of interaction between point charges $Q{1}$ and $Q{2}$ is $V{12}=Q{1} Q{2} / 4 \pi \varepsilon{0} r{12}$ [Eq. (6.58)]. We now take $Q{2}$ and smear it out into a continuous charge distribution such that $\rho{2}$ is the charge density, the charge per unit volume. The infinitesimal charge in the infinitesimal volume $d v{2}$ is $\rho{2} d v{2}$, and summing up the interactions between $Q{1}$ and the infinitesimal elements of charge, we have

\( V{12}=\frac{Q{1}}{4 \pi \varepsilon{0}} \int \frac{\rho{2}}{r{12}} d v{2} \)

For electron 2 (with charge $-e$ ), the charge density of the hypothetical charge cloud is $\rho{2}=-e\left|s{2}\right|^{2}$, and for electron $1, Q_{1}=-e$. Hence

\( V{12}=\frac{e^{2}}{4 \pi \varepsilon{0}} \int \frac{\left|s{2}\right|^{2}}{r{12}} d v_{2} \)

Adding in the interactions with the other electrons, we have

\( V{12}+V{13}+\cdots+V{1 n}=\sum{j=2}^{n} \frac{e^{2}}{4 \pi \varepsilon{0}} \int \frac{\left|s{j}\right|^{2}}{r{1 j}} d v{j} \)

The potential energy of interaction between electron 1 and the other electrons and the nucleus is then

\( \begin{equation} V{1}\left(r{1}, \theta{1}, \phi{1}\right)=\sum{j=2}^{n} \frac{e^{2}}{4 \pi \varepsilon{0}} \int \frac{\left|s{j}\right|^{2}}{r{1 j}} d v{j}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r_{1}} \tag{11.7} \end{equation} \)

We now make a further approximation beyond assuming the wave function to be a product of one-electron orbitals. We assume that the effective potential acting on an electron in an atom can be adequately approximated by a function of $r$ only. This central-field approximation can be shown to be generally accurate. We therefore average $V{1}\left(r{1}, \theta{1}, \phi{1}\right)$ over the angles to arrive at a potential energy that depends only on $r_{1}$ :

\( \begin{equation} V{1}\left(r{1}\right)=\frac{\int{0}^{2 \pi} \int{0}^{\pi} V{1}\left(r{1}, \theta{1}, \phi{1}\right) \sin \theta{1} d \theta{1} d \phi{1}}{\int{0}^{2 \pi} \int_{0}^{\pi} \sin \theta d \theta d \phi} \tag{11.8} \end{equation} \)

We now use $V{1}\left(r{1}\right)$ as the potential energy in a one-electron Schrödinger equation,

\( \begin{equation} \left[-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}+V{1}\left(r{1}\right)\right] t{1}(1)=\varepsilon{1} t_{1}(1) \tag{11.9} \end{equation} \)

and solve for $t{1}(1)$, which will be an improved orbital for electron 1 . In (11.9), $\varepsilon{1}$ is the energy of the orbital of electron 1 at this stage of the approximation. Since the potential energy in (11.9) is spherically symmetric, the angular factor in $t{1}(1)$ is a spherical harmonic involving quantum numbers $l{1}$ and $m{1}$ (Section 6.1). The radial factor $R{1}(1)$ in $t{1}$ is the solution of a one-dimensional Schrödinger equation of the form (6.17). We get a set of solutions $R{1}(1)$, where the number of nodes $k$ interior to the boundary points ( $r=0$ and $\infty$ ) starts at zero for the lowest energy and increases by 1 for each higher energy (Section 4.2). We now define the quantum number $n$ as $n \equiv l+1+k$, where $k=0,1,2, \ldots$. We thus have $1 s$, $2 s, 2 p$, and so on, orbitals (with orbital energy $\varepsilon$ increasing with $n$ ) just as in hydrogenlike atoms, and the number of interior radial nodes $(n-l-1)$ is the same as in hydrogenlike atoms (Section 6.6). However, since $V{1}\left(r{1}\right)$ is not a simple Coulomb potential, the radial factor $R{1}\left(r{1}\right)$ is not a hydrogenlike function. Of the set of solutions $R{1}\left(r{1}\right)$, we take the one that corresponds to the orbital we are improving. For example, if electron 1 is a $1 s$ electron in the beryllium $1 s^{2} 2 s^{2}$ configuration, then $V{1}\left(r{1}\right)$ is calculated from the guessed orbitals of one $1 s$ electron and two $2 s$ electrons, and we use the radial solution of (11.9) with $k=0$ to find an improved $1 s$ orbital.

We now go to electron 2 and regard it as moving in a charge cloud of density

\( -e\left[\left|t{1}(1)\right|^{2}+\left|s{3}(3)\right|^{2}+\left|s{4}(4)\right|^{2}+\cdots+\left|s{n}(n)\right|^{2}\right] \)

due to the other electrons. We calculate an effective potential energy $V{2}\left(r{2}\right)$ and solve a one-electron Schrödinger equation for electron 2 to get an improved orbital $t_{2}(2)$. We continue this process until we have a set of improved orbitals for all $n$ electrons. Then we go back to electron 1 and repeat the process. We continue to calculate improved orbitals until there is no further change from one iteration to the next. The final set of orbitals gives the Hartree self-consistent-field wave function.

How do we get the energy of the atom in the SCF approximation? It seems natural to take the sum of the orbital energies of the electrons, $\varepsilon{1}+\varepsilon{2}+\cdots+\varepsilon{n}$, but this is wrong. In calculating the orbital energy $\varepsilon{1}$, we iteratively solved the one-electron Schrödinger equation (11.9). The potential energy in (11.9) includes, in an average way, the energy of the repulsions between electrons 1 and 2,1 and $3, \ldots, 1$ and $n$. When we solve for $\varepsilon{2}$, we solve a one-electron Schrödinger equation whose potential energy includes repulsions between electrons 2 and 1,2 and $3, \ldots, 2$ and $n$. If we take $\sum{i} \varepsilon_{i}$, we will count each interelectronic repulsion twice. To correctly obtain the total energy $E$ of the atom, we must take

\( \begin{align} E & =\sum{i=1}^{n} \varepsilon{i}-\sum{i=1}^{n-1} \sum{j=i+1}^{n} \iint \frac{e^{2}\left|g{i}(i)\right|^{2}\left|g{j}(j)\right|^{2}}{4 \pi \varepsilon{0} r{i j}} d v{i} d v{j} \tag{11.10}\ E & =\sum{i} \varepsilon{i}-\sum{i} \sum{j>i} J_{i j} \end{align} \)

where the average repulsions of the electrons in the Hartree orbitals of (11.4) were subtracted from the sum of the orbital energies, and where the notation $J_{i j}$ was used for Coulomb integrals [Eq. (9.99)].

The set of orbitals belonging to a given principal quantum number $n$ constitutes a shell. The $n=1,2,3, \ldots$ shells are the $K, L, M, \ldots$ shells, respectively. The orbitals belonging to a given $n$ and a given $l$ constitute a subshell. Consider the sum of the Hartree probability densities for the electrons in a filled subshell. Using (11.5), we have

\( \begin{equation} 2 \sum{m=-l}^{l}\left|h{n, l}(r)\right|^{2}\left|Y{l}^{m}(\theta, \phi)\right|^{2}=2\left|h{n, l}(r)\right|^{2} \sum{m=-l}^{l}\left|Y{l}^{m}(\theta, \phi)\right|^{2} \tag{11.11} \end{equation} \)

where the factor 2 comes from the pair of electrons in each orbital. The spherical-harmonic addition theorem (Merzbacher, Section 9.7) shows that the sum on the right side of (11.11) equals $(2 l+1) / 4 \pi$. Hence the sum of the probability densities is $[(2 l+1) / 2 \pi]\left|h_{n, l}(r)\right|^{2}$, which is independent of the angles. A closed subshell gives a spherically symmetric probability density, a result called Unsöld's theorem. For a half-filled subshell, the factor 2 is omitted from (11.11), and here also we get a spherically symmetric probability density.

The Hartree-Fock SCF Method

The alert reader may have realized that there is something fundamentally wrong with the Hartree product wave function (11.4). Although we have paid some attention to spin and the Pauli exclusion principle by putting no more than two electrons in each spatial orbital, any approximation to the true wave function should include spin explicitly and should be antisymmetric to interchange of electrons (Chapter 10). Hence, instead of the spatial orbitals, we must use spin-orbitals and must take an antisymmetric linear combination of products of spin-orbitals. This was pointed out by Fock (and by Slater) in 1930, and an SCF calculation that uses antisymmetrized spin-orbitals is called a Hartree-Fock calculation. We have seen that a Slater determinant of spin-orbitals provides the proper antisymmetry. For example, to carry out a Hartree-Fock calculation for the lithium ground state, we start with the function (10.54), where $f$ and $g$ are guesses for the $1 s$ and $2 s$ orbitals. We then carry out the SCF iterative process until we get no further improvement in $f$ and $g$. This gives the lithium ground-state Hartree-Fock wave function.

The differential equations for finding the Hartree-Fock orbitals have the same general form as (11.9):

\( \begin{equation} \hat{F} u{i}=\varepsilon{i} u_{i}, \quad i=1,2, \ldots, n \tag{11.12} \end{equation} \)

where $u{i}$ is the $i$ th spin-orbital, the operator $\hat{F}$, called the Fock (or Hartree-Fock) operator, is the effective Hartree-Fock Hamiltonian, and the eigenvalue $\varepsilon{i}$ is the orbital energy of spin-orbital $i$. However, the Hartree-Fock operator $\hat{F}$ has extra terms as compared with the effective Hartree Hamiltonian given by the bracketed terms in (11.9). The

Hartree-Fock expression for the total energy of the atom involves exchange integrals $K_{i j}$ in addition to the Coulomb integrals that occur in the Hartree expression (11.10). See Section 14.3. [Actually, Eq. (11.12) applies only when the Hartree-Fock wave function can be written as a single Slater determinant, as it can for closed-subshell atoms and atoms with only one electron outside closed subshells. When the Hartree-Fock wave function contains more than one Slater determinant, the Hartree-Fock equations are more complicated than (11.12).]

The orbital energy $\varepsilon_{i}$ in the Hartree-Fock equations (11.12) can be shown to be a good approximation to the negative of the energy needed to ionize a closed-subshell atom by removing an electron from spin-orbital $i$ (Koopmans' theorem; Section 15.5).

Originally, Hartree-Fock atomic calculations were done by using numerical methods to solve the Hartree-Fock differential equations (11.12), and the resulting orbitals were given as tables of the radial functions for various values of $r$. [The Numerov method (Sections 4.4 and 6.9) can be used to solve the radial Hartree-Fock equations for the radial factors in the Hartree-Fock orbitals; the angular factors are spherical harmonics. See D. R. Hartree, The Calculation of Atomic Structures, Wiley, 1957; C. Froese Fischer, The Hartree-Fock Method for Atoms, Wiley, 1977.]

In 1951, Roothaan proposed representing the Hartree-Fock orbitals as linear combinations of a complete set of known functions, called basis functions. Thus for lithium we would write the Hartree-Fock $1 s$ and $2 s$ spatial orbitals as

\( \begin{equation} f=\sum{i} b{i} \chi{i}, \quad g=\sum{i} c{i} \chi{i} \tag{11.13} \end{equation} \)

where the $\chi{i}$ functions are some complete set of functions, and where the $b{i}$ 's and $c{i}$ 's are expansion coefficients that are found by the SCF iterative procedure. Since the $\chi{i}$ (chi $i$ ) functions form a complete set, these expansions are valid. The Roothaan expansion procedure allows one to find the Hartree-Fock wave function using matrix algebra (see Section 14.3 for details). The Roothaan procedure is readily implemented on a computer and is often used to find atomic Hartree-Fock wave functions and nearly always used to find molecular Hartree-Fock wave functions.

A commonly used set of basis functions for atomic Hartree-Fock calculations is the set of Slater-type orbitals (STOs) whose normalized form is

\( \begin{equation} \frac{\left(2 \zeta / a{0}\right)^{n+1 / 2}}{[(2 n)!]^{1 / 2}} r^{n-1} e^{-\zeta r / a{0}} Y_{l}^{m}(\theta, \phi) \tag{11.14} \end{equation} \)

The set of all such functions with $n, l$, and $m$ being integers obeying (6.96)-(6.98) but with $\zeta$ having all possible positive values forms a complete set. The parameter $\zeta$ is called the orbital exponent. To get a truly accurate representation of the Hartree-Fock orbitals, we would have to include an infinite number of Slater orbitals in the expansions. In practice, one can get very accurate results by using only a few judiciously chosen Slater orbitals. (Another possibility is to use Gaussian-type basis functions; see Section 15.4.)

Clementi and Roetti did Hartree-Fock calculations for the ground state and some excited states of the first 54 elements of the periodic table [E. Clementi and C. Roetti, At. Data Nucl. Data Tables, 14, 177 (1974); Bunge and co-workers have recalculated these wave functions; C. F. Bunge et al., At. Data Nucl. Data Tables, 53, 113 (1993); Phys. Rev. A, 46, 3691 (1992); these atomic wave functions can be found at www.ccl.net/cca/data/ atomic-RHF-wavefunctions/index.shtml]. For example, consider the Hartree-Fock groundstate wave function of helium, which has the form [see Eq. (10.41)]

\( f(1) f(2) \cdot 2^{-1 / 2}[\alpha(1) \beta(2)-\alpha(2) \beta(1)] \)

Clementi and Roetti expressed the $1 s$ orbital function $f$ as the following combination of five $1 s$ Slater-type orbitals:

\( f=\pi^{-1 / 2} \sum{i=1}^{5} c{i}\left(\frac{\zeta{i}}{a{0}}\right)^{3 / 2} e^{-\zeta{i} r / a{0}} \)

where the expansion coefficients $c{i}$ are $c{1}=0.76838, c{2}=0.22346, c{3}=0.04082$, $c{4}=-0.00994, c{5}=0.00230$ and where the orbital exponents $\zeta{i}$ are $\zeta{1}=1.41714$, $\zeta{2}=2.37682, \zeta{3}=4.39628, \zeta{4}=6.52699, \zeta{5}=7.94252$. [The largest term in the expansion has an orbital exponent that is similar to the orbital exponent (9.62) for the simple trial function (9.56).] The Hartree-Fock energy is -77.9 eV , as compared with the true nonrelativistic energy, -79.0 eV . The $1 s$ orbital energy corresponding to $f$ was found to be -25.0 eV , as compared with the experimental helium ionization energy of 24.6 eV .

For the lithium ground state, Clementi and Roetti used a basis set consisting of two 1 s STOs (with different orbital exponents) and four $2 s$ STOs (with different orbital exponents). The lithium $1 s$ and $2 s$ Hartree-Fock orbitals were each expressed as a linear combination of all six of these basis functions. The Hartree-Fock energy is -202.3 eV , as compared with the true energy -203.5 eV .

Electron densities calculated from Hartree-Fock wave functions are quite accurate. Figure 11.1 compares the radial distribution function of argon (found by integrating the electron density over the angles $\theta$ and $\phi$ and multiplying the result by $r^{2}$ ) calculated by the Hartree-Fock method with the experimental radial distribution function found by electron diffraction. (Recall from Section 6.6 that the radial distribution function is proportional to the probability of finding an electron in a thin spherical shell at a distance $r$ from the nucleus.) Note the electronic shell structure in Fig. 11.1. The high nuclear charge in ${ }_{18} \mathrm{Ar}$ makes the average distance of the $1 s$ electrons from the nucleus far less than in H or He . Thus there is only a moderate increase in atomic size as we go down a given group in the periodic table. Calculations show that the radius of a sphere containing $98 \%$ of the Hartree-Fock electron probability density gives an atomic radius in good agreement with the empirically determined van der Waals radius. [See C. W. Kammeyer and D. R. Whitman, J. Chem. Phys., 56, 4419 (1972).]

Although the radial distribution function of an atom shows the shell structure, the electron probability density integrated over the angles and plotted versus $r$ does not oscillate. Rather, for ground-state atoms this probability density is a maximum at the nucleus (because of the $s$ electrons) and continually decreases as $r$ increases. Similarly, in molecules the maxima in electron probability density usually occur at the nuclei; see, for example, Fig. 13.7. [For further discussion, see H. Weinstein, P. Politzer, and S. Srebnik, Theor. Chim. Acta, 38, 159 (1975).]

FIGURE 11.1 Radial distribution function in Ar as a function of $r$. The broken line is the result of a Hartree-Fock calculation. The solid line is the result of electron-diffraction data. [Reprinted figure with permission from L.S. Bartell and L. O. Brockway, Physical Review Series II, Vol 90, 833, 1953. Copyright 1953 by the American Physical Society.]

Accurate representation of a many-electron atomic orbital (AO) requires a linear combination of several Slater-type orbitals. For rough calculations, it is convenient to have simple approximations for AOs. We might use hydrogenlike orbitals with effective nuclear charges, but Slater suggested an even simpler method: to approximate an AO by a single function of the form (11.14) with the orbital exponent $\zeta$ taken as

\( \begin{equation} \zeta=(Z-s) / n \tag{11.15} \end{equation} \)

where $Z$ is the atomic number, $n$ is the orbital's principal quantum number, and $s$ is a screening constant calculated by a set of rules (see Prob. 15.62). A Slater orbital replaces the polynomial in $r$ in a hydrogenlike orbital with a single power of $r$. Hence a single Slater orbital does not have the proper number of radial nodes and does not represent well the inner part of an orbital.

A lot of computation is required to perform a Hartree-Fock SCF calculation for a many-electron atom. Hartree did several SCF calculations in the 1930s, when electronic computers were not in existence. Fortunately, Hartree's father, a retired engineer, enjoyed numerical computation as a hobby and helped his son. Nowadays computers have replaced Hartree's father.

The orbital concept and the Pauli exclusion principle allow us to understand the periodic table of the elements. An orbital is a one-electron spatial wave function. We have used orbitals to obtain approximate wave functions for many-electron atoms, writing the wave function as a Slater determinant of one-electron spin-orbitals. In the crudest approximation, we neglect all interelectronic repulsions and obtain hydrogenlike orbitals. The best possible orbitals are the Hartree-Fock SCF functions. We build up the periodic table by feeding electrons into these orbitals, each of which can hold a pair of electrons with opposite spin.

Latter [R. Latter, Phys. Rev., 99, 510 (1955)] calculated approximate orbital energies for the atoms of the periodic table by replacing the complicated expression for the Hartree-Fock potential energy in the Hartree-Fock radial equations by a much simpler function obtained from the Thomas-Fermi-Dirac method, which uses ideas of statistical mechanics to get approximations to the effective potential-energy function for an electron and the electron-density function in an atom (Bethe and Jackiw, Chapter 5). Figure 11.2 shows Latter's resulting orbital energies for neutral ground-state atoms. These AO energies are in pretty good agreement with both Hartree-Fock and experimentally found orbital energies (see J. C. Slater, Quantum Theory of Matter, 2nd ed., McGraw-Hill, 1968, pp. 146, 147, 325, 326).

Orbital energies change with changing atomic number $Z$. As $Z$ increases, the orbital energies decrease because of the increased attraction between the nucleus and the electrons. This decrease is most rapid for the inner orbitals, which are less well-shielded from the nucleus.

For $Z>1$, orbitals with the same value of $n$ but different $l$ have different energies. For example, for the $n=3$ orbital energies, we have $\varepsilon{3 s}<\varepsilon{3 p}<\varepsilon_{3 d}$ for $Z>1$. The splitting of these levels, which are degenerate in the hydrogen atom, arises from the interelectronic repulsions. (Recall the perturbation treatment of helium in Section 9.7.) In the limit $Z \rightarrow \infty$, orbitals with the same value of $n$ are again degenerate, because the interelectronic repulsions become insignificant in comparison with the electron-nucleus attractions.

The relative positions of certain orbitals change with changing $Z$. Thus in hydrogen the $3 d$ orbital lies below the $4 s$ orbital, but for $Z$ in the range from 7 through 20 the $4 s$ is below the $3 d$. For large values of $Z$, the $3 d$ is again lower. At $Z=19$, the $4 s$ is lower; hence ${ }_{19} \mathrm{~K}$ has the ground-state configuration $1 s^{2} 2 s^{2} 2 p^{6} 3 s^{2} 3 p^{6} 4 s$. Recall that $s$ orbitals are more

FIGURE 11.2 Atomicorbital energies as a function of atomic number for neutral atoms, as calculated by Latter. [Reprinted figure with permission from R. Latter, redrawn by M. Kasha, Physical Review Series II, 90, 510, 1955. Copyright 1955 by the American Physical Society.] Note the logarithmic scales. $E_{H}$ is the ground-state hydro-gen-atom energy, -13.6 eV .

penetrating than $p$ or $d$ orbitals; this allows the $4 s$ orbital to lie below the $3 d$ orbital for some values of $Z$. Note the sudden drop in the $3 d$ energy, which starts at $Z=21$, when filling of the $3 d$ orbital begins. The electrons of the $3 d$ orbital do not shield each other very well; hence the sudden drop in $3 d$ energy. Similar drops occur for other orbitals.

To help explain the observed electron configurations of the transition elements and their ions, Vanquickenborne and co-workers calculated Hartree-Fock $3 d$ and $4 s$ orbital energies for atoms and ions for $Z=1$ to $Z=29$ [L. G. Vanquickenborne et al., Inorg. Chem., 28, 1805 (1989); J. Chem. Educ., 71, 469 (1994)].

One complication is that a given electron configuration may give rise to many states. [For example, recall the several states of the $\mathrm{He} 1 s 2 s$ and $1 s 2 p$ configurations (Sections 9.7 and 10.4).] To avoid this complication, Vanquickenborne and co-workers calculated Hartree-Fock orbitals and orbital energies by minimizing the average energy $E_{\text {av }}$ of the
states of a given electron configuration, instead of by minimizing the energy of each individual state of the configuration. The average orbitals obtained differ only slightly from the true Hartree-Fock orbitals for a given state of the configuration.

For each of the atoms ${ }{1} \mathrm{H}$ to ${ }{19} \mathrm{~K}$, Vanquickenborne and co-workers calculated the $3 d$ average orbital energy $\varepsilon{3 d}$ for the electron configuration in which one electron is removed from the highest-occupied orbital of the ground-state electron configuration and put in the $3 d$ orbital; they calculated $\varepsilon{4 s}$ for these atoms in a similar manner. In agreement with Fig. 11.2, they found $\varepsilon{3 d}<\varepsilon{4 s}$ for atomic numbers $Z<6$ and $\varepsilon{4 s}<\varepsilon{3 d}$ for $Z=7$ to 19 for neutral atoms.

For discussion of the transition elements with $Z$ from 21 to 29, Fig. 11.2 is inadequate because it gives only a single value for $\varepsilon{3 d}$ for each element, whereas $\varepsilon{3 d}$ (and $\varepsilon{4 s}$ ) for a given atom depend on which orbitals are occupied. This is because the electric field experienced by an electron depends on which orbitals are occupied. Vanquickenborne and co-workers calculated $\varepsilon{3 d}$ and $\varepsilon{4 s}$ for each of the valence-electron configurations $3 d^{n} 4 s^{2}, 3 d^{n+1} 4 s^{1}$, and $3 d^{n+2} 4 s^{0}$ and found $\varepsilon{3 d}<\varepsilon{4 s}$ in each of these configurations of the neutral atoms and the +1 and +2 ions of the transition elements ${ }{21} \mathrm{Sc}$ through ${ }_{29} \mathrm{Cu}$ (which is the order shown in Fig. 11.2).

Since $3 d$ lies below $4 s$ for $Z$ above 20, one might wonder why the ground-state configuration of, say, ${ }{21} \mathrm{Sc}$ is $3 d^{1} 4 s^{2}$, rather than $3 d^{3}$. Although $\varepsilon{3 d}<\varepsilon{4 s}$ for each of these configurations, this does not mean that the $3 d^{3}$ configuration has the lower sum of orbital energies. When an electron is moved from $4 s$ into $3 d, \varepsilon{4 s}$ and $\varepsilon{3 d}$ are increased. An orbital energy is found by solving a one-electron Hartree-Fock equation that contains potential-energy terms for the average repulsions between the electron in orbital $i$ and the other electrons in the atom, so $\varepsilon{i}$ depends on the values of these repulsions and hence on which orbitals are occupied. For the first series of transition elements, the $4 s$ orbital is much larger than the $3 d$ orbital. For example, Vanquickenborne and co-workers found the following $\langle r\rangle$ values in Sc: $\langle r\rangle{3 d}=0.89 \AA$ and $\langle r\rangle{4 s}=2.09 \AA$ for $3 d^{1} 4 s^{2} ;\langle r\rangle{3 d}=1.11 \AA$ and $\langle r\rangle{4 s}=2.29 \AA$ for $3 d^{2} 4 s^{1}$. Because of this size difference, repulsions involving $4 s$ electrons are substantially less than repulsions involving $3 d$ electrons, and we have $(4 s, 4 s)<(4 s, 3 d)<(3 d, 3 d)$, where $(4 s, 3 d)$ denotes the average repulsion between an electron distributed over the $3 d$ orbitals and an electron in a $4 s$ orbital. (These repulsions are expressed in terms of Coulomb and exchange integrals.) When an electron is moved from $4 s$ into $3 d$, the increase in interelectronic repulsion that is a consequence of the preceding inequalities raises the orbital energies $\varepsilon{3 d}$ and $\varepsilon{4 s}$. For example, for ${ }{21} \mathrm{Sc}$, the $3 d^{1} 4 s^{2}$ configuration has $\varepsilon{3 d}=-9.35 \mathrm{eV}$ and $\varepsilon{4 s}=-5.72 \mathrm{eV}$, whereas the $3 d^{2} 4 s^{1}$ configuration has $\varepsilon{3 d}=-5.23 \mathrm{eV}$ and $\varepsilon{4 s}=-5.06 \mathrm{eV}$. For the $3 d^{1} 4 s^{2}$ configuration, the sum of valence-electron orbital energies is $-9.35 \mathrm{eV}+2(-5.72 \mathrm{eV})=-20.79 \mathrm{eV}$, whereas for the $3 d^{2} 4 s^{1}$ configuration, this sum is $2(-5.23 \mathrm{eV})-5.06 \mathrm{eV}=-15.52 \mathrm{eV}$. Thus, despite the fact that $\varepsilon{3 d}<\varepsilon_{4 s}$ for each configuration, transfer of an electron from $4 s$ to $3 d$ raises the sum of valence-electron orbital energies in Sc. [As we saw in Eq. (11.10) for the Hartree method and will see in Section 14.3 for the Hartree-Fock method, the Hartree and Hartree-Fock expressions for the energy of an atom contain terms in addition to the sum of orbital energies, so we must look at more than the sum of orbital energies to see which configuration is most stable.]

For the +2 ions of the transition metals, the reduction in screening makes the valence $3 d$ and $4 s$ electrons feel a larger effective nuclear charge $\mathrm{Z}{\text {eff }}$ than in the neutral atoms. By analogy to the H-atom equation $E=-\left(Z^{2} / n^{2}\right)\left(e^{2} / 8 \pi \varepsilon{0} a\right)$ [Eq. (6.94)], the orbital energies $\varepsilon{3 d}$ and $\varepsilon{4 s}$ are each roughly proportional to $Z{\text {eff }}^{2}$ and the energy difference $\varepsilon{4 s}-\varepsilon{3 d}$ is roughly proportional to $Z{\text {eff }}^{2}$. The difference $\varepsilon{4 s}-\varepsilon{3 d}$ is thus much larger in the transitionmetal ions than in the neutral atoms; the increase in valence-electron repulsion is no longer
enough to make the $4 s$ to $3 d$ transfer energetically unfavorable; and the +2 ions have ground-state configurations with no $4 s$ electrons.

For further discussion of electron configurations, see W. H. E. Schwarz, J. Chem. Educ., 87, 444 (2010); Schwarz and R. L. Rich, ibid., 87, 435.

Figure 11.2 shows that the separation between $n s$ and $n p$ orbitals is much less than that between $n p$ and $n d$ orbitals, giving the familiar $n s^{2} n p^{6}$ stable octet.

The orbital concept is the basis for most qualitative discussions of the chemistry of atoms and molecules. The use of orbitals, however, is an approximation. To reach the true wave function, we must go beyond a Slater determinant of spin-orbitals.


Energies calculated by the Hartree-Fock method are typically in error by about $\frac{1}{2} \%$ for light atoms. On an absolute basis this is not much, but for the chemist it is too large. For example, the total energy of the carbon atom is about -1000 eV , and $\frac{1}{2} \%$ of this is 5 eV . Chemical single-bond energies run about 5 eV . Calculating a bond energy by taking the difference between Hartree-Fock molecular and atomic energies, which are in error by several electronvolts for light atoms, is an unreliable procedure. We must seek a way to improve Hartree-Fock wave functions and energies. (Our discussion will apply to molecules as well as atoms.)

A Hartree-Fock SCF wave function takes into account the interactions between electrons only in an average way. Actually, we must consider the instantaneous interactions between electrons. Since electrons repel each other, they tend to keep out of each other's way. For example, in helium, if one electron is close to the nucleus at a given instant, it is energetically more favorable for the other electron to be far from the nucleus at that instant. One sometimes speaks of a Coulomb hole surrounding each electron in an atom. This is a region in which the probability of finding another electron is small. The motions of electrons are correlated with each other, and we speak of electron correlation. We must find a way to introduce the instantaneous electron correlation into the wave function.

Actually, a Hartree-Fock wave function does have some instantaneous electron correlation. A Hartree-Fock function satisfies the antisymmetry requirement. Therefore [Eq. (10.20)], it vanishes when two electrons with the same spin have the same spatial coordinates. For a Hartree-Fock function, there is little probability of finding electrons of the same spin in the same region of space, so a Hartree-Fock function has some correlation of the motions of electrons with the same spin. This makes the Hartree-Fock energy lower than the Hartree energy. One sometimes refers to a Fermi hole around each electron in a Hartree-Fock wave function, thereby indicating a region in which the probability of finding another electron with the same spin is small.

The correlation energy $E{\text {corr }}$ is the difference between the exact nonrelativistic energy $E{\text {nonrel }}$ and the (nonrelativistic) Hartree-Fock energy $E_{\mathrm{HF}}$ :

\(
\begin{equation}
E{\mathrm{corr}} \equiv E{\mathrm{nonrel}}-E_{\mathrm{HF}} \tag{11.16}
\end{equation}
\)

where $E{\text {nonrel }}$ and $E{\mathrm{HF}}$ should both either include corrections for nuclear motion or omit these corrections. For the He atom, the (nonrelativistic) Hartree-Fock energy uncorrected for nuclear motion is $-2.86168\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$ [E. Clementi and C. Roetti, At. Data Nucl. Data Tables, 14, 177 (1974)] and variational calculations (Section 9.4) give the exact nonrelativistic energy uncorrected for nuclear motion as $-2.90372\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$. Therefore, $E{\text {corr, } \mathrm{He}}=-2.90372\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)+2.86168\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)=-0.04204\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$ $=-1.14 \mathrm{eV}$. For atoms and molecules where $E{\text {nonrel }}$ cannot be accurately calculated,
one combines the experimental energy with estimates for relativistic and nuclear-motion corrections to get $E{\text {nonrel }}$. For neutral ground-state atoms, $\left|E{\text {corr }}\right|$ has been found to increase roughly linearly with the number $n$ of electrons:

\(
E{\text {corr }} \approx-0.0170 n^{1.31}\left(e^{2} / 4 \pi \varepsilon{0} a_{0}\right)=-0.0170 n^{1.31}(27.2 \mathrm{eV})
\)

[E. Clementi and G. Corongiu, Int. J. Quantum Chem., 62, 571 (1997)]. The percentage $\left(E{\text {corr }} / E{\text {nonrel }}\right) \times 100 \%$ decreases with increasing atomic number. Some values are $0.6 \%$ for $\mathrm{Li}, 0.4 \%$ for $\mathrm{C}, 0.2 \%$ for Na , and $0.1 \%$ for K .

We have already indicated two of the ways in which we may provide for instantaneous electron correlation. One method is to introduce the interelectronic distances $r_{i j}$ into the wave function (Section 9.4).

Another method is configuration interaction. We found (Sections 9.3 and 10.4) the zeroth-order wave function for the helium-atom $1 s^{2}$ ground state to be $1 s(1) 1 s(2)[\alpha(1) \beta(2)-\beta(1) \alpha(2)] / \sqrt{2}$. We remarked that first- and higher-order corrections to the wave function will mix in contributions from excited configurations, producing configuration interaction (CI), also called configuration mixing (CM).

The most common way to do a configuration-interaction calculation on an atom or molecule uses the variation method. One starts by choosing a basis set of one-electron functions $\chi_{i}$. In principle, this basis set should be complete. In practice, one is limited to a basis set of finite size. One hopes that a good choice of basis functions will give a good approximation to a complete set. For atomic calculations, STOs [Eq. (11.14)] are often chosen as the basis functions.

The SCF atomic (or molecular) orbitals $\phi_{i}$ are written as linear combinations of the basis-set members [see (11.13)], and the Hartree-Fock equations (11.12) are solved to give the coefficients in these linear combinations. The number of atomic (or molecular) orbitals obtained equals the number of basis functions used. The lowest-energy orbitals are the occupied orbitals for the ground state. The remaining unoccupied orbitals are called virtual orbitals.

Using the set of occupied and virtual spin-orbitals, one can form antisymmetric manyelectron functions that have different orbital occupancies. For example, for helium, one can form functions that correspond to the electron configurations $1 s^{2}, 1 s 2 s, 1 s 2 p, 2 s^{2}, 2 s 2 p, 2 p^{2}, 1 s 3 s$, and so on. Moreover, more than one function can correspond to a given electron configuration. Recall the functions (10.27) to (10.30) corresponding to the helium $1 s 2 s$ configuration. Each such many-electron function $\Phi{i}$ is a Slater determinant or a linear combination of a few Slater determinants. Use of more than one Slater determinant is required for certain openshell functions such as (10.44) and (10.45). Each $\Phi{i}$ is called a configuration state function or a configuration function or simply a "configuration." (This last name is unfortunate, since it leads to confusion between an electron configuration such as $1 s^{2}$ and a configuration function such as $|1 s \overline{1 s}|$.)

As we saw in perturbation theory, the true atomic (or molecular) wave function $\psi$ contains contributions from configurations other than the one that makes the main contribution to $\psi$, so we express $\psi$ as a linear combination of the configuration functions $\Phi_{i}$ :

\(
\begin{equation}
\psi=\sum{i} c{i} \Phi_{i} \tag{11.17}
\end{equation}
\)

We then regard (11.17) as a linear variation function (Section 8.5). Variation of the coefficients $c_{i}$ to minimize the variational integral leads to the equation

\(
\begin{equation}
\operatorname{det}\left(H{i j}-E S{i j}\right)=0 \tag{11.18}
\end{equation}
\)

where $H{i j} \equiv\left\langle\Phi{i}\right| \hat{H}\left|\Phi{j}\right\rangle$ and $S{i j} \equiv\left\langle\Phi{i} \mid \Phi{j}\right\rangle$. Commonly, the $\Phi_{i}$ functions are orthonormal, but if they are not orthogonal, they can be made so by the Schmidt method. [Only
configuration functions whose angular-momentum eigenvalues are the same as those of the state $\psi$ will contribute to the expansion (11.17); see Section 11.5.]

Because the many-electron configuration functions $\Phi{i}$ are ultimately based on a oneelectron basis set that is a complete set, the set of all possible configuration functions is a complete set for the many-electron problem: Any antisymmetric many-electron function (including the exact wave function) can be expressed as a linear combination of the $\Phi{i}$ functions. (For a proof of this, see Szabo and Ostlund, Section 2.2.7.) Therefore, if one starts with a complete one-electron basis set and includes all possible configuration functions, a CI calculation will give the exact atomic (or molecular) wave function $\psi$ for the state under consideration. In practice, one is limited to a finite, incomplete basis set, rather than an infinite, complete basis set. Moreover, even with a modest-size basis set, the number of possible configuration functions is extremely large, and one usually does not include all possible configuration functions. Part of the "art" of the CI method is choosing those configurations that will contribute the most.

Because it generally takes very many configuration functions to give a truly accurate wave function, configuration-interaction calculations for systems with more than a few electrons are time-consuming, even on supercomputers. Other methods for allowing for electron correlation are discussed in Chapter 16.

In summary, to do a CI calculation, we choose a one-electron basis set $\chi{i}$, iteratively solve the Hartree-Fock equations (11.12) to determine one-electron atomic (or molecular) orbitals $\phi{i}$ as linear combinations of the basis set, form many-electron configuration functions $\Phi{i}$ using the orbitals $\phi{i}$, express the wave function $\psi$ as a linear combination of these configuration functions, solve (11.18) for the energy, and solve the associated simultaneous linear equations for the coefficients $c_{i}$ in (11.17). [In practice, (11.18) and its associated simultaneous equations are solved by matrix methods; see Section 8.6.]

As an example, consider the ground state of beryllium. The Hartree-Fock SCF method would find the best forms for the $1 s$ and $2 s$ orbitals in the Slater determinant $|1 s \overline{1 s} 2 s \overline{2 s}|$ and use this for the ground-state wave function. [We are using the notation of Eq. (10.47).] Going beyond the Hartree-Fock method, we would include contributions from excited configuration functions (for example, $|1 s \bar{s} 3 s \overline{3 s}|$ ) in a linear variation function for the ground state. Bunge did a CI calculation for the beryllium ground state using a linear combination of 650 configuration functions [C. F. Bunge, Phys. Rev. A, 14, 1965 (1976)]. The Hartree-Fock energy is $-14.5730\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$, Bunge's CI result is $-14.6669\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$, and the exact nonrelativistic energy is $-14.6674\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$. Bunge was able to obtain $99.5 \%$ of the correlation energy.

A detailed CI calculation for the He atom is shown in Section 16.2.


For a many-electron atom, the operators for individual angular momenta of the electrons do not commute with the Hamiltonian operator, but their sum does. Hence we want to learn how to add angular momenta.

Suppose we have a system with two angular-momentum vectors $\mathbf{M}{\mathbf{1}}$ and $\mathbf{M}{\mathbf{2}}$. They might be the orbital angular-momentum vectors of two electrons in an atom, or they might be the spin angular-momentum vectors of two electrons, or one might be the spin and the other the orbital angular momentum of a single electron. The eigenvalues of $\hat{M}{1}^{2}, \hat{M}{2}^{2}, \hat{M}{1 z}$, and $\hat{M}{2 z}$ are $j{1}\left(j{1}+1\right) \hbar^{2}, j{2}\left(j{2}+1\right) \hbar^{2}, m{1} \hbar$, and $m{2} \hbar$, where the quantum numbers obey the usual restrictions. The components of $\hat{\mathbf{M}}{1}$ and $\hat{\mathbf{M}}{\mathbf{2}}$ obey the angular-momentum commutation relations [Eqs. (5.46), (5.48), and (5.107)]

\(
\begin{equation}
\left[\hat{M}{1 x}, \hat{M}{1 y}\right]=i \hbar \hat{M}{1 z}, \text { etc. }\left[\hat{M}{2 x}, \hat{M}{2 y}\right]=i \hbar \hat{M}{2 z}, \text { etc. } \tag{11.19}
\end{equation}
\)

We define the total angular momentum $\mathbf{M}$ of the system as the vector sum

\(
\begin{equation}
\mathbf{M}=\mathbf{M}{\mathbf{1}}+\mathbf{M}{\mathbf{2}} \tag{11.20}
\end{equation}
\)

$\mathbf{M}$ is a vector with three components:

\(
\begin{equation}
\mathbf{M}=M{x} \mathbf{i}+M{y} \mathbf{j}+M_{z} \mathbf{k} \tag{11.21}
\end{equation}
\)

The vector equation (11.20) gives the three scalar equations

\(
\begin{equation}
M{x}=M{1 x}+M{2 x}, \quad M{y}=M{1 y}+M{2 y}, \quad M{z}=M{1 z}+M_{2 z} \tag{11.22}
\end{equation}
\)

For the operator $\hat{M}^{2}$, we have

\(
\begin{align}
& \hat{M}^{2}=\hat{\mathbf{M}} \cdot \hat{\mathbf{M}}=\hat{M}{x}^{2}+\hat{M}{y}^{2}+\hat{M}{z}^{2} \tag{11.23}\
& \hat{M}^{2}=\left(\hat{\mathbf{M}}{1}+\hat{\mathbf{M}}{2}\right) \cdot\left(\hat{\mathbf{M}}{1}+\hat{\mathbf{M}}{2}\right) \
& \hat{M}^{2}=\hat{M}{1}^{2}+\hat{M}{2}^{2}+\hat{\mathbf{M}}{1} \cdot \hat{\mathbf{M}}{2}+\hat{\mathbf{M}}{2} \cdot \hat{\mathbf{M}}_{1} \tag{11.24}
\end{align}
\)

If $\hat{\mathbf{M}}{1}$ and $\hat{\mathbf{M}}{2}$ refer to different electrons, they will commute with each other, since each will affect only functions of the coordinates of one electron and not the other. Even if $\hat{\mathbf{M}}{1}$ and $\hat{\mathbf{M}}{2}$ are the orbital and spin angular momenta of the same electron, they will commute, as one will affect only functions of the spatial coordinates while the other will affect functions of the spin coordinates. Thus (11.24) becomes

\(
\begin{gather}
\hat{M}^{2}=\hat{M}{1}^{2}+\hat{M}{2}^{2}+2 \hat{\mathbf{M}}{1} \cdot \hat{\mathbf{M}}{2} \tag{11.25}\
\hat{M}^{2}=\hat{M}{1}^{2}+\hat{M}{2}^{2}+2\left(\hat{M}{1 x} \hat{M}{2 x}+\hat{M}{1 y} \hat{M}{2 y}+\hat{M}{1 z} \hat{M}{2 z}\right) \tag{11.26}
\end{gather}
\)

We now show that the components of the total angular momentum obey the usual angular-momentum commutation relations. We have [Eq. (5.4)]

\(
\begin{aligned}
{\left[\hat{M}{x}, \hat{M}{y}\right] } & =\left[\hat{M}{1 x}+\hat{M}{2 x}, \hat{M}{1 y}+\hat{M}{2 y}\right] \
& =\left[\hat{M}{1 x}, \hat{M}{1 y}+\hat{M}{2 y}\right]+\left[\hat{M}{2 x}, \hat{M}{1 y}+\hat{M}{2 y}\right] \
& =\left[\hat{M}{1 x}, \hat{M}{1 y}\right]+\left[\hat{M}{1 x}, \hat{M}{2 y}\right]+\left[\hat{M}{2 x}, \hat{M}{1 y}\right]+\left[\hat{M}{2 x}, \hat{M}{2 y}\right]
\end{aligned}
\)

Since all components of $\hat{\mathbf{M}}{1}$ commute with all components of $\hat{\mathbf{M}}{2}$, we have

\(
\begin{align}
& {\left[\hat{M}{x}, \hat{M}{y}\right]=\left[\hat{M}{1 x}, \hat{M}{1 y}\right]+\left[\hat{M}{2 x}, \hat{M}{2 y}\right]=i \hbar \hat{M}{1 z}+i \hbar \hat{M}{2 z}} \tag{11.27}\
& {\left[\hat{M}{x}, \hat{M}{y}\right]=i \hbar \hat{M}_{z}}
\end{align}
\)

Cyclic permutation of $x, y$, and $z$ gives

\(
\begin{equation}
\left[\hat{M}{y}, \hat{M}{z}\right]=i \hbar \hat{M}{x}, \quad\left[\hat{M}{z}, \hat{M}{x}\right]=i \hbar \hat{M}{y} \tag{11.28}
\end{equation}
\)

The same commutator algebra used to derive (5.109) gives

\(
\begin{equation}
\left[\hat{M}^{2}, \hat{M}{x}\right]=\left[\hat{M}^{2}, \hat{M}{y}\right]=\left[\hat{M}^{2}, \hat{M}_{z}\right]=0 \tag{11.29}
\end{equation}
\)

Thus we can simultaneously quantize $M^{2}$ and one of its components, say $M_{z}$. Since the components of the total angular momentum obey the angular-momentum commutation relations, the work of Section 5.4 shows that the eigenvalues of $\hat{M}^{2}$ are

\(
\begin{equation}
J(J+1) \hbar^{2}, \quad J=0, \frac{1}{2}, 1, \frac{3}{2}, 2, \ldots \tag{11.30}
\end{equation}
\)

and the eigenvalues of $\hat{M}_{z}$ are

\(
\begin{equation}
M{J} \hbar, \quad M{J}=-J,-J+1, \ldots, J-1, J \tag{11.31}
\end{equation}
\)

We want to find out how the total-angular-momentum quantum numbers $J$ and $M{J}$ are related to the quantum numbers $j{1}, j{2}, m{1}, m{2}$ of the two angular momenta we are adding in (11.20). We also want the eigenfunctions of $\hat{M}^{2}$ and $\hat{M}{z}$. These eigenfunctions are characterized by the quantum numbers $J$ and $M{J}$, and, using ket notation (Section 7.3), we write them as $\left|J M{J}\right\rangle$. Similarly, let $\left|j{1} m{1}\right\rangle$ denote the eigenfunctions of $\hat{M}{1}^{2}$ and $\hat{M}{1 z}$ and $\left|j{2} m{2}\right\rangle$ denote the eigenfunctions of $\hat{M}{2}^{2}$ and $\hat{M}{2 z}$. Now it is readily shown (Prob. 11.10) that

\(
\begin{equation}
\left[\hat{M}{x}, \hat{M}{1}^{2}\right]=\left[\hat{M}{y}, \hat{M}{1}^{2}\right]=\left[\hat{M}{z}, \hat{M}{1}^{2}\right]=\left[\hat{M}^{2}, \hat{M}_{1}^{2}\right]=0 \tag{11.32}
\end{equation}
\)

with similar equations with $\hat{M}{2}^{2}$ replacing $\hat{M}{1}^{2}$. Hence we can have simultaneous eigenfunctions of all four operators $\hat{M}{1}^{2}, \hat{M}{2}^{2}, \hat{M}^{2}, \hat{M}{z}$, and the eigenfunctions $\left|J M{J}\right\rangle$ can be more fully written as $\left|j{1} j{2} J M{J}\right\rangle$. However, one finds that $\hat{M}^{2}$ does not commute with $\hat{M}{1 z}$ or $\hat{M}{2 z}$ (Prob. 11.12), so the eigenfunctions $\left|j{1} j{2} J M{J}\right\rangle$ are not necessarily eigenfunctions of $\hat{M}{1 z}$ or $\hat{M}{2 z}$.

If we take the complete set of functions $\left|j{1} m{1}\right\rangle$ for particle 1 and the complete set $\left|j{2} m{2}\right\rangle$ for particle 2 and form all possible products of the form $\left|j{1} m{1}\right\rangle\left|j{2} m{2}\right\rangle$, we will have a complete set of functions for the two particles. Each unknown eigenfunction $\left|j{1} j{2} J M_{J}\right\rangle$ can then be expanded using this complete set:

\(
\begin{equation}
\left|j{1} j{2} J M{J}\right\rangle=\sum C\left(j{1} j{2} J M{J} ; m{1} m{2}\right)\left|j{1} m{1}\right\rangle\left|j{2} m{2}\right\rangle \tag{11.33}
\end{equation}
\)

where the expansion coefficients are the $C\left(j{1} \cdots m{2}\right)$ 's. The functions $\left|j{1} j{2} J M{J}\right\rangle$ are eigenfunctions of the commuting operators $\hat{M}{1}^{2}, \hat{M}{2}^{2}, \hat{M}^{2}$, and $\hat{M}{z}$ with the following eigenvalues:

$\hat{M}_{1}^{2}$$\hat{M}_{2}^{2}$$\hat{M}^{2}$$\hat{M}_{z}$
$j{1}\left(j{1}+1\right) \hbar^{2}$$j{2}\left(j{2}+1\right) \hbar^{2}$$J(J+1) \hbar^{2}$$\hat{M}_{J} \hbar$

The functions $\left|j{1} m{1}\right\rangle\left|j{2} m{2}\right\rangle$ are eigenfunctions of the commuting operators $\hat{M}{1}^{2}, \hat{M}{1 z}$, $\hat{M}{2}^{2}, \hat{M}{2 z}$ with the following eigenvalues:

$\hat{M}_{1}^{2}$$\hat{M}_{1 z}$$\hat{M}_{2}^{2}$$\hat{M}_{2 z}$
$j{1}\left(j{1}+1\right) \hbar^{2}$$m_{1} \hbar$$j{2}\left(j{2}+1\right) \hbar^{2}$$m_{2} \hbar$

Since the function $\left|j{1} j{2} J M{J}\right\rangle$ being expanded in (11.33) is an eigenfunction of $\hat{M}{1}^{2}$ with eigenvalue $j{1}\left(j{1}+1\right) \hbar^{2}$, we include in the sum only terms that have the same $j{1}$ value as in the function $\left|j{1} j{2} J M{J}\right\rangle$. (See Theorem 3 at the end of Section 7.3.) Likewise, only terms with the same $j{2}$ value as in $\left|j{1} j{2} J M{J}\right\rangle$ are included in the sum. Hence the sum goes over only the $m{1}$ and $m{2}$ values. Also, using $\hat{M}{z}=\hat{M}{1 z}+\hat{M}_{2 z}$, we can prove (Prob. 11.11) that the coefficient $C$ vanishes unless

\(
\begin{equation}
m{1}+m{2}=M_{J} \tag{11.34}
\end{equation}
\)

To find the total-angular-momentum eigenfunctions, one must evaluate the coefficients in (11.33). These are called Clebsch-Gordan or Wigner or vector addition coefficients. For their evaluation, see Merzbacher, Section 16.6.

Thus each total-angular-momentum eigenfunction $\left|j{1} j{2} J M{J}\right\rangle$ is a linear combination of those product functions $\left|j{1} m{1}\right\rangle\left|j{2} m{2}\right\rangle$ whose $m$ values satisfy $m{1}+m{2}=M{J}$.

We now find the possible values of the total-angular-momentum quantum number $J$ that arise from the addition of angular momenta with individual quantum numbers $j{1}$ and $j{2}$.

Before discussing the general case, we consider the case with $j{1}=1, j{2}=2$. The possible values of $m{1}$ are $-1,0,1$, and the possible values of $m{2}$ are $-2,-1,0,1,2$. If we
describe the system by the quantum numbers $j{1}, j{2}, m{1}, m{2}$, then the total number of possible states is fifteen, corresponding to three possibilities for $m{1}$ and five for $m{2}$. Instead, we can describe the system using the quantum numbers $j{1}, j{2}, J, M{J}$, and we must have the same number of states in this description. Let us tabulate the fifteen possible values of $M{J}$ using (11.34):

$m_{1}=-1$01
-3-2-1
-2-10
-101
01-2
123
:---:
-1
0
2

where each $M{J}$ value in the table is the sum of the $m{1}$ and $m{2}$ values at the top and side. The number of times each value of $M{J}$ occurs is

value of $M_{J}$3210-1-2-3
number of occurrences1233321

The highest value of $M{J}$ is +3 . Since $M{J}$ ranges from $-J$ to $+J$, the highest value of $J$ must be 3 . Corresponding to $J=3$, there are seven values of $M_{J}$ ranging from -3 to +3 . Eliminating these seven values, we are left with

value of $M_{J}$210-1-2
number of occurrences12221

The highest remaining value, $M{J}=2$, must correspond to $J=2$. For $J=2$, we have five values of $M{J}$, which when eliminated leave

value of $M_{J}$10-1
number of occurrences111

These remaining values of $M{J}$ clearly correspond to $J=1$. Thus, for the individual angular-momentum quantum numbers $j{1}=1, j_{2}=2$, the possible values of the total-angular-momentum quantum number $J$ are 3,2 , and 1 .

Now consider the general case. There are $2 j{1}+1$ values of $m{1}$ (ranging from $-j{1}$ to $\left.+j{1}\right)$ and $2 j{2}+1$ values of $m{2}$. Hence there are $\left(2 j{1}+1\right)\left(2 j{2}+1\right)$ possible states $\left|j{1} m{1}\right\rangle\left|j{2} m{2}\right\rangle$ with fixed $j{1}$ and $j{2}$ values. The highest possible values of $m{1}$ and $m{2}$ are $j{1}$ and $j{2}$, respectively. Therefore, the maximum possible value of $M{J}=m{1}+m{2}$ is $j{1}+j{2}$ [Eq. (11.34)]. Since $M{J}$ ranges from $-J$ to $+J$, the maximum possible value of $J$ must also be $j{1}+j{2}$ :

\(
\begin{equation}
J{\max }=j{1}+j_{2} \tag{11.35}
\end{equation}
\)

The second-highest value of $M{J}$ is $j{1}+j{2}-1$, which arises in two ways: $m{1}=j{1}-1$, $m{2}=j{2}$ and $m{1}=j{1}, m{2}=j{2}-1$. Linear combinations of these two states must give one state with $J=j{1}+j{2}, M{J}=j{1}+j{2}-1$ and one state with $J=j{1}+j{2}-1, M{J}=$ $j{1}+j_{2}-1$. Continuing in this manner, we find that the possible values of $J$ are

\(
j{1}+j{2}, \quad j{1}+j{2}-1, \quad j{1}+j{2}-2, \quad \ldots, \quad J_{\min }
\)

where $J_{\text {min }}$ is the lowest possible value of $J$.

We determine $J{\text {min }}$ by the requirement that the total number of states be $\left(2 j{1}+1\right)\left(2 j{2}+1\right)$. For a particular value of $J$, there are $2 J+1 M{J}$ values, and so $2 J+1$ states correspond to each value of $J$. The total number of states $\left|j{1} j{2} J M{J}\right\rangle$ for fixed $j{1}$ and $j{2}$ is found by summing the number of states $2 J+1$ for each $J$ from $J{\min }$ to $J_{\max }$ :

\(
\begin{equation}
\text { number of states }=\sum{J=J{\min }}^{J_{\max }}(2 J+1) \tag{11.36}
\end{equation}
\)

This sum goes from $J{\min }$ to $J{\max }$. Let us now take the lower limit of the sum to be $J=0$ instead of $J{\min }$. This change adds to the sum terms with $J$ values of $0,1,2, \ldots, J{\min }-1$. To compensate, we must subtract the corresponding sum that goes from $J=0$ to $J=J_{\min }-1$. Therefore, (11.36) becomes

\(
\text { number of states }=\sum{J=0}^{J{\max }}(2 J+1)-\sum{J=0}^{J{\min }-1}(2 J+1)
\)

Problem 6.16 gives $\sum{l=0}^{n-1}(2 l+1)=n^{2}$. Replacing $n-1$ with $b$, we get $\sum{j=0}^{b}(2 J+1)=$ $(b+1)^{2}$. Therefore,

\(
\text { number of states }=\left(J{\max }+1\right)^{2}-J{\min }^{2}=J{\max }^{2}+2 J{\max }+1-J_{\min }^{2}
\)

Replacing $J{\text {max }}$ by $j{1}+j{2}$ [Eq. (11.35)] and equating the number of states to $\left(2 j{1}+1\right)\left(2 j{2}+1\right)=4 j{1} j{2}+2 j{1}+2 j_{2}+1$, we have

\(
\begin{gather}
\left(j{1}+j{2}\right)^{2}+2\left(j{1}+j{2}\right)+1-J{\min }^{2}=4 j{1} j{2}+2 j{1}+2 j{2}+1 \
J{\min }^{2}=j{1}^{2}-2 j{1} j{2}+j{2}^{2}=\left(j{1}-j{2}\right)^{2} \
J{\min }= \pm\left(j{1}-j_{2}\right) \tag{11.37}
\end{gather}
\)

If $j{1}=j{2}$, then $J{\min }=0$. If $j{1} \neq j_{2}$, then one of the values in (11.37) is negative and must be rejected [Eq. (11.30)]. Thus

\(
\begin{equation}
J{\min }=\left|j{2}-j_{1}\right| \tag{11.38}
\end{equation}
\)

To summarize, we have shown that the addition of two angular momenta characterized by quantum numbers $j{1}$ and $j{2}$ results in a total angular momentum whose quantum number J has the possible values

\(
\begin{equation}
J=j{1}+j{2}, j{1}+j{2}-1, \ldots,\left|j{1}-j{2}\right| \tag{11.39}
\end{equation}
\)

EXAMPLE

Find the possible values of the total-angular-momentum quantum number resulting from the addition of angular momenta with quantum numbers $j{1}=2$ and $j{2}=3$.
The maximum and minimum $J$ values are given by (11.39) as $j{1}+j{2}=2+3=5$ and $\left|j{1}-j{2}\right|=|2-3|=1$. The possible $J$ values (11.39) are therefore $J=5,4,3,2,1$.

EXERCISE Find the possible values of the total-angular-momentum quantum number resulting from the addition of angular momenta with quantum numbers $j{1}=3$ and $j{2}=\frac{3}{2}$. (Answer: $J=\frac{9}{2}, \frac{7}{2}, \frac{5}{2}, \frac{3}{2}$.)

EXAMPLE

Find the possible $J$ values when angular momenta with quantum numbers $j{1}=1$, $j{2}=2$, and $j_{3}=3$ are added.

To add more than two angular momenta, we apply (11.39) repeatedly. Addition of $j{1}=1$ and $j{2}=2$ gives the possible quantum numbers 3,2 , and 1 . Addition of $j_{3}$ to each of these values gives the following possibilities for the total-angular-momentum quantum number:

\(
\begin{equation}
6,5,4,3,2,1,0 ; \quad 5,4,3,2,1 ; \quad 4,3,2 \tag{11.40}
\end{equation}
\)

We have one set of states with total-angular-momentum quantum number 6, two sets of states with $J=5$, three sets with $J=4$, and so on.

EXERCISE Find the possible $J$ values when angular momenta with quantum numbers $j{1}=1, j{2}=1$, and $j_{3}=1$ are added. (Answer: $J=3,2,2,1,1,1,0$.)


Total Electronic Orbital and Spin Angular Momenta

The total electronic orbital angular momentum of an $n$-electron atom is defined as the vector sum of the orbital angular momenta of the individual electrons:

\(
\begin{equation}
\mathbf{L}=\sum{i=1}^{n} \mathbf{L}{i} \tag{11.41}
\end{equation}
\)

Although the individual orbital-angular-momentum operators $\hat{\mathbf{L}}_{i}$ do not commute with the atomic Hamiltonian (11.1), one can show (Bethe and Jackiw, pp. 102-103) that $\hat{\mathbf{L}}$ does commute with the atomic Hamiltonian [provided spin-orbit interaction (Section 11.6) is neglected]. We can therefore characterize an atomic state by a quantum number $L$, where $L(L+1) \hbar^{2}$ is the square of the magnitude of the total electronic orbital angular momentum. The electronic wave function $\psi$ of an atom satisfies $\hat{L}^{2} \psi=L(L+1) \hbar^{2} \psi$. The total-electronic-orbital-angular-momentum quantum number $L$ of an atom is specified by a code letter, as follows:

$L$012345678
letter$S$$P$$D$$F$$G$$H$$I$$K$$L$

The total orbital angular momentum is designated by a capital letter, while lowercase letters are used for orbital angular momenta of individual electrons.

EXAMPLE

Find the possible values of the quantum number $L$ for states of the carbon atom that arise from the electron configuration $1 s^{2} 2 s^{2} 2 p 3 d$.

The $s$ electrons have zero orbital angular momentum and contribute nothing to the total orbital angular momentum. The $2 p$ electron has $l=1$ and the $3 d$ electron has $l=2$. From the angular-momentum addition rule (11.39), the total-orbital-angularmomentum quantum number ranges from $1+2=3$ to $|1-2|=1$; the possible values of $L$ are $L=3,2,1$. The configuration $1 s^{2} 2 s^{2} 2 p 3 d$ gives rise to $P, D$, and $F$ states. [The Hartree-Fock central-field approximation has each electron moving in a central-field potential, $V=V(r)$. Hence, within this approximation, the individual electronic orbital angular momenta are constant, giving rise to a wave function composed of a single
configuration that specifies the individual orbital angular momenta. When we go beyond the SCF central-field approximation, we mix in other configurations so we no longer specify precisely the individual orbital angular momenta. Even so, we can still use the rule (11.39) for finding the possible values of the total orbital angular momentum.]

EXERCISE Find the possible values of $L$ for states that arise from the electron configuration $1 s^{2} 2 s^{2} 2 p^{6} 3 s^{2} 3 p 4 p$. (Answer: 2, 1, 0.)

The total electronic spin angular momentum $\mathbf{S}$ of an atom is defined as the vector sum of the spins of the individual electrons:

\(
\begin{equation}
\mathbf{S}=\sum{i=1}^{n} \mathbf{S}{i} \tag{11.43}
\end{equation}
\)

The atomic Hamiltonian $\hat{H}$ of (11.1) (which omits spin-orbit interaction) does not involve spin and therefore commutes with the total-spin operators $\hat{S}^{2}$ and $\hat{S}{z}$. The fact that $\hat{S}^{2}$ commutes with $\hat{H}$ is not enough to show that the atomic wave functions $\psi$ are eigenfunctions of $\hat{S}^{2}$. The antisymmetry requirement means that each $\psi$ must be an eigenfunction of the exchange operator $\hat{P}{i k}$ with eigenvalue -1 (Section 10.3). Hence $\hat{S}^{2}$ must also commute with $\hat{P}{i k}$ if we are to have simultaneous eigenfunctions of $\hat{H}, \hat{S}^{2}$, and $\hat{P}{i k}$. Problem 11.19 shows that $\left[\hat{S}^{2}, \hat{P}_{i k}\right]=0$, so the atomic wave functions are eigenfunctions of $\hat{S}^{2}$. We have $\hat{S}^{2} \psi=S(S+1) \hbar^{2} \psi$, and each atomic state can be characterized by a total-electronic-spin quantum number $S$.

EXAMPLE

Find the possible values of the quantum number $S$ for states that arise from the electron configuration $1 s^{2} 2 s^{2} 2 p 3 d$.

Consider first the two $1 s$ electrons. To satisfy the exclusion principle, one of these electrons must have $m{s}=+\frac{1}{2}$ while the other has $m{s}=-\frac{1}{2}$. If $M{S}$ is the quantum number that specifies the $z$ component of the total spin of the $1 s$ electrons, then the only possible value of $M{S}$ is $\frac{1}{2}-\frac{1}{2}=0$ [Eq. (11.34)]. This single value of $M{S}$ clearly means that the total spin of the two $1 s$ electrons is zero. Thus, although in general when we add the spins $s{1}=\frac{1}{2}$ and $s{2}=\frac{1}{2}$ of two electrons according to the rule (11.39), we get the two possibilities $S=0$ and $S=1$, the restriction imposed by the Pauli principle leaves $S=0$ as the only possibility in this case. Likewise, the spins of the $2 s$ electrons add up to zero. The exclusion principle does not restrict the $m{s}$ values of the $2 p$ and $3 d$ electrons. Application of the rule (11.39) to the spins $s{1}=\frac{1}{2}$ and $s{2}=\frac{1}{2}$ of the $2 p$ and $3 d$ electrons gives $S=0$ and $S=1$. These are the possible values of the total spin quantum number, since the $1 s$ and $2 s$ electrons do not contribute to $S$.

EXERCISE Find the possible values of $S$ for states that arise from the electron configuration $1 s^{2} 2 s^{2} 2 p^{6} 3 s^{2} 3 p 4 p$. (Answer: 0 and 1.)

Atomic Terms

A given electron configuration gives rise in general to several different atomic states, some having the same energy and others having different energies, depending on whether the interelectronic repulsions are the same or different for the states. For example, the $1 s 2 s$ configuration of helium gives rise to four states: The three states with zeroth-order wave functions (10.27) to (10.29) all have the same energy; the single state (10.30) has a different energy. The $1 s 2 p$ electron configuration gives rise to twelve states: The nine states obtained by replacing $2 s$ in (10.27) to (10.29) by $2 p{x}, 2 p{y}$, or $2 p{z}$ have the same energy; the three states obtained by replacing $2 s$ in (10.30) by $2 p{x}, 2 p{y}$, or $2 p{z}$ have the same energy, which differs from the energy of the other nine states.

Thus the atomic states that arise from a given electron configuration can be grouped into sets of states that have the same energy. One can show that states that arise from the same electron configuration and that have the same energy (with spin-orbit interaction neglected) will have the same value of $L$ and the same value of $S$ (see Kemble, Section 63a). A set of equal-energy atomic states that arise from the same electron configuration and that have the same $L$ value and the same $S$ value constitutes an atomic term. For a fixed $L$ value, the quantum number $M{L}$ (where $M{L} \hbar$ is the $z$ component of the total electronic orbital angular momentum) takes on $2 L+1$ values ranging from $-L$ to $+L$. For a fixed $S$ value, $M{S}$ takes on $2 S+1$ values. The atomic energy does not depend on $M{L}$ or $M_{S}$, and each term consists of $(2 L+1)(2 S+1)$ atomic states of equal energy. The degeneracy of an atomic term is $(2 L+1)(2 S+1)$ (spin-orbit interaction neglected).

Each term of an atom is designated by a term symbol formed by writing the numerical value of the quantity $2 S+1$ as a left superscript on the code letter (11.42) that gives the $L$ value. For example, a term that has $L=2$ and $S=1$ has the term symbol ${ }^{3} D$, since $2 S+1=3$.

EXAMPLE

Find the terms arising from each of the following electron configurations: (a) $1 s 2 p$;
(b) $1 s^{2} 2 s^{2} 2 p 3 d$. Give the degeneracy of each term.
(a) The $1 s$ electron has quantum number $l=0$ and the $2 p$ electron has $l=1$. The addition rule (11.39) gives $L=1$ as the only possibility. The code letter for $L=1$ is $P$. Each electron has $s=\frac{1}{2}$, and (11.39) gives $S=1,0$ as the possible $S$ values. The possible values of $2 S+1$ are 3 and 1 . The possible terms are thus ${ }^{3} P$ and ${ }^{1} P$. The ${ }^{3} P$ term has quantum numbers $L=1$ and $S=1$, and its degeneracy is $(2 L+1)(2 S+1)=3(3)=9$. The ${ }^{1} P$ term has $L=1$ and $S=0$, and its degeneracy is $(2 L+1)(2 S+1)=3(1)=3$. [The nine states of the ${ }^{3} P$ term are obtained by replacing $2 s$ in (10.27) to (10.29) by $2 p{x}, 2 p{y}$, or $2 p_{z}$. The three states of the ${ }^{1} P$ term are obtained by replacing $2 s$ in (10.30) by $2 p$ functions.]
(b) In the two previous examples in this section, we found that the configuration $1 s^{2} 2 s^{2} 2 p 3 d$ has the possible $L$ values $L=3,2,1$ and has $S=1,0$. The code letters for these $L$ values are $F, D, P$, and the terms are

\(
\begin{equation}
{ }^{1} P, \quad{ }^{3} P, \quad{ }^{1} D, \quad{ }^{3} D, \quad{ }^{1} F, \quad{ }^{3} F \tag{11.44}
\end{equation}
\)

The degeneracies are found as in (a) and are $3,9,5,15,7$, and 21 , respectively.

Derivation of Atomic Terms

We now examine how to systematically derive the terms that arise from a given electron configuration.

First consider configurations that contain only completely filled subshells. In such configurations, for each electron with $m{s}=+\frac{1}{2}$ there is an electron with $m{s}=-\frac{1}{2}$. Let the quantum number specifying the $z$ component of the total electronic spin angular momentum be $M{S}$. The only possible value for $M{S}$ is zero ( $M{S}=\sum{i} m{s i}=0$ ). Hence $S$ must be zero. For each electron in a closed subshell with magnetic quantum number $m$, there is an electron with magnetic quantum number $-m$. For example, for a $2 p^{6}$ configuration we have two electrons with $m=+1$, two with $m=-1$, and two with $m=0$. Denoting the quantum number specifying the $z$ component of the total electronic orbital angular momentum by $M{L}$, we have $M{L}=\sum{i} m_{i}=0$. We conclude that $L$ must be zero. In summary, a configuration of closed subshells gives rise to only one term: ${ }^{1} S$. For configurations consisting of closed subshells and open subshells, the closed subshells make no contribution to $L$ or $S$ and may be ignored in finding the terms.

We now consider two electrons in different subshells; such electrons are called nonequivalent. Nonequivalent electrons have different values of $n$ or $l$ or both, and we need not worry about any restrictions imposed by the exclusion principle when we derive the terms. We simply find the possible values of $L$ from $l{1}$ and $l{2}$ according to (11.39); combining $s{1}$ and $s{2}$ gives $S=0,1$. We previously worked out the $p d$ case, which gives the terms in (11.44). If we have more than two nonequivalent electrons, we combine the individual $l$ 's to find the values of $L$, and we combine the individual $s$ 's to find the values of $S$. For example, consider a $p d f$ configuration. The possible values of $L$ are given by (11.40). Combining three spin angular momenta, each of which is $\frac{1}{2}$, gives $S=\frac{3}{2}, \frac{1}{2}, \frac{1}{2}$. Each of the three possibilities in (11.40) with $L=3$ may be combined with each of the two possibilities for $S=\frac{1}{2}$, giving six ${ }^{2} F$ terms. Continuing in this manner, we find that the following terms arise from a $p d f$ configuration: ${ }^{2} S(2),{ }^{2} P(4),{ }^{2} D(6),{ }^{2} F(6),{ }^{2} G(6),{ }^{2} H(4),{ }^{2} I(2)$, ${ }^{4} S,{ }^{4} P(2),{ }^{4} D(3),{ }^{4} F(3),{ }^{4} G(3),{ }^{4} H(2),{ }^{4} I$, where the number of times each type of term occurs is in parentheses.

Now consider two electrons in the same subshell (equivalent electrons). Equivalent electrons have the same value of $n$ and the same value of $l$, and the situation is complicated by the necessity to avoid giving two electrons the same four quantum numbers. Hence not all the terms derived for nonequivalent electrons are possible. As an example, consider the terms arising from two equivalent $p$ electrons, an $n p^{2}$ configuration. (The carbon groundstate configuration is $1 s^{2} 2 s^{2} 2 p^{2}$.) The possible values of $m$ and $m{s}$ for the two electrons are listed in Table 11.1, which also gives $M{L}$ and $M_{S}$.

Note that certain combinations are missing from this table. For example, $m{1}=1, m{s 1}=\frac{1}{2}, m{2}=1, m{s 2}=\frac{1}{2}$ is missing, since it violates the exclusion principle. Another missing combination is $m{1}=1, m{s 1}=-\frac{1}{2}, m{2}=1, m{s 2}=\frac{1}{2}$. This combination differs from $m{1}=1, m{s 1}=\frac{1}{2}, m{2}=1, m{s 2}=-\frac{1}{2}$ (row 1) solely by interchange of electrons 1 and 2. Each row in Table 11.1 stands for a Slater determinant, which when expanded

TABLE 11.1 Quantum Numbers for Two Equivalent $\boldsymbol{p}$ Electrons

$m_{1}$$m_{s 1}$$m_{2}$$m_{s 2}$$M{L}=m{1}+m_{2}$$M{S}=m{s 1}+m_{s 2}$
1$\frac{1}{2}$1$-\frac{1}{2}$20
1$\frac{1}{2}$0$\frac{1}{2}$11
1$\frac{1}{2}$0$-\frac{1}{2}$10
1$-\frac{1}{2}$0$\frac{1}{2}$10
1$-\frac{1}{2}$0$-\frac{1}{2}$1-1
1$\frac{1}{2}$-1$\frac{1}{2}$01
1$\frac{1}{2}$-1$-\frac{1}{2}$00
1$-\frac{1}{2}$-1$\frac{1}{2}$00
1$-\frac{1}{2}$-1$-\frac{1}{2}$0-1
0$\frac{1}{2}$0$-\frac{1}{2}$00
0$\frac{1}{2}$-1$\frac{1}{2}$-11
0$\frac{1}{2}$-1$-\frac{1}{2}$-10
0$-\frac{1}{2}$-1$\frac{1}{2}$-10
0$-\frac{1}{2}$-1$-\frac{1}{2}$-1-1
-1$\frac{1}{2}$-1$-\frac{1}{2}$-20

contains terms for all possible electron interchanges among the spin-orbitals. Two rows that differ from each other solely by interchange of two electrons correspond to the same Slater determinant, and we include only one of them in the table.

The highest value of $M{L}$ in Table 11.1 is 2 , which must correspond to a term with $L=2$, a $D$ term. The $M{L}=2$ value occurs in conjunction with $M_{S}=0$, indicating that $S=0$ for the $D$ term. Thus we have a ${ }^{1} D$ term corresponding to the five states

\(
\begin{array}{lllrr}
M{L}=2 & 1 & 0 & -1 & -2 \tag{11.45}\
M{S}=0 & 0 & 0 & 0 & 0
\end{array}
\)

The highest value of $M{S}$ in Table 11.1 is 1 , indicating a term with $S=1 . M{S}=1$ occurs in conjunction with $M_{L}=1,0,-1$, which indicates a $P$ term. Hence we have a ${ }^{3} P$ term corresponding to the nine states

\(
\begin{array}{rlrrrrrrrr}
M{L} & =1 & 1 & 1 & 0 & 0 & 0 & -1 & -1 & -1 \tag{11.46}\
M{S} & =1 & 0 & -1 & 1 & 0 & -1 & 1 & 0 & -1
\end{array}
\)

Elimination of the states of (11.45) and (11.46) from Table 11.1 leaves only a single state, which has $M{L}=0, M{S}=0$, corresponding to a ${ }^{1} S$ term. Thus a $p^{2}$ configuration gives rise to the terms ${ }^{1} S,{ }^{3} P,{ }^{1} D$. (In contrast, two nonequivalent $p$ electrons give rise to six terms: ${ }^{1} S,{ }^{3} S,{ }^{1} P,{ }^{3} P,{ }^{1} D,{ }^{3} D$.)

Table 11.2a lists the terms arising from various configurations of equivalent electrons. These results may be derived in the same way that we found the $p^{2}$ terms, but this procedure can become quite involved. To derive the terms of the $f^{7}$ configuration would require a table with 3432 rows. More efficient methods exist [R. F. Curl and J. E. Kilpatrick, Am. J. Phys., 28, 357 (1960); K. E. Hyde, J. Chem. Educ., 52, 87 (1975)].

TABLE 11.2 Terms Arising from Various Electron Configurations

ConfigurationTerms
(a) Equivalent electrons
$s^{2} ; p^{6} ; d^{10}$${ }^{1} S$
$p ; p^{5}$${ }^{2} P$
$p^{2} ; p^{4}$${ }^{3} P,{ }^{1} D,{ }^{1} S$
$p^{3}$${ }^{4} S,{ }^{2} D,{ }^{2} P$
$d ; d^{9}$${ }^{2} D$
$d^{2} ; d^{8}$${ }^{3} \mathrm{~F},{ }^{3} \mathrm{P},{ }^{1} G,{ }^{1} D,{ }^{1} S$
$d^{3} ; d^{7}$${ }^{4} F,{ }^{4} P,{ }^{2} H,{ }^{2} G,{ }^{2} F,{ }^{2} D(2),{ }^{2} P$
$d^{4} ; d^{6}$$\left{\begin{array}{l} { }^{5} D,{ }^{3} H,{ }^{3} G,{ }^{3} F(2),{ }^{3} D,{ }^{3} P(2) \ { }^{1} I,{ }^{1} G(2),{ }^{1} F,{ }^{1} D(2),{ }^{1} S(2) \end{array}\right.$
$d^{5}$$\left{\begin{array}{l} { }^{6} S,{ }^{4} G,{ }^{4} F,{ }^{4} D,{ }^{4} P,{ }^{2} I,{ }^{2} H,{ }^{2} G(2) \ { }^{2} F(2),{ }^{2} D(3),{ }^{2} P,{ }^{2} S \end{array}\right.$

(b) Nonequivalent electrons

$s s$${ }^{1} S,{ }^{3} S$
$s p$${ }^{1} P,{ }^{3} P$
$s d$${ }^{1} D,{ }^{3} D$
$p p$${ }^{3} D,{ }^{1} D,{ }^{3} P,{ }^{1} P,{ }^{3} S,{ }^{1} S$

Note from Table 11.2a that the terms arising from a subshell containing $N$ electrons are the same as the terms for a subshell that is $N$ electrons short of being full. For example, the terms for $p^{2}$ and $p^{4}$ are the same. We can divide the electrons of a closed subshell into two groups and find the terms for each group. Because a closed subshell gives only a ${ }^{1} S$ term, the terms for each of these two groups must be the same. Table 11.2 b gives the terms arising from some nonequivalent electron configurations.

To deal with a configuration containing both equivalent and nonequivalent electrons, we first find separately the terms from the nonequivalent electrons and the terms from the equivalent electrons. We then take all possible combinations of the $L$ and $S$ values of these two sets of terms. For example, consider an $s p^{3}$ configuration. From the $s$ electron, we get a ${ }^{2} S$ term. From the three equivalent $p$ electrons, we get the terms ${ }^{2} P,{ }^{2} D$, and ${ }^{4} S$ (Table 11.2a). Combining the $L$ and $S$ values of these terms, we have as the terms of an $s p^{3}$ configuration

\(
\begin{equation}
{ }^{3} P,{ }^{1} P,{ }^{3} D,{ }^{1} D,{ }^{5} S,{ }^{3} S \tag{11.47}
\end{equation}
\)

Hund's Rule

To decide which one of the terms arising from a given electron configuration is lowest in energy, we use the empirical Hund's rule: For terms arising from the same electron configuration, the term with the largest value of S lies lowest. If there is more than one term with the largest $S$, then the term with the largest $S$ and the largest $L$ lies lowest.

EXAMPLE

Use Table 11.2 to predict the lowest term of (a) the carbon ground-state configuration $1 s^{2} 2 s^{2} 2 p^{2}$; (b) the configuration $1 s^{2} 2 s^{2} 2 p^{6} 3 s^{2} 3 p^{6} 3 d^{2} 4 s^{2}$.
(a) Table 11.2a gives the terms arising from a $p^{2}$ configuration as ${ }^{3} P,{ }^{1} D$, and ${ }^{1} S$. The term with the largest $S$ will have the largest value of the left superscript $2 S+1$. Hund's rule predicts ${ }^{3} P$ as the lowest term. (b) Table 11.2a gives the $d^{2}$ terms as ${ }^{3} F,{ }^{3} P,{ }^{1} G,{ }^{1} D,{ }^{1} S$. Of these terms, ${ }^{3} F$ and ${ }^{3} P$ have the highest $S .{ }^{3} F$ has $L=3 ;{ }^{3} P$ has $L=1$. Therefore, ${ }^{3} F$ is predicted to be lowest.
EXERCISE Predict the lowest term of the $1 s^{2} 2 s 2 p^{3}$ configuration using (11.47). (Answer: ${ }^{5}$ S.)

Hund's rule works very well for the ground-state configuration, but occasionally fails for an excited configuration (Prob. 11.30).

Hund's rule gives only the lowest term of a configuration and should not be used to decide the order of the remaining terms. For example, for the $1 s^{2} 2 s 2 p^{3}$ configuration of carbon, the observed order of the terms is

\(
{ }^{5} S<{ }^{3} D<{ }^{3} P<{ }^{1} D<{ }^{3} S<{ }^{1} P
\)

The ${ }^{3} S$ term lies above the ${ }^{1} D$ term, even though ${ }^{3} S$ has the higher spin $S$.
It is not necessary to consult Table 11.2a to find the lowest term of a partly filled subshell configuration. We simply put the electrons in the orbitals so as to give the greatest number of parallel spins. Thus, for a $d^{3}$ configuration, we have

\(
\begin{equation}
m: \frac{\uparrow}{+2} \frac{\uparrow}{+1} \frac{\uparrow}{0} \overline{-1} \overline{-2} \tag{11.48}
\end{equation}
\)

The lowest term thus has three parallel spins, so $S=\frac{3}{2}$, giving $2 S+1=4$. The maximum value of $M_{L}$ is 3 , corresponding to $L=3$, an $F$ term. Hund's rule thus predicts ${ }^{4} F$ as the lowest term of a $d^{3}$ configuration.

The traditional explanation of Hund's rule is as follows: Electrons with the same spin tend to keep out of each other's way (recall the idea of Fermi holes), thereby minimizing the Coulombic repulsion between them. The term that has the greatest number of parallel spins (that is, the greatest value of $S$ ) will therefore be lowest in energy. For example, the ${ }^{3} S$ term of the helium $1 s 2 s$ configuration has an antisymmetric spatial function that vanishes when the spatial coordinates of the two electrons are equal. Hence the ${ }^{3} S$ term is lower than the ${ }^{1} S$ term.

This traditional explanation turns out to be wrong in most cases. It is true that the probability that the two electrons are very close together is smaller for the helium ${ }^{3} S 1 s 2 s$ term than for the ${ }^{1} S 1 s 2 s$ term. However, calculations with accurate wave functions show that the probability that the two electrons are very far apart is also less for the ${ }^{3} S$ term. The net result is that the average distance between the two electrons is slightly less for the ${ }^{3} S$ term than for the ${ }^{1} S$ term, and the interelectronic repulsion is slightly greater for the ${ }^{3} S$ term. The calculations show that the ${ }^{3} S$ term lies below the ${ }^{1} S$ term because of a substantially greater electron-nucleus attraction in the ${ }^{3} S$ term as compared with the ${ }^{1} S$ term. Similar results are found for terms of the atoms beryllium and carbon. [See J. Katriel and R. Pauncz, Adv. Quantum Chem., 10, 143 (1977).] The following explanation of these results has been proposed [I. Shim and J. P. Dahl, Theor. Chim. Acta, 48, 165 (1978)]. The Pauli "repulsion" between electrons of like spin makes the average angle between the radius vectors of the two electrons larger for the ${ }^{3} S$ term than for the ${ }^{1} S$ term. This reduces the electronic screening of the nucleus and allows the electrons to get closer to the nucleus in the ${ }^{3} S$ term, making the electron-nucleus attraction greater for the ${ }^{3} S$ term. [See also R. E. Boyd, Nature, 310, 480 (1984); T. Oyamada et al., J. Chem. Phys., 133, 164113 (2010).]

Eigenvalues of Two-Electron Spin Functions

The helium atom $1 s 2 s$ configuration gives rise to the term ${ }^{3} S$ with degeneracy $(2 L+1)(2 S+1)=1(3)=3$ and to the term ${ }^{1} S$ with degeneracy $1(1)=1$. The three helium zeroth-order wave functions (10.27) to (10.29) must correspond to the triply degenerate ${ }^{3} S$ term, and the single function (10.30) must correspond to the ${ }^{1} S$ term. Since $S=1$ and $M{S}=1,0,-1$ for the ${ }^{3} S$ term, the three spin functions in (10.27) to (10.29) should be eigenfunctions of $\hat{S}^{2}$ with eigenvalue $S(S+1) \hbar^{2}=2 \hbar^{2}$ and eigenfunctions of $\hat{S}{z}$ with eigenvalues $M{S} \hbar=\hbar, 0$, and $-\hbar$. The spin function in (10.30) should be an eigenfunction of $\hat{S}^{2}$ and $\hat{S}{z}$ with eigenvalue zero in each case, since $S=0$ and $M_{S}=0$ here. We now verify these assertions.

From Eq. (11.43), the total-electron-spin operator is the sum of the spin operators for each electron:

\(
\begin{equation}
\hat{\mathbf{S}}=\hat{\mathbf{S}}{1}+\hat{\mathbf{S}}{2} \tag{11.49}
\end{equation}
\)

Taking the $z$ components of (11.49), we have

\(
\begin{align}
& \hat{S}{z}=\hat{S}{1 z}+\hat{S}{2 z} \tag{11.50}\
\hat{S}{z} \alpha(1) \alpha(2)= & \hat{S}{1 z} \alpha(1) \alpha(2)+\hat{S}{2 z} \alpha(1) \alpha(2) \
= & \alpha(2) \hat{S}{1 z} \alpha(1)+\alpha(1) \hat{S}{2 z} \alpha(2) \
= & \frac{1}{2} \hbar \alpha(1) \alpha(2)+\frac{1}{2} \hbar \alpha(1) \alpha(2) \
\hat{S}_{z} \alpha(1) \alpha(2)= & \hbar \alpha(1) \alpha(2) \tag{11.51}
\end{align}
\)

where Eq. (10.7) has been used. Similarly, we find

\(
\begin{equation}
\hat{S}_{z} \beta(1) \beta(2)=-\hbar \beta(1) \beta(2) \tag{11.52}
\end{equation}
\)

\(
\begin{align}
& \hat{S}{z}[\alpha(1) \beta(2)+\beta(1) \alpha(2)]=0 \tag{11.53}\
& \hat{S}{z}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]=0 \tag{11.54}
\end{align}
\)

Consider now $\hat{S}^{2}$. We have [Eq. (11.26)]

\(
\begin{array}{r}
\hat{S}^{2}=\left(\hat{\mathbf{S}}{1}+\hat{\mathbf{S}}{2}\right) \cdot\left(\hat{\mathbf{S}}{1}+\hat{\mathbf{S}}{2}\right)=\hat{S}{1}^{2}+\hat{S}{2}^{2}+2\left(\hat{S}{1 x} \hat{S}{2 x}+\hat{S}{1 y} \hat{S}{2 y}+\hat{S}{1 z} \hat{S}{2 z}\right) \tag{11.55}\
\hat{S}^{2} \alpha(1) \alpha(2)=\alpha(2) \hat{S}{1}^{2} \alpha(1)+\alpha(1) \hat{S}{2}^{2} \alpha(2)+2 \hat{S}{1 x} \alpha(1) \hat{S}{2 x} \alpha(2) \
+2 \hat{S}{1 y} \alpha(1) \hat{S}{2 y} \alpha(2)+2 \hat{S}{1 z} \alpha(1) \hat{S}{2 z} \alpha(2)
\end{array}
\)

Using Eqs. (10.7) to (10.9) and (10.72) and (10.73), we find

\(
\begin{equation}
\hat{S}^{2} \alpha(1) \alpha(2)=2 \hbar^{2} \alpha(1) \alpha(2) \tag{11.56}
\end{equation}
\)

Hence $\alpha(1) \alpha(2)$ is an eigenfunction of $\hat{S}^{2}$ corresponding to $S=1$. Similarly, we find

\(
\begin{aligned}
\hat{S}^{2} \beta(1) \beta(2) & =2 \hbar^{2} \beta(1) \beta(2) \
\hat{S}^{2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)] & =2 \hbar^{2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)] \
\hat{S}^{2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] & =0
\end{aligned}
\)

Thus the spin eigenfunctions in (10.27) to (10.30) correspond to the following values for the total spin quantum numbers:

\(
\begin{align}
& \tag{11.57}\
& \text { triplet }\left{\begin{array}{crr}
& S & M_{S} \
2^{-1 / 2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)] & 1 & 1 \
\beta(1) \beta(2) & 1 & 0 \
\text { singlet }\left{2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]\right. & 1 & -1 \
0 & 0
\end{array}\right.
\end{align}
\)

[In the notation of Section 11.4, we are dealing with the addition of two angular momenta with quantum numbers $j{1}=\frac{1}{2}$ and $j{2}=\frac{1}{2}$ to give eigenfunctions with total angular-momentum quantum numbers $J=1$ and $J=0$. The coefficients in (11.57) to (11.60) correspond to the coefficients $C$ in (11.33) and are examples of Clebsch-Gordan coefficients.]

Figure 11.3 shows the vector addition of $\mathbf{S}{1}$ and $\mathbf{S}{2}$ to form $\mathbf{S}$. It might seem surprising that the spin function (11.58), which has the $z$ components of the spins of the two electrons pointing in opposite directions, could have total spin quantum number $S=1$. Figure 11.3 shows how this is possible.

Atomic Wave Functions

In Section 10.6, we showed that two of the four zeroth-order wave functions of the $1 s 2 s$ helium configuration could be written as single Slater determinants, but the other two functions had to be expressed as linear combinations of two Slater determinants. Since $\hat{L}^{2}$ and $\hat{S}^{2}$ commute with the Hamiltonian (11.1) and with the exchange operator $\hat{P}{i k}$, the zeroth-order functions should be eigenfunctions of $\hat{L}^{2}$ and $\hat{S}^{2}$. The Slater determinants $D{2}$ and $D_{3}$ of Section 10.6 are not eigenfunctions of these operators and so are not suitable zeroth-order functions. We have just shown that the linear combinations (10.44) and (10.45) are eigenfunctions of $\hat{S}^{2}$, and they can also be shown to be eigenfunctions of $\hat{L}^{2}$.

For a configuration of closed subshells (for example, the helium ground state), we can write only a single Slater determinant. This determinant is an eigenfunction of $\hat{L}^{2}$ and $\hat{S}^{2}$ and is the correct zeroth-order function for the nondegenerate ${ }^{1} S$ term. A configuration

$\alpha(1) \alpha(2)$

$\beta(1) \beta(2)$

\(
\frac{1}{\sqrt{2}}[\alpha(1) \beta(2)-\beta(1) \alpha(2)]
\)

with one electron outside closed subshells (for example, the boron ground configuration) gives rise to only one term. The Slater determinants for such a configuration differ from one another only in the $m$ and $m_{s}$ values of this electron and are the correct zeroth-order functions for the states of the term. When all the electrons in singly occupied orbitals have the same spin (either all $\alpha$ or all $\beta$ ), the correct zeroth-order function is a single Slater determinant [for example, see (10.42)]. When this is not true, one has to take a linear combination of a few Slater determinants to obtain the correct zeroth-order functions, which are eigenfunctions of $\hat{L}^{2}$ and $\hat{S}^{2}$. The correct linear combinations can be found by solving the secular equation of degenerate perturbation theory or by operator techniques. Tabulations of the correct combinations for various configurations are available (Slater, Atomic Structure, Vol. II). Hartree-Fock calculations of atomic term energies use these linear combinations and find the best possible orbital functions for the Slater determinants.

Each wave function of an atomic term is an eigenfunction of $\hat{L}^{2}, \hat{S}^{2}, \hat{L}{z}$, and $\hat{S}{z}$. Therefore, when one does a configuration-interaction calculation, only configuration functions that have the same $\hat{L}^{2}, \hat{S}^{2}, \hat{L}{z}$, and $\hat{S}{z}$ eigenvalues as the state under consideration are included in the expansion (11.17). For example, the helium $1 s^{2}$ ground term is ${ }^{1} S$, which has $L=0$ and $S=0$. The electron configuration $1 s 2 p$ produces ${ }^{1} P$ and ${ }^{3} P$ terms only and so gives rise to states with $L=1$ only. No configuration functions arising from the $1 s 2 p$ configuration can occur in the CI wave function (11.17) for the He ground state.

Parity of Atomic States

Consider the atomic Hamiltonian (11.1). We showed in Section 7.5 that the parity operator $\hat{\Pi}$ commutes with the kinetic-energy operator. The quantity $1 / r{i}$ in (11.1) is $r{i}^{-1}=\left(x{i}^{2}+y{i}^{2}+z{i}^{2}\right)^{-1 / 2}$. Replacement of each coordinate by its negative leaves $1 / r{i}$ unchanged. Also

\(
r{i j}^{-1}=\left[\left(x{i}-x{j}\right)^{2}+\left(y{i}-y{j}\right)^{2}+\left(z{i}-z_{j}\right)^{2}\right]^{-1 / 2}
\)

and inversion has no effect on $1 / r_{i j}$. Thus $\hat{\Pi}$ commutes with the atomic Hamiltonian, and we can choose atomic wave functions to have definite parity.

FIGURE 11.3 Vector addition of the spins of two electrons. For $\alpha(1) \alpha(2)$ and $\beta(1) \beta(2)$, the projections of $\mathbf{S}{1}$ and $\mathbf{S}{2}$ in the $x y$ plane make an angle of $90^{\circ}$ with each other (Prob. 11.18c).

For a one-electron atom, the spatial wave function is $\psi=R(r) Y{l}^{m}(\theta, \phi)$. The radial function is unchanged on inversion, and the parity is determined by the angular factor. In Prob. 7.29 we showed that $Y{l}^{m}$ is an even function when $l$ is even and is an odd function when $l$ is odd. Thus the states of one-electron atoms have even or odd parity according to whether $l$ is even or odd.

Now consider an $n$-electron atom. In the Hartree-Fock central-field approximation, we write the wave function as a Slater determinant (or linear combination of Slater determinants) of spin-orbitals. The wave function is the sum of terms, the spatial factor in each term having the form

\(
R{1}\left(r{1}\right) \cdots R{n}\left(r{n}\right) Y{l{1}}^{m{1}}\left(\theta{1}, \phi{1}\right) \cdots Y{l}^{m{n}}\left(\theta{n}, \phi_{n}\right)
\)

The parity of this product is determined by the spherical-harmonic factors. We see that the product is an even or odd function according to whether $l{1}+l{2}+\cdots+l{n}$ is an even or odd number. Therefore, the parity of an atomic state is found by adding the $l$ values of the electrons in the electron configuration that gives rise to the state: If $\sum{i} l{i}$ is an even number, then $\psi$ is an even function; if $\sum{i} l{i}$ is odd, $\psi$ is odd. For example, the configuration $1 s^{2} 2 s 2 p^{3}$ has $\sum{i} l_{i}=0+0+0+1+1+1=3$, and all states arising from this electron configuration have odd parity. (Our argument was based on the SCF approximation to $\psi$, but the conclusions are valid for the true $\psi$.)

Total Electronic Angular Momentum and Atomic Levels

The total electronic angular momentum $\mathbf{J}$ of an atom is the vector sum of the total electronic orbital and spin angular momenta:

\(
\begin{equation}
\mathbf{J}=\mathbf{L}+\mathbf{S} \tag{11.61}
\end{equation}
\)

The operator $\hat{\mathbf{J}}$ for the total electronic angular momentum commutes with the atomic Hamiltonian, and we may characterize an atomic state by a quantum number $J$, which has the possible values [Eq. (11.39)]

\(
\begin{equation}
L+S, L+S-1, \ldots,|L-S| \tag{11.62}
\end{equation}
\)

We have $\hat{J}^{2} \psi=J(J+1) \hbar^{2} \psi$.
For the atomic Hamiltonian (11.1), all states that belong to the same term have the same energy. However, when spin-orbit interaction (Section 11.6) is included in $\hat{H}$, one finds that the value of the quantum number $J$ affects the energy slightly. Hence states that belong to the same term but that have different values of $J$ will have slightly different energies. The set of states that belong to the same term and that have the same value of $J$ constitutes an atomic level. The energies of different levels belonging to the same term are slightly different. (See Fig. 11.6 in Section 11.7.) To denote the level, one adds the $J$ value as a right subscript on the term symbol. Each level is $(2 J+1)$-fold degenerate, corresponding to the $2 J+1$ values of $M{J}$, where $M{J} \hbar$ is the $z$ component of the total electronic angular momentum $\mathbf{J}$. Each level consists of $2 J+1$ states of equal energy.

EXAMPLE

Find the levels of a ${ }^{3} P$ term and give the degeneracy of each level.
For a ${ }^{3} P$ term, $2 S+1=3$ and $S=1$; also, from (11.42), $L=1$. With $L=1$ and $S=1,(11.62)$ gives $J=2,1,0$. The levels are ${ }^{3} P{2},{ }^{3} P{1}$, and ${ }^{3} P{0}$.
The ${ }^{3} P{2}$ level has $J=2$ and has $2 J+1=5$ values of $M{J}$, namely, $-2,-1,0,1$, and 2 . The ${ }^{3} P{2}$ level is 5 -fold degenerate. The ${ }^{3} P{1}$ level is $2(1)+1=3$-fold degenerate. The ${ }^{3} P{0}$ level has $2 J+1=1$ and is nondegenerate.

The total number of states for the three levels ${ }^{3} P{2},{ }^{3} P{1}$, and ${ }^{3} P{0}$ is $5+3+1=9$. When spin-orbit interaction was neglected, we had a ${ }^{3} P$ term that consisted of $(2 L+1)(2 S+1)=3(3)=9$ equal-energy states. With spin-orbit interaction, the 9-fold-degenerate term splits into three closely spaced levels: ${ }^{3} P{2}$ with five states, ${ }^{3} P{1}$ with three states, and ${ }^{3} P{0}$ with one state. The total number of states is the same for the term ${ }^{3} P$ as for the three levels arising from this term.
EXERCISE Find the levels of a ${ }^{2} D$ term and give the level degeneracies. (Answer: ${ }^{2} D{5 / 2},{ }^{2} D{3 / 2} ; 6,4$.)
The quantity $2 S+1$ is called the electron-spin multiplicity (or the multiplicity) of the term. If $L \geq S$, the possible values of $J$ in (11.62) range from $L+S$ to $L-S$ and are $2 S+1$ in number. If $L \geq S$, the spin multiplicity gives the number of levels that arise from a given term. For $L<S$, the values of $J$ range from $S+L$ to $S-L$ and are $2 L+1$ in number. In this case, the spin multiplicity is greater than the number of levels. For example, if $L=0$ and $S=1$ (a ${ }^{3} S$ term), the spin multiplicity is 3 , but there is only one possible value for $J$, namely, $J=1$. For $2 S+1=1,2,3,4,5,6, \ldots$, the words singlet, doublet, triplet, quartet, quintet, sextet, $\ldots$, are used to designate the spin multiplicity. The level symbol ${ }^{3} P_{1}$ is read as "triplet $P$ one."

For light atoms, the spin-orbit interaction is very small and the separation between levels of a term is very small. Note from Fig. 11.6 in Section 11.7 that the separations between the ${ }^{3} P{0},{ }^{3} P{1}$, and ${ }^{3} P_{2}$ levels of the helium $1 s 2 p{ }^{3} P$ term are far, far less than the separation between the ${ }^{1} P$ and ${ }^{3} P$ terms of the $1 s 2 p$ configuration.

Terms and Levels of Hydrogen and Helium

The hydrogen atom has one electron. Hence $L=l$ and $S=s=\frac{1}{2}$. The possible values of $J$ are $L+\frac{1}{2}$ and $L-\frac{1}{2}$, except for $L=0$, where $J=\frac{1}{2}$ is the only possibility. Each electron configuration gives rise to only one term, which is composed of one level if $L=0$ and two levels if $L \neq 0$. The ground-state configuration $1 s$ gives the term ${ }^{2} S$, which is composed of the single level ${ }^{2} S{1 / 2}$; the level is twofold degenerate $\left(M{J}=-\frac{1}{2}, \frac{1}{2}\right)$. The $2 s$ configuration also gives a ${ }^{2} S{1 / 2}$ level. The $2 p$ configuration gives rise to the levels ${ }^{2} P{3 / 2}$ and ${ }^{2} P{1 / 2}$; the ${ }^{2} P{3 / 2}$ level is fourfold degenerate, and the ${ }^{2} P_{1 / 2}$ level is twofold degenerate. There are $2+4+2=8$ states with $n=2$, in agreement with our previous work.

The helium ground-state configuration $1 s^{2}$ is a closed subshell and gives rise to the single level ${ }^{1} S{0}$, which is nondegenerate $\left(M{J}=0\right)$. The $1 s 2 s$ excited configuration gives rise to the two terms ${ }^{1} S$ and ${ }^{3} S$, each of which has one level. The ${ }^{1} S{0}$ level is nondegenerate; the ${ }^{3} S{1}$ level is threefold degenerate. The $1 s 2 p$ configuration gives rise to the terms ${ }^{1} P$ and ${ }^{3} P$. The levels of ${ }^{3} P$ are ${ }^{3} P{2},{ }^{3} P{1}$, and ${ }^{3} P{0} ;{ }^{1} P$ has the single level ${ }^{1} P{1}$.

Tables of Atomic Energy Levels

The spectroscopically determined energy levels for atoms with atomic number less than 90 are given in the tables of C. E. Moore and others: C. E. Moore, Atomic Energy Levels, National Bureau of Standards Circular 467, vols. I, II, and III, 1949, 1952, and 1958, Washington D.C.; these have been reprinted as Natl. Bur. Stand. Publ. NSRDS-NBS 35, 1971; W. C. Martin et al., Atomic Energy Levels-The Rare-Earth Elements, Natl. Bur. Stand. Publ. NSRDS-NBS 60, Washington, D.C., 1978. The atomic-energy-level data in Moore's tables and subsequent revisions are available online at the NIST Atomic Spectra Database at www.nist.gov/pml/data/asd.cfm.

These tables also list the levels of many atomic ions. Spectroscopists use the symbol I to indicate a neutral atom, the symbol II to indicate a singly ionized atom, and so on. To view the energy level data for the $\mathrm{C}^{2+}$ ion in the NIST online database, click on Levels and then enter C III after Spectrum.

FIGURE 11.4 Some term Energy/eV energies of the carbon atom.

The tables take the zero level of energy at the lowest energy level of the atom and list the level energies $E{i}$ as $E{i} / h c$ in $\mathrm{cm}^{-1}$, where $h$ and $c$ are Planck's constant and the speed of light. The difference in $E / h c$ values for two levels gives the wavenumber [Eq. (4.64)] of the spectral transition between the levels (provided the transition is allowed). An energy $E$ of 1 eV corresponds to $E / h c=8065.544 \mathrm{~cm}^{-1}$ (Prob. 11.29).

Figure 11.4 shows some of the term energies of the carbon atom. The separations between levels of each term are too small to be visible in this figure.


The atomic Hamiltonian (11.1) does not involve electron spin. In reality, the existence of spin adds an additional term, usually small, to the Hamiltonian. This term, called the spin-orbit interaction, splits an atomic term into levels. Spin-orbit interaction is a relativistic effect and is properly derived using Dirac's relativistic treatment of the electron. This section gives a qualitative discussion of the origin of spin-orbit interaction.

If we imagine ourselves riding on an electron in an atom, from our viewpoint, the nucleus is moving around the electron (as the sun appears to move around the earth). This apparent motion of the nucleus produces a magnetic field that interacts with the intrinsic (spin) magnetic moment of the electron, giving the spin-orbit interaction term in the Hamiltonian. The interaction energy of a magnetic moment $\mathbf{m}$ with a magnetic field $\mathbf{B}$ is given by (6.131) as $-\mathbf{m} \cdot \mathbf{B}$. The electron's spin magnetic moment $\mathbf{m}_{S}$ is proportional to its spin $\mathbf{S}$ [Eq. (10.57)], and the magnetic field arising from the apparent nuclear motion is proportional to the electron's orbital angular momentum $\mathbf{L}$. Therefore, the spin-orbit interaction is proportional to $\mathbf{L} \cdot \mathbf{S}$. The dot product of $\mathbf{L}$ and $\mathbf{S}$ depends on the relative orientation of these two vectors. The total electronic angular momentum $\mathbf{J}=\mathbf{L}+\mathbf{S}$ also depends on the relative orientation of $\mathbf{L}$ and $\mathbf{S}$, and so the spin-orbit interaction energy depends on $J$ [Eq. (11.67)].

When a proper relativistic derivation of the spin-orbit-interaction term $\hat{H}_{\text {S.O. }}$ in the atomic Hamiltonian is carried out, one finds that for a one-electron atom (see Bethe and Jackiw, Chapters 8 and 23)

\(
\begin{equation}
\hat{H}{\text {S.O. }}=\frac{1}{2 m{e}^{2} c^{2}} \frac{1}{r} \frac{d V}{d r} \hat{\mathbf{L}} \cdot \hat{\mathbf{S}} \tag{11.63}
\end{equation}
\)

where $V$ is the potential energy experienced by the electron in the atom and $c$ is the speed of light. One way to calculate $\hat{H}{\text {s.o. }}$ for a many-electron atom is first to neglect $\hat{H}{\text {S.O. }}$ and do an SCF calculation (Section 11.1) using the central-field approximation to get an effective potential energy $V{i}\left(r{i}\right)$ for each electron $i$ in the field of the nucleus and the other electrons viewed as charge clouds [Eqs. (11.7) and (11.8)]. One then sums (11.63) over the electrons to get

\(
\begin{equation}
\hat{H}{\text {S.O. }} \approx \frac{1}{2 m{e}^{2} c^{2}} \sum{i} \frac{1}{r{i}} \frac{d V{i}\left(r{i}\right)}{d r{i}} \hat{\mathbf{L}}{i} \cdot \hat{\mathbf{S}}{i}=\sum{i} \xi{i}\left(r{i}\right) \hat{\mathbf{L}}{i} \cdot \hat{\mathbf{S}}{i} \tag{11.64}
\end{equation}
\)

where the definition of $\xi{i}\left(r{i}\right)$ is obvious and $\hat{\mathbf{L}}{i}$ and $\hat{\mathbf{S}}{i}$ are the operators for orbital and spin angular momenta of electron $i$.

Calculating the spin-orbit interaction energy $E{\text {S.o. }}$ by finding the eigenfunctions and eigenvalues of the operator $\hat{H}{(11.1)}+\hat{H}{\text {S.O }}$, where $\hat{H}{(11.1)}$ is the Hamiltonian of Eq. (11.1), is difficult. One therefore usually estimates $E{\text {S.O. }}$ by using perturbation theory. Except for heavy atoms, the effect of $\hat{H}{\text {S.O }}$ is small compared with the effect of $\hat{H}{(11.1)}$, and first-order perturbation theory can be used to estimate $E{\text {S. } O}$.

Equation (9.22) gives $E{\mathrm{S} . \mathrm{O} .} \approx\langle\psi| \hat{H}{\text {S.O. }}|\psi\rangle$, where $\psi$ is an eigenfunction of $\hat{H}_{(11.1)}$. For a one-electron atom,

\(
\begin{equation}
E_{\mathrm{S} . \mathrm{O} .} \approx\langle\psi| \xi(r) \hat{\mathbf{L}} \cdot \hat{\mathbf{S}}|\psi\rangle \tag{11.65}
\end{equation}
\)

We have

\(
\begin{aligned}
\mathbf{J} \cdot \mathbf{J} & =(\mathbf{L}+\mathbf{S}) \cdot(\mathbf{L}+\mathbf{S})=L^{2}+S^{2}+2 \mathbf{L} \cdot \mathbf{S} \
\mathbf{L} \cdot \mathbf{S} & =\frac{1}{2}\left(J^{2}-L^{2}-S^{2}\right) \
(\hat{\mathbf{L}} \cdot \hat{\mathbf{S}}) \psi=\frac{1}{2}\left(\hat{J}^{2}-\hat{L}^{2}-\hat{S}^{2}\right) \psi & =\frac{1}{2}[J(J+1)-L(L+1)-S(S+1)] \hbar^{2} \psi
\end{aligned}
\)

since the unperturbed $\psi$ is an eigenfunction of $\hat{L}^{2}, \hat{S}^{2}$, and $\hat{J}^{2}$. Therefore,

\(
\begin{equation}
E_{\mathrm{S} . \mathrm{O} .} \approx \frac{1}{2}\langle\xi\rangle \hbar^{2}[J(J+1)-L(L+1)-S(S+1)] \tag{11.66}
\end{equation}
\)

For a many-electron atom, it can be shown (Bethe and Jackiw, p. 164) that the spin-orbit interaction energy is

\(
\begin{equation}
E_{\text {S.O. }} \approx \frac{1}{2} A \hbar^{2}[J(J+1)-L(L+1)-S(S+1)] \tag{11.67}
\end{equation}
\)

where $A$ is a constant for a given term; that is, $A$ depends on $L$ and $S$ but not on $J$. Equation (11.67) shows that when we include the spin-orbit interaction, the energy of an atomic state depends on its total electronic angular momentum $J$. Thus each atomic term is split into levels, each level having a different value of $J$. For example, the $1 s^{2} 2 s^{2} 2 p^{6} 3 p$ configuration of sodium has the single term ${ }^{2} P$, which is composed of the two levels ${ }^{2} P{3 / 2}$ and ${ }^{2} P{1 / 2}$. The splitting of these levels gives the observed fine structure of the sodium D line (Fig. 11.5). The levels of a given term are said to form its multiplet structure.

What about the order of levels within a given term? Since $L$ and $S$ are the same for such levels, their relative energies are determined, according to Eq. (11.67), by $A J(J+1)$. If $A$ is positive, the level with the lowest value of $J$ lies lowest, and the multiplet is said to be regular. If $A$ is negative, the level with the highest value of $J$ lies lowest, and the multiplet is said to be inverted. The following rule usually applies to a configuration with only one partly filled subshell: If this subshell is less than half filled, the multiplet is regular; if this

FIGURE 11.5 Fine structure of the sodium $D$ line.

subshell is more than halffilled, the multiplet is inverted. (A few exceptions exist.) For the half-filled case, see Prob. 11.28.

EXAMPLE

Find the ground level of the oxygen atom.
The ground electron configuration is $1 s^{2} 2 s^{2} 2 p^{4}$. Table 11.2 gives ${ }^{1} S,{ }^{1} D$, and ${ }^{3} P$ as the terms of this configuration. By Hund's rule, ${ }^{3} P$ is the lowest term. [Alternatively, a diagram like (11.48) could be used to conclude that ${ }^{3} P$ is the lowest term.] The ${ }^{3} P$ term has $L=1$ and $S=1$, so the possible $J$ values are 2, 1, and 0 . The levels of ${ }^{3} P$ are ${ }^{3} P{2},{ }^{3} P{1}$, and ${ }^{3} P{0}$. The $2 p$ subshell is more than half filled, so the rule just given predicts the multiplet is inverted, and the ${ }^{3} P{2}$ level lies lowest. This is the ground level of O .

EXERCISE Find the ground level of the Cl atom. (Answer: ${ }^{2} P_{3 / 2}$.)


The Hamiltonian operator of an atom can be divided into three parts:

\(
\begin{equation}
\hat{H}=\hat{H}^{0}+\hat{H}{\mathrm{rep}}+\hat{H}{\mathrm{S} . \mathrm{O}} \tag{11.68}
\end{equation}
\)

where $\hat{H}^{0}$ is the sum of hydrogenlike Hamiltonians,

\(
\begin{equation}
\hat{H}^{0}=\sum{i=1}^{n}\left(-\frac{\hbar^{2}}{2 m{e}} \nabla{i}^{2}-\frac{Z e^{2}}{4 \pi \varepsilon{0} r_{i}}\right) \tag{11.69}
\end{equation}
\)

$\hat{H}_{\text {rep }}$ consists of the interelectronic repulsions,

\(
\begin{equation}
\hat{H}{\mathrm{rep}}=\sum{i} \sum{j>i} \frac{e^{2}}{4 \pi \varepsilon{0} r_{i j}} \tag{11.70}
\end{equation}
\)

and $\hat{H}_{\text {S.O. }}$ is the spin-orbit interaction (11.64):

\(
\begin{equation}
\hat{H}{\text {S.O. }}=\sum{i=1}^{n} \xi{i} \hat{\mathbf{L}}{i} \cdot \hat{\mathbf{S}}_{i} \tag{11.71}
\end{equation}
\)

If we consider just $\hat{H}^{0}$, all atomic states corresponding to the same electronic configuration are degenerate. Adding in $\hat{H}{\text {rep }}$, we lift the degeneracy between states with different $L$ or $S$ or both, thus splitting each configuration into terms. Next, we add in $\hat{H}{\text {S.O }}$, which splits each term into levels. Each level is composed of states with the same value of $J$ and is $(2 J+1)$-fold degenerate, corresponding to the possible values of $M_{J}$.

We can remove the degeneracy of each level by applying an external magnetic field (the Zeeman effect). If $\mathbf{B}$ is the applied field, we have the additional term in the Hamiltonian [Eq. (6.131)]

\(
\begin{equation}
\hat{H}{B}=-\hat{\mathbf{m}} \cdot \mathbf{B}=-\left(\hat{\mathbf{m}}{L}+\hat{\mathbf{m}}_{S}\right) \cdot \mathbf{B} \tag{11.72}
\end{equation}
\)

where both the orbital and spin magnetic moments have been included. Using $\mathbf{m}{L}=-\left(e / 2 m{e}\right) \mathbf{L}, \mathbf{m}{S}=-\left(e / m{e}\right) \mathbf{S}$, and $\mu{\mathrm{B}} \equiv e \hbar / 2 m{e}$ [Eqs. (6.128), (10.55) with $g_{e} \approx 2$, and (6.130)], we have

\(
\begin{equation}
\hat{H}{B}=\mu{\mathrm{B}} \hbar^{-1}(\hat{\mathbf{L}}+2 \hat{\mathbf{S}}) \cdot \mathbf{B}=\mu{\mathrm{B}} \hbar^{-1}(\hat{\mathbf{J}}+\hat{\mathbf{S}}) \cdot \mathbf{B}=\mu{\mathrm{B}} B \hbar^{-1}\left(\hat{J}{z}+\hat{S}{z}\right) \tag{11.73}
\end{equation}
\)

where $\mu{\mathrm{B}}$ is the Bohr magneton and the $z$ axis is taken along the direction of the field. If the external field is reasonably weak, its effect will be less than that of the spin-orbit interaction, and the effect of the field can be calculated by use of first-order perturbation theory. Since $\hat{J}{z} \psi=M_{J} \hbar \psi$, the energy of interaction with the applied field is

\(
E{B}=\langle\psi| \hat{H}{B}|\psi\rangle=\mu{\mathrm{B}} B M{J}+\mu{\mathrm{B}} B \hbar^{-1}\left\langle S{z}\right\rangle
\)

Evaluation of $\left\langle S_{z}\right\rangle$ (Bethe and Jackiw, p. 169) gives as the final weak-field result:

\(
\begin{equation}
E{B}=\mu{\mathrm{B}} g B M_{J} \tag{11.74}
\end{equation}
\)

where $g$ (the Landé $g$ factor) is given by

\(
\begin{equation}
g=1+\frac{[J(J+1)-L(L+1)+S(S+1)]}{2 J(J+1)} \tag{11.75}
\end{equation}
\)

Thus the external field splits each level into $2 J+1$ states, each state having a different value of $M_{J}$.

Figure 11.6 shows what happens when we consider successive interactions in an atom, using the $1 s 2 p$ configuration of helium as the example.

FIGURE 11.6 Effect of inclusion of successive terms in the atomic Hamiltonian for the $1 s 2 p$ helium configuration. $\hat{H}_{B}$ is not part of the atomic Hamiltonian but is due to an applied magnetic field.

We have based the discussion on a scheme in which we first added the individual electronic orbital angular momenta to form a total-orbital-angular-momentum vector and did the same for the spins: $\mathbf{L}=\sum{i} \mathbf{L}{i}$ and $\mathbf{S}=\sum{i} \mathbf{S}{i}$. We then combined $\mathbf{L}$ and $\mathbf{S}$ to get $\mathbf{J}$. This scheme is called Russell-Saunders coupling (or $L-S$ coupling) and is appropriate where the spin-orbit interaction energy is small compared with the interelectronic repulsion energy. The operators $\hat{\mathbf{L}}$ and $\hat{\mathbf{S}}$ commute with $\hat{H}^{0}+\hat{H}{\text {rep }}$, but when $\hat{H}{\text {s.o }}$ is included in the Hamiltonian, $\hat{\mathbf{L}}$ and $\hat{\mathbf{S}}$ no longer commute with $\hat{H}$. ( $\hat{\mathbf{J}}$ does commute with $\hat{H}^{0}+\hat{H}{\text {rep }}+\hat{H}{\text {S.o. }}$.) If the spin-orbit interaction is small, then $\hat{\mathbf{L}}$ and $\hat{\mathbf{S}}$ "almost" commute with $\hat{H}$, and $L-S$ coupling is valid.

As the atomic number increases, the average speed $v$ of the electrons increases. As $v / c$ increases, relativistic effects such as the spin-orbit interaction increase. For atoms with very high atomic number, the spin-orbit interaction exceeds the interelectronic repulsion energy, and we can no longer consider $\hat{\mathbf{L}}$ and $\hat{\mathbf{S}}$ to commute with $\hat{H}$; the operator $\hat{\mathbf{J}}$, however, still commutes with $\hat{H}$. In this case we first add in $\hat{H}{\text {s.o. }}$ to $\hat{H}^{0}$ and then consider $\hat{H}{\text {rep }}$. This corresponds to first combining the spin and orbital angular momenta of each electron to give a total angular momentum $\mathbf{j}{i}$ for each electron: $\mathbf{j}{i}=\mathbf{L}{i}+\mathbf{S}{i}$. We then add the $\mathbf{j}{i}$ 's to get the total electronic angular momentum: $\mathbf{J}=\sum{i} \mathbf{j}_{i}$. This scheme is called $j-j$ coupling. For most heavy atoms, the situation is intermediate between $j-j$ and $L-S$ coupling, and computations are difficult.

Several other effects should be included in the atomic Hamiltonian. The finite size of the nucleus and the effect of nuclear motion slightly change the energy (Bethe and Salpeter, pp. 102, 166, 351). There is a small relativistic term due to the interaction between the spin magnetic moments of the electrons (spin-spin interaction). We should also take into account the relativistic change of electronic mass with velocity. This effect is significant for inner-shell electrons of heavy atoms, where average electronic speeds are not negligible in comparison with the speed of light.

If the nucleus has a nonzero spin, the nuclear spin magnetic moment interacts with the electronic spin and orbital magnetic moments, giving rise to atomic hyperfine structure. The nuclear spin angular momentum $\mathbf{I}$ adds vectorially to the total electronic angular momentum $\mathbf{J}$ to give the total angular momentum $\mathbf{F}$ of the atom: $\mathbf{F}=\mathbf{I}+\mathbf{J}$. For example, consider the ground state of the hydrogen atom. The spin of a proton is $\frac{1}{2}$, so $I=\frac{1}{2}$; also, $J=\frac{1}{2}$. Hence the quantum number $F$ can be 0 or 1 , corresponding to the proton and electron spins being antiparallel or parallel. The transition $F=1 \rightarrow 0$ gives a line at 1420 MHz , the $21-\mathrm{cm}$ line emitted by hydrogen atoms in outer space. In 1951, Ewen and Purcell stuck a horn-shaped antenna out the window of a Harvard physics laboratory and detected this line. The frequency of the hyperfine splitting in the ground state of hydrogen is one of the most accurately measured physical constants: $1420.405751768 \pm 0.000000002$ MHz [S. G. Karshenboim, Can. J. Phys., 78, 639 (2000); arxiv.org/abs/physics/0008051].


In the Hartree-Fock approximation, the wave function of an atom (or molecule) is a Slater determinant or a linear combination of a few Slater determinants [for example, Eq. (10.44)]. A configuration-interaction wave function such as (11.17) is a linear combination of many Slater determinants. To evaluate the energy and other properties of atoms and molecules using Hartree-Fock or configuration-interaction wave functions, we must be able to evaluate integrals of the form $\left\langle D^{\prime}\right| \hat{B}|D\rangle$, where $D$ and $D^{\prime}$ are Slater determinants of orthonormal spin-orbitals and $\hat{B}$ is an operator.

Each spin-orbital $u{i}$ is a product of a spatial orbital $\theta{i}$ and a spin function $\sigma{i}$, where $\sigma{i}$ is either $\alpha$ or $\beta$. We have $u{i}=\theta{i} \sigma{i}$ and $\left\langle u{i}(1) \mid u{j}(1)\right\rangle=\delta{i j}$, where $\left\langle u{i}(1) \mid u{j}(1)\right\rangle$ involves a sum over the spin coordinate of electron 1 and an integration over its spatial coordinates. If $u{i}$ and $u{j}$ have different spin functions, then (10.12) ensures the orthogonality
of $u{i}$ and $u{j}$. If $u{i}$ and $u{j}$ have the same spin function, their orthogonality is due to the orthogonality of the spatial orbitals $\theta{i}$ and $\theta{j}$.

For an $n$-electron system, $D$ is

\(
D=\frac{1}{\sqrt{n!}}\left|\begin{array}{cccc}
u{1}(1) & u{2}(1) & \ldots & u{n}(1) \tag{11.76}\
u{1}(2) & u{2}(2) & \ldots & u{n}(2) \
\vdots & \vdots & \ddots & \vdots \
u{1}(n) & u{2}(n) & \ldots & u_{n}(n)
\end{array}\right|
\)

An example with $n=3$ is Eq. (10.40). $D^{\prime}$ has the same form as $D$ except that $u{1}, u{2}, \ldots, u{n}$ are replaced by $u{1}^{\prime}, u{2}^{\prime}, \ldots, u{n}^{\prime}$.

We shall assume that the columns of $D$ and $D^{\prime}$ are arranged so as to have as many as possible of their left-hand columns match. For example, if we were working with the Slater determinants $\left|1 s \overline{1 s} 2 s 3 p{0}\right|$ and $\left|1 s \overline{1 s} 3 p{0} 4 s\right|$, we would interchange the third and fourth columns of the first determinant (thereby multiplying it by -1 ) and let $D=\left|1 s \overline{1 s} 3 p{0} 2 s\right|$ and $D^{\prime}=\left|1 s \overline{1 s} 3 p{0} 4 s\right|$.

The operator $\hat{B}$ typically has the form

\(
\begin{equation}
\hat{B}=\sum{i=1}^{n} \hat{f}{i}+\sum{i=1}^{n-1} \sum{j>i} \hat{g}_{i j} \tag{11.77}
\end{equation}
\)

where the one-electron operator $\hat{f}{i}$ involves only coordinate and momentum operators of electron $i$ and the two-electron operator $\hat{g}{i j}$ involves electrons $i$ and $j$. For example, if $\hat{B}$ is the atomic Hamiltonian operator (11.1), then $\hat{f}{i}=-\left(\hbar^{2} / 2 m{e}\right) \nabla{i}^{2}-Z e^{2} / 4 \pi \varepsilon{0} r{i}$ and $\hat{g}{i j}=e^{2} / 4 \pi \varepsilon{0} r{i j}$.

Condon and Slater showed that the $n$-electron integral $\left\langle D^{\prime}\right| \hat{B}|D\rangle$ can be reduced to sums of certain one- and two-electron integrals. The derivation of these Condon-Slater formulas uses the determinant expression of Prob. 8.22 together with the orthonormality of the spin-orbitals. (See Parr, pp. 23-27 for the derivation.) Table 11.3 gives the Condon-Slater formulas.

In Table 11.3, each matrix element of $\hat{g}{12}$ involves summation over the spin coordinates of electrons 1 and 2 and integration over the full range of the spatial coordinates of electrons 1 and 2. Each matrix element of $\hat{f}{1}$ involves summation over the spin coordinate of electron 1 and integration over its spatial coordinates. The variables in the sums and definite integrals are dummy variables.

TABLE 11.3 The Condon-Slater Rules

$D$ and $D^{\prime}$ differ by$\left\langle D^{\prime}\right\\sum{i=1}^{n} \hat{f}{i}\D\rangle$$\left\langle D^{\prime}\right\\sum{i=1}^{n-1} \sum{j>1} \hat{g}_{i j}\D\rangle$
no spin-orbitals$\sum{i=1}^{n}\left\langle u{i}(1)\right\\hat{f}_{1}\left\u_{i}(1)\right\rangle$$\begin{aligned} \sum{i=1}^{n-1} \sum{j>1}[ & \left\langle u{i}(1) u{j}(2)\right\\hat{g}_{12}\left\u{i}(1) u{j}(2)\right\rangle \ & \left.-\left\langle u{i}(1) u{j}(2)\right\\hat{g}_{12}\left\u{j}(1) u{i}(2)\right\rangle\right] \end{aligned}$
one spin-orbital $u{n}^{\prime} \neq u{n}$$\left\langle u_{n}^{\prime}(1)\right\\hat{f}_{1}\left\u_{n}(1)\right\rangle$$\begin{aligned} & \sum{j=1}^{n-1}\left[\left\langle u{n}^{\prime}(1) u_{j}(2)\right\\hat{g}_{12}\left\u{n}(1) u{j}(2)\right\rangle\right. \ & \left.\quad-\left\langle u{n}^{\prime}(1) u{j}(2)\right\\hat{g}_{12}\left\u{j}(1) u{n}(2)\right\rangle\right] \end{aligned}$
two spin-orbitals $u{n}^{\prime} \neq u{n}, u{n-1}^{\prime} \neq u{n-1}$0$\begin{aligned} & \left\langle u{n}^{\prime}(1) u{n-1}^{\prime}(2)\right\\hat{g}_{12}\left\u{n}(1) u{n-1}(2)\right\rangle \ & \quad-\left\langle u{n}^{\prime}(1) u{n-1}^{\prime}(2)\right\\hat{g}_{12}\left\u{n-1}(1) u{n}(2)\right\rangle \end{aligned}$
three or more spin-orbitals00

If the operators $\hat{f}{i}$ and $\hat{g}{i j}$ do not involve spin, the expressions in Table 11.3 can be further simplified. We have $u{i}=\theta{i} \sigma_{i}$ and

\(
\begin{aligned}
\left\langle u{i}(1)\right| \hat{f}{1}\left|u{i}(1)\right\rangle & =\int \theta{i}^{}(1) \hat{f}{1} \theta{i}(1) d v{1} \sum{m{s 1}} \sigma{i}^{}(1) \sigma{i}(1) \
& =\int \theta{i}^{*}(1) \hat{f}{1} \theta{i}(1) d v{1}=\left\langle\theta{i}(1)\right| \hat{f}{1}\left|\theta{i}(1)\right\rangle
\end{aligned}
\)

since $\sigma{i}$ is normalized. Using this result and the orthonormality of $\sigma{i}$ and $\sigma_{j}$, we get for the case $D=D^{\prime}$ (Prob. 11.37)

\(
\begin{equation}
\langle D| \sum{i=1}^{n} \hat{f}{i}|D\rangle=\sum{i=1}^{n}\left\langle\theta{i}(1)\right| \hat{f}{1}\left|\theta{i}(1)\right\rangle \tag{11.78}
\end{equation}
\)

$\langle D| \sum{i=1}^{n-1} \sum{j>1} \hat{g}{i j}|D\rangle=$
$\sum{i=1}^{n-1} \sum{j>1}\left[\left\langle\theta{i}(1) \theta{j}(2)\right| \hat{g}{12}\left|\theta{i}(1) \theta{j}(2)\right\rangle-\delta{m{s, i} m{s, j}}\left\langle\theta{i}(1) \theta{j}(2)\right| \hat{g}{12}\left|\theta{j}(1) \theta{i}(2)\right\rangle\right]$
where $\delta{m{s i,} m{s, j}}$ is 0 or 1 according to whether $m{s, i} \neq m{s, j}$ or $m{s, i}=m_{s, j}$. Similar equations hold for the other integrals.

Let us apply these equations to evaluate $\langle D| \hat{H}|D\rangle$, where $\hat{H}$ is the Hamiltonian of an $n$-electron atom with spin-orbit interaction neglected and $D$ is a Slater determinant of $n$ spin-orbitals. We have $\hat{H}=\sum{i} \hat{f}{i}+\sum{i} \sum{j>i} \hat{g}{i j}$, where $\hat{f}{i}=-\left(\hbar^{2} / 2 m{e}\right) \nabla{i}^{2}-Z e^{2} / 4 \pi \varepsilon{0} r{i}$ and $\hat{g}{i j}=e^{2} / 4 \pi \varepsilon{0} r_{i j}$. Introducing the Coulomb and exchange integrals (9.99) and (9.100) and using (11.78) and (11.79), we have

\(
\begin{gather}
\langle D| \hat{H}|D\rangle=\sum{i=1}^{n}\left\langle\theta{i}(1)\right| \hat{f}{1}\left|\theta{i}(1)\right\rangle+\sum{i=1}^{n-1} \sum{j>1}\left(J{i j}-\delta{m{s, i}, m{s, j}} K{i j}\right) \tag{11.80}\
J{i j}=\left\langle\theta{i}(1) \theta{j}(2)\right| e^{2} / 4 \pi \varepsilon{0} r{12}\left|\theta{i}(1) \theta{j}(2)\right\rangle, K{i j}=\left\langle\theta{i}(1) \theta{j}(2)\right| e^{2} / 4 \pi \varepsilon{0} r{12} \mid \theta{j}(1) \theta{i}(2) \tag{11.81}\
\hat{f}{1}=-\left(\hbar^{2} / 2 m{e}\right) \nabla{1}^{2}-Z e^{2} / 4 \pi \varepsilon{0} r{1} \tag{11.82}
\end{gather}
\)

The Kronecker delta in (11.80) results from the orthonormality of the one-electron spin functions.

As an example, consider Li. The SCF approximation to the ground-state $\psi$ is the Slater determinant $D=|1 s \overline{1 s} 2 s|$. The spin-orbitals are $u{1}=1 s \alpha, u{2}=1 s \beta$, and $u{3}=2 s \alpha$. The spatial orbitals are $\theta{1}=1 s, \theta{2}=1 s$, and $\theta{3}=2 s$. We have $J{12}=J{1 s 1 s}$ and $J{13}=J{23}=J{1 s 2 s}$. Since $m{s 1} \neq m{s 2}$ and $m{s 2} \neq m{s 3}$, the only exchange integral that appears in the energy expression is $K{13}=K_{1 s 2}$. We get exchange integrals only between spin-orbitals with the same spin. Equation (11.80) gives the SCF energy as

\(
E=\langle D| \hat{H}|D\rangle=2\langle 1 s(1)| \hat{f}{1}|1 s(1)\rangle+\langle 2 s(1)| \hat{f}{1}|2 s(1)\rangle+J{1 s 1 s}+2 J{1 s 2 s}-K_{1 s 2 s}
\)

The terms involving $\hat{f}_{1}$ are hydrogenlike energies, and their sum equals $E^{(0)}$ in Eq. (10.49). The remaining terms equal $E^{(1)}$ in Eq. (10.51). As noted at the beginning of Section 9.4, $E^{(0)}+E^{(1)}$ equals the variational integral $\langle D| \hat{H}|D\rangle$, so the Condon-Slater rules have been checked in this case.

For an atom with closed subshells only (for example, ground-state Be with a $1 s^{2} 2 s^{2}$ configuration), the $n$ electrons reside in $n / 2$ different spatial orbitals, so $\theta{1}=\theta{2}, \theta{3}=\theta{4}$, and so on. Let $\phi{1} \equiv \theta{1}=\theta{2}, \phi{2} \equiv \theta{3}=\theta{4}, \ldots, \phi{n / 2} \equiv \theta{n-1}=\theta_{n}$. If one rewrites
(11.80) using the $\phi$ 's instead of the $\theta$ 's, one finds (Prob. 11.38) for the SCF energy of the ${ }^{1} S$ term produced by a closed-subshell configuration

\(
\begin{equation}
E=\langle D| \hat{H}|D\rangle=2 \sum{i=1}^{n / 2}\left\langle\phi{i}(1)\right| \hat{f}{1}\left|\phi{i}(1)\right\rangle+\sum{j=1}^{n / 2} \sum{i=1}^{n / 2}\left(2 J{i j}-K{i j}\right) \tag{11.83}
\end{equation}
\)

where $\hat{f}{1}$ is given by (11.82) and where $J{i j}$ and $K{i j}$ have the forms in (11.81) but with $\theta{i}$ and $\theta{j}$ replaced by $\phi{i}$ and $\phi_{j}$. Each sum in (11.83) goes over all the $n / 2$ different spatial orbitals.

For example, consider the $1 s^{2} 2 s^{2}$ electron configuration. We have $n=4$ and the two different spatial orbitals are $1 s$ and $2 s$. The double sum in Eq. (11.83) is equal to $2 J{1 s 1 s}-K{1 s 1 s}+2 J{1 s 2 s}-K{1 s 2 s}+2 J{2 s 1 s}-K{2 s 1 s}+2 J{2 s 2 s}-K{2 s 2 s}$. From the definition (11.81), it follows that $J{i i}=K{i i}$. The labels 1 and 2 in (11.81) are dummy variables, and interchanging them can have no effect on the integrals. Interchanging 1 and 2 in $J{i j}$ converts it to $J{j i}$; therefore, $J{i j}=J{j i}$. The same reasoning gives $K{i j}=K{j i}$. Thus

\(
\begin{equation}
J{i i}=K{i i}, \quad J{i j}=J{j i}, \quad K{i j}=K{j i} \tag{11.84}
\end{equation}
\)

Use of (11.84) gives the Coulomb- and exchange-integrals expression for the $1 s^{2} 2 s^{2}$ configuration as $J{1 s 1 s}+J{2 s 2 s}+4 J{1 s 2 s}-2 K{1 s 2 s}$. Between the two electrons in the $1 s$ orbital, there is only one Coulombic interaction, and we get the term $J{1 s 1 s}$. Each $1 s$ electron interacts with two $2 s$ electrons, for a total of four $1 s-2 s$ interactions, and we get the term $4 J{1 s 2 s}$. As noted earlier, exchange integrals occur only between spin-orbitals of the same spin. There is an exchange integral between the $1 s \alpha$ and $2 s \alpha$ spin-orbitals and an exchange integral between the $1 s \beta$ and $2 s \beta$ spin-orbitals, which gives the $-2 K_{1 s 2 s}$ term.

The magnitude of the exchange integrals is generally much less than the magnitude of the Coulomb integrals [for example, see Eq. (9.111)].


Molecular Symmetry

Click the keywords to know more about it.

Symmetry Elements: Geometrical entities (point, line, or plane) with respect to which a symmetry operation is carried out. Examples include axes of rotation, planes of reflection, and centers of inversion 1. Symmetry Operations: Transformations of a body such that the final position is physically indistinguishable from the initial position, and the distances between all pairs of points in the body are preserved. Examples include rotations, reflections, and inversions 1. n-Fold Axis of Symmetry (C_n): An axis about which a molecule can be rotated by 360/n degrees to give a configuration that is physically indistinguishable from the original position. The order of the axis is denoted by n 1. Plane of Symmetry (σ): A plane that divides a molecule into two mirror-image halves. Reflection through this plane gives a configuration that is physically indistinguishable from the original one 1. Center of Symmetry (i): A point in a molecule such that inversion through this point gives a configuration that is physically indistinguishable from the original one 1. Rotation-Reflection Axis of Symmetry (S_n): An axis about which a molecule can be rotated by 360/n degrees followed by reflection in a plane perpendicular to the axis to give a configuration that is physically indistinguishable from the original one 1. Dipole Moment: A measure of the separation of positive and negative charges in a molecule. Symmetry considerations can determine whether a molecule has a dipole moment and along which axis it lies 1. Optical Activity: The ability of certain molecules to rotate the plane of polarization of plane-polarized light. Molecules that are not superimposable on their mirror images may be optically active 1. Symmetry Point Groups: Classifications of molecules based on their symmetry elements. Examples include groups with no C_n axis, groups with a single C_n axis, and groups with multiple C_n axes 1. Group Theory: A mathematical framework used to describe the symmetry operations of molecules. It involves the study of groups, which are sets of elements with a rule for combining them that satisfies certain requirements 1.

Qualitative information about molecular wave functions and properties can often be obtained from the symmetry of the molecule. By the symmetry of a molecule, we mean the symmetry of the framework formed by the nuclei held fixed in their equilibrium positions. (Our starting point for molecular quantum mechanics will be the Born-Oppenheimer approximation, which regards the nuclei as fixed when solving for the electronic wave function; see Section 13.1.) The symmetry of a molecule can differ in different electronic states. For example, HCN is linear in its ground electronic state, but nonlinear in certain excited states. Unless otherwise specified, we shall be considering the symmetry of the ground electronic state.

Symmetry Elements and Operations

A symmetry operation is a transformation of a body such that the final position is physically indistinguishable from the initial position, and the distances between all pairs of points in the body are preserved. For example, consider the trigonal-planar molecule $\mathrm{BF}_{3}$ (Fig. 12.1a), where for convenience we have numbered the fluorine nuclei. If we rotate the molecule counterclockwise by $120^{\circ}$ about an axis through the boron nucleus and perpendicular to the plane of the molecule, the new position will be as in Fig. 12.1b. Since in reality the fluorine nuclei are physically indistinguishable from one another, we have carried out a symmetry operation. The axis about which we rotated the molecule is an example of a symmetry element. Symmetry elements and symmetry operations are related but different things, which are often confused. A symmetry element is a geometrical entity (point, line, or plane) with respect to which a symmetry operation is carried out.

We say that a body has an $\boldsymbol{n}$-fold axis of symmetry (also called an $n$-fold proper axis or an $n$-fold rotation axis) if rotation about this axis by $360 / n$ degrees (where $n$ is an

FIGURE 12.1 (a) The $\mathrm{BF}{3}$ molecule. (b) $\mathrm{BF}{3}$ after a $120^{\circ}$ rotation about the axis through B and perpendicular to the molecular plane.

(a)

(b)

integer) gives a configuration physically indistinguishable from the original position; $n$ is called the order of the axis. For example, $\mathrm{BF}{3}$ has a threefold axis of symmetry perpendicular to the molecular plane. The symbol for an $n$-fold rotation axis is $C{n}$. The threefold axis in $\mathrm{BF}{3}$ is a $C{3}$ axis. To denote the operation of counterclockwise rotation by $(360 / n)^{\circ}$, we use the symbol $\hat{C}{n}$. The "hat" distinguishes symmetry operations from symmetry elements. $\mathrm{BF}{3}$ has three more rotation axes; each $\mathrm{B}-\mathrm{F}$ bond is a twofold symmetry axis (Fig. 12.2).

A second kind of symmetry element is a plane of symmetry. A molecule has a plane of symmetry if reflection of all the nuclei through that plane gives a configuration physically indistinguishable from the original one. The symbol for a symmetry plane is $\sigma$ (lowercase sigma). (Spiegel is the German word for mirror.) The symbol for the operation of reflection is $\hat{\sigma} . \mathrm{BF}{3}$ has four symmetry planes. The plane of the molecule is a symmetry plane, since nuclei lying in a reflection plane do not move when a reflection is carried out. The plane passing through the B and $\mathrm{F}{1}$ nuclei and perpendicular to the plane of the molecule is a symmetry plane, since reflection in this plane merely interchanges $\mathrm{F}{2}$ and $\mathrm{F}{3}$. It might be thought that this reflection is the same symmetry operation as rotation by $180^{\circ}$ about the $C{2}$ axis passing through B and $\mathrm{F}{1}$, which also interchanges $\mathrm{F}{2}$ and $\mathrm{F}{3}$. This is not so. The reflection carries points lying above the plane of the molecule into points that also lie above the molecular plane, whereas the $\hat{C}{2}$ rotation carries points lying above the molecular plane into points below the molecular plane. Two symmetry operations are equal only when they represent the same transformation of three-dimensional space. The remaining two symmetry planes in $\mathrm{BF}{3}$ pass through $\mathrm{B}-\mathrm{F}{2}$ and $\mathrm{B}-\mathrm{F}{3}$ and are perpendicular to the molecular plane.

The third kind of symmetry element is a center of symmetry, symbolized by $i$ (no connection with $\sqrt{-1}$ ). A molecule has a center of symmetry if the operation of inverting all the nuclei through the center gives a configuration indistinguishable from the original one. If we set up a Cartesian coordinate system, the operation of inversion through the origin (symbolized by $\hat{i}$ ) carries a nucleus originally at $(x, y, z)$ to $(-x,-y,-z)$. Does $\mathrm{BF}_{3}$ have a center of symmetry? With the origin at the boron nucleus, inversion gives the result shown in Fig. 12.3. Since we get a configuration that is physically distinguishable from

FIGURE 12.4 Effect of inversion in $\mathrm{BF}_{6}$.

the original one, $\mathrm{BF}{3}$ does not have a center of symmetry. For $\mathrm{SF}{6}$, inversion through the sulfur nucleus is shown in Fig. 12.4, and it is clear that $\mathrm{SF}{6}$ has a center of symmetry. (An operation such as $\hat{i}$ or $\hat{C}{n}$ may or may not be a symmetry operation. Thus, $\hat{i}$ is a symmetry operation in $\mathrm{SF}{6}$ but not in $\mathrm{BF}{3}$.)

The fourth and final kind of symmetry element is an $\boldsymbol{n}$-fold rotation-reflection axis of symmetry (also called an improper axis or an alternating axis), symbolized by $S{n}$. A body has an $S{n}$ axis if rotation by $(360 / n)^{\circ}(n$ integral) about the axis, followed by reflection in a plane perpendicular to the axis, carries the body into a position physically indistinguishable from the original one. Clearly, if a body has a $C{n}$ axis and also has a plane of symmetry perpendicular to this axis, then the $C{n}$ axis is also an $S{n}$ axis. Thus the $C{3}$ axis in $\mathrm{BF}{3}$ is also an $S{3}$ axis. It is possible to have an $S{n}$ axis that is not a $C{n}$ axis. An example is $\mathrm{CH}{4}$. In Fig. 12.5 we have first carried out a $90^{\circ}$ proper rotation $\left(\hat{C}{4}\right)$ about what we assert is an $S{4}$ axis. As can be seen, this operation does not result in an equivalent configuration. When we follow the $\hat{C}{4}$ operation by reflection in the plane perpendicular to the axis and passing through the carbon atom, we do get a configuration indistinguishable from the one existing before we performed the rotation and reflection. Hence $\mathrm{CH}{4}$ has an $S{4}$ axis. The $S{4}$ axis is not a $C{4}$ axis, although it is a $C{2}$ axis. There are two other $S{4}$ axes in methane, each perpendicular to a pair of faces of the cube in which the tetrahedral molecule is inscribed.

The operation of counterclockwise rotation by $(360 / n)^{\circ}$ about an axis, followed by reflection in a plane perpendicular to the axis, is denoted by $\hat{S}{n}$. An $\hat{S}{1}$ operation is a $360^{\circ}$ rotation about an axis, followed by a reflection in a plane perpendicular to the axis. Since a $360^{\circ}$ rotation restores the body to its original position, an $\hat{S}{1}$ operation is the same as reflection in a plane; $\hat{S}{1}=\hat{\sigma}$. A plane of symmetry has an $S_{1}$ axis perpendicular to it.

Consider now the $\hat{S}{2}$ operation. Let the $S{2}$ axis be the $z$ axis (Fig. 12.6). Rotation by $180^{\circ}$ about the $S{2}$ axis changes the $x$ and $y$ coordinates of a point to $-x$ and $-y$, respectively, and leaves the $z$ coordinate unaffected. Reflection in the $x y$ plane then converts the $z$ coordinate to $-z$. The net effect of the $\hat{S}{2}$ operation is to bring a point originally at $(x, y, z)$ to $(-x,-y,-z)$, which amounts to an inversion through the origin: $\hat{S}{2}=\hat{i}$. Any axis passing through a center of symmetry is an $S{2}$ axis. Reflection in a plane and inversion are special cases of the $\hat{S}_{n}$ operation.

FIGURE 12.5 An $S{4}$ axis in $\mathrm{CH}{4}$.

FIGURE 12.6 The $\hat{S}_{2}$ operation.

The $\hat{S}{n}$ operation may seem an arbitrary kind of operation, but it must be included as one of the kinds of symmetry operations. For example, the transformation from the first to the third $\mathrm{CH}{4}$ configuration in Fig. 12.5 certainly meets the definition of a symmetry operation, but it is neither a proper rotation nor a reflection nor an inversion.

Performing a symmetry operation on a molecule gives a nuclear configuration that is physically indistinguishable from the original one. Hence the center of mass must have the same position in space before and after a symmetry operation. For the operation $\hat{C}{n}$, the only points that do not move are those on the $C{n}$ axis. Therefore, a $C{n}$ symmetry axis must pass through the molecular center of mass. Similarly, a center of symmetry must coincide with the center of mass; a plane of symmetry and an $S{n}$ axis of symmetry must pass through the center of mass. The center of mass is the common intersection of all the molecular symmetry elements.

In discussing the symmetry of a molecule, we often place it in a Cartesian coordinate system with the molecular center of mass at the origin. The rotational axis of highest order is made the $z$ axis. A plane of symmetry containing this axis is designated $\sigma{v}$ (for vertical); a plane of symmetry perpendicular to this axis is designated $\sigma{h}$ (for horizontal).

Products of Symmetry Operations

Symmetry operations are operators that transform three-dimensional space, and (as with any operators) we define the product of two such operators as meaning successive application of the operators, the operator on the right of the product being applied first. Clearly, the product of any two symmetry operations of a molecule must be a symmetry operation.

As an example, consider $\mathrm{BF}{3}$. The product of the $\hat{C}{3}$ operator with itself, $\hat{C}{3} \hat{C}{3}=\hat{C}{3}^{2}$, rotates the molecule $240^{\circ}$ counterclockwise. If we take $\hat{C}{3} \hat{C}{3} \hat{C}{3}=\hat{C}{3}^{3}$, we have a $360^{\circ}$ rotation, which restores the molecule to its original position. We define the identity operation $\hat{E}$ as the operation that does nothing to a body. We have $\hat{C}{3}^{3}=\hat{E}$. (The symbol comes from the German word Einheit, meaning unity.)

Now consider a molecule with a sixfold axis of symmetry, for example, $\mathrm{C}{6} \mathrm{H}{6}$. The operation $\hat{C}{6}$ is a $60^{\circ}$ rotation, and $\hat{C}{6}^{2}$ is a $120^{\circ}$ rotation; hence $\hat{C}{6}^{2}=\hat{C}{3}$. Also $\hat{C}{6}^{3}=\hat{C}{2}$. Therefore, a $C{6}$ symmetry axis is also a $C{3}$ and a $C_{2}$ axis.

Since two successive reflections in the same plane bring all nuclei back to their original positions, we have $\hat{\sigma}^{2}=\hat{E}$. Also, $\hat{i}^{2}=\hat{E}$. More generally, $\hat{\sigma}^{n}=\hat{E}, \hat{i}^{n}=\hat{E}$ for even $n$, while $\hat{\sigma}^{n}=\hat{\sigma}, \hat{i}^{n}=\hat{i}$ for odd $n$.

Do symmetry operators always commute? Consider $\mathrm{SF}{6}$. We examine the products of a $\hat{C}{4}$ rotation about the $z$ axis and a $\hat{C}{2}$ rotation about the $x$ axis. Figure 12.7 shows that $\hat{C}{4}(z) \hat{C}{2}(x) \neq \hat{C}{2}(x) \hat{C}{4}(z)$. Thus symmetry operations do not always commute. Note that we describe symmetry operations with respect to a fixed coordinate system that does not move with the molecule when we perform a symmetry operation. Thus the $C{2}(x)$ axis does not move when we perform the $\hat{C}_{4}(z)$ operation.

FIGURE 12.7 Products of two symmetry operations in $\mathrm{SF}{6}$.
Top: $\hat{C}{2}(x) \hat{C}{4}(z)$.
Bottom: $\hat{C}{4}(z) \hat{C}_{2}(x)$.

Symmetry and Dipole Moments

As an application of symmetry, we consider molecular dipole moments. Since a symmetry operation produces a configuration that is physically indistinguishable from the original one, the direction of the dipole-moment vector must remain unchanged after a symmetry operation. (This is a nonrigorous, unsophisticated argument.) Hence, if we have a $C_{n}$ axis

FIGURE 12.8 The symmetry elements of $\mathrm{H}{2} \mathrm{O}$. of symmetry, the dipole moment must lie along this axis. If we have two or more noncoincident symmetry axes, the molecule cannot have a dipole moment, since the dipole moment cannot lie on two different axes. $\mathrm{CH}{4}$, which has four noncoincident $C{3}$ axes, has no dipole moment. If there is a plane of symmetry, the dipole moment must lie in this plane. If there are several symmetry planes, the dipole moment must lie along the line of intersection of these planes. In $\mathrm{H}{2} \mathrm{O}$ the dipole moment lies on the $C_{2}$ axis, which is also the intersection of the two symmetry planes (Fig. 12.8). A molecule with a center of symmetry cannot have a dipole moment, since inversion reverses the direction of a vector. A monatomic molecule has a center of symmetry. Hence atoms do not have dipole moments. (There is one exception to this statement; see Prob. 14.4.) Thus we can use symmetry to discover whether a molecule has a dipole moment. In many cases symmetry also tells us along what line the dipole moment lies.

Symmetry and Optical Activity

Certain molecules rotate the plane of polarization of plane-polarized light that is passed through them. Experimental evidence and a quantum-mechanical treatment (Kauzmann, pp. 703-713) show that the optical rotary powers of two molecules that are mirror images of each other are equal in magnitude but opposite in sign. Hence, if a molecule is its own mirror image, it is optically inactive: $\alpha=-\alpha, 2 \alpha=0, \alpha=0$, where $\alpha$ is the optical rotary power. If a molecule is not superimposable on its mirror image, it may be optically active. If the conformation of the mirror image differs from that of the original molecule only by rotation about a bond with a low rotational barrier, then the molecule will not be optically active.

What is the connection between symmetry and optical activity? Consider the $\hat{S}{n}$ operation. It consists of a rotation $\left(\hat{C}{n}\right)$ and a reflection $(\hat{\sigma})$. The reflection part of the
$\hat{S}{n}$ operation converts the molecule to its mirror image, and if the $\hat{S}{n}$ operation is a symmetry operation for the molecule, then the $\hat{C}_{n}$ rotation will superimpose the molecule and its mirror image:

\(
\text { molecule } \xrightarrow{\hat{C}_{n}} \text { rotated molecule } \xrightarrow{\hat{\sigma}} \text { rotated mirror image }
\)

We conclude that a molecule with an $S{n}$ axis is optically inactive. If the molecule has no $S{n}$ axis, it may be optically active.

Since $\hat{S}{1}=\hat{\sigma}$ and $\hat{S}{2}=\hat{i}$, a molecule with either a plane or a center of symmetry is optically inactive. However, an $S_{n}$ axis of any order rules out optical activity.

A molecule can have a symmetry element and still be optically active. If a $C{n}$ axis is present and there is no $S{n}$ axis, the molecule can be optically active.

Symmetry Operations and Quantum Mechanics

What is the relation between the symmetry operations of a molecule and quantum mechanics? To classify the states of a quantum-mechanical system, we consider those operators that commute with the Hamiltonian operator and with each other. For example, we classified the states of many-electron atoms using the quantum numbers $L, S, J$, and $M{J}$, which correspond to the operators $\hat{L}^{2}, \hat{S}^{2}, \hat{J}^{2}$, and $\hat{J}{z}$, all of which commute with one other and with the Hamiltonian (omitting spin-orbit interaction). The symmetry operations discussed in this chapter act on points in three-dimensional space, transforming each point to a corresponding point. All the quantum-mechanical operators we have discussed act on functions, transforming each function to a corresponding function. Corresponding to each symmetry operation $\hat{R}$, we define an operator $\hat{O}_{R}$ that acts on functions in the following manner. Let $\hat{R}$ bring a point originally at $(x, y, z)$ to the location ( $x^{\prime}, y^{\prime}, z^{\prime}$ ):

\(
\begin{equation}
\hat{R}(x, y, z) \rightarrow\left(x^{\prime}, y^{\prime}, z^{\prime}\right) \tag{12.1}
\end{equation}
\)

The operator $\hat{O}{R}$ is defined so that the function $\hat{O}{R} f$ has the same value at $\left(x^{\prime}, y^{\prime}, z^{\prime}\right)$ that the function $f$ has at $(x, y, z)$ :

\(
\begin{equation}
\hat{O}_{R} f\left(x^{\prime}, y^{\prime}, z^{\prime}\right)=f(x, y, z) \tag{12.2}
\end{equation}
\)

For example, let $\hat{R}$ be a counterclockwise $90^{\circ}$ rotation about the $z$ axis: $\hat{R}=\hat{C}{4}(z)$; and let $f$ be a $2 p{x}$ hydrogen orbital: $f=2 p{x}=N x e^{-k\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}}$. The shape of the $2 p{x}$ orbital is two distorted ellipsoids of revolution about the $x$ axis (Section 6.7). Let these ellipsoids be "centered" about the points $(a, 0,0)$ and $(-a, 0,0)$, where $a>0$ and $2 p{x}>0$ on the right ellipsoid. The operator $\hat{C}{4}(z)$ has the following effect (Fig. 12.9):

\(
\begin{equation}
\hat{C}_{4}(z)(x, y, z) \rightarrow(-y, x, z) \tag{12.3}
\end{equation}
\)

For example, the point originally at $(a, 0,0)$ is moved to $(0, a, 0)$, while the point at $(-a, 0,0)$ is moved to $(0,-a, 0)$. From (12.2), the function $\hat{O}{C{4}(z)} 2 p_{x}$ must have

FIGURE 12.9 The effect of a $\hat{C}_{4}(z)$ rotation is to move the point at $(x, y)$ to $\left(x^{\prime}, y^{\prime}\right)$. Use of trigonometry shows that $x^{\prime}=-y$ and $y^{\prime}=x$.

FIGURE 12.10 Effect of $\hat{O}{C{4}(z)}$ on a $p_{x}$ orbital.

its contours centered about $(0, a, 0)$ and $(0,-a, 0)$, respectively. We conclude that (Fig. 12.10)

\(
\begin{equation}
\hat{O}{C{4}(z)} 2 p{x}=2 p{y} \tag{12.4}
\end{equation}
\)

For the inversion operation, we have

\(
\begin{equation}
\hat{i}(x, y, z) \rightarrow(-x,-y,-z) \tag{12.5}
\end{equation}
\)

and (12.2) reads

\(
\hat{O}_{i} f(-x,-y,-z)=f(x, y, z)
\)

We now rename the variables as follows: $\bar{x}=-x, \bar{y}=-y, \bar{z}=-z$. Hence

\(
\hat{O}_{i} f(\bar{x}, \bar{y}, \bar{z})=f(-\bar{x},-\bar{y},-\bar{z})
\)

The point $(\bar{x}, \bar{y}, \bar{z})$ is a general point in space, and we can drop the bars to get

\(
\hat{O}_{i} f(x, y, z)=f(-x,-y,-z)
\)

We conclude that $\hat{O}{i}$ is the parity operator (Section 7.5): $\hat{O}{i}=\hat{\Pi}$.
The wave function of an $n$-particle system is a function of $4 n$ variables, and we extend the definition (12.2) of $\hat{O}_{R}$ to read

\(
\hat{O}{R} f\left(x{1}^{\prime}, y{1}^{\prime}, z{1}^{\prime}, m{s 1}, \ldots, x{n}^{\prime}, y{n}^{\prime}, z{n}^{\prime}, m{s n}\right)=f\left(x{1}, y{1}, z{1}, m{s 1}, \ldots, x{n}, y{n}, z{n}, m_{s n}\right)
\)

Note that $\hat{O}_{R}$ does not affect the spin coordinates. Thus, in looking at the parity of atomic states in Section 11.5, we looked at the spatial factors in each term of the expansion of the Slater determinant and omitted consideration of the spin factors, since these are unaffected by $\hat{\Pi}$.

When a system is characterized by the symmetry operations $\hat{R}{1}, \hat{R}{2}, \ldots$, then the corresponding operators $\hat{O}{R{1}}, \hat{O}{R{2}}, \ldots$ commute with the Hamiltonian. (For a proof, see Schonland, Sections 7.1-7.3.) For example, if the nuclear framework of a molecule has a center of symmetry, then the parity operator $\hat{\Pi}$ commutes with the Hamiltonian for the electronic motion. We can then choose the electronic states (wave functions) as even or odd, according to the eigenvalue of $\hat{\Pi}$. Of course, not all the symmetry operations may commute among themselves (Fig. 12.7). Hence the wave functions cannot in general be chosen as eigenfunctions of all the symmetry operators $\hat{O}_{R}$. (Further discussion on the relation between symmetry operators and molecular wave functions is given in Section 15.2.)

There is a close connection between symmetry and the constants of the motion (these are properties whose operators commute with the Hamiltonian $\hat{H}$ ). For a system whose Hamiltonian is invariant (that is, doesn't change) under any translation of spatial coordinates, the linear-momentum operator $\hat{p}$ will commute with $\hat{H}$ and $p$ can be assigned a definite value in a stationary state. An example is the free particle. For a system with $\hat{H}$ invariant under any rotation of coordinates, the operators for the angular-momentum components commute with $\hat{H}$, and the total angular momentum and one of its components are specifiable. An example is an atom. A linear molecule has axial symmetry, rather than the spherical symmetry of an atom; here only the axial component of angular momentum can be specified (Chapter 13).

Matrices and Symmetry Operations

The symmetry operation $\hat{R}$ moves the point originally at $x, y, z$ to the new location $x^{\prime}, y^{\prime}, z^{\prime}$, where each of $x^{\prime}, y^{\prime}, z^{\prime}$ is a linear combination of $x, y, z$ (for proof of this see Schonland, pp. 52-53):

\(
\begin{aligned}
& x^{\prime}=r{11} x+r{12} y+r{13} z \
& y^{\prime}=r{21} x+r{22} y+r{23} z \
& z^{\prime}=r{31} x+r{32} y+r{33} z
\end{aligned} \quad \text { or } \quad\left(\begin{array}{l}
x^{\prime} \
y^{\prime} \
z^{\prime}
\end{array}\right)=\left(\begin{array}{lll}
r{11} & r{12} & r{13} \
r{21} & r{22} & r{23} \
r{31} & r{32} & r{33}
\end{array}\right)\left(\begin{array}{l}
x \
y \
z
\end{array}\right)
\)

where $r{11}, r{12}, \ldots, r{33}$ are constants whose values depend on the nature of $\hat{R}$. One says that the symmetry operation $\hat{R}$ is represented by the matrix $\mathbf{R}$ whose elements are $r{11}, r{12}, \ldots, r{33}$. The set of functions $x, y, z$, whose transformations are described by $\mathbf{R}$, is said to be the basis for this representation.

For example, from (12.3) and (12.5), for the $\hat{C}{4}(z)$ operation, we have $x^{\prime}=-y, y^{\prime}=x, z^{\prime}=z$; for $\hat{i}$, we have $x^{\prime}=-x, y^{\prime}=-y, z^{\prime}=-z$. The matrices representing $\hat{C}{4}(z)$ and $\hat{i}$ in the $x, y, z$ basis are

\(
\mathbf{C}_{4}(z)=\left(\begin{array}{rrr}
0 & -1 & 0 \
1 & 0 & 0 \
0 & 0 & 1
\end{array}\right), \quad \mathbf{i}=\left(\begin{array}{rrr}
-1 & 0 & 0 \
0 & -1 & 0 \
0 & 0 & -1
\end{array}\right)
\)

If the product $\hat{R} \hat{S}$ of two symmetry operations is $\hat{T}$, then the matrices representing these operations in the $x, y, z$ basis multiply in the same way; that is, if $\hat{R} \hat{S}=\hat{T}$, then RS $=$ T. (For proof, see Schonland, pp. 56-57.)


We now consider the possible combinations of symmetry elements. We cannot have arbitrary combinations of symmetry elements in a molecule. For example, suppose a molecule has one and only one $C{3}$ axis. Any symmetry operation must send this axis into itself. The molecule cannot, therefore, have a plane of symmetry at an arbitrary angle to the $C{3}$ axis; any plane of symmetry must either contain this axis or be perpendicular to it. (In $\mathrm{BF}{3}$ there are three $\sigma{v}$ planes and one $\sigma{h}$ plane.) The only possibility for a $C{n}$ axis noncoincident with the $C{3}$ axis is a $C{2}$ axis perpendicular to the $C{3}$ axis. The corresponding $\hat{C}{2}$ operation will send the $C{3}$ axis into itself. Since $\hat{C}{3}$ and $\hat{C}{3}^{2}$ are symmetry operations, if we have one $C{2}$ axis perpendicular to the $C{3}$ axis, we must have a total of three such axes (as in $\mathrm{BF}{3}$ ).

The set of all the symmetry operations of a molecule forms a mathematical group. A group is a set of entities (called the elements or members of the group) and a rule for combining any two members of the group to form the product of these members, such that certain requirements are met. Let $A, B, C, D, \ldots$ (assumed to be all different from one another) be the members of the group and let $B C$ denote the product of $B$ and $C$. The product $B C$ need not be the same as $C B$. The requirements that must be met to have a group are as follows: (1) The product of any two elements (including the product of an element with itself) must be a member of the group (the closure requirement). (2) There is a single element $I$ of the group, called the identity element, such that $K I=K$ and $I K=K$ for every element $K$ of the group. (3) Every element $K$ of the group has an inverse (symbolized by $K^{-1}$ ) that is a member of the group and that satisfies $K K^{-1}=I$ and $K^{-1} K=I$, where $I$ is the identity element. (4) Group multiplication is associative, meaning that $(B D) G=B (D * G)$ always holds for elements of the group.

The number of elements in a group is called the order of the group. A group for which $B C=C B$ for every pair of group elements is commutative or Abelian.

An example of a group is the set of all integers (positive, negative, and zero) with the rule of combination being ordinary addition. Closure is satisfied since the sum of two integers is an integer. The identity element is 0 . The inverse of the integer $n$ is the integer $-n$. Addition is associative. This group is of infinite order and is Abelian.

The set of all symmetry operations of a three-dimensional body, with the rule of combination of $\hat{R}$ and $\hat{S}$ being successive performance of $\hat{R}$ and $\hat{S}$, forms a group. Closure is satisfied because the product of any two symmetry operations must be a symmetry operation. The identity element of the group is the identity operation $\hat{E}$, which does nothing. Associativity is satisfied [Eq. (3.6)]. The inverse of a symmetry operation $\hat{R}$ is the symmetry operation that undoes the effect of $\hat{R}$. For example, the inverse of the inversion operation $\hat{i}$ is $\hat{i}$ itself, since $\hat{i} \hat{i}=\hat{E}$. The inverse of a $\hat{C}{3} 120^{\circ}$ counterclockwise rotation is a $120^{\circ}$ clockwise rotation, which is the same as a $240^{\circ}$ counterclockwise rotation: $\hat{C}{3} \hat{C}{3}^{2}=\hat{E}$ and $\hat{C}{3}^{-1}=\hat{C}_{3}^{2}$. Note that it is the symmetry operations of a molecule (and not the symmetry elements) that are the members (elements) of the group. We will make some use of group theory in Section 15.2, but a full development of group theory and its applications is omitted (see Cotton or Schonland).

For any symmetry operation of a molecule, the point that is the center of mass remains fixed. Hence the symmetry groups of isolated molecules are called point groups. For a crystal of infinite extent, we can have symmetry operations (for example, translations) that leave no point fixed, giving rise to space groups. Consideration of space groups is omitted.

Every molecule belongs to one of the symmetry point groups that we now list. For convenience the point groups have been classified into four divisions. Script letters denote point groups.

\(
\text { I. Groups with no } C{n} \text { axis : } \mathscr{C}{1}, \mathscr{C}{s}, \mathscr{C}{i}
\)

$\mathscr{C}{1}$ : If a molecule has no symmetry elements at all, it belongs to this group. The only symmetry operation is $\hat{E}$ (which is a $\hat{C}{1}$ rotation). CHFClBr belongs to point group $\mathscr{C}{1}$.
$\mathscr{C}{s}$ : A molecule whose only symmetry element is a plane of symmetry belongs to this group. The symmetry operations are $\hat{E}$ and $\hat{\sigma}$. An example is HOCl (Fig. 12.11).
$\mathscr{C}_{i}$ : A molecule whose only symmetry element is a center of symmetry belongs to this group. The symmetry operations are $\hat{i}$ and $\hat{E}$.

FIGURE 12.11 Molecules with no $C_{n}$ axis.

$\mathscr{C}{2}$

$\mathscr{C}{2 h}$

$\mathscr{C}_{3 h}$

FIGURE 12.12 Molecules with a single $C{n}$ axis.
$\mathrm{H}{2} \mathrm{O}{2}$; the $\mathrm{O}-\mathrm{O}$ bond is perpendicular to the plane of the paper

II. Groups with a single $C{n}$ axis : $\mathscr{C}{n}, \mathscr{C}{n h}, \mathscr{C}{n v}, \mathscr{S}{2 n}$
$\mathscr{C}{n}, n=2,3,4, \ldots$ : A molecule whose only symmetry element is a $C{n}$ axis belongs to this group. The symmetry operations are $\hat{C}{n}, \hat{C}{n}^{2}, \ldots, \hat{C}{n}^{n-1}, \hat{E}$. A molecule belonging to $\mathscr{C}{2}$ is shown in Fig. 12.12.
$\mathscr{C}{n h}, n=2,3,4, \ldots$ : If we add a plane of symmetry perpendicular to the $C{n}$ axis, we have a molecule belonging to this group. Since $\hat{\sigma}{h} \hat{C}{n}=\hat{S}{n}$, the $C{n}$ axis is also an $S{n}$ axis. If $n$ is even, the $C{n}$ axis is also a $C{2}$ axis, and the molecule has the symmetry operation $\hat{\sigma}{h} \hat{C}{2}=\hat{S}{2}=\hat{i}$. Thus, for $n$ even, a molecule belonging to $\mathscr{C}{n h}$ has a center of symmetry. (The group $\mathscr{C}{1 h}$ is the group $\mathscr{C}{s}$ discussed previously.) Examples of molecules belonging to groups $\mathscr{C}{2 h}$ and $\mathscr{C}{3 h}$ are shown in Fig. 12.12.
$\mathscr{C}{n v}, n=2,3,4, \ldots$ : A molecule in this group has a $C{n}$ axis and $n$ vertical symmetry planes passing through the $C{n}$ axis. (Group $\mathscr{C}{1 v}$ is the group $\mathscr{C}{s}$.) $\mathrm{H}{2} \mathrm{O}$ with a $C{2}$ axis and two vertical symmetry planes belongs to $\mathscr{C}{2 v}$. $\mathrm{NH}{3}$ belongs to $\mathscr{C}{3 v}$. (See Fig. 12.12.)
$\mathscr{S}{n}, n=4,6,8, \ldots: \mathscr{S}{n}$ is the group of symmetry operations associated with an $S{n}$ axis. First consider the case of odd $n$. We have $\hat{S}{n}=\hat{\sigma}{h} \hat{C}{n}$. The operation $\hat{C}{n}$ affects the $x$ and $y$ coordinates only, while the $\hat{\sigma}_{h}$ operation affects the $z$ coordinate only. Hence these operations commute, and we have

\(
\hat{S}{n}^{n}=\left(\hat{\sigma}{h} \hat{C}{n}\right)^{n}=\hat{\sigma}{h} \hat{C}{n} \hat{\sigma}{h} \hat{C}{n} \cdots \hat{\sigma}{h} \hat{C}{n}=\hat{\sigma}{h}^{n} \hat{C}_{n}^{n}
\)

Now $\hat{C}{n}^{n}=\hat{E}$, and, for odd $n, \hat{\sigma}{h}^{n}=\hat{\sigma}{h}$. Hence the symmetry operation $\hat{S}{n}^{n}$ equals $\hat{\sigma}{h}$ for odd $n$, and the group $\mathscr{S}{n}$ has a horizontal symmetry plane if $n$ is odd. Also,

\(
\hat{S}{n}^{n+1}=\hat{S}{n}^{n} \hat{S}{n}=\hat{\sigma}{h} \hat{S}{n}=\hat{\sigma}{h} \hat{\sigma}{h} \hat{C}{n}=\hat{C}_{n}, \quad n \text { odd }
\)

so the molecule has a $C{n}$ axis if $n$ is odd. We conclude that $\mathscr{S}{n}$ is identical to the group $\mathscr{C}{n h}$ if $n$ is odd. Now consider even values of $n$. Since $\hat{S}{2}=\hat{i}$, the group $\mathscr{C}{2}$ is identical to $\mathscr{C}{i}$. Thus it is only for $n=4,6,8, \ldots$ that we get new groups. The $S{2 n}$ axis is also a $C{n}$ axis: $\hat{S}{2 n}^{2}=\hat{\sigma}{h}^{2} \hat{C}{2 n}^{2}=\hat{E} \hat{C}{n}=\hat{C}{n}$.
III. Groups with one $C{n}$ axis and $n C{2}$ axes : $\mathscr{D}{n}, \mathscr{D}{n h}, \mathscr{D}{n d}$
$\mathscr{D}{n}, n=2,3,4, \ldots$ : A molecule with a $C{n}$ axis and $n C{2}$ axes perpendicular to the $C{n}$ axis (and no symmetry planes) belongs to $\mathscr{D}{n}$. The angle between adjacent $C{2}$ axes is $\pi / n$ radians. The group $\mathscr{D}{2}$ has three mutually perpendicular $C{2}$ axes, and the symmetry operations are $\hat{E}, \hat{C}{2}(x), \hat{C}{2}(y), \hat{C}_{2}(z)$.

FIGURE 12.13 Molecules with a $C{n}$ axis and $n C{2}$ axes.

$\mathscr{D}_{3 h}$

$\mathscr{D}{n h}, n=2,3,4, \ldots$ : This is the group of a molecule with a $C{n}$ axis, $n C{2}$ axes, and a $\sigma{h}$ symmetry plane perpendicular to the $C{n}$ axis. As in $\mathscr{C}{n h}$, the $C{n}$ axis is also an $S{n}$ axis. If $n$ is even, the $C{n}$ axis is a $C{2}$ and an $S{2}$ axis, and the molecule has a center of symmetry. Molecules in $\mathscr{D}{n h}$ also have $n$ vertical planes of symmetry, each such plane passing through the $C{n}$ axis and a $C{2}$ axis. (For the proof, see Prob. 12.29.) $\mathrm{BF}{3}$ belongs to $\mathscr{D}{3 h} ; \mathrm{PtCl}{4}^{2-}$ belongs to $\mathscr{D}{4 h}$; benzene belongs to $\mathscr{D}{6 h}$ (Fig. 12.13).
$\mathscr{D}{n d}, n=2,3,4, \ldots$ : A molecule with a $C{n}$ axis, $n C{2}$ axes, and $n$ vertical planes of symmetry, which pass through the $C{n}$ axis and bisect the angles between adjacent $C{2}$ axes, belongs to this group. The $n$ vertical planes are called diagonal planes and are symbolized by $\sigma{d}$. The $C{n}$ axis can be shown to be an $S{2 n}$ axis. The staggered conformation of ethane is an example of group $\mathscr{D}{3 d}$ (Fig. 12.13). [The symmetry of molecules with internal rotation (for example, ethane) actually requires special consideration; see H. C. Longuet-Higgins, Mol. Phys., 6, 445 (1963).]

\(
\text { IV. Groups with more than one } C{n} \text { axis, } n>2: \mathscr{T}{d}, \mathscr{T}, \mathscr{T}{h}, \mathscr{O}{h}, \mathcal{O}, \mathscr{I}{h}, \mathscr{I}^{\prime}, \mathscr{K}{h}
\)

These groups are related to the symmetries of the Platonic solids, solids bounded by congruent regular polygons and having congruent polyhedral angles. There are five such solids: The tetrahedron has four triangular faces, the cube has six square faces, the octahedron has eight triangular faces, the pentagonal dodecahedron has twelve pentagonal faces, and the icosahedron has twenty triangular faces.
$\mathscr{T}{d}$ : The symmetry operations of a regular tetrahedron constitute this group. The prime example is $\mathrm{CH}{4}$. The symmetry elements of methane are four $C{3}$ axes (each $\mathrm{C}-\mathrm{H}$ bond), three $S{4}$ axes, which are also $C{2}$ axes (Fig. 12.5), and six symmetry planes, each such plane containing two $\mathrm{C}-\mathrm{H}$ bonds. (The number of combinations of 4 things taken 2 at a time is $4!/ 2!2!=6$.)
$\mathbb{O}{h}$ : The symmetry operations of a cube or a regular octahedron constitute this group. The cube and octahedron are said to be dual to each other; if we connect the midpoints of adjacent faces of a cube, we get an octahedron, and vice versa. Hence the

$\Phi_{h}$

FIGURE 12.14 Molecules with more than one $C{n}$ axis, $n>2$. (For $\mathrm{B}{12} \mathrm{H}{12}^{2-}$, the hydrogen atoms have been omitted for clarity.)
cube and octahedron have the same symmetry elements and operations. A cube has six faces, eight vertices, and twelve edges. Its symmetry elements are as follows: a center of symmetry, three $C{4}$ axes passing through the centers of opposite faces of the cube (these are also $S{4}$ and $C{2}$ axes), four $C{3}$ axes passing through opposite corners of the cube (these are also $S{6}$ axes), six $C{2}$ axes connecting the midpoints of pairs of opposite edges, three planes of symmetry parallel to pairs of opposite faces, and six planes of symmetry passing through pairs of opposite edges. Octahedral molecules such as $\mathrm{SF}{6}$ belong to $\mathbb{O}{h}$.
$\Phi{h}$ : The symmetry operations of a regular pentagonal dodecahedron or icosahedron (which are dual to each other) constitute this group. The $\mathrm{B}{12} \mathrm{H}{12}^{2-}$ ion belongs to group $\mathscr{I}{h}$. The twelve boron atoms lie at the vertices of a regular icosahedron (Fig. 12.14). The soccer-ball-shaped molecule $\mathrm{C}{60}$ (buckminsterfullerene) belongs to $\mathscr{I}{h}$. Its shape is a truncated icosahedron formed by slicing off each of the 12 vertices of a regular icosahedron (Fig. 12.14), thereby generating a figure with 12 pentagonal faces ( 5 faces meet at each vertex of the original icosahedron), 20 hexagonal faces (formed from the 20 triangular faces of the original icosahedron), and $12 \times 5=60$ vertices ( 5 new vertices are formed when one of the original vertices is sliced off).
$\mathscr{K}{h}$ : This is the group of symmetry operations of a sphere. (Kugel is the German word for sphere.) An atom belongs to this group.

For completeness, we mention the remaining groups related to the Platonic solids; these groups are chemically unimportant. The groups $\mathscr{T}, \mathcal{O}$, and $\mathscr{I}$ are the groups of symmetry proper rotations of a tetrahedron, cube, and icosahedron, respectively. These groups do not have the symmetry reflections and improper rotations of these solids or the inversion operation of the cube and icosahedron. The group $\mathscr{T}_{h}$ contains the symmetry rotations of a tetrahedron, the inversion operation, and certain reflections and improper rotations.

What groups do linear molecules belong to? A rotation by any angle about the internuclear axis of a linear molecule is a symmetry operation. A regular polygon of $n$ sides has a $C{n}$ axis, and taking the limit as $n \rightarrow \infty$ we get a circle, which has a $C{\infty}$ axis. The internuclear axis of a linear molecule is a $C{\infty}$ axis. Any plane containing this axis is a symmetry plane. If the linear molecule does not have a center of symmetry (for example, $\mathrm{CO}, \mathrm{HCN}$ ), it belongs to the group $\mathscr{C}{\infty v}$. If the linear molecule has a center of symmetry (for example, $\mathrm{H}{2}, \mathrm{C}{2} \mathrm{H}{2}$ ), then it also has a $\sigma{h}$ symmetry plane and an infinite number of $C{2}$ axes perpendicular to the molecular axis. Hence it belongs to $\mathscr{D}{\infty h}$.

How do we find what point group a molecule belongs to? One way is to find all the symmetry elements and then compare with the above list of groups. A more systematic procedure is given in Fig. 12.15 [J. B. Calvert, Am. J. Phys., 31, 659 (1963)]. This procedure is based on the four divisions of point groups.

FIGURE 12.15 How to determine the point group of a molecule.

*If there are three mutually $\perp C{2}$ axes, check each axis for two $\sigma{v}$ planes.

We begin by checking whether or not the molecule is linear. Linear molecules are classified in $\mathscr{D}{\infty h}$ or $\mathscr{C}{\infty v}$ according to whether or not there is a center of symmetry. If the molecule is nonlinear, we look for two or more rotational axes of threefold or higher order. If these are present, the molecule is classified in one of the groups related to the symmetry of the regular polyhedra (division IV). If these axes are not present, we look for any $C{n}$ axis at all. If there is no $C{n}$ axis, the molecule belongs to one of the groups $\mathscr{C}{s}, \mathscr{C}{i}, \mathscr{C}{1}$ (division I). If there is at least one $C{n}$ axis, we pick the $C{n}$ axis of highest order as the main symmetry axis before proceeding to the next step. (If there are three mutually perpendicular $C{2}$ axes, we may pick any one of these axes as the main axis.) We next check for $n C{2}$ axes at right angles to the main $C{n}$ axis. If these are present, we have one of the division III groups. If these are absent, we have one of the division II groups. If we find the $n C_{2}$ axes,

(a)

(b)
we look for a symmetry plane perpendicular to the main $C{n}$ axis. If it is present, the group is $\mathscr{D}{n h}$. If it is absent, we check for $n$ planes of symmetry containing the main $C{n}$ axis (if the molecule has three mutually perpendicular $C{2}$ axes, we must try each axis as the main axis in looking for the two $\sigma{v}$ planes; the three $C{2}$ axes are equivalent in the groups $\mathscr{D}{n h}$ and $\mathscr{D}{n}$, but not in $\mathscr{D}{n d}$ ). If we find $n \sigma{v}$ planes, the group is $\mathscr{D}{n d}$; otherwise it is $\mathscr{D}{n}$. If the molecule does not have $n C{2}$ axes perpendicular to the main $C{n}$ axis, we classify it in one of the groups $\mathscr{C}{n h}, \mathscr{C}{n v}, \mathscr{S}{2 n}$, or $\mathscr{C}{n}$, by looking first for a $\sigma{h}$ plane, then for $n \sigma{v}$ planes, and, finally, if these are absent, checking whether or not the $C{n}$ axis is an $S{2 n}$ axis. The procedure of Fig. 12.15 does not locate all symmetry elements. After classifying a molecule, check that all the required symmetry elements are indeed present. Although the above procedure might seem involved, it is really quite simple and is easily memorized.

The most common error students make in classifying a molecule is to miss the $n$ $C{2}$ axes perpendicular to the $C{n}$ axis of a molecule belonging to $\mathscr{D}{n d}$. For example, it is easy to see that the $\mathrm{C}=\mathrm{C}=\mathrm{C}$ axis of allene is a $C{2}$ axis, but the other two $C{2}$ axes (Fig. 12.16) are often overlooked. Molecules with two equal halves "staggered" with respect to each other generally belong to $\mathscr{D}{n d}$. Models may be helpful for those with visualization difficulties.


Electronic Structures of Diatomic Molecules

Click the keywords to know more about it.

Born–Oppenheimer Approximation: An approximation in molecular quantum mechanics that assumes nuclei are fixed while electrons move, simplifying the molecular Hamiltonian operator 1. Molecular Hamiltonian: The operator representing the total energy of a molecule, including kinetic energy of nuclei and electrons, and potential energy from repulsions and attractions between nuclei and electrons 1. Schrödinger Equation: A fundamental equation in quantum mechanics used to find the wave functions and energies of a molecule 1. Electronic Schrödinger Equation: The Schrödinger equation for electronic motion, obtained by omitting nuclear kinetic energy terms 1. Purely Electronic Hamiltonian: The Hamiltonian operator for electronic motion, excluding nuclear kinetic energy terms 1. Internuclear Repulsion: The potential energy term representing the repulsion between nuclei in a molecule 1. Equilibrium Internuclear Distance: The internuclear separation at which the potential energy of a molecule is minimized 1. Equilibrium Dissociation Energy: The energy difference between the potential energy at infinite internuclear separation and at the equilibrium internuclear distance 1. Zero-Point Energy: The lowest possible energy that a quantum mechanical system may have, even at absolute zero temperature 1. Harmonic Oscillator Approximation: An approximation that treats the vibration of a diatomic molecule as a harmonic oscillator, useful for calculating vibrational energy levels 1. Vibrational Anharmonicity: The deviation of a molecule's vibrational energy levels from those predicted by the harmonic oscillator approximation 1. Centrifugal Distortion: The distortion of a molecule due to rotational motion, affecting its vibrational energy levels 1. Morse Function: A potential energy function used to approximate the potential energy of a diatomic molecule, accounting for anharmonicity 1. Atomic Units: A system of units used in quantum chemistry where fundamental constants like the electron mass, charge, and Planck's constant are set to 1 1. Hartree: The atomic unit of energy, equivalent to the energy of the ground state of the hydrogen atom 1. Bohr Radius: The atomic unit of length, representing the average distance between the proton and electron in a hydrogen atom 1. Hydrogen Molecule Ion (H₂⁺): The simplest diatomic molecule, consisting of two protons and one electron, used as a model for studying molecular electronic structure 1. Confocal Elliptic Coordinates: A coordinate system used to solve the Schrödinger equation for the hydrogen molecule ion 1. LCAO-MO (Linear Combination of Atomic Orbitals - Molecular Orbital): A method for constructing molecular orbitals by combining atomic orbitals 1. Bonding and Antibonding Orbitals: Molecular orbitals formed from the combination of atomic orbitals, where bonding orbitals lower the energy and antibonding orbitals raise the energy of the molecule 1.

We now begin the study of molecular quantum mechanics. If we assume the nuclei and electrons to be point masses and neglect spin-orbit and other relativistic interactions (Sections 11.6 and 11.7), then the molecular Hamiltonian operator is

\(
\begin{equation}
\hat{H}=-\frac{\hbar^{2}}{2} \sum{\alpha} \frac{1}{m{\alpha}} \nabla{\alpha}^{2}-\frac{\hbar^{2}}{2 m{e}} \sum{i} \nabla{i}^{2}+\sum{\alpha} \sum{\beta>\alpha} \frac{Z{\alpha} Z{\beta} e^{2}}{4 \pi \varepsilon{0} r{\alpha \beta}}-\sum{\alpha} \sum{i} \frac{Z{\alpha} e^{2}}{4 \pi \varepsilon{0} r{i \alpha}}+\sum{j} \sum{i>j} \frac{e^{2}}{4 \pi \varepsilon{0} r_{i j}} \tag{13.1}
\end{equation}
\)

where $\alpha$ and $\beta$ refer to nuclei and $i$ and $j$ refer to electrons. The first term in (13.1) is the operator for the kinetic energy of the nuclei. The second term is the operator for the kinetic energy of the electrons. The third term is the potential energy of the repulsions between the nuclei, $r{\alpha \beta}$ being the distance between nuclei $\alpha$ and $\beta$ with atomic numbers $Z{\alpha}$ and $Z{\beta}$. The fourth term is the potential energy of the attractions between the electrons and the nuclei, $r{i \alpha}$ being the distance between electron $i$ and nucleus $\alpha$. The last term is the potential energy of the repulsions between the electrons, $r_{i j}$ being the distance between electrons $i$ and $j$. The zero level of potential energy for (13.1) corresponds to having all the charges (electrons and nuclei) infinitely far from one another.

As an example, consider $\mathrm{H}{2}$. Let $\alpha$ and $\beta$ be the two protons, 1 and 2 be the two electrons, and $m{p}$ be the proton mass. The $\mathrm{H}_{2}$ molecular Hamiltonian operator is

\(
\begin{align}
\hat{H}= & -\frac{\hbar^{2}}{2 m{p}} \nabla{\alpha}^{2}-\frac{\hbar^{2}}{2 m{p}} \nabla{\beta}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{1}^{2}-\frac{\hbar^{2}}{2 m{e}} \nabla{2}^{2} \
& +\frac{e^{2}}{4 \pi \varepsilon{0} r{\alpha \beta}}-\frac{e^{2}}{4 \pi \varepsilon{0} r{1 \alpha}}-\frac{e^{2}}{4 \pi \varepsilon{0} r{1 \beta}}-\frac{e^{2}}{4 \pi \varepsilon{0} r{2 \alpha}}-\frac{e^{2}}{4 \pi \varepsilon{0} r{2 \beta}}+\frac{e^{2}}{4 \pi \varepsilon{0} r{12}} \tag{13.2}
\end{align}
\)

The wave functions and energies of a molecule are found from the Schrödinger equation:

\(
\begin{equation}
\hat{H} \psi\left(q{i}, q{\alpha}\right)=E \psi\left(q{i}, q{\alpha}\right) \tag{13.3}
\end{equation}
\)

where $q{i}$ and $q{\alpha}$ symbolize the electronic and nuclear coordinates, respectively. The molecular Hamiltonian (13.1) is formidable enough to terrify any quantum chemist. Fortunately, a very accurate, simplifying approximation exists. Since nuclei are much heavier than electrons ( $m{\alpha} \gg m{e}$ ), the electrons move much faster than the nuclei. Hence, to a good
approximation as far as the electrons are concerned, we can regard the nuclei as fixed while the electrons carry out their motions. Speaking classically, during the time of a cycle of electronic motion, the change in nuclear configuration is negligible. Thus, considering the nuclei as fixed, we omit the nuclear kinetic-energy terms from (13.1) to obtain the Schrödinger equation for electronic motion:

\(
\begin{equation}
\left(\hat{H}{\mathrm{el}}+V{N N}\right) \psi{\mathrm{el}}=U \psi{\mathrm{el}} \tag{13.4}
\end{equation}
\)

where the purely electronic Hamiltonian $\hat{H}_{\text {el }}$ is

\(
\begin{equation}
\hat{H}{\mathrm{el}}=-\frac{\hbar^{2}}{2 m{e}} \sum{i} \nabla{i}^{2}-\sum{\alpha} \sum{i} \frac{Z{\alpha} e^{2}}{4 \pi \varepsilon{0} r{i \alpha}}+\sum{j} \sum{i>j} \frac{e^{2}}{4 \pi \varepsilon{0} r_{i j}} \tag{13.5}
\end{equation}
\)

The electronic Hamiltonian including nuclear repulsion is $\hat{H}{\text {el }}+V{N N}$. The nuclear-repulsion term $V_{N N}$ is

\(
\begin{equation}
V{N N}=\sum{\alpha} \sum{\beta>\alpha} \frac{Z{\alpha} Z{\beta} e^{2}}{4 \pi \varepsilon{0} r_{\alpha \beta}} \tag{13.6}
\end{equation}
\)

The energy $U$ in (13.4) is the electronic energy including internuclear repulsion. The internuclear distances $r_{\alpha \beta}$ in (13.4) are not variables, but are each fixed at some constant value. Of course, there are an infinite number of possible nuclear configurations, and for each of these we may solve the electronic Schrödinger equation (13.4) to get a set of electronic wave functions and corresponding electronic energies. Each member of the set corresponds to a different molecular electronic state. The electronic wave functions and energies thus depend parametrically on the nuclear coordinates:

\(
\psi{\mathrm{el}}=\psi{\mathrm{el}, n}\left(q{i} ; q{\alpha}\right) \quad \text { and } \quad U=U{n}\left(q{\alpha}\right)
\)

where $n$ symbolizes the electronic quantum numbers.
The variables in the electronic Schrödinger equation (13.4) are the electronic coordinates. The quantity $V{N N}$ is independent of these coordinates and is a constant for a given nuclear configuration. Now it is easily proved (Prob. 4.52) that the omission of a constant term $C$ from the Hamiltonian does not affect the wave functions and simply decreases each energy eigenvalue by $C$. Hence, if $V{N N}$ is omitted from (13.4), we get

\(
\begin{equation}
\hat{H}{\mathrm{el}} \psi{\mathrm{el}}=E{\mathrm{el}} \psi{\mathrm{el}} \tag{13.7}
\end{equation}
\)

where the purely electronic energy $E{\mathrm{el}}\left(q{\alpha}\right)$ (which depends parametrically on the nuclear coordinates $q_{\alpha}$ ) is related to the electronic energy including internuclear repulsion by

\(
\begin{equation}
U=E{\mathrm{el}}+V{N N} \tag{13.8}
\end{equation}
\)

We can therefore omit the internuclear repulsion from the electronic Schrödinger equation. After finding $E{\text {el }}$ for a particular configuration of the nuclei by solving (13.7), we calculate $U$ using (13.8), where the constant $V{N N}$ is easily calculated from (13.6) using the assumed nuclear locations.

For $\mathrm{H}{2}$, with the two protons at a fixed distance $r{\alpha \beta}=R$, the purely electronic Hamiltonian is given by (13.2) with the first, second, and fifth terms omitted. The nuclear repulsion $V{N N}$ equals $e^{2} / 4 \pi \varepsilon{0} R$. The purely electronic Hamiltonian involves the six electronic coordinates $x{1}, y{1}, z{1}, x{2}, y{2}, z{2}$ as variables and involves the nuclear coordinates as parameters.

The electronic Schrödinger equation (13.4) can be dealt with by approximate methods to be discussed later. If we plot the electronic energy including nuclear repulsion for a bound state of a diatomic molecule against the internuclear distance $R$, we find a

FIGURE 13.1 Electronic energy including internuclear repulsion as a function of the internuclear distance $R$ for a diatomic-molecule bound electronic state.

curve like the one shown in Fig. 13.1. At $R=0$, the internuclear repulsion causes $U$ to go to infinity. The internuclear separation at the minimum in this curve is called the equilibrium internuclear distance $R{e}$. The difference between the limiting value of $U$ at infinite internuclear separation and its value at $R{e}$ is called the equilibrium dissociation energy (or the dissociation energy from the potential-energy minimum) $D_{e}$ :

\(
\begin{equation}
D{e} \equiv U(\infty)-U\left(R{e}\right) \tag{13.9}
\end{equation}
\)

When nuclear motion is considered (Section 13.2), one finds that the equilibrium dissociation energy $D{e}$ differs from the molecular ground-vibrational-state dissociation energy $D{0}$. The lowest state of nuclear motion has zero rotational energy [as shown by Eq. (6.47)] but has a nonzero vibrational energy-the zero-point energy. If we use the harmonic-oscillator approximation for the vibration of a diatomic molecule (Section 4.3), then this zero-point energy is $\frac{1}{2} h \nu$. This zero-point energy raises the energy for the ground state of nuclear motion $\frac{1}{2} h \nu$ above the minimum in the $U(R)$ curve, so $D{0}$ is less than $D{e}$ and $D{0} \approx D{e}-\frac{1}{2} h \nu$. Different electronic states of the same molecule have different $U(R)$ curves (Figs. 13.5 and 13.19) and different values of $R{e}, D{e}, D_{0}$, and $\nu$.

Consider an ideal gas composed of diatomic molecules AB. In the limit of absolute zero temperature, all the AB molecules are in their ground states of electronic and nuclear motion, so $D{0} N{\mathrm{A}}$ (where $N{\mathrm{A}}$ is the Avogadro constant and $D{0}$ is for the ground electronic state of AB ) is the change in the thermodynamic internal energy $U$ and enthalpy $H$ for dissociation of 1 mole of ideal-gas diatomic molecules: $N{\mathrm{A}} D{0}=\Delta U{0}^{\circ}=\Delta H{0}^{\circ}$ for $\mathrm{AB}(\mathrm{g}) \rightarrow \mathrm{A}(\mathrm{g})+\mathrm{B}(\mathrm{g})$.

For some diatomic-molecule electronic states, solution of the electronic Schrödinger equation gives a $U(R)$ curve with no minimum. Such states are not bound and the molecule will dissociate. Examples include some of the states in Fig. 13.5.

Assuming that we have solved the electronic Schrödinger equation, we next consider nuclear motions. According to our picture, the electrons move much faster than the nuclei. When the nuclei change their configuration slightly, say from $q{\alpha}^{\prime}$ to $q{\alpha}^{\prime \prime}$, the electrons immediately adjust to the change, with the electronic wave function changing from $\psi{\mathrm{el}}\left(q{i} ; q{\alpha}^{\prime}\right)$ to $\psi{\mathrm{el}}\left(q{i} ; q{\alpha}^{\prime \prime}\right)$ and the electronic energy changing from $U\left(q{\alpha}^{\prime}\right)$ to $U\left(q{\alpha}^{\prime \prime}\right)$. Thus, as the nuclei move, the electronic energy varies smoothly as a function of the parameters defining the nuclear configuration, and $U\left(q_{\alpha}\right)$ becomes, in effect, the potential energy for the nuclear motion. The electrons act like a spring connecting the nuclei. As the internuclear distance
changes, the energy stored in the spring changes. Hence the Schrödinger equation for nuclear motion is

\(
\begin{gather}
\hat{H}{N} \psi{N}=E \psi{N} \tag{13.10}\
\hat{H}{N}=-\frac{\hbar^{2}}{2} \sum{\alpha} \frac{1}{m{\alpha}} \nabla{\alpha}^{2}+U\left(q{\alpha}\right) \tag{13.11}
\end{gather}
\)

The variables in the nuclear Schrödinger equation are the nuclear coordinates, symbolized by $q_{\alpha}$. The energy eigenvalue $E$ in (13.10) is the total energy of the molecule, since the Hamiltonian (13.11) includes operators for both nuclear energy and electronic energy. $E$ is simply a number and does not depend on any coordinates. Note that for each electronic state of a molecule we must solve a different nuclear Schrödinger equation, since $U$ differs from state to state. In this chapter we shall concentrate on the electronic Schrödinger equation (13.4).

In Section 13.2, we shall show that the total energy $E$ for an electronic state of a diatomic molecule is approximately the sum of electronic, vibrational, rotational, and translational energies, $E \approx E{\text {elec }}+E{\text {vib }}+E{\text {rot }}+E{\text {tr }}$, where the constant $E{\text {elec }}$ [not to be confused with $E{\text {el }}$ in (13.7)] is given by $E{\text {elec }}=U\left(R{e}\right)$.

The approximation of separating electronic and nuclear motions is called the BornOppenheimer approximation and is basic to quantum chemistry. [The American physicist J. Robert Oppenheimer (1904-1967) was a graduate student of Born in 1927. During World War II, Oppenheimer directed the Los Alamos laboratory that developed the atomic bomb.] Born and Oppenheimer's mathematical treatment indicated that the true molecular wave function is adequately approximated as

\(
\begin{equation}
\psi\left(q{i}, q{\alpha}\right)=\psi{\mathrm{el}}\left(q{i} ; q{\alpha}\right) \psi{N}\left(q_{\alpha}\right) \tag{13.12}
\end{equation}
\)

if $\left(m{e} / m{\alpha}\right)^{1 / 4} \ll 1$. The Born-Oppenheimer approximation introduces little error for the ground electronic states of diatomic molecules. Corrections for excited electronic states are larger than for the ground state, but still are usually small as compared with the errors introduced by the approximations used to solve the electronic Schrödinger equation of a many-electron molecule. Hence we shall not worry about corrections to the Born-Oppenheimer approximation. For further discussion of the Born-Oppenheimer approximation, see J. Goodisman, Diatomic Interaction Potential Theory, Academic Press, 1973, Volume 1, Chapter 1.

Born and Oppenheimer's 1927 paper justifying the Born-Oppenheimer approximation is seriously lacking in rigor. Subsequent work has better justified the Born-Oppenheimer approximation, but significant questions still remain; "the problem of the coupling of nuclear and electronic motions is, at the moment, without a sensible solution and ... is an area where much future work can and must be done" [B. T. Sutcliffe, J. Chem. Soc. Faraday Trans., 89, 2321 (1993); see also B. T. Sutcliffe and R. G. Woolley, Phys. Chem. Chem. Phys., 7, 3664 (2005), and Sutcliffe and Woolley, arxiv.org/abs/1206.4239].


Most of this chapter deals with the electronic Schrödinger equation for diatomic molecules, but this section examines nuclear motion in a bound electronic state of a diatomic molecule. From (13.10) and (13.11), the Schrödinger equation for nuclear motion in a diatomicmolecule bound electronic state is

\(
\begin{equation}
\left[-\frac{\hbar^{2}}{2 m{\alpha}} \nabla{\alpha}^{2}-\frac{\hbar^{2}}{2 m{\beta}} \nabla{\beta}^{2}+U(R)\right] \psi{N}=E \psi{N} \tag{13.13}
\end{equation}
\)

where $\alpha$ and $\beta$ are the nuclei, and the nuclear-motion wave function $\psi{N}$ is a function of the nuclear coordinates $x{\alpha}, y{\alpha}, z{\alpha}, x{\beta}, y{\beta}, z_{\beta}$.

The potential energy $U(R)$ is a function of only the relative coordinates of the two nuclei, and the work of Section 6.3 shows that the two-particle Schrödinger equation (13.13) can be reduced to two separate one-particle Schrödinger equations-one for translational energy of the entire molecule and one for internal motion of the nuclei relative to each other. We have

\(
\begin{equation}
\psi{N}=\psi{N, \mathrm{tr}} \psi{N, \text { int }} \text { and } E=E{\mathrm{tr}}+E_{\mathrm{int}} \tag{13.14}
\end{equation}
\)

The translational energy levels can be taken as the energy levels (3.72) of a particle in a three-dimensional box whose dimensions are those of the container holding the gas of diatomic molecules.

The Schrödinger equation for $\psi_{N, \text { int }}$ is [Eq. (6.43)]

\(
\begin{equation}
\left[-\frac{\hbar^{2}}{2 \mu} \nabla^{2}+U(R)\right] \psi{N, \mathrm{int}}=E{\mathrm{int}} \psi{N, \mathrm{int}}, \quad \mu \equiv m{\alpha} m{\beta} /\left(m{\alpha}+m_{\beta}\right) \tag{13.15}
\end{equation}
\)

where $\psi{N, \text { int }}$ is a function of the coordinates of one nucleus relative to the other. The best coordinates to use are the spherical coordinates of one nucleus relative to the other (Fig. 6.5 with $m{N}$ and $m{e}$ replaced by $m{\alpha}$ and $m{\beta}$ ). The radius $r$ in relative spherical coordinates is the internuclear distance $R$, and we shall denote the relative angular coordinates by $\theta{N}$ and $\phi_{N}$. Since the potential energy in (13.15) depends on $R$ only, this is a central-force problem, and the work of Section 6.1 shows that

\(
\begin{equation}
\psi{N, \text { int }}=P(R) Y{J}^{M}\left(\theta{N}, \phi{N}\right), \quad J=0,1,2, \ldots, \quad M=-J, \ldots, J \tag{13.16}
\end{equation}
\)

where the $Y_{J}^{M}$ functions are the spherical harmonic functions with quantum numbers $J$ and $M$.

From (6.17), the radial function $P(R)$ is found by solving

\(
\begin{equation}
-\frac{\hbar^{2}}{2 \mu}\left[P^{\prime \prime}(R)+\frac{2}{R} P^{\prime}(R)\right]+\frac{J(J+1) \hbar^{2}}{2 \mu R^{2}} P(R)+U(R) P(R)=E_{\mathrm{int}} P(R) \tag{13.17}
\end{equation}
\)

This differential equation is simplified by defining $F(R)$ as

\(
\begin{equation}
F(R) \equiv R P(R) \tag{13.18}
\end{equation}
\)

Substitution of $P=F / R$ into (13.17) gives [Eq. (6.137)]

\(
\begin{equation}
-\frac{\hbar^{2}}{2 \mu} F^{\prime \prime}(R)+\left[U(R)+\frac{J(J+1) \hbar^{2}}{2 \mu R^{2}}\right] F(R)=E_{\mathrm{int}} F(R) \tag{13.19}
\end{equation}
\)

which is a one-dimensional Schrödinger equation with the effective potential energy $U(R)+J(J+1) \hbar^{2} / 2 \mu R^{2}$.

The most fundamental way to solve (13.19) is as follows: (a) Solve the electronic Schrödinger equation (13.7) at several values of $R$ to obtain $E{\mathrm{el}}$ of the particular molecular electronic state you are interested in; (b) add $Z{\alpha} Z{\beta} e^{2} / 4 \pi \varepsilon{0} R$ to each $E_{\text {el }}$ value to obtain $U$ at these $R$ values; (c) devise a mathematical function $U(R)$ whose parameters are adjusted to give a good fit to the calculated $U$ values; (d) insert the function $U(R)$ found in (c) into the nuclear-motion radial Schrödinger equation (13.19) and solve (13.19) by numerical methods.

A commonly used fitting procedure for step (c) is the method of cubic splines, for which computer programs exist (see Press et al., Chapter 3; Shoup, Chapter 6).

As for step (d), numerical solution of the one-dimensional Schrödinger equation (13.19) is done using either the Cooley-Numerov method [see J. Tellinghuisen, J. Chem. Educ., 66,

51 (1989)], which is a modification of the Numerov method (Sections 4.4 and 6.9), or the finite-element method [see D. J. Searles and E. I. von Nagy-Felsobuki, Am. J. Phys., 56, 444 (1988)].

The solutions $F(R)$ of the radial equation (13.19) for a given $J$ are characterized by a quantum number $v$, where $v$ is the number of nodes in $F(R) ; v=0,1,2, \ldots$. The energy levels $E{\text {int }}$ [which are found from the condition that $P(R)=F(R) / R$ be quadratically integrable] depend on the quantum number $J$, which occurs in (13.19), and depend on $v$, which characterizes $F(R) ; E{\text {int }}=E{v, J}$. The angular factor $Y{J}^{M}\left(\theta{N}, \phi{N}\right)$ in (13.16) is a function of the angular coordinates. Changes in $\theta{N}$ and $\phi{N}$ with $R$ held fixed correspond to changes in the spatial orientation of the diatomic molecule, which is rotational motion. The quantum numbers $J$ and $M$ are rotational quantum numbers. Note that $Y_{J}^{M}$ is the wave function of a rigid two-particle rotor [Eq. (6.46)]. A change in the $R$ coordinate is a change in the internuclear distance, which is a vibrational motion, and the quantum number $v$, which characterizes $F(R)$, is a vibrational quantum number.

Since accurate solution of the electronic Schrödinger equation [step (a)] is hard, one often uses simpler, less accurate procedures than that of steps (a) to (d). The simplest approach is to expand $U(R)$ in a Taylor series about $R_{e}$ (Prob. 4.1):

\(
\begin{align}
U(R)= & U\left(R{e}\right)+U^{\prime}\left(R{e}\right)\left(R-R{e}\right)+\frac{1}{2} U^{\prime \prime}\left(R{e}\right)\left(R-R{e}\right)^{2} \
& +\frac{1}{6} U^{\prime \prime \prime}\left(R{e}\right)\left(R-R_{e}\right)^{3}+\cdots \tag{13.20}
\end{align}
\)

At the equilibrium internuclear distance $R{e}$, the slope of the $U(R)$ curve is zero (Fig. 13.1), so $U^{\prime}\left(R{e}\right)=0$. We can anticipate that the molecule will vibrate about the equilibrium distance $R{e}$. For $R$ close to $R{e},\left(R-R{e}\right)^{3}$ and higher powers will be small, and we shall neglect these terms. Defining the equilibrium force constant $k{e}$ as $k{e} \equiv U^{\prime \prime}\left(R{e}\right)$, we have

\(
\begin{gather}
U(R) \approx U\left(R{e}\right)+\frac{1}{2} k{e}\left(R-R{e}\right)^{2}=U\left(R{e}\right)+\frac{1}{2} k{e} x^{2} \tag{13.21}\
k{e} \equiv U^{\prime \prime}\left(R{e}\right) \quad \text { and } \quad x \equiv R-R{e}
\end{gather}
\)

We have approximated $U(R)$ by a parabola [Fig. 4.6 with $V \equiv U(R)-U\left(R{e}\right)$ and $\left.x \equiv R-R{e}\right]$.

With the change of independent variable $x \equiv R-R_{e}$, (13.19) becomes

\(
\begin{gather}
-\frac{\hbar^{2}}{2 \mu} S^{\prime \prime}(x)+\left[U\left(R{e}\right)+\frac{1}{2} k{e} x^{2}+\frac{J(J+1) \hbar^{2}}{2 \mu\left(x+R{e}\right)^{2}}\right] S(x) \approx E{\text {int }} S(x) \tag{13.22}\
\text { where } \quad S(x) \equiv F(R) \tag{13.23}
\end{gather}
\)

Expanding $1 /\left(x+R_{e}\right)^{2}$ in a Taylor series, we have (Prob. 13.7)

\(
\begin{equation}
\frac{1}{\left(x+R{e}\right)^{2}}=\frac{1}{R{e}^{2}\left(1+x / R{e}\right)^{2}}=\frac{1}{R{e}^{2}}\left(1-2 \frac{x}{R{e}}+3 \frac{x^{2}}{R{e}^{2}}-\cdots\right) \approx \frac{1}{R_{e}^{2}} \tag{13.24}
\end{equation}
\)

We are assuming that $R-R{e}=x$ is small compared with $R{e}$, so all terms after the 1 have been neglected in (13.24). Substitution of (13.24) into (13.22) and rearrangement gives

\(
\begin{equation}
-\frac{\hbar^{2}}{2 \mu} S^{\prime \prime}(x)+\frac{1}{2} k{e} x^{2} S(x) \approx\left[E{\mathrm{int}}-U\left(R{e}\right)-\frac{J(J+1) \hbar^{2}}{2 \mu R{e}^{2}}\right] S(x) \tag{13.25}
\end{equation}
\)

Equation (13.25) is the same as the Schrödinger equation for a one-dimensional harmonic oscillator with coordinate $x$, mass $\mu$, potential energy $\frac{1}{2} k{e} x^{2}$, and energy eigenvalues $E{\text {int }}-U\left(R{e}\right)-J(J+1) \hbar^{2} / 2 \mu R{e}^{2}$. [The boundary conditions for (13.25) and (4.32) are not the same, but this difference is unimportant and can be ignored (Levine, Molecular

Spectroscopy, p. 147).] We can therefore set the terms in brackets in (13.25) equal to the harmonic-oscillator eigenvalues, and we have

\(
\begin{gather}
E{\mathrm{int}}-U\left(R{e}\right)-J(J+1) \hbar^{2} / 2 \mu R{e}^{2} \approx\left(v+\frac{1}{2}\right) h \nu{e} \
E{\mathrm{int}} \approx U\left(R{e}\right)+\left(v+\frac{1}{2}\right) h \nu{e}+J(J+1) \hbar^{2} / 2 \mu R{e}^{2} \tag{13.26}\
\nu{e}=\left(k{e} / \mu\right)^{1 / 2} / 2 \pi, \quad v=0,1,2, \ldots \tag{13.27}
\end{gather}
\)

where (4.23) was used for $\nu{e}$, the equilibrium (or harmonic) vibrational frequency. The molecular internal energy $E{\text {int }}$ is approximately the sum of the electronic energy $U\left(R{e}\right) \equiv E{\text {elec }}$ (which differs for different electronic states of the same molecule), the vibrational energy $\left(v+\frac{1}{2}\right) h \nu{e}$, and the rotational energy $J(J+1) \hbar^{2} / 2 \mu R{e}^{2}$. The approximations (13.21) and (13.24) correspond to a harmonic-oscillator, rigid-rotor approximation. From (13.26) and (13.14), the molecular energy $E=E{\text {tr }}+E{\text {int }}$ is approximately the sum of translational, rotational, vibrational, and electronic energies:

\(
E \approx E{\mathrm{tr}}+E{\mathrm{rot}}+E{\mathrm{vib}}+E{\mathrm{elec}}
\)

From (13.14), (13.16), (13.18), and (13.23), the nuclear-motion wave function is

\(
\begin{equation}
\psi{N} \approx \psi{N, \mathrm{tr}} S{v}\left(R-R{e}\right) R^{-1} Y{J}^{M}\left(\theta{N}, \phi_{N}\right) \tag{13.28}
\end{equation}
\)

where $S{v}\left(R-R{e}\right)$ is a harmonic-oscillator eigenfunction with quantum number $v$.
The approximation (13.26) gives rather poor agreement with experimentally observed vibration-rotation energy levels of diatomic molecules. The accuracy can be improved by the addition of the first- and second-order perturbation-theory energy corrections due to the terms neglected in (13.21) and (13.24). When this is done (see Levine, Molecular Spectroscopy, Section 4.2), the energy contains additional terms corresponding to vibrational anharmonicity [Eq. (4.60)], vibration-rotation interaction, and rotational centrifugal distortion of the molecule (Section 6.4), where vibrational anharmonicity is the largest of these corrections and centrifugal distortion is the smallest.

EXAMPLE

An approximate representation of the potential-energy function of a diatomic molecule is the Morse function

\(
U(R)=U\left(R{e}\right)+D{e}\left[1-e^{-a\left(R-R_{e}\right)}\right]^{2}
\)

Use of $U^{\prime \prime}\left(R{e}\right)=k{e}$ [Eq. (4.59)] and (13.27) gives (see Prob. 4.29; the Morse functions in Prob. 4.29 and in this example differ because of different choices for the zero of energy)

\(
a=\left(k{e} / 2 D{e}\right)^{1 / 2}=2 \pi \nu{e}\left(\mu / 2 D{e}\right)^{1 / 2}
\)

Use the Morse function and the Numerov method (Section 4.4) to (a) find the lowest six vibrational energy levels of the ${ }^{1} \mathrm{H}{2}$ molecule in its ground electronic state, which has $D{e} / h c=38297 \mathrm{~cm}^{-1}, \nu{e} / c=4403.2 \mathrm{~cm}^{-1}$, and $R{e}=0.741 \AA$, where $h$ and $c$ are Planck's constant and the speed of light; (b) find $\langle R\rangle$ for each of these vibrational states.
(a) The vibrational energy levels correspond to states with the rotational quantum number $J=0$. Making the change of variables $x \equiv R-R_{e}$ and $S(x) \equiv F(R)$ [Eq. (13.23)] and substituting the Morse function into the nuclear-motion Schrödinger equation (13.19), we get for $J=0$

\(
-\left(\hbar^{2} / 2 \mu\right) S^{\prime \prime}(x)+D{e}\left(1-e^{-a x}\right)^{2} S(x)=\left[E{\text {int }}-U\left(R{e}\right)\right] S(x)=E{\text {vib }} S(x)
\)

since for $J=0, E{\text {int }}=E{\text {elec }}+E{\text {vib }}=U\left(R{e}\right)+E{\text {vib }}$ [Eq. (13.26)]. As usual in the Numerov method, we switch to the dimensionless reduced variables $E{\text {vib }, r} \equiv E{\text {vib }} / A$ and $x{r} \equiv x / B$, where $A$ and $B$ are products of powers of the constants $\hbar, \mu$, and $a$. The procedure of Section 4.4 gives (Prob. 13.8a) $A=\hbar^{2} a^{2} / \mu$ and $B=a^{-1}$, so

\(
x{r} \equiv x / B=a x, \quad E{\mathrm{vib}, r} \equiv E{\mathrm{vib}} / A=\mu E{\mathrm{vib}} / \hbar^{2} a^{2}=\left(2 D{e} / h^{2} \nu{e}^{2}\right) E_{\mathrm{vib}}
\)

where we used $a=2 \pi \nu{e}\left(\mu / 2 D{e}\right)^{1 / 2}$. Substitution of

\(
\begin{gathered}
x=x{r} / a, \quad E{\mathrm{vib}}=\hbar^{2} a^{2} E{\mathrm{vib}, r} / \mu, \quad D{e, r}=D{e} /\left(\hbar^{2} a^{2} / \mu\right), \quad S(x)=S{r}\left(x{r}\right) B^{-1 / 2} \
S^{\prime \prime}=B^{-5 / 2} S{r}^{\prime \prime}=B^{-1 / 2} B^{-2} S{r}^{\prime \prime}=B^{-1 / 2} a^{2} S{r}^{\prime \prime}
\end{gathered}
\)

[Eqs. (4.78) and (4.79)] into the differential equation for $S(x)$ gives

\(
S{r}^{\prime \prime}\left(x{r}\right)=\left[2 D{e, r}\left(1-e^{-x{r}}\right)^{2}-2 E{\mathrm{vib}, r}\right] S{r}\left(x{r}\right) \equiv G{r} S{r}\left(x{r}\right)
\)

This last equation has the form of (4.82) with $G{r} \equiv 2 D{e, r}\left(1-e^{-x{r}}\right)^{2}-2 E{\mathrm{vib}, r}$, so we are now ready to apply the Numerov procedure of Section 4.4. For the $\mathrm{H}_{2}$ ground electronic state, we find (Prob. 13.8b)

\(
\begin{gathered}
A=h^{2} \nu{e}^{2} / 2 D{e}=h^{2} c^{2}\left(4403.2 \mathrm{~cm}^{-1}\right)^{2} / 2 h c\left(38297 \mathrm{~cm}^{-1}\right)=\left(253.12{9} \mathrm{~cm}^{-1}\right) h c \
B=0.51412 \AA, \quad D{e, r}=D{e} / A=151.29{4}
\end{gathered}
\)

We want to start and end the Numerov procedure in the classically forbidden regions. If we used the harmonic-oscillator approximation for the vibrational levels, the energies of the first six vibrational levels would be $\left(v+\frac{1}{2}\right) h \nu{e}, v=0,1, \ldots, 5$. The reduced energy of the sixth harmonic-oscillator vibrational level would be $5.5 h \nu{e} / A=5.5 h \nu{e} /\left(253 \mathrm{~cm}^{-1}\right) h c=5.5\left(4403 \mathrm{~cm}^{-1}\right) /\left(253 \mathrm{~cm}^{-1}\right)=95.7$. Because of anharmonicity (Section 4.3), the sixth vibrational level will actually occur below 95.7, so we are safe in using 95.7 to find the limits of the classically allowed region. We have $D{e, r}\left(1-e^{-x{r}}\right)^{2}=95.7$, and with $D{e, r}=151.29$, we find $x{r}=-0.58$ and $x{r}=1.58$ as the limits of the classically allowed region for a reduced energy of 95.7. Extending the range by 1.2 at each end, we would start the Numerov procedure at $x{r}=-1.8$ and end at $x{r}=2.8$. However, $x{r}=\left(R-R{e}\right) / B=$ $(R-0.741 \AA) /(0.514 \AA)$ and the minimum possible internuclear distance $R$ is 0 , so the minimum possible value of $x{r}$ is -1.44 . We therefore start at $x{r}=-1.44$ and end at 2.8. If we take an interval of $s{r}=0.04$, we will have about 106 points, which is adequate, but we will try for higher accuracy by taking $s{r}=0.02$ to give about 212 points. With these choices, we set up the Numerov spreadsheet in the usual manner (or use Mathcad or the computer program of Table 4.1). We find (Prob. 13.9) the following lowest six $E{\text {vib }, r}$ values: 8.572525, 24.967566, 40.362582, 54.757570, 68.152531, 80.547472. Using $E{\mathrm{vib}, r} \equiv E{\text {vib }} / A$, we find the lowest levels to be $E{\text {vib }} / h c=2169.95,6320.01,10216.94,13860.73,17251.38,20388.90 \mathrm{~cm}^{-1}$. Note the reduced spacings between levels as the vibrational quantum number increases. (For comparison, the harmonic-oscillator approximation gives the following values: $2201.6,6604.8,11008.0,15411.2,19814.4,24217.6 \mathrm{~cm}^{-1}$.)

It happens that the Schrödinger equation for the Morse function can be analytically solved virtually exactly, and the analytic solution (Prob. 13.11) gives the following lowest eigenvalues: 2169.96, 6320.03, 10216.97, 13860.78, 17251.47, $20389.02 \mathrm{~cm}^{-1}$. Agreement between the Numerov Morse-function values and the analytic Morse-function values is very good. The experimentally observed lowest vibrational levels of $\mathrm{H}{2}$ are 2170.08, 6331.22, 10257.19, 13952.43, 17420.44, $20662.00 \mathrm{~cm}^{-1}$. The deviations of the Morse-function values from the experimental values indicate that the Morse function is not a very accurate representation of the ground-state $\mathrm{H}{2} U(R)$ function.

FIGURE 13.2 The $v=5$ Morse vibrational wave function for $\mathrm{H}_{2}$ as found by the Numerov method.

(b) We have $\left\langle x{r}\right\rangle \approx \int{-1.44}^{2.8} x{r}\left|S{r}\right|^{2} d x{r} \approx \sum{x{r}=1.44}^{2.8} x{r}\left|S{r}\right|^{2} s{r}$, where $s{r}=0.02$ is the interval spacing (not to be confused with the vibrational wave function $S{r}$ ), and where the vibrational wave function $S{r}$ must be normalized. (See also Prob. 13.12.) We normalize $S{r}$ as described in Section 4.4, and then create a column of $x{r}\left|S{r}\right|^{2} s{r}$ values. Next we sum these values to find the following results for the six lowest vibrational states (Prob. 13.9b) $\left\langle x{r}\right\rangle=0.0440,0.1365,0.2360,0.3435,0.4605,0.5884$. Using $x{r}=\left(R-R{e}\right) / B$, we find the following values: $\langle R\rangle=0.763,0.811,0.862,0.918$, $0.978,1.044 \AA$ A. (To get accurate $\left\langle x{r}\right\rangle$ values, $E{\text {vib, } r}$ must be found to many more decimal places than given in (a)-enough places to make the wave function close to zero at 2.8 . If the spreadsheet does not allow you to enter enough decimal places to do this for $v=0$, you can take the right-hand limit as 2.5 instead of 2.8.) Because of vibrational anharmonicity, the molecule gets larger as the vibrational quantum number increases. This effect is rather large for light atoms such as hydrogen. The $v=5$ Numerov-Morse vibrational wave function (Fig. 13.2) shows marked asymmetry about the origin ( $x{r}=0$, which corresponds to $R=R{e}$ ). For a spectacular example of the effect of anharmonicity on bond length, see the discussion of $\mathrm{He}_{2}$ (the world's largest diatomic molecule) near the end of Section 13.7.


Most quantum chemists report the results of their calculations using atomic units.
The hydrogen-atom Hamiltonian operator (assuming infinite nuclear mass) in SI units is $-\left(\hbar^{2} / 2 m{e}\right) \nabla^{2}-e^{2} / 4 \pi \varepsilon{0} r$. The system of atomic units is defined as follows. The units of mass, charge, and angular momentum are defined as the electron's mass $m{e}$, the proton's charge $e$, and $\hbar$, respectively (rather than the kilogram, the coulomb, and the $\mathrm{kg} \mathrm{m}^{2} / \mathrm{s}$ ); the unit of permittivity is $4 \pi \varepsilon{0}$, rather than the $\mathrm{C}^{2} \mathrm{~N}^{-1} \mathrm{~m}^{-2}$. (The atomic unit of mass used in quantum chemistry should not be confused with the quantity 1 amu , which is one-twelfth the mass of a ${ }^{12} \mathrm{C}$ atom.) When we switch to atomic units, $\hbar, m{e}, e$, and $4 \pi \varepsilon{0}$ each have a numerical value of 1 . Hence, to change a formula from SI units to atomic units, we simply set each of these quantities equal to 1 . Thus, in SI atomic units, the H -atom Hamiltonian is $-\frac{1}{2} \nabla^{2}-1 / r$, where $r$ is now measured in atomic units of length rather than in meters. The ground-state energy of the hydrogen atom is given by (6.94) as $-\frac{1}{2}\left(e^{2} / 4 \pi \varepsilon{0} a{0}\right)$. Since [Eq. (6.106)] $a{0}=4 \pi \varepsilon{0} \hbar^{2} / m{e} e^{2}$, the numerical value of $a{0}$ (the Bohr radius) in atomic units is 1 , and the ground-state energy of the hydrogen atom has the numerical value (neglecting nuclear motion) $-\frac{1}{2}$ in atomic units.

The atomic unit of energy, $e^{2} / 4 \pi \varepsilon{0} a{0}$, is called the hartree (symbol $E{\mathrm{h}}$ ):
1 hartree $\equiv E{\mathrm{h}} \equiv \frac{e^{2}}{4 \pi \varepsilon{0} a{0}}=\frac{m{e} e^{4}}{\left(4 \pi \varepsilon{0}\right)^{2} \hbar^{2}}=27.211385 \mathrm{eV}=4.359744 \times 10^{-18} \mathrm{~J}$
The ground-state energy of the hydrogen atom is $-\frac{1}{2}$ hartree if nuclear motion is neglected. The atomic unit of length is called the bohr:

\(
\begin{equation}
1 \text { bohr } \equiv a{0} \equiv 4 \pi \varepsilon{0} \hbar^{2} / m_{e} e^{2}=0.52917721 \AA \tag{13.30}
\end{equation}
\)

To find the atomic unit of any other quantity (for example, time) one combines $\hbar, m{e}, e$, and $4 \pi \varepsilon{0}$ so as to produce a quantity having the desired dimensions. One finds (Prob. 13.14) the atomic unit of time to be $\hbar / E{\mathrm{h}}=2.4188843 \times 10^{-17} \mathrm{~s}$ and the atomic unit of electric dipole moment to be $e a{0}=8.478353 \times 10^{-30} \mathrm{C} \mathrm{m}$.

Atomic units will be used in Chapters 13 to 17.
A more rigorous way to define atomic units is as follows. Starting with the H -atom electronic Schrödinger equation in SI units, we define (as in Section 4.4) the dimensionless reduced variables $E{r} \equiv E / A$ and $r{r} \equiv r / B$, where $A$ and $B$ are products of powers of the Schrödinger-equation constants $\hbar, m{e}, e$, and $4 \pi \varepsilon{0}$ such that $A$ and $B$ have dimensions of energy and length, respectively. The procedure of Section 4.4 shows that (Prob. 13.13) $A=m{e} e^{4} /\left(4 \pi \varepsilon{0}\right)^{2} \hbar^{2}=e^{2} / 4 \pi \varepsilon{0} a{0} \equiv 1$ hartree and $B=\hbar^{2} 4 \pi \varepsilon{0} / m{e} e^{2}=a{0} \equiv 1$ bohr. For this three-dimensional problem, the H-atom wave function has dimensions of $\mathrm{L}^{-3 / 2}$, so the reduced dimensionless $\psi{r}$ is defined as $\psi_{r} \equiv \psi B^{3 / 2}$. Also,

\(
\frac{\partial^{2} \psi{r}}{\partial r{r}^{2}}=B^{3 / 2} \frac{\partial^{2} \psi}{\partial r^{2}}\left(\frac{\partial r}{\partial r{r}}\right)^{2}=B^{3 / 2} \frac{\partial^{2} \psi}{\partial r^{2}} B^{2}=B^{3 / 2} \frac{\partial^{2} \psi}{\partial r^{2}} a{0}^{2}
\)

Introducing the reduced quantities into the Schrödinger equation, we find (Prob. 13.13) that the reduced H -atom Schrödinger equation is $-\frac{1}{2} \nabla{r}^{2} \psi{r}-\left(1 / r{r}\right) \psi{r}=E{r} \psi{r}$, where $\nabla{r}^{2}$ is given by Eq. (6.6) with $r$ replaced by $r{r}$. In practice, people do not bother to include the $r$ subscripts and instead write $-\frac{1}{2} \nabla^{2} \psi-(1 / r) \psi=E \psi$.


We now begin the study of the electronic energies of molecules. We shall use the Born-Oppenheimer approximation, keeping the nuclei fixed while we solve, as best we can, the Schrödinger equation for the motion of the electrons. We shall usually be considering an isolated molecule, ignoring intermolecular interactions. Our results will be most applicable to molecules in the gas phase at low pressure. For inclusion of solvent effects, see Sections 15.17 and 17.6.

We start with diatomic molecules, the simplest of which is $\mathrm{H}{2}^{+}$, the hydrogen molecule ion, consisting of two protons and one electron. Just as the one-electron H atom serves as a starting point in the discussion of many-electron atoms, the one-electron $\mathrm{H}{2}^{+}$ ion furnishes many ideas useful for discussing many-electron diatomic molecules. The electronic Schrödinger equation for $\mathrm{H}_{2}^{+}$is separable, and we can get exact solutions for the eigenfunctions and eigenvalues.

Figure 13.3 shows $\mathrm{H}{2}^{+}$. The nuclei are at $a$ and $b ; R$ is the internuclear distance; $r{a}$ and $r_{b}$ are the distances from the electron to nuclei $a$ and $b$. Since the nuclei are fixed, we have a one-particle problem whose purely electronic Hamiltonian is [Eq. (13.5)]

\(
\begin{equation}
\hat{H}{\mathrm{el}}=-\frac{\hbar^{2}}{2 m{e}} \nabla^{2}-\frac{e^{2}}{4 \pi \varepsilon{0} r{a}}-\frac{e^{2}}{4 \pi \varepsilon{0} r{b}} \tag{13.31}
\end{equation}
\)

FIGURE 13.3 Interparticle distances in $\mathrm{H}_{2}^{+}$.

The first term is the electronic kinetic-energy operator; the second and third terms are the attractions between the electron and the nuclei. In atomic units the purely electronic Hamiltonian for $\mathrm{H}_{2}^{+}$is

\(
\begin{equation}
\hat{H}{\mathrm{el}}=-\frac{1}{2} \nabla^{2}-\frac{1}{r{a}}-\frac{1}{r_{b}} \tag{13.32}
\end{equation}
\)

In Fig. 13.3 the coordinate origin is on the internuclear axis, midway between the nuclei, with the $z$ axis lying along the internuclear axis. The $\mathrm{H}_{2}^{+}$electronic Schrödinger equation is not separable in spherical coordinates. However, separation of variables is possible using confocal elliptic coordinates $\xi$, $\eta$, and $\phi$. The coordinate $\phi$ is the angle of rotation of the electron about the internuclear $(z)$ axis, the same as in spherical coordinates. The coordinates $\xi$ (xi) and $\eta$ (eta) are defined by

\(
\begin{equation}
\xi \equiv \frac{r{a}+r{b}}{R}, \quad \eta \equiv \frac{r{a}-r{b}}{R} \tag{13.33}
\end{equation}
\)

The ranges of these coordinates are

\(
\begin{equation}
0 \leq \phi \leq 2 \pi, \quad 1 \leq \xi \leq \infty, \quad-1 \leq \eta \leq 1 \tag{13.34}
\end{equation}
\)

We must put the Hamiltonian (13.32) into these coordinates. We have

\(
\begin{equation}
r{a}=\frac{1}{2} R(\xi+\eta), \quad r{b}=\frac{1}{2} R(\xi-\eta) \tag{13.35}
\end{equation}
\)

We also need the expression for the Laplacian in confocal elliptic coordinates. One way to find this is to express $\xi, \eta$, and $\phi$ in terms of $x, y$, and $z$, the Cartesian coordinates of the electron, and then use the chain rule to find $\partial / \partial x, \partial / \partial y$, and $\partial / \partial z$ in terms of $\partial / \partial \xi, \partial / \partial \eta$, and $\partial / \partial \phi$. We then form $\nabla^{2} \equiv \partial^{2} / \partial x^{2}+\partial^{2} / \partial y^{2}+\partial^{2} / \partial z^{2}$. The derivation of $\nabla^{2}$ is omitted. (For a discussion, see Margenau and Murphy, Chapter 5.) Substitution of $\nabla^{2}$ and (13.35) into (13.32) gives $\hat{H}{\text {el }}$ of $\mathrm{H}{2}^{+}$in confocal elliptic coordinates. The result is omitted.

For the hydrogen atom, whose Hamiltonian has spherical symmetry, the electronic angular-momentum operators $\hat{L}^{2}$ and $\hat{L}{z}$ both commute with $\hat{H}$. The $\mathrm{H}{2}^{+}$ion does not have spherical symmetry, and one finds that $\left[\hat{L}^{2}, \hat{H}{\mathrm{el}}\right] \neq 0$ for $\mathrm{H}{2}^{+}$. However, $\mathrm{H}{2}^{+}$does have axial symmetry, and one can show that $\hat{L}{z}$ commutes with $\hat{H}{\mathrm{el}}$ of $\mathrm{H}{2}^{+}$. Therefore, the electronic wave functions can be chosen to be eigenfunctions of $\hat{L}{z}$. The eigenfunctions of $\hat{L}{z}$ are [Eq. (5.81)]

\(
\begin{equation}
\text { constant } \cdot(2 \pi)^{-1 / 2} e^{i m \phi}, \quad \text { where } m=0, \pm 1, \pm 2, \pm 3, \ldots \tag{13.36}
\end{equation}
\)

The $z$ component of electronic orbital angular momentum in $\mathrm{H}{2}^{+}$is $m \hbar$ or $m$ in atomic units. The total electronic orbital angular momentum is not a constant for $\mathrm{H}{2}^{+}$.

The "constant" in (13.36) is a constant only as far as $\partial / \partial \phi$ is concerned, so the $\mathrm{H}{2}^{+}$ wave functions have the form $\psi{\mathrm{el}}=F(\xi, \eta)(2 \pi)^{-1 / 2} e^{\text {im } \phi}$. One now tries a separation of variables:

\(
\begin{equation}
\psi_{\mathrm{el}}=L(\xi) M(\eta)(2 \pi)^{-1 / 2} e^{i m \phi} \tag{13.37}
\end{equation}
\)

Substitution of (13.37) into $\hat{H}{\mathrm{el}} \psi{\mathrm{el}}=E{\mathrm{el}} \psi{\mathrm{el}}$ gives an equation in which the variables are separable. One gets two ordinary differential equations, one for $L(\xi)$ and one for $M(\eta)$. Solving these equations, one finds that the condition that $\psi{\mathrm{el}}$ be well-behaved requires that, for each fixed value of $R$, only certain values of $E{\text {el }}$ are allowed. This gives a set of different electronic states. There is no algebraic formula for $E{\mathrm{el}}$; it must be calculated numerically for each desired value of $R$ for each state. In addition to the quantum number $m$, the $\mathrm{H}{2}^{+}$ electronic wave functions are characterized by the quantum numbers $n{\xi}$ and $n{\eta}$, which give the number of nodes in the $L(\xi)$ and $M(\eta)$ factors in $\psi_{\mathrm{el}}$.

For the ground electronic state, the quantum number $m$ is zero. At $R=\infty$, the $\mathrm{H}{2}^{+}$ground state is dissociated into a proton and a ground-state hydrogen atom; hence $E{\mathrm{el}}(\infty)=-\frac{1}{2}$ hartree. At $R=0$, the two protons have come together to form the $\mathrm{He}^{+}$ ion with ground-state energy: $-\frac{1}{2}(2)^{2}$ hartrees $=-2$ hartrees. Addition of the internuclear repulsion $1 / R$ (in atomic units) to $E{\mathrm{el}}(R)$ gives the $U(R)$ potential-energy curve for nuclear motion. Plots of the ground-state $E{\mathrm{el}}(R)$ and $U(R)$, as found from solution of the electronic Schrödinger equation, are shown in Fig. 13.4. At $R=\infty$ the internuclear repulsion is 0 , and $U$ is $-\frac{1}{2}$ hartree.

The $U(R)$ curve is found to have a minimum at $R{e}=1.9972$ bohrs $=1.057 \AA$, indicating that the $\mathrm{H}{2}^{+}$ground electronic state is a stable bound state. The calculated value of $E{\mathrm{el}}$ at 1.9972 bohrs is -1.1033 hartrees. Addition of the internuclear repulsion $1 / R$ gives $U\left(R{e}\right)=-0.6026$ hartree, compared with -0.5000 hartree at $R=\infty$. The ground-state binding energy is thus $D_{e}=0.1026$ hartree $=2.79 \mathrm{eV}$. This corresponds to $64.4 \mathrm{kcal} / \mathrm{mol}=269 \mathrm{~kJ} / \mathrm{mol}$. The binding energy is only $17 \%$ of the total energy at the equilibrium internuclear distance. Thus a small error in the total energy can correspond to a large error in the binding energy. For heavier molecules the situation is even worse, since chemical binding energies are of the same order of magnitude for most diatomic molecules, but the total electronic energy increases markedly for heavier molecules.

FIGURE 13.4 Electronic energy with $(U)$ and without ( $E{\mathrm{e}}$ ) internuclear repulsion for the $\mathrm{H}{2}^{+}$ground electronic state.

FIGURE $13.5 U(R)$ curves for several $\mathrm{H}_{2}^{+}$electronic states. Dashed lines are used to help distinguish closely spaced states. Not visible on the scale of this diagram is a slight minimum in the curve of the first excited electronic state at 12.5 bohrs with a well depth of 0.00006 hartrees.

Note that the single electron in $\mathrm{H}{2}^{+}$is sufficient to give a stable bound state.
Figure 13.5 shows the $U(R)$ curves for the first several electronic energy levels of $\mathrm{H}{2}^{+}$, as found by solving the electronic Schrödinger equation.

The angle $\phi$ occurs in $\hat{H}{\text {el }}$ of $\mathrm{H}{2}^{+}$only as $\partial^{2} / \partial \phi^{2}$. When $\psi{\text {el }}$ of (13.37) is substituted into $\hat{H}{\mathrm{el}} \psi{\mathrm{el}}=E{\mathrm{el}} \psi{\mathrm{el}}$, the $e^{i m \phi}$ factor cancels, and we are led to differential equations for $L(\xi)$ and $M(\eta)$ in which the $m$ quantum number occurs only as $m^{2}$. Since $E{\text {el }}$ is found from the $L(\xi)$ and $M(\eta)$ differential equations, $E_{\text {el }}$ depends on $m^{2}$, and each electronic level with $m \neq 0$ is doubly degenerate, corresponding to states with quantum numbers $+|m|$ and $-|m|$. In the standard notation for diatomic molecules [F. A. Jenkins, J. Opt. Soc. Am., 43, 425 (1953)], the absolute value of $m$ is called $\lambda$ :

\(
\lambda \equiv|m|
\)

(Some texts define $\lambda$ as identical to $m$.) Similar to the $s, p, d, f, g$ notation for hydrogen-atom states, a letter code is used to specify $\lambda$, the absolute value (in atomic units) of the component along the molecular axis of the electron's orbital angular momentum:

$\lambda$01234
letter$\sigma$$\pi$$\delta$$\phi$$\gamma$

Thus the lowest $\mathrm{H}{2}^{+}$electronic state is a $\sigma$ state.
Besides classifying the states of $\mathrm{H}{2}^{+}$according to $\lambda$, we can also classify them according to their parity (Section 7.5). From Fig. 13.12, inversion of the electron's coordinates
through the origin $O$ changes $\phi$ to $\phi+\pi, r{a}$ to $r{b}$, and $r{b}$ to $r{a}$. This leaves the potentialenergy part of the electronic Hamiltonian (13.31) unchanged. We previously showed the kinetic-energy operator to be invariant under inversion. Hence the parity operator commutes with the Hamiltonian (13.31), and the $\mathrm{H}_{2}^{+}$electronic wave functions can be classified as either even or odd. For even electronic wave functions, we use the subscript $g$ (from the German word gerade, meaning even); for odd wave functions, we use $u$ (from ungerade).

The lowest $\sigma{g}$ energy level in Fig. 13.5 is labeled $1 \sigma{g}$, the next-lowest $\sigma{g}$ level at small $R$ is labeled $2 \sigma{g}$, and so on. The lowest $\sigma{u}$ level is labeled $1 \sigma{u}$, and so on. The alternative notation $\sigma{g} 1 s$ indicates that this level dissociates to a $1 s$ hydrogen atom. The meaning of the star in $\sigma{u}^{*} 1 s$ will be explained later.

For completeness, we must take spin into account by multiplying each spatial $\mathrm{H}_{2}^{+}$ electronic wave function by $\alpha$ or $\beta$, depending on whether the component of electron spin along the internuclear axis is $+\frac{1}{2}$ or $-\frac{1}{2}$ (in atomic units). Inclusion of spin doubles the degeneracy of all levels.

The $\mathrm{H}{2}^{+}$ground electronic state has $R{e}=2.00$ bohrs $=2.00\left(4 \pi \varepsilon{0} \hbar^{2} / m{e} e^{2}\right)$. The negative muon (symbol $\mu^{-}$) is a short-lived (half-life $2 \times 10^{-6}$ s) elementary particle whose charge is the same as that of an electron but whose mass $m{\mu}$ is 207 times $m{e}$. When a beam of negative muons (produced when ions accelerated to high speed collide with ordinary matter) enters $\mathrm{H}{2}$ gas, muomolecular ions that consist of two protons and one muon are formed. This species, symbolized by $(\mathrm{p} \mu \mathrm{p})^{+}$, is an $\mathrm{H}{2}^{+}$ion in which the electron has been replaced by a muon. Its $R{e}$ is found by replacing $m{e}$ with $m{\mu}$ in $R{e}$ :

\(
2.00\left(4 \pi \varepsilon{0} \hbar^{2} / m{\mu} e^{2}\right)=2.00\left(4 \pi \varepsilon{0} \hbar^{2} / 207 m{e} e^{2}\right)=(2.00 / 207) \text { bohr }=0.0051 \AA
\)

The two nuclei in this muoion are 207 times closer than in $\mathrm{H}{2}^{+}$. The magnitude of the vibrational-wave-function factor $S{v}\left(R-R{e}\right)$ in (13.28) is small but not entirely negligible for $R-R{e}=-0.0051 \AA$, so there is some probability for the nuclei in $(\mathrm{p} \mu \mathrm{p})^{+}$to come in contact, and nuclear fusion might occur. The isotopic nuclei ${ }^{2} \mathrm{H}$ (deuterium, D) and ${ }^{3} \mathrm{H}$ (tritium, T ) undergo fusion much more readily than protons, so instead of $\mathrm{H}{2}$ gas, one uses a mixture of $\mathrm{D}{2}$ and $\mathrm{T}_{2}$ gases. After fusion occurs, the muon is released and can then be recaptured to catalyze another fusion. Under the right conditions, one muon can catalyze 150 fusions on average before it decays. Unfortunately, at present, more energy is needed to produce the muon beam than is released by the fusion. (See en.wikipedia.org/wiki/ Muon-catalyzed_fusion.)

In the rest of this chapter, the subscript el will be dropped from the electronic wave function, Hamiltonian, and energy. It will be understood in Chapters 13 to 17 that $\psi$ means $\psi_{\mathrm{el}}$.


For a many-electron atom, the self-consistent-field (SCF) method is used to construct an approximate wave function as a Slater determinant of (one-electron) spin-orbitals. The one-electron spatial part of a spin-orbital is an atomic orbital (AO). We took each AO as a product of a spherical harmonic and a radial factor. As an initial approximation to the radial factors, we can use hydrogenlike radial functions with effective nuclear charges.

For many-electron molecules, which (unlike $\mathrm{H}_{2}^{+}$) cannot be solved exactly, we want to use many of the ideas of the SCF treatment of atoms. We shall write an approximate molecular electronic wave function as a Slater determinant of (one-electron) spin-orbitals. The one-electron spatial part of a molecular spin-orbital is a molecular orbital (MO).

Because of the Pauli principle, each MO can hold no more than two electrons, just as for AOs. What kind of functions do we use for the MOs? Ideally, the analytic form of each MO is found by an SCF calculation (Section 14.3). In this section, we seek simple approximations for the MOs that will enable us to gain some qualitative understanding of chemical bonding. Just as we took the angular part of each AO to be the same kind of function (a spherical harmonic) as in the one-electron hydrogenlike atom, we shall take the angular part of each diatomic MO to be $(2 \pi)^{-1 / 2} e^{i m \phi}$, as in $\mathrm{H}{2}^{+}$. However, the $\xi$ and $\eta$ factors in the $\mathrm{H}{2}^{+}$wave functions are complicated functions not readily usable in MO calculations. We therefore seek simpler functions that will provide reasonably accurate approximations to the $\mathrm{H}{2}^{+}$wave functions and that can be used to construct molecular orbitals for manyelectron diatomic molecules. With this discussion as motivation for looking at approximate solutions in a case where the Schrödinger equation is exactly solvable, we consider approximate treatments of $\mathrm{H}{2}^{+}$.

We shall use the variation method, writing down some function containing several parameters and varying them to minimize the variational integral. This will give an approximation to the ground-state wave function and an upper bound to the ground-state energy. By use of the factor $e^{i m \phi}$ in the trial function, we can get an upper bound to the energy of the lowest $\mathrm{H}_{2}^{+}$level for any given value of $m$ (see Section 8.2). By using linear variation functions, we can get approximations for excited states.

The $\mathrm{H}_{2}^{+}$ground state has $m=0$, and the wave function depends only on $\xi$ and $\eta$. We could try any well-behaved function of these coordinates as a trial variation function. We shall, however, use a more systematic approach based on the idea of a molecule as being formed from the interaction of atoms.

Consider what the $\mathrm{H}{2}^{+}$wave function would look like for large values of the internuclear separation $R$. When the electron is near nucleus $a$, nucleus $b$ is so far away that we essentially have a hydrogen atom with origin at $a$. Thus, when $r{a}$ is small, the groundstate $\mathrm{H}{2}^{+}$electronic wave function should resemble the ground-state hydrogen-atom wave function of Eq. (6.104). We have $Z=1$, and the Bohr radius $a{0}$ has the numerical value 1 in atomic units; hence (6.104) becomes

\(
\begin{equation}
\pi^{-1 / 2} e^{-r_{a}} \tag{13.39}
\end{equation}
\)

Similarly, we conclude that when the electron is near nucleus $b$, the $\mathrm{H}_{2}^{+}$ground-state wave function will be approximated by

\(
\begin{equation}
\pi^{-1 / 2} e^{-r_{b}} \tag{13.40}
\end{equation}
\)

This suggests that we try as a variation function

\(
\begin{equation}
c{1} \pi^{-1 / 2} e^{-r{a}}+c{2} \pi^{-1 / 2} e^{-r{b}} \tag{13.41}
\end{equation}
\)

where $c{1}$ and $c{2}$ are variational parameters. When the electron is near nucleus $a$, the variable $r{a}$ is small and $r{b}$ is large, and the first term in (13.41) predominates, giving a function resembling (13.39). The function (13.41) is a linear variation function, and we are led to solve a secular equation, which has the form (8.56), where the subscripts 1 and 2 refer to the functions (13.39) and (13.40).

We can also approach the problem using perturbation theory. We take the unperturbed system as the $\mathrm{H}{2}^{+}$molecule with $R=\infty$. For $R=\infty$, the electron can be bound to nucleus $a$ with wave function (13.39), or it can be bound to nucleus $b$ with wave function (13.40). In either case the energy is $-\frac{1}{2}$ hartree, and we have a doubly degenerate unperturbed energy level. Bringing the nuclei in from infinity gives rise to a perturbation that splits the doubly degenerate unperturbed level into two levels. This is illustrated by the $U(R)$ curves for the two lowest $\mathrm{H}{2}^{+}$electronic states, which both dissociate to a ground-state hydrogen atom (see Fig. 13.5). The correct zeroth-order wave functions for the perturbed levels are linear
combinations of the form (13.41), and we are led to a secular equation of the form (8.56), with $W$ replaced by $E^{(0)}+E^{(1)}$ (see Prob. 9.20).

Before solving (8.56), let us improve the trial function (13.41). Consider the limiting behavior of the $\mathrm{H}_{2}^{+}$ground-state electronic wave function as $R$ goes to zero. In this limit we get the $\mathrm{He}^{+}$ion, which has the ground-state wave function [put $Z=2$ in (6.104)]

\(
\begin{equation}
2^{3 / 2} \pi^{-1 / 2} e^{-2 r} \tag{13.42}
\end{equation}
\)

From Fig. 13.3 we see that as $R$ goes to zero, both $r{a}$ and $r{b}$ go to $r$. Hence as $R$ goes to zero, the trial function (13.41) goes to $\left(c{1}+c{2}\right) \pi^{-1 / 2} e^{-r}$. Comparing with (13.42), we see that our trial function has the wrong limiting behavior at $R=0$; it should go to $e^{-2 r}$, not $e^{-r}$. We can fix things by multiplying $r{a}$ and $r{b}$ in the exponentials by a variational parameter $k$, which will be some function of $R ; k=k(R)$. For the correct limiting behavior at $R=0$ and at $R=\infty$, we have $k(0)=2$ and $k(\infty)=1$ for the $\mathrm{H}_{2}^{+}$ground electronic state. Physically, $k$ is some sort of effective nuclear charge, which increases as the nuclei come together. We thus take the trial function as

\(
\begin{equation}
\phi=c{a} 1 s{a}+c{b} 1 s{b} \tag{13.43}
\end{equation}
\)

where the $c$ 's are variational parameters and

\(
\begin{equation}
1 s{a}=k^{3 / 2} \pi^{-1 / 2} e^{-k r{a}}, \quad 1 s{b}=k^{3 / 2} \pi^{-1 / 2} e^{-k r{b}} \tag{13.44}
\end{equation}
\)

The factor $k^{3 / 2}$ normalizes $1 s{a}$ and $1 s{b}$ [see Eq. (6.104)]. The molecular-orbital function (13.43) is a linear combination of atomic orbitals, an LCAO-MO. The trial function (13.43) was first used by Finkelstein and Horowitz in 1928.

For the function (13.43), the secular equation (8.56) is

\(
\left|\begin{array}{ll}
H{a a}-W S{a a} & H{a b}-W S{a b} \tag{13.45}\
H{b a}-W S{b a} & H{b b}-W S{b b}
\end{array}\right|=0
\)

The integrals $H{a a}$ and $H{b b}$ are

\(
\begin{equation}
H{a a}=\int 1 s{a}^{} \hat{H} 1 s{a} d v, \quad H{b b}=\int 1 s{b}^{*} \hat{H} 1 s{b} d v \tag{13.46}
\end{}
\)

where the $\mathrm{H}{2}^{+}$electronic Hamiltonian operator $\hat{H}$ is given by (13.32). We can relabel the variables in a definite integral without affecting its value. Changing $a$ to $b$ and $b$ to $a$ changes $1 s{a}$ to $1 s{b}$ but leaves $\hat{H}$ unaffected (this would not be true for a heteronuclear diatomic molecule). Hence $H{a a}=H_{b b}$. We have

\(
\begin{equation}
H{a b}=\int 1 s{a}^{} \hat{H} 1 s{b} d v, \quad H{b a}=\int 1 s{b}^{*} \hat{H} 1 s{a} d v \tag{13.47}
\end{}
\)

Since $\hat{H}$ is Hermitian and the functions in these integrals are real, we conclude that $H{a b}=H{b a}$. The integral $H{a b}$ is called a resonance (or bond) integral. Since $1 s{a}$ and $1 s_{b}$ are normalized and real, we have

\(
\begin{gather}
S{a a}=\int 1 s{a}^{} 1 s{a} d v=1=S{b b} \
S{a b}=\int 1 s{a}^{} 1 s{b} d v=S{b a} \tag{13.48}
\end{gather}
\)

The overlap integral $S_{a b}$ lies between 1 and 0 , and decreases as the distance between the two nuclei increases.

The secular equation (13.45) becomes

\(
\begin{gather}
\left|\begin{array}{cc}
H{a a}-W & H{a b}-S{a b} W \
H{a b}-S{a b} W & H{a a}-W
\end{array}\right|=0 \tag{13.49}\
H{a a}-W= \pm\left(H{a b}-S{a b} W\right) \tag{13.50}\
W{1}=\frac{H{a a}+H{a b}}{1+S{a b}}, \quad W{2}=\frac{H{a a}-H{a b}}{1-S_{a b}} \tag{13.51}
\end{gather}
\)

These two roots are upper bounds to the energies of the ground and first excited electronic states of $\mathrm{H}{2}^{+}$. We shall see that $H{a b}$ is negative, so $W_{1}$ is the lower-energy root.

We now find the coefficients in (13.43) for each of the roots of the secular equation. From Eq. (8.54), we have

\(
\begin{equation}
\left(H{a a}-W\right) c{a}+\left(H{a b}-S{a b} W\right) c_{b}=0 \tag{13.52}
\end{equation}
\)

Substituting in $W_{1}$ from (13.51) [or using (13.50)], we get

\(
\begin{gather}
c{a} / c{b}=1 \tag{13.53}\
\phi{1}=c{a}\left(1 s{a}+1 s{b}\right) \tag{13.54}
\end{gather}
\)

We fix $c_{a}$ by normalization:

\(
\begin{gather}
\left|c{a}\right|^{2} \int\left(1 s{a}^{2}+1 s{b}^{2}+2 \cdot 1 s{a} 1 s{b}\right) d v=1 \tag{13.55}\
\left|c{a}\right|=\frac{1}{\left(2+2 S_{a b}\right)^{1 / 2}} \tag{13.56}
\end{gather}
\)

The normalized trial function corresponding to the energy $W_{1}$ is therefore

\(
\begin{equation}
\phi{1}=\frac{1 s{a}+1 s{b}}{\sqrt{2}\left(1+S{a b}\right)^{1 / 2}} \tag{13.57}
\end{equation}
\)

For the root $W{2}$, we find $c{b}=-c_{a}$ and

\(
\begin{equation}
\phi{2}=\frac{1 s{a}-1 s{b}}{\sqrt{2}\left(1-S{a b}\right)^{1 / 2}} \tag{13.58}
\end{equation}
\)

Equations (13.57) and (13.58) come as no surprise. Since the nuclei are identical, we expect $|\phi|^{2}$ to remain unchanged on interchanging $a$ and $b$; in other words, we expect no polarity in the bond.

We now consider evaluation of the integrals $H{a a}, H{a b}$, and $S{a b}$. From (13.44) and (13.33), the integrand of $S{a b}$ is $1 s{a} 1 s{b}=k^{3} \pi^{-1} e^{-k\left(r{a}+r{b}\right)}=k^{3} \pi^{-1} e^{-k R \xi}$. The volume element in confocal elliptic coordinates is (Eyring, Walter, and Kimball, Appendix III)

\(
\begin{equation}
d v=\frac{1}{8} R^{3}\left(\xi^{2}-\eta^{2}\right) d \xi d \eta d \phi \tag{13.59}
\end{equation}
\)

Substitution of these expressions for $1 s{a} 1 s{b}$ and $d v$ into $S_{a b}$ and use of the Appendix integral (A.11) to do the $\xi$ integral gives (Prob. 13.16)

\(
\begin{equation}
S_{a b}=e^{-k R}\left(1+k R+\frac{1}{3} k^{2} R^{2}\right) \tag{13.60}
\end{equation}
\)

The evaluation of $H{a a}$ and $H{a b}$ is considered in Prob. 13.17. The results are

\(
\begin{equation}
H_{a a}=\frac{1}{2} k^{2}-k-R^{-1}+e^{-2 k R}\left(k+R^{-1}\right) \tag{13.61}
\end{equation}
\)

\(
\begin{equation}
H{a b}=-\frac{1}{2} k^{2} S{a b}-k(2-k)(1+k R) e^{-k R} \tag{13.62}
\end{equation}
\)

where $\hat{H}$ is given by (13.32) and so omits the internuclear repulsion.
Substituting the values for the integrals into (13.51), we get
$W{1,2}=-\frac{1}{2} k^{2}+\frac{k^{2}-k-R^{-1}+R^{-1}(1+k R) e^{-2 k R} \pm k(k-2)(1+k R) e^{-k R}}{1 \pm e^{-k R}\left(1+k R+k^{2} R^{2} / 3\right)}$
where the upper signs are for $W{1}$. Since $\hat{H}$ in (13.32) omits the internuclear repulsion $1 / R$, $W{1}$ and $W{2}$ are approximations to the purely electronic energy $E{\mathrm{el}}$, and $1 / R$ must be added to $W{1,2}$ to get $U_{1,2}(R)$ [Eq. (13.8)].

The final task is to vary the parameter $k$ at many fixed values of $R$ so as to minimize first $U{1}(R)$ and then $U{2}(R)$. This can be done numerically using a computer (Prob. 13.19) or analytically. The results are that, for the $1 s{a}+1 s{b}$ function (13.57), $k$ increases almost monotonically from 1 to 2 as $R$ decreases from $\infty$ to 0 ; for the $1 s{a}-1 s{b}$ function (13.58), $k$ decreases almost monotonically from 1 to 0.4 as $R$ decreases from $\infty$ to 0 . Since $00$, Eq. (13.62) shows that the integral $H{a b}$ is always negative. Therefore, $W{1}$ in (13.51) corresponds to the ground electronic state $\sigma{g} 1 s$ of $\mathrm{H}{2}^{+}$. For the ground state, one finds $k\left(R_{e}\right)=1.24$.

We might ask why the variational parameter $k$ for the $\sigma{u}^{*} 1 s$ state goes to 0.4 , rather than to 2 , as $R$ goes to zero. The answer is that this state of $\mathrm{H}{2}^{+}$does not go to the ground state ( $1 s$ ) of $\mathrm{He}^{+}$as $R$ goes to zero. The $\sigma{u}^{*} 1 s$ state has odd parity and must correlate with an odd state of $\mathrm{He}^{+}$. The lowest odd states of $\mathrm{He}^{+}$are the $2 p$ states (Section 11.5); since the $\sigma{u}^{*} 1 s$ state has zero electronic orbital angular momentum along the internuclear $(z)$ axis, this state must go to an atomic $2 p$ state with $m=0$, that is, to the $2 p{0}=2 p{z}$ state.

Having found $k(R)$ for each root, one calculates $W{1}$ and $W{2}$ from (13.63) and adds $1 / R$ to get the $U(R)$ curves. The calculated ground-state $U(R)$ curve has a minimum at 2.00 bohrs (Prob. 13.20), in agreement with the true $R{e}$ value 2.00 bohrs, and has $U\left(R{e}\right)=-15.96 \mathrm{eV}$, giving a predicted $D{\mathrm{e}}$ of 2.36 eV , as compared with the true value 2.79 eV . (If we omit varying $k$ but simply set it equal to 1 , we get $R{e}=2.49$ bohrs and $D_{e}=1.76 \mathrm{eV}$.)

Now consider the appearance of the trial functions for the $\sigma{g} 1 s$ and $\sigma{u}^{*} 1 s$ states at intermediate values of $R$. Figure 13.6 shows the values of the functions $\left(1 s{a}\right)^{2}$ and $\left(1 s{b}\right)^{2}$ at points on the internuclear axis (see also Fig. 6.7). For the $\sigma{g} 1 s$ function $1 s{a}+1 s_{b}$, we get a buildup of electronic probability density between the nuclei, as shown in Fig. 13.7. It is especially significant that the buildup of charge between the nuclei is greater than that obtained by simply taking the sum of the separate atomic charge densities. The probability

FIGURE 13.6 Atomic probability densities for $\mathrm{H}_{2}^{+}$. Note the cusps at the nuclei.

FIGURE 13.7 Probability density along the internuclear axis for the LCAOMO function $N\left(1 s{a}+1 s{b}\right)$.

density for an electron in a $1 s{a}$ atomic orbital is $\left(1 s{a}\right)^{2}$. If we add the probability density for half an electron in a $1 s{a} \mathrm{AO}$ and half an electron in a $1 s{b} \mathrm{AO}$, we get

\(
\begin{equation}
\frac{1}{2}\left(1 s{a}^{2}+1 s{b}^{2}\right) \tag{13.64}
\end{equation}
\)

However, in quantum mechanics, we do not add the separate atomic probability densities. Instead, we add the wave functions, as in (13.57). The $\mathrm{H}_{2}^{+}$ground-state probability density is then

\(
\begin{equation}
\phi{1}^{2}=\frac{1}{2\left(1+S{a b}\right)}\left[1 s{a}^{2}+1 s{b}^{2}+2\left(1 s{a} 1 s{b}\right)\right] \tag{13.65}
\end{equation}
\)

The difference between (13.65) and (13.64) is

\(
\begin{equation}
\phi{1}^{2}-\frac{1}{2}\left(1 s{a}^{2}+1 s{b}^{2}\right)=\frac{1}{2\left(1+S{a b}\right)}\left[2\left(1 s{a} 1 s{b}\right)-S{a b}\left(1 s{a}^{2}+1 s_{b}^{2}\right)\right] \tag{13.66}
\end{equation}
\)

Putting $R=2.00$ and $k=1.24$ in Eq. (13.60), we find that $S{a b}=0.46$ at $R{e}$. (It might be thought that because of the orthogonality of different AOs, the overlap integral $S{a b}$ should be zero. However, the AOs $1 s{a}$ and $1 s_{b}$ are eigenfunctions of different Hamiltonian operators-one for a hydrogen atom at $a$ and one for a hydrogen atom at $b$. Hence the orthogonality theorem does not apply.)

Consider now the relative magnitudes of the two terms in brackets in (13.66) for points on the molecular axis. To the left of nucleus $a$, the function $1 s{b}$ is very small; to the right of nucleus $b$, the function $1 s{a}$ is very small. Hence outside the region between the nuclei, the product $1 s{a} 1 s{b}$ is small, and the second term in brackets in (13.66) is dominant. This gives a subtraction of electronic charge density outside the internuclear region, as compared with the sum of the densities of the individual atoms. Now consider the region between the nuclei. At the midpoint of the internuclear axis (and anywhere on the plane perpendicular to the axis and bisecting it), we have $1 s{a}=1 s{b}$, and the bracketed terms in (13.66) become $2\left(1 s{a}\right)^{2}-0.92\left(1 s{a}\right)^{2} \approx 1 s_{a}^{2}$, which is positive. We thus get a buildup of charge probability density between the nuclei in the molecule, as compared with the sum of the densities of the individual atoms. This buildup of electronic charge between the nuclei allows the electron to feel the attractions of both nuclei at the same time, which lowers its potential energy. The greater the overlap in the internuclear region between the atomic orbitals forming the bond, the greater the charge buildup in this region.

The preceding discussion seems to attribute the bonding in $\mathrm{H}{2}^{+}$mainly to the lowering in the average electronic potential energy that results from having the shared electron interact with two nuclei instead of one. This, however, is an incomplete picture. Calculations on $\mathrm{H}{2}^{+}$by Feinberg and Ruedenberg show that the decrease in electronic potential energy due to the sharing is of the same order of magnitude as the nuclear repulsion energy $1 / R$ and hence is insufficient by itself to give binding. Two other effects also contribute to the bonding. The increase in atomic orbital exponent $\left(k=1.24\right.$ at $R_{e}$ versus 1.0 at $\left.\infty\right)$ causes charge to accumulate near the nuclei (as well as in the internuclear region), and this further lowers the electronic potential energy. Moreover, the buildup of charge in the internuclear region makes $\partial \psi / \partial z$ zero at the midpoint of the molecular axis and small in the

\(
\frac{\left[1 s{a}(0,0, z)-1 s{b}(0,0, z)\right]^{2}}{2\left(1-S_{a b}\right)}
\)

region close to this point. Hence the $z$ component of the average electronic kinetic energy [which can be expressed as $\frac{1}{2} \int|\partial \psi / \partial z|^{2} d \tau$; Prob. 7.7b] is lowered as compared with the atomic $\left\langle T_{z}\right\rangle$. (However, the total average electronic kinetic energy is raised; see Section 14.5.) For details, see M. J. Feinberg and K. Ruedenberg, J. Chem. Phys., 54, 1495 (1971); M. P. Melrose et al., Theor. Chim. Acta, 88, 311 (1994); see also K. Ruedenberg and M. W. Schmidt, J. Phys. Chem. A, 113, 1954 (2009); J. Comput. Chem., 28, 391 (2007).

Bader, however, has criticized the views of Feinberg and Ruedenberg. Bader states (among other points) that $\mathrm{H}{2}^{+}$and $\mathrm{H}{2}$ are atypical and that, in contrast to the increase of charge density in the immediate vicinity of the nuclei in $\mathrm{H}{2}$ and $\mathrm{H}{2}^{+}$, molecule formation that involves atoms other than H is usually accompanied by a substantial reduction in charge density in the immediate vicinity of the nuclei. See R. F. W. Bader in The Force Concept in Chemistry, B. M. Deb, ed., Van Nostrand Reinhold, 1981, pp. 65-67, 71, 95-100, 113-115. Further study is needed before the origin of the covalent bond can be considered a settled question.

The $\sigma{u}^{*} 1 s$ trial function $1 s{a}-1 s{b}$ is proportional to $e^{-k r{a}}-e^{-k r{b}}$. On the plane perpendicular to the internuclear axis and midway between the nuclei, we have $r{a}=r{b}$, so this plane is a nodal plane for the $\sigma{u}^{} 1 s$ function. We do not get a buildup of charge between the nuclei for this state, and the $U(R)$ curve has no minimum. We say that the $\sigma{g} 1 s$ orbital is bonding and the $\sigma{u}^{} 1 s$ orbital is antibonding. (See Fig. 13.8.)

Reflection of the electron's coordinates in the $\sigma{h}$ symmetry plane perpendicular to the molecular axis and midway between the nuclei converts $r{a}$ to $r{b}$ and $r{b}$ to $r{a}$ and leaves $\phi$ unchanged [Eq. (13.79)]. The operator $\hat{O}{\sigma{h}}$ (Section 12.1) commutes with the electronic Hamiltonian (13.32) and with the parity (inversion) operator. Hence we can choose the $\mathrm{H}{2}^{+}$wave functions to be eigenfunctions of this reflection operator as well as of the parity operator. Since the square of this reflection operator is the unit operator, its eigenvalues must be +1 and -1 (Section 7.5). States of $\mathrm{H}_{2}^{+}$for which the wave function changes sign upon reflection in this plane (eigenvalue -1 ) are indicated by a star as a superscript to the letter that specifies $\lambda$. States whose wave functions are unchanged on reflection in this plane are left unstarred. Since orbitals with eigenvalue -1 for this reflection have a nodal plane between the nuclei, starred orbitals are antibonding.

Instead of using graphs, we can make contour diagrams of the orbitals (Section 6.7); see Fig. 13.9.

FIGURE 13.8 Probability density along the internuclear axis for the LCAO-MO function $N^{\prime}\left(1 s{a}-1 s{b}\right)$.

FIGURE 13.9 Contours of constant $|\psi|$ for the $\sigma{g} 1$ s and $\sigma{u}^{*} 1 s$ MOs. The threedimensional contour surfaces are generated by rotating these figures about the $z$ axis. Note the resemblance of the antibonding-MO contours to those of a $2 p_{z} \mathrm{AO}$.

Sometimes the binding in $\mathrm{H}{2}^{+}$is attributed to the resonance integral $H{a b}$, since in the approximate treatment we have given, it provides most of the binding energy. This viewpoint is misleading. In the exact treatment of Section 13.4, there arose no such resonance integral. The resonance integral simply arises out of the nature of the LCAO approximation we used.

In summary, we have formed the two $\mathrm{H}{2}^{+}$MOs (13.57) and (13.58), one bonding and one antibonding, from the $\mathrm{AOs} 1 s{a}$ and $1 s_{b}$. The MO energies are given by Eq. (13.51) as

\(
\begin{equation}
W{1,2}=H{a a} \pm \frac{H{a b}-H{a a} S{a b}}{1 \pm S{a b}} \tag{13.67}
\end{equation}
\)

where $H{a a}=\left\langle 1 s{a}\right| \hat{H}\left|1 s{a}\right\rangle$, with $\hat{H}$ being the purely electronic Hamiltonian of $\mathrm{H}{2}^{+}$. The integral $H{a a}$ would be the molecule's purely electronic energy if the electron's wave function in the molecule were $1 s{a}$. In a sense, $H{a a}$ is the energy of the $1 s{a}$ orbital in the molecule. In the limit $R=\infty, H{a a}$ becomes the $1 s \mathrm{AO}$ energy in the H atom. In the molecule, $H{a a}$ is substantially lower than the electronic energy of an H atom because the electron is attracted to both nuclei. A diagram of MO formation from AOs is given in Fig. 13.22. To get $U(R)$, the electronic energy including nuclear repulsion, we must add $1 / R$ to (13.67).

Problem 13.21 outlines the use of Mathcad to create an animation showing how contour plots of the $\mathrm{H}{2}^{+}$LCAO MOs $\phi{1}$ and $\phi_{2}$ change as $R$ changes.

We have described the lowest two $\mathrm{H}{2}^{+}$electronic states according to the state of the hydrogen atom obtained on dissociation. This is a separated-atoms description. Alternatively, we can use the state of the atom formed as the internuclear distance goes to zero. This is a united-atom description. We saw that for the two lowest electronic states of $\mathrm{H}{2}^{+}$ the united-atom states are the $1 s$ and $2 p{0}$ states of $\mathrm{He}^{+}$. The united-atom designation is put on the left of the symbol for $\lambda$. The $\sigma{g} 1 s$ state thus has the united-atom designation $1 s \sigma{g}$. The $\sigma{u}^{} 1 s$ state has the united-atom designation $2 p \sigma_{u}^{}$. It is not necessary to write this state as $2 p{0} \sigma{u}^{*}$, because the fact that it is a $\sigma$ state tells us that it correlates with the united-atom $2 p{0}$ state. For the united-atom description, the subscripts $g$ and $u$ are not needed, since molecular states correlating with $s, d, g, \ldots$ atomic states must be $g$, while states correlating with $p, f, h, \ldots$ atomic states must be $u$. From the separated-atoms states, we cannot tell whether the molecular wave function is $g$ or $u$. Thus from the $1 s$ separated-atoms state we formed both a $g$ and a $u$ function for $\mathrm{H}{2}^{+}$.

Before constructing approximate molecular orbitals for other $\mathrm{H}{2}^{+}$states, we consider how the trial function (13.57) can be improved. From the viewpoint of perturbation theory, (13.57) is the correct zeroth-order wave function. We know that the perturbation of molecule formation will mix in other hydrogen-atom states besides 1s. Dickinson in 1933 used a trial function with some $2 p{0}$ character mixed in (since the ground state of $\mathrm{H}{2}^{+}$is a $\sigma$ state, it would be wrong to mix in $2 p{ \pm 1}$ functions); he took

\(
\begin{equation}
\phi=\left[1 s{a}+c\left(2 p{0}\right){a}\right]+\left[1 s{b}+c\left(2 p{0}\right){b}\right] \tag{13.68}
\end{equation}
\)

where $c$ is a variational parameter and where (Table 6.2)

\(
1 s{a}=k^{3 / 2} \pi^{-1 / 2} e^{-k r{a}}, \quad\left(2 p{0}\right){a}=\left(2 p{z}\right){a}=\frac{\beta^{5 / 2}}{4(2 \pi)^{1 / 2}} r{a} e^{-\beta r{a} / 2} \cos \theta_{a}
\)

with $k$ and $\beta$ being two other variational parameters. We have similar expressions for $1 s{b}$ and $\left(2 p{0}\right){b}$. The angles $\theta{a}$ and $\theta{b}$ refer to two sets of spherical coordinates, one set at each nucleus; see Fig. 13.10. The definitions of $\theta{a}$ and $\theta_{b}$ correspond to using a right-handed coordinate system on atom $a$ and a left-handed system on atom $b$. The coefficient $c$ goes to zero as $R$ goes to either zero or infinity.

The mixing together of two or more AOs on the same atom is called hybridization. The function $1 s+c 2 p{0}$ is a hybridized atomic orbital. Since the $2 p{0}$ function is positive in one lobe and negative in the other, the inclusion of $2 p{0}$ produces additional charge buildup between the nuclei, giving a greater binding energy. The hybridization allows for the polarization of the $1 s{a}$ and $1 s{b}$ atomic orbitals that occurs on molecule formation. The function (13.68) gives a $U(R)$ curve with a minimum at 2.01 bohrs. At this distance, the parameters have the values $k=1.246, \beta=2.965$, and $c=0.138$ [F. Weinhold, J. Chem. Phys., 54, 530 (1971)]. The calculated $D{e}$ is 2.73 eV , close to the true value 2.79 eV .

The quantum mechanics and spectroscopy of $\mathrm{H}_{2}^{+}$are reviewed in C. A. Leach and R. E. Moss, Annu. Rev. Phys. Chem., 46, 55 (1995).

One final point. The approximate wave functions in this chapter are written in atomic units. When rewriting these functions in ordinary units, we must remember that wave functions are not dimensionless. A one-particle wave function $\psi$ has units of length ${ }^{-3 / 2}$ (Section 3.5). The AOs $1 s{a}$ and $1 s{b}$ that occur in the functions (13.57) and (13.58) are given by (13.44) in atomic units. In ordinary units, $1 s{a}=\left(k / a{0}\right)^{3 / 2} \pi^{-1 / 2} e^{-k r{a} / a{0}}$.


In the preceding section, we used the approximate functions (13.57) and (13.58) for the two lowest $\mathrm{H}{2}^{+}$electronic states. Now we construct approximate functions for further excited states so as to build up a supply of $\mathrm{H}{2}^{+}$-like molecular orbitals. We shall then use these MOs to discuss many-electron diatomic molecules qualitatively, just as we used hydrogenlike AOs to discuss many-electron atoms.

To get approximations to higher $\mathrm{H}{2}^{+} \mathrm{MOs}$, we can use the linear-variation-function method. We saw that it was natural to take variation functions for $\mathrm{H}{2}^{+}$as linear combinations of hydrogenlike atomic-orbital functions, giving LCAO-MOs. To get approximate MOs for higher states, we add in more AOs to the linear combination. Thus, to get approximate wave functions for the six lowest $\mathrm{H}_{2}^{+} \sigma$ states, we use a linear combination of the three lowest $m=0$ hydrogenlike functions on each atom:

\(
\phi=c{1} 1 s{a}+c{2} 2 s{a}+c{3}\left(2 p{0}\right){a}+c{4} 1 s{b}+c{5} 2 s{b}+c{6}\left(2 p{0}\right){b}
\)

As found in the preceding section for the function (13.43), the symmetry of the homonuclear diatomic molecule makes the coefficients of the atom- $b$ orbitals equal to $\pm 1$ times the corresponding atom- $a$ orbital coefficients:

\(
\begin{equation}
\phi=\left[c{1} 1 s{a}+c{2} 2 s{a}+c{3}\left(2 p{0}\right){a}\right] \pm\left[c{1} 1 s{b}+c{2} 2 s{b}+c{3}\left(2 p{0}\right){b}\right] \tag{13.69}
\end{equation}
\)

where the upper sign goes with the even $(g)$ states.

Consider the relative magnitudes of the coefficients in (13.69). For the two electronic states that dissociate into a $1 s$ hydrogen atom, we expect that $c{1}$ will be considerably greater than $c{2}$ or $c{3}$, since $c{2}$ and $c{3}$ vanish in the limit of $R$ going to infinity. Thus the Dickinson function (13.68) has the $2 p{0}$ coefficient equal to one-seventh the $1 s$ coefficient at $R{e}$. (This function does not include a $2 s$ term, but if it did, we would find its coefficient to be small compared with the $1 s$ coefficient.) As a first approximation, we therefore set $c{2}$ and $c_{3}$ equal to zero, taking

\(
\begin{equation}
\phi=c{1}\left(1 s{a} \pm 1 s_{b}\right) \tag{13.70}
\end{equation}
\)

as an approximation for the wave functions of these two states (as we already have done). From the viewpoint of perturbation theory, if we take the separated atoms as the unperturbed problem, the functions (13.70) are the correct zeroth-order wave functions.

The same argument for the two states that dissociate to a $2 s$ hydrogen atom gives as approximate wave functions for them

\(
\begin{equation}
\phi=c{2}\left(2 s{a} \pm 2 s_{b}\right) \tag{13.71}
\end{equation}
\)

since $c{1}$ and $c{3}$ will be small for these states. The functions (13.71) are only an approximation to what we would find if we carried out the linear variation treatment. To find rigorous upper bounds to the energies of these two $\mathrm{H}_{2}^{+}$states, we must use the trial function (13.69) and solve the appropriate secular equation (8.58) (or use matrix algebra-Section 8.6).

In general, we have two $\mathrm{H}{2}^{+}$states correlating with each separated-atoms state, and rough approximations to the wave functions of these two states will be the LCAO functions $f{a}+f{b}$ and $f{a}-f{b}$, where $f$ is a hydrogenlike wave function. The functions (13.70) give the $\sigma{g} 1 s$ and $\sigma{u}^{*} 1 s$ states. Similarly, the functions (13.71) give the $\sigma{g} 2 s$ and $\sigma{u}^{*} 2 s$ molecular orbitals. The outer contour lines for these orbitals are like those for the corresponding MOs made from $1 s$ AOs. However, since the $2 s$ AO has a nodal sphere while the $1 s$ AO does not, each of these MOs has one more nodal surface than the corresponding $\sigma{g} 1 s$ or $\sigma_{u}^{*} 1 s \mathrm{MO}$.

Next we have the combinations

\(
\begin{equation}
\left(2 p{0}\right){a} \pm\left(2 p{0}\right){b}=\left(2 p{z}\right){a} \pm\left(2 p{z}\right){b} \tag{13.72}
\end{equation}
\)

giving the $\sigma{g} 2 p$ and $\sigma{u}^{*} 2 p$ MOs (Fig. 13.11). These are $\sigma$ MOs even though they correlate with $2 p$ separated AOs, since they have $m=0$.

The preceding discussion is oversimplified. For the hydrogen atom, the $2 s$ and $2 p$ AOs are degenerate, and so we can expect the correct zeroth-order functions for the $\sigma{g} 2 s, \sigma{u}^{} 2 s, \sigma{g} 2 p$, and $\sigma{u}^{} 2 p$ MOs of $\mathrm{H}{2}^{+}$to each be mixtures of $2 s$ and $2 p$ AOs rather than containing only $2 s$ or $2 p$ character. [In the $R \rightarrow \infty$ limit, $\mathrm{H}{2}^{+}$consists of an H atom perturbed by the essentially uniform electric field of a far-distant proton. Problem 9.23

FIGURE 13.11 Formation of $\sigma{g} 2 p$ and $\sigma{u}^{*} 2 p \mathrm{MOs}$ from $2 p_{z}$ AOs. The dashed lines indicate nodal surfaces. The signs on the contours give the sign of the wave function. The contours are symmetric about the $z$ axis. (Because of substantial $2 s-2 p$ hybridization, these contours are not accurate representations of true MO shapes. For accurate contours, see the reference for Fig. 13.20.)

showed that the correct zeroth-order functions for the $n=2$ levels of an H atom in a uniform electric field in the $z$ direction are $2^{-1 / 2}\left(2 s+2 p{0}\right), 2^{-1 / 2}\left(2 s-2 p{0}\right), 2 p{1}$, and $2 p{-1}$. Thus, for $\mathrm{H}{2}^{+}, 2 s$ and $2 p{0}$ in Eqs. (13.71) and (13.72) should be replaced by $2 s+2 p{0}$ and $2 s-2 p{0}$.] For molecules that dissociate into many-electron atoms, the separated-atoms $2 s$ and $2 p$ AOs are not degenerate but do lie close together in energy. Hence the first-order corrections to the wave functions will mix substantial $2 s$ character into the $\sigma 2 p$ MOs and substantial $2 p$ character into the $\sigma 2 s$ MOs. Thus the designation of an MO as $\sigma 2 s$ or $\sigma 2 p$ should not be taken too literally. For $\mathrm{H}{2}^{+}$and $\mathrm{H}{2}$, the united-atom designations of the MOs are preferable to the separated-atoms designations, but we shall use mostly the latter.

For the other two $2 p$ atomic orbitals, we can use either the $2 p{+1}$ and $2 p{-1}$ complex functions or the $2 p{x}$ and $2 p{y}$ real functions. If we want MOs that are eigenfunctions of $\hat{L}_{z}$, we will choose the complex $p$ orbitals, giving the MOs

\(
\begin{array}{r}
\left(2 p{+1}\right){a}+\left(2 p{+1}\right){b} \
\left(2 p{+1}\right){a}-\left(2 p{+1}\right){b} \
\left(2 p{-1}\right){a}+\left(2 p{-1}\right){b} \
\left(2 p{-1}\right){a}-\left(2 p{-1}\right){b} \tag{13.76}
\end{array}
\)

From Eq. (6.114) we have, since $\phi{a}=\phi{b}=\phi$,

\(
\begin{equation}
\left(2 p{+1}\right){a}+\left(2 p{+1}\right){b}=\frac{1}{8} \pi^{-1 / 2}\left(r{a} e^{-r{a} / 2} \sin \theta{a}+r{b} e^{-r{b} / 2} \sin \theta{b}\right) e^{i \phi} \tag{13.77}
\end{equation}
\)

Since $\lambda=|m|=1$, this is a $\pi$ orbital. The inversion operation amounts to the coordinate transformation (Fig. 13.12)

\(
\begin{equation}
r{a} \rightarrow r{b}, \quad r{b} \rightarrow r{a}, \quad \phi \rightarrow \phi+\pi \tag{13.78}
\end{equation}
\)

We have $e^{i(\phi+\pi)}=(\cos \pi+i \sin \pi) e^{i \phi}=-e^{i \phi}$. From Fig. 13.12 we see that inversion converts $\theta{a}$ to $\theta{b}$ and vice versa. Thus inversion converts (13.77) to its negative, meaning it is a $u$ orbital. Reflection in the plane perpendicular to the axis and midway between the nuclei causes the following transformations (Prob. 13.24):

\(
\begin{equation}
r{a} \rightarrow r{b}, \quad r{b} \rightarrow r{a}, \quad \phi \rightarrow \phi, \quad \theta{a} \rightarrow \theta{b}, \quad \theta{b} \rightarrow \theta{a} \tag{13.79}
\end{equation}
\)

This leaves (13.77) unchanged, so we have an unstarred (bonding) orbital. The designation of (13.77) is then $\pi{u} 2 p{+1}$.

FIGURE 13.12 The effect of inversion of the electron's coordinates in $\mathrm{H}{2}^{+}$. We have $r{\mathrm{a}}^{\prime}=r{b}, r{b}^{\prime}=r_{a}$, and $\phi^{\prime}=\phi+\pi$.

FIGURE 13.13 Cross section of the $\pi{u} 2 p{+1}$ (or $\pi{u} 2 p{-1}$ ) molecular orbital. To obtain the three-dimensional contour surface, rotate the figure about the $z$ axis. The $z$ axis is a nodal line for this MO (as it is for the $2 p_{+1}$ AO.)

The function (13.77) is complex. Taking its absolute value, we can plot the orbital contours of constant probability density (Section 6.7). Since $\left|e^{i \phi}\right|=1$, the probability density is independent of $\phi$, giving a density that is symmetric about the $z$ (internuclear) axis. Figure 13.13 shows a cross section of this orbital in a plane containing the nuclei. The three-dimensional shape is found by rotating this figure about the $z$ axis, creating a sort of fat doughnut.

The MO (13.75) differs from (13.77) only in having $e^{i \phi}$ replaced by $e^{-i \phi}$ and is designated $\pi{u} 2 p{-1}$. The coordinate $\phi$ enters the $\mathrm{H}{2}^{+}$Hamiltonian as $\partial^{2} / \partial \phi^{2}$. Since $\partial^{2} e^{i \phi} / \partial \phi^{2}=\partial^{2} e^{-i \phi} / \partial \phi^{2}$, the states (13.73) and (13.75) have the same energy. Recall (Section 13.4) that the $\lambda=1$ energy levels are doubly degenerate, corresponding to $m= \pm 1$. Since $\left|e^{i \phi}\right|=\left|e^{-i \phi}\right|$, the $\pi{u} 2 p{+1}$ and $\pi{u} 2 p{-1}$ MOs have the same shapes, just as the $2 p{+1}$ and $2 p_{-1}$ AOs have the same shapes.

The functions (13.74) and (13.76) give the $\pi{g}^{*} 2 p{+1}$ and $\pi{g}^{*} 2 p{-1}$ MOs. These functions do not give charge buildup between the nuclei; see Fig. 13.14.

Now consider the more familiar alternative of using the $2 p{x}$ and $2 p{y}$ AOs to make the MOs. The linear combination

\(
\begin{equation}
\left(2 p{x}\right){a}+\left(2 p{x}\right){b} \tag{13.80}
\end{equation}
\)

gives the $\pi{u} 2 p{x} \mathrm{MO}$ (Fig. 13.15). This MO is not symmetrical about the internuclear axis but builds up probability density in two lobes, one above and one below the $y z$ plane, which is a nodal plane for this function. The wave function has opposite signs on each side of this plane. The linear combination

\(
\begin{equation}
\left(2 p{x}\right){a}-\left(2 p{x}\right){b} \tag{13.81}
\end{equation}
\)

gives the $\pi{g}^{*} 2 p{x}$ MO (Fig. 13.15). Since the $2 p{y}$ functions differ from the $2 p{x}$ functions solely by a rotation of $90^{\circ}$ about the internuclear axis, they give MOs differing from those of Fig. 13.15 by a $90^{\circ}$ rotation about the $z$ axis. The linear combinations

\(
\begin{align}
& \left(2 p{y}\right){a}+\left(2 p{y}\right){b} \tag{13.82}\
& \left(2 p{y}\right){a}-\left(2 p{y}\right){b} \tag{13.83}
\end{align}
\)

give the $\pi{u} 2 p{y}$ and $\pi{g}^{*} 2 p{y}$ molecular orbitals. The MOs (13.80) and (13.82) have the same energy. The MOs (13.81) and (13.83) have the same energy. (Note that the $g \pi 2 p$ MOs are antibonding, while the $u \pi 2 p$ MOs are bonding.)

FIGURE 13.14 Cross section of the $\pi{g}^{*} 2 p{+1}$ (or $\pi{g}^{*} 2 p{-1}$ ) MO. To obtain the threedimensional contour surface, rotate the figure about the $z$ axis. The $z$ axis and the xy plane are nodes.

Just as the $2 p{x}$ and $2 p{y}$ AOs are linear combinations of the $2 p{+1}$ and $2 p{-1}$ AOs [Eqs. (6.118) and (6.120)], the $\pi{u} 2 p{x}$ and $\pi{u} 2 p{y}$ MOs are linear combinations of the $\pi{u} 2 p{+1}$ and $\pi{u} 2 p{-1}$ MOs. We can use any linear combination of the eigenfunctions of a degenerate energy level and still have an energy eigenfunction. Just as the $2 p{+1}$ and $2 p{-1}$ AOs are eigenfunctions of $\hat{L}{z}$ and the $2 p{x}$ and $2 p{y}$ AOs are not, the $\pi{u} 2 p{+1}$ and $\pi{u} 2 p{-1}$ MOs are eigenfunctions of $\hat{L}{z}$ and the $\pi{u} 2 p{x}$ and $\pi{u} 2 p{y}$ MOs are not. For the $\mathrm{H}{2}^{+} \pi{u} 2 p$ energy level, we can use the pair of real MOs (13.80) and (13.82), or the pair of complex MOs (13.73) and (13.75), or any two linearly independent linear combinations of these functions.

We have shown the correlation of the $\mathrm{H}{2}^{+}$MOs with the separated-atoms AOs. We can also show how they correlate with the united-atom AOs. As $R$ goes to zero, the $\sigma{u}^{} 1 s$ MO (Fig. 13.9) increasingly resembles the $2 p{z}$ AO, with which it correlates. Similarly, the $\pi{u} 2 p$ MOs correlate with $p$ united-atom states, while the $\pi_{g}^{} 2 p$ MOs correlate with $d$ united-atom states.

An online simulation of $\mathrm{H}_{2}^{+} \mathrm{MOs}$ is at www.falstad.com/qmmo; you can vary the internuclear distance.


We now use the $\mathrm{H}{2}^{+}$MOs developed in the last section to discuss many-electron homonuclear diatomic molecules. (Homonuclear means the two nuclei are the same; heteronuclear means they are different.) If we ignore the interelectronic repulsions, the zeroth-order wave function is a Slater determinant of $\mathrm{H}{2}^{+}$-like one-electron spin-orbitals. We approximate the spatial part of the $\mathrm{H}_{2}^{+}$spin-orbitals by the LCAO-MOs of the last section. Treatments that go beyond this crude first approximation will be discussed later.

The sizes and energies of the MOs vary with varying internuclear distance for each molecule and vary as we go from one molecule to another. Thus we saw how the orbital exponent $k$ in the $\mathrm{H}{2}^{+}$trial function (13.54) varied with $R$. As we go to molecules with higher nuclear charge, the parameter $k$ for the $\sigma{g} 1 s \mathrm{MO}$ will increase, giving a more compact MO. We want to consider the order of the MO energies. Because of the variation of these energies with $R$ and variations from molecule to molecule, numerous crossings occur, just

FIGURE 13.15 Formation of the $\pi{u} 2 p{x}$ and $\pi{g}^{*} 2 p{x}$ MOs. Since $\phi=0$ in the $x z$ plane, the cross sections of these MOs in the $x z$ plane are the same as for the corresponding $\pi{u} 2 p{+1}$ and $\pi{g}^{*} 2 p{-1}$ MOs. However, the $\pi 2 p_{x}$ MOs are not symmetrical about the $z$ axis. Rather, they consist of blobs of probability density above and below the nodal yz plane.

TABLE 13.1 Molecular-Orbital Nomenclature for Homonuclear Diatomic Molecules

Separated-Atoms
Description
United-Atom
Description
Numbering by
Symmetry
$\sigma_{g} 1 s$$1 s \sigma_{g}$$1 \sigma_{g}$
$\sigma_{u}^{*} 1 s$$2 p \sigma_{u}^{*}$$1 \sigma_{u}$
$\sigma_{g} 2 s$$2 s \sigma_{g}$$2 \sigma_{g}$
$\sigma_{u}^{*} 2 s$$3 p \sigma_{u}^{*}$$2 \sigma_{u}$
$\pi_{u} 2 p$$2 p \pi_{u}$$1 \pi_{u}$
$\sigma_{g} 2 p$$3 s \sigma_{g}$$3 \sigma_{g}$
$\pi_{g}^{*} 2 p$$3 d \pi_{g}^{*}$$1 \pi_{g}$
$\sigma_{u}^{*} 2 p$$4 p \sigma_{u}^{*}$$3 \sigma_{u}$

as for atomic-orbital energies (Fig. 11.2). Hence we cannot give a definitive order. However, the following is the order in which the MOs fill as we go across the periodic table:
$\sigma{g} 1 s<\sigma{u}^{} 1 s<\sigma{g} 2 s<\sigma{u}^{} 2 s<\pi{u} 2 p{x}=\pi{u} 2 p{y}<\sigma{g} 2 p<\pi{g}^{} 2 p{x}=\pi{g}^{} 2 p{y}<\sigma{u}^{*} 2 p$
Each bonding orbital fills before the corresponding antibonding orbital. The $\pi{u} 2 p$ orbitals are close in energy to the $\sigma{g} 2 p$ orbital, and it was formerly believed that the $\sigma_{g} 2 p \mathrm{MO}$ filled first.

Besides the separated-atoms designation, there are other ways of referring to these MOs; see Table 13.1. The second column of this table gives the united-atom designations. The nomenclature of the third column uses $1 \sigma{g}$ for the lowest $\sigma{g} \mathrm{MO}, 2 \sigma{g}$ for the second lowest $\sigma{g} \mathrm{MO}$, and so on.

Figure 13.16 shows how these MOs correlate with the separated-atoms and unitedatom AOs. Because of the variation of MO energies from molecule to molecule, this

FIGURE 13.16 Correlation diagram for homonuclear diatomic MOs. (This diagram does not hold for $\mathrm{H}_{2}^{+}$.) The dashed vertical line corresponds to the order in which the MOs fill.

diagram is not quantitative. (The word correlation is being used here to mean a correspondence; this is a different meaning than in the term electron correlation.)

Recall (Prob. 7.29 and Fig. 6.13) that $s, d, g, \ldots$ united-atom AOs are even functions and therefore correlate with gerade $(g)$ MOs, whereas $p, f, h, \ldots$ AOs are odd functions and correlate with ungerade (u) MOs.

A useful principle in drawing orbital correlation diagrams is the noncrossing rule, which states that for MO correlation diagrams of many-electron diatomic molecules, the energies of MOs with the same symmetry cannot cross. For diatomic MOs the word symmetry refers to whether the orbital is $g$ or $u$ and whether it is $\sigma, \pi, \delta \ldots$ For example, two $\sigma_{g}$ MOs cannot cross on a correlation diagram. From the noncrossing rule, we conclude that the lowest MO of a given symmetry type must correlate with the lowest unitedatom AO of that symmetry, and similarly for higher orbitals. [A similar noncrossing rule holds for potential-energy curves $U(R)$ for different electronic states of a many-electron diatomic molecule.] The proof of the noncrossing rule is a bit subtle; see C. A. Mead, J. Chem. Phys., 70, 2276 (1979) for a thorough discussion.

Just as we discussed atoms by filling in the AOs, giving rise to atomic configurations such as $1 s^{2} 2 s^{2}$, we shall discuss homonuclear diatomic molecules by filling in the MOs, giving rise to molecular electronic configurations such as $\left(\sigma{g} 1 s\right)^{2}\left(\sigma{u}^{*} 1 s\right)^{2}$. (Recall that with a single atomic configuration there is associated a hierarchy of terms, levels, and states; the same is true for a molecular configuration; see Section 13.8.)

Figure 13.17 shows the homonuclear diatomic MOs formed from the $1 s, 2 s$, and $2 p$ AOs.

For $\mathrm{H}{2}^{+}$we have the ground-state configuration $\sigma{g} 1 s$, which gives a one-electron bond. For excited states the electron is in one of the higher MOs.

For $\mathrm{H}{2}$ we put the two electrons in the $\sigma{g} 1 s \mathrm{MO}$ with opposite spins, giving the ground-state configuration $\left(\sigma{g} 1 s\right)^{2}$. The two bonding electrons give a single bond. The ground-state dissociation energy $D{e}$ is 4.75 eV .

Now consider $\mathrm{He}{2}$. Two electrons go in the $\sigma{g} 1 s \mathrm{MO}$, thereby filling it. The other two go in the next MO, $\sigma{u}^{*} 1 s$. The ground-state configuration is $\left(\sigma{g} 1 s\right)^{2}\left(\sigma_{u}^{*} 1 s\right)^{2}$. With

FIGURE 13.17 Homonuclear diatomic MOs formed from $1 s, 2 s$, and $2 p$ AOs.

two bonding and two antibonding electrons, we expect no net bonding, in agreement with the fact that the ground electronic state of $\mathrm{He}{2}$ shows no substantial minimum in the potential-energy curve. However, if an electron is excited from the antibonding $\sigma{u}^{*} 1 s \mathrm{MO}$ to a higher MO that is bonding, the molecule will have three bonding electrons and only one antibonding electron. We therefore expect that $\mathrm{He}{2}$ has bound excited electronic states, with a significant minimum in the $U(R)$ curve of each such state. Indeed, about two dozen such bound excited states of $\mathrm{He}{2}$ have been spectroscopically observed in gas discharge tubes. Of course, such excited states decay to the ground electronic state, and then the molecule dissociates.

The repulsion of two $1 s^{2}$ helium atoms can be ascribed mainly to the Pauli repulsion between electrons with parallel spins (Section 10.3). Each helium atom has a pair of electrons with opposite spin, and each pair tends to exclude the other pair from occupying the same region of space.

Removal of an antibonding electron from $\mathrm{He}{2}$ gives the $\mathrm{He}{2}^{+}$ion, with ground-state configuration $\left(\sigma{g} 1 s\right)^{2}\left(\sigma{u}^{} 1 s\right)$ and one net bonding electron. Ground-state properties of this molecule are quite close to those for $\mathrm{H}{2}^{+}$; see Table 13.2 later in this section.
$\mathrm{Li}{2}$ has the ground-state configuration $\left(\sigma{g} 1 s\right)^{2}\left(\sigma{u}^{} 1 s\right)^{2}\left(\sigma{g} 2 s\right)^{2}$ with two net bonding electrons, leading to the description of the molecule as containing an $\mathrm{Li}-\mathrm{Li}$ single bond. Experimentally, $\mathrm{Li}{2}$ is a stable species. $\mathrm{In}{\mathrm{Li}}^{2}$ the orbital exponent of the $1 s \mathrm{AOs}$ is considerably greater than in $\mathrm{H}{2}^{+}$or $\mathrm{H}{2}$, because of the increase in the nuclear charges from 1 to 3 . This shrinks the $1 s{a}$ and $1 s{b}$ AOs in closer to the corresponding nuclei. There is thus only very slight overlap between these two AOs, and the integrals $S{a b}$ and $H{a b}$ are very small for these AOs. As a result, the energies of the $\sigma{g} 1 s$ and $\sigma{u}^{*} 1 s \mathrm{MOs}$ in $\mathrm{Li}{2}$ are nearly equal to each other and to the energy of a $1 s \mathrm{Li} \mathrm{AO}$. (For very small $R$, the $1 s{a}$ and $1 s{b}$ AOs do overlap appreciably and their energies then differ considerably.) The $\mathrm{Li}{2}$ ground-state configuration is often written as $K K\left(\sigma{g} 2 s\right)^{2}$ to indicate the negligible change in inner-shell orbital energies on molecule formation, which is in accord with the chemist's usual idea of bonding involving only the valence electrons. The orbital exponent of the $2 s \mathrm{AOs}$ in $\mathrm{Li}_{2}$ is not much greater than 1, because these electrons are screened from the nucleus by the $1 s$ electrons.

The $\mathrm{Be}{2}$ ground-state configuration $K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{*} 2 s\right)^{2}$ has no net bonding electrons.
The $\mathrm{B}{2}$ ground-state configuration $K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{*} 2 s\right)^{2}\left(\pi{u} 2 p\right)^{2}$ has two net bonding electrons, indicating a stable ground state, as is found experimentally. The bonding electrons are $\pi$ electrons, which is at variance with the notion that single bonds are always $\sigma$ bonds. We have two degenerate $\pi{u} 2 p$ MOs. Recall that when we had an atomic configuration such as $1 s^{2} 2 s^{2} 2 p^{2}$ we obtained several terms, which because of interelectronic repulsions had different energies. We saw that the term with the highest total spin was generally the lowest (Hund's rule). With the molecular configuration of $\mathrm{B}{2}$ given above, we also have a number of terms. Since the lower $(\sigma)$ MOs are all filled, their electrons must be paired and contribute nothing to the total spin. If the two $\pi{u} 2 p$ electrons are both in the same MO (for example, both in $\pi{u} 2 p{+1}$ ), their spins must be paired (antiparallel), giving a total molecular electronic spin of zero. If, however, we have one electron in the $\pi{u} 2 p{+1} \mathrm{MO}$ and the other in the $\pi{u} 2 p{-1} \mathrm{MO}$, their spins can be parallel, giving a net spin of 1 ; by Hund's rule, this term will be lowest, and the ground term of $\mathrm{B}{2}$ will have spin multiplicity $2 S+1=3$. Investigation of the electron-spin-resonance spectrum of $\mathrm{B}{2}$ trapped in solid neon at low temperature showed that the $\mathrm{B}_{2}$ ground term is a triplet with $S=1$ [L. B. Knight et al., J. Am. Chem. Soc., 109, 3521 (1987)].

The $\mathrm{C}{2}$ ground-state configuration $K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{*} 2 s\right)^{2}\left(\pi{u} 2 p\right)^{4}$ with four net bonding electrons gives a stable ground state with a double bond. As mentioned, the $\pi{u} 2 p$ and $\sigma{g} 2 p$ MOs have nearly the same energy in many molecules. The triplet term of the
$\mathrm{C}{2} K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{*} 2 s\right)^{2}\left(\pi{u} 2 p\right)^{3}\left(\sigma{g} 2 p\right)$ configuration lies only 0.09 eV above the ground $\left(\pi{u} 2 p\right)^{4}$ singlet term.

The $\mathrm{N}{2}$ ground-state configuration $K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{*} 2 s\right)^{2}\left(\pi{u} 2 p\right)^{4}\left(\sigma_{g} 2 p\right)^{2}$ with six net bonding electrons gives a triple bond, in accord with the Lewis structure : $\mathrm{N} \equiv \mathrm{N}$ :.

The $\mathrm{O}_{2}$ ground-state configuration is

\(
K K\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{} 2 s\right)^{2}\left(\sigma{g} 2 p\right)^{2}\left(\pi{u} 2 p\right)^{4}\left(\pi_{g}^{} 2 p\right)^{2}
\)

Spectroscopic evidence shows that in $\mathrm{O}{2}\left(\right.$ and in $\left.\mathrm{F}{2}\right)$ the $\sigma{g} 2 p \mathrm{MO}$ is lower in energy than the $\pi{u} 2 p \mathrm{MO}$. The four net bonding electrons give a double bond. The $\pi{g}^{*} 2 p{x}$ and $\pi{g}^{*} 2 p{y}$ MOs have the same energy, and by putting one electron in each with parallel spins, we get a triplet term. By Hund's rule this is the ground term. This explanation of the paramagnetism of $\mathrm{O}_{2}$ was one of the early triumphs of MO theory.

For $\mathrm{F}{2}$ the $\ldots\left(\pi{g}^{} 2 p\right)^{4}$ ground-state configuration gives a single bond.
For $\mathrm{Ne}{2}$ the $\ldots\left(\pi{g}^{} 2 p\right)^{4}\left(\sigma_{u}^{*} 2 p\right)^{2}$ configuration gives no net bonding electrons and no chemical bond.

We can go on to describe homonuclear diatomic molecules formed from atoms of the next period. Thus the lowest electron configuration of $\mathrm{Na}{2}$ is $\operatorname{KKLL}\left(\sigma{g} 3 s\right)^{2}$. However, there are some differences as compared with the corresponding molecules of the preceding period. For $\mathrm{Al}{2}$, the ground term is the triplet term of the $\ldots\left(\sigma{g} 3 p\right)\left(\pi{u} 3 p\right)$ configuration, which lies a mere 0.02 eV below the triplet term of the $\ldots\left(\pi{u} 3 p\right)^{2}$ configuration [C. W. Bauschlicher et al., J. Chem. Phys., 86, 7007 (1987)]. For $\mathrm{Si}{2}$, the ground term is the triplet term of the $\ldots\left(\sigma{g} 3 p\right)^{2}\left(\pi{u} 3 p\right)^{2}$ configuration, which lies 0.08 eV below the triplet term of the $\ldots\left(\sigma{g} 3 p\right)\left(\pi_{u} 3 p\right)^{3}$ configuration [T. N. Kitsopolous et al., J. Chem. Phys., 95, 1441 (1991)].

Table 13.2 lists $D{e}, R{e}$, and $\widetilde{\nu}{e} \equiv \nu{e} / c$ for the ground electronic states of some homonuclear diatomic molecules, where $\nu_{e}$ is the harmonic vibrational frequency (13.27). (In

TABLE 13.2 Properties of Homonuclear Diatomic Molecules in Their Ground Electronic States

MoleculeGround TermBond Order$D_{e} / \mathrm{eV}$$R{e} / \AA$$\AA$
$\mathrm{H}{2}^{+}$
${ }^{2} \Sigma_{g}^{+}$

the research literature, $\widetilde{\nu}{e}$ is written as $\omega{e}$.) The table also lists the bond order, which is one-half the difference between the number of bonding and antibonding electrons. [For a survey of various methods to calculate bond orders, see J. J. Jules and J. R. Lombardi, THEOCHEM, 664-665, 255 (2003); see also J. F. Gonthier et al., Chem. Soc. Rev., 41, 4671 (2012).] As the bond order increases, $D{e}$ and $\nu{e}$ tend to increase and $R{e}$ decreases. (The high $\nu{e}$ of $\mathrm{H}_{2}$ is due to its small reduced mass $\mu$.) The term symbols in this table are explained in the next section.

Bonding MOs produce charge buildup between the nuclei, whereas antibonding MOs produce charge depletion between the nuclei. Hence removal of an electron from a bonding MO usually decreases $D{e}$, whereas removal of an electron from an antibonding MO increases $D{e}$. (Note that as $R$ decreases in Fig. 13.16, the energies of bonding MOs decrease, while the energies of antibonding MOs increase.) For example, the highest filled MO in $\mathrm{N}{2}$ is bonding, and Table 13.2 shows that in going from the ground state of $\mathrm{N}{2}$ to that of $\mathrm{N}{2}^{+}$the dissociation energy decreases (and the bond length increases). In contrast, the highest filled MO of $\mathrm{O}{2}$ is antibonding, and in going from $\mathrm{O}{2}$ to $\mathrm{O}{2}^{+}$the dissociation energy increases (and $R_{e}$ decreases). The designation of bonding or antibonding is not relevant to the effect of the electrons on the total energy of the molecule. Energy is always required to ionize a stable molecule, no matter which electron is removed. Hence both bonding and antibonding electrons in a stable molecule decrease the total molecular energy.

If the interaction between two ground-state He atoms were strictly repulsive (as predicted by MO theory), the atoms in He gas would not attract one another at all and the gas would never liquefy. Of course, helium gas can be liquefied. Configuration-interaction calculations and direct experimental evidence from scattering experiments show that as two He atoms approach each other there is an initial weak attraction, with the potential energy reaching a minimum at $2.97 \AA$ of 0.00095 eV below the separated-atoms energy. At distances less than $2.97 \AA$, the force becomes increasingly repulsive because of overlap of the electron probability densities. The initial attraction (called a London or dispersion force) results from instantaneous correlation between the motions of the electrons in one atom and the motions of the electrons in the second atom. Therefore, a calculation that includes electron correlation is needed to deal with dispersion attractions.

The general term for all kinds of intermolecular forces is van der Waals forces. Except for highly polar molecules, the dispersion force is the largest contributor to intermolecular attractions. The dispersion force increases as the molecular size increases, so boiling points tend to increase as the molecular weight increases.

The slight minimum in the $U(R)$ curve at relatively large intermolecular separation produced by the dispersion force can be deep enough to allow the existence at low temperatures of molecules bound by the dispersion interaction. Such species are called van der Waals molecules. For example, argon gas at 100 K has a small concentration of $\mathrm{Ar}{2}$ van der Waals molecules. $\mathrm{Ar}{2}$ has $D{e}=0.012 \mathrm{eV}, R{e}=3.77 \AA$, and has seven bound vibrational levels $(v=0, \ldots, 6)$.

For the ground electronic state of $\mathrm{He}{2}$ [corresponding to the electron configuration $\left.\left(\sigma{g} 1 s\right)^{2}\left(\sigma{u}^{*} 1 s\right)^{2}\right]$, the zero-point vibrational energy is very slightly less than the dissociation energy $D{e}$ associated with the dispersion attraction, so the $v=0, J=0$ level is the only bound level. Because of the extremely weak binding, $\mathrm{He}{2}$ exists in significant amounts only at very low temperatures. $\mathrm{He}{2}$ was detected mass spectrometrically in a beam of helium gas cooled to $10^{-3} \mathrm{~K}$ by expansion [ F . Luo et al., J. Chem. Phys., 98, 3564 (1993); 100, 4023 (1994)]. Accurate theoretical calculations on $\mathrm{He}{2}$ give $D{e}=0.000948 \mathrm{eV}, D{0}=0.00000014 \mathrm{eV}, R{e}=2.97 \AA$ and give the average internuclear distance in $\mathrm{He}_{2}$ as $\langle R\rangle \approx 47 \AA$ [M. Przybytek et al., Phys. Rev. Lett., 104, 183003 (2010); M. Jeziorska et al., J. Chem. Phys., 127, 124303 (2007)]. $\langle R\rangle$ is huge because the $v=0$ level lies so close to the dissociation limit.

Examples of diatomic van der Waals molecules and their $R{e}$ and $D{e}$ values include $\mathrm{Ne}{2}$, $3.1 \AA, 0.0036 \mathrm{eV} ; \mathrm{HeNe}, 3.2 \AA, 0.0012 \mathrm{eV} ; \mathrm{Ca}{2}, 4.28 \AA, 0.13 \mathrm{eV} ; \mathrm{Mg}{2}, 3.89 \AA, 0.053 \mathrm{eV}$. Observed polyatomic van der Waals molecules include $\left(\mathrm{O}{2}\right){2}, \mathrm{H}{2}-\mathrm{N}{2}, \mathrm{Ar}-\mathrm{HCl}$, and $\left(\mathrm{Cl}{2}\right){2}$. For van der Waals bonding, $R{e}$ is significantly greater and $D{e}$ is very substantially less than the values for chemically bound molecules. The $\mathrm{Be}{2}$ bond length of $2.45 \AA$ is much shorter than is typical for van der Waals molecules; the closeness of the $2 p$ orbitals to $2 s$ orbitals in Be allows substantial $2 s-2 p$ hybridization in $\mathrm{Be}_{2}$ and perhaps gives some amount of covalent character in addition to the dispersion attraction. For more on van der Waals molecules, see Chem. Rev., 88, 813-988 (1988); 94, 1721-2160 (1994); 100, 3861-4264 (2000).


We now consider the terms arising from a given diatomic molecule electron configuration.
For atoms, each set of degenerate atomic orbitals constitutes an atomic subshell. For example, the $2 p{+1}, 2 p{0}$, and $2 p{-1}$ AOs constitute the $2 p$ subshell. An atomic electronic configuration is defined by giving the number of electrons in each subshell; for example, $1 s^{2} 2 s^{2} 2 p^{4}$. For molecules, each set of degenerate molecular orbitals constitutes a molecular shell. For example, the $\pi{u} 2 p{+1}$ and $\pi{u} 2 p{-1}$ MOs constitute the $\pi{u} 2 p$ shell. Each diatomic $\sigma$ shell consists of one MO, while each $\pi, \delta, \phi, \ldots$ shell consists of two MOs; diatomic $\sigma$ shells are filled with two electrons, while non $\sigma$ shells hold up to four electrons. We define a molecular electronic configuration by giving the number of electrons in each shell, for example, $\left(\sigma{g} 1 s\right)^{2}\left(\sigma{u}^{} 1 s\right)^{2}\left(\sigma{g} 2 s\right)^{2}\left(\sigma{u}^{} 2 s\right)^{2}\left(\pi_{u} 2 p\right)^{3}$.

For $\mathrm{H}{2}^{+}$, the operator $\hat{L}{z}$ commutes with $\hat{H}$. For a many-electron diatomic molecule, one finds that the operator for the axial component of the total electronic orbital angular momentum commutes with $\hat{H}$. The component of electronic orbital angular momentum along the molecular axis has the possible values $M{L} \hbar$, where $M{L}=0, \pm 1, \pm 2, \pm \ldots$ To calculate $M_{L}$, we simply add algebraically the $m$ 's of the individual electrons. Analogous to the symbol $\lambda$ for a one-electron molecule, $\Lambda$ is defined as

\(
\begin{equation}
\Lambda \equiv\left|M_{L}\right| \tag{13.84}
\end{equation}
\)

(Some people define $\Lambda$ as equal to $M_{L}$.) The following code specifies the value of $\Lambda$ :

$\Lambda$01234
letter$\Sigma$$\Pi$$\Delta$$\Phi$$\Gamma$

For $\Lambda \neq 0$, there are two possible values of $M{L}$, namely, $+\Lambda$ and $-\Lambda$. As in $\mathrm{H}{2}^{+}$, the electronic energy depends on $M{L}^{2}$, so there is a double degeneracy associated with the two values of $M{L}$. Note that lowercase letters refer to individual electrons, while capital letters refer to the whole molecule.

Just as in atoms, the individual electron spins add vectorially to give a total electronic spin $\mathbf{S}$, whose magnitude has the possible values $[S(S+1)]^{1 / 2} \hbar$, with $S=0, \frac{1}{2}, 1, \frac{3}{2}, \ldots$. The component of $\mathbf{S}$ along an axis has the possible values $M{S} \hbar$, where $M{S}=S, S-1, \ldots,-S$. As in atoms, the quantity $2 S+1$ is called the spin multiplicity and is written as a left superscript to the code letter for $\Lambda$. Diatomic electronic states that arise from the same electron configuration and that have the same value for $\Lambda$ and the same value for $S$ are said to belong to the same electronic term. We now consider how the terms belonging to a given electron configuration are derived. (We are assuming Russell-Saunders coupling, which holds for molecules composed of atoms of not-too-high atomic number.)

A filled diatomic molecule shell consists of one or two filled molecular orbitals. The Pauli principle requires that, for two electrons in the same molecular orbital, one have $m{s}=+\frac{1}{2}$ and the other have $m{s}=-\frac{1}{2}$. Hence the quantum number $M{S}$, which is the algebraic sum of the individual $m{s}$ values, must be zero for a filled-shell molecular
configuration. Therefore, we must have $S=0$ for a configuration containing only filled molecular shells. A filled $\sigma$ shell has two electrons with $m=0$, so $M{L}$ is zero. A filled $\pi$ shell has two electrons with $m=+1$ and two electrons with $m=-1$, so $M{L}$ (which is the algebraic sum of the $m$ 's) is zero. The same situation holds for filled $\delta, \phi, \ldots$ shells. Thus a closed-shell molecular configuration has both $S$ and $\Lambda$ equal to zero and gives rise to only a ${ }^{1} \Sigma$ term. An example is the ground electronic configuration of $\mathrm{H}_{2}$. (Recall that a filled-subshell atomic configuration gives only a ${ }^{1} S$ term.) In deriving molecular terms, we need consider only electrons outside filled shells.

A single $\sigma$ electron has $s=\frac{1}{2}$, so $S$ must be $\frac{1}{2}$, and we get a ${ }^{2} \Sigma$ term. An example is the ground electronic configuration of $\mathrm{H}_{2}^{+}$. A single $\pi$ electron gives a ${ }^{2} \Pi$ term, and so on.

Now consider more than one electron. Electrons that are in different molecular shells are called nonequivalent. For such electrons we do not have to worry about giving two of them the same set of quantum numbers, and the terms are easily derived. Consider two nonequivalent $\sigma$ electrons, a $\sigma \sigma$ configuration. Since both $m$ 's are zero, we have $M_{L}=0$. Each $s$ is $\frac{1}{2}$, so $S$ can be 1 or 0 . We thus have the terms ${ }^{1} \Sigma$ and ${ }^{3} \Sigma$. Similarly, a $\sigma \pi$ configuration gives ${ }^{1} \Pi$ and ${ }^{3} \Pi$ terms.

For a $\pi \delta$ configuration, we have singlet and triplet terms. The $\pi$ electron can have $m= \pm 1$, and the $\delta$ electron can have $m= \pm 2$. The possible values for $M{L}$ are thus $+3,-3,+1$, and -1 . This gives $\Lambda=3$ or 1 , and we have the terms ${ }^{1} \Pi,{ }^{3} \Pi,{ }^{1} \Phi,{ }^{3} \Phi$. (In atoms we add the vectors $\mathbf{L}{i}$ to get the total $\mathbf{L}$; hence a $p d$ atomic configuration gives $P$, $D$, and $F$ terms. In diatomic molecules, however, we add the $z$ components of the orbital angular momenta. This is an algebraic rather than a vectorial addition, so a $\pi \delta$ molecular configuration gives $\Pi$ and $\Phi$ terms and no $\Delta$ terms.)

For a $\pi \pi$ configuration of two nonequivalent electrons, each electron has $m= \pm 1$, and we have the $M{L}$ values $2,-2,0,0$. The values of $\Lambda$ are 2,0 , and 0 ; the terms are ${ }^{1} \Delta,{ }^{3} \Delta,{ }^{1} \Sigma,{ }^{3} \Sigma,{ }^{1} \Sigma$, and ${ }^{3} \Sigma$. The values +2 and -2 correspond to the two degenerate states of the same $\Delta$ term. However, $\Sigma$ terms are nondegenerate (apart from spin degeneracy), and the two values of $M{L}$ that are zero indicate two different $\Sigma$ terms (which become four $\Sigma$ terms when we consider spin).

Consider the forms of the wave functions for the $\pi \pi$ terms. We shall call the two $\pi$ subshells $\pi$ and $\pi^{\prime}$ and shall use a subscript to indicate the $m$ value. For the $\Delta$ terms, both electrons have $m=+1$ or both have $m=-1$. For $M{L}=+2$, we might write as the spatial factor in the wave function $\pi{+1}(1) \pi{+1}^{\prime}(2)$ or $\pi{+1}(2) \pi_{+1}^{\prime}(1)$. However, these functions are neither symmetric nor antisymmetric with respect to exchange of the indistinguishable electrons and are unacceptable. Instead, we must take the linear combinations (we shall not bother with normalization constants)

\(
\begin{array}{cl}
{ }^{1} \Delta: & \pi{+1}(1) \pi{+1}^{\prime}(2)+\pi{+1}(2) \pi{+1}^{\prime}(1) \
{ }^{3} \Delta: & \pi{+1}(1) \pi{+1}^{\prime}(2)-\pi{+1}(2) \pi{+1}^{\prime}(1) \tag{13.86}
\end{array}
\)

Similarly, with both electrons having $m=-1$, we have the spatial factors

\(
\begin{array}{ll}
{ }^{1} \Delta: & \pi{-1}(1) \pi{-1}^{\prime}(2)+\pi{-1}(2) \pi{-1}^{\prime}(1) \
{ }^{3} \Delta: & \pi{-1}(1) \pi{-1}^{\prime}(2)-\pi{-1}(2) \pi{-1}^{\prime}(1) \tag{13.88}
\end{array}
\)

The functions (13.85) and (13.87) are symmetric with respect to exchange. They therefore go with the antisymmetric two-electron spin factor (11.60), which has $S=0$. Thus (13.85) and (13.87) are the spatial factors in the wave functions for the two states of the doubly degenerate ${ }^{1} \Delta$ term. The antisymmetric functions (13.86) and (13.88) must go with the symmetric two-electron spin functions (11.57), (11.58), and (11.59), giving the six states of the ${ }^{3} \Delta$ term. These states all have the same energy (if we neglect spin-orbit interaction).

Now consider the wave functions of the $\Sigma$ terms. These have one electron with $m=+1$ and one electron with $m=-1$. We start with the four functions

\(
\pi{+1}(1) \pi{-1}^{\prime}(2), \quad \pi{+1}(2) \pi{-1}^{\prime}(1), \quad \pi{-1}(1) \pi{+1}^{\prime}(2), \quad \pi{-1}(2) \pi{+1}^{\prime}(1)
\)

Combining them to get symmetric and antisymmetric functions, we have

\(
\begin{array}{cl}
{ }^{1} \Sigma^{+}: & \pi{+1}(1) \pi{-1}^{\prime}(2)+\pi{+1}(2) \pi{-1}^{\prime}(1)+\pi{-1}(1) \pi{+1}^{\prime}(2)+\pi{-1}(2) \pi{+1}^{\prime}(1) \
{ }^{1} \Sigma^{-}: & \pi{+1}(1) \pi{-1}^{\prime}(2)+\pi{+1}(2) \pi{-1}^{\prime}(1)-\pi{-1}(1) \pi{+1}^{\prime}(2)-\pi{-1}(2) \pi{+1}^{\prime}(1) \
{ }^{3} \Sigma^{+}: & \pi{+1}(1) \pi{-1}^{\prime}(2)-\pi{+1}(2) \pi{-1}^{\prime}(1)+\pi{-1}(1) \pi{+1}^{\prime}(2)-\pi{-1}(2) \pi{+1}^{\prime}(1) \tag{13.89}\
{ }^{3} \Sigma^{-}: & \pi{+1}(1) \pi{-1}^{\prime}(2)-\pi{+1}(2) \pi{-1}^{\prime}(1)-\pi{-1}(1) \pi{+1}^{\prime}(2)+\pi{-1}(2) \pi{+1}^{\prime}(1)
\end{array}
\)

The first two functions in (13.89) are symmetric. They therefore go with the antisymmetric singlet spin function (11.60). Clearly, these two spatial functions have different energies. The last two functions in (13.89) are antisymmetric and hence are the spatial factors in the wave functions of the two ${ }^{3} \Sigma$ terms. The four functions in (13.89) are found to have eigenvalue +1 or -1 with respect to reflection of electronic coordinates in the $x z \sigma_{v}$ symmetry plane containing the molecular $(z)$ axis (Prob. 13.30). The superscripts + and - refer to this eigenvalue.

Examination of the $\Delta$ terms (13.85) to (13.88) shows that they are not eigenfunctions of the symmetry operator $\hat{O}{\sigma{v}}$ (Section 12.1). Since a twofold degeneracy (apart from spin degeneracy) is associated with these terms, there is no necessity that their wave functions be eigenfunctions of this operator. However, since $\hat{O}{\sigma{v}}$ commutes with the Hamiltonian, we can choose the eigenfunctions to be eigenfunctions of $\hat{O}{\sigma{v}}$. Thus we can combine the functions (13.85) and (13.87), which belong to a degenerate energy level, as follows:

\(
(13.85)+(13.87) \text { and }(13.85)-(13.87)
\)

These two linear combinations are eigenfunctions of $\hat{O}{\sigma{v}}$ with eigenvalues +1 and -1 , and we could refer to them as ${ }^{1} \Delta^{+}$and ${ }^{1} \Delta^{-}$states. Since they have the same energy, there is no point in using the + and - superscripts. Thus the + and - designations are used only for $\Sigma$ terms. However, when one considers the interaction between the molecular rotational angular momentum and the electronic orbital angular momentum, there is a very slight splitting (called $\Lambda$-type doubling) of the two states of a ${ }^{1} \Delta$ term. It turns out that the correct zeroth-order wave functions for this perturbation are the linear combinations that are eigenfunctions of $\hat{O}{\sigma{v}}$, so in this case there is a point to distinguishing between $\Delta^{+}$and $\Delta^{-}$ states. The linear combinations (13.85) $\pm$ (13.87), which are eigenfunctions of $\hat{O}{\sigma{v}}$, are not eigenfunctions of $\hat{L}{z}$ but are superpositions of $\hat{L}{z}$ eigenfunctions with eigenvalues +2 and -2 .

We can distinguish + and - terms for one-electron configurations. The wave function of a single $\sigma$ electron has no phi factor and hence must correspond to a $\Sigma^{+}$term. For a $\pi$ electron, the MOs that are eigenfunctions of $\hat{L}{z}$ are the $\pi{+1}$ and $\pi{-1}$ functions (whose probability densities are each symmetric about the $z$ axis; Fig. 13.14). The $\pi{+1}$ and $\pi{-1}$ functions are not eigenfunctions of $\hat{O}{\sigma{v}}$, but the linear combinations $\pi{+1}+\pi{-1}=\pi{x}$ and $\pi{+1}-\pi{-1}=\pi{y}$ are. The $\pi{x}$ and $\pi{y}$ MOs (whose probability densities are not symmetric about the $z$ axis; Fig. 13.15) are the correct zeroth-order functions if the perturbation of the electronic wave functions due to molecular rotation is considered. The $\pi{x}$ and $\pi{y}$ MOs have eigenvalues +1 and -1 , respectively, for reflection in the $x z$ plane, and eigenvalues -1 and +1 , respectively, for reflection in the $y z$ plane. (The operators $\hat{L}{z}$ and $\hat{O}{\sigma{v}}$ do not commute; Prob. 13.31. Hence we cannot have all the eigenfunctions of $\hat{H}$ being eigenfunctions of both these operators as well. However, since each of these operators commutes with the electronic Hamiltonian and since there is no element of choice in the wave function of a nondegenerate level, all the $\sigma$ MOs must be eigenfunctions of both $\hat{L}{z}$ and $\hat{O}{\sigma_{v}}$.)

Electrons in the same molecular shell are called equivalent. There are fewer terms for equivalent electrons than for the corresponding nonequivalent electron configuration, because of the Pauli principle. Thus, for a $\pi^{2}$ configuration of two equivalent $\pi$ electrons, four of the eight functions (13.85) to (13.89) vanish; the remaining functions give a ${ }^{1} \Delta$ term, a ${ }^{1} \Sigma^{+}$term, and a ${ }^{3} \Sigma^{-}$term. Alternatively, we can make a table similar to Table 11.1 and use it to derive the terms for equivalent electrons.

Table 13.3 lists terms arising from various electron configurations. A filled shell always gives the single term ${ }^{1} \Sigma^{+}$. A $\pi^{3}$ configuration gives the same result as a $\pi$ configuration.

For homonuclear diatomic molecules, a $g$ or $u$ right subscript is added to the term symbol to show the parity of the electronic states belonging to the term. Terms arising from an electron configuration that has an odd number of electrons in molecular orbitals of odd parity are odd $(u)$; all other terms are even $(g)$. This is the same rule as for atoms.

The term symbols given in Table 13.2 are readily derived from the MO configurations. For example, $\mathrm{O}{2}$ has a $\pi^{2}$ configuration, which gives the three terms ${ }^{1} \Sigma{g}^{+},{ }^{3} \Sigma{g}^{-}$, and ${ }^{1} \Delta{g}$. Hund's rule tells us that ${ }^{3} \Sigma{g}^{-}$is the lowest term, as listed. The $v=0$ levels of the ${ }^{1} \Delta{g}$ and ${ }^{1} \Sigma{g}^{+} \mathrm{O}{2}$ terms lie 0.98 eV and 1.6 eV , respectively, above the $v=0$ level of the ground ${ }^{3} \Sigma{g}^{-}$ term. Singlet $\mathrm{O}{2}$ is a reaction intermediate in many organic, biochemical, and inorganic reactions. [See C. S. Foote et al., eds., Active Oxygen in Chemistry, Springer, 1995; J. S. Valentine et al., eds., Active Oxygen in Biochemistry, Springer, 1995; C. Schweitzer and R. Schmidt, Chem. Rev., 103, 1685 (2003).]

Most stable diatomic molecules have a ${ }^{1} \Sigma^{+}$ground term ( ${ }^{1} \Sigma{g}^{+}$for homonuclear diatomics). Exceptions include $\mathrm{B}{2}, \mathrm{Al}{2}, \mathrm{Si}{2}, \mathrm{O}_{2}$, and NO , which has a ${ }^{2} \Pi$ ground term.

Spectroscopists prefix the ground term of a molecule by the symbol $X$. Excited terms of the same spin multiplicity as the ground term are designated as $A, B, C, \ldots$, while excited terms of different spin multiplicity from the ground term are designated as $a, b, c, \ldots$ Exceptions are $\mathrm{C}{2}$ and $\mathrm{N}{2}$, where the ground terms are ${ }^{1} \Sigma_{g}^{+}$but the letters $A, B, C, \ldots$ are used for excited triplet terms.

Just as for atoms, spin-orbit interaction can split a molecular term into closely spaced energy levels, giving a multiplet structure to the term. The projection of the total electronic

TABLE 13.3 Electronic Terms of Diatomic Molecules

ConfigurationTerms
$\sigma \sigma$${ }^{1} \Sigma^{+},{ }^{3} \Sigma^{+}$
$\sigma \pi ; \sigma \pi^{3}$${ }^{1} \Pi,{ }^{3} \Pi$
$\pi \pi ; \pi \pi^{3}$${ }^{1} \Sigma^{+},{ }^{3} \Sigma^{+},{ }^{1} \Sigma^{-},{ }^{3} \Sigma^{-},{ }^{1} \Delta,{ }^{3} \Delta$
$\pi \delta ; \pi^{3} \delta ; \pi \delta^{3}$${ }^{1} \Pi,{ }^{3} \Pi,{ }^{1} \Phi,{ }^{3} \Phi$
$\sigma$${ }^{2} \Sigma^{+}$
$\sigma^{2} ; \pi^{4} ; \delta^{4}$${ }^{1} \Sigma^{+}$
$\pi ; \pi^{3}$${ }^{2} \Pi$
$\pi^{2}$${ }^{1} \Sigma^{+},{ }^{3} \Sigma^{-},{ }^{1} \Delta$
$\delta ; \delta^{3}$${ }^{2} \Delta$
$\delta^{2}$${ }^{1} \Sigma^{+},{ }^{3} \Sigma^{-},{ }^{1} \Gamma$

$\operatorname{spin} \mathbf{S}$ on the molecular axis is $M{S} \hbar$. In molecules the quantum number $M{S}$ is called $\Sigma$ (not to be confused with the symbol meaning $\Lambda=0$ ):

\(
\Sigma=S, S-1, \ldots,-S
\)

The axial components of electronic orbital and spin angular momenta add, giving as the total axial component of electronic angular momentum $(\Lambda+\Sigma) \hbar$. (Recall that $\Lambda$ is the absolute value of $M_{L}$. We consider $\Sigma$ to be positive when it has the same direction as $\Lambda$, and negative when it has the opposite direction as $\Lambda$.) The possible values of $\Lambda+\Sigma$ are

\(
\Lambda+S, \quad \Lambda+S-1, \ldots, \quad \Lambda-S
\)

The value of $\Lambda+\Sigma$ is written as a right subscript to the term symbol to distinguish the energy levels of the term. Thus a ${ }^{3} \Delta$ term has $\Lambda=2$ and $S=1$ and gives rise to the levels ${ }^{3} \Delta{3},{ }^{3} \Delta{2}$, and ${ }^{3} \Delta{1}$. In a sense, $\Lambda+\Sigma$ is the analog in molecules of the quantum number $J$ in atoms. However, $\Lambda+\Sigma$ is the quantum number of the $z$ component of total electronic angular momentum and therefore can take on negative values. Thus a ${ }^{4} \Pi$ term has the four levels ${ }^{4} \Pi{5 / 2},{ }^{4} \Pi{3 / 2},{ }^{4} \Pi{1 / 2}$, and ${ }^{4} \Pi_{-1 / 2}$. The absolute value of $\Lambda+\Sigma$ is called $\Omega$ :

\(
\begin{equation}
\Omega \equiv|\Lambda+\Sigma| \tag{13.90}
\end{equation}
\)

The spin-orbit interaction energy in diatomic molecules can be shown to be well approximated by $A \Lambda \Sigma$, where $A$ depends on $\Lambda$ and on the internuclear distance $R$ but not on $\Sigma$. The spacing between levels of the multiplet is thus constant. When $A$ is positive, the level with the lowest value of $\Lambda+\Sigma$ lies lowest, and the multiplet is regular. When $A$ is negative, the multiplet is inverted. Note that for $\Lambda \neq 0$ the spin multiplicity $2 S+1$ always equals the number of multiplet components. This is not always true for atoms.

Each energy level of a multiplet with $\Lambda \neq 0$ is doubly degenerate, corresponding to the two values for $M_{L}$. Thus a ${ }^{3} \Delta$ term has six different wave functions [Eqs. (13.86), (13.88), (11.57) to (11.59)] and therefore six different molecular electronic states. Spin-orbit interaction splits the ${ }^{3} \Delta$ term into three levels, each doubly degenerate. The double degeneracy of the levels is removed by the $\Lambda$-type doubling mentioned previously.

For $\Sigma$ terms $(\Lambda=0)$, the spin-orbit interaction is very small (zero in the first approximation), and the quantum numbers $\Sigma$ and $\Omega$ are not defined.
$A^{1} \Sigma$ term always corresponds to a single nondegenerate energy level.


The hydrogen molecule is the simplest molecule containing an electron-pair bond. The purely electronic Hamiltonian (13.5) for $\mathrm{H}_{2}$ is in atomic units

\(
\begin{equation}
\hat{H}=-\frac{1}{2} \nabla{1}^{2}-\frac{1}{2} \nabla{2}^{2}-\frac{1}{r{a 1}}-\frac{1}{r{a 2}}-\frac{1}{r{b 1}}-\frac{1}{r{b 2}}+\frac{1}{r_{12}} \tag{13.91}
\end{equation}
\)

where 1 and 2 are the electrons and $a$ and $b$ are the nuclei (Fig. 13.18). Just as in the helium atom, the $1 / r_{12}$ interelectronic-repulsion term prevents the Schrödinger equation from being separable. We therefore use approximation methods.

We start with the molecular-orbital approach. The ground-state electron configuration of $\mathrm{H}{2}$ is $\left(\sigma{g} 1 s\right)^{2}$, and we can write an approximate wave function as the Slater determinant

\(
\begin{align}
\frac{1}{\sqrt{2}}\left|\begin{array}{ll}
\sigma{g} 1 s(1) \alpha(1) & \sigma{g} 1 s(1) \beta(1) \
\sigma{g} 1 s(2) \alpha(2) & \sigma{g} 1 s(2) \beta(2)
\end{array}\right| & =\sigma{g} 1 s(1) \sigma{g} 1 s(2) \cdot 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] \
& =f(1) f(2) \cdot 2^{-1 / 2}[\alpha(1) \beta(2)-\beta(1) \alpha(2)] \tag{13.92}
\end{align}
\)

FIGURE 13.18 Interparticle distances in $\mathrm{H}_{2}$.

which is similar to $(10.26)$ for the helium atom. To save time, we write $f$ instead of $\sigma_{g} 1 s$. As we saw in Section 10.4, omission of the spin factor does not affect the variational integral for a two-electron problem. Hence we want to choose $f$ so as to minimize

\(
\frac{\iint f^{}(1) f^{}(2) \hat{H} f(1) f(2) d v{1} d v{2}}{\iint|f(1)|^{2}|f(2)|^{2} d v{1} d v{2}}
\)

where the integration is over the spatial coordinates of the two electrons. Ideally, $f$ should be found by an SCF calculation. For simplicity we can use an $\mathrm{H}{2}^{+}$-like MO. (The $\mathrm{H}{2}$ Hamiltonian becomes the sum of two $\mathrm{H}{2}^{+}$Hamiltonians if we omit the $1 / r{12}$ term.) We saw in Section 13.5 that the function [Eq. (13.57)]

\(
\frac{k^{3 / 2}}{(2 \pi)^{1 / 2}\left(1+S{a b}\right)^{1 / 2}}\left(e^{-k r{a}}+e^{-k r_{b}}\right)
\)

gives a good approximation to the ground-state $\mathrm{H}{2}^{+}$wave function. Hence we try as a variation function $\phi$ for $\mathrm{H}{2}$ the product of two such LCAO functions, one for each electron:

\(
\begin{gather}
\phi=\frac{\zeta^{3}}{2 \pi\left(1+S{a b}\right)}\left(e^{-\zeta r{a 1}}+e^{-\zeta r{b 1}}\right)\left(e^{-\zeta r{a 2}}+e^{-\zeta r{b 2}}\right) \tag{13.93}\
\phi=\frac{1}{2\left(1+S{a b}\right)}\left[1 s{a}(1)+1 s{b}(1)\right]\left[1 s{a}(2)+1 s{b}(2)\right] \tag{13.94}
\end{gather}
\)

where the effective nuclear charge $\zeta$ will differ from $k$ for $\mathrm{H}_{2}^{+}$. Since

\(
\hat{H}=\hat{H}{1}^{0}+\hat{H}{2}^{0}+1 / r_{12}
\)

where $\hat{H}{1}^{0}$ and $\hat{H}{2}^{0}$ are $\mathrm{H}_{2}^{+}$Hamiltonians for each electron, we have

\(
\iint \phi^{*} \hat{H} \phi d v{1} d v{2}=2 W{1}+\iint \frac{\phi^{2}}{r{12}} d v{1} d v{2}
\)

where $W{1}$ is given by (13.63) with $k$ replaced by $\zeta$. The evaluation of the $1 / r{12}$ integral is complicated and is omitted [see Slater, Quantum Theory of Molecules and Solids, Volume 1, page 65, and Appendix 6]. Coulson performed the variational calculation in 1937, using (13.93). [For the literature references of the $\mathrm{H}{2}$ calculations mentioned in this and later sections, see the bibliography in A. D. McLean et al., Rev. Mod. Phys., 32, 211 (1960).] Coulson found $R{e}=0.732 \AA$, which is close to the true value $0.741 \AA$; the minimum in the calculated $U(R)$ curve gave $D{e}=3.49 \mathrm{eV}$, as compared with the true value
4.75 eV (Table 13.2). (Of course, the percent error in the total electronic energy is much less than the percent error in $D{e}$, but $D{e}$ is the quantity of chemical interest.) The value of $\zeta$ at $0.732 \AA$ is 1.197 , which is less than $k$ for $\mathrm{H}{2}^{+}$. We attribute this to the screening of the nuclei from each electron by the other electron.

How can we improve on the above simple MO result? We can look for the best possible MO function $f$ in (13.92) to get the Hartree-Fock wave function for $\mathrm{H}_{2}$. This was done by Kolos and Roothaan [W. Kolos and C. C. J. Roothaan, Rev. Mod. Phys., 32, 219 (1960)]. They expanded $f$ in elliptic coordinates [Eq. (13.34)]. Since $m=0$ for the ground state, the $e^{i m \phi}$ factor in the SCF MO is equal to 1 and $f$ is a function of $\xi$ and $\eta$ only. The expansion used is

\(
f=e^{-\alpha \xi} \sum{p, q} a{p q} \xi^{p} \eta^{q}
\)

where $p$ and $q$ are integers and $\alpha$ and $a{p q}$ are variational parameters. The Hartree-Fock results are $R{e}=0.732 \AA$ and $D{e}=3.64 \mathrm{eV}$, which is not much improvement over the value 3.49 eV given by the simple LCAO molecular orbital. The correlation energy for $\mathrm{H}{2}$ is thus -1.11 eV , close to the value -1.14 eV for the two-electron helium atom (Section 11.3). To get a truly accurate binding energy, we must go beyond the SCF approximation of writing the wave function in the form $f(1) f(2)$. We can use the same methods we used for atoms: configuration interaction and introduction of $r_{12}$ into the trial function.

First, consider configuration interaction (CI). To reach the exact ground-state wave function, we include contributions from SCF (or other) functions for all the excited states with the same symmetry as the ground state. In the first approximation, only contributions from the lowest-lying excited states are included. The first excited configuration of $\mathrm{H}{2}$ is $\left(\sigma{g} 1 s\right)\left(\sigma{u}^{*} 1 s\right)$, which gives the terms ${ }^{1} \Sigma{u}^{+}$and ${ }^{3} \Sigma{u}^{+}$. (We have one $g$ and one $u$ electron, so the terms are of odd parity.) The ground-state configuration $\left(\sigma{g} 1 s\right)^{2}$ is a ${ }^{1} \Sigma{g}^{+}$state. Hence we do not get any contribution from the $\left(\sigma{g} 1 s\right)\left(\sigma{u}^{*} 1 s\right)$ states, since they have different parity from the ground state. Next consider the configuration $\left(\sigma{u}^{} 1 s\right)^{2}$. This is a closed-shell configuration having the single state ${ }^{1} \Sigma{g}^{+}$. This is of the right symmetry to contribute to the ground-state wave function. As a simple CI trial function, we can take a linear combination of the MO wave functions for the $\left(\sigma{g} 1 s\right)^{2}$ and $\left(\sigma_{u}^{} 1 s\right)^{2}$ configurations. To simplify things, we will use the LCAO-MOs as approximations to the MOs. Thus we take

\(
\begin{equation}
\phi=\sigma{g} 1 s(1) \sigma{g} 1 s(2)+c \sigma_{u}^{} 1 s(1) \sigma_{u}^{} 1 s(2) \tag{13.95}
\end{equation}
\)

where $\sigma{g} 1 s$ and $\sigma{u}^{*} 1 s$ are given by (13.57) and (13.58) with a variable orbital exponent and $c$ is a variational parameter. This calculation was performed by Weinbaum in 1933. The result is a bond length of $0.757 \AA$ and a dissociation energy of 4.03 eV , which is a considerable improvement over the Hartree-Fock result $D{e}=3.64 \mathrm{eV}$. The orbital exponent has the optimum value 1.19 . We can improve on this result by using a better form for the MOs of each configuration and by including more configuration functions. Hagstrom did a CI calculation in which the MOs were represented by expansions in elliptic coordinates. With 33 configuration functions, he found $D{e}=4.71 \mathrm{eV}$, close to the true value 4.75 eV [S. Hagstrom and H. Shull, Rev. Mod. Phys., 35, 624 (1963)].

Now consider the use of $r{12}$ in $\mathrm{H}{2}$ trial functions. The first really accurate calculation of the hydrogen-molecule ground state was done by James and Coolidge in 1933. They used the trial function

\(
\exp \left[-\delta\left(\xi{1}+\xi{2}\right)\right] \sum c{m n j k p}\left[\xi{1}^{m} \xi{2}^{n} \eta{1}^{j} \eta{2}^{k}+\xi{1}^{n} \xi{2}^{m} \eta{1}^{k} \eta{2}^{j}\right] r{12}^{p}
\)

where the summation is over integral values of $m, n, j, k$, and $p$. The variational parameters are $\delta$ and the $c{m n j k p}$ coefficients. The James and Coolidge function is symmetric with
respect to interchange of electrons 1 and 2, as it should be, since we have an antisymmetric ground-state spin function. With 13 terms in the sum, James and Coolidge found $D{e}=4.72 \mathrm{eV}$, only 0.03 eV in error. Their work has been extended by Kolos, Wolniewicz, and co-workers, who used as many as 279 terms in the sum. Since it is $D{0}$ that is determined from the observed electronic spectrum, they used the Cooley-Numerov method (Section 13.2) to calculate the vibrational levels from their theoretical $U(R)$ curve and then calculated $D{0}$. Including relativistic corrections and corrections to the Born-Oppenheimer approximation, they found $D{0} / h c=36118.1 \mathrm{~cm}^{-1}$, in agreement with the spectroscopically determined value $36118.1 \mathrm{~cm}^{-1}$ [W. Kolos et al., J. Chem. Phys., 84, 3278 (1986); L. Wolniewicz, J. Chem. Phys., 99, 1851 (1993)]. An even more precise $D{0}$ was calculated by workers who did high-precision calculations of relativistic corrections, corrections to the Born-Oppenheimer approximation, and quantumelectrodynamics corrections using a Born-Oppenheimer $U(R)$ found from wave functions having as many as 7000 terms to get $D_{0} / h c=36118.0695(10) \mathrm{cm}^{-1}$, where the number in parentheses is the estimate of the uncertainty of the last digits [K. Piszczatowski et al., J. Chem. Theory Comput., 5, 3039 (2009); www.fuw.edu.pl/~krp/papers/D0.pdf]. The experimental value is $36118.0696(4) \mathrm{cm}^{-1}$. (Quantum electrodynamics is the relativistic quantum-mechanical theory of the interaction of radiation and matter formulated by Feynman, Schwinger, and Tomonaga in the late 1940s and is an improvement on the quantum field theory of Dirac.)


The first quantum-mechanical treatment of the hydrogen molecule was by Heitler and London in 1927. Their ideas have been extended to give a general theory of chemical bonding, known as the valence-bond (VB) theory. The valence-bond method is more closely related to the chemist's idea of molecules as consisting of atoms held together by localized bonds than is the molecular-orbital method. The VB method views molecules as composed of atomic cores (nuclei plus inner-shell electrons) and bonding valence electrons. For $\mathrm{H}_{2}$, both electrons are valence electrons.

The first step in the Heitler-London treatment of the $\mathrm{H}_{2}$ ground state is to approximate the molecule as two ground-state hydrogen atoms. The wave function for two such noninteracting atoms is

\(
f{1}=1 s{a}(1) 1 s_{b}(2)
\)

where $a$ and $b$ refer to the nuclei and 1 and 2 refer to the electrons. Of course, the function

\(
f{2}=1 s{a}(2) 1 s_{b}(1)
\)

is also a valid wave function. This then suggests the trial variation function

\(
\begin{equation}
c{1} f{1}+c{2} f{2}=c{1} 1 s{a}(1) 1 s{b}(2)+c{2} 1 s{a}(2) 1 s{b}(1) \tag{13.96}
\end{equation}
\)

This linear variation function leads to the determinantal secular equation $\operatorname{det}\left(H{i j}-S{i j} W\right)=0$ [Eq. (8.57)], where $H{11}=\left\langle f{1}\right| \hat{H}\left|f{1}\right\rangle, S{11}=\left\langle f{1} \mid f{1}\right\rangle, \ldots$

We can also consider the problem using perturbation theory (as Heitler and London did). A ground-state hydrogen molecule dissociates to two neutral ground-state hydrogen atoms. We therefore take as the unperturbed problem two ground-state hydrogen atoms at infinite separation. One possible zeroth-order (unperturbed) wave function is $1 s{a}(1) 1 s{b}(2)$. However, electron 2 could just as well be bound to nucleus $a$, giving the unperturbed wave function $1 s{a}(2) 1 s{b}(1)$. These two unperturbed wave functions belong to a doubly degenerate energy level (exchange degeneracy). Under the perturbation of molecule formation, the
doubly degenerate level is split into two levels, and the correct zeroth-order wave functions are linear combinations of the two unperturbed wave functions:

\(
c{1} 1 s{a}(1) 1 s{b}(2)+c{2} 1 s{a}(2) 1 s{b}(1)
\)

This leads to a $2 \times 2$ secular determinant that is the same as (8.56), except that $W$ is replaced by $E^{(0)}+E^{(1)}$; see Prob. 9.20.

We now solve the secular equation. The Hamiltonian is Hermitian, all functions are real, and $f{1}$ and $f{2}$ are normalized. Therefore

\(
H{12}=H{21}, \quad S{12}=S{21}, \quad S{11}=S{22}=1
\)

Consider $H{11}$ and $H{22}$ :

\(
\begin{aligned}
H{11} & =\left\langle 1 s{a}(1) 1 s{b}(2)\right| \hat{H}\left|1 s{a}(1) 1 s{b}(2)\right\rangle \
H{22} & =\left\langle 1 s{a}(2) 1 s{b}(1)\right| \hat{H}\left|1 s{a}(2) 1 s{b}(1)\right\rangle
\end{aligned}
\)

Interchange of the coordinate labels 1 and 2 in $H{22}$ converts $H{22}$ to $H{11}$, since this relabeling leaves $\hat{H}$ unchanged. Hence $H{11}=H{22}$. The secular equation $\operatorname{det}\left(H{i j}-S_{i j} W\right)=0$ becomes

\(
\left|\begin{array}{cc}
H{11}-W & H{12}-W S{12} \tag{13.97}\
H{12}-W S{12} & H{11}-W
\end{array}\right|=0
\)

This equation has the same form as Eq. (13.49), and by analogy to Eqs. (13.51), (13.57), and (13.58) the approximate energies and wave functions are

\(
\begin{array}{cl}
W{1}=\frac{H{11}+H{12}}{1+S{12}}, & W{2}=\frac{H{11}-H{12}}{1-S{12}} \
\phi{1}=\frac{f{1}+f{2}}{\sqrt{2}\left(1+S{12}\right)^{1 / 2}}, & \phi{2}=\frac{f{1}-f{2}}{\sqrt{2}\left(1-S{12}\right)^{1 / 2}} \tag{13.99}
\end{array}
\)

The numerators of (13.99) are

\(
f{1} \pm f{2}=1 s{a}(1) 1 s{b}(2) \pm 1 s{a}(2) 1 s{b}(1)
\)

From our previous discussion, we know that the ground state of $\mathrm{H}{2}$ is a ${ }^{1} \Sigma$ state with the antisymmetric spin factor (11.60) and a symmetric spatial factor. Hence $\phi{1}$ must be the ground state. The Heitler-London ground-state wave function is

\(
\begin{equation}
\frac{1 s{a}(1) 1 s{b}(2)+1 s{a}(2) 1 s{b}(1)}{\sqrt{2}\left(1+S_{12}\right)^{1 / 2}} \frac{1}{\sqrt{2}}[\alpha(1) \beta(2)-\alpha(2) \beta(1)] \tag{13.100}
\end{equation}
\)

The Heitler-London wave functions for the three states of the lowest ${ }^{3} \Sigma$ term are

\(
\frac{1 s{a}(1) 1 s{b}(2)-1 s{a}(2) 1 s{b}(1)}{\sqrt{2}\left(1-S_{12}\right)^{1 / 2}}\left{\begin{array}{l}
\alpha(1) \alpha(2) \tag{13.101}\
2^{-1 / 2}[\alpha(1) \beta(2)+\beta(1) \alpha(2)] \
\beta(1) \beta(2)
\end{array}\right.
\)

where $S_{12}$ is given in Prob. 13.33.
Now consider the ground-state energy expression. We write the molecular electronic Hamiltonian as the sum of two H -atom Hamiltonians plus perturbing terms:

\(
\begin{equation}
\hat{H}=\hat{H}{a}(1)+\hat{H}{b}(2)+\hat{H}^{\prime} \tag{13.102}
\end{equation}
\)

\(
\hat{H}{a}(1)=-\frac{1}{2} \nabla{1}^{2}-\frac{1}{r{a 1}}, \quad \hat{H}{b}(1)=-\frac{1}{2} \nabla{2}^{2}-\frac{1}{r{b 2}}, \quad \hat{H}^{\prime}=-\frac{1}{r{b 1}}-\frac{1}{r{a 2}}+\frac{1}{r_{12}}
\)

The Heitler-London calculation does not introduce an effective nuclear charge into the $1 s$ function. Hence $1 s{a}(1)$ is an eigenfunction of $\hat{H}{a}(1)$ with eigenvalue $-\frac{1}{2}$ hartree, the hydrogen-atom ground-state energy. Using this result, one finds the following expressions for the VB energies (Prob. 13.33):

\(
\begin{equation}
W{1}=-1+\frac{Q+A}{1+S{a b}^{2}}, \quad W{2}=-1+\frac{Q-A}{1-S{a b}^{2}} \tag{13.103}
\end{equation}
\)

where the Coulomb integral $Q$ and the exchange integral $A$ are defined by:

\(
\begin{align}
Q & \equiv\left\langle 1 s{a}(1) 1 s{b}(2)\right| \hat{H}^{\prime}\left|1 s{a}(1) 1 s{b}(2)\right\rangle \tag{13.104}\
A & \equiv\left\langle 1 s{a}(2) 1 s{b}(1)\right| \hat{H}^{\prime}\left|1 s{a}(1) 1 s{b}(2)\right\rangle \tag{13.105}
\end{align}
\)

and the overlap integral $S_{a b}$ is defined by (13.48). The quantity -1 hartree in these expressions is the energy of two ground-state hydrogen atoms. To obtain the $U(R)$ potentialenergy curves, we add the internuclear repulsion $1 / R$ to these expressions.

Many of the integrals needed to evaluate $W{1}$ and $W{2}$ have been evaluated in the treatment of $\mathrm{H}{2}^{+}$in Section 13.5. The only new integrals are those involving $1 / r{12}$. The hardest one is the two-center, two-electron exchange integral:

\(
\iint 1 s{a}(1) 1 s{b}(2) \frac{1}{r{12}} 1 s{a}(2) 1 s{b}(1) d v{1} d v_{2}
\)

Two-center means that the integrand contains functions centered on two different nuclei, $a$ and $b$; two-electron means that the coordinates of two electrons occur in the integrand. This can be evaluated using an expansion for $1 / r{12}$ in confocal elliptic coordinates, similar to the expansion in Prob. 9.14 in spherical coordinates. Details of the integral evaluations are given in Slater, Quantum Theory of Molecules and Solids, Volume 1, Appendix 6. The results of the Heitler-London treatment are $D{e}=3.15 \mathrm{eV}, R{e}=0.87 \AA$. The agreement with the experimental values $D{e}=4.75 \mathrm{eV}$, $R_{e}=0.741 \AA$ is only fair. In this treatment, most of the binding energy is provided by the exchange integral $A$.

Consider some improvements on the Heitler-London function (13.100). One obvious step is the introduction of an orbital exponent $\zeta$ in the $1 s$ function. This was done by Wang in 1928. The optimum value of $\zeta$ is 1.166 at $R{e}$, and $D{e}$ and $R{e}$ are improved to 3.78 eV and $0.744 \AA$. Recall that Dickinson in 1933 improved the Finkelstein-Horowitz $\mathrm{H}{2}^{+}$trial function by mixing in some $2 p_{z}$ character into the atomic orbitals (hybridization). In 1931 Rosen used this idea to improve the Heitler-London-Wang function. He took the trial function

\(
\phi=\phi{a}(1) \phi{b}(2)+\phi{a}(2) \phi{b}(1)
\)

where the atomic orbital $\phi{a}$ is given by $\phi{a}=e^{-\zeta r{a}}\left(1+c z{a}\right)$, with a similar expression for $\phi_{b}$. This allows for the polarization of the AOs on molecule formation. The result is a binding energy of 4.04 eV . Another improvement, the use of ionic structures, will be considered in the next section.


Let us compare the molecular-orbital and valence-bond treatments of the $\mathrm{H}{2}$ ground state.
If $\phi{a}$ symbolizes an atomic orbital centered on nucleus $a$, the spatial factor of the unnormalized LCAO-MO wave function for the $\mathrm{H}_{2}$ ground state is

\(
\begin{equation}
\left[\phi{a}(1)+\phi{b}(1)\right]\left[\phi{a}(2)+\phi{b}(2)\right] \tag{13.106}
\end{equation}
\)

In the simplest treatment, $\phi$ is a $1 s$ AO. The function (13.106) equals

\(
\begin{equation}
\phi{a}(1) \phi{a}(2)+\phi{b}(1) \phi{b}(2)+\phi{a}(1) \phi{b}(2)+\phi{b}(1) \phi{a}(2) \tag{13.107}
\end{equation}
\)

What is the physical significance of the terms? The last two terms have each electron in an atomic orbital centered on a different nucleus. These are covalent terms, corresponding to equal sharing of the electrons between the atoms. The first two terms have both electrons in AOs centered on the same nucleus. These are ionic terms, corresponding to the chemical structures

\(
\mathrm{H}^{-} \mathrm{H}^{+} \text {and } \mathrm{H}^{+} \mathrm{H}^{-}
\)

The covalent and ionic terms occur with equal weight, so this simple MO function gives a $50-50$ chance as to whether the $\mathrm{H}{2}$ ground state dissociates to two neutral hydrogen atoms or to a proton and a hydride ion. Actually, the $\mathrm{H}{2}$ ground state dissociates to two neutral H atoms. Thus the simple MO function gives the wrong limiting value of the energy as $R$ goes to infinity.

How can we remedy this? Since $\mathrm{H}_{2}$ is nonpolar, chemical intuition tells us that ionic terms should contribute substantially less to the wave function than covalent terms. The simplest procedure is to omit the ionic terms of the MO function (13.107). This gives

\(
\begin{equation}
\phi{a}(1) \phi{b}(2)+\phi{b}(1) \phi{a}(2) \tag{13.108}
\end{equation}
\)

We recognize (13.108) as the Heitler-London function (13.100).
Although interelectronic repulsion causes the electrons to avoid each other, there is some probability of finding both electrons near the same nucleus, corresponding to an ionic structure. Therefore, instead of simply dropping the ionic terms from (13.107), we might try

\(
\begin{equation}
\phi{\mathrm{VB}, \mathrm{imp}}=\phi{a}(1) \phi{b}(2)+\phi{b}(1) \phi{a}(2)+\delta\left[\phi{a}(1) \phi{a}(2)+\phi{b}(1) \phi_{b}(2)\right] \tag{13.109}
\end{equation}
\)

where $\delta(R)$ is a variational parameter and the subscript imp indicates an improved VB function. In the language of valence-bond theory, this trial function represents ioniccovalent resonance. Of course, the ground-state wave function of $\mathrm{H}{2}$ does not undergo a time-dependent change back and forth from a covalent function corresponding to the structure $\mathrm{H}-\mathrm{H}$ to ionic functions. Rather (in the approximation we are considering), the wave function is a time-independent mixture of covalent and ionic functions. Since $\mathrm{H}{2}$ dissociates to neutral atoms, we know that $\delta(\infty)=0$. A variational calculation done by Weinbaum in 1933 using $1 s$ AOs with an orbital exponent gave the result that at $R_{e}$ the parameter $\delta$ has the value 0.26 ; the orbital exponent was found to be 1.19 , and the dissociation energy was calculated as 4.03 eV , a modest improvement over the Heitler-London-Wang value of 3.78 eV . With $\delta$ equal to zero in (13.109), we get the VB function (13.108). With $\delta$ equal to 1 , we get the LCAO-MO function (13.107). The optimum value of $\delta$ turns out to be closer to zero than to 1 , and, in fact, the Heitler-London-Wang VB function gives a better dissociation energy than the LCAO-MO function.

Let us compare the improved valence-bond trial function (13.109) with the simple LCAO-MO function improved by configuration interaction. The LCAO-MO CI trial function (13.95) has the (unnormalized) form
$\phi{\mathrm{MO}, \text { imp }}=\left[\phi{a}(1)+\phi{b}(1)\right]\left[\phi{a}(2)+\phi{b}(2)\right]+\gamma\left[\phi{a}(1)-\phi{b}(1)\right]\left[\phi{a}(2)-\phi_{b}(2)\right]$
Since we have not yet normalized this function, there is no harm in multiplying it by the constant $1 /(1-\gamma)$. Doing so and rearranging terms, we get

\(
\phi{\mathrm{MO}, \mathrm{imp}}=\phi{a}(1) \phi{b}(2)+\phi{b}(1) \phi{a}(2)+\frac{1+\gamma}{1-\gamma}\left[\phi{a}(1) \phi{a}(2)+\phi{b}(1) \phi_{b}(2)\right]
\)

There is also no harm done if we define a new constant $\delta$ as $\delta=(1+\gamma) /(1-\gamma)$. We see then that this improved MO function and the improved VB function (13.109) are identical. Weinbaum viewed his $\mathrm{H}_{2}$ calculation as a valence-bond calculation with inclusion of ionic terms. We have shown that we can just as well view the Weinbaum calculation as an MO calculation with configuration interaction. (This was the viewpoint adopted in Section 13.9.)

The MO function (13.107) underestimates electron correlation, in that it says that structures with both electrons on the same atom are just as likely as structures with each electron on a different atom. The VB function (13.108) overestimates electron correlation, in that it has no contribution from structures with both electrons on the same atom. In MO theory, electron correlation can be introduced by configuration interaction. In VB theory, electron correlation is reduced by ionic-covalent resonance. The simple VB method is more reliable at large $R$ than the simple MO method, since the latter predicts the wrong dissociation products.

To further fix the differences between the MO and VB approaches, consider how each method divides the $\mathrm{H}_{2}$ electronic Hamiltonian into unperturbed and perturbation Hamiltonians. For the MO method, we write

\(
\hat{H}=\left[\left(-\frac{1}{2} \nabla{1}^{2}-\frac{1}{r{a 1}}-\frac{1}{r{b 1}}\right)+\left(-\frac{1}{2} \nabla{2}^{2}-\frac{1}{r{a 2}}-\frac{1}{r{b 2}}\right)\right]+\frac{1}{r_{12}}
\)

where the unperturbed Hamiltonian consists of the bracketed terms. In MO theory the unperturbed Hamiltonian for $\mathrm{H}{2}$ is the sum of two $\mathrm{H}{2}^{+}$Hamiltonians, one for each electron. Accordingly, the zeroth-order MO wave function is a product of two $\mathrm{H}{2}^{+}$-like wave functions, one for each electron. Since the $\mathrm{H}{2}^{+}$functions are complicated, we approximate the $\mathrm{H}{2}^{+}$-like MOs as LCAOs. The effect of the $1 / r{12}$ perturbation is taken into account in an average way through use of self-consistent-field molecular orbitals. To take instantaneous electron correlation into account, we can use configuration interaction.

For the valence-bond method, the terms in the Hamiltonian are grouped in either of two ways:

\(
\begin{aligned}
& \hat{H}=\left[\left(-\frac{1}{2} \nabla{1}^{2}-\frac{1}{r{a 1}}\right)+\left(-\frac{1}{2} \nabla{2}^{2}-\frac{1}{r{b 2}}\right)\right]-\frac{1}{r{a 2}}-\frac{1}{r{b 1}}+\frac{1}{r{12}} \
& \hat{H}=\left[\left(-\frac{1}{2} \nabla{1}^{2}-\frac{1}{r{b 1}}\right)+\left(-\frac{1}{2} \nabla{2}^{2}-\frac{1}{r{a 2}}\right)\right]-\frac{1}{r{a 1}}-\frac{1}{r{b 2}}+\frac{1}{r{12}}
\end{aligned}
\)

The unperturbed system is two hydrogen atoms. We have two zeroth-order functions consisting of products of hydrogen-atom wave functions, and these belong to a degenerate level. The correct ground-state zeroth-order function is the linear combination (13.100).

The MO method is used far more often than the VB method, because it is computationally much simpler than the VB method. The MO method was developed by Hund, Mulliken, and Lennard-Jones in the late 1920s. Originally, it was used largely for qualitative descriptions of molecules, but the electronic digital computer has made possible the calculation of accurate MO functions (Section 13.14). For a discussion of the relative merits of the MO and VB methods, see R. Hoffman et al., Acc. Chem. Res., 36, 750 (2003).


The MO approximation puts the electrons of a molecule in molecular orbitals, which extend over the whole molecule. As an approximation to the molecular orbitals, we usually use linear combinations of atomic orbitals. The VB method puts the electrons of a molecule in
atomic orbitals and constructs the molecular wave function by allowing for "exchange" of the valence electron pairs between the atomic orbitals of the bonding atoms. We compared the two methods for $\mathrm{H}_{2}$. We now consider other homonuclear diatomic molecules.

We begin with the ground state of $\mathrm{He}_{2}$. Each separated helium atom has the groundstate configuration $1 s^{2}$. This closed-subshell configuration does not have any unpaired electrons to form valence bonds, and the VB wave function is simply the antisymmetrized product of the atomic-orbital functions. In the notation of Eq. (10.47), the He VB ground state wave function is the Slater determinant

\(
\begin{equation}
\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}}\right| \tag{13.110}
\end{equation}
\)

The subscripts $a$ and $b$ refer to the two atoms, and the bar indicates spin function $\beta$. The $1 s$ function in this wave function is a helium-atom $1 s$ function, which ideally is an SCF atomic function but can be approximated by a hydrogenlike function with an effective nuclear charge. The VB wave function for $\mathrm{He}_{2}$ has each electron paired with another electron in an orbital on the same atom and so predicts no bonding.

In the MO approach, $\mathrm{He}{2}$ has the ground-state configuration $\left(\sigma{g} 1 s\right)^{2}\left(\sigma_{u}^{*} 1 s\right)^{2}$. With no net bonding electrons, no bonding is predicted, in agreement with the VB method. The MO approximation to the wave function is

\(
\begin{equation}
\left|\sigma{g} 1 s \overline{\sigma{g} 1 s} \sigma_{u}^{} 1 s \overline{\sigma_{u}^{} 1 s}\right| \tag{13.111}
\end{equation}
\)

The simplest way to approximate the (unnormalized) MOs is to take them as linear combinations of the helium-atom AOs: $\sigma{g} 1 s=1 s{a}+1 s{b}$ and $\sigma{u}^{*} 1 s=1 s{a}-1 s{b}$. With this approximation, (13.111) becomes

\(
\begin{equation}
\left|\left(1 s{a}+1 s{b}\right) \overline{\left(1 s{a}+1 s{b}\right)}\left(1 s{a}-1 s{b}\right) \overline{\left(1 s{a}-1 s{b}\right)}\right| \tag{13.112}
\end{equation}
\)

Using theorems about determinants, we can show (Prob. 13.34) that (13.112) is equal to

\(
\begin{equation}
4\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}}\right| \tag{13.113}
\end{equation}
\)

which is identical (after normalization) to the VB function (13.110). This result is easily generalized to the statement that the simple VB and simple LCAO-MO methods give the same approximate wave functions for diatomic molecules formed from separated atoms with completely filled atomic subshells. We could now substitute the trial function (13.110) into the variational integral and calculate the repulsive curve for the interaction of two ground-state He atoms.

Before going on to $\mathrm{Li}{2}$, let us express the Heitler-London valence-bond functions for $\mathrm{H}{2}$ as Slater determinants. The ground-state Heitler-London function (13.100) and Prob. 13.33a can be written as

\(
\begin{gather}
\frac{1}{2}\left(1+S{a b}^{2}\right)^{-1 / 2}\left{\left|\begin{array}{ll}
1 s{a}(1) \alpha(1) & 1 s{b}(1) \beta(1) \
1 s{a}(2) \alpha(2) & 1 s{b}(2) \beta(2)
\end{array}\right|-\left|\begin{array}{ll}
1 s{a}(1) \beta(1) & 1 s{b}(1) \alpha(1) \
1 s{a}(2) \beta(2) & 1 s{b}(2) \alpha(2)
\end{array}\right|\right} \
=\left(2+2 S{a b}^{2}\right)^{-1 / 2}\left{\left|1 s{a} \overline{1 s{b}}\right|-\left|\overline{1 s{a}} 1 s{b}\right|\right} \tag{13.114}
\end{gather}
\)

In each Slater determinant, the electron on atom $a$ is paired with an electron of opposite spin on atom $b$, corresponding to the Lewis structure $\mathrm{H}-\mathrm{H}$. The Heitler-London functions (13.101) for the lowest $\mathrm{H}{2}$ triplet state can also be written as Slater determinants. Omitting normalization constants, we write the Heitler-London $\mathrm{H}{2}$ functions as

\(
\begin{array}{ll}
\text { Singlet: } & \left|1 s{a} \overline{1 s{b}}\right|-\left|\overline{1 s{a}} 1 s{b}\right| \
\text { Triplet: } & \left{\begin{array}{l}
\left|1 s{a} 1 s{b}\right| \
\left|1 s{a} \overline{1 s{b}}\right|+\left|\overline{1 s{a}} 1 s{b}\right| \
\left|\overline{1 s{a}} \overline{1 s{b}}\right|
\end{array}\right. \tag{13.116}
\end{array}
\)

Now consider $\mathrm{Li}{2}$. The ground-state configuration of Li is $1 s^{2} 2 s$, and the Lewis structure of $\mathrm{Li}{2}$ is $\mathrm{Li}-\mathrm{Li}$, with the two $2 s \mathrm{Li}$ electrons paired and the $1 s$ electrons remaining in the inner shell of each atom. The part of the valence-bond wave function involving the $1 s$ electrons will be like the $\mathrm{He}{2}$ function (13.110), while the part of the VB wave function involving the $2 s$ electrons (which form the bond) will be like the Heitler-London $\mathrm{H}{2}$ function (13.115). Of course, because of the indistinguishability of the electrons, there is complete electronic democracy, and we must allow every electron to be in every orbital. Hence we write the ground-state VB function for $\mathrm{Li}_{2}$ using $6 \times 6$ Slater determinants:

\(
\begin{equation}
\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}} 2 s{a} \overline{2 s{b}}\right|-\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}} \overline{2 s{a}} 2 s{b}\right| \tag{13.117}
\end{equation}
\)

We have written down (13.117) simply by analogy to (13.110) and (13.115). For a fuller justification of it, we should show that it is an eigenfunction of the spin operators $\hat{S}^{2}$ and $\hat{S}_{z}$ with eigenvalue zero for each operator, which corresponds to a singlet state. This can be shown, but we omit doing so. To save space, (13.117) is sometimes written as

\(
\begin{equation}
\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}} \widehat{2 s{a} 2 s{b}}\right| \tag{13.118}
\end{equation}
\)

where the curved line indicates the pairing (bonding) of the $2 s{a}$ and $2 s{b} \mathrm{AOs}$.
The MO wave function for the $\mathrm{Li}_{2}$ ground state is

\(
\begin{equation}
\left|\sigma{g} 1 s \overline{\sigma{g} 1 s} \sigma_{u}^{} 1 s \overline{\sigma{u}^{*} 1 s} \sigma{g} 2 s \overline{\sigma_{g} 2 s}\right| \tag{13.119}
\end{}
\)

If we approximate the two lowest MOs by $1 s{a} \pm 1 s{b}$ then the same procedure used in Prob. 13.34 to show that $(13.111)$ is the same wave function as $(13.110)$ shows that $(13.119)$ is the same as

\(
\left|1 s{a} \overline{1 s{a}} 1 s{b} \overline{1 s{b}} \sigma{g} 2 s \overline{\sigma{g} 2 s}\right|
\)

Recall the notation $K K\left(\sigma{g} 2 s\right)^{2}$ for the $\mathrm{Li}{2}$ ground-state configuration.
Now consider the VB treatment of the $\mathrm{N}{2}$ ground state. The lowest configuration of N is $1 s^{2} 2 s^{2} 2 p^{3}$. Hund's rule gives the ground level as ${ }^{4} S{3 / 2}$, with one electron in each of the three $2 p \mathrm{AOs}$. We can thus pair the two $2 p{x}$ electrons, the two $2 p{y}$ electrons, and the two $2 p_{z}$ electrons to form a triple bond. The Lewis structure is $: \mathrm{N} \equiv \mathrm{N}:$. How is this Lewis structure translated into the VB wave function? In the VB method, opposite spins are given to orbitals bonded together. We have three such pairs of orbitals and two ways to give opposite spins to the electrons of each bonding pair of AOs. Hence there are $2^{3}=8$ possible Slater determinants that we can write. We begin with

\(
D{1}=\left|1 s{a} \overline{1 s{a}} 2 s{a} \overline{2 s{a}} 1 s{b} \overline{1 s{b}} 2 s{b} \overline{2 s{b}} 2 p{x a} \overline{2 p{x b}} 2 p{y a} \overline{2 p{y b}} 2 p{z a} \overline{2 p_{z b}}\right|
\)

In all eight determinants, the first eight columns will remain unchanged, and to save space we write $D_{1}$ as

\(
\begin{equation}
D{1}=\left|\cdots 2 p{x a} \overline{2 p{x b}} 2 p{y a} \overline{2 p{y b}} 2 p{z a} \overline{2 p_{z b}}\right| \tag{13.120}
\end{equation}
\)

Reversing the spins of the electrons in $2 p{x a}$ and $2 p{x b}$, we get

\(
\begin{equation}
D{2}=\left|\cdots \overline{2 p{x a}} 2 p{x b} 2 p{y a} \overline{2 p{y b}} 2 p{z a} \overline{2 p_{z b}}\right| \tag{13.121}
\end{equation}
\)

There are six other determinants formed by interchanges of spins within the three pairs of bonding orbitals, and the VB wave function is a linear combination of eight determinants (Prob. 13.35). The following rule (see Kauzmann, pages 421-422) gives a VB wave function that is an eigenfunction of $\hat{S}^{2}$ with eigenvalue 0 (as is desired for the ground state): The coefficient of each determinant is +1 or -1 according to whether the number of spin interchanges required to generate the determinant from $D{1}$ is even or odd, respectively. Thus $D{2}$ has coefficient -1 . [Compare also (13.115).] Clearly, the single-determinant ground-state $\mathrm{N}_{2} \mathrm{MO}$ function is easier to handle than the eight-determinant VB function.


We have concentrated mostly on the ground electronic states of diatomic molecules. In this section we consider some of the excited states of $\mathrm{H}{2}$. Figure $\mathbf{1 3 . 1 9}$ gives the potentialenergy curves for some of the $\mathrm{H}{2}$ electronic energy levels.

The lowest MO configuration is $\left(1 \sigma{g}\right)^{2}$, where the notation of the third column of Table 13.1 is used. This closed-shell configuration gives only a nondegenerate ${ }^{1} \Sigma{g}^{+}$level, designated $X^{1} \Sigma_{g}^{+}$. The LCAO-MO function is (13.93).

The next-lowest MO configuration is $\left(1 \sigma{g}\right)\left(1 \sigma{u}\right)$, which gives rise to the terms ${ }^{1} \Sigma{u}^{+}$ and ${ }^{3} \Sigma{u}^{+}$(Table 13.3). Since there is no axial electronic orbital angular momentum, each of these terms corresponds to one level. Spectroscopists have named these electronic levels $B^{1} \Sigma{u}^{+}$and $b^{3} \Sigma{u}^{+}$. By Hund's rule, the $b$ level lies below the $B$ level. The LCAO-MO functions for these levels are [see Eqs. (10.27)-(10.30)]

\(
\begin{array}{ll}
b^{3} \Sigma{u}^{+}: & 2^{-1 / 2}\left[1 \sigma{g}(1) 1 \sigma{u}(2)-1 \sigma{g}(2) 1 \sigma{u}(1)\right]\left{\begin{array}{l}
\alpha(1) a(2) \
2^{-1 / 2}[\alpha(1) \beta(2)+\alpha(2) \beta(1)] \
\beta(1) \beta(2)
\end{array}\right. \
B^{1} \Sigma{u}^{+}: \quad 2^{-1 / 2}\left[1 \sigma{g}(1) 1 \sigma{u}(2)+1 \sigma{g}(2) 1 \sigma{u}(1)\right] 2^{-1 / 2}[\alpha(1) \beta(2)-\alpha(2) \beta(1)]
\end{array}
\)

where $1 \sigma{g} \approx N\left(1 s{a}+1 s{b}\right)$ and $1 \sigma{u} \approx N^{\prime}\left(1 s{a}-1 s{b}\right)$. The $b^{3} \Sigma{u}^{+}$level is triply degenerate. The $B^{1} \Sigma{u}^{+}$level is nondegenerate. The Heitler-London wave functions for the $b$ level are given by (13.101). Both these levels have one bonding and one antibonding electron, and we would expect the potential-energy curves for both levels to be repulsive. Actually,

the $B$ level has a minimum in its $U(R)$ curve. The stability of this state should caution us against drawing too hasty conclusions from very approximate wave functions.

We expect the next-lowest configuration to be $\left(1 \sigma{g}\right)\left(2 \sigma{g}\right)$, giving rise to ${ }^{1} \Sigma{g}^{+}$and ${ }^{3} \Sigma{g}^{+}$levels. These levels of $\mathrm{H}{2}$ are designated $E^{1} \Sigma{g}^{+}$and $a^{3} \Sigma_{g}^{+}$. By Hund's rule, the triplet lies lower. The $E$ state has two substantial minima in its $U(R)$ curve, and is often called the EF state because of the two minima.

Although the $2 \sigma{u} \mathrm{MO}$ fills before the two $1 \pi{u} \mathrm{MOs}$ in going across the periodic table, the $1 \pi{u} \mathrm{MOs}$ lie below the $2 \sigma{u} \mathrm{MO}$ in $\mathrm{H}{2}$. The configuration $\left(1 \sigma{g}\right)\left(1 \pi{u}\right)$ gives rise to the terms ${ }^{1} \Pi{u}$ and ${ }^{3} \Pi{u}$, the triplet lying lower. These terms are designated $C^{1} \Pi{u}$ and $c^{3} \Pi{u}$. The $c$ term gives rise to the levels $c^{3} \Pi{2 u}, c^{3} \Pi{1 u}$, and $c^{3} \Pi{0 u}$. These levels lie so close together that they are usually not resolved in spectroscopic work. The $C$ level shows a slight hump in its potential-energy curve at large $R$. Each level is twofold degenerate, which gives a total of eight electronic states arising from the $\left(1 \sigma{g}\right)\left(1 \pi{u}\right)$ configuration.


This section presents some examples of SCF MO wave functions for diatomic molecules.
The spatial orbitals $\phi{i}$ in an MO wave function are each expressed as a linear combination of a set of one-electron basis functions $\chi{s}$ :

\(
\begin{equation}
\phi{i}=\sum{s} c{s i} \chi{s} \tag{13.122}
\end{equation}
\)

For SCF calculations on diatomic molecules, one can use Slater-type orbitals [Eq. (11.14)] centered on the various atoms of the molecule as the basis functions. (For an alternative choice, see Section 15.4.) The procedure used to find the coefficients $c_{s i}$ of the basis functions in each SCF MO is discussed in Section 14.3. To have a complete set of AO basis functions, an infinite number of Slater orbitals are needed, but the true molecular Hartree-Fock wave function can be closely approximated with a reasonably small number of carefully chosen Slater orbitals. A minimal basis set for a molecular SCF calculation consists of a single basis function for each inner-shell AO and each valence-shell AO of each atom. An extended basis set is a set that is larger than a minimal set. Minimal-basis-set SCF calculations are easier than extended-basis-set calculations, but the latter are much more accurate.

SCF wave functions using a minimal basis set were calculated by Ransil for several light diatomic molecules [B. J. Ransil, Rev. Mod. Phys., 32, 245 (1960)]. As an example, the SCF MOs for the ground state of $\mathrm{Li}{2}$ [MO configuration $\left(1 \sigma{g}\right)^{2}\left(1 \sigma{u}\right)^{2}\left(2 \sigma{g}\right)^{2}$ ] at $R=R{e}$ are
$1 \sigma{g}=0.706\left(1 s{a}+1 s{b}\right)+0.009\left(2 s{\perp a}+2 s{\perp b}\right)+0.0003\left(2 p \sigma{a}+2 p \sigma{b}\right)$
$1 \sigma{u}=0.709\left(1 s{a}-1 s{b}\right)+0.021\left(2 s{\perp a}-2 s{\perp b}\right)+0.003\left(2 p \sigma{a}-2 p \sigma{b}\right)$
$2 \sigma{g}=-0.059\left(1 s{a}+1 s{b}\right)+0.523\left(2 s{\perp a}+2 s{\perp b}\right)+0.114\left(2 p \sigma{a}+2 p \sigma{b}\right)$
The AO functions in these equations are STOs, except for $2 s_{\perp}$. A Slater-type $2 s$ AO has no radial nodes and is not orthogonal to a $1 s$ STO. The Hartree-Fock $2 s$ AO has one radial node $(n-l-1=1)$ and is orthogonal to the $1 s$ AO. We can form an orthogonalized $2 s$ orbital with the proper number of nodes by taking the following normalized linear combination of $1 s$ and $2 s$ STOs of the same atom (Schmidt orthogonalization):

\(
\begin{equation}
2 s_{\perp}=\left(1-S^{2}\right)^{-1 / 2}(2 s-S \cdot 1 s) \tag{13.124}
\end{equation}
\)

where $S$ is the overlap integral $\langle 1 s \mid 2 s\rangle$. Ransil expressed the $\mathrm{Li}{2}$ orbitals using the (nonorthogonal) $2 s$ STO, but since the orthogonalized $2 s{\perp}$ function gives a better representation of the $2 s \mathrm{AO}$, the orbitals have been rewritten using $2 s{\perp}$. This changes the $1 s$ and $2 s$ coefficients, but the actual orbital is, of course, unchanged; see Prob. 13.37. The notation $2 p \sigma$ for an AO indicates that the $p$ orbital points along the molecular $(z)$ axis; that is, a $2 p \sigma$ AO is a $2 p{z} \mathrm{AO}$. (The $2 p{x}$ and $2 p{y} \mathrm{AOs}$ are called $2 p \pi$ AOs.) The optimum orbital exponents for the orbitals in (13.123) are $\zeta{1 s}=2.689, \zeta{2 s}=0.634, \zeta_{2 p \sigma}=0.761$.

Our previous simple expressions for these MOs were

\(
\begin{aligned}
& 1 \sigma{g}=\sigma{g} 1 s=2^{-1 / 2}\left(1 s{a}+1 s{b}\right) \
& 1 \sigma{u}=\sigma{u}^{*} 1 s=2^{-1 / 2}\left(1 s{a}-1 s{b}\right) \
& 2 \sigma{g}=\sigma{g} 2 s=2^{-1 / 2}\left(2 s{a}+2 s{b}\right)
\end{aligned}
\)

Comparison of these with (13.123) shows the simple LCAO functions to be reasonable first approximations to the minimal-basis-set SCF MOs. The approximation is best for the $1 \sigma{g}$ and $1 \sigma{u} \mathrm{MOs}$, whereas the $2 \sigma_{g} \mathrm{MO}$ has substantial $2 p \sigma \mathrm{AO}$ contributions in addition to the $2 s$ AO contributions. For this reason the notation of the third column of Table 13.1 (Section 13.7) is preferable to the separated-atoms MO notation. The substantial amount of $2 s-2 p \sigma$ hybridization is to be expected, since the $2 s$ and $2 p$ AOs are close in energy [see Eq. (9.27)]. The hybridization allows for the polarization of the $2 s \mathrm{AOs}$ in forming the molecule.

Let us compare the $3 \sigma{g} \mathrm{MO}$ of the $\mathrm{F}{2}$ ground state at $R_{e}$ as calculated by Ransil using a minimal basis set with that calculated by Wahl using an extended basis set [A. C. Wahl, J. Chem. Phys., 41, 2600 (1964)]:

\(
\begin{gathered}
3 \sigma{g, \text { min }}=0.038\left(1 s{a}+1 s{b}\right)-0.184\left(2 s{a}+2 s{b}\right)+0.648\left(2 p \sigma{a}+2 p \sigma{b}\right) \
\zeta{1 s}=8.65, \quad \zeta{2 s}=2.58, \quad \zeta{2 p \sigma}=2.49 \
3 \sigma{g, \mathrm{ext}}=0.048\left(1 s{a}+1 s{b}\right)+0.003\left(1 s{a}^{\prime}+1 s{b}^{\prime}\right)-0.257\left(2 s{a}+2 s{b}\right) \
+0.582\left(2 p \sigma{a}+2 p \sigma{b}\right)+0.307\left(2 p \sigma{a}^{\prime}+2 p \sigma{b}^{\prime}\right)+0.085\left(2 p \sigma{a}^{\prime \prime}+2 p \sigma{b}^{\prime \prime}\right) \
-0.056\left(3 s{a}+3 s{b}\right)+0.046\left(3 d \sigma{a}+3 d \sigma{b}\right)+0.014\left(4 f \sigma{a}+4 f \sigma{b}\right) \
\zeta{1 s}=8.27, \quad \zeta{1 s^{\prime}}=13.17, \quad \zeta{2 s}=2.26 \
\zeta{2 p \sigma}=1.85, \quad \zeta{2 p \sigma^{\prime}}=3.27, \quad \zeta{2 p \sigma^{\prime \prime}}=5.86 \
\zeta{3 s}=4.91, \quad \zeta{3 d \sigma}=2.44, \quad \zeta{4 f \sigma}=2.83
\end{gathered}
\)

Just as several STOs are needed to give an accurate representation of Hartree-Fock AOs (Section 11.1), one needs more than one STO of a given $n$ and $l$ in the linear combination of STOs that is to accurately represent the Hartree-Fock MO. The primed and double-primed AOs in the extended-basis-set function are STOs with different orbital exponents. The $3 d \sigma$ and $4 f \sigma$ AOs are AOs with quantum number $m=0$, that is, the $3 d{0}$ and $4 f{0}$ AOs. The total energies found are $-197.877 E{\mathrm{h}}$ and $-198.768 E{\mathrm{h}}$ for the minimal and extended calculations, respectively (where $E{\mathrm{h}}$ is the hartree). Extrapolation of calculations using much larger basis sets than Wahl used gives the Hartree-Fock $\mathrm{F}{2}$ energy at $R{e}$ as $-198.773 E{\mathrm{h}}$ [L. Bytautas et al., J. Chem. Phys., 127, 164317 (2007)]. The experimental energy of $\mathrm{F}{2}$ at $R{e}$ is $U\left(R{e}\right)=-199.672 E{\mathrm{h}}$. The correlation-energy definition (11.16) uses the nonrelativistic energy of the molecule. The relativistic contribution to the $\mathrm{F}{2}$ energy has been calculated to be $-0.142 E{\mathrm{h}}$, so the exact nonrelativistic $\mathrm{F}{2}$ energy at $R{e}$ is $-199.672 E{\mathrm{h}}+0.142 E{\mathrm{h}}=-199.530 E{\mathrm{h}}$. Therefore, the correlation energy in $\mathrm{F}{2}$ is $-199.530 E{\mathrm{h}}+198.773 E{\mathrm{h}}=-0.757 E_{\mathrm{h}}=-20.6 \mathrm{eV}$.

FIGURE 13.20 Hartree-Fock MO electron-density contours for the ground electronic state of $\mathrm{Li}_{2}$ as calculated by Wahl. [A. C. Wahl, Science,
151, 961 (1966); Scientific
American, April 1970, p. 54;
Atomic and Molecular Structure: 4 Wall Charts, McGraw-Hill, 1970.]

In discussing $\mathrm{H}{2}^{+}$and $\mathrm{H}{2}$, we saw how hybridization (the mixing of different AOs of the same atom) improves molecular wave functions. There is a tendency to think of hybridization as occurring only for certain molecular geometries. The SCF calculations make clear that all MOs are hybridized to some extent. Thus any diatomicmolecule $\sigma \mathrm{MO}$ is a linear combination of $1 s, 2 s, 2 p{0}, 3 s, 3 p{0}, 3 d_{0}, \ldots$ AOs of the separated atoms.

To aid in deciding which AOs contribute to a given diatomic MO, we use two rules. First, only $\sigma$-type $\mathrm{AOs}(s, p \sigma, d \sigma, \ldots)$ can contribute to a $\sigma \mathrm{MO}$; only $\pi$-type AOs ( $p \pi, d \pi, \ldots$ ) can contribute to a $\pi \mathrm{MO}$; and so on. Second, only AOs of reasonably similar energy contribute substantially to a given MO. (For examples, see the minimal- and extended-basis-set MOs quoted above.)

Wahl plotted the contours of the near Hartree-Fock molecular orbitals of homonuclear diatomic molecules from $\mathrm{H}{2}$ through $\mathrm{F}{2}$. Figure $\mathbf{1 3 . 2 0}$ shows these plots for $\mathrm{Li}_{2}$.

Of course, Hartree-Fock wave functions are only approximations to the true wave functions. It is possible to prove that a Hartree-Fock wave function gives a very good approximation to the electron probability density $\rho(x, y, z)$ for nuclear configurations in the region of the equilibrium configuration. A molecular property that involves only one-electron operators can be expressed as an integral involving $\rho$; see Eq. (14.8). Consequently, such properties are accurately calculated using Hartree-Fock wave functions. An example is the molecular dipole moment [Eq. (14.21)]. For example, the LiH dipole moment calculated with a near Hartree-Fock $\psi$ is 6.00 D (debyes) [S. Green, J. Chem. Phys., 54, 827 (1971)], compared with the experimental value 5.83 D. (One debye $=3.33564 \times 10^{-30} \mathrm{C}$ m.) For NaCl , the calculated and experimental dipole moments are 9.18 D and 9.02 D [R. L. Matcha, J. Chem. Phys., 48, 335 (1968)]. An error of about 0.2 D is typical in such calculations, but where the dipole moment is small, the percent error can be large. An extreme example is CO, for which the experimental moment is 0.11 D with the polarity $\mathrm{C}^{-} \mathrm{O}^{+}$, but the near-Hartree-Fock moment is 0.27 D with the wrong polarity $\mathrm{C}^{+} \mathrm{O}^{-}$. However, a configuration-interaction wave function gives 0.12 D with the correct polarity [S. Green, J. Chem. Phys., 54, 827 (1971)].

A major weakness of the Hartree-Fock method is its failure to give accurate molecular dissociation energies. For example, an extended-basis-set calculation [P. E. Cade et al., J. Chem. Phys., 44, 1973 (1966)] gives $D{e}=5.3 \mathrm{eV}$ for $\mathrm{N}{2}$, as compared with the true value 9.9 eV . (To calculate the Hartree-Fock $D{e}$, the molecular energy at the minimum in the $U(R)$ Hartree-Fock curve is subtracted from the sum of the Hartree-Fock energies of the separated atoms.) A related defect of Hartree-Fock molecular wave functions is that the energy approaches the wrong limit as $R \rightarrow \infty$. Recall the MO discussion of $\mathrm{H}{2}$.

13.15 MO Treatment of Heteronuclear Diatomic Molecules

The treatment of heteronuclear diatomic molecules is similar to that for homonuclear diatomic molecules. We first consider the MO description.

Suppose the two atoms have atomic numbers that differ only slightly; an example is CO . We could consider CO as being formed from the isoelectronic molecule $\mathrm{N}{2}$ by a gradual transfer of charge from one nucleus to the other. During this hypothetical transfer, the original $\mathrm{N}{2}$ MOs would slowly vary to give finally the CO MOs. We therefore expect the CO molecular orbitals to resemble somewhat those of $\mathrm{N}{2}$. For a heteronuclear diatomic molecule such as CO, the symbols used for the MOs are similar to those for homonuclear diatomics. However, for a heteronuclear diatomic, the electronic Hamiltonian (13.5) is not invariant with respect to inversion of the electronic coordinates (that is, $\hat{H}{\text {el }}$ does not commute with $\hat{\Pi}$ ), and the $g$, u property of the MOs disappears. The correlation between the $\mathrm{N}_{2}$ and CO shell designations is

$\mathrm{N}_{2}$$1 \sigma_{g}$$1 \sigma_{u}$$2 \sigma_{g}$$2 \sigma_{u}$$1 \pi_{u}$$3 \sigma_{g}$$1 \pi_{g}$$3 \sigma_{u}$
CO$1 \sigma$$2 \sigma$$3 \sigma$$4 \sigma$$1 \pi$$5 \sigma$$2 \pi$$6 \sigma$

MOs of the same symmetry are numbered in order of increasing energy. Because of the absence of the $g, u$ property, the numbers of corresponding homonuclear and heteronuclear MOs differ. Figure 13.21 is a sketch of a contour of the CO $1 \pi{ \pm 1}$ MOs as determined by an extended-basis-set SCF calculation [W. M. Huo, J. Chem. Phys., 43, 624 (1965)]. Note its resemblance to the contour of Fig. 13.13, which is for the $1 \pi{u, \pm 1} \mathrm{MOs}$ of a homonuclear diatomic molecule.

The ground-state configuration of CO is $1 \sigma^{2} 2 \sigma^{2} 3 \sigma^{2} 4 \sigma^{2} 1 \pi^{4} 5 \sigma^{2}$, as compared with the $\mathrm{N}{2}$ configuration $\left(1 \sigma{g}\right)^{2}\left(1 \sigma{u}\right)^{2}\left(2 \sigma{g}\right)^{2}\left(2 \sigma{u}\right)^{2}\left(1 \pi{u}\right)^{4}\left(3 \sigma_{g}\right)^{2}$.

As in homonuclear diatomics, the heteronuclear diatomic MOs are approximated as linear combinations of atomic orbitals. The coefficients are found by the procedure of Section 14.3. For example, a minimal-basis-set SCF calculation using Slater AOs (with nonoptimized exponents given by Slater's rules) gives for the CO $5 \sigma, 1 \pi$, and $2 \pi \mathrm{MOs}$ at $R=R_{e}$ [B. J. Ransil, Rev. Mod. Phys., 32, 245 (1960)]:

\(
\begin{gathered}
5 \sigma=0.027\left(1 s{\mathrm{C}}\right)+0.011\left(1 s{\mathrm{O}}\right)+0.739\left(2 s{\perp \mathrm{C}}\right)+0.036\left(2 s{\perp \mathrm{O}}\right) \
-0.566\left(2 p \sigma{\mathrm{C}}\right)-0.438\left(2 p \sigma{\mathrm{O}}\right) \
1 \pi=0.469\left(2 p \pi{\mathrm{C}}\right)+0.771\left(2 p \pi{\mathrm{O}}\right), \quad 2 \pi=0.922\left(2 p \pi{\mathrm{C}}\right)-0.690\left(2 p \pi{\mathrm{O}}\right)
\end{gathered}
\)

The expressions for the $\pi$ MOs are simpler than those for the $\sigma$ MOs because $s$ and $p \sigma$ AOs cannot contribute to $\pi$ MOs. For comparison, the corresponding MOs in $\mathrm{N}{2}$ at $R=R{e}$ are (Ransil, op. cit.):

\(
\begin{gathered}
3 \sigma{g}=0.030\left(1 s{a}+1 s{b}\right)+0.395\left(2 s{\perp a}+2 s{\perp b}\right)-0.603\left(2 p \sigma{a}+2 p \sigma{b}\right) \
1 \pi{u}=0.624\left(2 p \pi{a}+2 p \pi{b}\right), \quad 1 \pi{g}=0.835\left(2 p \pi{a}-2 p \pi_{b}\right)
\end{gathered}
\)

The resemblance of CO and $\mathrm{N}_{2} \mathrm{MOs}$ is apparent. The $1 \sigma \mathrm{MO}$ in CO is found to be nearly the same as a $1 s$ oxygen-atom AO ; the $2 \sigma \mathrm{MO}$ in CO is essentially a carbon-atom 1s AO.

In general, for a heteronuclear diatomic molecule $A B$ where the valence $A O$ s of each atom are of $s$ and $p$ type and where the valence AOs of A do not differ greatly in energy from the valence AOs of B, we can expect the Fig. 13.17 pattern of

\(
\sigma s<\sigma^{} s<\pi p<\sigma p<\pi^{} p<\sigma^{*} p
\)

Figure 13.21 Cross section of a contour of the $1 \pi_{ \pm 1}$ MOs in CO .

valence-shell MOs formed from $s$ and $p$ valence-shell AOs to hold reasonably well. Figure 13.17 would be modified in that each valence AO of the more electronegative atom would lie below the corresponding valence AO of the other atom.

When the valence-shell AO energies of B lie very substantially below those of A , the $s$ and $p \sigma$ valence AOs of B lie below the $s$ valence-shell AO of A , and this affects which AOs contribute to each MO. Consider the molecule BF, for example. A minimal-basis-set calculation [Ransil, Rev. Mod. Phys., 32, 245 (1960)] gives the $1 \sigma \mathrm{MO}$ as essentially $1 s{\mathrm{F}}$ and the $2 \sigma \mathrm{MO}$ as essentially $1 s{\mathrm{B}}$. The $3 \sigma \mathrm{MO}$ is predominantly $2 s{\mathrm{F}}$, with small amounts of $2 s{\mathrm{B}}, 2 p \sigma{\mathrm{B}}$, and $2 p \sigma{\mathrm{F}}$. The $4 \sigma$ MO is predominantly $2 p \sigma{\mathrm{F}}$, with significant amounts of $2 s{\mathrm{B}}$ and $2 s{\mathrm{F}}$ and a small amount of $2 p \sigma{\mathrm{B}}$. This is quite different from $\mathrm{N}{2}$, where the corresponding MO is formed predominantly from the $2 s$ AOs on each N . The $1 \pi$ MO is a bonding combination of $2 p \pi{\mathrm{B}}$ and $2 p \pi{\mathrm{F}}$. The $5 \sigma$ MO is predominantly $2 s{\mathrm{B}}$, with a substantial contribution from $2 p \sigma{\mathrm{B}}$ and a significant contribution from $2 p \sigma{\mathrm{F}}$. This is unlike the corresponding MO in $\mathrm{N}{2}$, where the largest contributions are from $2 p \sigma$ MOs on each atom. The $2 \pi \mathrm{MO}$ is an antibonding combination of $2 p \pi{\mathrm{B}}$ and $2 p \pi{\mathrm{F}}$. The $6 \sigma$ MO has important contributions from $2 p \sigma{\mathrm{B}}, 2 s{\mathrm{B}}, 2 s{\mathrm{F}}$, and $2 p \sigma_{\mathrm{F}}$.

We see from Fig. 11.2 that the $2 p{\mathrm{F}}$ AO lies well below the $2 s{\mathrm{B}} \mathrm{AO}$. This causes the $2 p \sigma{\mathrm{F}} \mathrm{AO}$ to contribute substantially to lower-lying MOs and the $2 s{\mathrm{B}} \mathrm{AO}$ to contribute substantially to higher-lying MOs, as compared with what happens in $\mathrm{N}{2}$. (This effect occurs in CO, although to a lesser extent. Note the very substantial contribution of $2 s{\mathrm{C}}$ to the $5 \sigma \mathrm{MO}$. Also, the $4 \sigma \mathrm{MO}$ in CO has a very substantial contribution from $2 p \sigma_{\mathrm{O}}$.)

For a diatomic molecule AB where each atom has $s$ and $p$ valence-shell AOs (this excludes H and transition elements) and where the A and B valence AOs differ widely in energy, we may expect the pattern of valence MOs to be $\sigma<\sigma<\pi<\sigma<\pi<\sigma$, but it is not so easy to guess which AOs contribute to the various MOs or the bonding or antibonding character of the MOs. By feeding the valence electrons into these MOs, we can make a plausible guess as to the number of unpaired electrons and the ground term of the AB molecule (Prob. 13.38).

Diatomic hydrides are a special case, since H has only a $1 s$ valence AO. Consider HF as an example. The ground-state configurations of the atoms are $1 s$ for H and $1 s^{2} 2 s^{2} 2 p^{5}$ for F . We expect the filled $1 s$ and $2 s \mathrm{~F}$ subshells to take little part in the bonding. The four $2 p \pi$ fluorine electrons are nonbonding (there are no $\pi$ valence AOs on H ). The hydrogen $1 s \mathrm{AO}$ and the fluorine $2 p \sigma \mathrm{AO}$ have the same symmetry $(\sigma)$ and have rather similar energies (Fig. 11.2), and a linear combination of these two AOs will form a $\sigma$ MO for the bonding electron pair:

\(
\phi=c{1}\left(1 s{\mathrm{H}}\right)+c{2}\left(2 p \sigma{\mathrm{F}}\right)
\)

where the contributions of $1 s{\mathrm{F}}$ and $2 s{\mathrm{F}}$ to this MO have been neglected. Since F is more electronegative than H , we expect that $c{2}>c{1}$. (In addition, the $1 s{\mathrm{H}}$ and $2 p \sigma{\mathrm{F}} \mathrm{AOs}$ form an antibonding MO, which is unoccupied in the ground state.)

The picture of HF just given is only a crude qualitative approximation. A minimal-basis-set SCF calculation using Slater orbitals with optimized exponents gives as the MOs of HF [B. J. Ransil, Rev. Mod. Phys., 32, 245 (1960)]

\(
\begin{gathered}
1 \sigma=1.000\left(1 s{\mathrm{F}}\right)+0.012\left(2 s{\perp \mathrm{F}}\right)+0.002\left(2 p \sigma{\mathrm{F}}\right)-0.003\left(1 s{\mathrm{H}}\right) \
2 \sigma=-0.018\left(1 s{\mathrm{F}}\right)+0.914\left(2 s{\perp \mathrm{F}}\right)+0.090\left(2 p \sigma{\mathrm{F}}\right)+0.154\left(1 s{\mathrm{H}}\right) \
3 \sigma=-0.023\left(1 s{\mathrm{F}}\right)-0.411\left(2 s{\perp \mathrm{F}}\right)+0.711\left(2 p \sigma{\mathrm{F}}\right)+0.516\left(1 s{\mathrm{H}}\right) \
1 \pi{+1}=\left(2 p \pi{+1}\right){\mathrm{F}}, \quad 1 \pi{-1}=\left(2 p \pi{-1}\right){\mathrm{F}}
\end{gathered}
\)

The ground-state MO configuration of HF is $1 \sigma^{2} 2 \sigma^{2} 3 \sigma^{2} 1 \pi^{4}$. The $1 \sigma$ MO is virtually identical with the $1 s$ fluorine AO. The $2 \sigma$ MO is pretty close to the $2 s$ fluorine AO. The $1 \pi$ MOs are required by symmetry to be the same as the corresponding fluorine $\pi$ AOs. The bonding $3 \sigma$ MO has its largest contribution from the $2 p \sigma$ fluorine and $1 s$ hydrogen AOs, as would be expected from the discussion of the preceding paragraph. However, the $2 s$ fluorine AO makes a substantial contribution to this MO. (Since a single $2 s$ function is only an approximation to the $2 s \mathrm{AO}$ of F , we cannot use this calculation to say exactly how much $2 s \mathrm{AO}$ character the $3 \sigma \mathrm{HF}$ molecular orbital has.)

For qualitative discussion (but not quantitative work), it is useful to have simple approximations for heteronuclear diatomic MOs. In the crudest approximation, we can take each valence MO of a heteronuclear diatomic molecule as a linear combination of two AOs $\phi{a}$ and $\phi{b}$, one on each atom. (As the discussions of CO and BF show, this approximation is often quite inaccurate.) From the two AOs, we can form two MOs:

\(
c{1} \phi{a}+c{2} \phi{b} \text { and } c{1}^{\prime} \phi{a}+c{2}^{\prime} \phi{b}
\)

The lack of symmetry in the heteronuclear diatomic makes the coefficients unequal in magnitude. The coefficients are determined by solving the secular equation [see Eq. (13.45)]

\(
\begin{gather}
\left|\begin{array}{cc}
H{a a}-W & H{a b}-W S{a b} \
H{a b}-W S{a b} & H{b b}-W
\end{array}\right|=0 \
\left(H{a a}-W\right)\left(H{b b}-W\right)-\left(H{a b}-W S{a b}\right)^{2}=0 \tag{13.125}
\end{gather}
\)

where $\hat{H}$ is some sort of effective one-electron Hamiltonian. Suppose that $H{a a}>H{b b}$, and let $f(W)$ be defined as the left side of (13.125). The overlap integral $S{a b}$ is less than 1 (except at $R=0$ ). [A rigorous proof of this follows from Eq. (3-114) in Margenau and Murphy.] The coefficient of $W^{2}$ in $f(W)$ is $\left(1-S{a b}^{2}\right)>0$; therefore $f(\infty)=f(-\infty)=+\infty>0$. For $W=H{a a}$ or $H{b b}$, the first product in (13.125) vanishes. Hence $f\left(H{a a}\right)<0$ and $f\left(H{b b}\right)<0$. The roots of (13.125) occur where $f(W)$ equals 0 . Hence, by continuity, one root must be between $+\infty$ and $H{a a}$ and the other between $H{b b}$ and $-\infty$. Therefore, the orbital energy of one MO is less than both $H{a a}$ and $H{b b}$ (the energies of the two AOs in the molecule; Section 13.5), and the energy of the other MO is greater than both $H{a a}$ and $H{b b}$. One bonding and one antibonding MO are formed from the two AOs. Figure 13.22 shows the formation of bonding and antibonding MOs from two AOs, for the homonuclear and heteronuclear cases. These figures are gross oversimplifications, since a given MO has contributions from many AOs, not just two.

The coefficients $c{1}$ and $c{2}$ in the bonding heteronuclear MO in Fig. 13.22 are both positive, so as to build up charge between the nuclei. For the antibonding heteronuclear MO, the coefficients of $\phi{a}$ and $\phi{b}$ have opposite signs, causing charge depletion between the nuclei.

FIGURE 13.22 Formation of bonding and antibonding MOs from AOs in the homonuclear and heteronuclear cases. (See Prob. 13.39.)

13.16 VB Treatment of Heteronuclear Diatomic Molecules

Consider the valence-bond ground state wave function of HF. We expect a single bond to be formed by the pairing of the hydrogen $1 s$ electron and the unpaired fluorine $2 p \sigma$ electron. The Heitler-London function corresponding to this pairing is [Eq. (13.117)]

\(
\begin{equation}
\phi{\text {cov }}=\left|\cdots 1 s{\mathrm{H}} \overline{2 p \sigma{\mathrm{F}}}\right|-\left|\cdots \overline{1 s{\mathrm{H}}} 2 p \sigma_{\mathrm{F}}\right| \tag{13.126}
\end{equation}
\)

where the dots stand for $1 s{\mathrm{F}} \overline{1 s{\mathrm{F}}} 2 s{\mathrm{F}} \overline{2 s{\mathrm{F}}} 2 p \pi{x \mathrm{~F}} \overline{2 p \pi{x \mathrm{~F}}} 2 p \pi{y \mathrm{~F}} \overline{2 p \pi{y \mathrm{~F}}}$. This function is essentially covalent, the electrons being shared by the two atoms. However, the high electronegativity of fluorine leads us to include a contribution from an ionic structure as well. An ionic valence-bond function has the form $\phi{a}(1) \phi{a}(2)$ [Eq. (13.109)]. Introduction of the required antisymmetric spin factor gives as the valence-bond function for an ionic structure in HF :

\(
\phi{\text {ion }}=\left|\cdots 2 p \sigma{\mathrm{F}} \overline{2 p \sigma_{\mathrm{F}}}\right|
\)

The VB wave function is then written as

\(
\begin{equation}
\phi=c{1} \phi{\mathrm{cov}}+c{2} \phi{\mathrm{ion}} \tag{13.127}
\end{equation}
\)

The optimum values of $c{1}$ and $c{2}$ are found by the variation method. This leads to the usual secular equation. We have ionic-covalent "resonance," involving the structures H-F and $\mathrm{H}^{+} \mathrm{F}^{-}$. The true molecular structure is intermediate between the covalent and ionic structures. A term $c{3}\left|1 s{\mathrm{H}} \overline{s_{\mathrm{H}}}\right|$ corresponding to the ionic structure $\mathrm{H}^{-} \mathrm{F}^{+}$could also be included in the wave function, but this should contribute only slightly for HF. For molecules that are less ionic, both ionic structures might well be included.

For a highly ionic molecule such as NaCl , we expect the VB function to have $c{2} \gg c{1}$. It might be thought that NaCl would dissociate to $\mathrm{Na}^{+}$and $\mathrm{Cl}^{-}$ions, but this is not true. The ionization energy of Na is 5.1 eV , while the electron affinity of Cl is only 3.6 eV . Hence, in the gas phase the neutral separated ground-state atoms $\mathrm{Na}+\mathrm{Cl}$ are more stable than the ground-state separated ions $\mathrm{Na}^{+}+\mathrm{Cl}^{-}$. (In aqueous solution the ions are more stable because of the hydration energy, which makes the separated ions more stable than even the diatomic NaCl molecule.) If the nuclei are slowly pulled apart, a gas-phase NaCl molecule will dissociate to neutral atoms. Therefore, as $R$ increases from $R{e}$, the ratio $c{2} / c_{1}$ in (13.127) must decrease, becoming zero at $R=\infty$. For intermediate values of $R$, the Coulombic attraction between the ions is greater than the $1.5-\mathrm{eV}$ difference between the ionization potential and electron affinity, and the molecule is largely ionic. For very large $R$, the Coulombic attraction between the ions is less than 1.5 eV , and the molecule is largely covalent. However, if the nuclei in NaCl are pulled apart very rapidly, then the electrons will not have a chance to adjust their wave function from the ionic to the covalent wave function, and both bonding electrons will go with the chlorine nucleus, giving dissociation into ions.

Cesium has the lowest ionization energy, 3.9 eV . Chlorine has the highest electron affinity, 3.6 eV . Thus, even for CsCl and CsF , the separated ground-state neutral atoms are more stable than the separated ground-state ions. There are, however, cases of excited states of diatomic molecules that dissociate to ions.

13.17 The Valence-Electron Approximation

Suppose we want to treat $\mathrm{Cs}_{2}$, which has 110 electrons. In the MO method, we would start by writing down a $110 \times 110$ Slater determinant of molecular orbitals. We would then approximate the MOs by functions containing variational parameters and go on
to minimize the variational integral. Clearly, the large number of electrons makes this a formidable task. One way to simplify the problem is to divide the electrons into two groups: the 108 core electrons and the two $6 s$ valence electrons, which provide bonding. We then try to treat the valence electrons separately from the core, taking the molecular energy as the sum of core- and valence-electron energies. This approach, introduced in the 1930s, is called the valence-electron approximation.

The simplest approach is to regard the core electrons as point charges coinciding with the nucleus. For $\mathrm{Cs}{2}$ this would give a Hamiltonian for the two valence electrons that is identical with the electronic Hamiltonian for $\mathrm{H}{2}$. If we then go ahead and minimize the variational integral for the valence electrons in $\mathrm{Cs}{2}$, with no restrictions on the valence-electron trial functions, we will clearly be in trouble. Such a procedure will cause the valence-electrons' MO to "collapse" to the $\sigma{g} 1 s$ MO, since the core electrons are considered absent. To avoid this collapse, one can impose the constraint that the variational functions used for the valence electrons be orthogonal to the orbitals of the core electrons. Of course, the task of keeping the valence orbitals orthogonal to the core orbitals means more work. A somewhat different approach is to drop the approximation of treating the core electrons as coinciding with the nucleus, and to treat them as a charge distribution that provides some sort of effective repulsive potential for the motion of the valence electrons. This leads to an effective Hamiltonian for the valence electrons, which is then used in the variational integral. The valence-electron approximation is widely used in approximate treatments of polyatomic molecules (Chapter 17).


Theorems of Molecular Quantum Mechanics

Click the keywords below to know more about it.

Electron Probability Density: This term refers to the likelihood of finding an electron in a specific region of space. It is derived from the wave function of a molecule and is crucial for understanding the spatial distribution of electrons in molecular systems 1. Dipole Moment: The dipole moment of a molecule is a measure of the separation of positive and negative charges within the molecule. It is calculated from the wave function and is important for understanding the molecule's interaction with electric fields 1. Hartree–Fock Method: This is a computational method used to determine the wave function and energy of a quantum many-body system in a stationary state. It approximates the wave function as a single Slater determinant of spin-orbitals 1. Virial Theorem: This theorem relates the average kinetic energy and potential energy of a system in a bound state. It is useful for understanding the stability and bonding of molecules 1. Hellmann–Feynman Theorem: This theorem states that the force on a nucleus in a molecule can be calculated as the sum of the electrostatic forces exerted by the other nuclei and the electron charge density. It simplifies the calculation of forces in molecular systems 1. Molecular Orbital (MO): A molecular orbital is a region in a molecule where there is a high probability of finding an electron. MOs are formed by the combination of atomic orbitals and are used to describe the electronic structure of molecules 1. Slater Determinant: This is a mathematical expression used to describe the wave function of a multi-electron system in a way that satisfies the Pauli exclusion principle. It ensures that the wave function changes sign when any two electrons are exchanged 1. Coulomb Integral: This integral represents the electrostatic interaction between electrons in different orbitals. It is a key component in the calculation of molecular energies using the Hartree–Fock method 1. Exchange Integral: This integral accounts for the exchange interaction between electrons due to their indistinguishability and the Pauli exclusion principle. It is also a key component in the Hartree–Fock method 1. Roothaan Equations: These are a set of linear equations derived from the Hartree–Fock method, used to determine the coefficients of the molecular orbitals in terms of a chosen basis set 1.

How is the wave function of a many-electron molecule related to the electron probability density? We want to find the probability of finding an electron in the rectangular volume element located at point $(x, y, z)$ in space with edges $d x, d y, d z$. The electronic wave function $\psi$ is a function of the spatial and spin coordinates of the $n$ electrons. (For simplicity the parametric dependence on the nuclear configuration will not be explicitly indicated.) We know that

\(
\begin{equation}
\left|\psi\left(x{1}, \ldots, z{n}, m{s 1}, \ldots, m{s n}\right)\right|^{2} d x{1} d y{1} d z{1} \cdots d x{n} d y{n} d z{n} \tag{14.1}
\end{equation}
\)

is the probability of simultaneously finding electron 1 with spin $m{s 1}$ in the volume element $d x{1} d y{1} d z{1}$ at $\left(x{1}, y{1}, z{1}\right)$, electron 2 with spin $m{s 2}$ in the volume element $d x{2} d y{2} d z{2}$ at $\left(x{2}, y{2}, z{2}\right)$, and so on. Since we are not interested in what spin the electron we find at ( $x, y, z$ ) has, we sum the probability (14.1) over all possible spin states of all electrons to give the probability of simultaneously finding each electron in the appropriate volume element with no regard for spin:

\(
\begin{equation}
\sum{m{s 1}} \cdots \sum{m{s n}}|\psi|^{2} d x{1} \cdots d z{n} \tag{14.2}
\end{equation}
\)

Suppose we want the probability of finding electron 1 in the volume element $d x d y d z$ at $(x, y, z)$. For this probability we do not care where electrons 2 through $n$ are. We therefore add the probabilities for all possible locations for these electrons. This amounts to integrating (14.2) over the coordinates of electrons $2,3, \ldots, n$ :

\(
\begin{equation}
\left[\sum{\text {all } m{s}} \int \cdots \int\left|\psi\left(x, y, z, x{2}, y{2}, z{2}, \ldots, x{n}, y{n}, z{n}, m{s 1}, \ldots, m{s n}\right)\right|^{2} d x{2} \cdots d z{n}\right] d x d y d z \tag{14.3}
\end{equation}
\)

where there is a $(3 n-3)$-fold integration over $x{2}$ through $z{n}$.
Now suppose we ask for the probability of finding electron 2 in the volume element $d x d y d z$ at $(x, y, z)$. By analogy to (14.3), this is

\(
\begin{equation}
\left[\sum{\mathrm{all} m{s}} \int \cdots \int\left|\psi\left(x{1}, y{1}, z{1}, x, y, z, x{3}, \ldots, z{n}, m{s 1}, \ldots, m{s n}\right)\right|^{2} d x{1} d y{1} d z{1} d x{3} \cdots d z{n}\right] d x d y d z \tag{14.4}
\end{equation}
\)

Of course, electrons do not come with labels, and this indistinguishability (Section 10.3) means that the probabilities (14.3) and (14.4) must be equal. This equality is readily proved. The wave function $\psi$ is antisymmetric with respect to electron exchange, so $|\psi|^{2}$ is unchanged by an electron exchange. Interchanging the spatial and spin coordinates of electrons 1 and 2 in $\psi$ in (14.4) and doing some relabeling of dummy variables, we see that (14.4) is equal to (14.3). Thus (14.3) gives the probability of finding any one particular electron in $d x d y d z$. Since the system has $n$ electrons, the probability of finding an electron in $d x d y d z$ is $n$ times (14.3). (In drawing this conclusion, we assume that the probability of finding more than one electron in the infinitesimal region $d x d y d z$ is negligible compared with the probability of finding one electron. This is certainly valid since the probability of finding two electrons will involve the product of six infinitesimal quantities as compared with the product of three infinitesimal quantities for the probability of finding one electron.)

Thus the probability density $\rho$ for finding an electron in the neighborhood of point $(x, y, z)$ is

\(
\begin{align}
\rho(x, y, z) & =n \sum{\text {all } m{s}} \int \cdots \int\left|\psi\left(x, y, z, x{2}, \ldots, z{n}, m{s 1}, \ldots, m{s n}\right)\right|^{2} d x{2} \cdots d z{n} \
\rho(\mathbf{r}) & =n \sum{\text {all } m{s}} \int \cdots \int\left|\psi\left(\mathbf{r}, \mathbf{r}{2}, \ldots, \mathbf{r}{n}, m{s 1}, \ldots, m{s n}\right)\right|^{2} d \mathbf{r}{2} \cdots d \mathbf{r}{n} \tag{14.5}
\end{align}
\)

where the vector notation for spatial variables (Section 5.2) is used. The atomic units of $\rho$ are electrons $/$ bohr $^{3}$.
$\rho$ is the electron probability density. The corresponding electronic charge density averaged over time is equal to $-e \rho(x, y, z)$, where $-e$ is the charge on an electron. In atomic units, the electronic charge density is $-\rho$. In addition, there are the positive charges of the nuclei. The term electronic charge density is commonly shortened to charge density.

A molecule's $\rho$ is an experimentally observable quantity that can be found from measured x-ray diffraction intensities of molecular crystals or electron-diffraction intensities of gases. See P. Coppens and M. B. Hall (eds.), Electron Distributions and the Chemical Bond, Plenum, 1982; D. A. Kohl and L. S. Bartell, J. Chem. Phys., 51, 2891, 2896 (1969); P. Coppens, J. Phys. Chem., 93, 7979 (1989); P. Coppens, Annu. Rev. Phys. Chem., 43, 663 (1992); C. Gatti and P. Macchi (eds.), Modern Charge-Density Analysis, Springer, 2012.

To illustrate (14.5), consider the electron density for the simple VB and MO groundstate $\mathrm{H}{2}$ wave functions. The wave function is a product of a spatial factor and the spin function (11.60). (For more than two electrons, $\psi$ cannot be factored into a product of a spatial part and a spin part; see Chapter 10.) Summation of (11.60) over $m{s 1}$ and $m{s 2}$ gives one (Section 10.4). Thus (14.5) becomes for $\mathrm{H}{2}$

\(
\rho(x, y, z)=2 \iiint\left|\phi\left(x, y, z, x{2}, y{2}, z{2}\right)\right|^{2} d x{2} d y{2} d z{2}
\)

where $\phi$ is the spatial factor. When $\phi$ is taken as the spatial factor in the VB function (13.100) and Prob. 13.33 or the MO function (13.94), we get (Prob. 14.1)

\(
\begin{equation}
\rho{\mathrm{VB}}=\frac{1 s{a}^{2}+1 s{b}^{2}+2 S{a b} 1 s{a} 1 s{b}}{1+S{a b}^{2}}, \quad \rho{\mathrm{MO}}=\frac{1 s{a}^{2}+1 s{b}^{2}+2\left(1 s{a} 1 s{b}\right)}{1+S_{a b}} \tag{14.6}
\end{equation}
\)

One finds (Prob. 14.2) that $\rho{\mathrm{MO}}>\rho{\mathrm{VB}}$ at the midpoint of the bond, so the MO function (which underestimates electron correlation) piles up more charge between the nuclei than the VB function.

The MO probability density in (14.6) is twice $\rho$ for the $\mathrm{H}{2}^{+}$-like $1 s{\mathrm{A}}+1 s_{\mathrm{B}}$ MO [Eq. (13.65)]. One can prove that, for a many-electron MO wave function, $\rho$ is found by multiplying the probability-density function of each MO by the number of electrons occupying it and summing the results:

\(
\begin{equation}
\rho(x, y, z)=\sum{j} n{j}\left|\phi_{j}\right|^{2} \tag{14.7}
\end{equation}
\)

where the sum is over the different orthogonal spatial MOs, and $n{j}$ (whose possible values are 0,1 , or 2 ) is the number of electrons in the $\mathrm{MO} \phi{j}$. [We used (14.7) in Eq. (11.11).]

Calculations of $\rho$ from high-quality wave functions show that for nearly all molecules, local maxima in $\rho$ occur only at the nuclei. One of the few exceptions is the ground electronic state of $\mathrm{Li}_{2}$, for which $\rho$ has a small local maximum at the bond midpoint [Bader, Section E2.1; G. I. Bersuker et al., J. Phys. Chem., 97, 9323 (1993)].

Let $B\left(\mathbf{r}{i}\right)$ be a function of the spatial coordinates $x{i}, y{i}, z{i}$ of electron $i$, where the notation of Section 5.2 is used. For an $n$-electron molecule, consider the average value

\(
\langle\psi| \sum{i=1}^{n} B\left(\mathbf{r}{i}\right)|\psi\rangle=\int \psi^{*} \sum{i=1}^{n} B\left(\mathbf{r}{i}\right) \psi d \tau=\sum{i=1}^{n} \int|\psi|^{2} B\left(\mathbf{r}{i}\right) d \tau
\)

where $\psi$ is the electronic wave function. Since the electrons are indistinguishable, each term in the sum $\sum{i} \int|\psi|^{2} B d \tau$ must have the same value. Therefore we have $\langle\psi| \sum{i=1}^{n} B\left(\mathbf{r}{i}\right)|\psi\rangle=\int n|\psi|^{2} B\left(\mathbf{r}{1}\right) d \tau$. Since $B\left(\mathbf{r}{1}\right)$ depends only on $x{1}, y{1}, z{1}$, before we integrate over $x{1}, y{1}, z{1}$, we can integrate $n|\psi|^{2}$ over the spatial coordinates of electrons 2 to $n$ and sum over all the spin coordinates. From Eq. (14.5), this produces the electron probability density $\rho\left(\mathbf{r}{1}\right)$. Therefore, $\langle\psi| \sum{i=1}^{n} B\left(\mathbf{r}{i}\right)|\psi\rangle=\int \rho\left(\mathbf{r}{1}\right) B\left(\mathbf{r}{1}\right) d \mathbf{r}_{1}$. The subscript 1 on the integration variables is not needed, and the final result is

\(
\begin{equation}
\int \psi^{} \sum{i=1}^{n} B\left(\mathbf{r}{i}\right) \psi d \tau=\int \rho(\mathbf{r}) B(\mathbf{r}) d \mathbf{r} \tag{14.8}
\end{}
\)

where the integration is over the three spatial coordinates $x, y, z$. This result will be used later in this chapter and in Chapter 15.


We now show how to calculate molecular dipole moments from wave functions.
The classical expression for the electric dipole moment $\boldsymbol{\mu}{\mathrm{cl}}$ of a set of discrete charges $Q{i}$ is

\(
\begin{equation}
\boldsymbol{\mu}{\mathrm{cl}}=\sum{i} Q{i} \mathbf{r}{i} \tag{14.9}
\end{equation}
\)

where $\mathbf{r}_{i}$ is the position vector from the origin to the $i$ th charge [Eq. (5.33)]. The electric dipole moment is a vector whose $x$ component is

\(
\begin{equation}
\mu{x, \mathrm{cl}}=\sum{i} Q{i} x{i} \tag{14.10}
\end{equation}
\)

with similar expressions for the other components. For a continuous charge distribution with charge density $\rho{Q}(x, y, z), \boldsymbol{\mu}{\mathrm{cl}}$ is found by summing over the infinitesimal elements of charge $d Q{i}=\rho{Q}(x, y, z) d x d y d z$ :

\(
\begin{equation}
\boldsymbol{\mu}{\mathrm{cl}}=\int \rho{Q}(x, y, z) \mathbf{r} d x d y d z, \quad \text { where } \quad \mathbf{r}=x \mathbf{i}+y \mathbf{j}+z \mathbf{k} \tag{14.11}
\end{equation}
\)

Now consider the quantum-mechanical definition of the electric dipole moment. Suppose we apply a uniform external electric field $\mathbf{E}$ to an atom or molecule and ask for the effect on the energy of the system. To form the Hamiltonian operator, we first need the classical expression for the energy. The electric field strength $\mathbf{E}$ is defined as $\mathbf{E} \equiv \mathbf{F} / Q$, where $\mathbf{F}$ is the force the field exerts on a charge $Q$. We take the $z$ direction as the direction of the applied field: $\mathbf{E}=\mathscr{E}_{z} \mathbf{k}$. The potential energy $V$ is [Eq. (4.24)]

\(
d V / d z=-F{z}=-Q \mathscr{E}{z} \quad \text { and } \quad V=-Q \mathscr{C}_{z} z
\)

This is the potential energy of a single charge in the field. For a system of charges,

\(
\begin{equation}
V=-\mathscr{E}{z} \sum{i} Q{i} z{i} \tag{14.12}
\end{equation}
\)

where $z{i}$ is the $z$ coordinate of charge $Q{i}$. The extension of (14.12) to the case where the electric field points in an arbitrary direction follows from (4.24) and is

\(
\begin{equation}
V=-\mathscr{E}{x} \sum{i} Q{i} x{i}-\mathscr{E}{y} \sum{i} Q{i} y{i}-\mathscr{E}{z} \sum{i} Q{i} z{i}=-\mathbf{E} \cdot \boldsymbol{\mu}_{\mathrm{cl}} \tag{14.13}
\end{equation}
\)

This is the classical-mechanical expression for the energy of an electric dipole in a uniform applied electric field.

To calculate the quantum-mechanical expression, we use perturbation theory. The perturbation operator $\hat{H}^{\prime}$ corresponding to (14.13) is $\hat{H}^{\prime}=-\mathbf{E} \cdot \hat{\boldsymbol{\mu}}$, where the electric dipole-moment operator $\hat{\boldsymbol{\mu}}$ is

\(
\begin{gather}
\hat{\boldsymbol{\mu}}=\sum{i} Q{i} \hat{\mathbf{r}}{i}=\mathbf{i} \hat{\mu}{x}+\mathbf{j} \hat{\mu}{y}+\mathbf{k} \hat{\mu}{z} \tag{14.14}\
\hat{\mu}{x}=\sum{i} Q{i} x{i}, \quad \hat{\mu}{y}=\sum{i} Q{i} y{i}, \quad \hat{\mu}{z}=\sum{i} Q{i} z{i} \tag{14.15}
\end{gather}
\)

The first-order correction to the energy is [Eq. (9.22)]

\(
\begin{equation}
E^{(1)}=-\mathbf{E} \cdot \int \psi^{(0) } \hat{\boldsymbol{\mu}} \psi^{(0)} d \tau \tag{14.16}
\end{}
\)

where $\psi^{(0)}$ is the unperturbed wave function. Comparison of (14.16) and (14.13) shows that the quantum-mechanical quantity that corresponds to $\boldsymbol{\mu}_{\mathrm{cl}}$ is the integral

\(
\begin{equation}
\boldsymbol{\mu}=\int \psi^{(0)} \hat{\boldsymbol{\mu}} \psi^{(0)} d \tau \tag{14.17}
\end{}
\)

$\boldsymbol{\mu}$ in (14.17) is the quantum-mechanical electric dipole moment of the system.
An objection to taking (14.17) as the dipole moment is that we considered only the first-order energy correction. If we had included $E^{(2)}$ in (14.16), the comparison with (14.13) would not have given (14.17) as the dipole moment. Actually, (14.17) is the dipole
moment of the system in the absence of an applied electric field and is the permanent electric dipole moment. Application of the field distorts the wave function from $\psi^{(0)}$, giving rise to an induced electric dipole moment in addition to the permanent dipole moment. The induced dipole moment corresponds to the energy correction $E^{(2)}$. (For the details, see Merzbacher, Section 17.4.) The induced dipole moment $\boldsymbol{\mu}_{\text {ind }}$ is related to the applied electric field $\mathbf{E}$ by

\(
\begin{equation}
\boldsymbol{\mu}_{\text {ind }}=\alpha \mathbf{E} \tag{14.18}
\end{equation}
\)

where $\alpha$ is the polarizability of the atom or molecule. The greater the polarizability of molecule B, the greater the London dispersion force (Section 13.7) between two $B$ molecules.

The shift in the energy of a quantum-mechanical system caused by an applied electric field is called the Stark effect. The first-order (or linear) Stark effect is given by (14.16), and from (14.17) it vanishes for a system with no permanent electric dipole moment. The second-order (or quadratic) Stark effect is given by the energy correction $E^{(2)}$ and is proportional to the square of the applied field.

The electric dipole-moment operator (14.14) is an odd function of the coordinates. If the wave function in (14.17) is either even or odd, then the integrand in (14.17) is an odd function, and the integral over all space vanishes. We conclude that the permanent electric dipole moment $\boldsymbol{\mu}$ is zero for states of definite parity.

The permanent electric dipole moment of a molecule in electronic state $\psi_{\mathrm{el}}$ is

\(
\begin{equation}
\boldsymbol{\mu}=\int \psi_{\mathrm{el}}^{} \hat{\boldsymbol{\mu}} \psi{\mathrm{el}} d \tau{\mathrm{el}} \tag{14.19}
\end{}
\)

The electronic wave functions of homonuclear diatomic molecules can be classified as $g$ or $u$, according to their parity. Hence, a homonuclear diatomic molecule has a zero permanent electric dipole moment, a result that is not too astonishing. The same holds true for any molecule with a center of symmetry. The electric dipole-moment operator for a molecule includes summation over both the electronic and nuclear charges:

\(
\begin{equation}
\hat{\boldsymbol{\mu}}=\sum{i}\left(-e \mathbf{r}{i}\right)+\sum{\alpha} Z{\alpha} e \mathbf{r}_{\alpha} \tag{14.20}
\end{equation}
\)

where $\mathbf{r}{\alpha}$ is the vector from the origin to the nucleus of atomic number $Z{\alpha}$, and $\mathbf{r}{i}$ is the vector to electron $i$. Since both the dipole-moment operator (14.20) and the electronic wave function depend on the parameters defining the nuclear configuration, the molecular electronic dipole moment $\boldsymbol{\mu}$ depends on the nuclear configuration. To indicate this, the quantity (14.19) can be called the dipole-moment function of the molecule. In writing (14.19), we ignored the nuclear motion. When the dipole moment of a molecule is experimentally determined, what is measured is the quantity (14.19) averaged over the zero-point vibrations (assuming the temperature is not high enough for there to be appreciable population of higher vibrational levels). We might use $\boldsymbol{\mu}{0}$ and $\boldsymbol{\mu}_{e}$ to indicate the dipole moment averaged over zero-point vibrations and the dipole moment at the equilibrium nuclear configuration, respectively.

Since the second sum in (14.20) is independent of the electronic coordinates, we have

\(
\begin{aligned}
\boldsymbol{\mu} & =\int \psi{\mathrm{el}}^{*} \sum{i}\left(-e \mathbf{r}{i}\right) \psi{\mathrm{el}} d \tau{\mathrm{el}}+\sum{\alpha} Z{\alpha} e \mathbf{r}{\alpha} \int \psi{\mathrm{el}}^{*} \psi{\mathrm{el}} d \tau{\mathrm{el}} \
& =-e \int\left|\psi{\mathrm{el}}\right|^{2} \sum{i} \mathbf{r}{i} d \tau{\mathrm{el}}+e \sum{\alpha} Z{\alpha} \mathbf{r}{\alpha}
\end{aligned}
\)

Using (14.8), we have

\(
\begin{equation}
\boldsymbol{\mu}=-e \iiint \rho(x, y, z) \mathbf{r} d x d y d z+e \sum{\alpha} Z{\alpha} \mathbf{r}_{\alpha} \tag{14.21}
\end{equation}
\)

where $\rho$ is the electron probability density. Equation (14.21) is what would be obtained if we pretended that the electrons were smeared out into a continuous charge distribution whose charge density is given by $-e \rho(x, y, z)$ and we used the classical equation (14.11) to calculate $\boldsymbol{\mu}$.


A key development in quantum chemistry has been the computation of accurate self-consistent-field wave functions for many diatomic and polyatomic molecules. The principles of molecular SCF MO calculations are essentially the same as those for atomic SCF calculations (Section 11.1). We shall restrict ourselves to closed-shell configurations. For open shells, the formulas are more complicated.

The molecular Hartree-Fock wave function is written as an antisymmetrized product (Slater determinant) of spin-orbitals, each spin-orbital being a product of a spatial orbital $\phi_{i}$ and a spin function (either $\alpha$ or $\beta$ ).

The expression for the Hartree-Fock molecular electronic energy $E{\mathrm{HF}}$ is given by the variation theorem as $E{\mathrm{HF}}=\langle D| \hat{H}{\mathrm{el}}+V{N N}|D\rangle$, where $D$ is the Slater-determinant Hartree-Fock wave function, and the purely electronic Hamiltonian $\hat{H}{\mathrm{el}}$ and the internuclear repulsion $V{N N}$ are given by (13.5) and (13.6). Since $V{N N}$ doesn't involve electronic coordinates and $D$ is normalized, we have $\langle D| V{N N}|D\rangle=V{N N}\langle D \mid D\rangle=V{N N}$. The operator $\hat{H}{\text {el }}$ is the sum of one-electron operators $\hat{f}{i}$ and two electron operators $\hat{g}{i j}$; we have $\hat{H}{\mathrm{el}}=\sum{i} \hat{f}{i}+\sum{j} \sum{i>j} \hat{g}_{i j}$, where (in atomic units)

\(
\hat{f}{i}=-\frac{1}{2} \nabla{i}^{2}-\sum{\alpha} \frac{Z{\alpha}}{r{i \alpha}} \text { and } \hat{g}{i j}=\frac{1}{r_{i j}}
\)

The Hamiltonian $\hat{H}{\text {el }}$ is the same as the Hamiltonian $\hat{H}$ for an atom except that $\sum{\alpha} Z{\alpha} / r{i \alpha}$ replaces $Z / r{i}$ in $\hat{f}{i}$. Hence Eq. (11.83) can be used to give $\langle D| \hat{H}_{\text {el }}|D\rangle$. Therefore, the Hartree-Fock energy of a diatomic or polyatomic molecule with only closed shells is

\(
\begin{gather}
E{\mathrm{HF}}=2 \sum{i=1}^{n / 2} H{i i}^{\mathrm{core}}+\sum{i=1}^{n / 2} \sum{j=1}^{n / 2}\left(2 J{i j}-K{i j}\right)+V{N N} \tag{14.22}\
H{i i}^{\mathrm{core}} \equiv\left\langle\phi{i}(1)\right| \hat{H}^{\mathrm{core}}(1)\left|\phi{i}(1)\right\rangle \equiv\left\langle\phi{i}(1)\right|-\frac{1}{2} \nabla{1}^{2}-\sum{\alpha} \frac{Z{\alpha}}{r{1 \alpha}}\left|\phi{i}(1)\right\rangle \tag{14.23}\
J{i j} \equiv\left\langle\phi{i}(1) \phi{j}(2)\right| 1 / r{12}\left|\phi{i}(1) \phi{j}(2)\right\rangle, \quad K{i j} \equiv\left\langle\phi{i}(1) \phi{j}(2)\right| 1 / r{12}\left|\phi{j}(1) \phi_{i}(2)\right\rangle \tag{14.24}
\end{gather}
\)

where the one-electron-operator symbol was changed from $\hat{f}_{1}$ to $\hat{H}^{\text {core }}(1)$. The oneelectron core Hamiltonian

\(
\hat{H}^{\mathrm{core}}(1) \equiv-\frac{1}{2} \nabla{1}^{2}-\sum{\alpha} \frac{Z{\alpha}}{r{1 \alpha}}
\)

is the sum of the kinetic-energy operator for electron 1 and the potential-energy operators for the attractions between electron 1 and the nuclei. $\hat{H}^{\text {core }}(1)$ omits the interactions of electron 1 with the other electrons. The sums over $i$ and $j$ are over the $n / 2$ occupied spatial
orbitals $\phi{i}$ of the $n$-electron molecule. In the Coulomb integrals $J{i j}$ and the exchange integrals $K_{i j}$, the integration goes over the spatial coordinates of electrons 1 and 2.

The Hartree-Fock method looks for those orbitals $\phi{i}$ that minimize the variational integral $E{\mathrm{HF}}$. Each MO is taken to be normalized: $\left\langle\phi{i}(1) \mid \phi{i}(1)\right\rangle=1$. Moreover, for computational convenience one takes the MOs to be orthogonal: $\left\langle\phi{i}(1) \mid \phi{j}(1)\right\rangle=0$ for $i \neq j$. It might be thought that a lower energy could be obtained if the orthogonality restriction were omitted, but this is not so. A closed-shell antisymmetric wave function is a Slater determinant, and one can use the properties of determinants to show that a Slater determinant of nonorthogonal orbitals is equal to a Slater determinant in which the orbitals have been orthogonalized by the Schmidt or some other procedure; see Section 15.9 and F. W. Bobrowicz and W. A. Goddard, Chapter 4, Section 3.1 of Schaefer, Methods of Electronic Structure Theory. In effect, the Pauli antisymmetry requirement removes nonorthogonalities from the orbitals.

The derivation of the equation that determines the orthonormal $\phi{i}$ 's that minimize $E{\mathrm{HF}}$ is complicated and is omitted. (For the derivation, see Lowe and Peterson, Appendix 7; Szabo and Ostlund, Sections 3.1 and 3.2; Parr, pages 21-23.) One finds that the closedshell orthogonal Hartree-Fock MOs satisfy

\(
\begin{equation}
\hat{F}(1) \phi{i}(1)=\varepsilon{i} \phi_{i}(1) \tag{14.25}
\end{equation}
\)

where $\varepsilon_{i}$ is the orbital energy and where the (Hartree-) Fock operator $\hat{F}$ is (in atomic units)

\(
\begin{gather}
\hat{F}(1)=\hat{H}^{\text {core }}(1)+\sum{j=1}^{n / 2}\left[2 \hat{J}{j}(1)-\hat{K}{j}(1)\right] \tag{14.26}\
\hat{H}^{\text {core }}(1) \equiv-\frac{1}{2} \nabla{1}^{2}-\sum{\alpha} \frac{Z{\alpha}}{r_{1 \alpha}} \tag{14.27}
\end{gather}
\)

where the Coulomb operator $\hat{J}{j}$ and the exchange operator $\hat{K}{j}$ are defined by

\(
\begin{align}
\hat{J}{j}(1) f(1) & =f(1) \int\left|\phi{j}(2)\right|^{2} \frac{1}{r{12}} d v{2} \tag{14.28}\
\hat{K}{j}(1) f(1) & =\phi{j}(1) \int \frac{\phi_{j}^{}(2) f(2)}{r{12}} d v{2} \tag{14.29}
\end{align*}
\)

where $f$ is an arbitrary function and the integrals are definite integrals over all space.
The first term on the right of (14.27) is the operator for the kinetic energy of one electron. The second term is the sum of the potential-energy operators for the attractions between one electron and the nuclei. The Coulomb operator $\widehat{J}{j}(1)$ is the potential energy of interaction between electron 1 and a smeared-out electron with electronic density $-\left|\phi{j}(2)\right|^{2}$. The factor 2 in (14.26) occurs because there are two electrons in each spatial orbital. The exchange operator has no simple physical interpretation but arises from the requirement that the wave function be antisymmetric with respect to electron exchange. The exchange operators are absent from the Hartree equations (11.9). The Hartree-Fock MOs $\phi{i}$ in (14.25) are eigenfunctions of the same operator $\hat{F}$, the eigenvalues being the orbital energies $\varepsilon{i}$.

The orthogonality of the MOs greatly simplifies MO calculations, causing many integrals to vanish. In contrast, the VB method uses atomic orbitals, and AOs centered on different atoms are not orthogonal. MO calculations are much simpler than VB calculations, and the MO method is used far more often than the VB method.

The true Hamiltonian operator and wave function involve the coordinates of all $n$ electrons. The Hartree-Fock Hamiltonian operator $\hat{F}$ is a one-electron operator (that is, it involves the coordinates of only one electron), and (14.25) is a one-electron differential equation. This has been indicated in (14.25) by writing $\hat{F}$ and $\phi_{i}$ as functions of the coordinates of electron 1 . Of course, the coordinates of any electron could have been used. The operator $\hat{F}$ is peculiar in that it depends on its own eigenfunctions [see Eqs. (14.26) to (14.29)], which are not known initially. Hence the HartreeFock equations must be solved by an iterative process, starting with an initial guess for the MOs.

To obtain the expression for the orbital energies $\varepsilon{i}$, we multiply (14.25) by $\phi{i}^{}(1)$ and integrate over all space. Using the fact that $\phi{i}$ is normalized and using the result of Prob. 14.8, we get $\varepsilon{i}=\int \phi_{i}^{}(1) \hat{F}(1) \phi{i}(1) d v{1}$ and

\(
\begin{gather}
\varepsilon{i}=\left\langle\phi{i}(1)\right| \hat{H}^{\mathrm{core}}(1)\left|\phi{i}(1)\right\rangle+\sum{j}\left[2\left\langle\phi{i}(1)\right| \hat{J}{j}(1)\left|\phi{i}(1)\right\rangle-\left\langle\phi{i}(1)\right| \hat{K}{j}(1)\left|\phi{i}(1)\right\rangle\right] \
\varepsilon{i}=H{i i}^{\text {core }}+\sum{j=1}^{n / 2}\left(2 J{i j}-K_{i j}\right) \tag{14.30}
\end{gather}
\)

where $H{i i}^{\text {core }}, J{i j}$, and $K_{i j}$ are defined by (14.23) and (14.24).
Summation of (14.30) over the $n / 2$ occupied orbitals gives

\(
\begin{equation}
\sum{i=1}^{n / 2} \varepsilon{i}=\sum{i=1}^{n / 2} H{i i}^{\text {core }}+\sum{i=1}^{n / 2} \sum{j=1}^{n / 2}\left(2 J{i j}-K{i j}\right) \tag{14.31}
\end{equation}
\)

Solving this equation for $\sum{i} H{i i}^{\text {core }}$ and substituting the result into (14.22), we obtain the Hartree-Fock energy as

\(
\begin{equation}
E{\mathrm{HF}}=2 \sum{i=1}^{n / 2} \varepsilon{i}-\sum{i=1}^{n / 2} \sum{j=1}^{n / 2}\left(2 J{i j}-K{i j}\right)+V{N N} \tag{14.32}
\end{equation}
\)

Since there are two electrons per MO, the quantity $2 \sum{i} \varepsilon{i}$ is the sum of the orbital energies. Subtraction of the double sum in (14.32) avoids counting each interelectronic repulsion twice, as discussed in Section 11.1.

A key development that made feasible the calculation of accurate molecular SCF wave functions was Roothaan's 1951 proposal to expand the spatial orbitals $\phi{i}$ as linear combinations of a set of one-electron basis functions $\chi{s}$ :

\(
\begin{equation}
\phi{i}=\sum{s=1}^{b} c{s i} \chi{s} \tag{14.33}
\end{equation}
\)

To exactly represent the MOs $\phi{i}$, the basis functions $\chi{s}$ should form a complete set. This requires an infinite number of basis functions. In practice, one must use a finite number $b$ of basis functions. If $b$ is large enough and the functions $\chi_{s}$ are well chosen, one can represent the MOs with negligible error.

To avoid confusion, we shall use the letters $r, s, t, u$ to label the basis functions $\chi$, and the letters $i, j, k, l$ to label the MOs $\phi$. (Often the Greek letters $\mu, \nu, \lambda, \sigma$ are used to label the basis functions.)

Substitution of the expansion (14.33) into the Hartree-Fock equations (14.25) gives

\(
\sum{s} c{s i} \hat{F} \chi{s}=\varepsilon{i} \sum{s} c{s i} \chi_{s}
\)

Multiplication by $\chi_{r}^{*}$ and integration gives

\(
\begin{gather}
\sum{s=1}^{b} c{s i}\left(F{r s}-\varepsilon{i} S{r s}\right)=0, \quad r=1,2, \ldots, b \tag{14.34}\
F{r s} \equiv\left\langle\chi{r}\right| \hat{F}\left|\chi{s}\right\rangle, \quad S{r s} \equiv\left\langle\chi{r} \mid \chi_{s}\right\rangle \tag{14.35}
\end{gather}
\)

The equations (14.34) form a set of $b$ simultaneous linear homogeneous equations in the $b$ unknowns $c{s i}, s=1,2, \ldots, b$, that describe the $\mathrm{MO} \phi{i}$ in (14.33). For a nontrivial solution, we must have

\(
\begin{equation}
\operatorname{det}\left(F{r s}-\varepsilon{i} S_{r s}\right)=0 \tag{14.36}
\end{equation}
\)

This is a secular equation whose roots give the orbital energies $\varepsilon{i}$. The (Hartree-Fock-) Roothaan equations (14.34) must be solved by an iterative process, since the $F{r s}$ integrals depend on the orbitals $\phi{i}$ (through the dependence of $\hat{F}$ on the $\phi{i}$ 's), which in turn depend on the unknown coefficients $c_{s i}$.

One starts with guesses for the occupied-MO expressions as linear combinations of the basis functions, as in (14.33). This initial set of MOs is used to compute the Fock operator $\hat{F}$ from (14.26) to (14.29). The matrix elements (14.35) are computed, and the secular equation (14.36) is solved to give an initial set of $\varepsilon{i}$ 's. These $\varepsilon{i}$ 's are used to solve (14.34) for an improved set of coefficients, giving an improved set of MOs, which are then used to compute an improved $\hat{F}$, and so on. One continues until no further improvement in MO coefficients and energies occurs from one cycle to the next. The calculations are done using a computer. (The most efficient way to solve the Roothaan equations is to use matrix-algebra methods; see the last part of this section.)

We have used the terms SCF wave function and Hartree-Fock wave function interchangeably. In practice, the term SCF wave function is applied to any wave function obtained by iterative solution of the Roothaan equations, whether or not the basis set is large enough to give a really accurate approximation to the Hartree-Fock SCF wave function. There is only one true Hartree-Fock wave function, which is the best possible wave function that can be written as a Slater determinant of spin-orbitals. With current computer power, one can use very large basis sets for small molecules and obtain wave functions that differ negligibly from the true Hartree-Fock wave functions. Because of deficiencies in properties calculated from Hartree-Fock wave functions, several methods that go beyond the Hartree-Fock method are widely used (see Chapter 16).

The Fock Matrix Elements

To solve the Roothaan equations (14.34), we first must express the Fock matrix elements (integrals) $F_{r s}$ in terms of the basis functions $\chi$. The Fock operator $\hat{F}$ is given by (14.26), so

\(
\begin{align}
& F{r s}=\left\langle\chi{r}(1)\right| \hat{F}(1)\left|\chi{s}(1)\right\rangle \
& F{r s}=\left\langle\chi{r}(1)\right| \hat{H}^{\mathrm{core}}(1)\left|\chi{s}(1)\right\rangle+\sum{j=1}^{n / 2}\left[2\left\langle\chi{r}(1) \mid \hat{J}{j}(1) \chi{s}(1)\right\rangle-\left\langle\chi{r}(1) \mid \hat{K}{j}(1) \chi_{s}(1)\right\rangle\right] \tag{14.37}
\end{align}
\)

Replacement of $f$ by $\chi_{s}$ in (14.28), followed by use of the expansion (14.33), gives

\(
\hat{J}{j}(1) \chi{s}(1)=\chi{s}(1) \int \frac{\phi{j}^{}(2) \phi{j}(2)}{r{12}} d v{2}=\chi{s}(1) \sum{t} \sum{u} c_{t j}^{} c{u j} \int \frac{\chi{t}^{*}(2) \chi{u}(2)}{r{12}} d v_{2}
\)

Multiplication by $\chi_{r}^{*}(1)$ and integration over the coordinates of electron 1 gives

\(
\left\langle\chi{r}(1) \mid \hat{J}{j}(1) \chi{s}(1)\right\rangle=\sum{t} \sum{u} c{t j}^{} c{u j} \iint \frac{\chi{r}^{}(1) \chi{s}(1) \chi{t}^{*}(2) \chi{u}(2)}{r{12}} d v{1} d v{2}
\)

\(
\begin{equation}
\left\langle\chi{r}(1) \mid \hat{J}{j}(1) \chi{s}(1)\right\rangle=\sum{t=1}^{b} \sum{u=1}^{b} c{t j}^{} c_{u j}(r s \mid t u) \tag{14.38}
\end{}
\)

where the two-electron repulsion integral $(r s \mid t u)$ is defined as

\(
\begin{equation}
(r s \mid t u) \equiv \iint \frac{\chi_{r}^{}(1) \chi{s}(1) \chi{t}^{}(2) \chi{u}(2)}{r{12}} d v{1} d v{2} \tag{14.39}
\end{equation}
\)

The widely used notation of (14.39) should not be misinterpreted as an overlap integral. Other notations, some of which are mutually contradictory, are used for electron repulsion integrals, so it is always wise to check an author's definition.

Similarly, replacement of $f$ by $\chi_{s}$ in (14.29) leads to (Prob. 14.9)

\(
\begin{equation}
\left\langle\chi{r}(1) \mid \hat{K}{j}(1) \chi{s}(1)\right\rangle=\sum{t=1}^{b} \sum{u=1}^{b} c{t j}^{} c_{u j}(r u \mid t s) \tag{14.40}
\end{}
\)

Substituting (14.40) and (14.38) into (14.37) and changing the order of summation, we get the desired expression for $F_{r s}$ in terms of integrals over the basis functions $\chi$ :

\(
\begin{gather}
F{r s}=H{r s}^{\mathrm{core}}+\sum{t=1}^{b} \sum{u=1}^{b} \sum{j=1}^{n / 2} c{t j}^{} c{u j}[2(r s \mid t u)-(r u \mid t s)] \
F{r s}=H{r s}^{\mathrm{core}}+\sum{t=1}^{b} \sum{u=1}^{b} P{t u}\left[(r s \mid t u)-\frac{1}{2}(r u \mid t s)\right] \tag{14.41}\
P{t u} \equiv 2 \sum{j=1}^{n / 2} c{t j}^{*} c{u j}, \quad t=1,2, \ldots, b, \quad u=1,2, \ldots, b \tag{14.42}\
H{r s}^{\text {core }} \equiv\left\langle\chi{r}(1)\right| \hat{H}^{\text {core }}(1)\left|\chi_{s}(1)\right\rangle
\end{gather*}
\)

The quantities $P{t u}$ are called density matrix elements or charge, bond-order matrix elements. [Some workers use the definition $P{u t} \equiv 2 \sum{j} c{t j}^{*} c_{u j}$.] Substitution of the expansion (14.33) into (14.7) for the electron probability density $\rho$ gives for a closed-shell molecule:

\(
\begin{equation}
\rho=2 \sum{j=1}^{n / 2} \phi{j}^{} \phi{j}=2 \sum{r=1}^{b} \sum{s=1}^{b} \sum{j=1}^{n / 2} c{r j}^{*} c{s j} \chi{r}^{*} \chi{s}=\sum{r=1}^{b} \sum{s=1}^{b} P{r s} \chi{r}^{} \chi_{s} \tag{14.43}
\end{equation}
\)

To express the Hartree-Fock energy in terms of integrals over the basis functions $\chi$, we first solve (14.31) for $\sum{i} \sum{j}\left(2 J{i j}-K{i j}\right)$ and substitute the result into (14.32) to get

\(
E{\mathrm{HF}}=\sum{i=1}^{n / 2} \varepsilon{i}+\sum{i=1}^{n / 2} H{i i}^{\mathrm{core}}+V{N N}
\)

We have, using the expansion (14.33),

\(
\begin{gather}
H{i i}^{\text {core }}=\left\langle\phi{i}\right| \hat{H}^{\mathrm{core}}\left|\phi{i}\right\rangle=\sum{r} \sum{s} c{r i}^{} c{s i}\left\langle\chi{r}\right| \hat{H}^{\mathrm{core}}\left|\chi{s}\right\rangle=\sum{r} \sum{s} c{r i}^{} c{s i} H{r s}^{\text {core }} \
E{\mathrm{HF}}=\sum{i=1}^{n / 2} \varepsilon{i}+\sum{r} \sum{s} \sum{i=1}^{n / 2} c_{r i}^{} c{s i} H{r s}^{\mathrm{core}}+V{N N} \
E{\mathrm{HF}}=\sum{i=1}^{n / 2} \varepsilon{i}+\frac{1}{2} \sum{r=1}^{b} \sum{s=1}^{b} P{r s} H{r s}^{\mathrm{core}}+V_{N N} \tag{14.44}
\end{gather*}
\)

An alternative expression for $E{\mathrm{HF}}$ is useful. Multiplication of $\hat{F} \phi{i}=\varepsilon{i} \phi{i}$ [Eq. (14.25)] by $\phi{i}^{*}$ and integration gives $\varepsilon{i}=\left\langle\phi{i}\right| \hat{F}\left|\phi{i}\right\rangle$. Substitution of $\phi{i}=\sum{s=1}^{b} c{s i} \chi{s}$ [Eq. (14.33)] gives $\varepsilon{i}=\sum{r} \sum{s} c{r i}^{} c{s i}\left\langle\chi{r}\right| \hat{F}\left|\chi{s}\right\rangle=\sum{r} \sum{s} c{r i}^{} c{s i} F{r s}$. The first sum in (14.44) becomes $\sum{i} \varepsilon{i}=\sum{r} \sum{s} \sum{i} c{r i}^{*} c{s i} F{r s}=\frac{1}{2} \sum{r} \sum{s} P{r s} F{r s}$, where the definition (14.42) of $P_{r s}$ was used. Equation (14.44) becomes

\(
\begin{equation}
E{\mathrm{HF}}=\frac{1}{2} \sum{r=1}^{b} \sum{s=1}^{b} P{r s}\left(F{r s}+H{r s}^{\mathrm{core}}\right)+V_{N N} \tag{14.45}
\end{equation}
\)

which expresses $E{\mathrm{HF}}$ of a closed-shell molecule in terms of the density, Fock, and coreHamiltonian matrix elements calculated with the basis functions $\chi{r}$.

EXAMPLE

Do an SCF calculation for the helium-atom ground state using a basis set of two $1 s$ STOs with orbital exponents $\zeta{1}=1.45$ and $\zeta{2}=2.91$. [By trial and error, these have been found to be the optimum $\zeta$ 's to use for this basis set; see C. Roetti and E. Clementi, J. Chem. Phys., 60, 4725 (1974).]

From (11.14), the normalized basis functions are (in atomic units)

\(
\begin{equation}
\chi{1}=2 \zeta{1}^{3 / 2} e^{-\zeta{1} r} Y{0}^{0}, \quad \chi{2}=2 \zeta{2}^{3 / 2} e^{-\zeta{2} r} Y{0}^{0}, \quad \zeta{1}=1.45, \quad \zeta{2}=2.91 \tag{14.46}
\end{equation}
\)

To solve the Roothaan equations (14.34), we need the integrals $F{r s}$ and $S{r s}$. The overlap integrals $S_{r s}$ are

\(
\begin{gathered}
S{11}=\left\langle\chi{1} \mid \chi{1}\right\rangle=1, \quad S{22}=\left\langle\chi{2} \mid \chi{2}\right\rangle=1 \
S{12}=S{21}=\left\langle\chi{1} \mid \chi{2}\right\rangle=4 \zeta{1}^{3 / 2} \zeta{2}^{3 / 2} \int{0}^{\infty} e^{-\left(\zeta{1}+\zeta{2}\right) r^{2}} d r=\frac{8 \zeta{1}^{3 / 2} \zeta{2}^{3 / 2}}{\left(\zeta{1}+\zeta_{2}\right)^{3}}=0.8366
\end{gathered}
\)

where the Appendix integral (A.8) was used.
The integrals $F{r s}$ are given by (14.41) and depend on $H{r s}^{\text {core }}, P{t u}$, and $(r s \mid t u)$. From (14.27), $\hat{H}^{\text {core }}=-\frac{1}{2} \nabla^{2}-2 / r=-\frac{1}{2} \nabla^{2}-\zeta / r+(\zeta-2) / r$. The integrals $H{r s}^{\text {core }}$ are evaluated the same way that similar integrals were evaluated in the variation treatment of He in Section 9.4. We find (Prob. 14.12)

\(
\begin{gathered}
H{11}^{\text {core }}=\left\langle\chi{1}\right| \hat{H}^{\text {core }}\left|\chi{1}\right\rangle=-\frac{1}{2} \zeta{1}^{2}+\left(\zeta{1}-2\right) \zeta{1}=\frac{1}{2} \zeta{1}^{2}-2 \zeta{1}=-1.8488 \
H{22}^{\text {core }}=\frac{1}{2} \zeta{2}^{2}-2 \zeta{2}=-1.5860 \
H{12}^{\text {core }}=H{21}^{\text {core }}=\left\langle\chi{1}\right| \hat{H}^{\text {core }}\left|\chi{2}\right\rangle=-\frac{1}{2} \zeta{2}^{2} S{12}+\frac{4\left(\zeta{2}-2\right) \zeta{1}^{3 / 2} \zeta{2}^{3 / 2}}{\left(\zeta{1}+\zeta{2}\right)^{2}} \
H{12}^{\text {core }}=H{21}^{\text {core }}=\frac{\zeta{1}^{3 / 2} \zeta{2}^{3 / 2}\left(4 \zeta{1} \zeta{2}-8 \zeta{1}-8 \zeta{2}\right)}{\left(\zeta{1}+\zeta{2}\right)^{3}}=-1.8826
\end{gathered}
\)

Many of the electron-repulsion integrals ( $r s \mid t u$ ) are equal to one another. For real basis functions, one can show that (Prob. 14.13)

\(
\begin{equation}
(r s \mid t u)=(s r \mid t u)=(r s \mid u t)=(s r \mid u t)=(t u \mid r s)=(u t \mid r s)=(t u \mid s r)=(u t \mid s r) \tag{14.47}
\end{equation}
\)

The electron-repulsion integrals are evaluated using the $1 / r_{12}$ expansion (9.124) in Prob. 9.14. One finds [see Eq. (9.53) and Prob. 14.14]

\(
\begin{aligned}
&(11 \mid 11)= \frac{5}{8} \zeta{1}=0.9062, \quad(22 \mid 22)=\frac{5}{8} \zeta{2}=1.8188 \
&(11 \mid 22)=(22 \mid 11)=\left(\zeta{1}^{4} \zeta{2}+4 \zeta{1}^{3} \zeta{2}^{2}+\zeta{1} \zeta{2}^{4}+4 \zeta{1}^{2} \zeta{2}^{3}\right) /\left(\zeta{1}+\zeta{2}\right)^{4}=1.1826 \
&(12 \mid 12)=(21 \mid 12)=(12 \mid 21)=(21 \mid 21)=20 \zeta{1}^{3} \zeta{2}^{3} /\left(\zeta{1}+\zeta{2}\right)^{5}=0.9536 \
&(11 \mid 12)=(11 \mid 21)=(12 \mid 11)=(21 \mid 11)=\frac{16 \zeta{1}^{9 / 2} \zeta{2}^{3 / 2}}{\left(3 \zeta{1}+\zeta{2}\right)^{4}}\left[\frac{12 \zeta{1}+8 \zeta{2}}{\left(\zeta{1}+\zeta{2}\right)^{2}}+\frac{9 \zeta{1}+\zeta{2}}{2 \zeta_{1}^{2}}\right]=0.9033 \
&(12 \mid 22)=(22 \mid 12)=(21 \mid 22)=(22 \mid 21) \
& \quad=\text { the }(11 \mid 12) \text { expression with } 1 \text { and } 2 \text { interchanged }=1.2980
\end{aligned}
\)

To start the calculation, we need an initial guess for the ground-state AO expansion coefficients $c{s i}$ in (14.33) so that we can get an initial estimate of the density matrix elements $P{t u}$ in (14.41). We saw in Section 9.4 that the optimum orbital exponent for a helium AO that consists of one $1 s$ STO is $\frac{27}{16}=1.6875$. Since the orbital exponent $\zeta{1}$ is much closer to 1.6875 than is $\zeta{2}$, we expect that the coefficient of $\chi{1}$ in $\phi{1}=c{11} \chi{1}+c{21} \chi{2}$ will be substantially larger than the coefficient of $\chi{2}$. Let us take as an initial guess $c{11} / c{21} \approx 2$. [A more general method to get an initial guess for the $c{s i}$ coefficients is to neglect the electron-repulsion integrals in (14.41) and approximate $F{r s}$ in the secular equation (14.36) by $F{r s} \approx H{r s}^{\text {core }}$; we then solve (14.36) and (14.34). This would give $c{11} / c{21} \approx 1.5$ (Prob 14.15).] The normalization condition $\int\left|\phi{1}\right|^{2} d \tau=1$ gives for real coefficients (Prob. 14.17)

\(
\begin{equation}
c{21}=\left(1+k^{2}+2 k S{12}\right)^{-1 / 2}, \quad \text { where } k \equiv c{11} / c{21} \tag{14.48}
\end{equation}
\)

Substitution of $k=2$ and $S{12}=0.8366$ gives $c{21} \approx 0.3461$ and $c{11} \approx 2 c{21}=0.6922$.
With $n=2$ and $b=2$, Eq. (14.42) gives

\(
\begin{equation}
P{11}=2 c{11}^{} c{11}, \quad P{12}=2 c{11}^{*} c{21}, \quad P{21}=P{12}^{}, \quad P{22}=2 c{21}^{} c_{21} \tag{14.49}
\end{}
\)

The initial guess $c{11} \approx 0.6922, c{21} \approx 0.3461$ gives as the initial density matrix elements

\(
P{11} \approx 0.9583, \quad P{12}=P{21} \approx 0.4791, \quad P{22} \approx 0.2396
\)

The Fock matrix elements are found from (14.41) with $b=2$. Using (14.47) and $P{12}=P{21}$ for real functions, we get (Prob. 14.16a)

\(
\begin{gathered}
F{11}=H{11}^{\text {core }}+\frac{1}{2} P{11}(11 \mid 11)+P{12}(11 \mid 12)+P{22}\left[(11 \mid 22)-\frac{1}{2}(12 \mid 21)\right] \
F{12}=F{21}=H{12}^{\text {core }}+\frac{1}{2} P{11}(12 \mid 11)+P{12}\left[\frac{3}{2}(12 \mid 12)-\frac{1}{2}(11 \mid 22)\right]+\frac{1}{2} P{22}(12 \mid 22) \
F{22}=H{22}^{\text {core }}+P{11}\left[(22 \mid 11)-\frac{1}{2}(21 \mid 12)\right]+P{12}(22 \mid 12)+\frac{1}{2} P{22}(22 \mid 22)
\end{gathered}
\)

Substitution of the values of the $H_{r s}^{\text {core }}$ and $(r s \mid t u)$ integrals listed previously gives (Prob. 14.16b)

\(
\begin{gather}
F{11}=-1.8488+0.4531 P{11}+0.9033 P{12}+0.7058 P{22} \tag{14.50}\
F{12}=F{21}=-1.8826+0.4516{5} P{11}+0.8391 P{12}+0.6490 P{22} \tag{14.51}\
F{22}=-1.5860+0.7058 P{11}+1.2980 P{12}+0.9094 P{22} \tag{14.52}
\end{gather}
\)

Substitution of the initial guess for the $P{t u}$ 's into (14.50) to (14.52) gives as the initial estimates of the $F{r s}$ matrix elements:

\(
F{11} \approx-0.813, \quad F{12}=F{21} \approx-0.892, \quad F{22} \approx-0.070
\)

The initial estimate of the secular equation $\operatorname{det}\left(F{r s}-S{r s} \varepsilon_{i}\right)=0$ is

\(
\begin{gathered}
\left|\begin{array}{cc}
-0.813-\varepsilon{i} & -0.892-0.8366 \varepsilon{i} \
-0.892-0.8366 \varepsilon{i} & -0.070-\varepsilon{i}
\end{array}\right| \approx 0 \
0.3001 \varepsilon{i}^{2}-0.609{5} \varepsilon{i}-0.739 \approx 0 \
\varepsilon{1} \approx-0.854, \quad \varepsilon_{2} \approx 2.885
\end{gathered}
\)

Substitution of the lower root $\varepsilon_{1}$ into the Roothaan equation (14.34) with $r=2$ gives

\(
\begin{gathered}
c{11}\left(F{21}-\varepsilon{1} S{21}\right)+c{21}\left(F{22}-\varepsilon{1} S{22}\right) \approx 0 \
-0.177{5} c{11}+0.784 c{21} \approx 0 \
c{11} / c_{21} \approx 4.42
\end{gathered}
\)

Substitution of $k=4.42$ and $S_{12}=0.8366$ in the normalization condition (14.48) gives

\(
c{21} \approx 0.189, \quad c{11}=k c_{21} \approx 0.836
\)

Substitution of these improved coefficients into (14.49) gives as the improved density matrix elements

\(
P{11} \approx 1.398, \quad P{12}=P{21} \approx 0.316, \quad P{22} \approx 0.071
\)

Substitution of these improved $P{t u}$ 's into (14.50) to (14.52) gives as the improved $F{r s}$ values

\(
F{11} \approx-0.880, \quad F{12}=F{21} \approx-0.940, \quad F{22} \approx-0.124_{6}
\)

The improved secular equation is

\(
\begin{gathered}
\left|\begin{array}{cc}
-0.880-\varepsilon{i} & -0.940-0.8366 \varepsilon{i} \
-0.940-0.8366 \varepsilon{i} & -0.124{6}-\varepsilon{i}
\end{array}\right| \approx 0 \
\varepsilon{1} \approx-0.918, \quad \varepsilon_{2} \approx 2.810
\end{gathered}
\)

The improved $\varepsilon{1}$ value gives $c{11} / c_{21} \approx 4.61$ and

\(
c{11} \approx 0.842, \quad c{21} \approx 0.183
\)

Another cycle of calculation yields (Prob. 14.18)

\(
\begin{gather}
P{11}=1.418, \quad P{12}=P{21}=0.308, \quad P{22}=0.067 \
F{11}=-0.881, \quad F{12}=F{21}=-0.940, \quad F{22}=-0.124{5} \tag{14.53}\
\varepsilon{1}=-0.918, \quad \varepsilon{2}=2.809 \tag{14.54}\
c{11}=0.842, \quad c_{21}=0.183
\end{gather}
\)

These last $c$ 's are the same as those for the previous cycle, so the calculation has converged and we are finished. The He ground-state SCF AO for this basis set is

\(
\phi{1}=0.842 \chi{1}+0.183 \chi_{2}
\)

The SCF energy is found from (14.44) with $n=2$ and $b=2$ as

\(
\begin{aligned}
E_{\mathrm{HF}} & =-0.918+\frac{1}{2}[1.418(-1.8488)+2(0.308)(-1.8826)+0.067(-1.5860)]+0 \
& =-2.862 \text { hartrees }=-77.9 \mathrm{eV}
\end{aligned}
\)

A more precise calculation with $\zeta{1}=1.45363$ and $\zeta{2}=2.91093$ gives an SCF energy of -2.8616726 hartrees, as compared with the limiting Hartree-Fock energy -2.8616799 hartrees found with five basis functions [C. Roetti and E. Clementi, J. Chem. Phys., 60, 4725 (1974)].

Matrix Form of the Roothaan Equations

The Roothaan equations are most efficiently solved using matrix methods. The Roothaan equations (14.34) read

\(
\sum{s=1}^{b} F{r s} c{s i}=\sum{s=1}^{b} S{r s} c{s i} \varepsilon_{i}, \quad r=1,2, \ldots, b
\)

The coefficients $c{s i}$ relate the MOs $\phi{i}$ to the basis functions $\chi{s}$ according to $\phi{i}=\sum{s} c{s i} \chi{s}$. Let $\mathbf{C}$ be the square matrix of order $b$ whose elements are the coefficients $c{s i}$. Let $\mathbf{F}$ be the square matrix of order $b$ whose elements are $F{r s}=\left\langle\chi{r}\right| \hat{F}\left|\chi{s}\right\rangle$. Let $\mathbf{S}$ be the square matrix whose elements are $S{r s}=\left\langle\chi{r} \mid \chi{s}\right\rangle$. Let $\boldsymbol{\varepsilon}$ be the diagonal square matrix whose diagonal elements are the orbital energies $\varepsilon{1}, \varepsilon{2}, \ldots, \varepsilon{b}$ so that the elements of $\boldsymbol{\varepsilon}$ are $\varepsilon{m i}=\delta{m i} \varepsilon{i}$, where $\delta_{m i}$ is the Kronecker delta.

Use of the matrix multiplication rule (7.107) gives the $(s, i)$ th element of the matrix product $\mathbf{C} \boldsymbol{\varepsilon}$ as $(\mathbf{C} \boldsymbol{\varepsilon}){s i}=\sum{m} c{s m} \varepsilon{m i}=\sum{m} c{s m} \delta{m i} \varepsilon{i}=c{s i} \varepsilon{i}$. Hence the Roothaan equations read

\(
\begin{equation}
\sum{s=1}^{b} F{r s} c{s i}=\sum{s=1}^{b} S{r s}(\mathbf{C} \boldsymbol{\varepsilon}){s i} \tag{14.55}
\end{equation}
\)

From the matrix multiplication rule, the left side of (14.55) is the $(r, i)$ th element of $\mathbf{F C}$, and the right side is the $(r, i)$ th element of $\mathbf{S}(\mathbf{C} \boldsymbol{\varepsilon})$. Since the general element of $\mathbf{F C}$ equals the general element of $\mathbf{S C \varepsilon}$, these matrices are equal:

\(
\begin{equation}
\mathrm{FC}=\mathbf{S C} \boldsymbol{\varepsilon} \tag{14.56}
\end{equation}
\)

This is the matrix form of the Roothaan equations.
The set of basis functions $\chi{s}$ used to expand the MOs is not an orthogonal set. However, one can use the Schmidt or some other procedure to form orthogonal linear combinations of the basis functions to give a new set of basis functions $\chi{s}^{\prime}$ that is an orthonormal set: $\chi{s}^{\prime}=\sum{t} a{t s} \chi{t}$ and $S{r s}^{\prime}=\left\langle\chi{r}^{\prime} \mid \chi{s}^{\prime}\right\rangle=\delta{r s}$. (See Probs. 8.57 and 8.58 and Szabo and Ostlund, Section 3.4.5, for details of the orthogonalization procedure.) With this orthonormal basis set, the overlap matrix is a unit matrix, and the Roothaan equations (14.56) have the simpler form

\(
\begin{equation}
\mathbf{F}^{\prime} \mathbf{C}^{\prime}=\mathbf{C}^{\prime} \boldsymbol{\varepsilon} \tag{14.57}
\end{equation}
\)

where $F{r s}^{\prime}=\left\langle\chi{r}^{\prime}\right| \hat{F}\left|\chi{s}^{\prime}\right\rangle$ and $\mathbf{C}^{\prime}$ is the matrix of the coefficients that relate the MOs $\phi{i}$ to the orthonormal basis functions: $\phi{i}=\sum{s} c{s i}^{\prime} \chi{s}^{\prime}$. It was shown in Prob. 8.57c that the $\mathbf{F}$ and $\mathbf{F}^{\prime}$ matrices and the $\mathbf{C}$ and $\mathbf{C}^{\prime}$ matrices are related by

\(
\mathbf{F}^{\prime}=\mathbf{A}^{\dagger} \mathbf{F A} \text { and } \mathbf{C}=\mathbf{A C}^{\prime}
\)

where $\mathbf{A}$ is the matrix of coefficients $a{t s}$ in $\chi{s}^{\prime}=\sum{s} a{t s} \chi_{t}$, so we can readily calculate $\mathbf{F}^{\prime}$ from $\mathbf{F}$ and $\mathbf{C}$ from $\mathbf{C}^{\prime}$. [ $\mathbf{H}$ in Prob. 8.57 corresponds to $\mathbf{F}$ in (14.56).]

The matrix equation (14.57) has the same form as Eq. (8.87), which is $\mathbf{H C}=\mathbf{C W}$, where $\mathbf{C}$ and $\mathbf{W}$ [defined by (8.86)] are the eigenvector matrix and eigenvalue matrix, respectively, of $\mathbf{H}$. Thus, the orbital energies $\varepsilon{i}$ are the eigenvalues of the Fock matrix $\mathbf{F}^{\prime}$ and each column of $\mathbf{C}^{\prime}$ is an eigenvector of $\mathbf{F}^{\prime}$. Because the Fock operator $\hat{F}$ is Hermitian, the Fock matrix $\mathbf{F}^{\prime}$ is a Hermitian matrix. As noted in the paragraph preceding Eq. (8.94), the eigenvector matrix $\mathbf{C}^{\prime}$ of the Hermitian matrix $\mathbf{F}^{\prime}$ can be chosen to be unitary, meaning that its inverse equals its conjugate transpose [Eq. (8.92)] $\mathbf{C}^{\prime-1}=\mathbf{C}^{\prime \dagger}$. (With a unitary coefficient matrix $\mathbf{C}^{\prime}$, the MOs $\phi{i}$ are orthonormal; see Prob. 14.22.) Multiplication of (14.57) on the left by $\mathbf{C}^{\prime-1}=\mathbf{C}^{\prime \dagger}$ gives [see Eqs. (8.88) and (8.94)]

\(
\begin{equation}
\mathbf{C}^{\prime \dagger} \mathbf{F}^{\prime} \mathbf{C}^{\prime}=\boldsymbol{\varepsilon} \tag{14.58}
\end{equation}
\)

which has the same form as Eq. (8.94).

The following procedure is commonly used to do an SCF MO calculation at a specified molecular geometry.

  1. Choose a basis set $\chi_{s}$.
  2. Evaluate the $H{r s}^{\text {core }}, S{r s}$, and ( $r s \mid t u$ ) integrals.
  3. Use the overlap integrals $S{r s}$ and an orthogonalization procedure to calculate the $\mathbf{A}$ matrix of coefficients $a{t s}$ that will produce orthonormal basis functions $\chi{s}^{\prime}=\sum{t} a{t s} \chi{t}$.
  4. Make an initial guess for the coefficients $c{s i}$ in the MOs $\phi{i}=\sum{s} c{s i} \chi_{s}$. From the initial guess of coefficients, calculate the density matrix $\mathbf{P}$ in (14.42).
  5. Use (14.41) to calculate an estimate of the Fock matrix elements $F{r s}$ from $\mathbf{P}$ and the $(r s \mid t u)$ and $H{r s}^{\text {core }}$ integrals.
  6. Calculate the matrix $\mathbf{F}^{\prime}$ using $\mathbf{F}^{\prime}=\mathbf{A}^{\dagger} \mathbf{F A}$.
  7. Use a matrix-diagonalization method (Section 8.6) to find the eigenvalue and eigenvector matrices $\boldsymbol{\varepsilon}$ and $\mathbf{C}^{\prime}$ of $\mathbf{F}^{\prime}$.
  8. Calculate the coefficient matrix $\mathbf{C}=\mathbf{A C}^{\prime}$.
  9. Calculate an improved estimate of the density matrix from $\mathbf{C}$ using $\mathbf{P}^{*}=2 \mathbf{C} \mathbf{C}^{\dagger}$, which is the matrix form of (14.42) (Prob. 14.10c).
  10. Compare the improved $\mathbf{P}$ with the preceding estimate of $\mathbf{P}$. If all corresponding matrix elements differ by negligible amounts from each other, the calculation has converged and one uses the converged SCF wave function to calculate molecular properties. If the calculation has not converged, go back to step (5) to calculate an improved $\mathbf{F}$ matrix from the current $\mathbf{P}$ matrix and then do the succeeding steps.

One way to begin an SCF calculation is to initially estimate the Fock matrix elements by $F{r s} \approx H{r s}^{\text {core }}$, which amounts to neglecting the double sum in (14.41). This gives a very crude estimate. More commonly, SCF calculations get the initial estimate of the density matrix by doing a semiempirical calculation (Section 17.4) on the molecule. Semiempirical calculations are very fast. Still another possibility is to construct a guess for the $\mathbf{P}$ matrix by using the density matrices for the atoms composing the molecule. To find the equilibrium geometry of a molecule, one does a series of SCF calculations at many successive geometries (see Section 15.10). For the second and later SCF calculations of the series, one takes the initial guess of $\mathbf{P}$ as $\mathbf{P}$ for the SCF wave function of a nearby geometry.


We now derive the virial theorem. Let $\hat{H}$ be the time-independent Hamiltonian of a system in the bound stationary state $\psi$ :

\(
\begin{equation}
\hat{H} \psi=E \psi \tag{14.59}
\end{equation}
\)

Let $\hat{A}$ be a linear, time-independent operator. Consider the integral

\(
\begin{equation}
\int \psi^{}[\hat{H}, \hat{A}] \psi d \tau=\langle\psi| \hat{H} \hat{A}-\hat{A} \hat{H}|\psi\rangle=\langle\psi| \hat{H}|\hat{A} \psi\rangle-E\langle\psi| \hat{A}|\psi\rangle \tag{14.60}
\end{}
\)

where (14.59) was used. Since $\hat{H}$ is Hermitian, we have

\(
\langle\psi| \hat{H}|\hat{A} \psi\rangle=\langle\hat{A} \psi| \hat{H}|\psi\rangle^{}=E^{}\langle\hat{A} \psi \mid \psi\rangle *=E\langle\psi \mid \hat{A} \psi\rangle=E\langle\psi| \hat{A}|\psi\rangle
\)

and Eq. (14.60) becomes

\(
\begin{equation}
\int \psi^{}[\hat{H}, \hat{A}] \psi d \tau=0 \tag{14.61}
\end{}
\)

Equation (14.61) is the hypervirial theorem. [For some of its applications, see J. O. Hirschfelder, J. Chem. Phys., 33, 1462 (1960); J. H. Epstein and S. T. Epstein, Am. J. Phys., 30, 266 (1962).] In deriving (14.61), we used the Hermitian property of $\hat{H}$. The proof that $\hat{p}{x}$ and $\hat{p}{x}^{2}$ are Hermitian, and hence that $\hat{H}$ is Hermitian, requires that $\psi$ vanish at $\pm \infty$ [see Eq. (7.17)]. Hence the hypervirial theorem does not apply to continuum stationary states, for which $\psi$ does not vanish at $\infty$.

We now derive the virial theorem from (14.61). We choose $\hat{A}$ to be

\(
\begin{equation}
\sum{i} \hat{q}{i} \hat{p}{i}=-i \hbar \sum{i} q{i} \frac{\partial}{\partial q{i}} \tag{14.62}
\end{equation}
\)

where the sum runs over the $3 n$ Cartesian coordinates of the $n$ particles. (Particle 1 has Cartesian coordinates $q{1}, q{2}, q{3}$ and linear-momentum components $p{1}, p{2}, p{3}$. In this chapter the symbol $q$ will indicate a Cartesian coordinate.) To evaluate $[\hat{H}, \hat{A}]$, we use (5.4), (5.5), (5.8), and (5.9) to get

\(
\begin{align}
& {\left[\hat{H}, \sum{i} \hat{q}{i} \hat{p}{i}\right]=\sum{i}\left[\hat{H}, \hat{q}{i} \hat{p}{i}\right]=\sum{i} \hat{q}{i}\left[\hat{H}, \hat{p}{i}\right]+\sum{i}\left[\hat{H}, \hat{q}{i}\right] \hat{p}{i}} \
& =i \hbar \sum{i} q{i} \frac{\partial V}{\partial q{i}}-i \hbar \sum{i} \frac{1}{m{i}} \hat{p}{i}^{2}=i \hbar \sum{i} q{i} \frac{\partial V}{\partial q_{i}}-2 i \hbar \hat{T} \tag{14.63}
\end{align}
\)

where $\hat{T}$ and $\hat{V}$ are the kinetic- and potential-energy operators for the system. Substitution of (14.63) into (14.61) gives

\(
\begin{equation}
\langle\psi| \sum{i} q{i} \frac{\partial V}{\partial q_{i}}|\psi\rangle=2\langle\psi| \hat{T}|\psi\rangle \tag{14.64}
\end{equation}
\)

Using $\langle B\rangle$ for the quantum-mechanical average of $B$, we write (14.64) as

\(
\begin{equation}
\left\langle\sum{i} q{i} \frac{\partial V}{\partial q_{i}}\right\rangle=2\langle T\rangle \tag{14.65}
\end{equation}
\)

Equation (14.65) is the quantum-mechanical virial theorem. Note that its validity is restricted to bound stationary states. (The word vires is Latin for "forces." In classical mechanics, the derivatives of the potential energy give the negatives of the force components. There is also a classical-mechanical virial theorem.)

For certain systems the virial theorem takes on a simple form. To discuss these systems, we introduce the idea of a homogeneous function. A function $f\left(x{1}, x{2}, \ldots, x_{j}\right)$ of several variables is homogeneous of degree $n$ if it satisfies

\(
\begin{equation}
f\left(s x{1}, s x{2}, \ldots, s x{j}\right)=s^{n} f\left(x{1}, x{2}, \ldots, x{j}\right) \tag{14.66}
\end{equation}
\)

where $s$ is an arbitrary parameter. For example, the function $g=1 / y^{3}+x / y^{2} z^{2}$ is homogeneous of degree -3 , since $g(s x, s y, s z)=1 / s^{3} y^{3}+s x / s^{2} y^{2} s^{2} z^{2}=s^{-3} g(x, y, z)$.

Euler's theorem on homogeneous functions states that, if $f\left(x{1}, \ldots, x{j}\right)$ is homogeneous of degree $n$, then

\(
\begin{equation}
\sum{k=1}^{j} x{k} \frac{\partial f}{\partial x_{k}}=n f \tag{14.67}
\end{equation}
\)

The theorem is proved as follows. Let

\(
u{1} \equiv s x{1}, \quad u{2} \equiv s x{2}, \quad \ldots, \quad u{j}=s x{j}
\)

Use of the chain rule gives for the partial derivative of the left side of (14.66) with respect to $s$

\(
\begin{aligned}
\frac{\partial f\left(u{1}, \ldots, u{j}\right)}{\partial s} & =\frac{\partial f}{\partial u{1}} \frac{\partial u{1}}{\partial s}+\frac{\partial f}{\partial u{2}} \frac{\partial u{2}}{\partial s}+\cdots+\frac{\partial f}{\partial u{j}} \frac{\partial u{j}}{\partial s} \
& =x{1} \frac{\partial f}{\partial u{1}}+x{2} \frac{\partial f}{\partial u{2}}+\cdots+x{j} \frac{\partial f}{\partial u{j}}=\sum{k=1}^{j} x{k} \frac{\partial f}{\partial u_{k}}
\end{aligned}
\)

The partial derivative of Eq. (14.66) with respect to $s$ is thus

\(
\begin{equation}
\sum{k=1}^{j} x{k} \frac{\partial f\left(u{1}, \ldots, u{j}\right)}{\partial u{k}}=n s^{n-1} f\left(x{1}, \ldots, x_{j}\right) \tag{14.68}
\end{equation}
\)

Let $s=1$, so that $u{i}=x{i}$; Eq. (14.68) then gives (14.67). This completes the proof.
Now we return to the virial theorem (14.65). If the potential energy $V$ is a homogeneous function of degree $n$ when expressed in Cartesian coordinates, Euler's theorem gives

\(
\begin{equation}
\sum{i} q{i} \frac{\partial V}{\partial q_{i}}=n V \tag{14.69}
\end{equation}
\)

and the virial theorem (14.65) simplifies to

\(
\begin{equation}
2\langle T\rangle=n\langle V\rangle \tag{14.70}
\end{equation}
\)

for a bound stationary state. Since (Prob. 6.35)

\(
\begin{equation}
\langle T\rangle+\langle V\rangle=E \tag{14.71}
\end{equation}
\)

we can write (14.70) in two other forms:

\(
\begin{align}
\langle V\rangle & =\frac{2 E}{n+2} \tag{14.72}\
\langle T\rangle & =\frac{n E}{n+2} \tag{14.73}
\end{align}
\)

EXAMPLE

Apply the virial theorem to (a) the one-dimensional harmonic oscillator; (b) the hydrogen atom; (c) a many-electron atom.
(a) For the one-dimensional harmonic oscillator, $V=\frac{1}{2} k x^{2}$, which is homogeneous of degree $n=2$. Equations (14.70) and (14.72) give

\(
\begin{equation}
\langle T\rangle=\langle V\rangle=\frac{1}{2} E=\frac{1}{2} h \nu\left(v+\frac{1}{2}\right) \tag{14.74}
\end{equation}
\)

This was verified for the ground state in Prob. 4.9.
(b) For the H atom, $V=-1 /\left(x^{2}+y^{2}+z^{2}\right)^{1 / 2}$ in Cartesian coordinates and atomic units. $V$ is a homogeneous function of degree -1 . Hence

\(
\begin{equation}
2\langle T\rangle=-\langle V\rangle \tag{14.75}
\end{equation}
\)

which was verified for the ground state in Prob. 6.36. For every hydrogen-atom bound stationary state,

\(
\begin{equation}
\langle V\rangle=2 E \quad \text { and } \quad\langle T\rangle=-E \tag{14.76}
\end{equation}
\)

(c) For a many-electron atom with spin-orbit interaction neglected,

\(
V=-Z \sum{i=1}^{n} \frac{1}{\left(x{i}^{2}+y{i}^{2}+z{i}^{2}\right)^{1 / 2}}+\sum{i} \sum{j>i} \frac{1}{\left[\left(x{i}-x{j}\right)^{2}+\left(y{i}-y{j}\right)^{2}+\left(z{i}-z{j}\right)^{2}\right]^{1 / 2}}
\)

Replacing each of the $3 n$ coordinates by $s$ times the coordinate, we find that $V$ is homogeneous of degree -1 . Hence Eqs. (14.75) and (14.76) hold for every atom.

Now consider molecules. In the Born-Oppenheimer approximation, the molecular wave function is [Eq. (13.12)]

\(
\psi=\psi{\mathrm{el}}\left(q{i} ; q{\alpha}\right) \psi{N}\left(q_{\alpha}\right)
\)

where $q{i}$ and $q{\alpha}$ symbolize the electronic and nuclear coordinates, respectively. $\psi_{\mathrm{el}}$ is found by solving the electronic Schrödinger equation (13.7):

\(
\hat{H}{\mathrm{el}} \psi{\mathrm{el}}\left(q{i} ; q{\alpha}\right)=E{\mathrm{el}}\left(q{\alpha}\right) \psi{\mathrm{el}}\left(q{i} ; q_{\alpha}\right)
\)

where $E_{\mathrm{el}}$ is the purely electronic energy and where (in atomic units)

\(
\begin{gather}
\hat{H}{\mathrm{el}}=\hat{T}{\mathrm{el}}+\hat{V}{\mathrm{el}} \tag{14.77}\
\hat{T}{\mathrm{el}}=-\frac{1}{2} \sum{i}\left(\frac{\partial^{2}}{\partial x{i}^{2}}+\frac{\partial^{2}}{\partial y{i}^{2}}+\frac{\partial^{2}}{\partial z{i}^{2}}\right) \tag{14.78}\
\hat{V}{\mathrm{el}}=-\sum{\alpha} \sum{i} \frac{Z{\alpha}}{\left[\left(x{i}-x{\alpha}\right)^{2}+\left(y{i}-y{\alpha}\right)^{2}+\left(z{i}-z{\alpha}\right)^{2}\right]^{1 / 2}} \
+\sum{i} \sum{j>i} \frac{1}{\left[\left(x{i}-x{j}\right)^{2}+\left(y{i}-y{j}\right)^{2}+\left(z{i}-z{j}\right)^{2}\right]^{1 / 2}} \tag{14.79}
\end{gather}
\)

Let the system be in the electronic stationary state $\psi{\mathrm{el}}$. If we put the subscript el on $\hat{H}$ and $\psi$ in (14.59) and regard the variables of $\psi{\mathrm{el}}$ to be the electronic coordinates $q{i}$ (with the nuclear coordinates $q{\alpha}$ being parameters), then the derivation of the virial theorem (14.65) is seen to be valid for the electronic kinetic- and potential-energy operators, and we have

\(
\begin{equation}
2\left\langle\psi{\mathrm{el}}\right| \hat{T}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle=\left\langle\psi{\mathrm{el}}\right| \sum{i} q{i} \frac{\partial V{\mathrm{el}}}{\partial q{i}}\left|\psi_{\mathrm{el}}\right\rangle \tag{14.80}
\end{equation}
\)

Viewed as a function of the electronic coordinates, $V_{\mathrm{el}}$ is $n o t$ a homogeneous function, since

\(
\begin{aligned}
{\left[\left(s x{i}-x{\alpha}\right)^{2}+\left(s y{i}-y{\alpha}\right)^{2}+\left(s z{i}-\right.\right.} & \left.\left.z{\alpha}\right)^{2}\right]^{-1 / 2} \
& \neq s^{-1}\left[\left(x{i}-x{\alpha}\right)^{2}+\left(y{i}-y{\alpha}\right)^{2}+\left(z{i}-z{\alpha}\right)^{2}\right]^{-1 / 2}
\end{aligned}
\)

Thus the virial theorem for the average electronic kinetic and potential energies of a molecule will not have the simple form (14.75), which holds for atoms. We can, however,
view $V{\text {el }}$ as a function of both the electronic and the nuclear Cartesian coordinates. From this viewpoint $V{\mathrm{el}}$ is a homogeneous function of degree -1 , since

\(
\begin{aligned}
{\left[\left(s x{i}-s x{\alpha}\right)^{2}+\left(s y{i}-s y{\alpha}\right)^{2}+\left(s z{i}\right.\right.} & \left.\left.-s z{\alpha}\right)^{2}\right]^{-1 / 2} \
& =s^{-1}\left[\left(x{i}-x{\alpha}\right)^{2}+\left(y{i}-y{\alpha}\right)^{2}+\left(z{i}-z{\alpha}\right)^{2}\right]^{-1 / 2}
\end{aligned}
\)

Therefore, considering $V_{\mathrm{el}}$ as a function of both electronic and nuclear coordinates and applying Euler's theorem (14.67), we have

\(
\sum{i} q{i} \frac{\partial V{\mathrm{el}}}{\partial q{i}}+\sum{\alpha} q{\alpha} \frac{\partial V{\mathrm{el}}}{\partial q{\alpha}}=-V_{\mathrm{el}}
\)

Using this equation in (14.80), we get

\(
\begin{equation}
2\left\langle\psi{\mathrm{el}}\right| \hat{T}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle=-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle-\left\langle\psi{\mathrm{el}}\right| \sum{\alpha} q{a} \frac{\partial V{\mathrm{el}}}{\partial q{\alpha}}\left|\psi{\mathrm{el}}\right\rangle \tag{14.81}
\end{equation}
\)

which contains an additional term as compared with the atomic virial theorem (14.75). Consider this extra term. We have

\(
\left\langle\psi{\mathrm{el}}\right| \sum{\alpha} q{\alpha} \frac{\partial V{\mathrm{el}}}{\partial q{\alpha}}\left|\psi{\mathrm{el}}\right\rangle=\sum{\alpha} q{\alpha} \int \psi{\mathrm{el}}^{*} \frac{\partial V{\mathrm{el}}}{\partial q{\alpha}} \psi{\mathrm{el}} d \tau_{\mathrm{el}}
\)

where the nuclear coordinate $q_{\alpha}$ was taken outside the integral over electronic coordinates. In Section 14.7 we shall show that [see the bracketed sentence after Eq. (14.126)]

\(
\begin{equation}
\int \psi_{\mathrm{el}}^{} \frac{\partial V{\mathrm{el}}}{\partial q{\alpha}} \psi{\mathrm{el}} d \tau{\mathrm{el}}=\frac{\partial E{\mathrm{el}}}{\partial q{\alpha}} \tag{14.82}
\end{}
\)

[Equation (14.82) is an example of the Hellmann-Feynman theorem.] Using these last two equations in the molecular electronic virial theorem (14.81), we get

\(
\begin{align}
2\left\langle\psi{\mathrm{el}}\right| \hat{T}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle & =-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle-\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}} \
2\left\langle T{\mathrm{el}}\right\rangle & =-\left\langle V{\mathrm{el}}\right\rangle-\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}} \tag{14.83}
\end{align}
\)

where the $q_{\alpha}$ 's are the nuclear Cartesian coordinates. Using

\(
\begin{equation}
\left\langle T{\mathrm{el}}\right\rangle+\left\langle V{\mathrm{el}}\right\rangle=E_{\mathrm{el}} \tag{14.84}
\end{equation}
\)

we can eliminate either $\left\langle T{\mathrm{el}}\right\rangle$ or $\left\langle V{\mathrm{el}}\right\rangle$ from (14.83), which is the molecular form of the virial theorem.

Now consider a diatomic molecule. The electronic energy is a function of $R$, the internuclear distance: $E{\text {el }}=E{\text {el }}(R)$. The summation in (14.83) is over the nuclear Cartesian coordinates $x{a}, y{a}, z{a}, x{b}, y{b}, z{b}$. We have

\(
\begin{gather}
\frac{\partial E{\mathrm{el}}}{\partial x{a}}=\frac{d E{\mathrm{el}}}{d R} \frac{\partial R}{\partial x{a}}, \quad \frac{\partial E{\mathrm{el}}}{\partial x{b}}=\frac{d E{\mathrm{el}}}{d R} \frac{\partial R}{\partial x{b}} \
R=\left[\left(x{a}-x{b}\right)^{2}+\left(y{a}-y{b}\right)^{2}+\left(z{a}-z{b}\right)^{2}\right]^{1 / 2} \
\frac{\partial R}{\partial x{a}}=\frac{x{a}-x{b}}{R}, \quad \frac{\partial R}{\partial x{b}}=\frac{x{b}-x{a}}{R} \tag{14.85}
\end{gather}
\)

with similar equations for the $y$ and $z$ coordinates. The sum in (14.83) becomes

\(
\begin{aligned}
\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}}=\frac{1}{R} \frac{d E{\mathrm{el}}}{d R}\left[x{a}\left(x{a}-x{b}\right)+x{b}\left(x{b}-x{a}\right)\right. & +y{a}\left(y{a}-y{b}\right) \
& \left.+y{b}\left(y{b}-y{a}\right)+z{a}\left(z{a}-z{b}\right)+z{b}\left(z{b}-z_{a}\right)\right]
\end{aligned}
\)

\(
\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}}=R \frac{d E_{\mathrm{el}}}{d R}
\)

where $\left(x{a}-x{b}\right)^{2}+\left(y{a}-y{b}\right)^{2}+\left(z{a}-z{b}\right)^{2}=R^{2}$ was used. The virial theorem (14.83) for a diatomic molecule becomes

\(
\begin{equation}
2\left\langle T{\mathrm{el}}\right\rangle=-\left\langle V{\mathrm{el}}\right\rangle-R \frac{d E_{\mathrm{el}}}{d R} \tag{14.86}
\end{equation}
\)

Using (14.84), we have the two alternative forms

\(
\begin{align}
\left\langle T{\mathrm{el}}\right\rangle & =-E{\mathrm{el}}-R \frac{d E{\mathrm{el}}}{d R} \tag{14.87}\
\left\langle V{\mathrm{el}}\right\rangle & =2 E{\mathrm{el}}+R \frac{d E{\mathrm{el}}}{d R} \tag{14.88}
\end{align}
\)

In deriving the molecular electronic virial theorem (14.83), we omitted the internuclear repulsion

\(
\begin{equation}
V{N N}=\sum{\beta} \sum{\alpha>\beta} \frac{Z{\alpha} Z{\beta}}{\left[\left(x{\alpha}-x{\beta}\right)^{2}+\left(y{\alpha}-y{\beta}\right)^{2}+\left(z{\alpha}-z_{\beta}\right)^{2}\right]^{1 / 2}} \tag{14.89}
\end{equation}
\)

from the electronic Hamiltonian (14.77) to (14.79). Let

\(
V=V{\mathrm{el}}+V{N N}
\)

where $V{\text {el }}$ is given by (14.79). We can rewrite the electronic Schrödinger equation $\hat{H}{\mathrm{el}} \psi{\mathrm{el}}=E{\mathrm{el}} \psi_{\mathrm{el}}$ as [Eq. (13.4)]

\(
\left(\hat{T}{\mathrm{el}}+\hat{V}\right) \psi{\mathrm{el}}=U\left(q{\alpha}\right) \psi{\mathrm{el}}
\)

where

\(
U\left(q{\alpha}\right)=E{\mathrm{el}}\left(q{\alpha}\right)+V{N N}
\)

$U\left(q{\alpha}\right)$ is the potential-energy function for nuclear motion. Consider what happens to the right side of (14.83) when we add $V{N N}$ to $V{\mathrm{el}}$ and $E{\mathrm{el}}$. We have

\(
\begin{align}
&-\int \psi_{\mathrm{el}}^{}\left(\hat{V}{\mathrm{el}}+V{N N}\right) \psi{\mathrm{el}} d \tau{\mathrm{el}}-\sum{\alpha} q{\alpha} \frac{\partial U}{\partial q{\alpha}} \
&=-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle-V{N N}-\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}}-\sum{\alpha} q{\alpha} \frac{\partial V{N N}}{\partial q_{\alpha}} \tag{14.90}
\end{align*}
\)

Since $V_{N N}$ is a homogeneous function of the nuclear Cartesian coordinates of degree -1 , Euler's theorem gives

\(
\sum{\alpha} q{\alpha} \frac{\partial V{N N}}{\partial q{\alpha}}=-V_{N N}
\)

and (14.90) becomes

\(
\begin{equation}
-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}+V{N N}\left|\psi{\mathrm{el}}\right\rangle-\sum{\alpha} q{\alpha} \frac{\partial U}{\partial q{\alpha}}=-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle-\sum{\alpha} q{\alpha} \frac{\partial E{\mathrm{el}}}{\partial q{\alpha}} \tag{14.91}
\end{equation}
\)

Substitution of (14.91) into (14.83) gives

\(
\begin{align}
2\left\langle\psi{\mathrm{el}}\right| \hat{T}{\mathrm{el}}\left|\psi{\mathrm{el}}\right\rangle & =-\left\langle\psi{\mathrm{el}}\right| \hat{V}{\mathrm{el}}+V{N N}\left|\psi{\mathrm{el}}\right\rangle-\sum{\alpha} q{\alpha} \frac{\partial U}{\partial q{\alpha}} \
2\left\langle T{\mathrm{el}}\right\rangle & =-\langle V\rangle-\sum{\alpha} q{\alpha} \frac{\partial U}{\partial q{\alpha}} \tag{14.92}
\end{align}
\)

which has the same form as (14.83). Therefore the molecular electronic virial theorem holds whether or not we include the internuclear repulsion. Corresponding to Eqs. (14.86) to (14.88) for diatomic molecules, we have

\(
\begin{align}
2\left\langle T{\mathrm{el}}\right\rangle & =-\langle V\rangle-R(d U / d R) \tag{14.93}\
\left\langle T{\mathrm{el}}\right\rangle & =-U-R(d U / d R) \tag{14.94}\
\langle V\rangle & =2 U+R(d U / d R) \tag{14.95}
\end{align}
\)

The potential energy $V=V{\text {el }}+V{N N}$ takes the zero of energy with all particles (electrons and nuclei) at infinite separation from one another. Therefore, $U(R)$ in (14.93) to (14.95) does not go to zero at $R=\infty$ but goes to the sum of the energies of the separated atoms, which is negative.

The true (nonrelativistic) wave functions for a system with $V$ a homogeneous function of the coordinates must satisfy the form of the virial theorem (14.70). What determines whether an approximate wave function for such a system satisfies (14.70)? The answer is that, by inserting a variational parameter as a multiplier of each Cartesian coordinate and choosing this parameter to minimize the variational integral, we can make any trial variation function satisfy the virial theorem. (For the proof, see Kauzmann, page 229.) This process is called scaling, and the variational parameter multiplying each coordinate is called a scale factor. For a molecular trial function, the scaling parameter must be inserted in front of the nuclear Cartesian coordinates, as well as in front of the electronic coordinates.

Consider some examples. The zeroth-order perturbation wave function (9.49) for the heliumlike atom has no scale factor and so does not satisfy the virial theorem. If we were to calculate $\langle T\rangle$ and $\langle V\rangle$ for (9.49), we would find $2\langle T\rangle \neq-\langle V\rangle$; see Prob. 14.26. The Heitler-London trial function for $\mathrm{H}_{2}$, Eq. (13.100), has no scale factor and does not satisfy the virial theorem. The Heitler-London-Wang function, which uses a variationally determined orbital exponent, satisfies the virial theorem. Hartree-Fock wave functions satisfy the virial theorem; note the scale factor in the Slater basis functions (11.14).


We now use the virial theorem to examine the changes in electronic kinetic and potential energy that occur when a covalent chemical bond is formed in a diatomic molecule. For formation of a stable bond, the $U(R)$ curve must have a substantial minimum. At this minimum we have

\(
\begin{equation}
\left.\frac{d U}{d R}\right|{R{e}}=0 \tag{14.96}
\end{equation}
\)

and Eqs. (14.93) to (14.95) become

\(
\begin{align}
\left.2\left\langle T{\mathrm{el}}\right\rangle\right|{R{e}} & =-\left.\langle V\rangle\right|{R{e}} \tag{14.97}\
\left.\left\langle T{\mathrm{el}}\right\rangle\right|{R{e}} & =-U\left(R{e}\right) \tag{14.98}\
\left.\langle V\rangle\right|{R{e}} & =2 U\left(R{e}\right) \tag{14.99}
\end{align}
\)

These equations resemble those for atoms [Eqs. (14.75) and (14.76)]. At $R=\infty$ we have the separated atoms, and the atomic virial theorem gives

\(
\begin{equation}
\left.2\left\langle T{\mathrm{el}}\right\rangle\right|{\infty}=-\left.\langle V\rangle\right|{\infty},\left.\quad\left\langle T{\mathrm{el}}\right\rangle\right|{\infty}=-U(\infty),\left.\quad\langle V\rangle\right|{\infty}=2 U(\infty) \tag{14.100}
\end{equation}
\)

$U(\infty)$ is the sum of the energies of the two separated atoms. Equations (14.98)-(14.100) give

\(
\begin{align}
& \left.\left\langle T{\mathrm{el}}\right\rangle\right|{R{e}}-\left.\left\langle T{\mathrm{el}}\right\rangle\right|{\infty}=U(\infty)-U\left(R{e}\right) \tag{14.101}\
& \left.\langle V\rangle\right|{R{e}}-\left.\langle V\rangle\right|{\infty}=2\left[U\left(R{e}\right)-U(\infty)\right] \tag{14.102}
\end{align}
\)

For bonding, we have $U\left(R{e}\right)<U(\infty)$. Therefore, Eqs. (14.101) and (14.102) show that the average molecular potential energy at $R{e}$ is less than the sum of the potential energies of the separated atoms, whereas the average molecular kinetic energy is greater at $R{e}$ than at $\infty$. The decrease in potential energy is twice the increase in kinetic energy, and results from allowing the electrons to feel the attractions of both nuclei and perhaps from an increase in orbital exponents in the molecule (see Section 13.5). The equilibrium dissociation energy (13.9) is $D{e}=\frac{1}{2}\left(\left.\langle V\rangle\right|{\infty}-\left.\langle V\rangle\right|{R_{e}}\right)$.

Consider the behavior of the average potential and kinetic energies for large $R$. The forces between uncharged atoms or molecules (other than those due to bond formation) are called van der Waals forces. For two neutral atoms, at least one of which is in an $S$ state, quantum-mechanical perturbation theory shows that the van der Waals force of attraction is proportional to $1 / R^{7}$, and the potential energy behaves like

\(
\begin{equation}
U(R) \approx U(\infty)-\frac{A}{R^{6}}, \quad R \text { large } \tag{14.103}
\end{equation}
\)

where $A$ is a positive constant. (See Kauzmann, Chapter 13.) This expression was first derived by London, and van der Waals forces between neutral atoms are called London forces or dispersion forces. (Recall the discussion near the end of Section 13.7.)

Substitution of (14.103) for $U$ and $d U / d R$ into (14.94) and (14.95), and use of (14.100) gives

\(
\begin{equation}
\left.\langle V\rangle \approx\langle V\rangle\right|{\infty}+\frac{4 A}{R^{6}},\left.\quad\left\langle T{\mathrm{el}}\right\rangle \approx\left\langle T{\mathrm{el}}\right\rangle\right|{\infty}-\frac{5 A}{R^{6}}, \quad R \text { large } \tag{14.104}
\end{equation}
\)

Hence, as $R$ decreases from infinity, the average potential energy at first increases, while the average kinetic energy at first decreases. The combination of these conclusions with our conclusions about $\left.\langle V\rangle\right|{R{e}}$ and $\left.\left\langle T{\mathrm{el}}\right\rangle\right|{R{e}}$ shows that $\langle V\rangle$ must go through a maximum somewhere between $R{e}$ and infinity and $\left\langle T_{\mathrm{el}}\right\rangle$ must go through a minimum in this region.

Now consider small values of $R$. One can treat a diatomic molecule by applying perturbation theory to the united atom (UA) formed by merging the two atoms of the molecule. The perturbation is the difference between the molecular and the united-atom Hamiltonians: $H^{\prime}=\hat{H}{\text {mol }}-\hat{H}{\text {UA }}$. One finds that the molecular purely electronic energy has the following form at small $R$ [W. A. Bingel, J. Chem. Phys., 30, 1250 (1959);
I. N. Levine, J Chem. Phys., 40, 3444 (1964); 41, 2044 (1965); W. Byers Brown and E. Steiner, J. Chem. Phys. 44, 3934 (1966)]:

\(
\begin{equation}
E{\mathrm{el}}=E{\mathrm{UA}}+a R^{2}+b R^{3}+c R^{4}+d R^{5}+e R^{5} \ln R+\cdots \tag{14.105}
\end{equation}
\)

where $E{\mathrm{UA}}$ is the united-atom energy and $a, b, c, d, e$ are constants. For $R \ll R{e}$, we can use (14.105) and $U=E{\text {el }}+V{N N}$ [Eq. (13.8)] to write

\(
\begin{equation}
U(R) \approx \frac{Z{a} Z{b}}{R}+E_{\mathrm{UA}}+a R^{2}, \quad R \text { small } \tag{14.106}
\end{equation}
\)

The virial theorem then gives (in atomic units)

\(
\begin{aligned}
\left\langle T{\mathrm{el}}\right\rangle & \approx-E{\mathrm{UA}}-3 a R^{2}, \quad R \text { small } \
\langle V\rangle & \approx \frac{Z{a} Z{b}}{R}+2 E_{\mathrm{UA}}+4 a R^{2}, \quad R \text { small }
\end{aligned}
\)

Since the virial theorem (14.76) holds for the united atom, we have $\left.\left\langle T{\mathrm{el}}\right\rangle\right|{0}=-E{\mathrm{UA}}$ and $\left.\left\langle V{\text {el }}\right\rangle\right|{0}=2 E{\mathrm{UA}}$. Therefore,

\(
\begin{gather}
\left.\left\langle T{\mathrm{el}}\right\rangle \approx\left\langle T{\mathrm{el}}\right\rangle\right|{0}-3 a R^{2}, \quad R \text { small } \tag{14.107}\
\langle V\rangle \approx \frac{Z{a} Z{b}}{R}+\left.\left\langle V{\mathrm{el}}\right\rangle\right|_{0}+4 a R^{2}, \quad R \text { small } \tag{14.108}
\end{gather}
\)

$\langle V\rangle$ goes to infinity as $R$ goes to zero, because of the internuclear repulsion.
Having found the general behavior of $\langle V\rangle$ and $\left\langle T{\text {el }}\right\rangle$ as functions of $R$, we now draw Fig. 14.1. This figure is not for any particular molecule but resembles the known curves for $\mathrm{H}{2}$ and $\mathrm{H}_{2}^{+}$[W. Kolos and L. Wolniewicz, J. Chem. Phys., 41, 3663 (1964); Slater,

FIGURE 14.1 Variation of the average potential and kinetic energies of a diatomic molecule. The unit of energy is taken as the electronic kinetic energy of the separated atoms.

Quantum Theory of Molecules and Solids, Volume 1, p. 36]. Similar curves hold for other diatomic molecules [see Fig. 1 in J. Hernandez-Trujillo et al., Faraday Discuss., 135, 79 (2007)].

How can we explain the changes in average kinetic and potential energy with $R$ ? Consider $\mathrm{H}_{2}^{+}$. The electronic potential-energy function is in atomic units

\(
\begin{equation}
V{\mathrm{el}}=-\frac{1}{r{a}}-\frac{1}{r_{b}} \tag{14.109}
\end{equation}
\)

If we plot $V{\mathrm{el}}$ for points on the molecular axis for a large value of $R$, we get a curve like Fig. 14.2, which resembles two hydrogen-atom potential-energy curves (Fig. 6.6) placed side by side. We saw that the overlapping of the $1 s$ AOs occurring in molecule formation increases the charge probability density between the nuclei for the ground state. However, Fig. 14.2 shows that the potential energy is relatively high in the region midway between the nuclei when $R$ is large. Thus $\langle V\rangle$ initially increases as $R$ decreases from infinity. Now consider the kinetic energy. The uncertainty principle (5.13) gives $(\Delta x)^{2}\left(\Delta p{x}\right)^{2} \geq \hbar^{2} / 4$. For a stationary state, $\left\langle p{x}\right\rangle$ is zero [see Eq. (3.92) and Prob. 14.31] and (5.11) gives $\left(\Delta p{x}\right)^{2}=\left\langle p{x}^{2}\right\rangle$. Hence a small value of $(\Delta x)^{2}$ means a large value of $\left\langle p{x}^{2}\right\rangle$ and a large value of the average kinetic energy, which equals $\left\langle p^{2}\right\rangle / 2 m$. Thus a compact $\psi{\text {el }}$ corresponds to a large electronic kinetic energy. In the separated atoms, the wave function is concentrated in two rather small regions about each nucleus (Fig. 6.7). In the initial stages of molecule formation, the buildup of probability density between the nuclei corresponds to having a wave function that is less compact than it was in the separated atoms. Thus, as $R$ decreases from infinity, the electronic kinetic energy initially decreases. The energies $E{\text {el }}$ of the two lowest $\mathrm{H}_{2}^{+}$states have been indicated in Fig. 14.2. For large $R$ the region between the nuclei is classically forbidden, but it is accessible according to quantum mechanics (tunneling).

Now consider what happens as $R$ decreases further. Plotting (14.109) for an intermediate value of $R$, we find that now the region between the nuclei is a region of low potential energy, since an electron in this region feels substantial attractions from both nuclei. (See Fig. 14.3.) Hence at intermediate values of $R$, the overlap charge buildup between the nuclei lowers the potential energy. For intermediate values of $R$, the wave function has

FIGURE 14.3 Potential energy along the internuclear axis for electronic motion in $\mathrm{H}{2}^{+}$at an intermediate internuclear distance.
become more compact compared with large $R$, which gives an increase in $\left\langle T{\mathrm{el}}\right\rangle$ as $R$ is reduced. In fact, we see from Fig. 14.1 and Eq. (14.101) that $\left\langle T{\mathrm{el}}\right\rangle$ is greater at $R{e}$ in the molecule than in the separated atoms. Hence the molecular wave function at $R_{e}$ is more compact than the separated-atoms wave functions.

For very small $R$, the average potential energy goes to infinity, because of the internuclear repulsion. However, for $R=R{e}$, Fig. 14.1 shows that $\langle V\rangle$ is still decreasing sharply with decreasing $R$, and it is the increase in $\left\langle T{\mathrm{el}}\right\rangle$, and not the nuclear repulsion, that causes the $U(R)$ curve to turn up as $R$ becomes less than $R{e}$. The squeezing of the molecular wave function into a smaller region with the associated increase in $\left\langle T{\text {el }}\right\rangle$ is more important than the internuclear repulsion in causing the initial repulsion between the atoms.


Consider a system with a time-independent Hamiltonian $\hat{H}$ that involves parameters. An obvious example is the molecular electronic Hamiltonian (13.5), which depends parametrically on the nuclear coordinates. However, the Hamiltonian of any system contains parameters. For example, in the one-dimensional harmonic-oscillator Hamiltonian operator $-\left(\hbar^{2} / 2 m\right)\left(d^{2} / d x^{2}\right)+\frac{1}{2} k x^{2}$, the force constant $k$ is a parameter, as is the mass $m$. Although $\hbar$ is a constant, we can consider it as a parameter also. The stationary-state energies $E_{n}$ are functions of the same parameters as $\hat{H}$. For example, for the harmonic oscillator

\(
\begin{equation}
E_{n}=\left(v+\frac{1}{2}\right) h \nu=\left(v+\frac{1}{2}\right) \hbar(k / m)^{1 / 2} \tag{14.110}
\end{equation}
\)

The stationary-state wave functions also depend on the parameters in $\hat{H}$. We now investigate how $E{n}$ varies with each of the parameters. More specifically, if $\lambda$ is one of these parameters, we ask for $\partial E{n} / \partial \lambda$, where the partial derivative is taken with all other parameters held constant.

We begin with the Schrödinger equation

\(
\begin{equation}
\hat{H} \psi{n}=E{n} \psi_{n} \tag{14.111}
\end{equation}
\)

where the $\psi_{n}$ 's are the normalized stationary-state eigenfunctions. Because of normalization, we have

\(
\begin{gather}
E{n}=\int \psi{n}^{} \hat{H} \psi{n} d \tau \tag{14.112}\
\frac{\partial E{n}}{\partial \lambda}=\frac{\partial}{\partial \lambda} \int \psi{n}^{*} \hat{H} \psi{n} d \tau \tag{14.113}
\end{gather*}
\)

The integral in (14.112) is a definite integral over all space, and its value depends parametrically on $\lambda$ since $\hat{H}$ and $\psi_{n}$ depend on $\lambda$. Provided the integrand is well behaved, we can find the integral's derivative with respect to a parameter by differentiating the integrand with respect to the parameter and then integrating. Thus

\(
\begin{equation}
\frac{\partial E{n}}{\partial \lambda}=\int \frac{\partial}{\partial \lambda}\left(\psi{n}^{} \hat{H} \psi{n}\right) d \tau=\int \frac{\partial \psi{n}^{}}{\partial \lambda} \hat{H} \psi{n} d \tau+\int \psi{n}^{} \frac{\partial}{\partial \lambda}\left(\hat{H} \psi_{n}\right) d \tau \tag{14.114}
\end{}
\)

We have

\(
\begin{equation}
\frac{\partial}{\partial \lambda}\left(\hat{H} \psi{n}\right)=\frac{\partial}{\partial \lambda}\left(\hat{T} \psi{n}\right)+\frac{\partial}{\partial \lambda}\left(\hat{V} \psi_{n}\right) \tag{14.115}
\end{equation}
\)

The potential-energy operator is just multiplication by $V$, so

\(
\begin{equation}
\frac{\partial}{\partial \lambda}\left(\hat{V} \psi{n}\right)=\frac{\partial V}{\partial \lambda} \psi{n}+V \frac{\partial \psi_{n}}{\partial \lambda} \tag{14.116}
\end{equation}
\)

The parameter $\lambda$ will occur in the kinetic-energy operator as part of the factor multiplying one or more of the derivatives with respect to the coordinates. For example, taking $\lambda$ as the mass of the particle, we have for a one-particle problem

\(
\begin{aligned}
\hat{T} & =-\frac{\hbar^{2}}{2 \lambda}\left(\frac{\partial^{2}}{\partial x^{2}}+\frac{\partial^{2}}{\partial y^{2}}+\frac{\partial^{2}}{\partial z^{2}}\right) \
\frac{\partial}{\partial \lambda}(\hat{T} \psi) & =-\frac{\hbar^{2}}{2} \frac{\partial}{\partial \lambda}\left[\frac{1}{\lambda}\left(\frac{\partial^{2} \psi}{\partial x^{2}}+\frac{\partial^{2} \psi}{\partial y^{2}}+\frac{\partial^{2} \psi}{\partial z^{2}}\right)\right] \
& =\frac{\hbar^{2}}{2 \lambda^{2}}\left(\frac{\partial^{2} \psi}{\partial x^{2}}+\frac{\partial^{2} \psi}{\partial y^{2}}+\frac{\partial^{2} \psi}{\partial z^{2}}\right)-\frac{\hbar^{2}}{2 \lambda}\left(\frac{\partial^{2}}{\partial x^{2}}+\frac{\partial^{2}}{\partial y^{2}}+\frac{\partial^{2}}{\partial z^{2}}\right)\left(\frac{\partial \psi}{\partial \lambda}\right)
\end{aligned}
\)

since we can change the order of the partial differentiations without affecting the result. We can write this last equation as

\(
\begin{equation}
\frac{\partial}{\partial \lambda}\left(\hat{T} \psi{n}\right)=\left(\frac{\partial \hat{T}}{\partial \lambda}\right) \psi{n}+\hat{T}\left(\frac{\partial \psi_{n}}{\partial \lambda}\right) \tag{14.117}
\end{equation}
\)

where $\partial \hat{T} / \partial \lambda$ is found by differentiating $\hat{T}$ with respect to $\lambda$ just as if it were a function instead of an operator. Although we got (14.117) by considering a specific $\hat{T}$ and $\lambda$, the same arguments show it to be generally valid. Combining (14.116) and (14.117), we write

\(
\begin{equation}
\frac{\partial}{\partial \lambda}\left(\hat{H} \psi{n}\right)=\left(\frac{\partial \hat{H}}{\partial \lambda}\right) \psi{n}+\hat{H}\left(\frac{\partial \psi_{n}}{\partial \lambda}\right) \tag{14.118}
\end{equation}
\)

Equation (14.114) becomes

\(
\begin{equation}
\frac{\partial E{n}}{\partial \lambda}=\int \frac{\partial \psi{n}^{}}{\partial \lambda} \hat{H} \psi{n} d \tau+\int \psi{n}^{} \frac{\partial \hat{H}}{\partial \lambda} \psi{n} d \tau+\int \psi{n}^{} \hat{H} \frac{\partial \psi_{n}}{\partial \lambda} d \tau \tag{14.119}
\end{}
\)

For the first integral in (14.119), we have

\(
\begin{equation}
\int \frac{\partial \psi_{n}^{}}{\partial \lambda} \hat{H} \psi{n} d \tau=E{n} \int \frac{\partial \psi{n}^{*}}{\partial \lambda} \psi{n} d \tau \tag{14.120}
\end{}
\)

The Hermitian property of $\hat{H}$ and (14.111) give for the last integral in (14.119)

\(
\int \psi{n}^{*} \hat{H} \frac{\partial \psi{n}}{\partial \lambda} d \tau=\int \frac{\partial \psi{n}}{\partial \lambda}\left(\hat{H} \psi{n}\right)^{} d \tau=E{n} \int \psi{n}^{} \frac{\partial \psi_{n}}{\partial \lambda} d \tau
\)

Therefore,

\(
\begin{equation}
\frac{\partial E{n}}{\partial \lambda}=\int \psi{n}^{} \frac{\partial \hat{H}}{\partial \lambda} \psi{n} d \tau+E{n} \int \frac{\partial \psi{n}^{*}}{\partial \lambda} \psi{n} d \tau+E{n} \int \psi{n}^{} \frac{\partial \psi_{n}}{\partial \lambda} d \tau \tag{14.121}
\end{equation}
\)

The wave function is normalized, so

\(
\begin{gather}
\int \psi_{n}^{} \psi{n} d \tau=1, \quad \frac{\partial}{\partial \lambda} \int \psi{n}^{} \psi{n} d \tau=0 \
\int \frac{\partial \psi{n}^{}}{\partial \lambda} \psi{n} d \tau+\int \psi{n}^{} \frac{\partial \psi_{n}}{\partial \lambda} d \tau=0 \tag{14.122}
\end{gather}
\)

Using (14.122) in (14.121), we obtain

\(
\begin{equation}
\frac{\partial E{n}}{\partial \lambda}=\int \psi{n}^{} \frac{\partial \hat{H}}{\partial \lambda} \psi_{n} d \tau \tag{14.123}
\end{}
\)

Equation (14.123) is the (generalized) Hellmann-Feynman theorem. [For a discussion of the origin of the Hellmann-Feynman and related theorems, see J. I. Musher, Am. J. Phys., 34, 267 (1966).]

EXAMPLE

Apply the generalized Hellmann-Feynman theorem to the one-dimensional harmonic oscillator with $\lambda$ taken as the force constant.

For the harmonic oscillator, $\hat{H}=-\left(\hbar^{2} / 2 m\right)\left(d^{2} / d x^{2}\right)+\frac{1}{2} k x^{2}$ and $\partial \hat{H} / \partial k=\frac{1}{2} x^{2}$. The energy levels are $E_{v}=\left(v+\frac{1}{2}\right) h \nu=\left(v+\frac{1}{2}\right) h(k / m)^{1 / 2} / 2 \pi$. We have

\(
\partial E_{v} / \partial k=\frac{1}{2}\left(v+\frac{1}{2}\right) h k^{-1 / 2} m^{-1 / 2} / 2 \pi=\frac{1}{2}\left(v+\frac{1}{2}\right) h \nu / k
\)

Substitution in (14.123) gives

\(
\begin{equation}
\int{-\infty}^{\infty} \psi{v}^{} x^{2} \psi_{v} d x=\left(v+\frac{1}{2}\right) h \nu / k \tag{14.124}
\end{}
\)

We have found $\left\langle x^{2}\right\rangle$ for any harmonic-oscillator stationary state without evaluating any integrals. This result was also obtained from the virial theorem; see Eq. (14.74). For a third derivation, see Eyring, Walter, and Kimball, p. 79.

The derivation of the Hellmann-Feynman theorem assumes that $\partial \psi / \partial \lambda$ exists. For a state belonging to a degenerate energy level, this assumption may not be true and (14.123) need not hold. Changing the parameter's value from $\lambda$ to $\lambda+d \lambda$ amounts to applying a perturbation $\hat{H}^{\prime} \equiv \hat{H}(\lambda+d \lambda)-\hat{H}(\lambda) \approx(\partial \hat{H} / \partial \lambda) d \lambda$. This perturbation changes $\psi$ from $\psi(\lambda)$ to $\psi(\lambda+d \lambda)$. If $\psi(\lambda)$ is not one of the correct zeroth-order wave functions (9.73) for the perturbation $\hat{H}^{\prime}$, then $\psi(\lambda)$ need not equal $\lim {d \lambda \rightarrow 0} \psi(\lambda+d \lambda)$, so $\psi$ will make a discontinuous jump at $d \lambda=0$ and $\partial \psi / \partial \lambda$ will not exist at this point. The Hellmann-Feynman theorem (14.123) applies to the wave functions of a degenerate level only if we use the correct zeroth-order wave functions for the perturbation $\hat{H}^{\prime}$. These correct functions are found by solving (9.83) and (9.81) with $\hat{H}^{\prime} \equiv(\partial \hat{H} / \partial \lambda) d \lambda$. [Since $E{n}^{(1)}$ will be proportional to $d \lambda$, we can replace $\hat{H}^{\prime}$ with $\partial \hat{H} / \partial \lambda$ and $E{n}^{(1)}$ with $E{n}^{(1)} / d \lambda$ in (9.83) and (9.81).] For further details, see references 4-7 in G. P. Zhang and T. F. George, Phys. Rev. B, 69, 167102 (2004).

Application of the Hellmann-Feynman theorem to the hydrogenlike atom, with $Z$ as the parameter, gives (Prob. 14.37a)

\(
\begin{equation}
\int r^{-1}|\psi|^{2} d \tau=\left\langle\frac{1}{r}\right\rangle=\frac{Z}{n^{2}}\left(\frac{1}{a}\right) \tag{14.125}
\end{equation}
\)

This result was also obtained from the virial theorem; see Eq. (14.76).


Hellmann and Feynman independently applied Eq. (14.123) to molecules, taking $\lambda$ as a nuclear Cartesian coordinate. We now consider their results.

As usual, we are using the Born-Oppenheimer approximation, solving the electronic Schrödinger equation for a fixed nuclear configuration [Eqs. (13.4) to (13.6)]:

\(
\left(\hat{T}{\mathrm{el}}+\hat{V}\right) \psi{\mathrm{el}}=\left(\hat{T}{\mathrm{el}}+\hat{V}{\mathrm{el}}+\hat{V}{N N}\right) \psi{\mathrm{el}}=U \psi_{\mathrm{el}}
\)

where $\hat{T}{\mathrm{el}}, \hat{V}{\mathrm{el}}$, and $\hat{V}{N N}$ are given by (14.78), (14.79), and (14.89). The Hamiltonian operator $\hat{T}{\mathrm{el}}+\hat{V}{\mathrm{el}}+\hat{V}{N N}$ depends on the nuclear coordinates as parameters. If $x_{\delta}$ is the $x$ coordinate of nucleus $\delta$, the generalized Hellmann-Feynman theorem (14.123) gives

\(
\begin{equation}
\frac{\partial U}{\partial x{\delta}}=\int \psi{\mathrm{el}}^{} \frac{\partial\left(\hat{T}{\mathrm{el}}+\hat{V}{\mathrm{el}}+\hat{V}{N N}\right)}{\partial x{\delta}} \psi{\mathrm{el}} d \tau{\mathrm{el}}=\int \psi{\mathrm{el}}^{*}\left(\frac{\partial V{\mathrm{el}}}{\partial x{\delta}}+\frac{\partial V{N N}}{\partial x{\delta}}\right) \psi{\mathrm{el}} d \tau_{\mathrm{el}} \tag{14.126}
\end{}
\)

since $\hat{T}{\text {el }}$ is independent of the nuclear Cartesian coordinates, as can be seen from (14.78). [If we had omitted $V{N N}$ from $V$, we would have obtained Eq. (14.82), which was used in deriving the molecular electronic virial theorem.] From (14.79) we get in atomic units

\(
\begin{equation}
\frac{\partial V{\mathrm{el}}}{\partial x{\delta}}=-\sum{i} \frac{Z{\delta}\left(x{i}-x{\delta}\right)}{r_{i \delta}^{3}} \tag{14.127}
\end{equation}
\)

where $r{i \delta}$ is the distance from electron $i$ to nucleus $\delta$. To find $\partial V{N N} / \partial x_{\delta}$, we need to consider only internuclear repulsion terms that involve nucleus $\delta$. Hence

\(
\frac{\partial V{N N}}{\partial x{\delta}}=\frac{\partial}{\partial x{\delta}} \sum{\alpha \neq \delta} \frac{Z{\alpha} Z{\delta}}{\left[\left(x{\alpha}-x{\delta}\right)^{2}+\left(y{\alpha}-y{\delta}\right)^{2}+\left(z{\alpha}-z{\delta}\right)^{2}\right]^{1 / 2}}=\sum{\alpha \neq \delta} Z{\alpha} Z{\delta} \frac{x{\alpha}-x{\delta}}{R{\alpha \delta}^{3}}
\)

where $R{\alpha \delta}$ is the distance between nuclei $\alpha$ and $\delta$. Since $\partial V{N N} / \partial x{\delta}$ does not involve the electronic coordinates and $\psi{\mathrm{el}}$ is normalized, (14.126) becomes

\(
\begin{equation}
\frac{\partial U}{\partial x{\delta}}=-Z{\delta} \int\left|\psi{\mathrm{el}}\right|^{2} \sum{i} \frac{x{i}-x{\delta}}{r{i \delta}^{3}} d \tau{\mathrm{el}}+\sum{\alpha \neq \delta} Z{\alpha} Z{\delta} \frac{x{\alpha}-x{\delta}}{R{\alpha \delta}^{3}} \tag{14.128}
\end{equation}
\)

Consider the integral in (14.128). Using Eq. (14.8) with $B\left(\mathbf{r}{i}\right)=\left(x{i}-x{\delta}\right) / r{i \delta}^{3}$, we get

\(
\begin{equation}
\frac{\partial U}{\partial x{\delta}}=-Z{\delta} \iiint \rho(x, y, z) \frac{x-x{\delta}}{r{\delta}^{3}} d x d y d z+\sum{\alpha \neq \delta} Z{\alpha} Z{\delta} \frac{x{\alpha}-x{\delta}}{R{\alpha \delta}^{3}} \tag{14.129}
\end{equation}
\)

The variable $r_{\delta}$ is the distance between nucleus $\delta$ and point $(x, y, z)$ in space:

\(
r{\delta}=\left[\left(x-x{\delta}\right)^{2}+\left(y-y{\delta}\right)^{2}+\left(z-z{\delta}\right)^{2}\right]^{1 / 2}
\)

What is the significance of (14.129)? In the Born-Oppenheimer approximation, $U\left(x{\alpha}, y{\alpha}, z{\alpha}, x{\beta}, \ldots\right)$ is the potential-energy function for nuclear motion, the nuclear Schrödinger equation being

\(
\begin{equation}
\left(-\frac{\hbar^{2}}{2} \sum{\alpha} \frac{1}{m{\alpha}} \nabla{\alpha}^{2}+U\right) \psi{N}=E \psi_{N} \tag{14.130}
\end{equation}
\)

The quantity $-\partial U / \partial x{\delta}$ can thus be viewed [see Eq. (5.31)] as the $x$ component of the effective force on nucleus $\delta$ due to the other nuclei and the electrons. In addition to (14.129),
we have two corresponding equations for $\partial U / \partial y{\delta}$ and $\partial U / \partial z{\delta}$. If $\mathbf{F}{\delta}$ is the effective force on nucleus $\delta$, then

\(
\begin{gather}
\mathbf{F}{\delta}=-\mathbf{i} \frac{\partial U}{\partial x{\delta}}-\mathbf{j} \frac{\partial U}{\partial y{\delta}}-\mathbf{k} \frac{\partial U}{\partial z{\delta}} \tag{14.131}\
\mathbf{F}{\delta}=-Z{\delta} \iiint \rho(x, y, z) \frac{\mathbf{r}{\delta}}{r{\delta}^{3}} d x d y d z+\sum{\alpha \neq \delta} Z{\alpha} Z{\delta} \frac{\mathbf{R}{\alpha \delta}}{R_{\alpha \delta}^{3}} \tag{14.132}
\end{gather}
\)

where $\mathbf{r}_{\delta}$ is the vector from point $(x, y, z)$ to nucleus $\delta$,

\(
\begin{equation}
\mathbf{r}{\delta}=\mathbf{i}\left(x{\delta}-x\right)+\mathbf{j}\left(y{\delta}-y\right)+\mathbf{k}\left(z{\delta}-z\right) \tag{14.133}
\end{equation}
\)

and where $\mathbf{R}_{\alpha \delta}$ is the vector from nucleus $\alpha$ to nucleus $\delta$ :

\(
\mathbf{R}{\alpha \delta}=\mathbf{i}\left(x{\delta}-x{\alpha}\right)+\mathbf{j}\left(y{\delta}-y{\alpha}\right)+\mathbf{k}\left(z{\delta}-z_{\alpha}\right)
\)

Equation (14.132) has a simple physical interpretation. Let us imagine the electrons smeared out into a charge distribution whose density in atomic units is $-\rho(x, y, z)$. The force on nucleus $\delta$ exerted by the infinitesimal element of electronic charge $-\rho d x d y d z$ is [Eq. (6.56)]

\(
\begin{equation}
-Z{\delta} \frac{\mathbf{r}{\delta}}{r_{\delta}^{3}} \rho d x d y d z \tag{14.134}
\end{equation}
\)

and integration of (14.134) shows that the total force exerted on $\delta$ by this hypothetical electron smear is given by the first term on the right of (14.132). The second term on the right of (14.132) is clearly the Coulomb's law force on nucleus $\delta$ due to the electrostatic repulsions of the other nuclei.

Thus the effective force acting on a nucleus in a molecule can be calculated by simple electrostatics as the sum of the Coulombic forces exerted by the other nuclei and by a hypothetical electron cloud whose charge density $-\rho(x, y, z)$ is found by solving the electronic Schrödinger equation. This is the Hellmann-Feynman electrostatic theorem. The electron probability density depends on the parameters defining the nuclear configuration: $\rho=\rho\left(x, y, z ; x{\alpha}, y{\alpha}, z{\alpha}, x{\beta}, \ldots\right)$.

It is quite reasonable that the electrostatic theorem follows from the BornOppenheimer approximation, since the rapid motion of the electrons allows the electronic wave function and probability density to adjust immediately to changes in nuclear configuration. The rapid motion of the electrons causes the sluggish nuclei to "see" the electrons as a charge cloud rather than as discrete particles. The fact that the effective forces on the nuclei are electrostatic affirms that there are no "mysterious quantum-mechanical forces" acting in molecules.

Let us consider the implications of the electrostatic theorem for chemical bonding in diatomic molecules. We take the internuclear axis as the $z$ axis (Fig. 14.4). By symmetry the $x$ and $y$ components of the effective forces on the two nuclei are zero. [Also, one can show that the $z$ force components on nuclei $a$ and $b$ are related by $F{z, a}=-F{z, b}$ (Prob. 14.40). The effective forces on nuclei $a$ and $b$ are equal in magnitude and opposite in direction.]

From (14.134) and (14.133), the $z$ component of the effective force on nucleus $a$ due to the element of electronic charge in the infinitesimal region about $(x, y, z)$ is

\(
\begin{equation}
-Z{a} \rho\left[\left(z{a}-z\right) / r{a}^{3}\right] d x d y d z=Z{a} \rho\left(\cos \theta{a} / r{a}^{2}\right) d x d y d z \tag{14.135}
\end{equation}
\)

FIGURE 14.4 Coordinate system for a diatomic molecule. The origin is at $O$.
since $\cos \theta{a}=\left(-z{a}+z\right) / r{a}$. ( $z{a}$ is negative.) Similarly, the $z$ component of force on nucleus $b$ due to this charge is

\(
\begin{equation}
-Z{b} \rho\left(\cos \theta{b} / r_{b}^{2}\right) d x d y d z \tag{14.136}
\end{equation}
\)

A positive value of $(14.135)$ or $(14.136)$ corresponds to a force in the $+z$ direction, that is, to the right in Fig. 14.4. When the force on nucleus $a$ is algebraically greater than the force on nucleus $b$, then the element of electronic charge tends to draw $a$ toward $b$. Hence electronic charge that is binding is located in the region where

\(
\begin{equation}
Z{a} \rho\left(\cos \theta{a} / r{a}^{2}\right) d x d y d z>-Z{b} \rho\left(\cos \theta{b} / r{b}^{2}\right) d x d y d z \tag{14.137}
\end{equation}
\)

Since the probability density $\rho$ is nonnegative, division by $\rho$ preserves the direction of the inequality sign, and the binding region of space is where

\(
\begin{equation}
Z{a} \frac{\cos \theta{a}}{r{a}^{2}}+Z{b} \frac{\cos \theta{b}}{r{b}^{2}}>0 \tag{14.138}
\end{equation}
\)

When the force on $b$ is algebraically greater than that on $a$, the electronic charge element tends to draw $b$ away from $a$. The antibinding region of space is thus characterized by a negative value for the left side of (14.138). The surfaces for which the left side of (14.138) equals zero divide space into the binding and antibinding regions. [T. Berlin, J. Chem. Phys., 19, 208 (1951); J. Hinze, J. Chem. Phys., 101, 6369 (1994); Berlin's ideas are extended to polyatomic molecules in T. Koga et al., J. Am. Chem. Soc., 100, 7522 (1978); X. Wang and Z. Peng, Int. J. Quantum. Chem., 47, 393 (1993).]

Figures 14.5 and 14.6 show the binding and antibinding regions for a homonuclear and a heteronuclear diatomic molecule. As might be expected, the binding region for a homonuclear diatomic molecule lies between the nuclei. Charge in this region tends to draw the nuclei together. Bonding leads to a transfer of charge probability density into the region between the nuclei because of the overlap between the bonding AOs. Electronic

FIGURE 14.5 Cross section of binding and antibinding regions in a homonuclear diatomic molecule. To obtain the three-dimensional regions, rotate the figure about the internuclear axis.

FIGURE 14.6 Binding and antibinding regions for a heteronuclear diatomic molecule with $Z{b}>Z{a}$.

charge that is "behind" the nuclei (to the left of nucleus $a$ or to the right of nucleus $b$ in Fig. 14.5) exerts a greater attraction on the nucleus that is nearer to it than on the other nucleus and thus tends to pull the nuclei apart.

From the Hellmann-Feynman viewpoint, we seem to be considering chemical bonding solely in terms of potential energy, whereas the virial-theorem discussion involved both potential and kinetic energy. For the purposes of the Hellmann-Feynman discussion, we are imagining the electrons to be smeared out into a continuous charge distribution. Hence we make no reference to electronic kinetic energy. The use of the electrostatic theorem to explain chemical bonding has been criticized by some quantum chemists on the grounds that it hides the role of kinetic energy in bonding. [See the references cited after Eq. (13.66).]

In 1939, Feynman conjectured that the dispersion attraction between two molecules A and B at relatively large intermolecular distances is explainable as follows: The interactions between the two molecules cause the electron probability density of each molecule to be distorted and shifted somewhat toward the other molecule. The attractions of the nuclei in molecule A toward the distorted (polarized) electron density of molecule A and the attractions of the B nuclei toward the polarized B electron density then draw the two molecules together. In 1990, Hunt proved that the dispersion interaction between any two molecules in their ground electronic states results from the attractions of the nuclei in each molecule to the polarized electron density of the same molecule [K. L. C. Hunt, J. Chem. Phys., 92, 1180 (1990)].

For further applications of the Hellmann-Feynman electrostatic theorem, see B. M. Deb, ed., The Force Concept in Chemistry, Van Nostrand Reinhold, 1981.