Projection operator (quantum mechanics)

A projection operator \hat{P} is a linear operator that transforms a vector in the direction of another vector, i.e., it projects one vector onto another.

In general,

\hat{P}=\vert\boldsymbol{\mathit{u}}\rangle\langle\boldsymbol{\mathit{v}}\vert

\hat{P}\vert\boldsymbol{\mathit{w}}\rangle=(\vert\boldsymbol{\mathit{u}}\rangle\langle\boldsymbol{\mathit{v}}\vert)\vert\boldsymbol{\mathit{w}}\rangle=\vert\boldsymbol{\mathit{u}}\rangle(\langle\boldsymbol{\mathit{v}}\vert\boldsymbol{\mathit{w}}\rangle) =c\vert\boldsymbol{\mathit{u}}\rangle

where c is a scalar.

It is useful in quantum mechanics to have a projection operator that maps a vector onto another vector, which is part of a complete set of orthonormal basis vectors \left \{\boldsymbol{\mathit{i}} \right \} in a Hilbert space. We define the operator as:

\hat{P}_{\boldsymbol{\mathit{i}=\boldsymbol{\mathit{m}}}}=\vert\boldsymbol{\mathit{m}}\rangle\langle\boldsymbol{\mathit{m}}\vert

This allows us to project a vector \boldsymbol{\mathit{w}} onto the basis vector \boldsymbol{\mathit{m}}:

\hat{P}_{\boldsymbol{\mathit{i}=\boldsymbol{\mathit{m}}}}\vert\boldsymbol{\mathit{w}}\rangle=\vert\boldsymbol{\mathit{m}}\rangle\langle\boldsymbol{\mathit{m}}\vert\boldsymbol{\mathit{w}}\rangle=c\vert\boldsymbol{\mathit{m}}\rangle

If \boldsymbol{\mathit{w}} is a wavefunction \psi that is a linear combination of a complete set of orthonormal basis functions, i.e., \psi=\sum_{i=1}^{N}c_i\phi_i, then

\hat{P}_i\psi=\vert\phi_i\rangle\langle\phi_i\vert\psi\rangle=c_i\phi_i

When we measure an observable of a system whose state is described by \psi, we get an eigenvalue corresponding to an eigenfunction, which is one of the orthonormal basis functions in the complete set. We say that the wavefunction \psi is projected onto (or collapsed into) the eigenfunction \phi_i.

 

 

Next article: expectation value
Previous article: matrix elements of an operator
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Matrix elements of an operator

The matrix elements of an operator are the entries of the matrix representation of the operator.

Consider a linear map from a vector space \vert\psi\rangle to the same vector space, i.e. \vert\phi\rangle=\hat{O}\vert\psi\rangle, where and the orthonormal basis states \vert\varphi_n\rangle span . The matrix representation of the equation is

\begin{pmatrix} \phi_1\\\phi_2 \\ \vdots \end{pmatrix}=\begin{pmatrix} O_{11} &O_{12} &\cdots \\ O_{21} &O_{22} &\cdots \\ \vdots & \vdots &\ddots \end{pmatrix}\begin{pmatrix} \psi_1\\\psi_2 \\ \vdots \end{pmatrix}

where \phi_m and \psi_n are the coefficients of the vectors \vert\phi\rangle and \vert\psi\rangle respectively.

The matrix elements of \vert\phi\rangle are given by

\phi_m=\sum_{n}O_{mn}\psi_n\; \; \; \; \; \; \; \; 32a

Since the orthonormal basis states \vert\varphi_n\rangle span , we have . So,

\vert\phi\rangle=\hat{O} \sum_{n}\vert\varphi_n\rangle\langle\varphi_n\vert\psi\rangle =\sum_{n}\hat{O}\vert\varphi_n\rangle\langle\varphi_n\vert\psi\rangle

\langle\varphi_m\vert\phi\rangle=\sum_{n}\langle\varphi_m\vert\hat{O}\vert\varphi_n\rangle\langle\varphi_n\vert\psi\rangle

Similarly, and so

Comparing eq33 with eq32a, O_{mn}=\langle\varphi_m\vert\hat{O}\vert\varphi_n\rangle. Therefore, O_{mn} are the matrix elements of \hat{O} with respect to the basis states of \vert\varphi_n\rangle.

 

Next article: projection operator
Previous article: spectral decomposition of an operator
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Spectral decomposition of an operator

The spectral decomposition (also known as eigendecomposition or diagonalisation) of an operator is the transformation of an operator in a given basis to one in another basis, such that the resultant operator is represented by a diagonal matrix.

There are 2 main reasons for diagonalising an operator, especially a Hermitian operator. One is to find its eigenvalues and the other is to convert it into a form that is easier to multiply with.

 

Question

What is a spectrum with respect to linear algebra?

Answer

A spectrum is a collection of all eigenvalues of a matrix. If the matrix represents an operator, its spectral decomposition transforms it to a diagonal matrix with the eigenvalues as its diagonal elements.

 

Consider an operator with a complete set of orthonormal eigenvectors \{\boldsymbol{\mathit{e_i}}\} that is represented by the eigenvalue equation \hat{O}\vert\boldsymbol{\mathit{e_i}}\rangle=o_i\vert\boldsymbol{\mathit{e_i}}\rangle, where i\in \mathbb{N} and o_i are eigenvalues of \hat{O}. Since the eigenvectors form a complete set, any vector \boldsymbol{\mathit{u}} can be written as a linear combination of the basis eigenvectors:

\vert\boldsymbol{\mathit{u}}\rangle=\sum_{i=1}^{N}c_i\vert\boldsymbol{\mathit{e_i}}\rangle \; \; \; \; \; \; \; \; 28

where c_i is the coefficient of the basis eigenvector.

Letting \hat{O} act on eq28, \hat{O}\vert\boldsymbol{\mathit{u}}\rangle=\sum_{i=1}^{N}c_io_i\vert\boldsymbol{\mathit{e_i}}\rangle. As we have a complete set of orthonormal eigenvectors,  \langle\boldsymbol{\mathit{e_i}}\vert\boldsymbol{\mathit{u}}\rangle=c_i and \hat{O}\vert\boldsymbol{\mathit{u}}\rangle=\sum_{i=1}^{N}\langle\boldsymbol{\mathit{e_i}}\vert\boldsymbol{\mathit{u}}\rangle o_i\vert\boldsymbol{\mathit{e_i}}\rangle. Furthermore, \langle\boldsymbol{\mathit{e_i}}\vert\boldsymbol{\mathit{u}}\rangle is a scalar and matrix multiplication is associative. Therefore,

\hat{O}\vert\boldsymbol{\mathit{u}}\rangle=\sum_{i=1}^{N}o_i\vert\boldsymbol{\mathit{e_i}}\rangle\langle\boldsymbol{\mathit{e_i}}\vert\boldsymbol{\mathit{u}}\rangle =\left\(\sum_{i=1}^{N}o_i\vert\boldsymbol{\mathit{e_i}}\rangle\langle\boldsymbol{\mathit{e_i}}\vert\right\)\vert\boldsymbol{\mathit{u}}\rangle\; \; \; \; \; \; \; \; 29

and

\hat{O}=\sum_{i=1}^{N}o_i\vert\boldsymbol{\mathit{e_i}}\rangle\langle\boldsymbol{\mathit{e_i}}\vert\; \; \; \; \; \; \; \; 30

We call eq30 the spectral decomposition of \hat{O}. Since  is the projection operator onto the eigenspace corresponding to o_i, we can say that the spectral decomposition of a quantum operator represents the operator as a sum of projections onto its eigenstates, weighted by its eigenvalues.

 

Question

Show that \hat{O} in eq30, where N=3, is represented by a diagonal matrix.

Answer

\hat{O}=o_1 \begin{pmatrix} 1\\ 0\\0 \end{pmatrix} \begin{pmatrix} 1 &0 &0 \end{pmatrix}+o_2 \begin{pmatrix} 0\\ 1\\0 \end{pmatrix} \begin{pmatrix} 0 &1 &0 \end{pmatrix}+o_3 \begin{pmatrix} 0\\ 0\\1 \end{pmatrix} \begin{pmatrix} 0 &0 &1 \end{pmatrix}

\hat{O}=o_1 \begin{pmatrix} 1 &0 &0 \\ 0 &0 &0 \\ 0 &0 &0 \end{pmatrix}+o_2 \begin{pmatrix} 0 &0 &0 \\ 0 &1 &0 \\ 0 &0 &0 \end{pmatrix}+o_3 \begin{pmatrix} 0 &0 &0 \\ 0 &0 &0 \\ 0 &0 &1 \end{pmatrix}=\begin{pmatrix} o_1 &0 &0 \\ 0 &o_2 &0 \\ 0 &0 &o_3 \end{pmatrix}

Each o_i is a diagonal element of the operator, as well as an eigenvalue of the operator.

 

In other words, any operator can be expressed in the form of a diagonal matrix if it has the following properties:

    1. Eigenvectors of the operator form a complete set, i.e. the eigenvectors span the vector space.
    2. Eigenvectors of the operator are orthogonal or can be chosen to be orthogonal.

If the eigenvalues of \hat{O} are real,

This implies that a Hermitian operator can also be expressed in the form of a diagonal matrix because the properties of a Hermitian matrix are:

    1. Eigenvectors of the operator form a complete set, i.e. the eigenvectors span the vector space.
    2. Eigenvectors of the operator are orthogonal or can be chosen to be orthogonal.
    3. Eigenvalues of the operator are real.
    4. \hat{O}.

 

 

Next article: matrix elements of an operator
Previous article: the energy-time uncertainty relation
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

The uncertainty principle (derivation)

Heisenberg’s uncertainty principle states that the position and momentum of a particle cannot be determined simultaneously with unlimited precision.

The uncertainty not only applies to the position and momentum of a particle, but to any pair of complementary observables, e.g. energy and time. In general, the uncertainty principle is expressed as:

\Delta A\Delta B\geq \frac{1}{2}\left\vert\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle\right\vert\; \; \; \; \; \; \; \; \; 12

where \hat{A} and \hat{B} are Hermitian operators and A and B are their respective observables.

The derivation of eq12 involves the following:

  1. Deriving the Schwarz inequality
  2. Proving the inequality \left ( \Delta A \right )^{2}\left ( \Delta B \right )^{2}\geq -\frac{1}{4}\left ( \left \langle \varphi \left | \left [ \hat{A},\hat{B} \right ] \right |\varphi \right \rangle \right )^{2}
  3. Showing that \left \langle \varphi \left | \left [ \hat{A},\hat{B} \right ] \right |\varphi \right \rangle =-\left \langle \varphi \left | \left [ \hat{A},\hat{B} \right ] \right |\varphi \right \rangle^{*}

 

Step 1

Let

f(\lambda)=\langle\phi-\lambda\psi\vert\phi-\lambda\psi\rangle\; \; \; \; \; \; \; \; 13

where \phi and \psi are arbitrary square integrable wavefunctions and \lambda is an arbitrary scalar.

Since \langle\phi-\lambda\psi\vert\phi-\lambda\psi\rangle=\int (\phi-\lambda\psi)^{*}(\phi-\lambda\psi)d\tau=\int \vert\phi-\lambda\psi\vert^{2}d\tau\geq 0

f(\lambda)\geq 0\; \; \; \; \; \; \; \; \; 14

Expanding eq13, we have

f(\lambda)=\langle\phi\vert\phi\rangle-\lambda\langle\phi\vert\psi\rangle-\lambda^{*}\langle\psi\vert\phi\rangle+\lambda^{*}\lambda\langle\psi\vert\psi\rangle\; \; \; \; \; \; \; \; 15

Since \lambda is an arbitrary scalar, substituting \lambda=\frac{\langle\psi\vert\phi\rangle}{\langle\psi\vert\psi\rangle} and \lambda^{*}=\frac{\langle\phi\vert\psi\rangle}{\langle\psi\vert\psi\rangle} in eq15 gives:

f(\lambda)=\langle\phi\vert\phi\rangle-\frac{\langle\phi\vert\psi\rangle}{\langle\psi\vert\psi\rangle}\langle\psi\vert\phi\rangle

Substituting eq14 in the above equation and rearranging yields \langle\phi\vert\psi\rangle\langle\psi\vert\phi\rangle \leq\langle\phi\vert\phi\rangle\langle\psi\vert\psi\rangle. Since \langle\phi\vert\psi\rangle= \langle\psi\vert\phi\rangle^{*}

\vert\langle\psi\vert\phi\rangle\vert^{2}\leq\langle\phi\vert\phi\rangle\langle\psi\vert\psi\rangle\; \; \; \; \; \; \; \; 16

Eq16 is called the Schwarz Inequality.

 

Step 2

Let \psi=(\hat{A}-\langle A\rangle)\varphi and \phi=(\hat{B}-\langle B\rangle)\varphi, where \varphi is normalised, and \hat{A} and \hat{B} are Hermitian operators, which implies that \hat{A}-\langle A\rangle and \hat{B}-\langle B\rangle are also Hermitian operators (see this article for proof). The variance of the observable of \hat{A} is

(\Delta A)^{2}=\frac{\sum_{i=1}^{N}(\hat{A}-\langle A\rangle)^{2}}{N}=\langle(\hat{A}-\langle A\rangle)^{2} \rangle

=\langle \varphi\vert(\hat{A}-\langle A\rangle)[(\hat{A}-\langle A\rangle)\varphi]\rangle=\langle [(\hat{A}-\langle A\rangle)\varphi]\vert[(\hat{A}-\langle A\rangle)\varphi]\rangle=\langle\psi\vert\psi \rangle\; \; \; \; \; \; \; \; 17

Note that the 2nd last equality uses the property of Hermitian operators (see eq36). Similarly,

(\Delta B)^{2}=\langle\phi\vert\phi\rangle\; \; \; \; \; \; \; \; 18

Substituting eq17 and eq18 in eq16 results in

(\Delta A)^{2}(\Delta B)^{2} \geq\vert \langle\psi\vert\phi\rangle\vert^{2} \; \; \; \; \; \; \; \; 19

Let z= \langle\psi\vert\phi\rangle where z=x+iy. So, \left | z \right |^{2}=x^{2}+y^{2}\geq y^{2}. Since y=\frac{z-z^{*}}{2i}, we have \left | z \right |^{2}\geq \left ( \frac{z-z^{*}}{2i} \right )^{2}, which is

\left | \langle\psi\vert\phi\rangle\right |^{2}\geq \left ( \frac{\langle\psi\vert\phi\rangle-\langle\psi\vert\phi\rangle^{*}}{2i} \right )^{2}=-\frac{1}{4}\left (\langle\psi\vert\phi\rangle-\langle\phi\vert\psi\rangle\right )^{2}\; \; \; \;\; \; \; \; 20

Substituting eq19 in eq20 gives

(\Delta A)^{2}(\Delta B)^{2}\geq -\frac{1}{4}\left ( \langle\psi\vert\phi\rangle-\langle\phi\vert\psi\rangle\right )^{2}\; \; \; \; \; \; \; \; 21

Next, we have

\langle\psi\vert\phi\rangle= \langle (\hat{A}-\langle A\rangle)\varphi\vert (\hat{B}-\langle B\rangle )\varphi\rangle=\langle\varphi\vert(\hat{A}-\langle A\rangle)(\hat{B}-\langle B\rangle)\varphi\rangle

=\langle\varphi\vert\hat{A}\hat{B}\vert\varphi\rangle-\langle B\rangle\langle\varphi\vert\hat{A}\vert\varphi\rangle-\langle A\rangle\langle\varphi\vert\hat{B}\vert\varphi\rangle+\langle A\rangle\langle B \rangle\langle\varphi\vert\varphi\rangle

=\langle\varphi\vert\hat{A}\hat{B}\vert\varphi\rangle-\langle A\rangle\langle B \rangle \; \; \; \; \; \; \; \; 22

Similarly,

\langle\phi\vert\psi\rangle=\langle\varphi\vert\hat{B}\hat{A}\vert\varphi\rangle-\langle B\rangle\langle A \rangle \; \; \; \; \; \; \; \; 23

Substituting eq22 and eq23 in eq21 yields

(\Delta A)^{2}(\Delta B)^{2}\geq -\frac{1}{4}(\langle\varphi\vert\hat{A}\hat{B}\vert\varphi\rangle-\langle\varphi\vert\hat{B}\hat{A}\vert\varphi\rangle )^{2}

=-\frac{1}{4}(\langle\varphi\vert\hat{A}\hat{B}-\hat{B}\hat{A}\vert\varphi\rangle )^{2}=-\frac{1}{4}(\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle )^{2}\; \; \; \; \; \; \; \; 24

 

Step 3

\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle=\langle\varphi\vert\hat{A}\hat{B}\vert\varphi\rangle-\langle\varphi\vert\hat{B}\hat{A}\vert\varphi\rangle=\langle\hat{A}\varphi\vert\hat{B}\varphi\rangle-\langle\hat{B}\varphi\vert\hat{A}\varphi\rangle

=\langle\varphi\vert\hat{B}(\hat{A}\varphi)\rangle^{*}-\langle\varphi\vert\hat{A}(\hat{B}\varphi)\rangle^{*}=- \{\langle\varphi\vert\hat{A}(\hat{B}\varphi)\rangle^{*}-\langle\varphi\vert\hat{B}(\hat{A}\varphi)\rangle^{*}\}

=-\{\langle\varphi\vert\hat{A}(\hat{B}\varphi)\rangle-\langle\varphi\vert\hat{B}(\hat{A}\varphi)\rangle\}^{*}=-\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle^{*}\; \; \; \; \; \; \; \; 25

We have used eq37 for the 2nd equality and eq35 for the 3rd equality. Substituting eq25 in one of the \langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi)\rangle in eq24 results in

(\Delta A)^{2}(\Delta B)^{2}\geq\frac{1}{4} \{\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle^{*}\}\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle

Therefore,

(\Delta A)^{2}(\Delta B)^{2}\geq \frac{1}{4}\vert\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle\vert^{2}

\Delta A\Delta B\geq \frac{1}{2}\vert\langle\varphi\vert[\hat{A},\hat{B}]\vert\varphi\rangle\vert\; \; \; \; \; \; \; \; 26

which is eq12, the general form of the uncertainty principle.

For, the observable pair of position x and momentum p, we have

\Delta x\Delta p\geq \frac{1}{2}\vert\langle\varphi\vert[\hat{x},\hat{p}]\vert\varphi\rangle\vert

Since [\hat{x},\hat{p}]\varphi=x\frac{\hbar}{i}\frac{d}{dx}\varphi-\frac{\hbar}{i}\frac{d}{dx}(x\varphi)=i\hbar\varphi

\Delta x\Delta p\geq \frac{1}{2}\vert\langle\varphi\vert i\hbar\varphi\rangle\vert=\frac{\hbar}{2}\vert i\vert

\Delta x\Delta p\geq \frac{\hbar}{2}\; \; \; \; \; \; \; \; 27

 

Next article: The energy-time uncertainty relation
Previous article: commuting operators
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Commuting operators

A pair of commuting operators that are Hermitian can have a common complete set of eigenfunctions.

Let \hat{O}_1 and \hat{O}_2 be two different operators, with observables \Omega_1 and \Omega_2 respectively.

\hat{O}_1(\hat{O}_2\psi)=\hat{O}_1(\Omega_2\psi)=\Omega_2\hat{O}_1\psi=\Omega_2\Omega_1\psi

\hat{O}_2(\hat{O}_1\psi)=\hat{O}_2(\Omega_1\psi)=\Omega_1\hat{O}_2\psi=\Omega_1\Omega_2\psi

So, \hat{O}_1(\hat{O}_2\psi)=\hat{O}_2(\hat{O}_1\psi)\; \; \; or\; \; \;\hat{O}_1(\hat{O}_2\psi)-\hat{O}_2(\hat{O}_1\psi)=0. If this is so, we say that the two operators commute. The short notation for \hat{O}_1(\hat{O}_2\psi)-\hat{O}_2(\hat{O}_1\psi) is \left [\hat{O}_1,\hat{O}_2\right ], where in the case of two commuting operators, \left [\hat{O}_1,\hat{O}_2\right ]=0.

When the effect of two operators depends on their order, we say that they do not commute, i.e. \left [\hat{O}_1,\hat{O}_2\right ]\neq 0. If this is the case, we say that the observables \Omega_1 and \Omega_2 are complementary.

One important concept in quantum mechanics is that we can select a common complete set of eigenfunctions for a pair of commuting Hermitian operators. The proof is as follows:

Let \left \{ \vert a_n\rangle \right \} and \left \{ \vert b_n\rangle \right \} be the complete sets of eigenfunctions of \hat{A} and \hat{B} respectively, such that \hat{A}\vert a_n\rangle=a_n\vert a_n\rangle and \hat{B}\vert b_m\rangle=b_m\vert b_m\rangle. If the two operators have a common complete set of eigenfunctions, we can express \vert a_n\rangle as a linear combination of \vert b_m\rangle:

\vert a_n\rangle=\sum_{m=1}^{k}c_{nm}\vert b_m\rangle\; \; \; \; \; \; \; \; 5

For example, the eigenfunction is:

\vert a_1\rangle=c_{11}\vert b_1\rangle+c_{12}\vert b_2\rangle+\cdots+c_{1k}\vert b_k\rangle\; \; \; \; \; \; \; \; 6

Since some of the eigenfunctions \vert b_m\rangle may describe degenerate states (i.e. some \vert b_m\rangle are associated with the same eigenvalue b_i), we can rewrite \vert a_n\rangle as:

\vert a_n\rangle=\sum_{i=1}^{j}\vert(a_n) b_i\rangle\; \; \; \; \; \; \; \; 7

where \vert(a_n) b_i\rangle=\sum_{m=1}^{k}d_{nm}\vert b_m\rangle\delta_{b_i,b_m} and b_i represents distinct eigenvalues of the complete set of eigenfunctions of \hat{B}.

For example, if the linear combination of \vert a_1 \rangle in eq6 has \vert b_1\rangle and \vert b_2\rangle describing the same eigenstate with eigenvalue b_1, and \vert b_4\rangle and \vert b_5\rangle describing another common eigenstate with eigenvalue b_3,

\vert a_1\rangle=\vert(a_1) b_1\rangle+\vert(a_1) b_2\rangle+\vert(a_1) b_3\rangle+\cdots+\vert(a_1) b_j\rangle

where \vert(a_1) b_1\rangle=d_{11}\vert b_1\rangle+d_{12}\vert b_2\rangle, \vert(a_1) b_2\rangle=d_{13}\vert b_3\rangle, \vert(a_1) b_3\rangle=d_{14}\vert b_4\rangle+d_{15}\vert b_5\rangle and so on.

In other words, eq7 is a sum of eigenfunctions with distinct eigenvalues of \hat{B}. Since a linear combination of eigenfunctions describing a degenerate eigenstate is an eigenfunction of \hat{B}, we have

\hat{B}\vert(a_n)b_i\rangle=b_i\vert(a_n)b_i\rangle\; \; \; \; \; \; \; \; 8

i.e. \vert(a_n)b_i\rangle is an eigenfunction of \hat{B}. Furthermore, the set \left \{ \vert(a_n)b_i\rangle \right \} is complete, which is deduced from eq7, where the set \left \{ \vert a_n\rangle \right \} is complete.

From \hat{A}\vert a_n\rangle=a_n\vert a_n\rangle, we have:

(\hat{A}-a_n)\vert a_n\rangle=0

Substituting eq7 in the above equation, we have

(\hat{A}-a_n)\vert a_n\rangle=\sum_{i=1}^{j}(\hat{A}-a_n) \vert (a_n)b_i\rangle=0\; \; \; \; \; \; \; \; 9

By operating on the 1st term of the summation in the above equation with \hat{B}, and using the fact that \hat{A} commute with \hat{B},

\hat{B}(\hat{A}-a_n)\vert (a_n)b_1\rangle=(\hat{A}-a_n)\hat{B} \vert (a_n)b_1\rangle\; \; \; \; \; \; \; \; 10

Substituting eq8, where i=1 in the above equation,

\hat{B}(\hat{A}-a_n)\vert (a_n)b_1\rangle=b_1(\hat{A}-a_n) \vert (a_n)b_1\rangle\; \; \; \; \; \; \; \; 11

Repeating the operation of \hat{B} on the remaining terms of the summation in eq9, we obtain equations similar to eq11 and we can write:

\hat{B}(\hat{A}-a_n)\vert (a_n)b_i\rangle=b_i(\hat{A}-a_n) \vert (a_n)b_i\rangle

i.e. (\hat{A}-a_n)\vert (a_n)b_i\rangle is an eigenfunction of \hat{B} with distinct eigenvalues b_i. Since \hat{B} is Hermitian and (\hat{A}-a_n)\vert (a_n)b_i\rangle are associated with distinct eigenvalues, the eigenfunctions (\hat{A}-a_n)\vert (a_n)b_i\rangle are orthogonal and therefore linearly independent. Consequently, each term in the summation in eq9 must be equal to zero:

(\hat{A}-a_n)\vert (a_n)b_i\rangle=0\; \; \; \Rightarrow \; \; \;\hat{A}\vert (a_n)b_i\rangle=a_n\vert (a_n)b_i\rangle

This implies that \vert (a_n)b_i\rangle, which is a complete set as mentioned earlier, is also an eigenfunction of \hat{A}. Therefore, we can select a common complete set of eigenfunctions \left \{ \vert (a_n)b_i\rangle\right \} for a pair of commuting Hermitian operators. Conversely, if two Hermitian operators do not commute, eq10 is no longer valid and we cannot select a common complete set of eigenfunctions for them.

 

 

Next article: the uncertainty principle
Previous article: operators
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Operators (quantum mechanics)

An operator \hat{O} in a vector space V maps a set of vectors \boldsymbol{\mathit{v}}_i to another set of vectors \boldsymbol{\mathit{v}}_j (or transforms one function into another function), where \boldsymbol{\mathit{v}}_i,\boldsymbol{\mathit{v}}_j\in V. For example, the operator \frac{d^{2}}{dx^{2}} transforms the function f(x) into f^{''}(x):

\frac{d^{2}}{dx^{2}}f(x)=f^{''}(x)

Linear operators have the following properties:

    1. \hat{O}(\boldsymbol{\mathit{u}}+\boldsymbol{\mathit{v}})=\hat{O}\boldsymbol{\mathit{u}}+\hat{O}\boldsymbol{\mathit{v}}
    2. \hat{O}(c\boldsymbol{\mathit{u}})=c\hat{O}\boldsymbol{\mathit{u}}
    3. (\hat{O}_1\hat{O}_2)\boldsymbol{\mathit{u}}=\hat{O}_1(\hat{O}_2\boldsymbol{\mathit{u}})

Two operators commonly encountered in quantum mechanics are the position and linear momentum operators. To construct these operators, we refer to probability theory, where the expectation value of the position x of a particle in a 1-D box of length L is

\langle x\rangle=\int_{0}^{L}xP(x)dx=\int_{0}^{L}x\left |\psi(x) \right |^{2}dx=\int_{0}^{L}\psi(x)\, x\, \psi(x)dx

where P(x) is the probability of observing the particle at a particular position between 0 and L, and \psi(x) is the particle’s wavefunction, which is assumed to be real.

Comparing the above equation with the expression of the expectation value of a quantum-mechanical operator, \hat{x}=x.

One may infer that the linear momentum operator is \hat{p}_x=p_x. However, we must find a form of p_x that is a function of x so that we can compute \int_{0}^{L}\psi(x)p_x\psi(x)dx. If we compare the time-independent Schrodinger equation \left [ -\frac{\hbar^{2}}{2m}\frac{\partial^{2}}{\partial x^{2}}+V(x) \right ]\psi(x)=E\psi(x) with the total energy equation \frac{p_x^{\: 2}}{2m}+V(x)=E, we have \hat{p}_x=\frac{\hbar}{i}\frac{d}{dx}.

To test the validity of \hat{x}=x and \hat{p}_x=\frac{\hbar}{i}\frac{d}{dx}, we compute \langle x\rangle and \langle p_x\rangle using the 1-D box wavefunction of \sqrt\frac{2}{L}sin\frac{n\pi x}{L} and check if the results are reasonable with respect to classical mechanics.

Integrating \langle x\rangle=\frac{2}{L}\int_{0}^{L}xsin^{2}\left ( \frac{n\pi x}{L} \right )dx by parts, we have \langle x\rangle=\frac{L}{2}. In classical mechanics, the particle can be anywhere in the 1-D box with equal probability. Therefore, the average position of \langle x\rangle=\frac{L}{2} is reasonable.

For the linear momentum operator, we have \langle p_x\rangle=\frac{2\hbar}{iL}\int_{0}^{L}sin\left ( \frac{n\pi x}{L} \right )\frac{d}{dx}sin\left ( \frac{n\pi x}{L} \right )dx=0. Since \langle x\rangle=\frac{L}{2}, we expect \langle p_x\rangle=m\frac{d\langle x\rangle}{dt}=0. Therefore, x and \frac{\hbar}{i}\frac{d}{dx} are reasonable assignments of the position and linear momentum operators respectively. In 3-D, the position and linear momentum operators are:

\hat{x}=x\; \; \; \; \;\hat{y}=y\; \; \; \; \;\hat{z}=z

\hat{p}_x=\frac{\hbar}{i}\frac{\partial}{\partial x}\; \; \; \; \;\hat{p}_y=\frac{\hbar}{i}\frac{\partial}{\partial y}\; \; \; \; \;\hat{p}_z=\frac{\hbar}{i}\frac{\partial}{\partial z}\; \; \; \; \;\; \; \; 4

To see the proof that the position and linear momentum operators are Hermitian, read this article.

 

 

Next article: commuting operators
Previous article: kronecker product
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Kronecker product

The Kronecker product, denoted by \otimes, is a multiplication method for generating a new vector space from existing vector spaces, and therefore, new vectors from existing vectors.

Consider 2 vectors spaces, e.g. V=\mathbb{R}^{2} and W=\mathbb{R}^{3}. For \boldsymbol{\mathit{v}}=\begin{pmatrix} a_1\\a_2 \end{pmatrix} in V and \boldsymbol{\mathit{w}}=\begin{pmatrix} b_1\\b_2\\b_3 \end{pmatrix} in W, we can define a new vector space, V\otimes W, which consists of the vector \boldsymbol{\mathit{v}}\otimes\boldsymbol{\mathit{w}}, where:

\boldsymbol{\mathit{v}}\otimes\boldsymbol{\mathit{w}}=\begin{pmatrix} a_1\\a_2 \end{pmatrix}\otimes \begin{pmatrix} b_!\\b_2 \\ b_3 \end{pmatrix}=\begin{pmatrix} a_1b_1\\a_1b_2 \\a_1b_3 \\ a_2b_1 \\ a_2b_2 \\ a_2b_3 \end{pmatrix}

If the basis vectors for V and W are V=\left \{\boldsymbol{\mathit{e_1}},\boldsymbol{\mathit{e_2}}\right \} and W=\left \{\boldsymbol{\mathit{f_1}},\boldsymbol{\mathit{f_2}},\boldsymbol{\mathit{f_3}}\right \} respectively, the basis for V\otimes W is:

 

Question

Why is a new vector space?

Answer

An -dimensional vector space is spanned by  linearly independent basis vectors. The basis vectors for V=\left \{\boldsymbol{\mathit{e_1}},\boldsymbol{\mathit{e_2}}\right \} and W=\left \{\boldsymbol{\mathit{f_1}},\boldsymbol{\mathit{f_2}},\boldsymbol{\mathit{f_3}}\right \} are

and consequently, the basis vectors for  are

These 6 linearly independent basis vectors therefore span a 6-dimensional space.

 

This implies that V\otimes W is nm dimensional if V is n-dimensional and W is m-dimensional. Since V\otimes W is a vector space, the vectors \boldsymbol{\mathit{v}}\otimes\boldsymbol{\mathit{w}} must follow the rules of addition and multiplication of a vector space. Each vector \boldsymbol{\mathit{v}}\otimes\boldsymbol{\mathit{w}} in the new vector space can then be written as a linear combination of the basis vectors \boldsymbol{\mathit{e_i}}\otimes\boldsymbol{\mathit{f_j}}, i.e. \sum c_{i,j}\boldsymbol{\mathit{e_i}}\otimes\boldsymbol{\mathit{f_j}}.

In general, if

then

Since the pair  in  is distinct for each \boldsymbol{\mathit{e_i}}\otimes\boldsymbol{\mathit{f_j}} vector, the Kronecker product \boldsymbol{\mathit{e_i}}\otimes\boldsymbol{\mathit{f_j}} results in  basis vectors, which span an  vector space.

As mentioned in an earlier article, a vector space is a set of objects that follows certain rules of addition and multiplication. If the objects are matrices, we have a vector space of matrices. For example, the vector spaces of matrices and generates a new vector space of matrices , where

Similarly, if the objects are functions, we have a vector space of functions and the Kronecker product of two vector spaces of functions  and generates a new vector space of functions . If and are spanned by basis functions and basis functions respectively, is spanned by basis functions.

A vector space that is generated from two separate vector spaces has applications in quantum composite systems and in group theory.

Question

What is the relation between the matrix entries of A, B and C in ?

Answer

Let the matrix entries of A, B and C be , and respectively, where

Using the ordering convention called dictionary order, where  is determined by  and , and is determined by and , such that  and are given by

For example, if and ,

We can then express the matrix entries of as .

 

 

Next article: operators
Previous article: hilbert space
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Hilbert space

A Hilbert space H is a complete inner product space. It allows the application of linear algebra and calculus techniques in a space that may have an infinite dimension.

The inner product in a Hilbert space has the following properties:

  1. Conjugate symmetry: \langle\phi_1\vert\phi_2\rangle=\langle\phi_2\vert\phi_1\rangle^{*}
  2. Linearity with respect to the 2nd argument: \langle\phi_1\vert c_2\phi_2+c_3\phi_3\rangle=c_2\langle\phi_1\vert \phi_2\rangle+c_3\langle\phi_1\vert \phi_3\rangle
  3. Antilinearity with respect to the first argument: \langle c_1\phi_1+c_2\phi_2\vert \phi_3\rangle=c_1^{*}\langle\phi_1\vert \phi_3\rangle+c_2^{*}\langle\phi_2\vert \phi_3\rangle
  4. Positive semi-definiteness: \langle\phi_1\vert \phi_1\rangle\geq 0, with \langle\phi_1\vert \phi_1\rangle= 0 if \phi_1=0

The last property can be illustrated using the \mathbb{R}^{2} space that is equipped with an inner product. Such a space is an example of a real finite-dimensional Hilbert space. The inner product of the vector \boldsymbol{\mathit{u}} with itself is:

\boldsymbol{\mathit{u}}\cdot\boldsymbol{\mathit{u}}=\begin{pmatrix} u_1 &u_2 \end{pmatrix}\begin{pmatrix} u_1\\u_2 \end{pmatrix}=u{_{1}}^{2}+u{_{2}}^{2}=\{\begin{matrix} >0 &if\; \boldsymbol{\mathit{u}}\neq 0 \\ 0&if \; \boldsymbol{\mathit{u}}=0\end{matrix}

We define a complete Hilbert space as one where every Cauchy sequence in H converges to an element of H. If you recall, a Cauchy sequence is a sequence, e.g. \left \{ x_n \right \}_{n=1}^{\infty} where x_n=\sum_{k=1}^{n}\frac{(-1)^{k+1}}{k}, for which

\lim_{m,n\rightarrow \infty}\left | x_n-x_m \right |=0

We can also define the completeness of a Hilbert space in terms of a sequence of vectors \left \{ \boldsymbol{\mathit{v_n}} \right \}_{n=1}^{\infty}, where \boldsymbol{\mathit{v_n}} =\sum_{k=1}^{n}\boldsymbol{\mathit{u_k}}. Each element \boldsymbol{\mathit{v_n}} is represented by a series of vectors, which converges absolutely (i.e. \sum_{k=1}^{\infty}\left \| \boldsymbol{\mathit{u_k}} \right \|< \infty) and converges to an element of H. In other words, the series of vectors in H converges to some limit vector \boldsymbol{\mathit{L}} in H:

\lim_{n\rightarrow \infty}\left \|\boldsymbol{\mathit{L}}-\sum_{k=1}^{n}\boldsymbol{\mathit{u_k}} \right \|=0

Generally, every element of a vector space can be a point, a vector or a function. In quantum mechanics, we are interested in a Hilbert space called the L^{2} space, where the eigenfunctions of a Hermitian operator are square integrable, i.e. \int_{-a}^{b}\left | \phi(x) \right |^{2}dx< \infty.

Not to be confused with the completeness of a Hilbert space, the completeness of a set of basis eigenfunctions refers to the property that any eigenfunction of the Hilbert space can be expressed as a linear combination of the basis eigenfunctions. An example is the \mathbb{R}^{2} space, where the set of basis vectors \left \{\boldsymbol{\mathit{\hat i}},\boldsymbol{\mathit{\hat j}}\right \} is complete, with a linear combination of \boldsymbol{\mathit{\hat i}} and \boldsymbol{\mathit{\hat j}} spanning \mathbb{R}^{2}. In H, the number of basis vectors \boldsymbol{\mathit{u_k}} may be infinite. If the set of \boldsymbol{\mathit{u_k}} is complete, we say that it spans H, which is itself complete.

Just as the orthonormal vectors \boldsymbol{\mathit{\hat i}} and \boldsymbol{\mathit{\hat j}} form a complete set of basis vectors in the \mathbb{R}^{2} space, where any vector in \mathbb{R}^{2} can be expressed as a linear combination of \boldsymbol{\mathit{\hat i}} and \boldsymbol{\mathit{\hat j}}, we postulate the existence of a complete basis set of orthonormal wavefunctions of any Hermitian operator in L^{2}.

 

 

Next article: Kronecker product
Previous article: completeness of a space
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Completeness of a vector space

A complete vector space is one that has no “missing elements” (e.g. no missing coordinates).

In the previous article, we learned that the function d(\boldsymbol{\mathit{u}},\boldsymbol{\mathit{v}}) defines the distance between two elements of the vector space. Such a function is called a metric and it measures the ‘closeness’ of elements (or points) in a vector space. Since a vector space is a collection of elements, we can use a sequence, e.g. \left \{ x_n \right \}_{n=1}^{\infty}, to represent elements of a vector space X. If the distance between two members of the sequence gets closer as n gets larger (see diagram below), i.e.

\lim_{m,n\rightarrow \infty}d(x_n-x_m)=0

we call the sequence a Cauchy sequence.

Cauchy sequences are useful in determining the completeness of a vector space. A vector space V is complete if every Cauchy sequence in V converges to an element of V. For example, the sequence \left \{ x_n \right \}_{n=1}^{\infty}, where x_n=\sum_{k=1}^{n}\frac{(-1)^{k+1}}{k}, is one of many Cauchy sequences of rational numbers in the rational number space \mathbb{Q}. However, the sequence converges to ln2, which is not an element of \mathbb{Q}. Therefore, \mathbb{Q} is not complete, and has “missing elements” or “gaps”, as compared to the real space number space \mathbb{R}, which is complete.

 

Question

Show that \sum_{k=1}^{\infty}\frac{(-1)^{k+1}}{k}=ln2.

Answer

If \left | x \right |<1, then (1-x)(1+x+x^{2}+\cdots). So, \frac{1}{1-x}=1+x+x^{2}+\cdots or

\frac{1}{1-(-x)}=\frac{1}{1+x}=1-x+x^{2}+\cdots

Integrating the 2nd equality of the above equation on both sides gives

ln\left | 1+x \right |=x-\frac{x^{2}}{2}+\frac{x^{3}}{3}+\cdots=\sum_{k=1}^{\infty}(-1)^{k+1}\frac{x^{k}}{k}

Substituting x=1  in the above equation yields ln2=\sum_{k=1}^{\infty}\frac{(-1)^{k+1}}{k}.

 

A vector space with no “missing elements” is essential for scientists to work with to formulate theories (e.g. kinematics) and solving problems associated with those theories. Furthermore, the ability to compute limits in a complete vector space implies that we can apply calculus to solve problems defined by the space. For example, a complete inner product space called a Hilbert space, which will be discussed in the next article, is used to formulate the theories of quantum mechanics.

 

 

Next article: hilbert space
Previous article: Vector subspace and eigenspace
Content page of quantum mechanics
Content page of advanced chemistry
Main content page

Inner product space

An inner product space is a vector space with an inner product.

An inner product is an operation that assigns a scalar to a pair of vectors \langle\boldsymbol{\mathit{u}}\vert\boldsymbol{\mathit{v}}\rangle, or a scalar to a pair of functions \langle f\vert g\rangle. The way to assign the scalar may be through the matrix multiplication of the pair of vectors, for instance

\langle\boldsymbol{\mathit{u}}\vert\boldsymbol{\mathit{v}}\rangle=\begin{pmatrix} u_{1}^{*} &u_{2}^{*}& \cdots &u_{N}^{*} \end{pmatrix}\begin{pmatrix} v_1\\v_2 \\ \vdots \\ v_N \end{pmatrix}=\sum_{i=1}^{N}u_{i}^{*}v_i\; \; \; \; \; \; \; \; 3

or it may be through an integral of the pair of functions:

\langle f\vert g\rangle=\int_{-\infty}^{\infty}f(x)^{*}g(x)dx

You may notice that eq3 resembles a dot product. The dot product pertains to vectors in \mathbb{R}^{3}, where \boldsymbol{\mathit{A}}\cdot\boldsymbol{\mathit{B}}=\sum_{i=1}^{3}A_iB_i, which can be extended to N-dimensions, where \langle\boldsymbol{\mathit{A}}\vert\boldsymbol{\mathit{B}}\rangle=\sum_{i=1}^{N}A_iB_i, and to include complex and real functions, \langle f\vert g\rangle=\int_{-\infty}^{\infty}f(x)^{*}g(x)dx. Therefore, an inner product is a generalisation of the dot product.

An inner product space has the following properties:

    1. Conjugate symmetry: \langle f\vert g\rangle=\langle g\vert f\rangle^{*}
    2. Additivity: \langle f+g\vert h\rangle=\langle f\vert h\rangle+\langle g\vert h\rangle
    3. Positive semi-definiteness: \langle f\vert f\rangle\geq 0, with \langle f\vert f\rangle= 0 if f=0

 

Question

i) Why is the inner product space positive semi-definite?
ii) Show that orthogonal vectors are linearly independent.
iii) Prove that .

Answer

i) A general vector space of \langle\boldsymbol{\mathit{a}}\vert\boldsymbol{\mathit{a}}\rangle can be positive or negative. The inner product space is defined such that \langle f\vert f\rangle\geq 0, with \langle f\vert f\rangle= 0 if f=0, which is useful in quantum mechanics.

ii) Let the set of vectors \left \{ \boldsymbol{\mathit{v_k}} \right \} in eq1 be orthogonal vectors. The dot product of eq1 with \boldsymbol{\mathit{v_i}} gives c_i\boldsymbol{\mathit{v_i}}\cdot\boldsymbol{\mathit{v_i}} =c_i\left |\boldsymbol{\mathit{v_i}} \right |^{2}=0. Since the magnitudes of orthogonal vectors are non-zero, c_i=0. Hence, orthogonal vectors are linearly independent.

iii) Let’s consider two vectors and as position vectors starting from the origin. Then the vector forms a triangle with them. According to the law of cosines, we have:

Substituting into the above equation gives:

which completes the proof.

 

Two functions (or two vectors) are orthogonal if \langle f\vert g\rangle= 0. Elements of a set of basis functions are orthonormal if \langle \phi_i\vert \phi_j\rangle=\delta_{ij} where

\delta_{ij}= \{\; \begin{matrix} 1 & for\; \; i=j\\ 0 & for\; \; i\neq j \end{matrix}

In other words, two functions (or two vectors) are orthonormal if they are orthogonal and normalised.

Finally, the norm (or length) of a vector \boldsymbol{\mathit{u}} is denoted by \left \|\boldsymbol{\mathit{u}}\right \| and is defined as \left \|\boldsymbol{\mathit{u}}\right \|=\sqrt{\langle\boldsymbol{\mathit{u}}\vert\boldsymbol{\mathit{u}}\rangle}=\sqrt{\left |\boldsymbol{\mathit{u}}\right |\left |\boldsymbol{\mathit{u}}\right |cos\: 0^{\circ} }=\left |\boldsymbol{\mathit{u}}\right |. With this association of inner product and the length of a vector, we can relate the inner product \langle\boldsymbol{\mathit{u}}\vert\boldsymbol{\mathit{v}}\rangle and the Euclidean distance d(\boldsymbol{\mathit{u}},\boldsymbol{\mathit{v}}) between 2 vectors \boldsymbol{\mathit{u}} and \boldsymbol{\mathit{v}}. Using the \mathbb{R}^{2} space as an example, where \boldsymbol{\mathit{u}}=\begin{pmatrix} 6\\9 \end{pmatrix} and \boldsymbol{\mathit{v}}=\begin{pmatrix} 3\\5 \end{pmatrix}, we have

d(\boldsymbol{\mathit{u}},\boldsymbol{\mathit{v}})=\left \|\boldsymbol{\mathit{u}}-\boldsymbol{\mathit{v}}\right \|=\sqrt{\langle\boldsymbol{\mathit{u}}- \boldsymbol{\mathit{v}}\vert\boldsymbol{\mathit{u}}- \boldsymbol{\mathit{v}}\rangle}=\sqrt{\langle\begin{pmatrix} 3\\4 \end{pmatrix}\vert \begin{pmatrix} 3\\4 \end{pmatrix}\rangle}

=\sqrt{\begin{pmatrix} 3 &4 \end{pmatrix}\begin{pmatrix} 3\\4 \end{pmatrix}}=5

Hence, the norm naturally comes from the inner product, i.e. every inner product space is a normed space, but not vice versa.

 

Next article: vector subspace and eigenspace
Previous article: vector space of functions
Content page of quantum mechanics
Content page of advanced chemistry
Main content page
Mono Quiz