In science, one tries to tell people something that no one ever
knew before, in such a way as to be understood by everyone.
But in poetry, it's the exact opposite. Paul
A.M. Dirac (1902-1984;
The surprising way quantum probabilities are obtained.
First, let's consider how probabilities are ordinarily
When an event consists of two mutually exclusive events,
its probability is the sum of the probabilities of those two events.
Similarly, when an event is the conjunction
of two statistically independent events,
its probability is the product of the probabilities of those two events.
For example, if you roll a fair die,
the probability of obtaining a multiple of 3 is
1/3 = 1/6+1/6; it's the sum of the probabilities (1/6 each)
of the two mutually exclusive events "3" and "6".
You add probabilities when the
component events can't happen together (the outcome of the roll cannot be
both "3" and "6").
On the other hand, the probability of rolling two fair dice without obtaining a 6
is 25/36 = (5/6)(5/6); it's the product of the probabilities
(5/6 each) of two independent events,
each consisting of not rolling a 6 with each throw.
Quantum Logic and [Complex] Probability Amplitudes :
In the quantum realm, as long as two logical possibilities are not actually observed,
they can be neither exclusive nor independent
and the above does not apply.
Instead, quantum mechanical probability amplitudes are defined as
complex numbers whose
absolute values squared correspond to ordinary probabilities.
The phases (the angular directions) of such complex numbers
have no classical equivalents
(although they happen to provide a deep explanation for the existence
of the conserved classical quantity known as electric charge).
To obtain the amplitude of an event with two
unobserved logical components:
For EITHER-OR (exclusive) components,
the amplitudes are added.
For AND (independent) components,
the amplitudes are multiplied.
In practice, "AND components" are successive steps that
could logically lead to the desired
outcome, forming what's called an acceptable history for that outcome.
The "EITHER-OR components", whose amplitudes are to be added,
are thus all the possible histories logically leading up to the same outcome.
Following Richard Feynman,
the whole thing is therefore called a "sum over histories".
These algebraic manipulations are a mind-boggling substitute for statistical logic,
but that's the way the physical universe appears to work.
The above quantum logic normally applies only at the microscopic level,
where "observation" of individual components is either impossible or would
introduce an unacceptable disturbance.
At the macroscopic level, the observation of a combined outcome usually implies that
all relevant components are somehow "observed" as well (and the ordinary
algebra of probabilities applies).
For example, in our examples involving dice, you cannot tell
if the outcome of a throw is a multiple of 3 unless you actually observe
the precise outcome and will thus know if it's a "3" or a "6",
or something else.
Similarly, to know that you haven't obtained a "6"
in a double throw, you must observe separately the outcome of each throw.
Surprisingly enough, when the logical components of an event are only imperfectly
observed (with some remaining uncertainty), the probability of the outcome
is somewhere between what the quantum rules say
and what the classical rules would predict.
(2007-07-19) On the "Statistics" of Elementary Particles
A direct consequence of quantum logic:
Pauli's Exclusion Principle
In very general terms, you may call "particle" some part of a quantum
system. Swapping (or switching) a pair of particles is making one
particle take the place of the other and vice versa, while leaving everything else
Although swapping particles may deeply affect a quantum system,
swapping twice will certainly not change anything since, by definition,
this is like doing nothing at all.
"Swapping" can be defined as something that does nothing
if you do it twice.
Particles are defined to be "identical" if they can be swapped.
So, according to the abovequantum logic,
the amplitude associated with one swapping
must have a square of 1.
Therefore (assuming that amplitudes are ordinary
complex numbers) the swapping amplitude is
either +1 or -1.
In the mathematical description of quantum states, swapping
is well-defined only for particles of
the same "nature". Whether swapping involves a multiplicative factor of
+1 or -1 depends on that "nature".
Particles for which swapping leaves the quantum state unchanged are called
bosons, those for which swapping negates the quantum state are called
A deep consequence of Special
relativity is that spin
determines which "statistics" a given type of particles obeys
for bosons, Fermi-Dirac statistics for fermion).
Part of the angular momentum of a fermion can't be explained in classical terms
(it must include a nonorbital "pointlike" component).
The spin of a boson is a whole multiple of the quantum
of angular momentum
h/2p, whereas the spin of a fermion is
an odd multiple of the "half quantum" h/4p.
With the concepts so defined, let's consider a quantum state where two
fermions would be absolutely undistinguishable.
Not only would they be particles of the same kind (e.g., two electrons)
but they would have the same position, the same state of motion, etc.
So, the quantum state is clearly unchanged by swapping.
Yet, swapping fermions must negate the quantum state...
Therefore, it's equal to its own opposite and can only be zero !
The probability associated to a zero quantum state is zero;
this corresponds to something impossible. In other words,
two different fermions can't "occupy" the exact same state.
This result is called Pauli's exclusion principle.
It's the reason why all the electrons around a nucleus don't collapse
to the single state of lowest energy.
Instead, they occupy successively different "orbitals", according to rules
which explain the entire periodic table of
(2002-11-01) The Infamous Measurement Problem
What does a quantum observation entail?
There are no things, only processes.
David Bohm (1917-1992)
This is arguably the most fundamental
unsolved question in quantum mechanics.
According to the above, one should deal strictly
with amplitudes between observations (or measurements),
but another recipe holds when measurements are made.
That would be fine if we knew exactly what a measurement entails,
but we don't...
Should we really assume that a system can only be measured by some outside agency
If we do, nothing prevents us from considering a larger system that includes this
observer as well, and that system's evolution would involve only
measurement-free quantum rules.
If we don't assume that, we can't avoid the conclusion that a system can
observe itself, in some obscure sense.
Either way, the simple quantum rules outlined above would have to be smoothly modified
to account for a behavior which can be nearly classical for a large enough system.
In other words, current quantum ideas must be incomplete, because they fail to describe
any bridge between a quantum system waiting only to be observed,
and an entity capable of observation.
Our current quantum description of the world has proven its worth and reigns supreme,
just like Newtonian mechanics reigned supreme
before the advent of Relativity Theory.
Relativity consistently bridged the gap between the slow and the fast,
the massive and the massless (while retaining the full applicability of
Newtonian theories to the domain of ordinary speeds).
Likewise, the gap must ultimately be bridged
between observer and observed,
between the large and the small,
between the classical world and the quantum realm,
for there is but one single physical reality in which everything
This bothers, or should bother, everybody who deals with quantum mechanics:
The so-called Schrödinger's Cat theme is often used to discuss the
problem, in the guise of a system that includes a cat (a "qualified" observer)
in the presence of a quantum device which could trigger a lethal device.
It seems silly to view the whole thing as a single quantum system,
which would only exist (until observed) in some
superposition of states,
where the cat would be neither dead nor alive, but both at once.
Something must exist which collapses the quantum state of a large enough system
frequently enough to make it appear "classical".
It stands to reason that Schrödinger's Cat must be dead
very shortly after being killed...
(2005-07-03) Matrix Mechanics (1925)
Physical quantities are multiplied like matrices... Order matters.
In 1925, Werner Heisenberg
(1901-1976; Nobel 1932)
discovered that observable physical quantities
obey noncommutative rules similar to those governing
the multiplication of algebraic matrices.
If the measurement of a physical quantity would disturb the measurement of the
other, then a noncommutative circumstance exists which disallows even the
possibility of two separate sets of experiments yielding
the values of these two quantities with arbitrary precision (read this again).
This delicate connection between
noncommutativity and uncertainty
is now known as Heisenberg's uncertainty principle.
In particular, the position and momentum of a particle can only be measured with respective
(i.e., standard deviations in repeated experiments)
Dpx satisfying the following inequality :
The early development of Heisenberg's Matrix Mechanics
was undertaken by M. Born and P. Jordan.
In March 1926, Erwin
Schrödinger showed that Heisenberg's
viewpoint was equivalent to his own "undulatory" approach
(Wave Mechanics, January 1926) for which he
would share the 1933 Nobel prize with
Paul Dirac, who
gave basic Quantum Theory
its current form.
Heisenberg's Viewpoint [skip on first reading]
Here is a terse summary of Heisenberg's approach
in terms of the Schrödinger viewpoint which we adopt
here, following Dirac and almost all modern scholars:
In the modern Schrödinger-Dirac perspective, a
ket |y> is
introduced which describes a quantum state
varying with time.
Since it remains of unit length, its value at time t is obtained
from its value at time 0 via a unitary
| yt >
| y0 >
The unitary operator Û so defined is called the
Heisenberg's viewpoint consists in considering that a given system is represented
by the constant ket
Operators are modified accordingly...
A physical quantity which is associated with the operator Â
in the Schrödinger viewpoint
(possibly constant with time) is then associated with the following
time-dependent operator in the Heisenberg viewpoint.
Û* Â Û
Â Û (t,0)
(2002-11-02) The Schrödinger Equation (1926)
The dance of a single nonrelativistic particle in a classical force field.
The Schrödinger equation governs the
probability amplitude y
of a particle of mass m and energy E
in a space-dependent potential energy V.
Strictly speaking, E is the total relativistic
(starting at mc2 for the particle at rest).
However, the final stationary Schrödinger equation
(below) features only the
difference E-V with respect to the potential V,
which may thus be shifted to incorporate the rest energy
of a single particle.
For several particles, the issue cannot be skirted so easily
(in fact, it's partially unresolved) and it's one of several reasons
why the quantum study of multiple particles takes the form of an
inherently relativistic theory
(Quantum Field Theory)
which also accounts for the creation and anihilation of particles.
In 1926, when the Austrian physicist
(1887-1961; Nobel 1933)
worked out the equation now named after him, he thought that the relevant quantity
was something like a density of electric charge...
is now understood to be a probability amplitude, as
defined in the above article,
namely a complex number whose squared length is proportional to the probability
of actually finding the electron at a particular position in space.
That interpretation of y was proposed by
(1882-1970; Nobel 1954)
the very person who actually coined the term quantum mechanics
(Max Born also happens to be the maternal grandfather of
The controversy about the meaning of y hindered neither
the early development of Schrödinger's theory of "Wave Mechanics",
nor the derivation of the nonrelativistic equation at its core:
The nonrelativistic (defining) relations
E = V + ½ mv 2 and
p = mv imply:
p = Ö
Therefore, the wave celerity u = E/p is simply:
u = E / Ö
Now, the general 3-dimensional wave equation
of some quantity j propagating
at celerity u  is:
¶ 2 j
¶ 2 j
¶ 2 j
¶ 2 j
¶ t 2
¶ x 2
¶ y 2
¶ z 2
[D is the Laplacian operator]
The standard way to solve this (mathematically)
is to first obtain solutions j which are
products of a time-independent space function y
by a sinusoidal function of the time (t) alone.
The general solution is simply a linear superposition of these
stationary waves :
y exp (
-2pin t )
For a frequency n, the stationary amplitude
y thus defined must satisfy:
( 4pn2 / u2 )
y = 0
Using n = E/h
(Planck's formula) and the above for
u = E/p we obtain...
The Schrödinger Equation :
(8 p2 m / h2 )
(E - V) y
This equation is best kept in its nonrelativistic context,
where it determines allowed levels of
energy up to an additive constant.
A frequency may only be associated with a Schrödinger solution
at energy E if E is the total relativistic energy (including rest energy)
and V has been ajusted accordingly, against the usual nonrelativistic freedom,
as discussed in this article's introduction.
In the above particular stationary case, we have:
( i h / 2p )
This relation turns the previous equation into
a more general linear equation :
( i h / 2p ) ¶j/¶t
V j -
( h2 / 8p2 m )
Signed Energy and the Arrow of Time
Historically, Erwin Schrödinger associated an equally valid
stationary function with the positive (relativistic) energy
E = hn and obtained a
different equation :
exp (2pin t )
( -i h / 2p ) ¶j/¶t
V j -
( h2 / 8p2 m )
Formally, a reversal of the direction of time turns one equation into the other.
We may also allow negative energies and/or frequencies in
Planck's formula E = hn
and observe that a particle may be described by the same wave function
whether it carries energy E in one direction of time, or energy
-E in the other.
To retain only one version of the Schrödinger equation and
one arrow of time (the term was coined by Eddington)
we must formally allow particles to carry a signed energy
(typically, E = ± mc2 ).
If the wave function j
is a solution of one version of the Schrödinger equation, then its
conjugate j* is a solution of the other.
However, time-reversal and conjugation need not result in the same wave
function whenever Schrödinger's equation has
more than one solution at a given energy.
Principle of Superposition :
The linearity of Schrödinger's equation means that
any sum of satisfactory solutions is also a solution.
principle of superposition justifies the
general Hilbert space formalism introduced by Dirac:
Until it is actually measured,
a quantum state may contain (as a linear superposition)
several acceptable realities at once.
This is, of course, mind-boggling.
Schrödinger and many others have argued that
this cannot beentirely true:
Something in the ultimate quantum rules must
escape any linear description to defeat this
principle of superposition, which is unacceptable
as an overall rule for everything observed and anything
(2003-05-26) Noether's Theorem (1915)
The German mathematician Emmy Noether (1882-1935)
established this deep result (Noether's Theorem) in 1915:
For every continuous symmetry of the laws of physics, there's a conservation law,
and vice versa.
This result was first established in the context of classical
rational mechanics but it remains
true (and even more meaningful) in the quantum realm.
(2005-06-27) Hilbert Spaces:
Dirac's <bras| and |kets>
A nice notation with built-in simplification features.
The standard vocabulary for the Hilbert spaces used in quantum mechanics
started out as a pun:
P.A.M. Dirac (1902-1984;
decided to call < j | a bra
and | y > a ket,
< j | y >
is clearly a bracket...
Hilbert Space and "Hilbertian Basis" :
A Hilbert space
is a vector space over the field of
(its elements are called kets ) endowed with an inner hermitian
product (Dirac's "bracket", of which the left half is a "bra").
That's to say that the following properties hold
(z* being the complex conjugate of z):
< y | j >
< j | y >*
is a complexscalar.
< j | (
x | x > + y | y >)
x < j | x > +
y < j | y >
For any nonzero ket | y >,
the real < y | y >
(= ||y|| 2 ).
A Hilbert space is also required to be
which implies that its dimension is either finite or countably infinite.
It's customary to use raw indices for the kets of an agreed-upon
Hilbertian basis :
| 1 >, | 2 >, | 3 >, | 4 > ...
Such a "basis" is a maximal
set of unit kets which are pairwise orthogonal :
< i | i > = 1 and < i | j > = 0
if i ¹ j
The so-called closure relationÎ = å
| n > < n |
is a nice way to state that any ket is a
generalized linear combination of kets from the "basis":
| y >
Î | y >
| n > < n | y >
< n | y > | n >
This need not be a proper linear combination,
since infinitely many of the coefficients
< n | y >
could be nonzero: An
Hilbertian basis is not a properlinear basis unless it's finite
(cf. Hamel basis).
A linear operator is a
square matrixÂ = [ a ij ]
which we may express as:
Â = å
a ij | i > < j |
a ij =
< i | Â | j >
To the left of a ket or the right of a
bra, Â yields another like vector.
Hermitian conjugation generalizes to vectors and operators the
complex conjugation of scalars.
We prefer to use the same notation X* for the hermitian conjugate
of any object X, regardless of its dimension.
We use interchangeably the terms which other authors prefer to
use for specific dimensions, namely "conjugate" for scalars,
"dual" for vectors (bras and kets) and "adjoint" for operators
(the adjugate of a matrix is something
Many authors (especially in quantum theory)
use an overbar for the conjugate of a scalar and an obelisk
("dagger") for the adjointA
of an operator A.
In other words,
A º A*
Loosely speaking, conjugation consists in replacing all coordinates by
their complex conjugates and
transposing (i.e., flipping about the main diagonal).
The conjugate transpose is also called adjoint, Hermitian adjoint,
Hermitian transpose, Hermitian conjugate, etc.
The word conjugate can also be used by itself,
since conjugation of the complex coordinates
of a vector or matrix is rarely used, if ever, without a simultaneous transposition.
| y >* = < y |
< y |* = | y >
< j | Â* | y >
( < y | Â | j > )*
The adjoint of a product is the product of the adjoints in reverse order.
For an inner product, this restates
the axiomatic hermitian symmetry.
( X Y )* = Y* X*
< y | j >*
< j | y >
An operator Â is self-adjoint or
hermitian if Â = Â*.
All eigenvalues of an hermitian operator are real.
That key theorem was established in 1855 by Charles
Hermite (1822-1901, X1842)
when he introduced the relevant concepts now named after him:
hermitian conjugation, hermitian symmetry, etc.
Two eigenvectors of an hermitian operator
for distinct eigenvalues are necessarily orthogonal
(see proof below).
In finitely many dimensions, such operators are diagonalizable.
An hermitian operator multiplied by a real
scalar is hermitian.
So is a sum of hermitian operators,
or the product of two commuting hermitian operators.
The following combinations of two hermitian operators are always hermitian:
( Â Ê + Ê Â )
( Â Ê - Ê Â )
Unitary Transformations Preserve Length :
operator Û is a Hilbert isomorphism:
Û Û* = Û* Û = Î.
It transforms an
into another Hilbertian basis and turns
| y >,
< j | and
Â (respectively) into
Û | y >,
< j | Û* and
For an infinitesimal e,
Û = Î + ieÊ
is unitary (only) when Ê is hermitian.
State Vectors, Observables and the Measurement Postulate :
A quantum state, state vector, or microstate is a ket
| y > of unit length :
< y | y >
Such a ket | y >
is associated with the density operator
| y >
< y |
(whose entropy is zero) which determines it back,
within some physically irrelevant phase factor exp(iq).
An observable physical quantity corresponds to an hermitian
operator Â whose eigenvalues are the possible values of a
measurement. The average value of a measurement of
Â from a pure microstate
| y > is:
< y | Â | y >
This is a corollary of the following measurement postulate
(von Neumann's projection postulate)
which states the consequence of a measurement,
in terms of the eigenspace projector matching each possible outcome
(necessarily an eigenvalue a
of Â = åa
| y > becomes
| y >
| y > ||
with probability < y |
Pa | y >
The above is also often called the
principle of spectral decomposition.
Note that, since P2 = P = P*,
|| P | y > || 2
< y | P | y >
The principle of quantization limits the observed
values of a physical quantity to the eigenvalues of
its associated operator.
The principle of superposition
asserts that a pure quantum state is represented by a ket...
A quantum state represented by an eigenvector of an observable
is called an eigenstate. It always
yields the same measurement of that observable.
|j> that are eigenstates of an hermitian
operator Â associated with
eigenvalues a and
b are necessarily orthogonal.
Proof : If
Â |y> = a |y>
<j| Â =
b <j| with
a ¹ b, then we have:
<j| Â |y> = a <j|y> = b <j|y>.
Therefore, <j|y> = 0
Nonrelativistic Postulate of Evolution with Time :
In nonrelativistic quantum theory, time (t) is not an observable in the
above sense, but a parameter with which things
evolve between measurements,
according to the following substitute for
involving the hamiltonian operatorH
(associated with the system's total energy) :
| y >
| y >
This is completely wrong unless Hamiltonians
are properly adjusted to incorporate rest energies
(see our discussion of Schrödinger's
(2005-07-03) Operators Corresponding to Physical Quantities
Building on 6 operators for the
coordinates of position and momentum.
Only scalar physical quantities correspond to basic
observables (hermitian square matrices)
within the relevant
Physical vectors may also be considered, which
correspond to operators mapping a ket into a vector of kets
(an element of some cartesian power of L ).
The following table embodies the so-called
principle of correspondence,
for those physical quantities which have a classical equivalent...
The orbital angular momentum of a
pointlike particle does; its
E = V(r) + ||p||2 / 2m
( h2/8p2m )
The commutator of two operators
A and B is :
[A,B] = AB - BA.
It's worth noting that if A and B are hermitian,
then so is i[A,B].
According to the
the commutator of the two operators respectively associated with
the position x and the momentum
px along the same axis is the operator for
which the image of y is:
x ( h / 2pi )
( h / 2pi )
( i h / 2p ) y
That commutator is thus
( i h / 2p ) Î
(where Î is the identity operator).
Similarly, we obtain the following expression for
the operators Âx
and Ây associated with components
Lx and Ly
of the orbital angular momentum:
[ Âx , Ây ]
= ( i h / 2p )
Let's evaluate Âx (Ây
( h / 2pi ) 2
[ z ¶y/¶x
- x ¶y/¶z ] - z
[ z ¶y/¶x
- x ¶y/¶z ] )
( h / 2pi ) 2
All the second-order terms also appear in the like expression for
(y)) (which is obtained by swapping x and y).
So, they cancel in the difference:
[ Âx Ây -
Ây Âx ]
( h / 2pi ) 2
( i h / 2p )
Âx Ây Âz
For the 3-component column operator Â
associated with the ("orbital") angular momentum
L, this can be summarized
( i h / 2p ) Â
Algebraic Rules for Commutators :
A few general relations hold about commutators, which are easily verified :
[A,B] + [A,C]
[A,B]C + B[A,C]
This last relation is known as the Jacobi identity.
It's one of two relations a bilinear map
must satisfy to be called a Lie bracket
(the other, which is also satisfied here, is simply:
[A,A] = Ô ).
The commutator bracket thus endows the vector space of quantum operators
with the structure of a Lie algebra.
The following relation holds for two operators whose commutator
is a scalar times Î (or, at least, if
that commutator commutes with the operator B ).
[ A, f (B) ] =
[A,B] f ' (B)
f is an analytic function,
of derivative f '.
The relation being linear with respect to f,
it holds generally if it holds for
f (z) = z n...
The case n = 0 is trivial
(zero on both sides) and an induction on n completes the proof:
(2005-07-03) Noncommutativity and Uncertainty Relations
The link between commutators and expected
When two observables A and B
are repeatedly measured from the same quantum state
| y >
the expected standard deviations are
( Da )2
< y |
| y >
< y |
| y >2
( Db )2
< y |
| y >
< y |
| y >2
The following inequality then holds
( Heisenberg's uncertainty relation ).
< y |
| y >
Assuming, without loss of generality, that both
observables have zero averages (so the trailing terms
vanish in the above defining equations) this may be
identified as a type of Schwartz inequality, which may be proved
with the remark that the following quantity is nonnegative
for any real number x :
|| ( A + i x B )
| y > || 2
< y |
( A - i x B )
( A + i x B )
| y >
< y | (
x 2B 2
i x AB
i x BA
) | y >
x 2 ( Db )2
x < y |
| y >
( Da )2
So, the discriminant of this real
quadratic function of x
can't be positive.
As we have established that
the observables for the position and momentum along the
same axis yield a commutator equal to
( i h / 2p ) Î,
Contrary to popular belief, the above doesn't simply state that two quantities
can't be pinpointed simultaneously (supposedly because "measuring one would
disturb the other").
Instead, it expounds that no experiments can be made on
identically prepared systems to determine separately both quantities
with arbitrary precision... At least whenever the following noncommutative
< y | AB | y >
< y | BA | y >
For a given quantum state, the uncertainty in the measurement of the momentum
along x always has some definite nonzero value. No experiment can be devised
which could achieve a better precision, even if the experimenter does not
care at all about estimating the position along x.
Conversely, for that same quantum state, there's a definite limit on the precision
with which the position x can be determined, even if we do not care at all
about the momentum along x.
What Heisenberg's uncertainty relation specifies is that
no quantum states exists for which the product of
those two separate uncertainties is below h/4p.
This has absolutely nothing to do with one type of measurement
"disturbing" the other...
It's true that several measurements disturb each other,
but it's a completely different issue
(e.g., a precise momentum measurement
may leave the system in a new quantum state where the inherent
uncertainty in position may very well be much greater than originally).
The uncertainty principle goes much deeper than that.
In particular, it says that there's no way to create a perfectly focused beam
of identical particles with the same lateral velocity.
Even if you measure only either
the lateral position or the lateral momentum of any given particle
from the beam, your many measurements of both quantities will feature
standard deviations which cannot be better than what's imposed by the above
uncertainty relation. That's the way it is.
(2012-07-10) Transverse Certainties
Physical quantities whose commutator is a scalar
(i.e., the identity operator multiplied into some complex number)
are said to be conjugate of each other and the
dispersion in the measurement of one is inversely proportional to
the dispersion in the measurement of the other.
This is illustrated by the position and the momentum of a particle
along the same axis.
Conversely, when the observables commute, the eigenstate of one is
an eigenstate of the other and both quantities can be measured
simultaneously, without any dispersion, for all possible
values of either quantity.
Otherwise, some quantum states are eigenstates of one observable
but not the other, while others may be eigenstates of both.
For example, the magnitude of the impulsion (but not its direction)
can be measured with zero dispersion if the particule is found to
be at a location where the magnitude |y|
of the wave function is either zero or maximum:
y* ¶y / ¶x
y* ¶y / ¶y
y* ¶y / ¶z
That's because the commutator between the operators associated to the coordinate
position x and || p2 ||
vanish at such positions (the same being true for other coordinates):
[ x, ||p||2 ] | y >
x (-h2/4p2 )
¶y / ¶x
|< y | [ x, ||p||2 ] | y >
y* ¶y / ¶x
(2007-07-16) Orbital Angular Momentum and Spin
Spin is a form of angular momentum without a classical equivalent.
The following argument was fully developed by
Elie Cartan (1869-1951) in 1913
from a purely geometrical standpoint (not involving Planck's
constant as such) as he investigated the
Lie algebra of the group of
three-dimensional rotations. Cartan thus demonstrated, ahead of
his time, how the idea of quantified spin is a consequence of
The pioneers of quantum mechanics rediscovered those things in the 1920's.
In 1935, Cartan himself published a remarkable textbook on
his Theory of Spinors.
Let's investigate the properties of a vectorial
observable Â which satisfies the fundamental property
in the case of the quantum operator associated with a classical
(orbital) angular momentum, namely:
( i h / 2p ) Â
This pretty equation is merely a mnemonic for 3
Âx Ây Âz
[ Ây , Âz ] =
( i h / 2p ) Âx
[ Âz , Âx ] =
( i h / 2p ) Ây
[ Âx , Ây ] =
( i h / 2p ) Âz
The 3 components
Âx , Ây
and Âz are
scalar observables (i.e., square matrices with hermitian symmetry).
We introduce another such observable:
Therefore, those two things add up to zero and we obtain:
[ Â2 , Âz ] = 0
The above definition of Â2 ensures
that < y | Â2 |
is nonnegative for any ket
(HINT: this is the sum of 3 real squares).
Therefore, this operator can only have nonnegative eigenvalues, which
(for the sake of future simplicity)
we may as well put in the following form, for some nonnegative number j.
j (j+1) (h/2p)2
The punch line will be that j is restricted
to integer or half-integer values.
For now however, we may just accept this expression
because it spans all nonnegative values
once and only once when j goes from zero to infinity.
So, j can be used to index every eigenvalue
Similarly, we may use another index m to identify the
eigenvalue m (h/2p)
of Âz . For now,
nothing is assumed about m (we'll show
later that 2m is an integer).
Since those two observables commute, there's an orthonormal
consisting entirely of eigenvectors common to both of them.
We may specify it by introducing a third index n (needed
to distinguish between kets having identical eigenvalues
for both of our observables).
Those conventions are summarized by the following relations,
which clarify the notation used for base kets:
| n, j, m > =
| n, j, m >
| n, j, m > =
| n, j, m >
To determine the restrictions that j and m must obey,
we introduce two non-hermitian operators,
conjugate of each other. They are collectively known as
and are respectively called lowering operator
(or anihilation operator)
and raising operator (or creation operator)
because it turns out that each transforms an eigenvector into
another eigenvector corresponding to a lesser or greater eigenvalue, respectively.
Both commute with Â2
(because Âx and Ây do).
The following holds:
| n, j, m > || 2
< n, j, m | Â-Â+ | n, j, m >
[ Âx , Ây ]
( h / 2p ) Âz
| n, j, m > || 2
[ j(j+1) - m2
- m ]
( h / 2p )2
As the nonnegative square bracket is equal to
j (j+1) - m(m+1)
we see that m cannot exceed j.
We would find that (-m) cannot exceed j
by performing the same computation for
| n, j, m > ||. All told:
-j ≤ m ≤ j
Note that the above also proves that the ket
| n, j, m > vanishes only when m = j.
| n, j, m > is nonzero unless m = -j.
Except in the cases where they vanish,
such kets are eigenvectors of Âz
associated with the eigenvalue of index
m ± 1.
Let's prove that:
So, if | y > is an eigenvector of
the eigenvalue m (h/2p), then:
| y > =
| y >
| y >
is either zero or an eigenvector of
associated with the value
The same is true of
| y >
with (m-1) (h/2p).
Since we know that m is between
-j and +j , we see that both
j-m and j+m must be integers
(or else iterating one of the two constructions above
would yield a nonzero eigenvector with a
value of m outside of the allowed range).
Thus, 2j and 2m must be integers (they are the
sum and the difference of the integers j+m and j-m).
If j is an integer, so is m.
If j is an half-integer, so is m
(by definition, an "half-integer" is half the value of an odd integer).
The above demonstration is quite remarkable:
It shows how a 3-component observable is quantized
whenever it obeys the same commutation relation
as an orbital angular momentum.
Although half-integer values of the numbers j and m
are allowed, those
do not correspond to an orbital momentum.
Indeed, let's show that orbital momenta
can only lead to whole values of j and m.
(2008-08-24) Pauli Matrices (1927)
& Spin of an Electron
Three special 2-dimensional Hermitian matrices with unit squares.
In 1927, Wolfgang Pauli (1900-1958)
introduced three matrices for use in the
theory of electron spin.
Their eigenvalues are +1 and -1.
They combine into a 3-vector of matrices verifying the crucial equation:
s ´ s =
Therefore, they provide an explicit representation of the
above type of "angular momentum" observables
in the simplest case
of only two values (eigenvalues). This is meant to describe
a lone fermion of spin ½, of which the electron
is the primary example.
The above discussion and notations apply directly to:
Â = (h/4p)
(i.e., Âx = (h/4p)
sx , etc. )
In this simple case, we have
The square of the spin of any electron;
is thus always equal to
The observable corresponding to the projection of the electron spin
along the direction of the unit vector
u of cartesian coordinates (x,y,z) is
x Âx +
y Ây +
z Âz =
z x + iy
x - iy -z
Since x2 + y2 + y2 = 1, the
eigenvalues of Âu are indeed always
Note that any Hermitian matrix with such opposite eigenvalues can be put in this
form. Thus, any quantum state is associated with an observable which will
confirm its orientation with certainty (probability 1).
In 1924, Pauli had identified a "two-valued quantum degree of freedom"
associated, in particular, with the valence electron of an alkali metal.
The introduction of this new quantum number allowed him to state his famous exclusion principle
(i.e., two electrons orbiting the same atom have different quantum numbers).
However, he strongly rejected the idea of the young
Ralph Kronig (1904-1995)
that this might be due to some intrinsic rotation of the electron.
Pauli discouraged Kronig from pursuing a "clever" idea which he pronounced to have
"nothing to do with reality".
Instead, the proposal was duly published by
who thus got the credit for the concept of electron spin.
Indeed, no classical rotation of something as small as an
electron could produce the required magnetic moment unless
the "surface of the electron" [sic] moved faster than light
(this objection was raised by H.A. Lorentz). However,
a pointlike object can be endowed with something similar to rotation.
One should simply refrain from dubious explanations relying on moving subparts...
(2008-08-26) Quantum Entanglement
The singlet and triplet
states of two entangled electrons.
According to the previous article, a
pure quantum state for the spin of a lone electron is represented
by a ket which is
a linear combination of the two eigenvectors of
which we shall henceforth call "up" and "down":
| u > =
| d > =
This involves a priori two complex coefficients.
However, two kets that are complex multiples of each other represent the same quantum state,
so the specification of a state actually depends on just two
Another way to look at this is to remark
that such a quantum state is represented by a normalized
ket of unit length corresponding to either of two diametrically opposed
points on a unit sphere.
Such a thing is indeed specified by two real numbers: latitude and longitude
(although the global topology
is not that of a sphere because diametrically
opposite points are considered to be equivalent).
The juxtaposition of two such spins is represented by a linear combination
of four pairwise orthogonal unit kets in a 4-dimensional
Hilbert space :
| u,u >
| u,d >
| d,u >
| d,d >
In that space, a quantum state is described
by 6 independent real numbers
(4 complex coefficients modulo one complex scalar)
which is 2 more "degrees of freedom" than
what might be expected for the separate
description of two spins.
The extra possibilities are called entangled
Consider the same observables as before for
the measurement of the first spin only.
Those operators do not change at all the components of the ket which
describe the second spin.
With a single spin, we saw that any given pure quantum state
was always a +1 eigenstate of a certain linear combination of
In particular, as all measurements of the corresponding quantity were
always equal to +1 so was their average.
Surprisingly, this no longer holds for the measurement of
a single spin in a two-spin system.
In particular, the following two states both yield a zero
average for the measurement of the first spin
along any direction :
Similarly, in either of those two quantum states,
the average measurement of the second spin along any direction
is also zero.
We may also consider a combined observable which gives the sum
of the two spins along some direction.
The result can only be +2, 0 or -2 and the average
is zero for both the singlet and triplet states.
However, much more is true for the singlet state,
since any measurement of the sum of the spins along any
direction always gives zero for the singlet state.
Not just a zero average but an actual zero measurement
every time !
Thus, if you measure the spin of one of the two electrons entangled
in a singlet state, you will know for sure
that a measurement of the spin of the other electron
along the same direction will give the opposite result. Always.
(2008-08-31) Bell's Inequalities (1964)
Statistical relations which are violated in quantum mechanics.
Classically, the probabilities of events
can be broken down as sums of mutually exclusive events.
Such a decomposition implies the following inequality between
various joint probabilities of three events A, B and C:
P ( A & [not B] ) +
P ( B & [not C] ) ≥
P ( A & [not C] )
The picture shows that the event of the right-hand-side is
composed of the two mutually exclusive events shaded
in red, which also appear as components of the two events from
the left-hand-side. So, their probabilities add up
to something no greater than the left-hand-side sum.
This is known as Bell's Inequality.
In quantum mechanics, there are no such things as mutually exclusive
events (unless actual observations take place which
turn the quantum logic of virtual
possibilities into the more familiar statistics of
Thus, there's no reason why Bell's inequality should apply
to the calculus of virtual quantum possibilities.
Indeed, it doesn't in the
above case of a
(2008-08-25) Higher-Order Spin
Equivalents of the Pauli matrices beyond spin ½.
A particle of spin j is
something which allows a measurement of its spin along any
direction to have 2j+1 values
(according to Cartan's argument).
Two measurement values are allowed for j=1/2,
3 values for j=1,
4 values for j=3/2,
5 values for j=2.
The relativistic case of massless particles is beyond the scope of this discussion:
The measured spin of a massless particle
can only be clockwise or counterclockwise, at full magnitude,
in the direction of motion (that would translate into only two possible
measurement results, for any nonzero value of j ).
Hilbert space has dimension 2j+1 and the observables
for the three projections of angular momentum on three orthogonal
directions (in a right-handed configuration) can be expressed
as in the above special case (j=½)
using the counterparts of Pauli matrices
in a Hilbert space of 2j+1 dimensions, namely
three 2j+1 by 2j+1 matrices which combine into a "vector" verifying
the following compact commutation relation:
s ´ s =
Actual observables of angular momentum are simply
obtained by multiplying such matrices into the
half-quantum of spin
Here's how we may construct such a thing: First we impose
sz is diagonal
(that simply means we decide to use eigenvectors of
sz to form a basis
for our Hilbert space).
We do know the eigenvalues of
Cartan's argument, so
sz is entirely specified up
to the ordering of the (real) elements in the diagonal. We choose (arbitrarily)
to order our base kets so that those (distinct) eigenvalues appear in
decreasing order on the diagonal of
So, sz is simply the diagonal
matrix whose 2j+1 elements are
(2j, 2j-2, 2j-4, ... -2j).
We are looking for the hermitian matrix
in terms of (j+1)(2j+1) scalar unknowns
(including 2j+1 real ones).
In the case j = 3/2 this would mean a total of
10 unknowns (6 complex and 4 real ones)
in a 4 by 4 matrix:
a b* c* d*
b e f* g*
c f h k*
d g k m
The reader is encouraged to use that explicit example, with
index-free notations, to embody
the following outline of a general derivation.
Now, sy is obtained
directly from the equation:
[ sz ,
sx ] =
2 i sy
This yields an expression of
where each entry is proportional
to the corresponding (unknown) entry of
Next, we may use the relation:
[ sy ,
sz ] =
2 i sx
This tells us that all terms of
(and, therefore, also those of
must vanish except at positions
adjacent to the main diagonal.
Now, we're faced with only 2j unknown
(which are unconstrained, at this point)
and just one more commutation relation to satisfy, namely:
[ sx ,
sy ] =
2 i sz
It turns out that this final equation gives us the squares
of the absolute values
of the aforementioned remaining 2j unknowns.
Each of them is thus determined
up to an arbitrary phase factor
(for a total of 2j arbitrary multipliers of unit length).
In the following tabulation, we have chosen a "standard" convention
for those phase factors which makes all the coefficients of
Spin j = 1/2 :
Spin j = 1 :
0 1 0
1 0 1
0 1 0
0 i 0
-i 0 i
0 -i 0
2 0 0
0 0 0
0 0 -2
Spin j = 3/2 :
0 Ö3 0 0
Ö3 0 2 0
0 2 0 Ö3
0 0 Ö3 0
0 iÖ3 0 0
-iÖ3 0 2i 0
0 -2i 0 iÖ3
0 0 -iÖ3 0
3 0 0 0
0 1 0 0
0 0 -1 0
0 0 0 -3
Spin j = 2 :
0 2 0 0 0
2 0 Ö6 0 0
0 Ö6 0 Ö6 0
0 0 Ö6 0 2
0 0 0 2 0
0 2i 0 0 0
-2i 0 iÖ6 0 0
0 -iÖ6 0 iÖ6 0
0 0 -iÖ6 0 2i
0 0 0 -2i 0
4 0 0 0 0
0 2 0 0 0
0 0 0 0 0
0 0 0 -2 0
0 0 0 0 -4
Relativistic arguments (beyond the scope of this discussion)
do not allow elementary particles beyond spin 2.
Composite objects with higher spins do not have a fixed value of
j. However, if their possible decay into things of lower spin is ignored,
they would behave like fictional high-spin objects, starting with:
The same pattern holds for any spin j :
The nth coefficient down the upper subdiagonal of
sx for spin j
is simply given by the expression:
n ( 2j + 1 - n )
exp ( i jn )
[ e.g., jn = 0 ]
If used, each phase factor applies to two matching elements in
sy which are above the
diagonal. The conjugate phase applies to the transposed elements (below the diagonal).
This would turn the ordinary (2x2) Pauli matrices into:
0 i e-ij
-i eij 0
The eigenvectors of those three matrices are
respectively proportional to:
1 i e-ij
-1 i e-ij
We may call twists the 2j such phase factors which are
part of sx
For spin ½, the single twist can be eliminated by redefining
which axis (perpendicular to the z-axis) is associated with the twist-free version
This works for a single spin but cannot be done simultaneously for several
spins... It's as if a spin possessed an internal phase
which indicates, so to speak, the actual angular position
in a "rotation" around a given axis.
The same trick can always be used to make the sum of the twists
vanish in a single higher spin, but what is the physical significance of the
2j-1 remaining degrees of freedom?
They seem to determine, in a nontrivial way, the relative positional
phases in the "rotations" around each direction of space.
In particular, what does j mean
in the following observables for spin 1 ?
0 e-ij 0
eij 0 eij
0 e-ij 0
0 i e-ij 0
-i eij 0 i eij
0 -i e-ij 0
2 0 0
0 0 0
0 0 -2
The columns in the following [unitary and hermitian] matrices are eigenvectors:
1 Ö2 e-ij 1
Ö2 eij 0 -Ö2 eij
1 -Ö2 e-ij 1
1 i Ö2 e-ij -1
-i Ö2 eij 0 -i Ö2 eij
-1 i Ö2 e-ij -1
1 0 0
0 1 0
0 0 1
(2005-06-30) Density operators = macrostates
(von Neumann, 1927)
Quantum representation of systems in
imperfectly known mixed states.
A microstate (or pure quantum state)
is represented by a normed ket
from the relevant Hilbert space, up to an irrelevant phase factor.
A more realistic macrostate is a statistical mixture
(called mixed state or Gemischt)
which can be represented by a unique [hermitian] density operatorr
with positive eigenvalues that add up to 1.
pn | n > < n |
In particular, the unique density operator representing the pure quantum state
associated with the normed ket | y >
is given by the following expression, which is unaffected by phase factors
(since multiplying | y >
by a complex number of unit norm will multiply
< y | by the reciprocal).
| y > < y |
A statistical mixture consisting of a proportion u of the macrostate
represented by r1
and a proportion 1-u of the macrostate
represented by r2
is represented by the following density operator:
u r1 +
The trace of an operator
is the sum of the elements in its main diagonal
(this doesn't depend on the base).
All density operators have a trace equal to 1.
Conversely, all operators of trace 1 can be construed as density operators.
Tr ( Â ) =
ån < n | Â | n >
The measurement of any observableÂ
yields the eigenvalue a
with the following probability, involving the
projector onto the relevant eigenspace:
p ( a ) =
Tr ( r Pa )
Thus, systems are experimentally different if and only
if they have different density operators.
We may as well talk about r as
being a macrostate.
The average value resulting from a measurement of
Â = åa a Pa is:
< Â > =
Tr ( r a Pa )
Tr ( r Â )
Mere interaction with a measuring instrument turns the
macrostate r into åaPa r
Recording the measure a
Tr ( r Pa )
This is known as "Lüder's rule" or
Lüders' projection postulate.
It was first discussed in 1951 by Gerhart Lüders, in
"Über die Zustandsanderung durch den Messprozess"
(On the state-change
due to the measurement process) which appeared in
Annalen der Physik, 8 (6) 322-328.
An [analytic] function of an operator, like the logarithm of an operator,
is defined in a standard way:
In a base where the operator is diagonal, its image is the
diagonal operator whose eigenvalues are the images of its eigenvalues.
The statistical entropy
S ( r ) is
defined in units of a positive constant k :
S ( r ) =
-k Tr ( r
Log ( r ) )
S is positive, except for a pure state
r = | y >
< y |
for which S = 0.
Algebraically, the following strict inequality holds, unless
r = r'.
S ( r ) <
-k Tr ( r
Log ( r' ) )
An isolated nonrelativistic system evolves according to
the Schrödinger-Liouville equation, involving its
hamiltonian H :
( ih / 2p )
With thermal contacts, a quasistatic evolution has different rules (T and H vary).
Introducing the partition function (Z) :
Z = Tr exp ( - H / kT )
r = exp ( - H / kT ) / Z
The variation of the internal energy
U = Tr ( r H ) may be expressed as