Note about Notations :
The timelike-coordinate of a
quadrivector is listed first.
Vectorial quantities are boldfaced;
lowercase symbols are used for 3D vectors, capitalization for 4-vectors:
R = (ct,x,y,z) = (ct,r)
X = (x0 , x1 , x2 , x3 )
= (x0 , x)
The sign of a 4D dot product is defined to generalize the 3D-space concept:
U . V = (u0 , u) . (v0 , v)
u . v - u0 v0
- u0 v0
+ u1 v1 + u2 v2 + u3 v3
Observers in Motion: The Lorentz Transform
How are the coordinates in two uniformly moving systems related?
In the framework of the Special Theory of Relativity, such coordinates are
linearly related. Nonlinear relations are the subject of
General Relativity Theory, where linear transforms
only apply to infinitesimal coordinates (cdt,dx,dy,dz).
Call t,x,y,z the coordinates in one system (S) and
t',x',y',z' the coordinates in the other (S').
Assume the axes are so oriented that motion is along the x-axis of S which is also the
x'-axis of S'. For points on that invariant axis, y,z,y',z' are zero,
and we have to find proper dimensionless coefficients aij in the relations:
ct' = a00 ct + a01 x
x' = a10 ct + a11 x
In this, c is the speed of light in a vacuum (now best called
Einstein's constant). The symbol c
stands for celerity (Latin celeritas).
Speed and phase celerity are identical for things that propagate at celerity c.
The main tenet of Relativity requires the
important physical constant c
to be the same for all observers...
That is, (x±ct) is zero if and only if
(x'±ct') is. Now, we have the linear relations:
(a10+ a00 ) ct
(a11+ a01 ) x
(a10- a00 ) ct
(a11- a01 ) x
The aforementioned conditions can only be met if the coefficients of x and ct
on the right hand side of each equation are respectively proportional to
the coefficients of x' and ct' on the left hand side. Therefore:
Adding or subtracting these two equations yields:
a00 = a11 and
a01 = a10 .
Introducing g = a00
and letting b be equal to -a10 / a00,
we thus obtain...
The Lorentz Transform:
|ct' = g
( ct - bx )
x' = g
(-bct + x )
The origin of S' (x' = 0) moves at a speed v in S (x = vt).
Now, we have to assume that the inverse transform from S' to S is the same as
the direct transform from S to S', with only a trivial change of sign (or else,
there would be a directional preference in the Universe, which we find repugnant):
ct = g
( ct' + bx' )
x = g
( bct' + x' )
Using the prior values of ct' and x', we obtain ct = g2 (1-b2 ) ct.
FitzGerald-Lorentz contraction of a moving object:
If points at rest in S' are observed at constant time t = 0 in S,
then we have:
x' = g
( 0 + x )
It is customary to use nought subscripts
( So = S', xo = x', etc.) for the rest frame of a solid
(that's a special case of the proper frame of an extended object,
a key concept in relativistic thermodynamics).
Therefore, the above relation x = x'/g
|x = xo
Thus, a yardstick moving lengthwise appears shorter than a yardstick at rest because
a stationary observer who records both extremities simultaneously
actually records them at what would be different times for a moving observer
traveling with the ruler.
This shortening of distances, called the
occurs only in the direction of the velocity of a moving object
(distances perpendicular to the velocity are not affected).
It was proposed in 1889 (well
before Special Relativity was
formulated) by the Irish physicist George FitzGerald (1851-1901)
to explain the negative result of the
Michelson-Morley experiment of 1887.
FitzGerald is also remembered for his prediction that oscillating currents
ought to generate radio waves (1883).
|George Francis FitzGerald|
A trivial consequence of the
is that a container has a lesser volume V when it's moving
than when it's not.
|V = Vo
Other thermodynamical characteristics of extended bodies
may also depend on their motion,
because simultaneity is relative
(by definition, all parts of a moving body are observed at constant time
in the frame of the observer).
The same remark applies to pointlike objects, which can be viewed
as tiny extended bodies.
Combining Collinear Velocities
How do relativistic speeds add up?
Considering the case of motions along the x-axis only,
the problem is to find the velocity w with respect to S of something moving
at velocity u in a system S' moving itself at velocity v
with respect to S.
Well, if x' = ut', the above Lorentz transform
(with b = v/c) tells us that:
x - vt =
u ( t - xv/c2 )
Solving for x, we obtain x = wt, where w is given by the following expression:
| u + v
|1 + uv / c2
Note that, if u and v are both subluminal
(between -c and c) then so is w...
However, we need only assume that
one of the speeds is subluminal
As phrased above, v is so restricted (it's the speed of one coordinate system
in the other) whereas u could well
be a superluminal phase celerity, for example.
In that case, the result w is superluminal as well.
Nothing is subluminal for one observer and superluminal for another,
provided all observers have subluminal motion with respect to each other.
It can be convenient to introduce a function
of the speed v called rapidity,
which is essentially the inverse of the
function. Namely :
j = c tanh-1 (v/c)
= c th-1 (v/c)
= c Argth (v/c)
There's a simple addition formula for the direct function
(tanh or th):
|th (x+y) =
|| th(x) + th(y)
|1 + th(x) th(y)
This parallels nicely the above formula for the addition of
collinear velocities and shows that
rapidities are additive :
If an object moves at rapidity j2
relative to a frame of reference of rapidity
j1 , then the rapidity of that
object is j1+j2
The concept of rapidity
was introduced in 1910 by the British mathematician
Edmund T. Whittaker
(1873-1956) in the midst of an historical
account of the development of electromagnetism up to and including
special relativity (which Whittaker attributes to the efforts of
rather than Einstein).
History of the Theories of Aether and Electricity
by E.T. Whittaker (1910)
Combining Non-collinear Velocities
Let's generalize the above to velocities along
Using the notations of the previous section, we now consider
the following motion in the frame of reference attached
to an actual particle moving at speed v < c.
x' = u t' cos q
y' = u t' sin q
Using the expressions of x' and t' given by the
(along with y' = y) we obtain the following description in terms
of the observer's coordinate system:
x - vt =
u ( t - xv/c2 ) cos q
y = g
u ( t - xv/c2 ) sin q
From the first equation, a relation of the form
x = wx t is derived exactly as in the
collinear case. For the other coordinate,
we may start by remarking that:
y = wy t
(wx - v) (tg q) t
This leads to a straightforward computation of wy
which simplifies nicely:
Cartesian Components of the Observed Resultant Velocity
| v + u cos q
|u (1 - v2/c2 )½
|1 + (uv cos q) / c2
||1 + (uv cos q) / c2
w2 = (wx)2 + (wy)2
gives the speed w seen by the observer. With some algebraic massaging, this
can be put in a nice symmetrical form:
1 - w2/c2 =
( 1 - u2/c2 )
( 1 - v2/c2 )
|[ 1 + (uv cos q) / c2 ] 2
The reader may want to retrieve the above
addition formula for parallel velocities
cos q = 1
in this lesser-known general relation.
In the case u = c, we have
w = c and the above expression for
wx turns into the following relativistic formula, published by
Einstein in 1905.
Relativistic Aberration of Light
= cos a =
| (v/c) + cos q
||1 + (v/c) cos q
The Headlight Effect
Istropic radiation is focused forward if the source moves.
In its rest frame, an isotropic source of photons is equally
likely to emit at an acute angle or at an obtuse angle
with respect to any direction of reference.
That direction of reference can be chosen to be the
velocity v of the source with respect to a particular observer,
who will thus see half of the photons emitted within a cone
corresponding to a right angle in the rest frame of the source.
Using the results of the previous section
(with u = c and
sin q = 1)
the limiting cone is defined by the following components,
respectively parallel and perpendicular to v
in the frame of the observer:
wx = v
c / g
This corresponds to an angle j
= Arcsin (1/g)
= Arccos (v/c)
away from the direction of the velocity v.
For a fast source, j
is small, which indicates that the photons are
mostly emitted in a direction close to
that of v.
Optical Effects of Special Relativity
The Lord is Subtle, not Malicious.
On the straight addition of closing speeds.
Two spaceships head toward each other at 70% and 80% of the speed of light,
They are 3 light-years apart. When will they meet?
Well, the distance between them is seen to decrease at a rate equal to
0.7 + 0.8 = 1.5 times the speed of light.
They'll meet in exactly 2 years.
Relative Speed vs. Closing Speed
The previously discussed
relative speed of two moving objects
is defined as the
speed of one object seen by an observer at rest with respect to the other.
This is entirely different from what's sometimes called the
closing speed, which is the rate of change of the distance
between two moving objects (as seen by an observer who is linked to neither).
When motion along a straight line is considered at time t by an independent observer,
an object moving at velocity -u is at abscissa -u t and
an object moving at velocity v is at abscissa v t.
The signed distance separating the latter from the former is clearly (u+v)t
and the rate of change of that quantity
(the signed closing speed, if you will)
In prerelativistic mechanics, there is no difference between the relative speed and the
closing speed of two objects (because two moving observers are supposedly
experiencing the same flow of time).
In relativistic mechanics, this ain't so.
Confusing the two notions can be a great source of puzzlement
when the relevant conditions are not properly analyzed.
In particular, the following Fizeau effect is correctly
explained by the relativistic expression for relative velocities, whereas
the Sagnac effect is due to the difference in the closing
speeds of two light beams that either chase a moving mirror or race toward it
(those closing speeds are c-v and c+v, respectively).
Propagation of Light in Moving Bodies (Fresnel, 1818)
Dependence of the Fresnel drag f
on the refractive index n (Fizeau).
Fizeau (1819-1896) used interferometry
to determine how the speed of a moving liquid affects the propagation of
light in it.
He found an empirical relation between the magnitude of the effect
and the refractive index (n) which was later explained relativistically...
In a transparent fluid at rest,
the [phase] celerity of light u = c / n is isotropic and
inversely proportional to the fluid's index of refraction (n).
Consider the case where the propagation of light is parallel
to the motion of the fluid.
Let v be the speed of the fluid
and w the observed celerity of light.
According to the above rule, we have:
(u + v) / (1 + uv/c2 )
(c/n + v) / (1 + v/nc)
c/n + v ( 1-1/n2 )
[ 1 + v/nc ] -1
c/n + v ( 1-1/n2 )
c/n + v f
The parameter f =
is known as the Fresnel drag coefficient.
It has been named after
Augustin Fresnel (1788-1827)
who introduced the now obsolete
drag hypothesis in 1818, to explain experiments performed by
Arago in 1810.
The coefficient f would be
0 if the motion of the liquid had no influence on the propagation of light.
It would be 1 if light was entirely "carried" by the liquid,
like sound is.
What's observed looks like partial dragging.
Hippolyte Fizeau established empirically the above expression of
f in terms of the index n, by experimenting with different liquids.
Although Fizeau's relation can be derived without resorting to the principle
of relativity (Lorentz did it)
Einstein considered it an excellent experimental test of
Harress-Sagnac Effect (using mirrors or fiber optics)
Rotation rate of an optical loop is revealed by interfering opposite beams.
The Sagnac effect
is mainly a nonrelativistic effect observed
when two coherent light beams (which may come from a single source through
a half-silvered mirror)
travel in opposite directions around an optical loop.
When the apparatus rotates, a phase difference is observed which is essentially
proportional to both the pulsatance w
[the rotation rate in rad/s]
and the area of the loop (actually,
the apparent area, for a distant observer located
on the axis of rotation).
The thing is not to be interpreted
as some obscure change in the celerity of light, supposedly due to rotation
(dwelling on this fallacious point is a misguided endeavor).
It's mainly that the beams must travel different distances
to their rendezvous, at a point which will have moved
toward one beam and away from the other, by the time they get there !
The loop may be a polygonal path, with a mirror at every corner
(light travels at speed c between mirrors).
Alternately, fiber optics may be used so light travels at a celerity c/n
relative to the loop
[ n being the refractive index of the optical material ]
via ordinary total internal reflection (TIR).
In some of the so-called "laser gyroscopes" (devised around 1963)
of modern guidance systems, the latter method is now used
to produce a resonant frequency proportional to the rate of rotation.
One advantage over mirrors is that a fiber optic cable can easily be coiled to
increase the effective area of the loop
within a compact volume.
Rotation rates around 5 ´ 10 -11 rad / s
have actually been detected this way...
We consider only the case of a circular loop, of radius R,
rotating about its axis
(note that n = 1 approximates a regular
polygonal path with many mirrors).
Let's measure positively counterclockwise angles (that's the usual convention)
and assume that the loop is rotating in this positive direction.
Euclidean geometry remains valid for a fixed observer, who thus sees each point
of the loop travel a distance 2pR
in a time 2p/w.
Therefore, the nonrelativistic expression for the speed v
of each point of the loop does hold:
v = wR.
(The FitzGerald-Lorentz contraction
applies to the length of moving objects, not
to the space they travel.)
Using this value of v in the (exact) expression from the
previous article, we obtain the
value for the celerity w of either beam in the moving loop,
in our fixed viewpoint
"+" for the positive beam and "-" for the other).
( 1-1/n2 )
[ 1 ± wR/nc ] -1
This signed quantity is, of course, equal to
when n = 1.
Now, if both beams start from the same point at t = 0,
this point will be at an angle wt
when the beam reaches it again after one turn, so that t is a solution
t w ±
/ R = wt
Solving for t, we get:
t- = 2pR /
( -w- + wR )
t+ = 2pR /
( w+ - wR )
The exact values of
make n vanish from the time lag ! [2005-07-27]
Sagnac Time Lag
(inertial observer at rest at the center of rotation)
||4p R 2
c 2 - w 2 R 2
With ordinary objects, the above is undistinguishable from
4p R 2
w / c2
and is thus proportional to the rotation
Indeed, the speed wR of the circumference is
much lower than the speed of light
(the relative error is about
for a 10 cm radius at 10 000 rpm).
Even for relativistic things, the above denominator must be positive
(there's no such thing as a large rotating "solid").
A Brief History of the Harress-Sagnac Effect :
The Sagnac effect was first dreamt of by
Oliver Lodge (1851-1940) in 1893 and by
Albert A. Michelson in 1904.
In 1911, Francis Harress tried to substitute glass for the
liquids used in Fizeau's investigations.
He had the idea to observe rotating rings of glass, but could clearly do so
only at fixed points of such rings...
As we've discovered theoretically in the above discussion, the resulting effect
does not depend on the index of refraction involved !
Although Harress failed to understand his experimental results,
the effect is still known as Harress-Sagnac,
mostly in the context of fiber optics.
In 1913, the French physicist Georges Sagnac (1869-1926) published his own
experimental results and properly described the effect now named after him.
The prior observation by Harress of the effect's "fiber optics version"
was noted later.
Michelson and Gale used this effect in 1925 to measure the absolute rotation
of the Earth, with a rectangular optical loop
0.2 mile wide and 0.4 mile long.
studies of the Sagnac effect by Grigorii B. Malykin (1997)
Relativistic Limit of a Relative Speed
What is w = (u-v) / (1-uv/c2 ) as both u and v approach c ?
Answer: The quantity w doesn't have a definite limit as both u and v approach c.
This formula for w describes how [collinear] relativistic velocities are combined:
Consider a straight railroad track where a train (U) moves at speed u
and another train (V) moves at speed v
(both speeds being measured relative to some platform
on which the "observer" is located).
According to the Special Theory of Relativity
(see above) the quantity
w = (u-v) /
is simply what the speed of train U would be for an observer located on train V.
If the observation platform is on a fast rocket moving
parallel to the railroad tracks and approaching the speed of light c
(with respect to the tracks),
both u and v will be close to c.
Yet, the quantity w remains a low number (like 10 mph or 20 mph)
equal to the speed of one train relative to the other.
This relative speed is not affected by how fast an irrelevant nearby rocket
might be moving...
To make this relativistic math more transparent, you may want to consider
rapidity instead of speed.
The interesting thing is that rapidities are additive,
whereas speeds are not:
The rapidities x, y, z corresponding to the above speeds u, v, w are thus related by
the much simpler equation:
z = x - y
As speeds approach c,
rapidities approach infinity and the question becomes:
What's the limit of z = x-y
when both x and y become infinite?
Answer: Such a limit is clearly undefined.
Take your pick, use either the physical or the mathematical approach...
gerry (2002-06-30) [follow-up to the previous article]
What's the relative velocity of two photons?
The velocity of object B relative to object A is the velocity of B measured in the frame of
reference where A has zero speed.
When A is a photon,
we are in trouble with this definition,
because there is no such proper frame of reference.
The question is thus to determine if/when the above definition
can be consistently extended to include objects moving at speed c.
Such an extension exists, except in the very special case of two photons moving
in exactly the same direction, where the notion of "relative speed"
breaks down completely
(as shown in the above discussion of the relativistic
formula for the "addition" of parallel velocities).
If A is a photon but B is not also moving at speed c,
we may still reach a firm conclusion by noticing
that the velocity of B relative to A must be the opposite of the velocity of
A relative to B (which is well-defined, unless B moves at speed c).
Let's consider now the case of two photons of velocities u and v.
These two velocities are 3D vectors of length c; the corresponding 4-vectors
(c,u) and (c,v) have zero 4-dimensional "length".
If u and v are not equal (that's to say that the two photons
have different directions) the 3D length
||u-v|| is nonzero
and it turns out that the relative velocity of two objects chasing the
two photons (at sub-c speed) approaches a definite limit when the velocities
of the chasers both approach the velocities of their respective chasees.
This limit is:
c (u-v) / ||u-v||
If we have to define the relative velocity of two photons moving along different
directions (possibly opposite, but not equal),
this is the only sensible way to do it.
On the other hand, if u and v are equal, we are back to our
Two sub-c chasers with any arbitrary relative velocity
(not necessarily collinear with u = v)
could both approach the velocity (u = v) of both photons.
Therefore, there is no continuous way to define the relative velocities of two photons
moving in the same direction.
Having said this, we may or may not find it useful to state
(rather arbitrarily) that two such photons have zero relative speed.
However, we failed to find a compelling reason for this "obvious" choice
and cannot guarantee that it would not lead to a paradox of some kind...
Minkowski Space and 4-Vectors
4D objects transforming according to the Lorentz rule
Space by itself and time by itself are fading into mere shadows.
Only a union of the two will preserve an independent reality.
Hermann Minkowski (1908)
Minkowski (1864-1909) spoke those words on September 21, 1908,
in the opening lecture (entitled Raum und Zeit
or Space and Time) for the mathematical section at the annual meeting
of the German Association of Natural Scientists and Physicians
Gesellschaft Deutscher Naturforscher und Ärzte)
in Cologne, Germany.
Weeks later (on January 12, 1909) Hermann Minkowski would die
of a ruptured appendix, at the age of 44.
Minkowski was explaining that a timelike component is associated to any physical 3-vector
which makes the [contravariant] coordinates of the resulting 4-dimensional object transform
according to the same Lorenz transforms
which applies to the 4-dimensional interval between two pointlike events.
A scalar quantity a whose value does not depend on the
coordinate system used to locate
space-time events is called a relativistic invariant.
The 4-dimensional gradient of such a scalar is the following 4-vector:
Grad a =
(-1/c ¶a/¶t ,
grad a ) =
(-1/c ¶a/¶t ,
Applying this definition in another coordinate system leads to coordinates
that are indeed obtained from the above ones via the relevant Lorentz transform.
Three-Dimensional Expression of the Lorentz Transform
A Lorentz boost of speed
V = cb
may be expressed vectorially.
Let V be the [vectorial] 3-dimensional speed of
(S') relative to the coordinate system (S).
Introducing the vectorial quantity
b = V/c,
we may remark that any 3D vector A is the sum of
two vectors, one parallel to b
(the projection onto b,
expressed with a dot product )
the other perpendicular to it.
(A.b /b )
(A.b /b )
The Lorentz transform applied to the 4-vector
(a,A) doesn't change the latter and specifies the interrelated
transformations of the time coordinate (a)
and of the "parallel" spatial coordinate
(A.b /b ).
Putting it all back together, we obtain:
||g ( a -
[ (g-1) A.b
A linear transformation which preserve
spacetime intervals (while respecting the orientation of space and
the direction of time) is necessarily a composition of such a
boost with a spatial rotation
(which leaves time unchanged and preserves spatial distances without
changing the orientation of space).
Such transformations form the 6-dimensional
Restricted Lorentz Group
(a spatial rotation may be specified by its 3 Euler angles, while
a boost may be given by the 3 components of the vector
b introduced above).
If space inversion and time reversal are allowed, we obtain the
with 4 connected components isomorphic to the
Wave and Phase: Wave Vector = Phase Gradient
An Introduction to Four-Dimensional Wave Vectors.
The phase j
of a periodic phenomenum describes
its position in the cycle: a crest, descending, a trough, ascending, etc.
This does not depend on whatever coordinate system we may use to locate the event:
j is a relativistic invariant.
The phase j of an
ideal planar wave at a given point
R = (ct,r) is given by the following relation,
up to some irrelevant additive constant:
j = wt
The pulsatance (w) is proportional to the wave's
whereas the magnitude of the 3D wave-vector (k) is tied to
the wavelength (l), namely:
The celerity of the wave (its phase speed) is the product
u = ln.
K = (w/c, k) is a
4-vector because it's the
4D gradient of a scalar invariant:
(w/c , k) =
This quadrivector is known as the four-dimensional wave vector.
The 3-D vector k is called either wave vector
or (less often) propagation vector.
Relativistic Transverse Doppler Effect
The radial effect is multiplied by an isotropic relativistic factor.
Among redshift factors, the classical
Doppler effect comes from a changing distance between the
observer and the source of radiation.
Another type of redshift is also due to local relative motion.
It's entirely relativistic and applies even to
transverse motion, when this distance doesn't change:
If the source is moving at velocity v, its proper pulsatance
w0 = 2pn0
(in its own rest frame) is given by the
Lorentz transform for the wave-vector:
( w/c -
b k x )
( w/c -
Therefore, calling q the angle between v and
the direction of observation (-k) we have:
( n + cos(q)
||v|| / l )
If the signal propagates at celerity
u = ln
(not necessarily c) in the frame of the observer, we obtain the following
relation (see note below if u<c):
|| 1 -
u + v cos q
This is merely the classical (radial) Doppler effect,
with an extra relativistic factor
corresponding to the observed stretching of time in a moving source.
In this context, that relativistic correction factor is called the
"Transverse Doppler Effect".
That discovery is universally attributed to Einstein, even by authors who prefer to
credit other foundations of Special Relativity to
Poincaré or Lorentz...
Einstein used it in September 1905
to establish the
inertia of energy
(E = mc2 ).
The above is of the form
l n0' = u' :
The classical radial celerity (u') is
the wavelength (l)
multiplied into a relativistically adjusted frequency
A Dubious Academic Tradition:
Things become more obscure when the above is "simplified" for light in
a vacuum (u = c)
in 3 special cases which are popular with textbook writers:
Doppler effect for light (u = c): Values of
1 + z = l / l 0
|Outbound (q = 0)
||Transverse (q = ±p/2)
||Inbound (q = p)
|[ (1+v/c) / (1-v/c) ] ½
||1 / (1-v2/c2 ) ½
||[ (1-v/c) / (1+v/c) ] ½
Note, when u < c :
As remarked in the general derivation, the above is only valid when the
signal emitted by the source obeys a classical
with celerity u in a frame at rest with
respect to the observer.
For light in a vacuum, this is of no concern because the main
tenet of special relativity does state that something
that propagates as a wave of celerity c in one particular
inertial frame does so in any other inertial frame as well.
Not so when the celerity u is less than c, though!
In that case, the wave equation is only valid in the proper frame of
the propagation medium (e.g., air in the case of sound).
Otherwise, we are faced with
a complicated combination of the Doppler effect and the
(only the latter is at work when the source and the observer move at the same
velocity with respect to the propagation medium, as when one listens to
an outdoor concert in a steady wind).
Energy (E) and Momentum (P)
E/c and P form a 4-vector (i.e., they transform like ct and r).
So far, we have only dealt with relativistic kinematics
by introducing quantities on which a description of motion is based
which is consistent with the basic tenets of Special Relativity,
namely equivalence of all observers in relative uniform motion and
constancy of the speed of light measured by all such observers.
Another principle has to be introduced to provide the
philosophical equivalent of the basic laws of Newtonian mechanics which introduce
the notion of force and relate it to changes in motion...
One approach of Newtonian mechanics which can be consistently generalized
to the framework of Special Relativity is to introduce
the notion of linear momentum (the product P
of mass m by velocity v)
and to postulate that the time derivative of that quantity
is a vectorial quantity, called force, which somehow describes dynamical
exchanges between distinct parts. (This is where Newton's famous equation
F = m a comes from.)
Newton postulated that "to every action,
there's an opposite reaction of the same magnitude" which is a
fancy way of saying that every variation in the momentum
of one part will always be exactly compensated by the variation in the
momentum of another, so that total momentum is conserved.
derivation of the inertia of energy
by Terence Tao.
proof of E = m c2
by Henry Reich (Minute Physics)
(Belgium; e-mail 2005-04-14)
What's the relation between E = m c 2
and the formula E = ½ m v 2 ?
The relativistic energy of a particle of rest mass
m and speed v is:
(m g) c 2 =
||m c 2
||1 - v2/c2
Unfortunately, this is also written E = m c2
by letting the symbol m be the so-called relativistic mass,
which we denote here by m g as we reserve
the symbol m itself for the so-called "invariant mass"
or "rest mass" of a particle (following
Under the assumption that v is much smaller that c, a good approximation for E
is obtained from the Taylor expansion of the above:
E = mc 2 ( 1
+ 1/2 (v/c)2
+ 3/8 (v/c)4
+ 5/16 (v/c)6
+ ... )
At low speed, we may just keep the first two terms of this series:
mc 2 +
½ mv 2
In prerelativistic mechanics, the first term is irrelevant because it's a constant, whereas
the second term is called kinetic energy.
Actually, the first term was originally conjectured by Einstein because of the above
considerations: It's just mathematically simpler to assume that
a motionless body of mass m already has an energy
This explains painlessly the so-called mass defect observed
in the decay of radioactive elements :
A nuclear decay always leaves remnants whose rest masses add up to less than the mass
of the original nucleus. The balance of the energy appears in the form
of either kinetic energy or radiation...
The relation E = m c 2 has been verified directly
by countless experiments in nuclear physics, starting (in 1932) with the artificial
transmutation of lithium into two alpha particles, using a beam of fast protons
produced by the particle accelerator of Cockcroft and Walton
In 1938, Otto Hahn and Fritz Strassmann observed that Barium
is produced when Uranium is bombarded with neutrons.
Meitner and Otto Frisch interpreted this result as an induced
fission of the Uranium atom.
Photons and Other Massless Particles
To have a finite energy, a massless particle must travel at speed c.
Particles travelling at speed c can only have a finite energy if they have
zero rest mass ; they only exist in motion
(always at speed c).
In this case, we must use a quantum expression of the energy, in terms of an associated
wave: The energy of a photon of frequency n
where h is Planck's constant.
This relation was proposed by Einstein (1905) in his explanation of the laws of
the photoelectric effect (for which he received
the Nobel Prize in 1921)
which may be construed as a formal discovery of the photon...
At first, Isaac Newton (1643-1727)
had argued that light was corpuscular
[i.e., consisting of discrete particles]
against Christiaan Huygens (1629-1695)
who held the view that it was undulatory [wavelike].
The viewpoint of the latter was revived in 1803, when Thomas Young's
proved light was capable of interference.
The same doctrine was reinforced when James Clerk Maxwell (1831-1879)
put forth his famous differential equations
governing electromagnetism in general,
and light in particular [at a macroscopic level].
Max Planck (1858-1947) also believed strictly in the wave nature of light,
even after his own success in explaining the blackbody spectrum
by assuming matter and radiation could only exchange energy in
packets ("quanta") proportional to the radiation's frequency.
Thus, it was a revolutionary proposal of Einstein's that Planck's packets of energy could
actually be related to the particles of radiation envisioned by Newton.
A generalization to electrons and other subluminal particles
was proposed by the French physicist Louis de Broglie,
who put forth a new principle establishing
the dual corpuscular and undulatory nature
of everything, as discussed next.
The Principle of de Broglie is Relativistic
Corpuscular and undulatory duality, as proposed in 1923/1924.
In 1923, the French physicist
Louis de Broglie (1892-1987;
Nobel 1929) was still a graduate
student at the Sorbonne when he proposed the idea of matter waves,
which he defended successfully in 1924 (with the support of Einstein himself)
in front of a doctoral committee which included
Paul Langevin (1872-1946).
At the time,
de Broglie stated that his proposed matter waves might
be observable in experiments involving crystal diffraction with electrons.
Such experimental confirmations came in 1927, with two independent experiments:
one by Clinton J. Davisson (1881-1958;
and Lester H. Germer (1896-1971), the other by G.P. Thomson (1892-1975;
Ironically, George Paget Thomson thus demonstrated the undulatory nature of
electrons, whose corpuscular properties were established three decades earlier
by his own father, J.J. Thomson
(1856-1940; Nobel 1906).
De Broglie's idea was that any particle is associated with a so-called pilot wave:
The momentum of one and the wave-vector of the other are proportional and
the coefficient of proportionality is a universal constant.
We'll state de Broglie's principle
using the 4-dimensional wave-vector introduced
(w/c , k) =
||| k ||||=
||2p / l
De Broglie's Principle Expressed Relativistically :
Louis de Broglie proposed that any particle of
4D momentum P = (E/c, p)
was "associated with" a wave of
(4D) wave-vector K proportional to P, namely:
[ h is Planck's constant ]
This 4D equality breaks down into a scalar component and a (3D) vectorial component.
The former is Planck's relation, the latter is de Broglie's relation,
usually stated using only the magnitude
p = || p || of the 3D-momentum:
Phase Celerity (u)
vs. Mechanical Speed (v)
The above relations make the celerity u = ln
equal to E/p.
For a particle of rest mass m and speed v, we have:
(E/c)2 - p 2. Therefore:
1 / u 2
1 / c 2 - (mc/E)2
(1-(1-(v/c)2 )) / c 2
(v/c 2 ) 2
This establishes an extremely simple relation between celerity and speed:
The [phase] celerity (u) for a massive particle is thus greater than c,
as energy and information travel at the speed (v),
which remains lower than c.
Formerly, the above mechanical speed (v) was identified with the "group speed"
which measures the propagation of a wave's modulation
(i.e., the overall shape of its envelope).
Actually, this "fact" was used by Erwin Schrödinger himself in one particular
justification for Schrödinger's equation.
However, the mathematical expression for group speed does not necessarily imply
such a relation, which is not absolute:
Although it's usually slower than light, group speed can indeed
be faster than light. This was demonstrated experimentally in the 1980's.
De Broglie wavelength of a relativistic particle :
- (mc) 2 = -(E/c)2 + p 2
= -(E/c)2 +
|(E - mc2 ) (E + mc2 )
The relativistic kinetic energy
W = E - m c2
is often used to characterize the speed of fast particles, especially
when x = mc2/W is small.
On the other hand, in the nonrelativistic case where
W is approximately
equal to ½ m v 2 ,
l » h/mv.
More precisely, the usual expression of the 3D relativistic momentum
|l = h / p
= (h / mv)
||1 - (v/c)2
Finally, here's a formula involving the
= (h/mc) [ (E/mc2 ) 2
- 1 ] -½
de Broglie wavelength of a 240 Mev proton is 1.74 fm
The Compton Effect
The frequency of a photon changes when it collides with an electron
(of rest mass m).
Compton scattering is a deflection of X-rays by matter that entails
a shift in their frequency
(n) which depends on the angle of
This was explained by Arthur H. Compton
(1892-1962; Nobel 1927)
in terms of collisions between incoming photons and recoiling electrons:
In a system where the target electron (of rest mass m) is at rest,
its 4-momentum is (mc,0).
The incoming photon moves along the x-axis and goes out
in the (x,y) plane, at an angle q
from Ox. Now, energy-momentum is conserved:
After the shock, the 4-momentum P
of the electron is thus obtained by subtracting the momentum of the outgoing photon
from the sum of the two initial 4-momenta.
The Minkowski square of the electron's 4-momentum
is always -(mc)2. So:
- m 2 c 4 =
-(mc2 + hn
hn' ) 2 +
( hn - hn'
cos q ) 2 +
( hn' sin q ) 2
Compton frequency of the electron
( nc = m c2 / h )
[ 1 + (1 - cos q )
n / nc ]
The effect is often stated in terms of a change in wavelength,
using the Compton wavelength of the electron
lc = h / mc = 0.0024263102389(16) nm
l [ 1 +
(1 - cos q )
lc / l ]
- l =
(1 - cos q )
The Compton shift
(z) is best defined, like other types
of redshift, as the relative change in wavelength.
If E is the energy of the incoming photon, the outgoing photon has energy E/(1+z).
(1 - cos q )
n / nc
(1 - cos q )
E / mc2
The recoiling electron, initially at rest, is imparted a speed v and a kinetic
energy W equal to the opposite of the change in energy of the photon
E - E/(1+z) = E /(1+1/z)
With energetic photons (gamma rays)
the target electrons may recoil at high speed and
cause bluish cherenkov radiation in transparent bodies.
Klein-Nishina formula for Compton scattering (1929)
Differential cross-section of electrons in Compton diffusion.
Photons deflected by an angle between q
span a solid angle
dW = 2p sin q dq.
Such a deflection occurs with the same probability as would a collision of a
classical point particle
with an obstacle of cross section ds.
For Compton diffusion,
this is given by the Klein-Nishina formula,
as derived from a 1928 paper by
Oskar Klein (1894-1977)
and Yoshio Nishina
½ (re ) 2 [
P2 sin2 q + P3 ]
where P =
[ 1 + (1 - cos q )
n / nc ] -1
This involves the
electron radius, obtained by equating the rest energy
mc 2 with twice the electrical energy
of a sphere of radius re bearing the charge of the electron
(q) uniformly distributed on its surface:
The total cross-section s
is obtained by integrating the above, using the parameter
u = n/nc
and a new variable x = cos q :
p (re ) 2
- (1-x2 ) (1+(1-x)u)-2
p (re ) 2
|| - 2
||(u+1) Log (1+2u)
|| + 2
For photons of low energies (small values of u)
this total cross-section reduces to that of classical
(re ) 2
At high energies, the cross-section
(s) vanishes but
the average transfer of energy
increases logarithmically with the energy of the incoming photon.
Bert Dobbelaere (2008-01-29; e-mail)
Compton diffusion does not cause any optical aberration.
Compton diffusion transfers some energy to the recoiling electron.
With incoming gamma rays, this transfer of energy can easily exceed what's
necessary to overcome the binding energy of electrons bound to atomic orbitals.
Beyond that threshold, a continuous spectrum of energy is allowed.
On the other hand, the low-energy photons of visible light can only transfer
something like 1/100000 of their energy to the electron they collide with,
according to the above formula
(the maximum occurs when the photon bounces straight back
to the direction where it came from).
The electrons bound in the ordinary orbitals of chemical elements
can only change their energies in steps which are far greater than that.
Therefore, the recoil must be absorbed by the whole atomic structure rather than by
a lone electron.
This reduces the relative Compton shift of low-energy
photons so drastically that no measurable effect can be observed.
Even after many billions of wavelengths traveled through an
optical instrument, Compton diffusion at different
angles will cause no observable difference in phase.
The Compton effect is thus completely
suppressed quantically for visible light.
Relativistic Elastic Collisions
A simple relation between transfer of energy and change in momentum.
dE = v.dp
(v is the velocity of the center of mass). Here are the details:
If two particles (1 and 2) collide but retain their respective identities,
we may define the energy and the momentum lost by one and gained by the other:
||E'1 - E1
||- ( E'2 -
||p'1 - p1
||- ( p'2 -
||x u [ with ||u|| = 1 ]
The scalars x and y are introduced for convenience, so is the unit vector u,
which is uniquely defined, unless no shock takes place
(dp = 0).
We also introduce:
u . pi
(Ei / c 2 ) vi
A collision is said to be elastic
when each rest mass is conserved, namely:
|E'12 / c2 -
||E12 / c2 -
|E'22 / c2 -
||E22 / c2 -
In the form
(p'i - pi
).(p'i + pi )
(E'i - Ei )
(E'i + Ei ) / c 2
these two read:
|x ( 2 p1 + x )
||y ( 2 E1 / c + y )
|x ( 2 p2 - x )
||y ( 2 E2 / c - y )
Adding both equations, we obtain
x ( p1 + p2 ) =
y ( E1 + E2 ) / c
Using this relation, we may multiply either of the previous equations by
( E1 + E2 ) / cx
(as x is nonzero) and obtain the second equation of the following
|x ( p1 + p2 )
||y ( E1 + E2 ) / c
|x ( E1 + E2 ) / c
||y ( p1 + p2 )
||2 ( E1 p2 -
E2 p1 )
By itself, the first equation already says that
dE = v.dp
(as advertised) where the
vector v is the velocity of the center of mass, namely:
( p1 + p2 ) c 2 /
( E1 + E2 )
By solving the whole system for x and y, we obtain:
A process whose leading Feynman diagram is at right:
Classically, light does not interfere with itself because
Maxwell's equations are
Light beams normally pass right through each other undisturbed.
However, there are high-energy quantum processes
whose net result is similar to an elastic collision between two
photons (cf. above Feynman diagram) and we can
determine the relations between the incoming and outgoing photons.
If the photons are traveling in exactly the same
direction, they will never collide (just as cars which travel at
the same speed in the same direction of the same road do not collide).
Otherwise, their combined energy-momentum is a
which is the energy-momentum
of what can be called the center of mass
of those two photons. The mass of that point is the Minkowski-length
of the momentum-energy divided by c.
Its 3D-velocity is the 3D-momentum divided by that mass.
Let's use this center of mass as
the origin of a new frame of reference.
In that frame of reference, the two incoming photons simply
have the same frequency and opposite directions
(so that their combined 3D-momentum is zero). So do the
two outgoing photons. Furthermore, the incoming and outgoing frequencies
are identical (because energy is conserved).
Usually however, the incoming and outgoing directions are different.
In particular, there is an angle
q between them, which departs from a zero
or flat angle whenever the collision breaks the axial symmetry
of the incoming photons.
The azimuthal orientation of the outgoing direction in which the incoming symmetry
is broken will also be important to whoever might have to use the relevant
Lorentz transform to translate the above
simplicity into actual laboratory measurements.
For the head-on collision of two photons
of frequencies n0 and
n1, we may write the conservation
of energy-momentum (divided by h/c) as follows:
n0' cos q0
n0' sin q0
-n1' cos q1
-n1' sin q1
Physics by I.F. Ginzburg (Acta Physica Polonica, 2006)
Scattering (Physics Forums)
(2005-05-05) [ Blue ] Cherenkov Radiation
The Cherenkov effect occurs when a charged particle
moves faster than the celerity of light.
In a dispersive medium like a liquid or a gas, the celerity of light depends
on frequency. Normally, the celerity of visible light is below
Einstein's constant (c) while the celerity of
X-rays is above it.
( It's the group speed which may never exceed c.
Phase celerity is another matter entirely.)
Radioactive sources make nearby transparent bodies emit a bluish glow which
Marie Curie first
noticed in 1910, with radium salts in distilled water.
of this glow consists of ordinary luminescence from impurities...
However, this is what the whole
thing was mistaken for until the French radiologist
Lucien I. Mallet (1885-1981)
studied the phenomenon in details in 1926-1929 and found it to have a
continuous spectrum, unlike fluorescence.
This was further investigated between 1934 and 1937 by Pavel A. Cherenkov (1904-1990)
who established that the radiation came primarily from fast electrons
disloged by Compton collisions with energetic gamma-rays.
The final mathematical explanation was worked out by two of his colleagues from the
Lebedev Physical Institute of Moscow,
Il'ja M. Frank and
Igor Y. Tamm, following the 1936 discovery of the particular geometry
of the Cherenkov beam.
For this, those three men became the first Russians ever to be awarded the
Nobel Prize in Physics,
Cherenkov radiation is entirely different from so-called bremsstrahlung,
the electromagnetic radiation emitted when a charged particle is accelerated
(e.g., as it collides with atoms).
A heavy particle causes less bremsstrahlung than
a lighter one of the same speed, but the Cherenkov emission is the same.
The Cherenkov effect is to light what the
sonic boom is to sound. Kind of :
The effect is also called "Cerenkov-Mallet", especially in French texts.
"Cerenkov" and "Cherenkov" are equally acceptable transliterations.
Cherenkov Radiation by Philip Gibbs
by Emma Ona Wilhelmi
Cerenkov Light: What is it?
A Fermilab video.
How far would you travel in a lifetime at a constant acceleration g ?
a constant acceleration cannot be maintained indefinitely relative to a fixed
female observer (Alice) or else the speed of the
male traveler (Bob) would eventually exceed the speed of
light, which is absurd.
Thus, the "constant acceleration" we're talking about is the acceleration
he feels, not the acceleration she sees.
Although the rest frame of Bob is not inertial at all,
we may consider, at some specific instant,
the so-called tangent frame (S')
which is an inertial frame that moves uniformly with respect to the frame of Alice (S)
at the same velocity as Bob.
When Bob uses the inertial frame (S') to describe his own motion, he finds the second
derivative of his position (x') with respect to time (t') to be constant (g).
Dilation and Space Flight
The relativistic paradox of the twins of Langevin
A puzzling effect confirmed by the Häfele-Keating experiment (1971).
In October 1971,
Joseph C. Häfele
(Department of Physics, Washington University, St. Louis, Missouri)
and Richard E. Keating
(Time Service Division, U.S. Naval Observatory, Washington, DC)
conducted an experiment with four different cesium atomic clocks flown
around the Earth in two opposite direction (Eastward and Westward).
The entire experiment leasted 636 hours, including a 65.4 h
trip eastward and an 80.4 h trip westward.
Häfele-Keating experiment (1971)