home | index | units | counting | geometry | algebra | trigonometry & functions | calculus
analysis | sets & logic | number theory | recreational | misc | nomenclature & history | physics

Final Answers
© 2000-2018   Gérard P. Michon, Ph.D.

Logic  &  Set Theory

 Blaise Pascal 
 1623-1662 Reason's last step is the recognition that there are  
an infinite number of things which are beyond it. 

Blaise Pascal (1623-1662)   Pensées   

[...]   several outstanding logicians of the twentieth
century found shelter in asylums at some time.

Gian-Carlo Rota  (1932-1999)

 border  border

Related articles on this site:

Related Links (Outside this Site)

The Continuum Hypothesis  by  Nancy McGough
A History of Set Theory  by J.J. O'Connor & E.F. Robertson (1996).
Paradoxes mathématiques (in French):  Taupe du Lycée Victor Hugo (Caen).
Mathematics and Logic  [ 1 | 2 ]  by  Peter Cameron  (January 2010).
IST = Internal Set Theory (1977)  [Ch. 1]  by  Edward Nelson (1932-2014).

DMOZ: Set Theory

Video :   Georg Cantor (1845-1918).  Dangerous Knowledge: 1  |  2  |  3
Kurt Gödel (1906-1978).  Dangerous Knowledge: 7  |  8  |  9  |  10


From basic logic to axiomatic Set Theory

Curiously, set theory arose in the context of sets of real numbers related to the convergence of  Fourier series...  In 1829, Dirichlet had shown that a function always had a Fourier series converging to itself, under certain  sufficient conditions  (he considered periodic functions with finitely many extrema in every period and equal to the half-sum of their lower and upper limits at every point).
In 1870, Georg Cantor (1845-1918)  (who had obtained his doctorate in 1867, under Karl Weierstrass, 1815-1897)  proved a basic uniqueness theorem:  The only trigonometric series which converges to zero everywhere is the zero series.  In 1871, Cantor introduced the concept of a U-set  (or set of uniqueness)  with the characteristic property that the zero series is the only trigonometric series converging to zero  outside  of any given U-set.  He proved that any finite set is a U-set.  In 1872, Cantor introduced the concept of  limit point  and he properly defined real numbers as equivalence classes of rational Cauchy sequences.  Then, he introduced the  derived  set E' of a set E as the set of its limit points.  He established that a set whose derived set is a U-set is also a U-set.  Cantor went on to prove that every set with countably many limit points is a U-set.
Ernst Zermelo (1871-1953)  considered that analytical result to mark the actual birth of modern set theory, as introduced on this page.

Video :   Set Theory: An Offspring of Analysis (1990)  by  Walter B. Rudin (1921-2010).

(2014-01-21)   Aristotelian logic
For over two millenia, the framework of logic was rigid and frozen.

Aristotle (384-322 BC)   Aristotle of Stagira showed how to deduce categorical propositions from each other.  In a typical  syllogism,  a conclusion is derived from two  premises.  For example :

  • All men are mortal.
  • Socrates is a man.
  • Therefore, Socrates is mortal.

The pattern seems foolproof, and it is.  Yet, difficulties may arise:

  • Rare things are not cheap.
  • A cheap horse is a rare thing.
  • Therefore, a cheap horse is not cheap.

As discussed below, the silly conclusion is not necessarily invalid  (as it merely implies that there's no such thing as a cheap horse, which isn't so objectionable in this context).  Nevertheless, clarifications are needed...

Such threats to logic from circularities in the premises bothered  Gottlob Frege  (1848-1925)  as he seeked to put modern logic on solid foundations.

As he thought he had done so, Frege was proofreading his work, just before going to press, when he received a letter from his younger colleague  Bertrand Russell  (1872-1970)  who pointed out to Frege what's now called  Russell's paradox.  To his credit, Frege saw immediately that his work could not be fixed and he did what he could to apologize to his readership.  Russell's objection was a fundamental one.

The consequences for the very foundations of mathematics are still with us.  Read on...

Syllogism   |   Gottlob Frege (1848-1925)   |   Bertrand Russell (1872-1970)

(2014-02-11)   Formal Logic
The basis for reasoning  (violated in the realm of quantum physics).

A system of axioms is consistent if, and only if,
at least one statement is not provable in it.

Although natural language can be used for reasoning, it's notoriously unreliable in its rigor, or lack thereof  (pay close attention to any debate involving a religious apologist to illustrate this point).  As with the rest of mathematics, a formal language is needed for logical statements.  That language should be unforgiving enough to make mistakes detectable.

  • Definitions are equivalences involving newly named concepts.
  • A theorem is a statement established by a proof.
  • An axiom is a statement assumed without a proof.

 Come back later, we're
 still working on this one...

The formal counterparts of Aristotelian categories are called  sets.  A set can be envisioned as the collection of all things that belong to it  (something that belongs to a set is said to be a  member  of that set).  Two sets that have the same members cannot be distinguished and are thus considered  identical  (that's the  principle of extensionality).

 Come back later, we're
 still working on this one...

Surely, you want to consider sets themselves as legitimate objects of study and allow sets whose members are sets.  However, the freedom to do so cannot be completely unrestricted, as  Bertrand Russel  first discovered.  There's a need for a  theory of sets  which spells out what is required to avoid subtle or not-so-subtle contradictions in our discourse.

(Ashlee of Braithwaite, LA. 2000-05-03 twice)
There is a barber who lives in a small town.  The barber shaves all those men and only those men who do not shave themselves.
Does the barber shave himself?

Stay away from the many "cute" answers that do not really address the question: The barber can't be a woman (otherwise the term "himself" used in the question would be improper).  It would also be cheating to consider that the barber is a boy (and therefore not a "man") or any other kind of non-human male creature for that matter...  The dilemma remains (it's not a paradadox, as we shall see):  If the barber doesn't shave himself then he shaves himself.  If he shaves himself, then he doesn't shave himself.

The answer to this classic "problem" is simple:

There cannot possibly be any such barber!

The fact that we are wrongly  told  at the outset that such a barber exists just does not make it appear out of thin air in spite of the logical contradictions his very existence would imply.  This is no more paradoxical than a silly question like:

There's a number whose square is 4 and whose cube is 24.  What is it?

Both this and the barber's "problem" are just plain nonsense.  If the existence of something leads to a contradiction, the thing is thereby shown not to exist and any statements about it are to be considered either nonsensical or  vacuously true :  Any such barber does shave himself.  Any such barber does not shave himself.  Any such barber has purple hair.  The name of any such barber is Isaac Newton.  All of this is true.  Vacuously  so.

Russell's Paradox:

More seriously, it is worth pointing out that a fundamental logical dilemma had to be resolved the same way by the pioneers of set theory:  In what's known as  Russell's paradox,  we are asked to consider the "set" of all sets that are not members of themselves.  Is it a member of itself or not?  Neither answer is acceptable...  Therefore, such a thing can't possibly be called a set!

This paradox was instrumental in clarifying the (Zermelo-Fraenkel) axioms of modern set theory:  A "set" cannot be just anything we like.  In particular, the axioms imply that no set can be a member of itself.  Therefore, there is no "set of all sets"  (it would be a member of itself).  The collection of sets that are not members of themselves thus includes all sets and it is not a set itself.  That's the way out of  Russell's paradox :  "such a set" simply does not exist, exactly like "such a barber" cannot possibly exist in the  barber's dilemma.

Meaningless Statements in Natural Languages:

A related issue about "natural" languages is that there may be inherently meaningless statements or descriptions in a language that's allowed to refer to itself.  For example, "the integers that can be specified in twenty English words or less" cannot possibly describe any unambiguous set of integers. 
      Otherwise, there would be such a thing as "the smallest integer that cannot be specified in less than twenty English words", which would thus be specified in only 13 English words...

This goes to show that natural or formal languages may include self-contradictory sentences, not only when they refer to themselves  (e.g. "This sentence is false.")  but also when they include other references to the language being used.  Bertrand Russell  coined the term  Berry paradox  for sentences of this latter type,  because of a letter he received from  G.G. Berry  (in 1906)  introducing the earliest such paradox by discussing, among transfinite ordinals:

The first ordinal that cannot be named in a finite number of words.

Even if you  don't know  what an ordinal is, you can tell that the above phrase must be meaningless.  Its meaning, if it had any, would be an ordinal.  However, by assuming that such a meaning exists, we see that  another  ordinal must be described instead.  Thus, the above can't possibly describe  any  specific ordinal.  There's actually no paradox here; just a lack of meaning.

Russell's paradox   |   Paradoxes of set theory
Crisis in the Foundations of Mathematics (12:39)  Kelsey Houston-Edwards (PBS Infinite Series, 2017-10-19).

(2018-10-28)   A Kindergarten Question:
How can a man who shaves almost every day have a long beard?


(Melissa of Denver, CO. 2000-11-06)
What is infinity and what does it mean?
(Nagaraj of Anamoose, ND. 2000-11-18)
What is the definition of infinity?  Please explain to me in details.

Albert Einstein once said that we can't solve problems by using the same kind of thinking we used when we created them.  Large numbers begot the concept, but they are of no help in the  definition  of infinity.  Let's try common sense first:

A finite set has n elements, for some nonnegative integer n.

Of course, any set which isn't finite in this sense may be called  infinite.

Although this common sense approach is perfectly sound, it would seem that the basic distinction between finite and infinite sets ought not to depend explicitely on the properties of something as complicated as the integers.

A more fundamental approach was first suggested by Dedekind, who turned a perceived  paradox  about infinite sets into a natural  definition  of infinity:

An infinite set is equipollent to one of its proper subsets.

subset  is a set contained in the set being considered  (i.e., all elements of one are elements of the other).  A  proper  subset is a subset not equal to the entire set.  The  empty set  is a proper subset of any set but itself.

Two sets are said to be  equipollent  when there's a one-to-one correspondence between the elements of one and the elements of the other.  At the most fundamental level  (which may not be the most intuitive one)  the  existence  of infinite sets is built into the axioms of modern set theory. 

Consider the set  N  of  natural integers  {0,1,2,3,4,5,6,...}.  One of its proper subsets is the set of all  counting numbers  N* = {1,2,3,4,5,6,7,...}.  Yet, there's a trivial one-to-one correspondence betweeen those two  (the function  y = x+1).  There's also a one-to-one correspondence between N and the even numbers, between N and the squares, etc.  N is thus an example of an  infinite  set.

The same thing cannot  be done with any finite set:  {1,2,3,4,5,6,7}  cannot be put in a one-to-one correspondence with any set of fewer than 7 elements.

Nowadays, a set equipollent to a proper subset is dubbed  Dedekind-infinite.

Tarski's definition of infinity :

The problem with Dedekind's definition of infinity is that a proof of its equivalence with the previous definition  (involving integers)  requires  the axiom of choice  (or, at least, the weaker  Axiom of countable choice )  which is normally reserved for advanced nonconstructive proofs.  It's undesirable in a context where  natural integers  themselves are considered too elaborate.

There are no such problems with the following definition, proposed in the doctoral dissertation  (1924)  of  Alfred Tarski  (1902-1983).  It's based on the notion of a  minimal element  in a family of sets  (a  minimal element  is a set  not  containing another set from that family):

Tarski's Definition of Finite Sets  (1924)
A set  E  is finite if, and only if, every nonempty family
of subsets  (call it F)  has a minimal element.  Formally:
"F Í 2E ,   ( F ¹ Æ )   Þ   (  $xÎF"yÎF ,   y Ë x  )

That clever definition is simply perfect from a theoretical standpoint  (although it certainly doesn't qualify as an intuitive one).

Infinity as a Limit of Large Numbers

If you allow mentally for the existence of such infinite sets (as you should) you realize that there's no such thing as the "largest" integer.  The process of building larger integers is never ending.  That's what infinity is all about.

This is like the discovery game children play when trying to name the "largest" number.  Sooner of later, one of them realizes that the game cannot possibly be won by either player:  Regardless of what number is named, it's always possible to say something like "twice that" (or "this plus one") and not concede.

Grasping Infinity :  Zeno's Arrow

Mathematicians routinely study things whose infinite versions turn out to be much simpler than the finite ones.  One example is the following sum (properly called a "series"):

1/2 + 1/4 + 1/8 + 1/16 + 1/32 + 1/64 + 1/128 + 1/256 + ...

This sum is equal to  1-2-n  when carried out only to its n-th term.  It's simply equal to 1 if  all  of the infinitely many terms are added up.

When the ancient Greeks were still wrestling with the concept of infinity, the above sum was underlying one of  Zeno's paradoxes :  Before an arrow reaches its target, it must first travel half of the distance to it (1/2), then half of what's left (1/4), half of what's left after that (1/8), and so forth.  Although there are infinitely many such "steps", the arrow does reach its target.  (Try!)

   Giuseppe Peano
Giuseppe Peano

(2012-06-15)   The Peano axioms  (1889)
Defining the  set  of the natural integers.

The five  Peano Axioms  are:

  1. Zero is a natural integer.
  2. Every natural integer has a  successor  in the natural integers.
  3. Zero is not the successor of any natural number.
  4. Two different natural integers have different successors.
  5. A set which contains zero and the successor of every element in it, contains all natural integers.

The last axiom is a  second-order statement.  It's often called the  induction axiom,  as the validity of the method of proof by induction is based on it.

Peano axioms  (Wikipedia)   |   Giuseppe Peano (1858-1932)

(2005-07-09)   Classes of Ordered Numbers
From integers to rational, real and surreal numbers.

The most rudimentary numbers are the counting numbers  (1, 2, 3, 4 ...)  but it's better to let the story begin with the natural numbers  (0, 1, 2, 3 ...)  although zero is a sophisticated concept of relatively recent origin.

Unfortunately, the tradition in US education [1, 2, etc.] differs from the international conventions described here, following Bourbaki and other authors.  In US high-schools, "natural numbers" are often understood to exclude zero, which makes the term synonymous with "counting numbers".  The dubious convention within the US educational system that makes "whole numbers" include zero but exclude negative integers is best forgotten by anybody who wants to be understood outside of the US.  You may talk about "positive integers" or "nonnegative integers" instead.

Negative integers originated  after  the fractions and  constructible numbers which result from the application of classical rules (compass and straightedge) to Euclidean geometry.  Algebraic numbers were also considered before negative numbers were accepted  (along with complex numbers, during the Renaissance).

Real numbers were first formally defined  (as cuts over the rationals)  by Richard Dedekind  (1858)  to close all possible "gaps" in the number line.  This is to say that whenever the real numbers are divided into two sets L and R such that no element of L exceeds any element of R, there's one element at their common boundary.  (This need not be the case with rational numbers.  For example, there is no rational number at the boundary between the sets L and R of rational numbers whose  squares  are respectively less than 2 and more than 2.)

Archimedes attributes to Eudoxus what's now called the  Archimedean  property of ordinary numbers:  There's always an integral multiple of any given positive quantity that exceeds any other prescribed quantity.  The surreal numbers, mentioned above, are  not  Archimedean, as they include infinitesimal positive quantities whose product by any integer can't exceed a number like 1  (likewise, no integer exceeds an infinite surreal quantity). 

Totally-Ordered Number Classes  (each one contains those below it)
Number ClassStructureCountable CompleteArchimedean
Surreal NumbersFieldNoYesNo
Real NumbersNoYes
Computable NumbersYesNo
Algebraic Numbers
Constructible Numbers
Rational Numbers
Natural IntegersMonoid
Counting NumbersSemigroup

ciderspider (Mark Barnes, UK. 2000-10-11)   The Number Line
How do you prove that there are more real numbers than rational ones?

Any infinite set which may be put in a one-to-one correspondence with the integers is said to be  countable  (or to have cardinality Ào).  Infinite sets that are not countable are said to be  uncountable.  As shown below, real numbers are not countable, while rational numbers are.

There is some disagreement about the meaning of the adjective "countable" when applied to sets which are not known to be  infinite.  As the term is clearly a mathematical one, it should have the inclusive meaning normally implied by mathematical discourse.  In other words, finite sets are  countable  too.
However, some authors simply won't apply the term "countable"  (or its synonyms:  "denumerable" and "enumerable")  to finite sets...  To avoid misunterstandings, you should either clarify things at the outset  (in Numericana, I consider finite sets to be countable)  or use unambiguous locutions, like  "countably infinite"  or  "countable or finite"  (although the latter could be construed as pleonastic).

The uncountability of real numbers :

The  diagonal argument  of Georg Cantor (1845-1918) is the remark that, with  any  infinite "list" of real numbers between 0 and 1 (such a "list" is in one-to-one correspondence with the integers), an  unlisted  number may be constructed by choosing for its N-th decimal a digit that's  different  from the N-th decimal of the N-th number listed.

To avoid the problem presented by numbers having two decimal representations  (ending with infinitely many zeroes or infinitely many nines)  never choose a 0 or a 9.  This still leaves 7 valid choices.

The number so constructed is  not  in the list, since it has only  one  decimal expansion and differs from any listed number in at least one decimal digit.

Thus,  any  list of reals between 0 and 1 fails to contain all the reals between 0 and 1.  Therefore, the real numbers between 0 and 1 are  uncountable.   QED (Halmos symbol)

This is one of the simplest and most beautiful proofs of all times !  Its elegance is often  spoiled  by presenting it as a proof by contradiction,  starting with something like:  "Suppose there was such a list..."

It's related to the superb proof of Cantor's theorem which is even simpler and purer, albeit far more abstract.

The countability of rational numbers :

 All the points of a grid can
be visited sequentially

Now, to prove that there are more reals than rationals, it suffices to show that rationals  may  be put in a list.  There are neater ways to do so, but we'll describe only a simple-minded scheme here:  At coordinates (p,q) of a whole planar grid, put the value p/q [use zero as placeholder, if q is zero].  Starting at the center, run through the grid along a square spiral like the one pictured at right.  Put in a growing list each ratio you encounter for the first time as you travel this way through all possible fractions...  Any given rational will eventually appear in that list, which is another way to say that all of them belong to the list.  QED (Halmos symbol)

(2000-10-11)   Cantor's Ternary Set
Some vanishing sets of reals have as many points as the whole line.

The probability is zero that a randomly chosen real number is rational.  This statement  (which could be made precise in the proper context of Lebesgue measure theory)  holds for the set of rational numbers or any other  countable set.

However, not all sets of numbers with zero probability  (or, rather, zero  measure  in the sense of Lebesgue)  are countable.  It's easy to describe a set of numbers which has zero probability but yet contains just as many elements as the whole set of real numbers  (which isn't countable).

 Successive approximations to the Cantor 
 set, whose measure turns out to be zero.

One such example is the  ternary Cantor set  obtained by removing from a segment (of numbers from 0 to 1, say) its "middle third" and similarly removing the middle third of any smaller segment that shows up in this (infinite) process.

Cantor's ternary set, introduced by Cantor in 1883, was first considered by Henry Smith in 1875.  It's thus the earliest mathematical example of a  fractal  endowed with self-similarity, since it actually consists of two copies of itself one third as big.  Its  fractional dimension  d  (in the sense of Hausdorff)  is less than 1:

(1/3)d   =   1/2             d  =  log 2 / log 3  =  0.630929753571457437...

Equivalently, you may consider the set of all the numbers (between zero and one) whose decimal expansions contain  only  zeroes and nines.  Its probability is zero, in spite of the fact that this set contains as many elements as the  whole  number line  (as sequences of zeroes and nines are in one-to-one correspondence with sequences of zeroes and ones, which are at least as numerous as the uncountable set of numbers between 0 and 1).  The fractal dimension of that decimal version of the Cantor set is a familiar number  (smaller than the above dimension of the ternary set):

d   =   log 2 / log 10   =   0.3010299956639811952...

A self-similar  countable set of reals must have dimension  0.  Example:

For   { 1, 1/2, 1/4, 1/8, ... 1/2n ... }   we have   (1/2)d = 1   so,  d = 0.

The Devil's Staircase (13:33)  by  Kelsey Houston-Edwards (PBS Infinite Series, 2017-05-19).

(2015-02-14)   The formal language of  Set Theory :
Symbols and idioms which can be used in the rest of mathematics.

The rest of this page may involved advanced topics of  Set Theory  (the biggest hurdle being the fundamental axioms)  which can be skipped by most readers.  However, the language introduced in this section will be of interest to everybody.  It allows many mathematical statements to be expressed quickly and accurately and its mastery will definitely help structure the thought process of aspiring mathematicians at all levels of expertise.  Some of it used to be taught systematically in primary and secondary education  (to avoid highbrow fundamental queezes, what was taught was actually mostly the "algebra of the parts of a given set E").

  • Membership :   x Î A .   x belongs to set  A  (it's an element of A).
  • Inclusion :   A Í B .   Every element of A belongs to B.
  • Equality :   A = B .   A and B are included in each other.
  • Union :   A È B   is the set whose elements belong to A and/or B.
  • Intersection :   A Ç B  (A inter B).  All elements in both A and B.
  • Complement :   A .  All elements  (of  E )  not in  A.
  • Powerset :   2A   or   P(A) .  The set of all subsets of  A.

The original symbol for inclusion was  Ì   (so that  A Ì A  may serve to express that  A  is a subset of itself).  To express that  A  is a  proper  subset of  B,  the unambiguous notation is  A ÌÌ B.

We introduced unions and intersections for two operands, but those are defined for several operands, possibly  infinitely many  of them...

Closely related to the above are the standard symbols of modern  logic :

  • Universal quantifier :   " x   Whatever follows is true for any  x.
  • Existential quantifier :   $ x   Whatever follows is true for some  x.
  • Defining quantifier :   $! x   Whatever follows is true for just one  x.
  • Implication :   Þ   If the LHS is true, so is the  right-hand side.
  • Equivalence :   Û   Either both sides are true or both sides are false.
  • Disjunction :   Ú   At least one side is true.
  • Conjunction :   Ù   Both sides are true  ("&" is also used for that).
  • Negation :   Ø   Whatever follows is false.

Needless to say that we may also use comma separators, parentheses or various brackets whenever the precedence of the above operators needs to be clarified...  Curly brackets are usually reserved to denote sets, in particular with the very important  set-builder notation.

The next order of business for a well-rounded basic vocabulary would be to consider cartesian products, relations and functions, as introduced below.

(2002-06-04)   On the fundamental axioms of  Set Theory :
Fraenkel's  replacement schema  (1922)  was added to Zermelo's list.

As Russell's paradox once showed, we cannot simply call a set anything we like.  The "membership" of a set has to be somewhat restricted; a set simply cannot be a member of itself (or else we'd run into trouble real fast).

On the other hand, infinite sets must be part of the theory, if it is to be of any use whatsoever.  So, it is postulated at the outset that some sets exists which may be put in one-to-one correspondence with a proper part of themselves.

Finally, the last axiom listed below (the so-called Axiom of Choice) postulates that sets need not be explicitely constructible, but may instead be postulated to exist as abstract collections of virtual choices.  This is the most controversial axiom of the bunch.  It has nevertheless been shown that, if Set Theory is consistent without the Axiom of Choice, then it is also consistent with it.  Various alternate theories have been formulated where the axiom of choice is replaced by some weaker statement (which would be a consequence of the full-fledged  Axiom of Choice).

A proof using the  Axiom of Choice  (or, possibly, one of its weaker counterparts) is usually called nonconstructive, and is rejected by constructivists.  However, such a proof may serve to demonstrate [even to constructivists] that the negation of a nonconstructive theorem cannot possibly be shown to be true within the framework of the rest of Set Theory (since the Axiom of Choice must be consistent with it)...

Axioms of Set Theory

Existence of the Empty Set :  The empty set  Æ  has no elements.
( A set with at least one element is  called  nonempty.)

( $ Æ ) ( "x ) ( x Ï Æ )

Extensionality :  A set is characterized by its elements.
( In particular, there's only one empty set.)

( "x ) ( x Î A  Û  x Î B )   Þ   A = B

Subset Schema :  Elements of  A  with property  b  form a subset  B.  ( This is to justify the  set builder  notation:   B  =  { x Î A  |  b(x) }.  )

( $ B ) ( "x ) ( x Î B  Û  ( x Î A  &  b(x) )  )

Pairing :  Two elements form a set.

( $ A ) ( "z ) ( z Î A  Û  (z = x) Ú (z = y)  )

Union of a set of sets :   C  =   È   B

( $ C ) ( "x ) ( x Î C  Û  ( $ B Î A ) ( x Î B )  )

Power Set :  The power set  C = 2 A  is the set of all subsets of  A.

( $ C ) ( " B ) (  B Î C  Û  B Í A  )

Regularity :  Set membership is neither reflexive  ( A Ï A )  nor circular.  Every nonempty set has an element with which it has no common element.

( " A ¹ Æ ) ( $ x Î A ) ( " y Î x )   y Ï A         [Zermelo, 1930]

Infinite Sets :  One example is  { Æ, {Æ}, {{Æ}}, {{{Æ}}} ... }

( $ A ¹ Æ ) ( " x Î A ) ( {x} Î A )

Replacement Schema :  Any collection equipollent to a set is a set.

To obtain cardinals as large as  À ,  Fraenkel and/or Skolem added this axiom in 1922.  In 1923, von Neumann would show that the assumption is also indispensable to prove that every well-ordered set is equipollent to an ordinal.  The axiom of replacement had first been proposed by  Dmitri Mirimanov  from Switzerland, in 1917.

Axiom of Choice :  A set may be formed which contains a single element from every nonempty set of a given family of pairwise disjoint sets.

( " A Í 2 E )  ( "MÎA  "NÎA   MÇN=Æ )   Þ
( $ C Í E ) ( " B Î A-{Æ} ) ( $ x Î E )   B Ç C = {x}

More brutally stated:

( "MÎA"NÎA,  MÇN=Æ )   Þ   ( $C, "B Î A-{Æ}, $x,   B Ç C = {x} )

Several equivalent formulations of the  axiom of choice  are used.  Among those, the above  (in either flavor)  is known as the  Multiplicative Axiom.  It's the original formulation of the  Axiom of Choice  introduced in 1904 by Zermelo to establish that every set can be well-ordered.

In 1963, Paul Cohen would show that the  Axiom of Choice  is essential to the proof of that "desirable" statement  (i.e., replacing the  Axiom of Choice  by any weaker statement would describe a theory of sets in which some sets  cannot  be well-ordered).  Incidentally, Zermelo showed that a set is well-ordered if and only if is equipollent to its cartesian square.  (This allows a very short nontrivial formulation of the Axiom of Choice.)

In spite of its annoying consequences  (see below)  some formulations of the  Axiom of Choice  make it look trivial and unavoidable  (e.g., "the cartesian product of any collection of nonempty sets is nonempty").

Even if one finds the nonconstructive proofs based on the  Axiom of Choice  particularly repugnant, it does impose nice constraints on the definitions of key concepts.  For example, in general topology, the  Axiom of Choice  is found to be equivalent to  Tychonoff's theorem  (the product of compact topological spaces is always compact,  when compactness and product topology are properly defined).

In that sense, paradoxically, the  Axiom of Choice  has a beneficial influence even in the "constructivist" axiomatic framework  (called ZF or Zermelo-Fraenkel)  where it is explicitely rejected.  The full set of axioms stated above is usually referred to as ZFC  (Zermelo-Fraenkel with Choice).

Note that ZF or ZFC assert only the existence of mathematical objects that can be defined as sets.  Every mathematical object so defined, with the sole exception of the empty set, has some kind of sructure, in the sense that some  simpler  mathematical objects belong to it.  This remains true even when convenient names are assigned to basic objects whose set-theoretical definition is rarely used, if ever, starting with the natural integers:

0 = {}     1 = {0} = {{}}     2 = {0,1} = {{},{{}}}
3 = {0,1,2} = {{},{{}},{{},{{}}}     etc.

As part of ZF or ZFC, the additional assertion is often understood that all mathematical objects ultimately reduce to the empty set in that way.  That's  all there is  to the mathematical universe we may talk about.  This is what establishes the validity of the standard method of proof by structural induction  (also called  mathematical induction ).

Nevertheless, it's also possible to postulate the existence of  atoms  (such  fundamental elements  or urelements  are objects, different from the empty set, to which no other object "belongs").  In the simplest such model, called ZFA  (Zermelo-Fraenkel with Atoms)  all atoms are alike but distinct  (e.g.,  there's only one set of two nonempty elements that are both atoms).

Set Theory in The Mathematical Atlas by Dave Rusin (NIU).
Axiomatic Set Theory in Around Goedel's Theorem by Karlis Podnieks.
Earliest Uses of Symbols of Set Theory and Logic by Jeff Miller.
Wikipedia :   Zermelo Set Theory (1908)   |   Zermelo-Fraenkel Set Theory with replacement schema (1922)

(2013-02-15)   The Axiom of Restriction
Ruling out infinite descending membership chains.

In 1917,  Dmitri Mirimanov  discovered that certain models compatible with the axioms of ZF set theory allow the existence of infinite descending membership chains:  xn+1Îxn

John von Neumann  found this repugnant and proposed his  Axiom of Restriction  to rule that out.

Philosophically, the axiom of restriction disallows sets beyond those created in finitely many applications of the other axioms of ZF.

Interestingly, if the axiom of dependent choice is postulated instead of the full axiom of choice, then the axiom of restriction implies regularity  (so regularity no longer needs to be postulated as an axiom).

Axiom of regularity = Axiom of foundation   |   Set Theory  (Britannica article)  by  Robert R. Stoll

(2012-09-22)   Equivalents of  AC  and weaker alternatives.
Statements that may or may not require the  full power  of AC.

Statements Equivalent to the Axiom of Choice :

Here are some of the theorems which can be proved using the axiom of choice and which, conversely, imply the full axiom of choice.

  • Axiom of choice (Zermelo, 1904):   The cartesian product of any family of nonempty sets is nonempty.
  • Zermelo's theorem (1904):   Any set can be well-ordered  (it can be endowed with a  total ordering  where every subset has a least element).
  • A set is equipollent to it  cartesian square.
  • Zorn's lemma (1935):   For any partial ordering of E, if any chain in E has an upperbound in E, then E has at least one maximal element.
  • Tukey's lemma :   Any family of sets of  finite character  has at least one maximal element  (with respect to  set inclusion).
    By definition, a family of sets has "finite character" when a set belongs to the family if and only if all of its finite subsets do.
  • Tychonoff's theorem :
    The cartesian product of any family of compact spaces is compact.
  • Vector bases :  Any  vector space  has a basis.

Weaker alternatives to the axiom of choice :

Here are a few interesting statements that are undecidable within ZF and proven within ZFC, although their proofs does not require the full power of the unrestricted  axiom of choice  (the corresponding theorem may be true even if the  axiom of choice  isn't).  Statements equivalent to some of them have been proposed as  weaker  alternatives to the full axiom of choice to complement the rest of the ZF axioms.

The Hahn-Banach theorem and the Krein-Milman theorem  taken together  are equivalent to the full axiom of choice.  Likewise, the Boolean prime ideal theorem and the Krein-Milman theorem  taken together  are equivalent to the full axiom of choice.

Well-ordering_theorem (Zermelo, 1904)   |   Ernst Zermelo (1871-1953)
Zorn's lemma (1922, 1935)   |   Kazimierz Kuratowski (1886-1980)   |   Max Zorn (1906-1993, Ph.D 1930)
Tukey's lemma (1922, 1935)   |   John Tukey (1915-2000)
König's infinity lemma (1936)   |   Dénes König  (1884-1944)
The axiom of choice and equivalent statements  by  Takis Konstantopoulos  (2010-07-21)
Cardinality of cartesian square  (StackExchange, Aug. 2012).

(2010-05-06)   Are there sets of real numbers that are not measurable? The  Axiom of Choice  guarantees it !   (Giuseppe Vitali, 1905)

The Lebesgue theory of integration is based on the notion of  measurable sets  of real numbers:  The  Lebesgue measure  is a nonnegative  real  function  m  defined on measurable sets, having the following properties:

  • It's translation-invariant:   m({x}+A)  =  m(A)
  • The measure of an  interval  is equal to its  length :   m ( [x,y[ )  =  y-x
  • It's  countably additive :  The measure of a countable union of pairwise disjoint sets is the (countable) sum of their measures.

Surprisingly enough, the  Axiom of Choice  then implies that there are sets of real numbers that are  not  measurable.  Here's the proof:

Consider the equivalence relation among real numbers in the interval  [0,1[  whereby two numbers are dubbed  equivalent  when their difference is a rational number  (i.e., the quotient of two integers).  By the  Axiom of Choice,  we may form a set  W  (called a  Vitali set )  containing one and only one number from every equivalence class.

If  W  was measurable, then every set  Wq  =  { (q+x) mod 1  |  x Î W }   would be measurable of measure  M,  for any rational number q.

Well,  M  couldn't be positive because the union of finitely many  Wq  (they're all pairwise disjoint)  would then have a measure exceeding unity.  M  couldn't be zero either, because that would make the measure of  [0,1[  vanish  (as the union of countably many disjoint sets of measure zero).

Therefore,  W  cannot be Lebesgue-measurable.  QED

Note that the above relies on the Archimedean property of real numbers.  If we were investigating surreal measures,  that would merely establish that some sets of reals must have  infinitesimal  nonzero measures...

ow the Axiom of Choice Gives Sizeless Sets (13:19)  Kelsey Houston-Edwards (PBS Infinite Series, 2017-09-14).

(2010-05-06)   Should we drop the  (full)  Axiom of Choice ?

Robert M. Solovay (b. 1938)  has argued in favor of dropping the  Axiom of Choice  from the standard axioms of set theory to ensure that  every  set of real numbers is Lebesgue-measurable.  This could make every linear map on a Banach space continuous  (a so-called  dream space ).

 H.G. Garnir
H.G. Garnir (1921-1985)
One possiblity, advocated by the late Belgian mathematician  Henri Garnir  (1921-1985)  is to replace the  Axiom of Choice  (AC)  by the weaker  Axiom of Dependent Choice (DC)  and to postulate that every set of reals has the  Baire property  (which is incompatible with the full axiom of choice).
In that system, every set of reals  is  Lebesgue-measurable.

Giuseppe Vitali  (1875-1932)

(2014-12-04)   Cartesian Products
Binary or multiple  cartesian products  and infinite ones.

The  cartesian product  of two sets  A  and  B  is the set,  denoted  A ´ B ,  of all ordered pairs  (a,b)  whose first element  (a)  is in  A  and whose second element  (b)  is in  B.

Multiple cartesian products :

This definition is readily extended to the cartesian product of a finite sequence of  n  sets,  which consists of all the  n-tuples  formed with an element from every set, in the stated order.  The cross symbol  (´)  used to denote cartesian products doesn't have a fixed  arity.  The lack of parentheses is meaningful  (it's not like an associative  binary  operator, where parentheses can be omitted or not).  For example:

  • (1,2,3)  is an ordered triple belonging to   N ´ N ´ N  =  N 3
  • (1,(2,3))  is an ordered pair belonging to   N ´ ( N ´ N )   =   N ´ N 2
From the highbrow perspective of category theory where new constructions are defined up to isomorphism, that distinction isn't made and the corresponding  categorial product  is indeed a binary associative operator  (at least when all relevant products are well-defined, as would be the case in a cartesian-closed category).

Indexed and/or infinite cartesian products :

Far more generally, we may define the cartesian product of a family of sets indexed by a set  I  as a function which sends any element  i  of  I  to  some  element of  Ei .  When  I  is finite, this is equivalent to our previous description.  When at least one of the sets involved is empty, the cartesian product is empty.  However, the converse is only true if we postulate the  Axiom of Choice  which is  equivalent  to the following statement:

The cartesian product of nonempty sets is not empty.

Formal definitions of cartesian products :

The intuitive notion of  ordered pairs  ought to be defined first.  This allows  binary  cartesian products to be defined as above.  Building on that, relations and functions can be stated to be subsets of binary cartesian products  (as is done in the next section).  Only then, can the above definitions for multiple or infinite cartesian products be given in terms of "functions".

Classically,  an  ordered pair  are awkwardly defined in terms of the more basic concepts of set theory:  The pair  (x,y)  is defined as the set  {x,{x,y}}.

(2014-12-04)   Binary Relations
A binary relation between two set is a subset of their  cartesian product.

Technically, a  [binary]  relation  between two sets is just a subset of their  cartesian product.  When  (a,b)  is in that subset, we say that  "a  is related to  b"  (it's not the same thing as  "b  is related to  a"  since we are talking about  ordered  pairs here).

For a relation  R,  the  infix  notation  a R b  means  (a,b) Î R.

The relation  R  from  A  to  B  is said to be  entire  when any element of  A  is related to some element of  B.

Composition of Relations :

If  R  is a relation from set  A  to  B  and  S  is a relation from  B  to  C,  then the  composition  T = S o R  is the relation from  A  to  C  defined by:

a T c   Û   { $ b Î Ba R bb S c }

Readers who have been exposed to this notion for  functions  (which are just a special type of relations, as introduced below)  may remark that the above reduces to the familiar composition of functions, when applicable.

If the relation  R  is a function,   a R b   means the same as   b  =  R ( a )

The peculiar order of the operands in the above definition is meant to accomodate functional definitions.  Indeed, if R and S are functions, then the above definition translates into:

c   =   S o R ( a )   =   S ( R ( a ))

The composition of general binary relations  (not necessarily functions)  is especially useful to define  uniform spaces, whose  completeness  can be determined even when distances are not defined.

Binary Relations within a Set :

A relation  R  from a set  A  to itself could verify the following properties.

  • Reflexivity :   " x Î A ,  x R x   (thus, reflexive relations are entire).
  • Symmetry :   x R y   Þ   y R x
  • Antisymmetry :   { x R y  &  y R x }   Þ   x = y
  • Transitivity :   { x R y  &  y R z }   Þ   x R z

Those concepts are used to define several important types of relations:

  • preorder relation  (or quasiorder)  is reflexive and transitive.
  • An  equivalence relation  is reflexive, symmetric and transitive.
  • An  ordering relation  is reflexive, antisymmetric and transitive.

Equivalence Classes,  Quotient :

When an equivalence relation  R  is defined on the set  A, that set can be  partitioned  into disjoint  equivalence classes  such that two elements are related  iff  they belong to the same class.  Any element of  A  belongs to one and only one such class.  The empty set  isn't  an equivalence class.

The set of all those equivalence classes is denoted  A/R  and is called the  quotient  of  A  by  R  (also dubbed  A  modulo  R).

For example the  set of signed integers  Z can be construed as the cartesian square  N 2  modulo  the following relation R :

(a,b) R (c,d)   Û   a + d  =  b + c

The motivation is to make  subtraction  a well-defined operation verifying:

(a , 0)  -  (c , 0)   =   (a , c)

This construction is one key reason why it is best to include zero in the set of  natural integers  N  (zero is excluded from  N* )  in spite of the dubious tradition prevalent in US high-schools.

It is then a simple  (albeit tedious)  exercise to show that the set  Z  so constructed is given a  ring  structure using the following operations which respect the equivalence  R  (i.e., the equivalence class of a result depends only on the equivalence classes of the operands).

(a,b) + (c,d)   =   (a + c  ,  b + d)
(a,b)  (c,d)   =   (ac + bd  ,  ad + bc)

Well-founded relations :

A relation is well-founded when every subset has a "minimal element" (to which no other element of that subset is related).

(2010-09-23)   Functions and applications are special  relations.
Some functions from  A  to  B  are  injectivesurjective  or both.

A function  f  from set A to set B is something that associates one and only one element  f (x)  of B with every element  x  of A.  Formally, functions are special  relations  between the elements of  A  and  B,  in the above sense.

Bourbaki calls the above an  application,  Other authors may call it a  total function,  A partial function  f  from A to B is merely an application from a  subset  of  A  (called the  definition domain  of  f )  to set B.  In numerical analysis, functions are normally understood to be partial ones.  Elsewhere, it's prudent to lift all ambiguities with the systematic use of qualifiers  (either "total" or "partial").

Thus, an application  f  verifies the following characteristic properties:

f   Í   A ´ B
( " x Î A )   ( $! y Î B )   ( (x,y) Î f )

By convention, the above is expressed with the following notations:

f :   A  Ho heel¾®  B
x   Heel¾®  y = f (x)

A function  f  from set  A  to set  B  is said to be  surjective  when every element of  B  is the image of some element of  A.  (The idiom "onto" is synonymous with  surjective.)

A function is said to be  injective  when no element of  B  is the image of more than one element of  A.  (The idiom "one-to-one" is synonymous with  injective.)

Both terms were introduced by  Nicolas Bourbaki  (the collective pen name of a famous ongoing collaboration of industrious mathematicians, mostly French, founded on January 14, 1935).

A function that's both surjective and injective is said to be bijective, or "one-to-one onto".  It's also said to be a "one-to-one correspondence".

Functions that are surjective, injective, or bijective are respectively called surjections, injections, or bijections.

Direct and reciprocal images of a set :

For a given function  f  from  A  to  B,  the following definitions and notations are commonly used:

  • Direct image  (or simply  image)  of a subset  A'  of  A :
       f ( A' )   =   {  y Î B  |  $ x Î A' ,  y = f (x)  }
  • Inverse image  (also called  preimage)  of a subset  B'  of  B :
    f  -1 ( B' )   =   {  x Î A  |  $ y Î B' ,  y = f (x)  }

It's useful to know that the [direct] image of an intersection is  contained  in the intersection of the [direct] images, but is not necessarily equal to it  (unless the function is injective).  In particular, the images of two disjoint sets aren't necessarily disjoint  (unless the function is injective).

On the other hand, we do have equality for unions of direct images and for both unions and intersections of inverse images.  All told:

f ( X Ç Y )    Ì     f ( X )   Ç   f ( Y )
f ( X È Y )    =     f ( X )   È   f ( Y )
f  -1 ( X Ç Y )   =   f  -1 ( X )  Ç  f  -1 ( Y )
f  -1 ( X È Y )   =   f  -1 ( X )  È  f  -1 ( Y )

The proofs are left as an enlightening exercise for the reader.

One application of those relations pertains to the characterization of the  continuity  of a function in terms of the conservation of certain topological properties of sets for  either  their direct or their inverse images...

(2005-07-26)   Cantor's Theorem:  Any set is  smaller  than its powerset.
No function from  A  to  2A  can possibly be surjective  (1891).

A beautiful proof :

Consider  any  function  f  from set  A  to its powerset  2A  (by definition,  2A  is the set of all the subsets of  A).  Let  M  be the subset of  A  consisting of all the elements  x  of  A  such that  x Ï f (x).

Well,  M  is clearly an element of  2A  which is not the image of  any  element  x  of  A.  (HINT:  Such an  x  could be neither in  M  nor outside of it.)  Thus, our  arbitrary  function  f  is not surjective.  Therefore, there's no surjection from  A  to its powerset.  QED

This much seems obvious for finite sets, but the above proof shows that it holds for all infinite sets as well.  The reader is encouraged to check the above wording for conformity with the  Axioms of Set Theory  presented in the above article, assuming familiarity with the concept of a  function.

The above result is known as  Cantor's theorem  and it provides yet another argument  (formerly known as the paradox of Burali-Forti)  against the existence of a  set of all sets  because the powerset of such a thing could not possibly be of greater cardinality than itself.  This argument was put forth by Cesare Burali-Forti (1861-1931)  in 1896 and predates the formulation of Russell's paradox (1901).

 Aleph 0 (2003-11-10)   Ào  and the transfinite cardinals
On the different sizes of infinite sets.

Two different approaches make sense of actual infinite numbers which were both investigated by Georg Cantor (1845-1918).  The  cardinal numbers are discussed in this section;  ordinal numbers will be considered later.

Equipollence, Cardinality and Cardinal Numbers

Two sets which can be put in one-to-one correspondence are said to be  equipollent.  This is a fundamental notion which does not require an ability to count the elements in either set; the ability to pair them up suffices.

Loosely speaking, equipollence is an  equivalence relation  and we can give a name to each equivalence class.  Such a name is called a  cardinality.  Two sets are equipollent if and only if they have the same  cardinality.

WARNING :   There's no such thing as the  "set of all cardinalities".  This was known to Cantor even before Russel pointed out that there is no such thing as the "set of all sets".

Of course,  finite  sets, have the same cardinality if and only if they have the same number of elements.  If you have as many apples as you have oranges, you may  count  either the apples or the oranges and you'll come up with the same number  (0 if you have no fruit).  One may thus consider natural integers (0,1,2,3,4 ...) to be indicative of the  cardinality  of either set (apples or oranges).  In that capacity, the natural integers are called cardinal numbers.  The fancy vocabulary may be intimidating, but the whole thing is really the most basic kind of numbers, only described in a philosophical way which  can  be extended to infinite sets...

Can infinite sets be "counted" too?  Well, the finite integers are clearly not up to the task.  But can't we just add one more "number" (using tentatively the "¥" symbol for it) and say that "¥" is the number of elements in any infinite set?

Cantor showed that this won't do, because there are pairs of infinite sets which cannot be put into one-to-one correspondence with each other.  To be consistent with the very concept of cardinal numbers presented above, there has to be more than one "transfinite" cardinal number.  The symbol Ào (pronounced "aleph zero", "aleph null", or "aleph nought") was introduced by Cantor to denote the smallest of these;  it's the number of all the [finite] integers.  An infinite set with this many elements is said to be countable.  An  uncountable  set is an infinite set which is not countable.

Cantor's diagonal argument shows that the  continuum  of the real numbers is uncountable.  Its cardinality  (called the  power of the continuum)  is the same as that of the powerset of the integers  (the set of all sets of integers)  as can be established directly by observing that Pierce expansions yield a one-to-one correspondence between sets of positive integers and the reals between 0 and 1.

By Cantor's theorem, the powerset of  any  set has a greater cardinality than that of the set itself.  Therefore, there must be infinitely many different infinite cardinalities and  infinitely many  transfinite cardinals, including, at least, the following sequence of so-called  beth numbers :

  Á0  =  À0   ;     Á1  =  C  =  2 Á0   ;     Á2  =  2 Á1     ...   Án+1  =  2 Án    ...

Georg Cantor  used the notation  C = 2Ào  for the  power of the continuum.  and called  Àn+1  the smallest transfinite cardinal beyond Àn   (that definition is valid only if the  axiom of choice  is retained).

With that notation, in 1877, he conjectured the  continuum hypothesis  (CH) which states that there's nothing between Ào  and  C :

À1   =   C

The  generalized continuum hypothesis  (GCH)  is the deep generalization:

Àn   =   Án     for any ordinal n

Cantor could not prove or disprove the  [restricted]  continuum hypothesis, Neither could anyone else...  because this can't be done, as discussed next.

(2009-04-11)   The Continuum Hypothesis   (CH)
Is any  uncountable  set of real numbers  equipollent  to the whole line?

Cantor thought that any uncountable subset of a continuum was equipollent to the whole continuum.  He formulated this statement, known as the  continuum hypothesis,  in 1877 and it bedeviled him till the day he died.

The reason why Cantor failed to prove or disprove the Continuum Hypothesis is that  it cannot be proved or disproved.  It's  undecidable...  The axioms of set theory are equally consistent if we rule in or rule out the existence of transfinite cardinals strictly between  À and  C.

This was established in 1963 by  Paul Cohen (1934-2007) who was awarded the Fields medal for that work, in 1966.

Some preconditioned versions of  CH  are true :

In 1884, Cantor observed that an uncountable  closed  set of real numbers must have a subset equipollent to the Cantor set and, therefore, must be equipollent to the entire real line.  So, the continuum hypothesis holds for closed sets.

More generally, Felix Hausdorff (1868-1942)  proved, in 1916, that any uncountable Borel set of real numbers is equipollent to the real line.

 omega (2003-11-10)   w and Cantor's transfinite ordinals  (1897)
Counting to infinity and beyond.

The second kind of transfinite numbers introduced by Cantor are called transfinite ordinals.  They are more complicated than  cardinals.

Following von Neumann (1923) we may start with the following remark:  A natural integer may be represented by the set of all nonnegative integers before it, starting with the empty set ({} or Æ) for 0 (zero) because there are no nonnegative integers before 0.  Then 1 corresponds to {0}, 2 is {0,1}, 3 is {0,1,2}, etc. For the whole set of nonnegative integers {0,1,2,3...} Cantor introduced the symbol w.  He did not stop there, since {0,1,2,3...w} corresponds to another transfinite ordinal, which is best "called" w+1.  {0,1,2,3...w, w+1} is w+2, etc.

However, you end up doing two different types of counting if you enumerate a finite set first and an infinite set second or the other way around...  The addition of ordinals is associative but  not  commutative:

1 + w   =   w   ¹   w + 1

Incidentally, the maximum number of different results which can be obtained by changing the order of the terms in a Cantor-sum of  n  ordinals may be tabulated as follows.  (The last pair of lines gives the general pattern, which has exceptions only below n = 15.)

Number of different ways to add n ordinals   (A005348)
012 34
5k + 155k + 165k + 175k + 185k + 19
33 ´ 81 k+2 81 k+3 193 ´ 81 k+2 193 2 ´ 81 k+1 193 3 ´ 81 k

The addition of infinite numbers such as  w  may also be defined in other ways which retain commutativity.  The best such approach was proposed by John H. Conway, who defined  w  essentially as above, but in the context of so-called surreal numbers, a very satisfying generalization of the familar concept of "numbers"  (including integers, rationals and real numbers ).

Infinite Numbers among Surreal Numbers :

Expressions like  w-1, 2w, w2, Öw, ww  www, etc.  have consistent meanings in the  superb  construction of  John Horton Conway  presented next.

Large countable ordinals  [ 2016-06-29 | 2016-07-04 | 2016-07-07 ]  by  John Baez.

(2003-11-10)   Surreal Numbers   (John H. Conway, 1972)
What's the most general type of  ordered  numbers?

The most general type of number  line  (as opposed to unordered things like complex numbers or p-adic numbers)  turns out to be simpler to define than the familiar real numbers.  With the right approach, it's actually easier to include infinitesimals and transfinite ordinals than to rule them out...

Conway's approach was inspired by the aforementioned view of Cantor's ordinals (Von Neumann, 1923) and also by the "Dedekind cuts" introduced in 1858 by Richard Dedekind (1831-1916, last doctoral student of Gauss).  Dedekind defined real numbers as gaps "between" splits of the rational line  (this was introduced before Georg Cantor found it more convenient to identify real numbers with equivalence classes of Cauchy sequences).  A Dedekind cut consists of two disjoint sets L and R whose union is the entire set of rationals.  L does not have a greatest element and contains all rationals to the left of any member of L,  whereas R  may  have a lowest element and contains any element to the right of any member of R.  The mnemonic notations L and R  ("left" and "right")  are due to J.E. Littlewood (1885-1977).

The term surreal numbers was coined by D.E. Knuth in 1973 to describe the beautiful construct of John H. Conway, who gave simultaneously recursive definitions of the numbers and the ordering among them...

Equality is a defined relation:  Two surreal numbers x and y are equal if and only if the two relations  x ≤ y  and  y ≤ x  are both true.  In other words, surreal numbers are really  equivalence classes  among  some  of the representations introduced below.

Conway introduces the compact notation   x = { L | R }   for what's called a game; an ordered pair of two sets L and R, given either by name or by listing their elements, all of which are [simpler] games...

Part of Conway's notation is standard:  If  L  is a set, then L+y is universally understood as the set of all elements obtained by adding y to an element of L, just like E+F is the set whose elements are sums of an element of E and an element of F  (that convention is due to Minkowski).  If desired, the rest of Conway's notations could easily be translated into the standard notation of set theory, by stating that a number may stand for a single-element set within the "first level" of a "{...|...}" notation and that a comma ( , ) is synonymous with the "union" operator ( È ) in that same context.

Technical Presentation of Surreal Numbers :

For a given game  x,  typical elements of  L or R are respectively denoted  xL  or  xR  and are called left or right options of x.  Numbers are defined as  special  games whose options are other numbers, and which lay "between" their own left and right options, in the following sense, recursively defined:

  • If  L = { xL }  and R = { xR }  are two sets of known surreal numbers, such that  xR ≤ xL  is never true, then one representation of  another  surreal number  x  is obtained as:  x  =  { L | R }  =  { xL | xR }
  • x  ≤  y  means that we never have  y ≤ x  or  y ≤ xL

Those two mutual recursions allow the introduction of a  fundamental  equivalence relation:

  • x = y   is  defined  to mean that   x  ≤  y   and   y  ≤  x.

The true statement  x  ≤  x   (implying  x  =  x)  requires a nontrivial proof!  So does the transitivity of the  "≤"  relation.  From that follows the fact that the defined notion of equality is indeed an equivalence relation  (technically, surreal numbers are thus equivalence classes of their representations modulo that relation).  Finally, we should observe that the construction ensures the antisymmetry of  "≤"  which makes it a proper  ordering relation.

The following definitions are merely introduced for convenience:

  • x ≥ y   means that  y ≤ x.
  • x < y   means that  x ≤ y  is true and  y ≤ x  is false.
  • x > y   means that  y < x.

Here are the  single-line  definitions of the basic arithmetic operations:

  • Sum:   x + y  =  { x L + y , x + y L  |  x R + y , x + y R }
  • Opposite of a number:   -x  =  { -x R | -x L }
  • Product:   xy  =
{ xL y + xyL - xL yL  ,  xR y + xyR - xR yR   |   xL y + xyR - xL yR  ,  xR y + xyL - xR yL }

Well, nobody said this was easy to work with, but it sure is beautiful...  Conway said that it took him "several weeks of hard work" to find the above  genetic  definition of multiplication and another year to define division!

Constructing numbers using simpler numbers :

A representation of a surreal number is  canonical  when it's formed only with previously constructed  simpler  numbers.  Let's now construct surreal numbers one generation at a time, using such canonical representations.

The  simplest  number is the game which has no left option and no right option  (that game is indeed a number because no left option is to the right of a right option).  For a reason which will soon be clear, we call it  0  (zero).

0   =   { | }

We can show that this particular number must, indeed, be the  neutral element  for addition using  structural induction  on surreal numbers which have a canonical representation  (we haven't yet proved that all of them do).  Indeed, if we apply the above definition of addition, we obtain:

x + 0     =   { x L + 0 , x + 0 L  |  x R + 0 , x + 0 R }
   =   { x L + 0  |  x R + 0 }
   =   { x L  |  x R }  =  x

Expressions with  0R  and  0L  are discarded because there's no such thing as an element of the empty set.  The remaining expressions can be simplified using the  induction hypothesis,  which tells us that  0  is neutral for addition with the simpler numbers available in either set of options of  x.

0 ≤ 0  holds  (since we can't have  0R ≤ 0  or  0 ≤ 0L therefore  0 = 0.  That goes to show that the game  { 0 | 0 }  is not a number.  The next generation of surreal number includes only two numbers:

-1   =   { | 0 }         and         1   =   { 0 | }

Indeed, those two are clearly opposite of each other and we could prove, by structural induction, that the latter is the neutral element of the multiplication defined above.  (At first, Conway didn't have the luxury to do so!)

The  3  surreal numbers identified so far can form  8  sets  which allow the construction of  4  new numbers, with multiple representations:

The seven simplest surreal numbers   (the first three generations) :
NumberCanonical Representations  (simplest one first)Day
2{ 1 | }   { 0, 1 | }   { -1, 1 | }   { -1, 0, 1 | }2
1{ 0 | }1
½{ 0 | 1 }   { -1 , 0 | 1 }2
0{ | }0
- ½{ -1 | 0 }   { -1 | 0, 1 }2
-1{ | 0 }1
-2{ | -1 }   { | -1, 0 }   { | -1, 1 }   { | -1, 0, 1 }2

Besides the above are noncanonical representations which involve, for a given number, at least one number of the same generation or greater.  E.g.,

0   =   {-1 | }   =   { | 1 }   =   {-1 | 1}
1   =   {-1, 0 | }       and       -1   =   { | 0, 1 }

2 new numbers are born on Day  n,  including two extremities  (identified with  -n  and  n)  and other new surreal numbers halfway between the  2n-1  numbers of the previous generations.  In finitely many days we obtain surreal numbers corresponding to all fractions whose denominators are powers of two, but only those.

Structural Recursion :

The great power of Conway's notation is that it makes it easy to specify recursively new options for a number  (or any other game)  in terms of "old" options  (i.e., previously born ones).  For example, the following equation defines  x = w ={0,1,2,3,4...| }  which is the  simplest  infinite surreal:

x   =   { 0, 1+xL | }

More precisely, that equation means to say that  0  is a left option of  x  and that whenever  xL  is a left option of  x,  so is  1+xL.

w  is the only infinite number born on Day w.  However, that Birthday is shared by  all  irrational numbers and all rational numbers not born on a finite Day.  That includes  1/3.

As part of a  Grothendieck universesurreal numbers  form a  set.  This set is certainly pretty large, but it's not an unwieldy  proper class.

"On Numbers and Games"  (a.k.a. ONAG)  by  John H. Conway   (1976, 2001)
Surreal Numbers (ISBN 0-201-03812-9)  by  D.E. Knuth  (1974, Addison Wesley)
Wikipedia :   Surreal Numbers
From Games to Surreal Numbers (1:15:44)  by  John Conway  (2016-02-23).
Surreal Numbers:  The Book (14:05)  by  Donald Knuth  (Numberphile, 2016-06-27).

(2005-07-09)   Multidimensional  (or Hypercomplex)  Numbers
Real and complex numbers.  Quaternions, octonions, sedenions, etc.

division algebra  over a field  (here, we use only the field of reals)  is an  algebra  in which every nonzero element has a multiplicative inverse  (this doesn't rule out divisors of zero because no associativity is postulated).

The  Cayley-Dickson construction  doubles the dimension of a previously established set of numbers, by considering  ordered pairs  of such numbers and defining their sums, products and conjugates in the following way.  (The conjugate of z is denoted z* and the conjugate of a  real  number is itself.)  The product of a number and its conjugate is always a nonnegative real number, equal to the square of that number's  norm, by definition.

( a , b )  +  ( c , d )       =     ( a + c  ,  b + d )
( a , b )   ( c , d )   = ( a c  -  d b*  ,  a*d  +  c b )
( a , b )* = ( a* , -b )

A trivial induction, based on this last relation, would prove that  z** = z.

Likewise, the identity   ( x y )*  =  y* x*   is clearly true in the realm of real numbers and is established for all hypercomplex numbers by induction:

[ ( a , b ) ( c , d ) ]*     =     ( a c  -  d b*  ,  a*d  +  c b )*
= ( c* a*  -  b d*  ,  - a*d  -  c b )
( c* , -d ) ( a* , -b ) = ( c* a*  -  b d*  ,  -  c b  - a*d )

Hypercomplex Numbers:  Normed Division Algebras
(multidimensional numbers with real coordinates)
Complex Plane2Yes
Straight Line1

Arguably, the above process should stop with the octonions, as the next step yields a strange realm  (the sedenions)  where two nonzero elements may have a zero product.  It seems abusive to call  numbers  things that include such  divisors of zero,  except for recreation  (e.g., decadic integers).

Among sedenions, the divisors of zero of unit norm are homeomorphic to the exceptional Lie group G2.

By contrast, the noncommutativity of quaternions is benign.  The resulting algebraic structure is called a skew field, and it repays study.

In July 1844, Hamilton realized that octonions  failed  to have the crucial property for which he coined the word  associativity, now universally used:

"x   "y   "z       ( x y ) z   =   x ( y z )

nonassociative  division algebra  may  contain divisors of zero.  In an associative division algebra, a product of nonzero factors is never zero.

In 1886, Frobenius proved that the reals, the complex numbers and the quaternions are the only three  associative  division algebras over the reals.

Multiplication of  octonions  isn't associative, it's only  alternative :

"x   "y     (xx)y = x(xy)     and     y(xx) = (yx)x

Alternativity is less demanding than associativity.  Less demanding still is the  power-associativity  discussed  below.  The  Cayley-Dickson construction  transforms a power-associative algebra into another power-associative algebra, so all the algebras discussed here are power-associative.  Likewise, they're all flexible.

Cayley-Dickson construction in The Octonions  (PDF, 515 kB)  by  John C. Baez   (May 2001)
"On Quaternions and Octonions" (2004)   by John H. Conway & Derek A. Smith.  [Review by John Baez]
Curious Quaternions   &   Ubiquitous octonions by Helen Joyce & John Baez  (2004).
Cayley-Dickson algebras and loops  by  Craig Culbert   (2007).
The Strangest Numbers in String Theory  by  John Baez & John Huerta  (2011).
Cohl FureyOctonions Could Underlie the Laws of Nature  by  Natalie Wolchover  (Quanta, 2018-07-20).
Hypercomplex Numbers  (Normed Division Algebras)  Timeline by Jeff...
Wikipedia :   Cayley-Dickson  |  Hypercomplex numbers  |  Quaternions  |  Octonions  |  Sedenions

(2013-07-08)   Beyond the  sedenions :
Colorful names or systematic nomenclature.

There has been at least one colorful attempt to name some  Cayley-Dickson algebras  beyond the 16-dimensional sedenionsTony Smith  reports  the  macaronic  tongue-in-cheek nomenclature proposed by  Robert de Marais :

  • Pathions  (32 dimensions).  Symbol:  P.
    The 32 paths of wisdom in the Kabbalistic tree of life.
  • Chingons  (64 dimensions).  Symbol:  X  (since C is for dimension 2).
    The 64 Hexagrams of the  I Ching  (Book of Changes).
  • Routions  (128 dimensions).  Symbol:  U  (that comes before  V).
    Route 128 of the  Massachusetts Miracle.
  • Voudons  (256 dimensions).  Symbol:  V.
    The 256 combinations from the Ifá corpus of Voudou  (Tony Smith).

In the unlikely case something less silly is ever needed, I am hereby offering a systematic nomenclature based on the "plex" root, for two reasons:

  • Prior usage of "plex" in numerical prefixes to denote exponents.
  • Evocation of the relevant words "complex" and "hypercomplex".

With  plexions  denoting real numbers,  elements of the  2n-dimensional real algebra could be called  n-plexions  where the number  n  can be expressed with standard numerical prefixes:

The reals are noplexions, the complex numbers are monoplexions, the quaternions are diplexions, the octonions are triplexions, the sedenions are tetraplexions.  This goes on with pentaplexions (32 dimensions),  hexaplexions (64 dimensions),  heptaplexions (128 dimensions),  octaplexions (256 dimensions),  enneaplexions (512 dimensions),  decaplexions (1024 dimensions),  hendecaplexions (2048 dimensions),  etc.

(2012-12-20)   Power-associativity is a very weak form of associativity.
Any "multiplication"  needs  that for exponentiation to be well-defined.  For an "addition",  that property allows a consistent definition of divisors.

A multiplication is said to be  power-associative   when  positive  powers  can be  recursively well-defined  as follows,  for any element  a :

a1 = a         and         " m≥1 ,   " n≥1 ,   an+m   =   an  am

The innocent-looking requirement that the above be  "well-defined"  entails an infinite sequence of independent conditions :

  • a is well-defined only if   a a  =   aa
  • a is then well-defined only if   a a  =   aa  =   aa
  • a is then well-defined only if   a a  =   aa  =   aa  =   aa
  • Etc.

If a power-associative multiplication has a neutral element  (1)  the zeroth power of any element,  invertible or not,  is also well-defined:

" a ,       a0   =   1

Invertible  elements can also be raised to the power of a  negative  integer:

a-n   =   (a-1)n

For commutative nonassociative rings or algebras whose characteristic is either  0  or coprime with  30, A.A. Albert  has shown (1947) that,  if fourth powers are well-defined, then the multiplication is power-associative.

When the characteristic is 3 or 5, the same holds whenever fifth and sixth powers are well-defined  (Louis A. Sokotoris, 1952).

Examples of power-associative operators :

Alternative multiplications are power-associative, so is the multiplication in any set of hypercomplex numbers  (even beyond the  sedenions)  or any  commutative  multiplication having the following property, known as  Jordan's identity  (which is verified, in particular, by the Jordan product used in quantum mechanics).

"x   "y       x ( y ( x x ))   =   (x y)  ( x x )

A.A. Albert  (1947)  defines  flexibility  as the following property:

"x   "y       x ( y x )   =   (x y ) x

On the weakest form of subassociativity :

The following property is called either  power-alternativity  or  third-power associativity  (for short, I call it  cube-associativity  or  cubic associativity):

"x ,     x ( x x )  =  ( x x ) x

This merely means that cubes are well-defined.

The  ugly  multiplication tabulated below for a set of  3  elements, has this property  (since any square equals  A  and everything commutes with  A).  This makes all cubes well-defined:   A3 = A,  B3 = C,  C3 = B.  However, neither  B  nor  C  have well-defined  fourth powers :

   A  B  C 
  B  B3  =  B C  =  B
B2 B2  =  A A  =  A
B3 B   =  C B  =  C
  C  C3  =  C B  =  C
C2 C2  =  A A  =  A
C3 C   =  B C  =  B
Therefore,  this operation isn't  power-associative.

(2005-07-19)   Von Neumann - Bernays - Gödel   set theory  (NBG)
Sets belong to classes.  A proper class isn't a member of anything.

Everything in NBG set theory is a  class  (a concept undefined in ZFC set theory)  but only those classes that are a member of another class are called  sets.  Any collection of sets to be considered collectively as a  class  but such classes cannot be considered collectively unless they are sets.  Quantified variables in the language of NBG can only be sets, not proper classes.

NBG set theory originated with John von Neumann (1925).  It was then modified by Paul Bernays (1937) and simplified by Kurt Gödel (1940).

In the 1990's, it was established that NBG is a  conservative extension  of classical set theory  (ZFC)  in the sense that NBG introduces new language but no new theorems expressible in the language of ZFC.

In view of the fact that they have essentially the same theorems, it's surprising to observe that NBG can be axiomatized with finitely many statements and that ZFC cannot  (the latter requires at least one schema of axioms  which asserts a given pattern rather than a definite sentence about objects within the theory).  There's no need for a "statement about statements" which could become nonsensical if the object statements feature free variables whose names are used in the main statement.  (In the LISP programming language, the so-called FUNARG problem/bug may arise when functions are passed as arguments to another function.)

For example, within NBG, we may speak of the  class  of all sets, which is called the  Absolute  (always spelled with a capital A).  The Absolute is a  proper  class because of  Russel's argument.  That statement isn't a "new" theorem because it can't be expressed at all in the language of ZFC.  It may be more satisfying to say that "the class of all sets is proper" rather than "there's no such thing as the set of all sets" but both statements are equivalent in a deep sense...

Gödel marveled at the fact that something can be known about the  Absolute  by pure reason, namely:

Anything which is true of the Absolute is also true of at least one set.

Von Neumann-Bernays-Gödel set theory (NBG)
John von Neumann (1903-1957)   |   Paul Bernays (1888-1977)   |   Kurt Gödel (1906-1978)

(2014-01-19)   Tarski - Grothendieck   set theory  (TG)
The  nonconservative  extension of  ZFC  used in the  Mizar system.

Around 1973,  Andrzej Trybulec  chose  TG set theory  as the basis for his  Mizar  proof-checker  (now, the most popular system of its kind).

TG set theory is essentially ZFC with the addition of  Tarski's axiom  (originally formulated by Tarski in 1939).  All objects are sets; the notion of  class  isn't part of the language of TG.

Tarski's axiom states  (in modern language)  that every set is a member of at least one  Grothendieck universe,  where a  Grothendieck universe  is, loosely speaking, a set within which all usual mathematical operations can be performed.  More precisely, a set U is a Grothendieck universe if and only if the following four conditions are met:

  • If x belongs to U and y belongs to x, then y belongs to U.
  • If x and y belong to U, then the set {x,y} belongs to U.
  • If x belongs to U, then so does its powerset.
  • A union of elements of U indexed by an element of U is an element of U.

Tarski-Grothendieck set theory.    |   Mizar system
Alfred Tarski (1902-1983)   |   Alexander Grothendieck (1928-)   |   Andrzej Trybulec (1941-2013)

visits since Dec. 6, 2000
 (c) Copyright 2000-2018, Gerard P. Michon, Ph.D.