Video : Georg Cantor (18451918). Dangerous Knowledge:
1

2

3
Kurt Gödel (19061978). Dangerous Knowledge:
7

8

9

10
From basic logic to axiomatic Set Theory
Curiously, set theory arose in the context of sets of real numbers related to
the convergence of Fourier series...
In 1829, Dirichlet had showed that
a function always had a Fourier series
converging to itself, provided certain weak conditions were met (he considered
periodic functions with finitely many extrema in every period and
equal to the halfsum of their lower and upper limits at every point).
In 1870, Georg Cantor (18451918)
(who had obtained his doctorate in 1867, under
Karl Weierstrass, 18151897)
proved a basic uniqueness theorem: The only trigonometric series which converges
to zero everywhere is the zero series.
In 1871, Cantor introduced the concept of a Uset (or set of uniqueness) with
the characteristic property that the zero series is the only trigonometric series
converging to zero outside of any given Uset.
He proved that any finite set is a Uset.
In 1872, Cantor introduced the concept of
limit point and he properly defined real numbers
as equivalence classes of rational Cauchy sequences.
Then, he introduced the derived set E' of a set E as the
set of its limit points. He established that a set whose derived set is a Uset is
also a Uset.
Cantor went on to prove that every set with countably many limit
points is a Uset.
Ernst Zermelo (18711953)
considered that analytical result to mark the actual birth of modern set theory,
as introduced on this page.
(20140121)
Aristotelian logic
For over two millenia, the framework of logic was rigid and frozen.
Aristotle (384322 BC)
showed how to deduce
categorical propositions
from each other. In a typical syllogism,
a conclusion is derived from two premises. For example :
All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.
The pattern seems foolproof, and it is. Yet, difficulties may arise:
Rare things are not cheap.
A cheap horse is a rare thing.
Therefore, a cheap horse is not cheap.
As discussed below, the silly conclusion is not necessarily
invalid (as it merely implies that there's no such thing as a cheap horse,
which isn't so objectionable in this context).
Nevertheless, clarifications are needed...
Such threats to logic from circularities in the premises bothered
Gottlob Frege (18481925) as he seeked to put modern logic
on solid foundations.
As he thought he had done so, Frege was proofreading his work, just before
going to press, when he received a letter from his younger
colleague Bertrand Russell (18721970)
who pointed out to Frege what's now called Russell's paradox.
To his credit, Frege saw immediately that his work could not be fixed
and he did what he could to apologize to his readership.
Russell's objection was a fundamental one.
The consequences for the very foundations of mathematics are still with us. Read on...
(20140211) Formal Logic
The basis for reasoning
(violated in the realm of quantum physics).
A system of axioms is consistent if, and only if, at least one statement is not provable in it.
Although natural language can be used for reasoning, it's notoriously
unreliable in its rigor, or lack thereof
(pay close attention to any debate involving a religious apologist to illustrate this point).
As with the rest of mathematics, a formal language is needed
for logical statements.
That language should be unforgiving enough to make mistakes detectable.
Definitions are equivalences involving newly named concepts.
The formal counterparts of Aristotelian categories are called sets.
A set can be envisioned as the collection of all things that belong to it
(something that belongs to a set is said to be a member of that set).
Two sets that have the same members cannot be distinguished and are
thus considered identical
(that's the principle of extensionality).
Surely, you want to consider sets themselves as legitimate objects of study
and allow sets whose members are sets.
However, the freedom to do so cannot be completely unrestricted,
as Bertrand Russel first discovered.
There's a need for a theory of sets
which spells out what is required to avoid subtle or notsosubtle contradictions
in our discourse.
(Ashlee of Braithwaite, LA. 20000503twice)
There is a barber who lives in a small town.
The barber shaves all those men and only those men who do not shave themselves.
Does the barber shave himself?
Stay away from the many "cute" answers that do not really address the question:
The barber can't be
a woman (otherwise the term "himself" used in the question would be improper).
It would also be cheating to consider that the barber is a boy (and therefore not a "man")
or any other kind of nonhuman male creature for that matter...
The dilemma remains (it's not a paradadox, as we shall see):
If the barber doesn't shave himself then he shaves himself.
If he shaves himself, then he doesn't shave himself.
The answer to this classic "problem" is simple:
There cannot possibly be any such barber!
The fact that we are wrongly told at the outset that such a barber exist
just does not make it appear out of thin air in spite of the logical contradictions
his very existence would imply.
This is no more paradoxical than a silly question like:
There's a number whose square is 4 and whose cube is 24. What is it?
Both this and the barber's "problem" are just plain nonsense.
If the existence of something leads to a contradiction,
the thing is thereby shown not to exist and any statements about it are to be considered
either nonsensical or "vacuously true":
Any such barber does shave himself.
Any such barber does not shave himself.
Any such barber has purple hair.
The name of any such barber is Isaac Newton.
All of this is true.
"Vacuously" so.
Russell's Paradox:
More seriously, it is worth pointing out that a fundamental logical
dilemma had to be resolved the same way by the pioneers of set theory:
In what's known as "Russell's paradox", we are asked to consider
the "set" of all sets that are not members of themselves.
Is it a member of itself or not?
Neither answer is acceptable...
Therefore, such a thing can't possibly be called a set!
This paradox was instrumental
in clarifying the (ZermeloFraenkel) axioms of modern set theory:
A "set" cannot be just anything we like.
In particular, the axioms imply that no set can be a member of itself.
Therefore, there is no "set of all sets" (it would be a member of itself).
The collection of sets that are not members of themselves thus includes all
sets and it is not a set itself.
That's the way out of Russell's paradox:
"such a set" simply does not exist, exactly like "such a barber"
cannot possibly exist in the barber's dilemma.
Meaningless Statements in Natural Languages:
A related issue about "natural" languages is that there may be
inherently meaningless statements or descriptions in a language
that's allowed to refer to itself.
For example,
"the integers that can be specified in twenty English words or less"
cannot possibly describe any unambiguous set of integers.
Otherwise, there would be such a thing as
"the smallest integer that cannot be specified in less than twenty English words",
which would thus be specified in only 13 English words...
This goes to show that natural or formal languages may include selfcontradictory sentences,
not only when they refer to themselves (e.g. "This sentence is false.")
but also when they include other references to the language being used.
Bertrand Russell coined the term Berry paradox
for sentences of this latter type, because of a letter he received from G.G. Berry
in 1906
introducing the earliest such paradox by discussing, among transfinite ordinals:
The first ordinal that cannot be named in a finite number of words.
Even if you don't know what an ordinal is,
you can tell that the above phrase must be meaningless.
Its meaning, if it had any, would be an ordinal.
However, by assuming that such a meaning exists, we see that another
ordinal must be described instead.
Thus, the above can't possibly describe any
specific ordinal.
There's actually no paradox here; just a lack of meaning.
(Melissa of Denver, CO.
20001106)
What is infinity and what does it mean?
(Nagaraj of Anamoose, ND. 20001118)
What is the definition of infinity? Please explain to me in details.
Albert Einstein once said that
we can't solve problems by using the same kind of thinking we used
when we created them.Large
numbers begot the concept,
but they are of no help in the definition of infinity.
Let's try common sense first:
A finite set has n elements, for some nonnegative integer n.
Of course, any set which isn't finite in this sense may be called infinite.
Although this common sense approach is perfectly sound, it would seem that
the basic distinction between finite and infinite sets ought not
to depend explicitely on the properties of something
as complicated as the integers.
A more fundamental approach was first suggested by
Dedekind,
who turned a perceived paradox about infinite sets
into a natural definition of infinity:
An infinite set is equipollent to one of its proper subsets.
A subset
is a set contained in the set being considered
(i.e., all elements of one are elements of the other).
A proper subset is a subset
not equal to the entire set.
The empty set is a proper subset of any set but itself.
Two sets are said to be equipollent
when there's a onetoone correspondence between the elements of one and the
elements of the other.
At the most fundamental level (which may not be the most intuitive one)
the existence of infinite sets is built into the
axioms of modern set theory.
Consider the set N of natural integers
{0,1,2,3,4,5,6,...}. One of its proper subsets is the set of
all counting numbersN* = {1,2,3,4,5,6,7,...}.
Yet, there's a trivial onetoone correspondence betweeen those two
(the function y = x+1).
There's also a onetoone correspondence between N and the even numbers,
between N and the squares, etc.
N is thus an example of an infinite set.
The same thing cannot be done with any finite set:
{1,2,3,4,5,6,7} cannot be put in a onetoone
correspondence with any set of fewer than 7 elements.
Nowadays, a set equipollent to a proper subset is dubbed Dedekindinfinite.
Tarski's definition of infinity :
The problem with Dedekind's definition of infinity is that
a proof of its equivalence with the previous definition
(involving integers) requires the
axiom of choice
(or, at least, the weaker Axiom
of countable choice )
which is normally reserved for advanced nonconstructive proofs.
It's undesirable in a context where natural integers
themselves are considered too elaborate.
There are no such problems with the following definition, proposed in the doctoral dissertation
(1924)
of Alfred Tarski (19021983).
It's based on the notion of a minimal element in
a family of sets (a minimal element is a set
not containing another set from that family):
Tarski's Definition of Finite Sets (1924)
A set E is finite if, and only if, every nonempty family
of subsets (call it F) has a minimal element. Formally:
"F Í 2^{E} ,
( F ¹ Æ )
Þ
( $xÎF ,
"yÎF ,
y Ë x )
That clever definition is simply perfect from a theoretical standpoint
(although it certainly doesn't qualify as an intuitive one).
Infinity as a Limit of Large Numbers
If you allow mentally for the existence of such infinite sets (as you should)
you realize that there's no such thing as the "largest" integer.
The process of building larger integers is never ending.
That's what infinity is all about.
This is like the discovery game children play when trying to name the
"largest" number.
Sooner of later, one of them realizes that the game cannot possibly
be won by either player:
Regardless of what number is named, it's always possible to say something
like "twice that" (or "this plus one") and not concede.
Mathematicians routinely study things whose infinite versions turn out to be
much simpler than the finite ones.
One example is the following sum (properly called a "series"):
This sum is equal to 12^{n}
when carried out only to its nth term.
It's simply equal to 1 if all
of the infinitely many terms are added up.
When the ancient Greeks were still wrestling with the concept of infinity,
the above sum was underlying one of
Zeno's paradoxes :
Before an arrow reaches its target, it must first travel
half of the distance to it (1/2), then half of what's left (1/4),
half of what's left after that (1/8), and so forth.
Although there are infinitely many such "steps", the arrow does reach its target. (Try!)
(20120615) The Peano axioms (1889)
Defining the set of the natural integers.
The five Peano Axioms are:
Zero is a natural integer.
Every natural integer has a successor in the natural integers.
Zero is not the successor of any natural number.
Two different natural integers have different successors.
A set which contains zero and the successor of every element in it,
contains all natural integers.
The last axiom is often called the induction axiom,
as the validity of the method of
proof by induction is based on it.
(20050709)
Classes of Ordered Numbers
From integers to rational, real and surreal numbers.
The most rudimentary numbers are the counting numbers
(1, 2, 3, 4 ...) but it's better to let
the story begin with the natural numbers (0, 1, 2, 3 ...)
although zero
is a sophisticated concept of relatively recent origin.
Unfortunately, the tradition in US education
[1,
2, etc.]
differs from the international conventions described here, following
Bourbaki and other authors.
In US highschools, "natural numbers" are often understood to exclude
zero, which makes the term synonymous with "counting numbers".
The dubious convention within the US educational system that makes
"whole numbers" include zero but exclude negative integers
is best forgotten by anybody who wants to be understood outside of the US.
You may talk about "positive integers" or "nonnegative integers" instead.
Negative integers originated after the fractions and
constructible numbers which result from the application
of classical rules (compass and straightedge)
to Euclidean geometry.
Algebraic numbers were also considered before negative numbers were accepted
(along with complex numbers, during the Renaissance).
Real numbers were first formally defined
(as cuts over the rationals)
by Richard Dedekind (1858)
to close all possible "gaps" in the number line.
This is to say that whenever the real numbers are divided into
two sets L and R such that no element of L exceeds
any element of R, there's one element at their common boundary.
(This need not be the case with rational numbers. For example,
there is no rational number at the
boundary between the sets L and R
of rational numbers whose squares are respectively less than 2
and more than 2.)
Archimedes attributes to
Eudoxus
what's now called the Archimedean
property of ordinary numbers:
There's always an integral multiple of any given positive quantity
that exceeds any other prescribed quantity.
The surreal numbers, mentioned above,
are not Archimedean, as they include infinitesimal positive
quantities whose product by any integer can't exceed a
number like 1
(likewise, no integer exceeds an infinite surreal quantity).
TotallyOrdered Number Classes
(each one contains those below it)
ciderspider
(Mark Barnes, UK.
20001011) The Number Line
How do you prove that there are more real numbers than rational ones?
Any infinite set which may be put in a
onetoone correspondence with the integers is
said to be countable (or to have cardinality
Ào).
Infinite sets that are not countable are said to be uncountable.
As shown below, real numbers are not countable,
while rational numbers are.
There is some disagreement about the meaning of the adjective "countable"
when applied to sets which are not knomn to be infinite.
As the term is clearly a mathematical one, it should have the
inclusive meaning normally implied by
mathematical discourse.
In other words, finite sets are countable too.
However, some authors simply won't apply
the term "countable" (or its synonyms:
"denumerable" and "enumerable") to finite sets...
To avoid misunterstandings, you should either clarify things at the outset
(in Numericana, I consider finite sets to be countable)
or use unambiguous locutions, like "countably infinite" or
"countable or finite" (although the latter could be construed as pleonastic).
The uncountability of real numbers :
The diagonal argument of
Georg Cantor (18451918) is the remark that,
with any infinite "list" of real
numbers between 0 and 1 (such a "list" is in onetoone correspondence with the integers),
an unlisted number may be constructed by choosing for its Nth decimal
a digit that's
different from the Nth decimal of the Nth number listed.
To avoid the problem presented by numbers having
two decimal representations
(ending with infinitely many zeroes or infinitely many nines)
never choose a 0 or a 9. This still leaves 7 valid choices.
The number so constructed is not in the list,
since it has only one
decimal expansion and differs from any listed number in at least one decimal digit.
Thus, any list of reals between 0 and 1 fails to contain all
the reals between 0 and 1.
Therefore, the real numbers between 0 and 1 are uncountable.
This is one of the simplest and most beautiful proofs of all times !
Its elegance is often spoiled
by presenting it as a proof by contradiction,
starting with something like: "Suppose there was such a list..."
The countability of rational numbers :
Now, to prove that there are more reals than rationals, it suffices
to show that rationals may be put in a list.
There are neater ways to do so, but we'll describe only a
simpleminded scheme here:
At coordinates (p,q) of a whole planar grid, put the value p/q
[use zero as placeholder, if q is zero].
Starting at the center,
run through the grid along a square spiral like the one pictured at right.
Put in a growing list each ratio you encounter for the first time as you travel
this way through all possible fractions...
Any given rational will eventually appear in that list,
which is another way to say that all of them belong to the list.
(20001011)
Cantor's Ternary Set
Some vanishing sets of reals have as many points as the whole line.
The probability is zero that a randomly chosen real number is rational.
This statement (which could be made precise in the proper context
of Lebesgue measure theory)
holds for the set of rational numbers or
any other countable set.
However, not all sets of numbers with zero probability
(or, rather, zero measure in the sense of
Lebesgue) are countable.
It's easy to describe a set of numbers which has zero probability but yet contains
just as many elements as the whole set of real numbers
(which isn't countable).
One such example is the ternary Cantor set
obtained by removing from a segment
(of numbers from 0 to 1, say) its "middle third" and similarly removing the middle third
of any smaller segment that shows up in this (infinite) process.
Cantor's ternary set, introduced by Cantor in 1883, was first considered by
Henry Smith
in 1875. It's thus the earliest mathematical example of
a fractal endowed with selfsimilarity,
since it actually consists of two copies of itself one third as big.
Its fractional dimension
d (in the sense of Hausdorff) is less than 1:
Equivalently, you may consider the set of all the numbers
(between zero and one) whose decimal expansions contain
only zeroes and nines.
Its probability is zero, in spite of the fact that this set contains
as many elements as the whole number line
(as sequences of zeroes and nines are in onetoone correspondence with
sequences of zeroes and ones, which are
at least as numerous as the uncountable
set of numbers between 0 and 1).
The fractal dimension of that decimal version of the Cantor set is a
familiar number
(smaller than the above dimension of the ternary set):
d = log 2 / log 10 = 0.3010299956639811952...
A selfsimilar countable set of reals
must have dimension 0. Example:
For { 1, 1/2, 1/4, 1/8, ... 1/2^{n} ... }
we have (1/2)^{d} = 1 so, d = 0.
(20020604)
On the fundamental axioms of Set Theory :
Fraenkel's replacement schema (1922) was added to Zermelo's list.
As Russell's paradox once showed,
we cannot simply call a set anything we like.
The "membership" of a set has to be somewhat restricted;
a set simply cannot be a member of itself (or else we'd run into trouble real fast).
On the other hand, infinite sets must be part of the theory,
if it is to be of any use whatsoever.
So, it is postulated at the outset that
some sets exists which may be put in onetoone correspondence with a proper part
of themselves.
Finally, the last axiom listed below (the socalled Axiom of Choice)
postulates that sets need not be explicitely constructible,
but may instead be postulated
to exist as abstract collections of virtual choices.
This is the most controversial axiom of the bunch.
It has nevertheless been shown that,
if Set Theory is consistent without the Axiom of Choice,
then it is also consistent with it.
Various alternate theories have been formulated
where the axiom of choice is replaced by some weaker statement (which would be
a consequence of the fullfledged Axiom of Choice).
A proof using the Axiom of Choice
(or, possibly, one of its weaker counterparts) is usually called nonconstructive,
and is rejected by constructivists.
However, such a proof may serve to demonstrate [even to constructivists]
that the negation
of a nonconstructive theorem cannot possibly be shown to be true within the framework
of the rest of Set Theory (since the Axiom of Choice must be consistent with it)...
Axioms of Set Theory
Existence of the Empty Set :
The empty set Æ has no elements.
( A set with at least one element is called nonempty.)
( $ Æ )
( "x )
( x Ï Æ )
Extensionality :
A set is characterized by its elements.
( In particular, there's only one empty set.)
( "x )
( x Î A
Û
x Î B )
Þ A = B
Subset Schema :
Elements of A with property
b form a subsetB.
( This is to justify the set builder
notation: B =
{ x Î A  b(x) }. )
( $ B )
( "x )
( x Î B
Û
( x Î A
& b(x) ) )
Pairing :
Two elements form a set.
( $ A )
( "z )
( z Î A
Û
(z = x) Ú (z = y) )
Union of a set of sets : C =
È
B
BÎA
( $ C )
( "x )
( x Î C
Û
( $ B Î A )
( x Î B ) )
Power Set :
The power set C = 2^{ A} is the set
of all subsets of A.
( $ C )
( " B )
( B Î C
Û
B Í A )
Regularity :
Set membership is neither reflexive
( A Ï A ) nor circular.
Every nonempty set has an element with which it has no common element.
( " A ¹ Æ )
( $ x Î A )
( " y Î x )
y Ï A
[Zermelo, 1930]
Infinite Sets : One example is
{ Æ, {Æ}, {{Æ}}, {{{Æ}}} ... }
( $ A ¹ Æ )
( " x Î A )
( {x} Î A )
Replacement Schema :
Any collection equipollent to a set is a set.
To obtain cardinals as large as
Àw ,
Fraenkel and/or Skolem
added this axiom in 1922.
In 1923, von Neumann would show
that the assumption is also indispensable to prove that every wellordered set is equipollent to an
ordinal.
The axiom of replacement had first been proposed by
Dmitri Mirimanov from Switzerland, in 1917.
Axiom of Choice :
A set may be formed which contains a single element from every
nonempty set of a given family of pairwise disjoint sets.
( " A Í
2^{ E })
( "MÎA "NÎA
MÇN=Æ )
Þ
( $ C Í E )
( " B Î
A{Æ} )
( $ x Î E )
B Ç C = {x}
More brutally stated:
( "MÎA, "NÎA,
MÇN=Æ )
Þ
( $C,
"B Î
A{Æ},
$x,
B Ç C = {x} )
Several equivalent formulations of the axiom of choice are used.
Among those, the above (in either flavor)
is known as the Multiplicative Axiom.
It's the original formulation of the Axiom of Choice introduced in 1904
by Zermelo to establish that every set can be wellordered.
In 1963, Paul Cohen would show that the Axiom of Choice
is essential to the proof of that "desirable" statement
(i.e., replacing the Axiom of Choice by any weaker
statement would describe a theory of sets in which some sets cannot
be wellordered).
In spite of its annoying consequences (see below) some
formulations of the Axiom of Choice make it look trivial and unavoidable
(e.g., "the cartesian product of any collection of nonempty sets is nonempty").
Even if one finds the nonconstructive proofs based on the Axiom of Choice
particularly repugnant, it does impose welcome constraints on the definitions of key concepts.
For example, in general topology,
the Axiom of Choice is found to be equivalent to
Tychonoff's theorem
(the product of compact topological spaces is always compact,
when compactness and product topology are properly defined).
In that sense, paradoxically, the Axiom of Choice has a
beneficial influence even in the "constructivist" axiomatic framework
(called ZF or ZermeloFraenkel) where it is explicitely rejected.
The full set of axioms stated above is usually referred to as ZFC
(ZermeloFraenkel with Choice).
Note that ZF or ZFC assert only the existence of mathematical objects that
can be defined as sets.
Every mathematical object so defined, with the sole exception of the empty set,
has some kind of sructure, in the sense that some simpler
mathematical objects belong to it.
This remains true even when convenient names are assigned to basic
objects whose settheoretical definition is rarely used, if ever, starting with
the natural integers:
As part of ZF or ZFC, the additional assertion is often understood that all
mathematical objects ultimately reduce to the empty set in that way.
That's all there is to the mathematical universe we may talk about.
This is what establishes the validity of the standard method of proof by
structural induction
(also called mathematical induction ).
Nevertheless, it's also possible to postulate the existence of
atoms (such fundamental elements
or urelements are objects,
different from the empty set, to which no other object "belongs").
In the simplest such model, called ZFA (ZermeloFraenkel with Atoms)
all atoms are alike but they're distinct (e.g.,
there is only one set of two nonempty elements that are both atoms).
(20130215) The Axiom of Restriction
Ruling out infinite descending membership chains.
In 1917, Dmitri Mirimanov
discovered that certain models compatible with the axioms of ZF set theory
allow the existence of infinite descending membership chains:
x_{n+1}Îx_{n}
John von Neumann
found this repugnant and proposed his Axiom of Restriction
to rule that out.
Philosophically, the axiom of restriction disallows sets beyond those
created in finitely many applications of the other axioms of ZF.
Interestingly, if the axiom of dependent choice
is postulated instead of the full axiom of choice, then the axiom of restriction
implies regularity (so regularity no longer needs to be postulated as an axiom).
Here are a few interesting statements that are undecidable within ZF and
proven within ZFC, although their proofs does not require the full power of
the unrestricted axiom of choice (the corresponding theorem
may be true even if the axiom of choice isn't).
Statements equivalent to some of them have been proposed as weaker
alternatives to the full axiom of choice to complement the rest of the ZF axioms.
The HahnBanach theorem and the KreinMilman theorem taken together
are equivalent to the full axiom of choice.
Likewise, the Boolean prime ideal theorem and the KreinMilman theorem taken together
are equivalent to the full axiom of choice.
(20100506)
Are there sets of real numbers that are not measurable?
The Axiom of Choice guarantees it ! (Giuseppe Vitali, 1905)
The Lebesgue theory of integration is based on the notion of
measurable sets of real numbers:
The Lebesgue measure is a nonnegative
real function
m defined on measurable
sets, having the following properties:
It's translationinvariant:
m({x}+A) = m(A)
The measure of an interval
is equal to its length :
m ( [x,y[ ) = yx
It's countably additive :
The measure of a countable union of pairwise disjoint sets
is the (countable) sum of their measures.
Surprisingly enough, the Axiom of Choice
then implies that there are sets
of real numbers that are not measurable. Here's the proof:
Consider the equivalence relation among real numbers in the interval
[0,1[ whereby two numbers are dubbed equivalent when their
difference is a rational number (i.e., the quotient of two integers).
By the Axiom of Choice, we may form a set W
(called a Vitali set )
containing one and only one number from every equivalence class.
If W was measurable, then every set W_{q} =
{ (q+x) mod 1  x Î W }
would be measurable of measure M, for any rational number q.
Well, M couldn't be positive because the
union of finitely many W_{q}
(they're all pairwise disjoint)
would then have a measure exceeding unity.
M couldn't be zero either, because that would make
the measure of [0,1[ vanish
(as the union of countably many disjoint sets of measure zero).
Therefore, W cannot be Lebesguemeasurable.
Note that the above relies on the
Archimedean property of real numbers.
If we were investigating surreal measures,
we would just have established that some sets of reals must have
infinitesimal nonzero measures...
Robert M. Solovay (b. 1938)
has argued in favor of dropping the Axiom of Choice from the standard axioms
of set theory to ensure that every set of real numbers is Lebesguemeasurable.
This would make every linear map on a Banach space continuous
(a socalled dream space ).
H.G.
Garnir (19211985)
One possiblity, advocated by the late Belgian mathematician
Henri Garnir
(19211985)
is to replace the Axiom of Choice (AC) by the weaker
Axiom of Dependent Choice (DC)
and to postulate that every set of reals has the
Baire property
(which is incompatible with the full axiom of choice).
In that system, every set of reals is Lebesguemeasurable.
(20141204)
Cartesian Products
Binary or multiple cartesian products and infinite ones.
The cartesian product of two sets A and B
is the set, denoted
A ´ B , of all ordered pairs
(a,b) whose first element (a) is
in A and whose second element (b) is in B.
Multiple cartesian products :
This definition is readily extended to the cartesian product of a finite sequence
of n sets, which consists of all the ntuples formed
with an element from every set, in the stated order.
The cross symbol (´) used to denote cartesian
products doesn't have a fixed arity.
The lack of parentheses is meaningful
(it's not like an associativebinary operator, where parentheses can be omitted or not).
For example:
(1,2,3) is an ordered triple belonging to
´
´
=
^{ 3}
(1,(2,3)) is an ordered pair belonging to
´
(
´
)
=
´
^{ 2}
From the highbrow perspective of category theory where new constructions are
defined up to isomorphism, that distinction isn't made and the corresponding
categorial product is indeed a binary associative operator
(at least when all relevant products are welldefined, as would be the case in a cartesianclosed category).
Indexed and/or infinite cartesian products :
Far more generally, we may define the cartesian product of a family of sets
indexed by a set I
as a function which sends any element i of I to
some element of E_{i} .
When I is finite, this is equivalent to our previous description.
When at least one of the sets involved is empty, the cartesian product is empty.
However, the converse is only true if we postulate the Axiom of Choice
which is equivalent to the following statement:
The cartesian product of nonempty sets is not empty.
Formal definitions of cartesian products :
The intuitive notion of ordered pairs ought to be
defined first. This allows binary cartesian products to be defined
as above. Building on that, relations and functions can be stated to be subsets
of binary cartesian products (as is done in the next section).
Only then, can the above definitions for multiple or infinite cartesian products be given
in terms of "functions".
Classically, an ordered pair are awkwardly defined in terms
of the more basic concepts of set theory:
The pair (x,y) is defined as the set {x,{x,y}}.
(20141204)
Binary Relations
A binary relation between two set is a subset of their cartesian product.
Technically, a [binary] relation between two sets is just a
subset of their cartesian product.
When (a,b) is
in that subset, we say that "a is related to b"
(it's not the same thing as "b is related to a"
since we are talking about ordered pairs here).
For a relation R, the infix notation a R b
means (a,b) Î R.
The relation R from A to B is said to be
entire when any element of A
is related to some element of B.
Composition of Relations :
If R is a relation from set A to B
and S is a relation from B to C,
then the composition T = S o R
is the relation from A to C defined by:
a T c
Û
{ $ b Î B ,
a R b , b S c }
Readers who have been exposed to this notion for functions
(which are just a special type of relations,
as introduced below) may remark
that the above reduces to the familiar composition of functions, when applicable.
If the relation R is a function,
a R b means the same as b = R ( a )
The peculiar order of the operands in the above definition
is meant to accomodate functional definitions. Indeed,
if R and S are functions, then the above definition translates into:
c =
S o R ( a ) = S ( R ( a ))
The composition of general binary relations (not necessarily functions)
is especially useful to define uniform spaces,
whose completeness can be determined even when distances are undefined.
Binary Relations on a Set :
A relation R from a set A to itself could verify the following properties.
Reflexivity :
" x Î A , x R x
(thus, reflexive relations are entire).
Symmetry :
x R y Þ y R x
Antisymmetry :
{ x R y & y R x } Þ x = y
Transitivity :
{ x R y & y R z } Þ x R z
Those concepts are used to define several important types of relations:
A preorder relation is reflexive and transitive.
An equivalence relation is reflexive, symmetric and transitive.
An ordering relation is reflexive, antisymmetric and transitive.
Equivalence Classes, Quotient :
When an equivalence relation R is defined on the set A,
that set can be partitioned into disjoint equivalence classes
such that two elements are related iff
they belong to the same class. Any element of A belongs to
one and only one such class. The empty set isn't an equivalence class.
The set of all those equivalence classes is denoted A/R and is called
the quotient of A by R
(also dubbed Amodulo R).
For example the set of signed integers
can be construed as the cartesian square
^{ 2 }modulo the following relation R :
(a,b) R (c,d)
Û
a + d = b + c
The motivation is to make subtraction a welldefined operation verifying:
(a , 0)  (c , 0) =
(a , c)
This construction is one key reason why it is best to include zero in the set of natural integers
(zero is excluded from
* )
in spite of the dubious tradition prevalent in US highschools.
It is then a simple (albeit tedious) exercise to show that the set
so constructed is given a ring structure using the
following operations which respect the equivalence R
(i.e., the equivalence class of a result depends only on the equivalence classes of the operands).
(a,b) + (c,d) =
(a + c , b + d)
(a,b) (c,d) =
(ac + bd , ad + bc)
Wellfounded relations :
A relation is wellfounded
when every subset has a "minimal element" (to which no other element of that subset is related).
(20100923)
Functions and applications are special relations. Some
functions from A to B
are injective, surjective or both.
A function f from set A to set B is something
that associates one and only one element
f (x) of B with every element x of A.
Formally, functions are special relations between
the elements of A and B,
in the above sense.
Bourbaki calls the above an
application.
A partial functionf from A to B is
merely an application from a subset of A
(called the definition domain of f ) to set B.
When there's no risk of confusion, the qualifier "partial" may be omitted.
Thus, an application f verifies the following characteristic
properties:
f Í
A ´ B
( " x Î A )
( $! y Î B )
( (x,y) Î f )
By convention, the above is expressed with the following notations:
f :
A
¾® B
x
¾® y = f (x)
A function f from set A
to set B is said to be surjective
when every element of B is the image of some element
of A.
(The idiom "onto" is synonymous with surjective.)
A function is said to be injective when no element of
B is the image of more than one element of A.
(The idiom "onetoone" is synonymous with injective.)
Both terms were introduced by
Nicolas Bourbaki
(the collective pen name of a famous ongoing collaboration of industrious
mathematicians, mostly French, founded on January 14, 1935).
A function that's both surjective and injective is said to be
bijective, or "onetoone onto".
It's also said to be a "onetoone correspondence".
Functions that are surjective, injective, or bijective are respectively called
surjections, injections, or bijections.
Direct and reciprocal images of a set :
For a given function f from A to B,
the following definitions and notations are commonly used:
Direct image (or simply image)
of a subset A' of A :
_{ }f ( A' ) =
{ y Î B 
$ x Î A' , y = f (x) }
Inverse image^{ } (also called preimage)
of a subset B' of B :
f^{ 1 }( B' ) =
{ x Î A 
$ y Î B' , y = f (x) }
It's useful to know that the [direct] image of an intersection is contained
in the intersection of the [direct] images, but is not necessarily equal to it
(unless the function is injective).
In particular, the images of two disjoint sets aren't necessarily disjoint
(unless the function is injective).
On the other hand, we do have equality for unions of direct images and for
both unions and intersections of inverse images. All told:
f ( X Ç Y )
Ì
f ( X ) Ç
f ( Y )
f ( X È Y ) =
f ( X ) È
f ( Y )
f^{ 1 }( X Ç Y ) =
f^{ 1 }( X ) Ç
f^{ 1 }( Y )
f^{ 1 }( X È Y ) =
f^{ 1 }( X ) È
f^{ 1 }( Y )
The proofs are left as an enlightening exercise for the reader.
One application of those relations pertains to the characterization
of the continuity of a function in terms of
the conservation of certain topological properties of sets for either
their direct or their inverse images...
(20050726)
Cantor's Theorem:
Any set is smaller than its powerset. No
function from A to 2^{A}
can possibly be surjective (1891).
A beautiful proof :
Consider anyfunctionf from set A to its powerset 2^{A}
(by definition, 2^{A}
is the set of all the subsets of A).
Let M be the subset of A consisting of all the
elements x of A such that
x Ï f (x).
Well, M is clearly an element of 2^{A}
which is not the image of any
element x of A.
(HINT: Such an x could be neither in
M nor outside of it.)
Thus, our arbitrary function f is not surjective.
Therefore, there's no surjection from A
to its powerset.
This much seems obvious for finite sets,
but the above proof shows that it holds for all infinite sets as well.
The reader is encouraged to check the above wording for conformity with
the Axioms of Set Theory presented in the
above article,
assuming familiarity with the concept of a function.
The above result is known as Cantor's theorem and it provides
yet another argument (formerly known as the paradox of
BuraliForti) against the existence of a set of all sets
because the powerset of such a thing could not possibly
be of greater cardinality than itself.
This argument was put forth by
Cesare BuraliForti
(18611931)
in 1896 and predates the formulation of Russell's paradox
(1901).
(20031110)
Ào
and the transfinite cardinals
On the different sizes of infinite sets.
Two different approaches make sense of actual
infinite numbers which were both investigated by
Georg Cantor (18451918).
The cardinal numbers are discussed in this section;
ordinal numbers will be considered later.
Equipollence, Cardinality and Cardinal Numbers
Two sets which can be put in onetoone correspondence
are said to be equipollent.
This is a fundamental notion which does not require an ability to count
the elements in either set; the ability to pair them up suffices.
Loosely speaking, equipollence is an equivalence relation
and we can give a name to each equivalence class.
Such a name is called a cardinality.
Two sets are equipollent if and only if they
have the same cardinality.
WARNING : There's no such thing as the
"set of all cardinalities". This was known to
Cantor
even before Russel pointed out that
there is no such thing as the "set of all sets".
Of course, finite sets, have the same cardinality if and only
if they have the same number of elements.
If you have as many apples as you have oranges,
you may count either the apples or the oranges and you'll
come up with the same number (0 if you have no fruit).
One may thus consider natural integers (0,1,2,3,4 ...) to be
indicative of the
cardinality of either set (apples or oranges).
In that capacity, the natural integers are
called cardinal numbers.
The fancy vocabulary may be intimidating,
but the whole thing is really the most basic kind
of numbers,
only described in a philosophical way which can be extended
to infinite sets...
Can infinite sets be "counted" too?
Well, the finite integers are clearly not up to the task.
But can't we just add one more "number"
(using tentatively the "¥" symbol for it)
and say that "¥" is the number of elements
in any infinite set?
Cantor showed that this won't do,
because there are pairs of infinite sets which cannot
be put into onetoone correspondence with each other.
To be consistent with the very concept of cardinal numbers presented
above, there has to be more than one "transfinite" cardinal number.
The symbol Ào (pronounced
"aleph zero", "aleph null", or "aleph nought") was introduced by Cantor to
denote the smallest of these;
it's the number of all the [finite] integers.
An infinite set with this many elements is said to be countable.
An uncountable set is an infinite set which is not countable.
Cantor's diagonal argument
shows that the continuum of the real numbers
is uncountable.
Its cardinality (called the power of the continuum)
is the same as that of the powerset of the integers (the set of all sets
of integers)
as can be established directly by observing that
Pierce expansions yield a onetoone
correspondence between sets of positive integers and the reals between 0 and 1.
By Cantor's theorem,
the powerset of any set has a greater cardinality
than that of the set itself.
Therefore, there must be infinitely many different infinite cardinalities and
infinitely many transfinite cardinals, including, at least, the following
sequence of socalled beth numbers :
Á_{0} =
À_{0}
; Á_{1} = C =
2
Á_{0}
; Á_{2} =
2
Á_{1}
... Á_{n+1} =
2
Á_{n}
_{ }...
Georg Cantor used the notation
C = 2^{Ào}
for the power of the continuum. and
called À_{n+1}
the smallest transfinite cardinal beyond À_{n}
(that definition is valid only if the
axiom of choice is retained).
With that notation, in 1877, he conjectured the continuum hypothesis (CH)
which states that there's nothing between
Ào and C :
À_{1} = C
The generalized continuum hypothesis (GCH) is the deep generalization:
À_{n} =
Á_{n} for any ordinal n
Cantor could not prove or disprove the [restricted] continuum hypothesis,
Neither could anyone else... because this can't be done,
as discussed next.
(20090411)
The Continuum Hypothesis (CH)
Is any uncountable set of real numbers
equipollent to the whole line?
Cantor thought that any uncountable subset of a continuum
was equipollent to the whole continuum.
He formulated this statement, known as the continuum hypothesis,
in 1877 and it bedeviled him till the day he died.
The reason why Cantor failed to prove or disprove
the Continuum Hypothesis is that it cannot be proved or disproved.
It's undecidable...
The axioms of set theory are equally consistent
if we rule in or rule out the existence of transfinite
cardinals strictly between À_{0 }
and C.
This was established in 1963 by
Paul Cohen
(19342007) who was awarded the
Fields medal for that work, in 1966.
Some preconditioned versions of CH are true :
In 1884, Cantor observed that an uncountable
closed set of real numbers must have a subset
equipollent to the Cantor set and, therefore,
must be equipollent to the entire real line.
So, the continuum hypothesis holds for closed sets.
More generally, Felix Hausdorff (18681942)
proved, in 1916, that any uncountable Borel set of real numbers is
equipollent to the real line.
(20031110)
w
and Cantor's transfinite ordinals (1897)
Counting to infinity and beyond.
The second kind of transfinite numbers introduced by Cantor
are called transfinite ordinals.
They are more complicated than cardinals.
Following von Neumann (1923)
we may start with the following remark:
A natural integer may be represented by
the set of all nonnegative integers before it,
starting with the empty set ({} or Æ)
for 0 (zero) because there are no nonnegative integers before 0.
Then 1 corresponds to {0}, 2 is {0,1}, 3 is {0,1,2}, etc.
For the whole set of nonnegative integers {0,1,2,3...} Cantor introduced the symbol
w.
He did not stop there, since {0,1,2,3...w}
corresponds to another transfinite ordinal, which is best "called"
w+1.
{0,1,2,3...w, w+1} is w+2, etc.
However, you end up doing two different
types of counting if you enumerate a finite set first and an infinite set second or
the other way around...
The addition of ordinals is associative but not commutative:
1 + w
= w
¹ w + 1
Incidentally, the maximum number of different results
which can be obtained by changing the order of the terms in a Cantorsum of n
ordinals may be tabulated as follows.
(The last pair of lines gives the general pattern,
which has exceptions only below n = 15.)
Number of different ways to add n ordinals
(A005348)
0
1
2
3
4
1
1
2
5
13
5
6
7
8
9
33
81
193
449
1089
10
11
12
13
14
2673
6561
15633
37249
88209
5k + 15
5k + 16
5k + 17
5k + 18
5k + 19
33 ´ 81^{ k+2}
81^{ k+3}
193 ´ 81^{ k+2}
193^{ 2} ´ 81^{ k+1}
193^{ 3} ´ 81^{ k}
The addition of infinite numbers such as w
may also be defined in other ways which retain commutativity.
The best such approach was proposed by
John H. Conway,
who defined w essentially as
above, but in the context of socalled surreal numbers,
a very satisfying generalization of the familar concept of "numbers"
(including integers, rationals and real numbers ).
Infinite Numbers among Surreal Numbers :
Expressions like
w1,
2w,
w^{2},
Öw,
w^{w}
w^{ww}, etc.
have consistent meanings in the superb
construction of John Horton Conway presented
next.
(20031110)
Surreal Numbers (John H. Conway, 1972)
What's the most general type of ordered numbers?
The most general type of number line
(as opposed to unordered things like complex numbers
or padic numbers) turns out to be
simpler to define than the familiar real numbers.
With the right approach,
it's actually easier to include infinitesimals and transfinite ordinals
than to rule them out...
The approach is inspired by the "Dedekind cuts" introduced in 1858
by the last doctoral student of
Gauss,
Richard
Dedekind (18311916) to define real numbers as gaps
"between" partitions of the rational line
(this was well before Georg Cantor identified real numbers with
equivalence classes of Cauchy sequences).
A Dedekind cut consists of two disjoint sets
L and R whose union is the entire set of rationals.
L does not have a greatest element and contains all rationals to the left of any
member of L, whereas R may
have a lowest element and contains any element to the
right of any member of R.
The mnemonic notations L and R ("left" and "right") are due to
J.E. Littlewood (18851977).
The term surreal numbers was coined by
D.E. Knuth
in 1973 to describe the beautiful construct of
John H. Conway,
who gave simultaneously recursive definitions of the numbers
and the ordering among them...
Conway introduces the compact notation
x = { L  R } =
{ x^{ L}  x^{ R} }
for what is called a game;
an ordered pair of two sets L and R, given either by name or by listing their
elements, all of which are [simpler] games themselves...
An element of L or R may be called a left or right option of x.
Numbers are then defined as special games whose options are other numbers,
and which lay "between" their own left and right options, in the following
sense, recursively defined:
Part of Conway's notation is standard: If L or x^{ L }
is a set, then L+y is universally understood as the set of all elements obtained
by adding y to an element of L, just like E+F is the set whose elements
are sums of an element of E and an element of F.
If desired, the rest of Conway's notation could easily be translated into
the standard notation of set theory,
by stating that a number may stand for a singleelement set within
the "first level" of a "{......}" notation and that a comma ( , )
is synonymous with the "union" operator
( È ) in that same context.
Here are singleline definitions of some basic arithmetic operations:
Sum: x + y =
{ x^{ L }+ y , x + y^{ L} 
x^{ R }+ y , x + y^{ R} }
(20050709)
Multidimensional (Hypercomplex) Numbers
Real and complex numbers. Quaternions, octonions, sedenions, etc.
A division algebra over a field
(here, we use only the field of reals)
is an algebra
in which every nonzero element has a multiplicative inverse
(this doesn't rule out divisors of zero because no associativity is postulated).
The CayleyDickson construction doubles
the dimension of a previously established set of numbers,
by considering ordered pairs of such numbers
and defining their sums, products and conjugates in the following way.
(The conjugate of z is denoted z* and the conjugate of a real
number is itself.)
The product of a number and its conjugate is always
a nonnegative real number, equal to the square of that
number's norm, by definition.
( a , b ) + ( c , d )
=
( a + c , b + d )
( a , b ) ( c , d )
=
( ac  db* ,
a*d + cb )
( a , b )*
=
( a* , b )
A trivial induction, based on this last relation, would prove that z** = z.
Likewise, the identity
( x y )* = y* x* is clearly true in the realm of real numbers
and is established for all hypercomplex numbers by induction:
[ ( a , b ) ( c , d ) ]*
=
( ac  db* ,
a*d + cb )*
=
( c* a*  bd* ,
 a*d  cb )
( c* , d )
( a* , b )
=
( c* a*  bd* ,
 cb  a*d )
Hypercomplex Numbers: Normed Division Algebras (multidimensional
numbers with real coordinates)
Arguably, the above process should stop with the octonions, as the
next step yields a strange realm (the sedenions) where two
nonzero elements may have a zero product.
It seems abusive to call numbers
things that include such divisors of zero,
except for recreation (e.g., decadic
integers).
By contrast, the noncommutativity of quaternions is benign.
The resulting algebraic structure
is called a skew field, and it repays study.
Sir William Rowan
Hamilton
(18051865) invented the quaternions
on October 16, 1843.
The
next day,_{ } he told John T. Graves about it.
John T. Graves (18061870)
came up with octonions on December 26, 1843.
They were rediscovered by Arthur Cayley
in March 1845.
In July 1844, Hamilton
realized that octonions failed to have the crucial property
for which he coined the word associativity, now universally used:
"x
"y
"z
( x y ) z = x ( y z )
A nonassociative division algebra may
contain divisors of zero.
In an associative division algebra, a product of nonzero factors is never zero.
In 1886, Frobenius
proved that the reals, the complex numbers and the quaternions
are the only three associative division algebras over the reals.
Multiplication of octonions isn't associative,
it's only alternative :
"x
"y
(xx)y = x(xy) and y(xx) = (yx)x
Alternativity is less demanding than associativity.
Less demanding still is the
powerassociativity discussed below.
The CayleyDickson construction transforms a powerassociative
algebra into another powerassociative algebra, so all the algebras discussed here are
powerassociative.
(20130708)
Beyond the sedenions :
Colorful names or systematic nomenclature.
There has been at least one colorful attempt to name some
CayleyDickson algebras
beyond the 16dimensional sedenions.
Tony Smithreports the
macaronic
tongueincheek nomenclature proposed by Robert de Marais :
Pathions (32 dimensions). Symbol: P.
The 32 paths of wisdom
in the Kabbalistic tree of life.
Chingons (64 dimensions). Symbol: X (since
is for dimension 2).
The 64 Hexagrams
of the I Ching (Book of Changes).
Voudons (256 dimensions). Symbol: V.
The 256 combinations from the
Ifá corpus of
Voudou (Tony Smith).
In the unlikely case something less silly is ever needed, I am hereby offering a
a systematic nomenclature based on the "plex" root, for two reasons:
Prior usage of "plex" in numerical prefixes to denote exponents.
Evocation of the relevant words "complex" and "hypercomplex".
With plexions denoting real numbers,
elements of the 2^{n}dimensional real algebra
could be called nplexions where the number
n can be expressed with standard numerical prefixes:
The reals are noplexions, the complex numbers are monoplexions, the quaternions are diplexions,
the octonions are triplexions, the sedenions are tetraplexions.
This goes on with pentaplexions (32 dimensions),
hexaplexions (64 dimensions),
heptaplexions (128 dimensions),
octaplexions (256 dimensions),
enneaplexions (512 dimensions),
decaplexions (1024 dimensions),
hendecaplexions (2048 dimensions), etc.
(20121220)
Powerassociativity is a very weak form of associativity.
Any "multiplication" needs that for exponentiation
to be welldefined.
A multiplication is said to be
powerassociative
when positive powers
can be recursively welldefined as follows,
for any element a :
a^{1} = a
and
" m≥1 ,
" n≥1 ,
a^{n+m} = a^{n}a^{m}
The innocentlooking requirement that the above be "welldefined" entails
an infinite sequence of independent conditions :
a^{3 } is welldefined only if
aa^{2 } = a^{2 }a
a^{4 } is then welldefined only if
a a^{3 } =
a^{2 }a^{2 } =
a^{3 }a
a^{5 } is then welldefined only if
a a^{4 } =
a^{2 }a^{3 } =
a^{3 }a^{2 } =
a^{4 }a
Etc.
If a powerassociative multiplication has a neutral element (1)
the zeroth power of any element,
invertible or not, is also welldefined:
" a ,
a^{0} = 1
Invertible
elements can also be raised to the power of a negative integer:
a^{n} = (a^{1})^{n}
Examples of powerassociative operators :
Alternative multiplications are powerassociative,
so is the multiplication in any set of hypercomplex
numbers (even beyond the sedenions)
or any commutative multiplication having the following
property, known as Jordan's identity (which is verified,
in particular, by the Jordan product
used in quantum mechanics).
"x
"y
x ( y ( x x )) = (x y) ( x x )
A.A. Albert (1947) defines flexibility as the following property:
"x
"y
x ( y x ) = (x y ) x
On the weakest form of subassociativity :
The following property is called poweralternativity :
"x ,
x ( x x ) = (x x ) x
This merely means that cubes are welldefined. (Any use for that?)
The ugly multiplication tabulated below for a set of 3 elements,
has this property (since any square equals A and everything
commutes with A). This makes all cubes welldefined:
A^{3} = A,
B^{3} = C,
C^{3} = B.
However, neither B nor C have welldefined
fourth powers :
A
B
C
A
A
C
B
B
C
A
B
C
B
C
A
B B^{3} = B C = B
B^{2 }B^{2} = A A = A
B^{3 }B = C B = C
C C^{3} = C B = C
C^{2 }C^{2} = A A = A
C^{3 }C = B C = B
Therefore, this operation isn't powerassociative.
For commutative nonassociative rings or algebras
whose characteristic is either 0 or
coprime with 30, A.A. Albert has shown (1947) that,
if fourth powers are welldefined, then the multiplication is powerassociative.
When the characteristic is 3 or 5, the same holds whenever fifth and sixth
powers are welldefined (Louis A. Sokotoris, 1952).
(20050719)
Von Neumann  Bernays  Gödel set theory
(NBG) Sets belong to classes.
A proper class isn't a member of anything.
Everything in NBG set theory is a class
(a concept undefined in ZFC set theory)
but only those classes that are a member of another class are called sets.
Any collection of sets to be considered collectively as a class
but such classes cannot be considered collectively unless they are sets.
Quantified variables in the language of NBG can only be sets, not
proper classes.
In the 1990's, it was established that NBG is a
conservative
extension of classical set theory (ZFC)
in the sense that NBG introduces
new language but no new theorems expressible in the language of ZFC.
In view of the fact that they have essentially the same theorems,
it's surprising to observe that NBG can be axiomatized with
finitely many statements and that ZFC cannot
(the latter requires at least one
schema of axioms
which asserts a given pattern rather
than a definite sentence about objects within the theory).
There's no need for a "statement about statements" which could become nonsensical
if the object statements feature free variables whose names
are used in the main statement.
(In the LISP programming language, the socalled FUNARG problem/bug
may arise when functions are passed as arguments to another function.)
For example, within NBG, we may speak of the class of all sets,
which is called the Absolute (always spelled
with a capital A).
The Absolute is a proper class because of
Russel's argument.
That statement isn't a "new" theorem because it can't be expressed at all in the language
of ZFC. It may be more satisfying to say that "the class of all sets is proper" rather
than "there's no such thing as the set of all sets" but both statements are
equivalent in a deep sense...
Gödel marveled at the fact that something
can be known about the Absolute by pure reason, namely:
Anything which is true of the Absolute is also true of at least one set.
(20140119)
Tarski  Grothendieck set theory (TG)
The nonconservative extension of ZFC
used in the Mizar system.
Around 1973, Andrzej Trybulec
chose TG set theory as the basis for his Mizarproofchecker
(now, the most popular system of its kind).
TG set theory is essentially ZFC with the addition of Tarski's axiom
(originally formulated by Tarski in 1939).
All objects are sets;
the notion of class isn't part of the language of TG.
Tarski's axiom states (in modern language)
that every set is a member of at least one Grothendieck universe,
where a Grothendieck universe
is, loosely speaking, a set within which all usual mathematical operations can be performed.
More precisely, a set U is a Grothendieck universe if and only if the following four
conditions are met:
If x belongs to U and y belongs to x, then y belongs to U.
If x and y belong to U, then the set {x,y} belongs to U.