home | index | units | counting | geometry | algebra | trigonometry & functions | calculus analysis | sets & logic | number theory | recreational | misc | nomenclature & history | physics

# Measure TheoryDensities, Distributions...

[ Atomists have said that ]
bodies are composed of surfaces,
and surfaces of lines, and lines of points.

"De Luce" (c. 1235)  Robert Grosseteste (1168-1253)

### Related articles on this site:

Measurable Sets  with respect to an outer measure (PlanetMath.org)
Does Benford's law apply to prime numbers?   by Chris K. Caldwell
Chebyshev's Bias   by John Derbyshire

## Probabilities, Measures and Distributions

(2016-01-23)   Mind-bogglng open cover of the rationals:
Countable union of intervals with lengths in geometric progression.

As the rationals between 0 and 1 are  countable,  all of them can be included in a discrete sequence.  Any ugly way to do ss may be employed but we may as well use the nice sequence provided by the  Stern-Brocot tree:

q1 = 1/2,  q2 = 1/3,  q3 = 2/3,  and so on:  1/4, 2/5, 3/5, 3/4, ...

Let  e > 0  be less than  1/3.  Consider the following open interval:

In   =   ] qn - en ,  qn + en [

For the above choice of sequence, this is contained in  ]0,1[  (the proof is simple but tedious).  Otherwise, we redefine the interval by intersecting the above with  ]0,1[.  Either way, the length of  I is  2e or less.

Therefore, the measure of the union of all intervals  In  is less than  2e/(1-e).  When  e  is less than  1/3, this upper-bound is less than 1.

It's quite counterintuitive that all rationals between 0 and 1 are included in the union of intervals whose total length is less than one.  Moreover, since the above  e  can be as small as desired, that total length can be arbirarily small.  Therefore the  measure  is nonnegative but less than any positive quantity.  It must be zero.

Music and Measure Theory (13:12)  by  Grant Snderson  (3Blue1Brown, 2015-10-03).

Constantin Carathéodory (1875-1941)   |   Lebesgue's number lemma

(2014-02-28)   Outer measures of subsets:
Nonnegative functions of sets which are countably subadditive.

Constantin Carathéodory (1875-1941)   |   Carathéodory's extension theorem (1875-1941)

(2014-02-28)   Lebesgue measure of sets of real numbers:
The  Borel Tribe  consists of all measurable sets.

By definition, the Lebesgue measure of a nonempty open interval  ]a,b[  of real numbers is equal to its  length :

l ( ]a,b[ )   =   b - a   >   0

The  outer measure  of a set of real numbers  E  is defined as the greatest lower bound of the sums of the lengths of countable collections of open interval that cover it:

l* ( E )   =   inf  { SI ( bi-ai )   |   E

Lebesgue measure (1901)   |   Henri Lebesgue (1875-1941)

(2014-02-28)   Stieltjes measures
Forerunners of the general notion of distributions.

Lebesgue-Stieltjes integration

(2004-11-15)   Relative densities of sets of integers.
Attempting to compare the densities of two sets of integers.

Let Q be some  divergent series  å q n  of positive terms.

For two sets of natural integers E and F, the relative density of E and F  [with respect to Q]  is defined as the following limit, whenever it exists :

DQ (E | F)   =
 lim m ® ¥

 m å q n n Î E m å q n n Î F

The sum on the numerator (resp. the denominator) is understood to be over all indices in E (resp. F) which do not exceed m.

When N is the entire set of natural integers, the quantity D(E | N) may be called the [absolute] density of E with respect to Q.  We denote it  D(E).

D(E | F)   =     D(E)  /  D(F)

Note that, in the above equation, the quantity on the left-hand-side may be defined even when neither of the two quantities on the right-hand-side are.

When q n = 1 , we call the above natural densities.  When q n =  1/n , they are called logarithmic densities.  Both types have been used fairly extensively.

We could also consider q n =  1/n Log n  (for n>1), or q n =  1/n Log n LogLog n  (for Log(n)>1), etc.  This would make Q diverge very slowly.

On the other hand, we may consider for  Q  a diverging geometric series of ratio z, of which the above series  (1+1+1+...)  is a singular case.  We may define the  geometric density  as a function of  z :

f (z)   =   Dz (E)   =
 lim m ® ¥

 m å z n n Î E m å z n n = 0

This function is well-defined only when it is constant  (= 1).

Natural density

(G. P. Michon2000-10-15)   What would you say is the "probability" that a randomly chosen integer has an even number of digits?

Note:  Although this is a minor issue in the following discussion, we must point out that the integer 0 is best considered to have "zero digits" (it does not have a leading nonzero digit).  When k is 1 or more, the integers with k digits in base B (which need not be ten) thus go from Bk-1 to Bk-1, included.

There are many foolproof ways to define a proper probability distribution over the set of natural integers...  Consider any convergent series of positive terms which add up to 1, and define the probability of a set E of integers to be the sum of all such terms whose rank is an integer in E.  For example, if we use a geometric series of positive ratio l < 1, we might define the probability of E as:

 Pl (E)     =     (1-l) å ln n Î E

This is safe, but this feels "artificial"...  The convergent series we choose is totally arbitrary; even within geometric series there is nothing special about any particular ratio l.  Also, every integer has a nonzero probability, although it "should" be utterly negligible compared with infinitely many of its peers...  Let's try something bold, then:  Consider the limit P(E) of the above as l tends to 1:

This looks like a bright idea at first:  P(E+k) equals P(E)  (since lk is the ratio of the defining quantities, whose limits are considered as l tends to 1).  Any finite set has zero probability, yet the probability of all the multiples of p is equal to 1/p, etc.  Fine...  The problem is that there are sets for which the above limit doesn't even exist.  The set of integers with an even number of digits is one such set...

For the set EVEN consisting of 0 (which has "0 digits", by convention) and all the integers having an even number of digits, we have:

Pl (EVEN)   =   1 - l + l10 - l100 + l1000 - l10000 + ...

When l is   1-1/10n   for integral values of n, the above is 0.45395505298... for large even values of n and 0.546044947... for large odd values of n (note that the average of these two numbers is ½ ).  The thing oscillates as l tends to 1.

If two tentative probabilities P and Q are defined for "enough" sets and are invariant by translation, then they are equal wherever they're both defined. [??]

Is it possible to define a "probability" over the set of natural integers?  Well, if we require that finite sets of integers have zero "probability", such a "probability" cannot be a measure in the usual sense of the term, as required of probabilities defined over uncountable sets, since a measure must be countably subadditive (that is, the measure of a countable union is no greater than the sum of the measures of its components).  If such a function is zero for singletons, it's also zero for any set of integers (necessarily a countable union of singletons).

However, we may investigate the possibility of a "probability" function that would have all the properties of a measure except that it would be merely finitely subadditive (only the "probability" of a union of finitely many sets does not exceed the sum of their probabilities).

We may define the density of a set E of integers as the limit (if it exists) of 1/n times the number of elements of E that are no greater that n (a ratio which we may call a "partial density").  We would require the "probability" of E to be equal to this density whenever it exists.  For example, the "probability" of the multiples of P must be 1/P.

Some sets, however, do not have such a well-defined density.  If E is the set of integers having an even number of digits, the above "partial density" ratio keeps oscillating between 1/11 and 10/11 and has no limit.  Another example is the set L(10,D) of integers whose leading decimal digit is D; for this one we would probably like the newly defined probability to verify "Benford's law":

P( L(10,D) )   =   ln(1+1/D) / ln(10)

It seems 3 fundamental properties are required:

1. The "probability" of a set with a well-defined density is this density.
2. The "probability" of the union of two disjoint sets is the sum of their "probabilities".
3. Every set has a "probability".
Once these 3 conditions are met, it would seem reasonable to drop the quotes around the word "probability"...

To satisfy the third requirement, one could propose to define, say, a "probability" as the superior limit of partial densities.  Since partial densities are bounded (they are between 0 and 1) such a superior limit always exists.  It is equal to 10/11 in the case of the set "EVEN" of integers with an even number of digits.  This, however, won't meet the second requirement.  Just split EVEN into two disjoint sets A and B, where A consists of all integers of EVEN whose leading digit is 1 (one).

```A = {10,11...19,1000,1001...1999,100000...199999,10000000...}
B = {20,21...99,2000,2001...9999,200000...999999,20000000...}```
Now, partial densities for A are highest when n is 19, 1999, 199999, etc. as there are respectively 10, 1010, 101010, etc. members of A no larger than n. The limsup of partial densities is therefore 50/99 for A. On the other hand, partial densities for B are highest when n is 99, 9999, 999999, etc. as there are respectively 80, 8080, 808080, etc. members of B no greater than n. The limsup of partial densities is therefore 80/99 for B.  Now 50/99+80/99 is 130/99 which is clearly not the limsup of partial densities for EVEN (namely 10/11).  Heck, this sum of probabilities of two disjoint sets does not even have the decency to be less than 1...

A similar problem is encountered with the inferior limit of partial densities, which are lowest for A when n is 9, 999, 99999, 9999999 etc. (with respectively 0,10,1010,101010, etc. members of A).  The liminf for A is 1/99.  For B, lowest partial densities are achieved when n is 19, 1999, 199999, 19999999, etc. (for a count of 0, 80, 8080, 808080 etc.) and the liminf is 4/99.  Again, 1/99+4/99=5/99 is well below the corresponding liminf for the union EVEN of A and B (namely 1/11, or 9/99).

The related Schnirelmann density the greatest lower bound of partial densities.

To provide a satisfactory answer, we have to ensure that additivity occurs with whatever limiting process we use. Partial densities are additive and their limits (if they exist) are thus also additive.  Similarly, the sequence of the averages of the partial densities whose rank is n or less will also be additive, so will its limit if it exists.  We may iterate this process by considering the sequence of averages of the previous sequence. It is not difficult to prove that in such an iteration if a sequence has a limit so do the following ones and the limits are all equal...

(2012-08-06)   Signed Measures
Bounded real measures or unbounded complex ones.

The values of ordinary measures are nonnegative real numbers.  Signed measures take on  complex  values.

(2014-02-09)   Measures on Boolean Algebras
A choice principle equivalent to the Hahn-Banach theorem.

It was established in 1951 by  J. Los  and  C. Ryll-Nardzewski  (and, independently, by  W.A.J. Luxembourg  in 1969)  that the  Hahn-Banach theorem  follows from the  Boolean Prime Ideal Theorem.  The following proposition is equivalent to the  Hahn-Banach theorem,  in the sense that both entail exactly the same  choice principle:

On every Boolean algebra, there exists an additive real-valued
measure  m  such that:     m (0)  =  0     and     m (1)  =  1

See Los and (1951, pp. 233-237) or
.

"Application of Tychonov's Theorem in Mathematical Proofs"   by  Jerzy Los  and
Czeslaw Ryll-Nardzewski   in  Fundamenta Mathematicae 38 (1951)  pp. 233-237.
Wilhelmus A.J. Luxembourg (1969, p. 131)
"Prime Ideal Theorems and systems of finite character" ( pdf )  by  Marcel Erné  (1997).