Final Answers
© 2000-2023 Gérard P. Michon, Ph.D.

Matrices and
Determinants

Permutation matrices include the identity matrix and the exchange matrix.
Gil's matrix The favorite square matrix of professor W. Gilbert Strang.
Operations on matrices are conveniently defined using Dirac's notation.
Triangular matrices. Proper isomorphism between upper and lower ones.
The determinant is proportional to any completely antisymmetrical form.
Minors. First minors are obtained by deleting one row and one column.
Adjugate of a matrix: Transpose of its cofactor matrix A adj(A) = det(A) I
Eigenvectors and eigenvalues of an operator or a matrix.
The numerical range of a complex matrix. Locus of its Rayleigh ratios.
The characteristic polynomial of an operator doesn't depend on the basis.
The minimal polynomial has the same zeroes but no multiple roots.
Cayley-Hamilton theorem: A matrix vanishes its characteristic polynomial.
Normal matrices and diagonalizable matrices.
Totally positive generalized Vandermonde matrices (fractional powers).
Cholesky decomposition L L* of an Hermitian matrix.
Toeplitz matrix: Constant diagonals.
Circulant matrix: Cyclic permutations of the first row.
Linear recurrences of order k-1 which define sequences of period k.
Wendt's Determinant: The circulant of the binomial coefficients.
Hankel matrix: Constant skew-diagonals.
Catberg matrix: Hankel matrix of the reciprocal of Catalan numbers.
Hadamard matrix: Unit elements and orthogonal columns.
Sylvester matrix of two polynomials has their resultant for determinant.
The discriminant of a polynomial is the resultant of itself and its derivative.
Pfaffian: Polynomial whose square equals an antisymmetric determinant.
The numerical range of a complex matrix. Locus of its Rayleigh ratios.
Matrices with coefficients in a noncommutative ring.
Matrix exponential. The exponential of a matrix is an invertible matrix.
Matrix gamma function. The reciprocal gamma is an entire function.
Matrix Mittag-Leffler functions. Applications to fractional calculus.
Values of analytic functions for nilpotent matrices reduce to finite sums.

Videos :

The Rise and Fall of Matrices (42:37) by Walter Ledermann (LMS, 1986).
Gaussian elimination [ Part 1 | Part 2 ] by Norman J. Wildberger (2010).
What's a [symmetric] determinant? (2:50) James Schloss (LeiosOS, 2016-07-23).
The determinant, essence of linear algebra (10:02) Grant Sanderson (2016-08-10).

Linear Algebra: Matrices and Determinants

(2006-12-26) Identity (I), Permutation Matrices, Exchange Matrix (J) The various permutations of the columns of the identity (I).

Those square matrices whose elements vanish except for one unit element per row and per column are known as permutation matrices. Each corresponds to a permutation of the columns of the identity matrix (I) and they form collectively a multiplicative group isomorphic to the symmetric group on a set of n elements.

One particular permutation matrix is the so-called exchange matrix (J) which is the "mirror image" of the identity matrix itself. For example:

I₆ =	1	0	0	0	0	0
	0	1	0	0	0	0
	0	0	1	0	0	0
	0	0	0	1	0	0
	0	0	0	0	1	0
	0	0	0	0	0	1

J₆ =	0	0	0	0	0	1
	0	0	0	0	1	0
	0	0	0	1	0	0
	0	0	1	0	0	0
	0	1	0	0	0	0
	1	0	0	0	0	0

In spite of this superficial finite-dimensional resemblance with the identity matrix, there's no nice generalization of the exchange matrix to infinitely many dimensions.

(2023-05-31) Gil Strang's Matrix The favorite square matrix of professor W. Gilbert Strang (MIT).

S₆ =

2 -1 0 0 0 0

-1 2 -1 0 0 0
0 -1 2 -1 0 0
0 0 -1 2 -1 0
0 0 0 -1 2 -1
0 0 0 0 -1 2

Of particular interest is Strang's matrix S₁₁ because it has 121 elements...

Favorite Matrix

(2007-02-13) Defining matrix operations from operations on elements Dirac's notation is convenient to express the elements of a matrix A.

The scalar at row i and column j in matrix A, is often denoted a_ij

Similar ad hoc notations are adequate when juggling with just a few distinct matrices, in spite of the fact that the name of the doubly-subscripted symbol is not directly related to the name of the matrix itself (A)...

That flaw is avoided with the nice self-contained notation (due to P.A.M. Dirac) shown on the right-hand side of the following equation:

a_ij =

This is sometimes called the element of A between i and j. Abstract meanings are given to separate parts of that notation: | j > is called a ket (or column vector) in the canonical orthonormal basis and < i | is the dual of such a thing (it's called a bra or row vector).

Originally, the words bra and ket where just the two halves of the word bracket, a mnemonic invented by Dirac as a reminder that the row-vector comes before the column-vector in the bracket notation normally used to denote the inner product which forms a scalar. (with the opposite order, you'd form a square matrix instead).

These interpretations are important and do repay study, but they're not a prerequisite to use the notation as a whole. For example, the following equation defines the sum A+B of two matrices A and B by giving every element of it, in terms of a simple addition of scalars :

= +

More interestingly, here's how to define the product A B of two matrices:

= å_k < k | B | j >

Jacques Binet

That general rule for matrix multiplication was made explicit in 1812 (using more traditional index notations) by the French mathematician Jacques Binet (1786-1856; X1804) who is also remembered for two utterly unrelated things, both known as Binet's formulas :

1) Explicit expression for the n-th term of a sequence obeying a linear recurrence relation (e.g., the Fibonacci sequence).
2) In celestial mechanics, Binet's formulas allow an easy proof that the orbit of two gravitating bodies must be a conic section.

If we denote by z* the conjugate a-ib of the complex number z = a+ib, (z=z* when z is a real number) then we may define as follows the conjugate transpose A* of the complex matrix A:

" i , " j , = < j | A | i >*

This operation is far more important than transposition without conjugation (which is virtually useless in the complex realm). With respect to A, the matrix A* is variously called conjugate transpose, Hermitian conjugate, adjoint, Hermitian adjoint or dual. (It's called transpose for real matrices.)

Here's how to prove for matrices the relation which gives the adjoint of a product: (A B)* = B* A* (this holds for whatever is called adjoint or dual among other multiplicative objects for which the term is defined).

< i \| (A B)* \| j >	= < j \| A B \| i >*
	= S_k ( < j \| A \| k > < k \| B \| i > )*
	= S_k < j \| A \| k >* < k \| B \| i >*
	= S_k < k \| A* \| j > < i \| B* \| k >
	= S_k < i \| B* \| k > < k \| A* \| j >
	= < i \| B* A* \| j >

(2019-01-29) Triangular Matrices

The concepts we introduced in the previous section are natural in the sense that they do not depend on the structure of the set used to index matrix coefficients. In particular, any order between distinct indices was utterly irrelevant in those definitions.

Incidentally, this is true because we only consider matrices of finite dimension over a ring of coefficeints where addition is required to be commutative. If either condition fails, the summation symbol S (subscripted by the dummy name of the summation index) is no longer well-defined. Even with a commutative addition, an infinite sum could depend on the ordering of indices.

The identity matrix I has a natural definition in this sense while other permutation matrices do not (including mirror-image of I previously featured).

Come back later, we're
still working on this one...

Transposition is one obvious anti-isomorphism between upper-triangular matrices and lower-triangular matrices. A straight isomorphism can be defined using any involution s among the set of indices. In dimension n+1, if we use the integers from 0 to n as our set of indices, s(i) = n-i will do. The induced involution over the set of matrices is defined by:

= < s(j) | A | s(i) >

So defined, s is a ring isomorphism between upper-triangular matrices and lower-triangular matrices.

(2010-08-10) Determinant of a Matrix (Seki, 1683) It's defined as a completely antisymmetrical multilinear form.

In a vector space of dimension n, a completely antisymmetrical form is a scalar-valued function of n vectors, which is linear with respect to every argument and has the property of being unchanged by any even permutation of the arguments and negated by any odd one.

Even and odd permutations have signature +1 and -1, respectively. Permutations are bijective functions of a finite set onto itself. The signature of a function s which is not bijective is zero ( sgn(s) = 0).

In dimension 3 for example, such a function f would be trilinear and verify:

f ( v₁, v₂, v₃) = f ( v₂, v₃, v₁) = - f ( v₂, v₁, v₃)

A key observation is that all such functions are proportional to each other.

Come back later, we're
still working on this one...

Leibniz's formula entails a sum of n! terms, each a product of n factors:

			n
det ( A ) =	å	sgn ( s )	Õ	< i \| A \| s(i) >
	sÎS_n		i = 1

(2012-09-18) Minors & Cofactors A minor is a determinant obtained by deleting n rows and n columns.

The number of deleted rows and columns can be zero (the whole determinant is a minor of itself). An n-th minor is obtained by deleting n rows and n columns. Of particular interest are the first minors of a determinant, which are obtained by deleting one row (the i-th one) and one column (the j-th one).

Come back later, we're
still working on this one...

Wikipedia : Minor | Cofactor | Laplace expansion

(2010-08-12) Adjugate (classic adjoint): Tranpose of the cofactors. The adjugate of A verifies adj(A) A = A adj(A) = det(A) I

The adjugate adj(A) of a matrix A may no longer be called its "adjoint", which is now a synonym of transpose conjugate (denoted A* or A ).

A matrix A is invertible iff it has nonzero determinant, in which case:

A^-1 = adj(A) / det(A)

As the determinant of the matrix A is an homogeneous polynomial of its elements, it can be written uniquely in the following way, where the polynomials A_ij and B_ij do not depend on the element a_ij.

det(A) = a_ij A_ij + B_ij

By definition, A_ij is the cofactor of a_ij

Come back later, we're
still working on this one...

(2010-08-12) Eigenvectors and eigenvalues The image of an eigenvector is proportional to it.

Come back later, we're
still working on this one...

Eigenvalue Problems (in Sixty Symbols) by Seamus Garvey.

(2019-11-23) Numerical Range A compact convex set associated to any complex matrix.

For any complex matrix A (Hermitian or not) the Rayleigh quotient (or Rayleigh-Ritz ratio) at a nonzero point x is the following ratio:

x* A x

x* x

If A is a quantum observable, that quantity is its expected value in the quantum state descbribed by ket x.

The range of all Rayleigh quotients is called the numerical range of A. It clearly contains the spectrum of A (i.e., all eigenvalues of A).

Come back later, we're
still working on this one...

Rayleigh quotient | Numerical range | Product numerical range | Spectrum | Spectral theory
Sturm-Liouville theory (1834) | Blaschke products (1915) | Wilhelm Blaschke (1885-1962)
Poncelet's_closure theorem | Jean-Victor Poncelet (1788-1867; X1807)

Finding Ellipses by Ulrich Daepp, Pamela Gorkin, Andrew Shaffer, Karl Voss (MAA Press, 2018-10-30)

(2010-08-12) Characteristic Polynomial P of a square matrix It's independent on the choice of a vectorial basis.

Come back later, we're
still working on this one...

Trace :

The trace of a matrix is the sum of the elements in its diagonal:

	n
tr ( A ) =	å	< i \| A \| i >
	i = 1

It's also the coefficient of (-x) in the characteristic polynomial of the operator. As such, the trace is an intrinsic value associated to the operator itself, irrespective of the choice of a coordinate basis. So, we may talk freely about the trace of an operator (or the trace of a tensor).

The trace function is a linear function. Therefore, its kernel is a vector space, consisting of traceless matrices, which play an important rôle in quantum theory: The groups SU(2) and SU(3) Lie groups are represented by traceless matrices, so are their respective standard bases, the three Pauli matrices and the eight Gell-Mann matrices.

(2017-12-03) Minimal Polynomial Q of a Square Matrix A polynomial Q easy to obtain from the characteristic polynomial P.

Defining Q as the least monic polynomial havind the same roots as P, we obtain it as P/R where R is the greatest (monic) common divisor of P and its derivative P'.

It's not even necessary to know the roots of P to perform that calculation.

The characteristic polynomial P is irreducible when it's equal to the minimal polynomial Q.

(2010-08-12) Cayley-Hamilton Theorem (1855, 1858, 1878) Any matrix is a root of its own characteristic polynomial.

That general theorem was first stated in 1858 by Arthur Cayley who only checked it for 3-dimensional matrices. A special case (3-dimensional rotations) had been stated by Hamilton three years earlier. The first general proof was published in 1878 by Georg Frobenius (1849-1917).

As the coefficients of the characteristic polynomial of a matrix A are polynomial functions of its elements, the assertion states that a particular set of n² polynomials functions (of degree n) in n² variables vanishes.

To prove that the coefficients of all the relevant polynomials vanish, it suffices to establish that the associated polynomial functions vanish on a sufficiently large set of points. In particular, a dense set of points will do (the zeroes of any continuous function form a closed set).

For an n-dimensional vector space over complex numbers, the set of matrices with distinct eigenvalues is dense among all square matrices.

Come back later, we're
still working on this one...

Algebraic Number Fields as Matrix Algebras by Kapil Hari Paranjape (2002-10-20).

Wikipedia : Cayley-Hamilton theorem

(2020-09-13) Diagonalization of normal matrices. A matrix A is said to be normal when A A* = A* A

Two matrices A and B are said to be similar when they are related by an inner isomorphism, namely there's an invertible matrix P such that:

B = P A P^-1

A matrix is said to be diagonalizable if it's similar to a diagonal matrix.

Come back later, we're
still working on this one..when.

Diagonalizable matrix | Normal matrix | Spectral theorem

(2015-04-03) Diagonalization and Jordan Normal Form of M For some S, the nondiagonal elements of S M S^-1 are off-diagonal units.

Come back later, we're
still working on this one...

Jordan matrix | Jordan normal form | Camille Jordan (1838-1922; X1855)

(2017-12-03) Jordan-Chevalley-Dunford decomposition: M = D+N Chevalley's algorithm finds the unique decomposition of a matrix into the sum of its diagonalizable and nilpotent parts, which commute.

History :

Although Camille Jordan published an early form of it in 1870, that result took a full century to make its way into a graduate textbook (Dunford & Schwartz, 1971). It's now common knowledge among graduate students and it has entered the standard undergraduate curriculum for linear algebra.

What is still lost on most students and some faculty is the remarkable fact that the decomposition can be effectively performed without any knowledge of the eigenvalues of the matrix to be decomposed. The ground field need not even be algebraically closed (like the complex numbers). A perfect field is enough (e,g., the rationals or any finite field).

That crucial observation is due to Claude Chevalley (1951) which fully justifies the name of Jordan-Chevalley decomposition, which is now commonly used. Here's the chronology of the four key publications:

1870: Camille Jordan (1838-1922; X1855)
Traité des substitutions et des équations algébriques (Paris, 1870).
1951: Claude Chevalley (1909-1984) Théorie des groupes de Lie II. Groupes Algébriques (Hermann, Paris, 1951).
1954: Nelson Dunford (1906-1986) Spectral operators Pacific Journal of Mathematics, 4, pp. 321-354 (1954).
1971: Nelson Dunford & Jacob T. Schwartz (1930-2009) Linear Operators III. pp.1937-1941 & 2096 (Wiley, 1971).

Since the Dunford & Schwartz textbook (1971) was so instrumental in its popularizarion, the concept was known for a while as Dunford's decomposition, a term which is now being deprecated. (Dunford's earliest research on the topic appeared 3 years after Chevalley's final results.) As a clear example of reverse-chauvinism, the French have held on to the attribution to Dunford longer than anyone else...

Proof :

In a context where the matrix could be put in Jordan's normal form, the existence and uniqueness of the Jordan-Chevalley decomposition are fairly trivial. However, as advertised, we aim for a stronger result and won't assume that the characteristic polynomial or the minimal polynomial can be fully factored.

We merely assume that the gound field is a perfect field. This is all the following algorithm need to uniquely determine the diagonalizable part D of the matrix M. The nilpotent part N is then obtained as M-D. The ground field need not be algebraically closed (as Chevalley's algorithm doesn't require that). The last step will be to remark that N and D commute because they're both obtained as polynomial functions of M

Chevalley's Algorithm :

Let M be the matrix to decompose. Let Q be its minimal polynomial. The diagonalizable part D of M is a zero of Q.

In elementary numerical analysis, the Newton-Raphson method can be used to find a numerical zero of a polynomial Q (or any other differentiable function) as a limit of a sequence of approximations, defined by recurrence:

x_n+1 = x_n - Q ( x_n) / Q' ( x_n)

If the starting point is close enough to the root, the sequence so defined converges quadratically to that root (i.e., the number of correct digits roughly doubles withh each iteration).

Chevalley's idea was to apply the same strategy and the same iteration formula to matrices instead of numbers. The nice surprise is that the convergence is even much better, as the limit is reached in finitely many steps (no more than the binary logarithm of the dimension of the matrix). Let's show that:

Come back later, we're
still working on this one...

Nilpotent matrix | Diagonalizable matrix | Jordan-Chevalley (Dunford) decomposition

Decomposition of Matrices in Semisimple and Nilpotent Parts by Miguel (MathOverflow, 2012-09-29).
The Unapologetic Mathematician: The Jordan-Chevalley decomposition & Proof (2012-08-28).
The Jordan-Chevalley decomposition by Joo Heon Yoo (2014-09-08).

Décomposition de Dunford (5:25, in French) Matthieu Romagny (Les 5 minutes Lebesgue, 2016-01-19).

Décomposition effective de Jordan-Chevalley et ses retombées en enseignement
by Danielle Couty, Jean Esterlé & Rachid Zarouf (arXiv, 2011-03-25 / 2013-01-15).

(2017-01-04) Perron Vector of a Matrix with Positive Elements The spectral radius r of such a matrix is a simple positive eigenvalue.

The spectral radius of a square matrix is the largest modulus of its eigenvalues. A simple eigenvalue is a simple root of the characteristic polynomial. It's associated with an eigenspace of dimension 1.

Come back later, we're
still working on this one...

Cardinal voting | Perron-Frobenius theorem (Perron 1907, Frobenius 1912)
Oskar Perron (1880-1975) | Georg Frobenius (1849-1917)

(2006-01-18) Vandermonde Matrices A Vandermonde matrix is also sometimes called an alternant matrix.

The Vandermonde matrix associated with a sequence (x_n) is the square matrix of the successive powers of the elements in that sequence. The element on row i and column j is (x_j)ⁱ. [Some authors consider the transpose of this.]

M_n+1 =	1	1	1	1	...	1
	x₀	x₁	x₂	x₃	...	x_n
	x₀²	x₁²	x₂²	x₃²	...	x_n²
	x₀³	x₁³	x₂³	x₃³	...	x_n³
	...	...	...	...	...	...
	x₀ⁿ	x₁ⁿ	x₂ⁿ	x₃ⁿ	...	x_nⁿ

The determinant of the above Vandermonde matrix has a nice expression:

	n-1	n
det ( M_n+1) =	Õ	Õ	( x_j - x_i )
	i = 0	j = i+1

Proof : Both sides of the equation are polynomials of degree n(n+1)/2 in n+1 variables, which vanish when two of the variables are equal. Since such polynomials are necessarily proportional, they must be equal if some pair of matching coefficients are equal. That's easily proven for the last pair, by induction on n. (HINT: The coefficient of the last variable raised to the n-th power is the Vandermonde determinant for the n previous variables.)

Alexandre-Théophile Vandermonde (1735-1796)

The above Vandermonde determinant and Vandermonde matrix are named after the French mathematician Alexandre-Théophile Vandermonde (1735-1796) who is the rightful founder of the modern theory of determinants...

Earlier authors had stumbled upon determinants when solving several simultaneous linear equations in as many unknowns. Those include Cardano (in the 2 by 2 case only) Leibniz, Cramer (1750) and Bézout (1764).

However, Vandermonde published the first investigation (1771) of determinants in their own right, as functions of the coefficients of a matrix. The importance of the idea was immediately recognized by Laplace (1772) and Lagrange (1773). Curiously, the determinant concept was made clear before the concept of a matrix itself. The modern name "determinant" was coined by Gauss in 1801. When it does not vanish, this quantity "determines" that a system of equations has a unique solution (in which case it's called a "Cramer system", in high-school parlance).

Strangely enough, Vandermonde himself never mentioned the above type of matrices, now named after him. For lack of a better explanation, it has been speculated (by Lebesgue and others) that whoever atttributed this enduring "credit" mistook one of Vandermonde's superscripts for an exponent !

What's perhaps the greatest contribution of Vandermonde is often overlooked: Vandermonde came up with the idea of studying functions which are invariant under any permutation of a given polynomial's roots. He did so in the first of only four mathematical papers he ever published: Mémoire sur la résolution des équations (1770). Arguably, this publication marks the very beginning of modern algebra. This idea of Vandermonde's would ultimately lead to the celebrated Abel-Ruffini theorem (which states the impossibility of a general solution by radicals for algebraic equations of degree 5 or more). The subsequent work of Lagrange, Ruffini, Abel and Galois begat Group Theory and Field Theory.

Xinglongdada (2012-05-04, e-mail) A Totally Positive Matrix If 0 < a₁ < a₂ < ... < a_n and 0 ≤ l₁ < l₂ < ... < l_n prove that the determinant of the following matrix A_n is positive.

A_n =	a₁^l₁	a₂^l₁	a₃^l₁	...	a_n^l₁
	a₁^l₂	a₂^l₂	a₃^l₂	...	a_n^l₂
	...	...	...	...	...
	a₁^l_n	a₂^l_n	a₃^l_n	...	a_n^l_n

Such a matrix is known as a generalized Vandermonde matrix.

It's a classical example of a totally positive matrix (i.e., a real matrix whose minors are all positive) which is to say that we always obtain a square matrix of positive determinant by deleting from it an equal number (possibly zero) of rows and columns. (Note that, if a matrix is a generalized Vandermonde matrix, so are all its submatrices.)

The fact that all generalized Vandermonde determinants are positive happens to be a consequence of the fact that they are all nonzero...

Indeed, there's a continuous path (depending on a parameter u that goes continuously from 0 to 1) which goes continuously from the above determinant to an ordinary Vandermonde determinant (which is clearly positive by virtue of the above formula which expresses it as a product of positive factors). The determinant at point u along that path is obtained by replacing the exponent l_i with the positive quantity:

l'_i = (1-u) l_i + u (i-1)

For any u in [0,1], the above inequalities between exponents are preserved. A continuous function of u over [0,1] cannot go from a negative to a positive value without vanishing at some point. So, to show that all generalized Vandermonde determinants are positive, we only need to establish that none of them is zero. Let's prove that:

Proof :

We proceed by induction on the size n of the matrix (n = 1 being trivial).

Let's assume (by induction hypothesis) that there are no vanishing generalized Vandermonde determinants of size less than n. Then, if we had a vanishing generalized Vandermonde determinant of size n, there would be a linear combination of different powers of x vanishing for at least n positive values of x (namely, a₁ ...a_n). That is to say that we'd have coefficients c_i (not all vanishing) for which:

0 = c₁ x^l₁ + c₂ x^l₂ + ... + c_n x^l_n for at least n positive values of x.
So, 0 = c₁ + c₂ x^{l₂- l₁} + ... + c_n x^{l_n- l₁} for those same values.

By Rolle's theorem, the derivative of the right-hand side would vanish at least once in each of the n-1 intervals separating the n known distinct roots. So would that derivative multiplied into the positive quantity x. Therefore:

0 = c₂ (l₂- l₁) x^{l₂- l₁} + ... + c_n (l_n- l₁) x^{l_n- l₁} for at least n-1 values of x>0

This would contradict the induction hypothesis.

Wikipedia : Totally Positive Matrix (Fekete's criterion)
Generalized Vandermonde Determinants by Hans Peter Schlickewei & Carlo Viola.
An Explicit Factorization of Totally Positive Generalized Vandermonde Matrices Avoiding Schur Functions
by Young-Ming Chen, Hsuan-Chu Li, Eng-Tjioe Tan. Applied Mathematics E-Notes, 8 (2008), pp. 138-147.

(2020-09-13) Signature of a Hermitian matrix (1852) This was originally called Sylvester's law of inertia.

Sylvester's law of inertia (1852) | James Joseph Sylvester (1814-1897)

Computing the signature of an n×n symmetric matrix by Igor Rivin (MathOverflow, 2012-09-19).

Theorem of the Day #142, by Robin Whitty.

(2012-10-07) Cholesky reduction of a Hermitian matrix:
There's a lower-triangular matrix L such that L L* = A

By definition, a lower-triangular matrix is a matrix with only zero elements above the diagonal.

This decomposition was introduced by the French engineer André-Louis Cholesky in the special case where the matrix A is a positive-definite symmetrical real matrix. In that case, the matrix L has real coefficients and nonzero diagonal elements.

The matrix L isn't unique, as -L, (L*)^t and -(L*)^t are satisfactory too.

A related construction which applies to any definite Hermitian matrix (not necessarily positive) involves a diagonal real matrix D and a lower-triangular matrix L whose diagonal elements are all equal to 1 (unity).

L D L* = A

Conversely, any matrix of this form is clearly Hermitian (A* = A).

In the special case where A is real, L and D are real too.

The requirement of definiteness (no zero eigenvalue) can be dropped. It's only useful to make the actual construction of L more straightforward.

Applications :

Because the decomposition can be performed quite efficiently, it can be used to solve a symmetric system of linear equations in real numbers much faster than with general-purpose Gaussian elimination (such symmetrical systems arise in the linear least square method, in particular):

L L* x = b

The triangular systems L y = b and L* x = y are sequentially solved.

Cholesky's method was published posthumously in 1924. It received little or no attention until World War II, when Jack Todd taught it in his analysis course at King's College, London. In 1948, Alan Turing proved the numerical stability of the procedure (after empirical results published the same year by J.D. Fox, H.D. Huskey & J.H. Wilkinson).

Cholesky decomposition | HP Prime implementation | André-Louis Cholesky (1875-1918; X1895)
How accurate is Gaussian elimination? by Nicholas J. Higham (July 1989)

(2006-01-18) Toeplitz Matrices A Toeplitz matrix is a matrix whose diagonals are constant.

In other words, the value of the element on the i^th row and j^th column of a Toeplitz matrix depends only on the difference (j-i).

M_n =	x₀	x₁	x₂	x₃	...	x_n-1
	x_-1	x₀	x₁	x₂	...	x_n-2
	x_-2	x_-1	x₀	x₁	...	x_n-3
	x_-3	x_-2	x_-1	x₀	...	x_n-4
	...	...	...	...	...	...
	x_1-n	x_2-n	x_3-n	x_4-n	...	x₀

Such matrices have been named after Otto Toeplitz (1881-1940).

Wikipedia: Toeplitz matrix | List of matrices
On Calculating the Determinants of Toeplitz Matrices by Hsuan-Chu Li (2011)

(2006-12-25) Circulant Matrices & Circulant Determinants A circulant matrix is a nice special case of a Toeplitz matrix.

The value of the element on the i^th row and j^th column of an n by n circulant matrix depends only on the difference (j-i) modulo n.

M_n =	x₀	x₁	x₂	x₃	...	x_n-1
	x_n-1	x₀	x₁	x₂	...	x_n-2
	x_n-2	x_n-1	x₀	x₁	...	x_n-3
	x_n-3	x_n-2	x_n-1	x₀	...	x_n-4
	...	...	...	...	...	...
	x₁	x₂	x₃	x₄	...	x₀

Each row is thus equal to the previous row shifted one place to the right.

The determinant and eigenvalues ( l_k) of the above circulant matrix are:

	n-1			n-1
det ( M_n) =	Õ	l_k	l_k =	å	w^jk x_j
	k = 0			j = 0

In this, w = exp ( ^2ip/_n ) is a primitive n^th root of unity. For example:

a b
b a

= a² - b² = ( a + b ) ( a - b )

	a b c c a b b c a		= a³ + b³ + c³ - 3 abc = ( a + b + c ) ( a + w b + w²c ) ( a + w²b + w c )
			with w = w₃ = ½ (-1 + i Ö3 )

	a b c d d a b c c d a b b c d a		= a⁴ - b⁴ + c⁴ - d⁴ - 2 a²c² + 2 b²d²
			- 4 a²bd + 4 ab²c - 4 bc²d + 4 acd²
			= (a+b+c+d) (a+ib-c-id) (a-b+c-d) (a-ib-c+id)

The determinant of a circulant matrix is commonly called the circulant determinant or, for short, the circulant of the n coefficients involved.

The order of those coefficients matters, except when n = 3.

Circulant matrix | Circulant graph

(2021-07-09) Periodic sequence defined by a linear recurrence. What linear recurrences of order k-1 yield sequences of period k?

Let's consider a sequence u obeying a linear recurrence of order k-1 :

u_n+k-1 = a₁ u_n+k-2 + a₂ u_n+k-3 + ... + a_k-2 u_n+1 + a_k-1 u_n

Well, if u is periodic of period k

Come back later, we're
still working on this one...

There are only four recurrences of order 3 yielding sequences of period 4:

u_n+3	=	-	u_n+2	-	u_n+1	-	u_n
u_n+3	=		u_n+2	-	u_n+1	+	u_n
u_n+3	=	i	u_n+2	+	u_n+1	-i	u_n
u_n+3	=	- i	u_n+2	+	u_n+1	+ i	u_n

Dirichlet characters | Sárközy's condition (1978) | András Sárközy (1941-)

(2006-12-27) Wendt's Determinant W_n (1894) Circulant of n binomial coefficients. (A048954)

In his 1894 dissertation about Fermat's Last Theorem, Ernst Adolf Wendt (1872-1946) investigated the resultant W_n of Xⁿ-1 and (X+1)ⁿ-1.

Emma Lehmer proved that W_n vanishes if and only if n is a multiple of 6.

The integer W_n boils down to the circulant of n binomial coefficients (more precisely, a line of Pascal's triangle, without the rightmost "1").

The element at line i and column j is best described as equal to C(n,|i-j|). With the above notations, x_j = C(n,j) [where j goes from 0 to n-1 ]. So:

	n-1
W_n = det ( M_n) =	Õ	[ ( 1 + w_n^k )ⁿ - 1 ]
	k = 0

n	Factorization of W_n
1	1
2	-1 . 3
3	2² . 7
4	-1 . 3 . 5³
5	11² . 31
6	0
7	2⁶ . 29² . 127
8	-1 . 3⁷ . 5³ . 17³
9	2² . 7 . 19⁴ . 37² . 73
10	-1 . 3 . 11⁹ . 31³
11	23⁵ . 67² . 89 . 199²
12	0
13	3⁶ . 53² . 79² . 131² . 521² . 8191
14	-1 . 2²⁴ . 3 . 29⁶ . 43³ . 127³
15	2¹⁴ . 7 . 11² . 31⁷ . 61⁴ . 151 . 271²
16	-1 . 3⁷ . 5³ . 7⁶ . 17¹⁵ . 257³
17	103² . 137⁴ . 307² . 409² . 613² . 3571² . 131071
18	0
19	191² . 229⁴ . 419² . 647² . 761² . 1483² . 9349² . 524287
20	-1 . 3 . 5²⁴ . 11⁹ . 31³ . 41⁹ . 61⁶
21	2⁸ . 7¹⁰ . 29² . 43¹⁰ . 127 . 211² . 337 . 379² . 463² . 547² . 2269²
22	-1 . 3 . 23²¹ . 61⁶ . 89³ . 43¹⁰ . 199⁶ . 683³
23	47¹¹ . 139⁴ . 461⁴ . 599⁴ . 1151² . 2347² . 3313² . 178481
24	0
See more details and a much larger table elsewhere on this site.

If p divides q , then W_p divides W_q . The n^th Mersenne number (2ⁿ-1) divides W_n and the quotient |W_n| / (2ⁿ-1) is a perfect square...

On Wendt's Determinant and Sophie Germain's Theorem by David Ford and Vijay Jha (1993)
On Wendt's Determinant by Charles Helou (1997)
Upper Bounds for the Prime Divisors of Wendt's Determinant by Anastasios Simalarides (2000)

(2006-01-18) Hankel Matrices & Hankel Transform A Hankel matrix is a matrix whose skew-diagonals are constant. (Such a matrix is also known as persymmetric or orthosymmetric.)

In other words, the value of the element on the i^th row and j^th column of a persymmetric matrix depends only on the sum of the indices (i+j).

M_n =	x₀	x₁	x₂	x₃	...	x_n-1
	x₁	x₂	x₃	x₄	...	x_n
	x₂	x₃	x₄	x₅	...	x_n+1
	x₃	x₄	x₅	x₆	...	x_n+2
	...	...	...	...	...	...
	x_n-1	x_n	x_n+1	x_n+2	...	x_2n-2

With those notations, the Hankel transform of the sequence x_n is defined to be the sequence y_n = det ( M_n+1) which starts with y₀ = x₀.

Unfortunately, the above is not the only thing called Hankel transform...

(2006-01-18) Catberg Matrix The Hankel matrix of the reciprocals of Catalan numbers... Its element on the i^th row and j^th column is (i+j+1) / C(2i+2j , i+j).

Come back later, we're
still working on this one...

(2007-02-16) Sign Matrices, Phase Matrices The elements of a phase matrix are complex numbers of module 1.

Thus, the elements of a phase matrix are complex numbers of the form:

e^iq = cos q + i sin q [ where q is real ]

If such a matrix is real, it's called a sign matrix (its elements are +1 or -1).

(2007-01-24) Hadamard Matrices ( real or complex coefficients ) An Hadamard matrix consists of orthogonal rows of unit elements.

Introduction (real coefficients) :

In the realm of real numbers, Hadamard matrices are square matrices whose elements are either -1 or +1 and whose rows are pairwise orthogonal.

They're named after the Frenchman Jacques Hadamard (1865-1963) the very influential long-lived analyst who gave functional analysis its name (1910) and will always be remembered for his 1896 proof of the Prime Number Theorem, simultaneous with the independent proof from Charles de la Vallée-Poussin (1866-1962).

In 1893, Hadamard did write an important paper about those matrices, but they were actually first investigated in 1867, under the quaint name of anallagmatic pavements, by James Joseph Sylvester (1814-1897).

Such matrices are often represented graphically as square mosaics of square tiles that are either light or dark... Here are special Hadamard matrices, known as Walsh matrices, whose orders are powers of 2 (1, 2, 4, 8, 16).

In the so-called normalized cases, one row and one column have a uniform color which is traditionally light (so that there are ½ n(n+1) light squares and ½ n(n-1) dark ones). The exact color-code can't be fixed (as the opposite of an Hadamard matrix is also Hadamard) but we usually assume that the dominant (light) color stands for the value +1. When its first column and its first row have a uniform value of +1, the matrix is said to be dephased (a jargon term discussed below, originally applied to complex Hadamard matrices c. 1970).

The above serves to illustrate a remark of Sylvester (1867) that an Hadamard matrix of order 2n may be obtained from a known Hadamard matrix H of order n, simply by juxtaposing four copies of H and negating one of those:

H	H
H	-H

More generally, an Hadamard matrix of order mn may be constructed from an Hadamard matrix K of order m and an Hadamard matrix H of order n by substituting in K all occurrences of +1 by H and all occurences of -1 by -H (the above is the special case m = 2). In other words, the Kronecker product of two Hadamard matrices is also an Hadamard matrix...

Abstract Algebraic Definition (real or complex realm) :

Hadamard matrices of order n may be defined as square matrices of order n whose elements have unit length ( modulus ) and which verify the relation:

M M* = n I_n

In this, M* is the transpose of a real matrix M or, more generally, the transpose conjugate of a complex one. Matrices with complex elements of unit moduli which satisfy the above relation are called complex Hadamard matrices. They were introduced in 1970 by Richard J. Turyn (1930-?) as interesting generalizations of the more traditional (real) Hadamard matrices, which can be considered to be merely of a restricted subtype.

A (complex) Hadamard matrix is trivially obtained from another by using one or several of the following types of transformations:

Multiply a row or a column by an element of unit modulus (like -1).
Replace a row or a column by its (element-wise) conjugate.
Permutate rows or permutate columns.

Matrices so obtained from each other are said to be Hadamard-equivalent. A complex Hadamard matrix where all first-row and first-column elements are equal to 1 is said to be dephased. Any complex Hadamard matrix is equivalent to a dephased one (HINT: Use the first type of transformations.)

It's not difficult to prove that all complex Hadamard matrices of order 2 are Hadamard-equivalent to the aforementioned Walsh matrix of order 2.

	=		1	1
			1	-1

Similarly, there's essentially only one complex Hadamard matrix of order 3. It involves a primitive cube root of 3, denoted w :

1	1	1	where w = w₃ = ½ (-1 + i Ö3 )
1	w	w²
1	w²	w

That's an example of a so-called Butson-Hadamard matrix, namely a complex Hadamard matrix of order n whose elements are p-th roots of unity. Those were first discussed in 1961 by Alton T. Butson (1926-1997) before Turyn's general approach to complex Hadamard matrices (1970).

For any order n, one example of a Butson-Hadamard matrix is given by the Vandermonde matrix of all the n-th roots of unity (which is Hadamard-equivalent to their circulant matrix). Another example is provided, for any composite order, by extending to complex Hadamard matrices what we've already remarked about real ones, namely that the Kronecker product of two (complex) Hadamard matrices is a (complex) Hadamard matrix. In particular, if H is a complex Hadamard matrix of order m, a complex Hadamard matrix of order 3m is obtained from the following pattern:

H	H	H	where w = w₃ = ½ (-1 + i Ö3 )
H	w H	w² H
H	w² H	w H

This can be applied repeatedly to obtain the following ternary equivalent of the aforementioned (binary) Walsh matrices, using a three-color code:

Complex Hadamard matrix of order 1

For any odd prime p, a complex Hadamard matrix whose elements are p^th roots of unity must have an order which is a multiple of p. Conversely, it is conjectured that such matrices exist for any order which is a multiple of p.

The existence of ordinary (real) Hadamard matrices (as discussed next) can be construed to be the case p=2, for which a slightly different rule holds...

Existence of (Real) Hadamard Matrices of Order n :

Back to ordinary Hadamard matrices (whose elements are either -1 or +1).

The empty matrix (of order 0) is vacuously an Hadamard matrix. The order of any Hadamard matrix must be 1, 2, or a multiple of 4 (Hadamard, 1893). Conversely, the Hadamard-Paley conjecture (better known as the Hadamard Conjecture ) states that there are Hadamard matrices of all such orders...

In 1933, Raymond Paley used finite fields to construct Hadamard matrices of all orders of the form q+1 (resp. 2q+2) where q is the power of a prime congruent to 3 (resp. 1) modulo 4. Paley's Theorem states that Hadamard matrices can be constructed (using Paley's construction and the aforementioned construction of Sylvester) for all positive orders divisible by 4 except those in the following sequence (i.e., multiples of 4 not equal to a power of 2 multiplied by q+1, for some power q of an odd prime):

92, 116, 156, 172, 184, 188, 232, 236, 260, 268, 292, 324, 356, 372, 376, 404, 412, 428, 436, 452, 472, 476, 508, 520, 532, 536, 584, 596, 604, 612, 652, 668, 712, 716, 732, 756, 764, 772, 808, 836, 852, 856, 872, 876, 892, 904, 932, 940, 944, 952, 956, 964, 980, 988, 996, 1004, 1012, 1016, 1028, 1036, 1068, 1072, 1076, 1100, 1108, 1132, 1148, 1168, 1180, 1192, 1196, 1208, 1212, 1220, 1244, 1268, 1276, 1300, 1316, 1336, 1340, 1364, 1372, 1380, 1388, 1396, 1412, 1432, 1436, 1444, 1464, 1476, 1492, 1508, 1528, 1556, 1564, 1588, 1604, 1612, 1616, 1636, 1652, 1672, 1676, 1692, 1704, 1712, 1732, 1740, 1744, 1752, 1772, 1780, 1796, 1804, 1808, 1820, 1828, 1836, 1844, 1852, 1864, 1888, 1892, 1900, 1904, 1912, 1916, 1928, 1940, 1948, 1960, 1964, 1972, 1976, 1992, 2008, 2024, 2032, 2036, 2052, 2056, 2060, 2072, 2076, 2092, 2108, 2116, 2136, 2148, 2152, 2156, 2164, 2172, 2200, 2212, 2216, 2228, 2264, 2276, 2284, 2292, 2296, 2300 ... (A046116)

For many of those remaining orders, Hadamard matrices have been discovered by other methods. Here's a brief summary...

In 1962, Baumert, Golomb, and Hall found an Hadamard matrix of order 92, using the general approach introduced in 1944, by John Williamson (who so constructed an Hadamard matrix of order 172). A Williamson matrix of order 4m is an Hadamard matrix obtained as follows from 4 matrices of order m (with unit elements) A, B, C and D, provided 4 matching relations are satisfied:

A	B	C	D	AA* + BB* + CC* + DD* = 4 m I_m AB* - BA* = CD* - DC* AC* - CA* = DB* - BD* AD* - DA* = BC* - CB*
-B	A	D	-C
-C	-D	A	B
-D	C	-B	A

For example, we may build an Hadamard matrix of order 12 with 4 matrices of order 3, by letting one of them be the matrix U whose 9 elements are equal to +1 (the square of U is 3U) while the 3 others are equal to U-2I.

With A = U, the resulting Hadamard matrix of order 12 (which isn't normalized in the above sense) is shown graphically below, next to the Hadamard matrices of order 24 and 48 obtained from it with the aforementioned Sylvester construct:

It turns out that all (real) Hadamard matrices of order 12 are equivalent. In particular, the above order-12 Hadamard matrix is Hadamard-equivalent to the following, which is normalized (dephased) and symmetrical:

In 1985, K. Sawade found an Hadamard matrix of order 268.

In June 2004, Hadi Kharaghani and Behruz Tayfeh-Rezaie built the Hadamard matrix of order 428 shown below (1 pixel per matrix element).

If we denote by M the symmetrical of any M with respect to its antidiagonal, the above matrix is based on 4 matrices A, B, C, D of order m = 107 put together in the following pattern, which gives an Hadamard matrix of order 4m if and only if all the 10 conditions listed are met. (Note that M N = N M .)

A	B	C	D	AA* + BB* + CC* + DD* = 4 m I_m B B* = B B* C C* = C C* D D* = D D*
-B	A	-D	C
-C	D	A	-B
-D	-C	B	A

A B* - B A* = D C* - C D*
A C* - C A* = B D* - D B*
A D* - D A* = B C* - C B*

A B* - B A* = D C* - C D*
A C* - C A* = B D* - D B*
A D* - D A* = C B* - B C*

Since June 2004, the order 668 has been the smallest multiple of 4 for which no Hadamard matrix is yet known (as of July 2014).

Generalized Hadamard Matrices by A. T. Butson (1961).
Recent advances in the construction of Hadamard matrices by Jennifer Seberry (1973).
Walsh-Hadamard transform | Hadamard Matrix : Wikipedia | MathWorld

Usage note : Both spellings "a Hadamard matrix" and "an Hadamard matrix" can be construed as correct: The former one applies to an anglicized pronounciation of "Hadamard" ("h" and final "d" both pronounced) whereas the latter must be used with the original French pronounciation (initial "h" and final "d" both silent).

(2006-12-31) Sylvester matrix of two polynomials The resultant of two polynomials is the determinant of that matrix.

Consider two polynomials A and B of respective degrees m and n.

	m			n
A =	å	a_i xⁱ	B =	å	b_j x^j
	i = 0			j = 0

Their Sylvester matrix is a square matrix (of dimension m+n) whose first n rows feature the m+1 coefficients of A and whose last m rows feature the n+1 coefficients of B. It is defined as follows:

M_m+n =	a_m	a_m-1	a_m-2	a_m-3	...	0
	0	a_m	a_m-1	a_m-2	...	0
	0	0	a_m	a_m-1	...	0
	...	...	...	...	...	...
	0	0	0	...	a₀	0
	0	0	0	...	a₁	a₀
	b_n	b_n-1	b_n-2	b_n-3	...	0
	...	...	...	...	...	...
	0	0	0	...	b₁	b₀

The determinant of that matrix equals the so-called resultant of A and B which may be expressed as follows, in terms of all the complex roots (a_i , b_j) of those polynomials and their leading coefficients a_m and b_n :

	m	n
r (A,B) = det ( M_m+n) = (a_m)ⁿ (b_n)^m	Õ	Õ	( a_i - b_j )
	i = i	j = 1

Final Answers
© 2000-2023 Gérard P. Michon, Ph.D.

Matrices and
Determinants

Related articles on this site:

Related Links (Outside this Site)

Videos :

Linear Algebra: Matrices and Determinants

Trace :

History :

Proof :

Chevalley's Algorithm :

Alexandre-Théophile Vandermonde (1735-1796)

Proof :

Applications :

Introduction (real coefficients) :

Abstract Algebraic Definition (real or complex realm) :

Existence of (Real) Hadamard Matrices of Order n :

Final Answers © 2000-2023 Gérard P. Michon, Ph.D.

Matrices andDeterminants

Related articles on this site:

Related Links (Outside this Site)

Videos :

Linear Algebra: Matrices and Determinants

Trace :

History :

Proof :

Chevalley's Algorithm :

Alexandre-Théophile Vandermonde (1735-1796)

Proof :

Applications :

Introduction (real coefficients) :

Abstract Algebraic Definition (real or complex realm) :

Existence of (Real) Hadamard Matrices of Order n :

Final Answers
© 2000-2023 Gérard P. Michon, Ph.D.

Matrices and
Determinants