Continue reading Simple proofs: The impossibility of trisection

]]>Ancient Greek mathematicians developed the methodology of “ruler-and-compass” constructions: if one is given only a ruler (without marks) and a compass, what objects can be constructed as a result of a finite set of operations? While they achieved many successes, three problems confounded their efforts: (1) squaring the circle; (2) trisecting an angle; and (3) duplicating a cube (i.e., constructing a cube whose volume is twice that of a given cube). Indeed, countless mathematicians through the ages have attempted to solve these problems, and countless incorrect “proofs” have been offered. We might add that Archimedes discovered a way to trisect an angle, but it did not strictly conform to the rules of ruler-and-compass construction.

The impossibility of squaring the circle was first proved by Lindemann in 1882, who showed that $\pi$ is *transcendental* — it is not the root of any algebraic equation with integer or rational coefficients. The impossibility of trisecting an arbitrary angle was proved earlier, in 1837, by Pierre Wantzel, and the impossibility of duplicating a cube was also proved at about that same time. For additional historical background on the trisection problem, see this Wikipedia article.

In most textbooks, these impossibility proofs are presented only after extensive background in groups, rings, fields, field extensions and Galois theory. Needless to say, very few college-level students, even among those studying majors such as physics or engineering, ever take such coursework. This is typically accepted as an unpleasant but necessary aspect of mathematical pedagogy. However, as we will see below, the impossibility of trisection or cube duplication, at least, can be proven much more simply.

We present here a proof of the impossibility of trisecting an arbitrary angle by ruler-and-compass construction and, as a bonus, the impossibility of cube duplication. We emphasize that this proof is both *elementary* and *complete* — all necessary lemmas and nontrivial facts are proved below. It should be understandable by anyone with a solid high school background in algebra, geometry and trigonometry, although it would help if the reader has previously been introduced to the notion of an algebraic field. This proof is somewhat longer than others in this series, but this is mainly due to the inclusion of several illustrative examples, as well as proofs of facts, such as the rational root theorem and the formula for cosine of a triple angle, that some might consider “obvious.” The core of the proof (see “Proof of the main theorems” below) is actually the *shortest* of those published so far in this series.

**Gist of the proof:**

We will show that trisecting a 60 degree angle, in particular, is equivalent to constructing the number $\cos (\pi/9)$ (i.e., the cosine of 20 degrees), which is an algebraic number that satisfies an irreducible polynomial of degree 3. Since the only numbers and number fields that can be produced by ruler-and-compass construction have algebraic degrees that are powers of two, this shows that the trisection of a 60-degree angle is impossible.

We first define some basic notions about polynomials, algebraic numbers and field extensions, and prove two basic lemmas (Lemma 1 and Lemma 2). Readers familiar with these two lemmas may skip to the “Constructible numbers” section below.

**Definitions: Algebraic numbers, minimal polynomials and degrees.**

By an *algebraic number*, we mean some real or complex number $\alpha$ that is the root of a polynomial equation with integer coefficients. The *minimal polynomial* of $\alpha$, namely $P(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$, is the polynomial with integer coefficients of minimal degree that has $\alpha$ as a root, and the *degree* of $\alpha$, denoted ${\rm deg}(\alpha)$, is the degree $m$ of its minimal polynomial. We will assume here that the integer coefficients $a_i$ have no common factor, since if they do all coefficients can be divided by this factor.

**LEMMA 1 (The rational root theorem)**: If $x = p / q$ is a rational root of a polynomial $a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0$, where $a_i$ are integer coefficients with neither $a_n$ nor $a_0$ zero (otherwise the polynomial is equivalent to one of lesser degree), and where $p$ and $q$ are integers without any common factor, then $p$ divides $a_0$ and $q$ divides $a_n$.

**Proof**: By hypothesis, $a_n (p/q)^n + a_{n-1} (p/q)^{n-1} + \cdots + a_1 (p/q) + a_0 = 0.$ After multiplying through by $q^n$, taking the lowest-order (constant) term to the right-hand side and factoring out $p$ from the other terms, we obtain $$p (a_n p^{n-1} + a_{n-1} p^{n-2}q + \cdots + a_1 q^{n-1}) = – a_0 q^n.$$ Since the large expression in parentheses is an integer, $p$ divides the left-hand side and thus also the right-hand side. But since $p$ and $q$ have no common factors, $p$ cannot divide $q^n$. Thus $p$ must divide $a_0$. Similarly, by taking the highest-order term to the right-hand side and factoring out $q$ from the other terms, we can write $$q (a_{n-1} p^{n-1} + a_{n-2} p^{n-2} q + \cdots + a_0 q^{n-1}) = – a_n p^n.$$ But again, since the expression in parentheses is an integer, $q$ divides the left-hand side and thus must also divide the right-hand side. Since $q$ cannot divide $p^n$, it must divide $a_n$.

**Examples**: $\sqrt{3}$ satisfies the equation $x^2 – 3 = 0$, so that its minimal polynomial is of degree not greater than two. By Lemma 1, any possible rational root $x = p/q$, where $p$ and $q$ have no common factor, would need $p$ dividing 3 and $q$ dividing one. Since there are no such integers $p$ and $q$ that satisfy this requirement and $(p/q) ^2 – 3 = 0$, the polynomial $x^2 – 3$ is irreducible. Thus $x^2 – 3$ is the minimal polynomial of $\sqrt{3}$, and hence $\sqrt{3}$ is irrational. Similarly, $1 + \sqrt{2}$ satisfies the polynomial $x^2 – 2 x – 1$, so that its algebraic degree cannot exceed two, and since this polynomial is *irreducible*, its degree cannot be less than two. Irreducibility can be seen in this case by checking the various possibilities for integers $a, b, c, d$ such that $x^2 – 2 x – 1 = (ax + b) (cx + d) = ac x^2 + (ad + bc) x + bd$, where the coefficients of $x^2, \, x$ and the constant term must match term by term on both sides of the equation. By Lemma 1, the only possibilities for $a, b, c, d$ are $1$ or $-1$, and none of these combinations give $-2x$ for the middle term. Thus $x^2 – 2x – 1$ is irreducible, and its roots are irrational.

**Definitions: Algebraic fields and extensions.**

By an *algebraic field*, we will mean, for our purposes here, a set of numbers $F$ that includes the rational numbers $R$, with the usual four arithmetic operations defined, and with the property that if $x$ and $y$ are in $F$, then so is the sum $x+y$, difference $x-y$, product $x y$, and, if $x \neq 0$, the reciprocal $1/x$. There is a rich theory of algebraic fields, but we will not require anything here more than this simple definition, the notions of linear independence and degrees, which we define below, and a basic fact (Lemma 2), which we prove below.

**Examples**: The simplest examples of fields are extensions of the rationals generated by one or more algebraic numbers. Suppose, for example, that $\alpha$ is algebraic of degree $m \ge 2$, satisfying a minimal polynomial $P(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$. Let the set $S$ be the set of all $s$ that can be written $s = r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1}$, where $r_i \in R$ are rational. It is fairly easy to show that $S$ is closed under addition, subtraction and multiplication, and that the reciprocal of any nonzero value is also in $S$. Note, for instance, that if $\alpha$ is algebraic of degree $m$, then the constant term $a_0$ of its minimal polynomial $P(x)$ cannot be zero (otherwise its degree would be $m-1$), so that we can solve the equation $P(\alpha) = a_0 + a_1 \alpha + a_2 \alpha^2 + \cdots + a_m \alpha^m = 0$ to obtain $1/\alpha = -(a_1/a_0) – (a_2/a_0) \alpha – (a_3/a_0) \alpha^2 – \cdots – (a_m/a_0) \alpha^{m-1}$, which is in the set $S$. Thus $S$ is a field.

**Definitions: Linear independence and degrees.**

Suppose that we have a field $S$ that contains the rational numbers $R$ (i.e., $R \subset S$), such that every $s \in S$ can be written as $s = r_1 \alpha_1 + r_2 \alpha_2 + \cdots + r_m \alpha_m$, with $r_i \in R$, and with the property that the $\alpha_i$ are *linearly independent*: This means that there are no linear relations among the $\alpha_i$, so that in particular, if $r_1 \alpha_1 + r_2 \alpha_2 + \cdots + r_n \alpha_m = 0$, with $r_i \in R$, then each $r_i = 0$. Such an extension is said to be of *degree* $m$, which we will denote here as ${\rm deg}(S / R) = m$ (this is also denoted $[S:R] = m$).

**Examples**: For example, suppose $\alpha$ is algebraic of degree $m$, with minimal polynomial $a_0 + a_1 \alpha + a_2 \alpha^2 + \cdots + a_m \alpha^m$. Then consider the field $S$ consisting of all $s$ that can be written $s = r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1}$, with $r_i \in R$. Since there are $m$ terms, it is clear that ${\rm deg}(S/R) \leq m$. Now suppose that for some choice of $r_0, r_1, \cdots, r_{m-1}$, that $r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1} = 0$. Since the minimal polynomial of $\alpha$ has degree $m$, and not degree $m-1$ or less, this can only happen if each $r_i = 0$. Thus ${\rm deg}(S/R) = m$ exactly. In other words, *the notions of algebraic degree and field extension degree coincide when the field is generated by an algebraic number*.

As a specific example, let $R$ be the rational numbers, and consider the extension $S = R(\sqrt{2}) = \{s = r_0 \cdot 1 + r_1 \cdot \sqrt{2}\}$ with $r_0, r_1 \in R$. If $s, t \in S$, then clearly the sum and difference are also in $S$. So is their product: note, for instance, if $s = 2 + 3 \sqrt{2}$ and $t = 1 – 2 \sqrt{2}$, that the product $st = (2 + 3 \sqrt{2})(1 – 2 \sqrt{2}) = 2 – 4 \sqrt{3} + 3 \sqrt{3} – 12 = -10 – \sqrt{2}$, which is in $S$. Further, by “rationalizing the denominator,” we can write $1/s = 1/(2 + 3 \sqrt{2}) = (2 – 3 \sqrt{2})/((2 + 3 \sqrt{2})(2 -3 \sqrt{2})) = -1/7 + 3/14 \sqrt{2}$, which is also in $S$. Now note that the numbers $\{1, \sqrt{2}\}$ are linearly independent, since if $r_0 + r_1 \sqrt{2} = 0$ and $r_1 \ne 0$, this means that $\sqrt{2} = – r_0 / r_1$, which clearly cannot happen because $\sqrt{2}$ is irrational (this also follows by Lemma 1). Thus $r_0 = 0$ and $r_1 = 0$, so that ${\rm deg} (S/R) = 2$.

Similarly, consider the case where $S = R(\sqrt[3]{2})$, where $S$ is the set of all $s$ that can be written $s = r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2$, with $r_i \in R$. Clearly $R \subset S$, and, since there are three terms, ${\rm deg}(S/R) \leq 3$. Now suppose that for some choice of $r_0, r_1, r_2$ that $r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2 = 0$. Since $\sqrt[3]{2}$ satisfies the polynomial $x^3 – 2 = 0$, and this polynomial is irreducible by Lemma 1, $\sqrt[3]{2}$ cannot satisfy any polynomial with coefficients in $R$ with degree 2 or less. Thus it follows that $r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2 = 0$ only when $r_0 = 0, \, r_1 = 0, \, r_2 = 0$ and thus ${\rm deg}(S/R) = 3$ exactly. This also proves rigorously that $\sqrt[3]{2}$ is irrational.

We now prove an important fact about the degrees of field extensions, which will be key to our proof of the impossibility of trisection:

**LEMMA 2 (Multiplication of degrees of field extensions)**: Suppose that we have three fields $R, S, T$ (where $R$ is the rational numbers as before) such that $R \subset S \subset T$, and with ${\rm deg}(S/R) = m$ and ${\rm deg}(T/S) = n$. Then ${\rm deg} (T/R) = {\rm deg} (S/R) \cdot {\rm deg} (T/S) = m \cdot n$.

**Proof**: For ease of notation, we will prove this in the specific case $m = 2, n = 3$, i.e., where ${\rm deg} (S/R) = 2$ and ${\rm deg} (T/S) = 3$, although the argument is completely general. By hypothesis, all $s \in S$ can be written as $s = r_1 \alpha_1 + r_2 \alpha_2$, where $r_i \in R$ are rational numbers, and with the property that if $r_1 \alpha_1 + r_2 \alpha_2 = 0$, then each $r_i = 0$. In a similar way, by hypothesis all $t \in T$ can be written as $t = s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3$, where $s_i \in S$, and with the property that if $s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3 = 0$, then each $s_i = 0$.

First note that when we write any $t \in T$ as $t = s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3,$ where $s_1, s_2, s_3 \in S$, each of these $s_i$ can in turn be rewritten in terms of $r_i$ and $\alpha_i$: in particular, $s_1 = a_1 \alpha_1 + a_2 \alpha_2$, for some $a_i \in R$; $s_2 = b_1 \alpha_1 + b_2 \alpha_2$, for some $b_i \in R$; and $s_3 = c_1 \alpha_1 + c_2 \alpha_2$, for some $c_i \in R$. Thus we can write $$t = (a_1 \alpha_1 + a_2 \alpha_2) \beta_1 + (b_1 \alpha_1 + b_2 \alpha_2) \beta_2 + (c_1 \alpha_1 + c_2 \alpha_2) \beta_3 $$ $$= a_1 (\alpha_1 \beta_1) + a_2 (\alpha_2 \beta_1) + b_1 (\alpha_1 \beta_2) + b_2 (\alpha_2 \beta_2) + c_1 (\alpha_1 \beta_3) + c_2 (\alpha_2 \beta_3),$$ where $a_1, a_2, b_1, b_2, c_1, c_2 \in R$. We now see that we can write any $t \in T$ as a linear sum of the six terms shown, with coefficients in $R$. Thus ${\rm deg}(T/R) \leq 6$. Now suppose that some $t \in T$ is zero, so that $$t = (a_1 \alpha_1 + a_2 \alpha_2) \beta_1 + (b_1 \alpha_1 + b_2 \alpha_2) \beta_2 + (c_1 \alpha_1 + c_2 \alpha_2) \beta_3 = 0.$$ By hypothesis, this can only happen if each of the three coefficient expressions $(a_1 \alpha_1 + a_2 \alpha_2)$, $(b_1 \alpha_1 + b_2 \alpha_2)$ and $(c_1 \alpha_1 + c_2 \alpha_2)$ is zero, which in turn is only possible if each $a_1, a_2, b_1, b_2, c_1, c_2$ is zero. Thus the six terms $\alpha_1 \beta_1, \alpha_2 \beta_1, \alpha_1 \beta_2, \alpha_2 \beta_2, \alpha_1 \beta_3$ and $\alpha_2 \beta_3$ are linearly independent, so that ${\rm deg}(T/R) = 6$ exactly.

The more general case where ${\rm deg} (S/R) = m$ and ${\rm deg} (T/S) = n$ is handled in exactly the same way, only with somewhat more complicated notation.

**Definitions: Constructible numbers.**

The next step is to establish what kinds of numbers can be constructed by ruler and compass. Readers familiar with the mathematics of ruler-and-compass constructions may skip to the “Angle trisection” section below.

First, it is clear that addition can be easily done with ruler and compass by marking the two distances on a straight line, then the combined distance is the sum. Subtraction can be done similarly. Multiplication can also be done by ruler and compass, as illustrated in the figure to the right (taken from here), and division is just the inverse of this process.

What’s more, one can compute square roots by ruler and compass. In the second illustration to the right (taken from here), assuming that $AC = 1$, simple properties of right triangles show that the distance $AD = \sqrt{AB}$.

Thus, starting with a distance of unit length, numbers of the form $2/3 + \sqrt{1 + \sqrt{5/2}}$, or any finite combination of operations involving arithmetic operations and square roots, can be constructed. The converse is also true: Such numbers are the *only* numbers that can be constructed, as we will now show.

**LEMMA 3 (Degrees of constructible numbers)**: If $S$ is the number field resulting from a finite sequence of ruler-and-compass constructions, then ${\rm deg}(S/R) = 2^m$ for some integer $m$.

**Proof**: Ruler-and-compass constructions can only extend the rational numbers by a sequence of square root operations, each of which has algebraic degree 1 or 2:

- The intersection of two lines is algebraically the solution of two sets of linear equations, which involves only addition, subtraction, multiplication and division. Note that this does
*not*involve any field extension. - The intersection of a line with a circle involves the solution of a system of a linear equation and a quadratic equation, which involves only arithmetic and square roots.
- The intersection of two circles involves the solution of two simultaneous quadratic equations, which again involves only arithmetic and square roots. See any high school analytic geometry text for details.

**Definitions: Angle trisection.**

We first must keep in mind that some angles *can* be trisected. For example, a right angle can be trisected, because a 30 degree angle can be constructed, simply by bisecting one of the interior angles of an equilateral triangle. To show the impossibility result, we need only exhibit a single angle that cannot be trisected. We choose a 60 degree angle (i.e., $\pi/3$ radians). In other words, we will show that a 20 degree angle (i.e., $\pi/9$ radians) cannot be constructed. Note that it is sufficient to prove that the number $x = \cos(\pi/9)$ cannot be constructed, because if it could, then by copying this distance to the $x$ axis, with the left end at the origin, then constructing a perpendicular from the right end to the unit circle, the resulting angle will be the desired trisection.

Here we require only an elementary fact of trigonometry, namely the formula for the cosine of a triple angle. For convenience, we will prove this and some related formulas here, based only on elementary geometry and a little algebra. Readers familiar with these formulas may skip to the “Proof of the main theorems” section below.

**LEMMA 4 (Triple angle formula for cosine)**: $\cos(3\alpha) = 4 \cos^3(\alpha) – 3 \cos(\alpha).$

**Proof**: We start with the formula $$\sin (\alpha + \beta) = \sin (\alpha) \cos (\beta) + \cos (\alpha) \sin (\beta).$$ This fact has a simple geometric proof, which is illustrated to the right, where $OP = 1$. Note that angle $RPQ = \alpha$, because $OQA = \pi/2 – \alpha$, so that $RQO = \alpha$, and $RPQ = \pi/2 – RQP = \pi/2 – (\pi/2 – RQO) = RQO = \alpha$. Then note that $PQ = \sin(\beta), \, OQ = \cos (\beta)$. Further, $AQ/OQ = \sin(\alpha)$, so $AQ = \sin(\alpha) \cos(\beta)$, and $PR/PQ = \cos(\alpha)$, so $PR = \cos(\alpha) \sin(\beta)$. Combining these results, we have $$\sin(\alpha + \beta) = PB = RB + PR = AQ + PR = \sin(\alpha) \cos(\beta) + \cos(\alpha) \sin(\beta).$$ [This proof and illustration are taken from here.] By applying this formula to $(\pi/2 – \alpha)$ and $-\beta$, we obtain the formula: $$\cos(\alpha + \beta) = \cos(\alpha) \cos(\beta) – \sin(\alpha) \sin(\beta).$$ From these formulas we immediately derive the double-angle formulas $\sin(2\alpha) = 2 \sin(\alpha) \cos(\alpha)$ and $\cos(2\alpha) = 2 \cos^2(\alpha) – 1$. Now we can write: $$\cos(3\alpha) = \cos (\alpha + 2\alpha) = \cos(\alpha) \cos(2\alpha) – \sin(\alpha) \sin(2\alpha) = \cos(\alpha) (2 \cos^2(\alpha) – 1) – 2 \sin^2(\alpha)\cos(\alpha)$$ $$= \cos(\alpha) (2 \cos^2(\alpha) – 1) – 2 (1 – \cos^2(\alpha))\cos(\alpha) = 4 \cos^3(\alpha) – 3 \cos(\alpha).$$

**Proof of the main theorems:**

**THEOREM 1**: There is no ruler-and-compass scheme to trisect an arbitrary angle.

**Proof**: From Lemma 4 we can write $$\cos (3(\pi/9)) = \cos (\pi/3) = 1/2 = 4 \cos^3(\pi/9) – 3 \cos(\pi/9),$$ so that $\cos(\pi/9)$ is a root of the polynomial $8 x^3 – 6 x – 1$. A quick check of possible integer coefficients $a, b, c, d, e$ for the equality $8 x^3 – 6 x – 1 = (ax + b) (cx^2 + dx + e) = ac x^3 + (ad + bc)x^2 + (ae + bd)x + be$, attempting to match coefficients term by term on each side, and applying Lemma 1, shows that the polynomial $8 x^3 – 6 x – 1$ is irreducible. In other words, $\cos(\pi/9)$ is an algebraic number whose minimal (irreducible) polynomial is $8 x^3 – 6 x – 1$, so that it lies in an extension field $F$ of the rationals $R$ with $\deg(F/R) = 3$ or some multiple of $3$. However, by Lemma 3 the only possible degrees for number fields constructed by ruler and compass are powers of two. Since $2^m$ is not divisible by $3$, no finite sequence of ruler-and-compass constructions can compose an angle of $\pi/9$ or 20 degrees, which is the trisection of the angle $\pi/3$ or 60 degrees.

**THEOREM 2**: There is no ruler-and-compass scheme to duplicate an arbitrary cube.

**Proof**: This follows by the same line of reasoning, since $\sqrt[3]{2}$ is clearly of algebraic degree 3. As noted above, the irreducibility of the associated polynomial $x^3 – 2$ follows by Lemma 1. In particular, if $x^3 – 2$ factors at all, one of the factors must be linear, yielding a rational root. But there is no rational root, since by Lemma 1 any rational root $p/q$, where $p$ and $q$ are integers having no common factor, must have $p$ dividing $3$ and $q$ dividing $1$, and none of the possibilities $(3,1), \, (3, -1), \, (-3, 1)$ or $(-3,-1)$ work.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>This theorem has a long, tortuous history. In 1608, Peter Roth wrote that a polynomial equation of degree $n$ with real coefficients may have $n$ solutions, but offered no proof. Leibniz and Nikolaus Bernoulli both asserted that quartic polynomials of

Continue reading Simple proofs: The fundamental theorem of algebra

]]>The

This theorem has a long, tortuous history. In 1608, Peter Roth wrote that a polynomial equation of degree $n$ with real coefficients may have $n$ solutions, but offered no proof. Leibniz and Nikolaus Bernoulli both asserted that quartic polynomials of the form $x^4 + a^4$, where $a$ is real and nonzero, could not be solved or factored into linear or quadratic factors, but Euler later pointed out that $$x^4 + a^4 = (x^2 + a \sqrt{2} x + a^2) (x^2 – a \sqrt{2} x + a^2).$$ Eventually mathematicians became convinced that something like the fundamental theorem must be true. Numerous mathematicians, including d’Alembert, Euler, Lagrange, Laplace and Gauss, published proofs in the 1600s and 1700s, but each was later found to be flawed or incomplete. The first complete and fully rigorous proof was by Argand in 1806. For additional historical background on the fundamental theorem of algebra, see this Wikipedia article.

A proof of the fundamental theorem of algebra is typically presented in a college-level course in complex analysis, but only after an extensive background of underlying theory such as Cauchy’s theorem, the argument principle and Liouville’s theorem. Only a tiny percentage of college students ever take such coursework, and even many of those who do take such coursework never really grasp the essential idea behind the fundamental theorem of algebra, because of the way it is typically presented in textbooks (this was certainly the editor’s initial experience with the fundamental theorem of algebra). All this is generally accepted as an unpleasant but unavoidable feature of mathematical pedagogy.

We present here a proof of the fundamental theorem of algebra. We emphasize that it is both *elementary* and *self-contained* — it relies only a well-known completeness axiom and some straightforward reasoning with estimates and inequalities. It should be understandable by anyone with a solid high school background in algebra, trigonometry and complex numbers, although some experience with limits and estimates, as is commonly taught in high school or first-year college calculus courses, would also help.

**Gist of the proof**:

For readers familiar with Newton’s method for solving equations, one starts with a reasonably close approximation to a root, then adjusts the approximation by moving closer in an appropriate direction. We will employ the same strategy here, showing that if one assumes that the argument where the polynomial function achieves its minimum absolute value is not a root, then there is a nearby argument where the polynomial function has an even smaller absolute value, contradicting the assumption that the argument of the minimum absolute value is not a root.

**Definitions and axioms**:

In the following $p(z)$ will denote the $n$-th degree polynomial $p(z) = p_0 + p_1 z + p_2 z^2 + \cdots + p_n z^n$, where the coefficients $p_i$ are any complex numbers, with neither $p_0$ nor $p_n$ equal to zero (otherwise the polynomial is equivalent to one of lesser degree). We will utilize a fundamental completeness property of real and complex numbers, namely that a continuous function on a closed set achieves its minimum at some point in the domain. This can be taken as an axiom, or can be easily proved by applying other well-known completeness axioms, such as the Cauchy sequence axiom or the nested interval axiom. See this Wikipedia article for more discussion on completeness axioms.

**THEOREM 1**: Every polynomial with real or complex coefficients has at least one complex root.

**Proof**: Suppose that $p(z)$ has no roots in the complex plane. First note that for large $z$, say $|z| > 2 \max_i |p_i/p_n|$, the $z^n$ term of $p(z)$ is greater in absolute value than the sum of all the other terms. Thus given some $B > 0$, then for any sufficiently large $s$, we have $|p(z)| > B$ for all $z$ with $|z| \ge s$. We will take $B = 2 |p(0)| = 2 |p_0|$. Since $|p(z)|$ is continuous on the interior and boundary of the circle with radius $s$, it follows by the completeness axiom mentioned above that $|p(z)|$ achieves its minimum value at some point $t$ in this circle (possibly on the boundary). But since $|p(0)| < 1/2 \cdot |p(z)|$ for all $z$ on the circumference of the circle, it follows that $|p(z)|$ achieves its minimum at some point $t$ in the *interior* of the circle.

Now rewrite the polynomial $p(z)$, translating the argument $z$ by $t$, thus producing a new polynomial $$q(z) = p(z – t) \; = \; q_0 + q_1 z + q_2 z^2 + \cdots + q_n z^n,$$ and similarly translate the circle described above. Presumably the polynomial $q(z)$, defined on some circle centered at the origin (which circle is contained within the circle above), has a minimum absolute value $M > 0$ at $z = 0$. Note that $M = |q(0)| = |q_0|$.

Our proof strategy is to construct some point $x$, close to the origin, such that $|q(x)| < |q(0)|,$ thus contradicting the presumption that $|q(z)|$ has a minimum nonzero value at $z = 0$. If our method gives us merely a direction in the complex plane for which the function value decreases in magnitude (a *descent direction*), then by moving a small distance in that direction, we hope to achieve our goal of constructing a complex $x$ such that $|q(x)| < |q(0)|$. This is the strategy we will pursue.

**Construction of $x$ such that $|q(x)| < |q(0)|$**:

Let the first nonzero coefficient of $q(z)$, following $q_0$, be $q_m$, so that $q(z) = q_0 + q_m z^m + q_{m+1} z^{m+1} + \cdots + q_n z^n$. We will choose $x$ to be the complex number $$x = r \left(\frac{- q_0}{ q_m}\right)^{1/m},$$ where $r$ is a small positive real value we will specify below, and where $(-q_0/q_m)^{1/m}$ denotes any of the $m$-th roots of $(-q_0/q_m)$.

**Comment**: As an aside, note that unlike the real numbers, in the complex number system the $m$-th roots of a real or complex number are always guaranteed to exist: if $z = z_1 + i z_2$, with $z_1$ and $z_2$ real, then the $m$-th roots of $z$ are given explicitly by $$\{R^{1/m} \cos ((\theta + 2k\pi)/m) + i R^{1/m} \sin ((\theta+2k\pi)/m), \, k = 0, 1, \cdots, m-1\},$$ where $R = \sqrt{z_1^2 + z_2^2}$ and $\theta = \arctan (z_2/z_1)$. The guaranteed existence of $m$-th roots, a feature of the complex number system, is the key fact behind the fundamental theorem of algebra.

**Proof that $|q(x)| < |q(0)|$**:

With the definition of $x$ given above, we can write

$$q(x) = q_0 – q_0 r^m + q_{m+1} r^{m+1} \left(\frac{-q_0} {q_m}\right)^{(m+1)/m} + \cdots + q_n r^n \left(\frac{-q_0} {q_m}\right)^{n/m}$$ $$= q_0 – q_0 r^m + E,$$ where the extra terms $E$ can be bounded as follows. Assume that $q_0 \leq q_m$ (a very similar expression is obtained for $|E|$ in the case $q_0 \geq q_m$), and define $s = r(|q_0/q_m|)^{1/m}$. Then, by applying the well-known formula for the sum of a geometric series, we can write $$|E| \leq r^{m+1} \max_i |q_i| \left|\frac{q_0}{q_m}\right|^{(m+1)/m} (1 + s + s^2 + \cdots + s^{n-m-1}) \leq \frac{r^{m+1}\max_i |q_i|}{1 – s} \left|\frac{q_0}{q_m}\right|^{(m+1)/m}.$$ Thus $|E|$ can be made arbitrarily smaller than $|q_0 r^m| = |q_0|r^m$ by choosing $r$ small enough. For instance, select $r$ so that $|E| < |q _0|r^m / 2$. Then for such an $r$, we have $$|q(x)| = |q_0 - q_0 r^m + E| < |q_0 - q_0 r^m / 2| = |q_0|(1 - r^m / 2)< |q_0| = |q(0)|,$$ which contradicts the original assumption that $|q(z)|$ has a minimum nonzero value at $z = 0$.

**THEOREM 2**: Every polynomial of degree $n$ with real or complex coefficients has exactly $n$ complex roots, when counting individually any repeated roots.

**Proof**: If $\alpha$ is a real or complex root of the polynomial $p(z)$ of degree $n$ with real or complex coefficients, then by dividing this polynomial by $(z – \alpha)$, using the well-known polynomial division process, one obtains $p(z) = (z – \alpha) q(z) + r$, where $q(z)$ has degree $n – 1$ and $r$ is a constant. But note that $p(\alpha) = r = 0$, so that $p(z) = (z – \alpha) q(z)$. Continuing by induction, we conclude that the original polynomial $p(z)$ has exactly $n$ complex roots, although some might be repeated.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>Continue reading Simple proofs: The irrationality of pi

]]>Mankind has been fascinated with $\pi$, the ratio between the circumference of a circle and its diameter, for at least 2500 years. Ancient Hebrews used the approximation 3 (see 1 Kings 7:23 and 2 Chron. 4:2). Babylonians used the approximation 3 1/8. Archimedes, in the first rigorous analysis of $\pi$, proved that 3 10/71 < $\pi$ < 3 1/7, by means of a sequence of inscribed and circumscribed triangles. Later scholars in India (where decimal arithmetic was first developed, at least by 300 CE), China and the Middle East computed $\pi$ ever more accurately. In 1665, Newton computed 16 digits, but, as he later confessed, “I am ashamed to tell you to how many figures I carried these computations, having no other business at the time.” In 1844 the computing prodigy Johan Dase produced 200 digits. In 1874, Shanks published 707 digits, although later it was found that only the first 527 were correct.

In the 20th century, $\pi$ was computed to thousands, then to millions, then to billions of digits, in part due to some remarkable new formulas and algorithms for $\pi$, and in part due to clever computational techniques, all accelerated by the relentless advance of Moore’s Law. The most recent computation, as far as the present author is aware, produced 22.4 trillion digits. One may download up to the first trillion digits here. For additional details on the history and computation of $\pi$, see The quest for $\pi$ and The computation of previously inaccessible digits of pi^2 and Catalan’s constant.

For many years, part of the motivation for computing $\pi$ was to answer the question of whether $\pi$ was a rational number — if $\pi$ was rational, the decimal expansion would eventually repeat. But given that no repetitions were found in early computations, many leading mathematicians in the 17th and 18th century concluded that $\pi$ must be irrational. In 1761, Johann Heinrich Lambert settled the question by proving that $\pi$ is irrational. Then in 1882 Ferdinand von Lindemann proved that $\pi$ is transcendental, which proved once and for all that the ancient Greek problem of squaring the circle is impossible (because ruler-and-compass constructions can only produce power-of-two degree algebraic numbers).

We present here what we believe to be the simplest proof that $\pi$ is irrational. It was first published in the 1930s as an exercise in the Bourbaki treatise on calculus. It requires only a familiarity with integration, including integration by parts, which is a staple of any high school or first-year college calculus course. It is similar to, but simpler than (in the present author’s opinion), a related proof due to Ivan Niven.

**Gist of the proof:**

The basic idea is to define a function $A_n(b)$, based on an integral from $0$ to $\pi$. This function has the property that for each positive integer $b$ and for all sufficiently large integers $n$, $A_n(b)$ lies strictly between 0 and 1. Yet another line of reasoning, assuming $\pi$ is a rational number, and applying integration by parts, concludes that $A_n(b)$ must be an integer. This contradiction shows that $\pi$ must be irrational.

**THEOREM**: $\pi$ is irrational.

**Proof**:

For each positive integer $b$ and non-negative integer $n$, define $$A_n(b) = b^n \int_0^\pi \frac{x^n(\pi – x)^n \sin(x)}{n!} \, {\rm d}x.$$ Note that the integrand function of $A_n(b)$ is zero at $x = 0$ and $x = \pi$, but is strictly positive elsewhere in the integration interval. Thus $A_n(b) > 0$. In addition, since $x (\pi – x) \le (\pi/2)^2$, we can write $$A_n(b) \le \frac{\pi b^n}{n!}\left(\frac{\pi}{2}\right)^{2n} = \frac{\pi (b \pi^2 / 4)^n}{n!},$$ which is less than one for large $n$, since, by Stirling’s formula, the $n!$ in the denominator increases faster than the $n$-th power in the numerator. Thus we have established, for any integer $b \ge 1$ and all sufficiently large $n$, that $0 < A_n(b) < 1.$

Now let us assume that $\pi = a/b$ for relatively prime integers $a$ and $b$. Define $f(x) = x^n (a – bx)^n / n!$. Then we can write, using integration by parts, $$A_n(b) = \int_0^\pi f(x) \sin(x) \, {\rm d}x = \left[-f(x) \cos(x)\right]_0^\pi – \left[-f'(x) \sin(x)\right]_0^\pi + \cdots$$ $$ \cdots \pm \left[f^{(2n)} (x) \cos(x)\right]_0^\pi \pm \int_0^\pi f^{(2n+1)} (x) \cos(x) \, {\rm d}x.$$ Now note that for $k = 1$ to $k = n$, the derivative $f^{(k)} (x)$ is zero at $x = 0$ and $x = \pi$. For $k = n + 1$ to $2n$, $f^{(k)} (x)$ takes various integer values at $x = 0$ and $x = \pi$; and for $k = 2n + 1$, the derivative is zero. Also, the function $\sin(x) $ is $0$ at $x = 0$ and $x = \pi$, while the function $\cos(x)$ is $1$ at $x = 0$ and $-1$ at $x = \pi$. Combining these facts, we conclude that $A_n (b)$ must be an integer. This contradiction proves that $\pi$ is irrational.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>Modern mathematics is one of the most enduring edifices created by humankind, a magnificent form of art and science that all too few have the opportunity of appreciating. The great British mathematician G.H. Hardy wrote, “Beauty is the first test; there is no permanent place in the world for ugly mathematics.” Mathematician-philosopher Bertrand Russell added: “Mathematics, rightly viewed, possesses not only truth, but supreme beauty — a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music,

Continue reading Simple proofs of great theorems

]]>Modern mathematics is one of the most enduring edifices created by humankind, a magnificent form of art and science that all too few have the opportunity of appreciating. The great British mathematician G.H. Hardy wrote, “Beauty is the first test; there is no permanent place in the world for ugly mathematics.” Mathematician-philosopher Bertrand Russell added: “Mathematics, rightly viewed, possesses not only truth, but supreme beauty — a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show.”

Yet it is a sad fact that few people on this planet even begin to grasp the full power and beauty of mathematics. For the vast majority of humankind, even in first-world nations, all students see of mathematics is some relatively dull drilling in decimal arithmetic (e.g., the dreaded times tables), some very basic algebra and various textbook “story problems.” Most never glimpse the higher-level beauty of real modern mathematics that lies at the end of the tunnel. Many otherwise promising students lose interest in mathematics as a result of dreary instruction, often by teachers who did not major in mathematics and for whom teaching mathematics is a chore. Inevitably, math homework often loses out to social media, computer games, parties and sports. The result is a tragic waste of desperately needed mathematical talent for the present and future information age.

Part of the problem here is that hardly any students ever see some of the more beautiful parts of mathematics, such as elegant proofs of important mathematical theorems. Some of these theorems may be mentioned in textbooks, but often their proofs are dismissed as “beyond the scope of this text,” relegated to some higher-level mathematics course only taken by advanced math majors or graduate students, which is an extremely tiny audience. Even when these proofs are part of the curriculum, the presentation in textbooks often fails to highlight the keys ideas in a way that students can easily grasp and gain intuition.

As a single example, the fundamental theorem of algebra, namely the fact that every algebraic equation of degree *n* has *n* roots, typically is presented in a college-level course in complex analysis, and then only after an extensive background of underlying theory such as Cauchy’s theorem, the argument principle and Liouville’s theorem. Only a tiny percentage of college students ever take such coursework, and even many of those who do take such coursework never really grasp the essential idea behind the fundamental theorem of algebra, because of the way it is typically presented in textbooks (this was certainly the editor’s initial experience with the fundamental theorem of algebra). All this is generally accepted as an unpleasant but unavoidable feature of mathematical pedagogy.

The editor of this blog rejects this defeatism. He is convinced that many of the greatest theorems of mathematics can be proved significantly more simply, and requiring significantly less background, than they are typically presented in traditional textbooks and courses.

Thus the editor has decided to start a new feature in this blog, namely to present simple, beautiful and readily understandable proofs of a number of important theorems. In most cases, as we will see, this can be done without sacrificing rigor, and yet also without a huge background of prior coursework and machinery that many younger students especially might not have the patience or time for.

The proofs envisioned for presentation will be designed to satisfy the following requirements:

- The proofs will include some historical background and context.
- In most cases, the most simple, elegant and beautiful proof of a given theorem will be the one presented.
- Whenever possible, the proofs will be suitable for presentation to bright high school students or, at most, first- or second-year college mathematics students.
- Any necessary lemmas or other background material beyond very basic facts will be included.
- The entire proof, including historical context and background material, will not exceed three pages or so.

An impossible dream? The editor is convinced otherwise.

Here are some articles that will start the series. Others will be added in the coming months:

Comments on any of this material certainly welcome. What’s more, if a reader has a suggestion for some theorem that not yet been featured, or even for an alternate proof of some theorem that already has been featured, please contact the editor. His email is given at Welcome to the Math Scholar blog.

]]>As we have explained in previous Math Scholar blogs (see, for example, MS1 and MS2), the perplexing question why the heavens are silent even though, from all evidence, the universe is teeming with potentially habitable exoplanets, continues to perplex and fascinate scientists. It is one of the most significant questions of modern science, with connections to mathematics, physics, astronomy, cosmology, biology and philosophy.

In spite of the glib dismissals that are often presented in public venues and (quite sadly) in writings by some professional scientists (see MS1 and MS2 for examples and rejoinders), there

Continue reading New books and articles on the “great silence”

]]>As we have explained in previous Math Scholar blogs (see, for example, MS1 and MS2), the perplexing question why the heavens are silent even though, from all evidence, the universe is teeming with potentially habitable exoplanets, continues to perplex and fascinate scientists. It is one of the most significant questions of modern science, with connections to mathematics, physics, astronomy, cosmology, biology and philosophy.

In spite of the glib dismissals that are often presented in public venues and (quite sadly) in writings by some professional scientists (see MS1 and MS2 for examples and rejoinders), there is no easy resolution to Fermi’s paradox, as it is known.

**Sociological explanations**, such as “they have lost interest in research and exploration,” or “they are under a galactic directive not to disturb Earth,” or “they have moved on to more advanced communication technologies,” all fall prey to a diversity argument: Any advanced society that arose from an evolutionary process (which is the only conceivable natural process to explain the origin of complex beings), surely must consist of millions if not billions or even trillions of diverse individuals. All it takes is for one individual in just one of these advanced civilizations to disagree, say with the galactic directive not to or send probes or communications to Earth in a form that earthlings or other newly technological societies can easily recognize, and this explanation fails. Note, for example, that any signal (microwave, light or gravitational wave), once it has been sent by extraterrestrials (ETs) on its way to Earth, cannot be censored or called back, according to known laws of physics.

Similarly, **technological explanations** (i.e., it is too difficult for them) were once considered persuasive, but now founder on the fact that even present-day human technology is sufficient to communicate with and (soon) to send probes to distant stars and planets (see, for example, technical reports TR1, TR2, TR3 and TR4). If we can do this now, or within the next 20-30 years, surely civilizations thousands or millions of years more advanced can do this much better, and much more cheaply too. This same space technology, by the way, defeats the “all advanced civilizations destroy themselves” explanation, since as soon as human civilization (for instance) establishes permanent outposts on the Moon, Mars and beyond, its continued survival will be largely impervious to any calamities that may befall the home planet. By the way, the self-destruct explanation also falls prey to a diversity argument — even if most civilizations destroy themselves, surely at least a handful have figured out how to survive and thrive over the long term?

So are we alone, at least in the Milky Way? Perhaps, although many (including the present author) find such a conclusion rather distasteful, to say the least, since it goes against the Copernican principle that has guided scientific research for many years. See previous Math Scholar blogs MS1 and MS2 for a more complete discussion of these issues.

Recently two books and a *Scientific American* article (which summarizes a third book) have appeared on this topic. Here is a brief synopsis of each.

In his book If the Universe Is Teeming with Aliens … WHERE IS EVERYBODY?: Seventy-Five Solutions to the Fermi Paradox and the Problem of Extraterrestrial Life, British physicist Stephen Webb updates his 2002 edition with a thorough discussion of 50 earlier proposed solutions to Fermi’s paradox, updated to reflect the latest scientific findings, plus 25 new proposed solutions for this edition. The discussion is very readable and focused, yet does not omit technical detail where needed.

Here are just a few of the proposed solutions presented (often with devastating rejoinders) in Webb’s book. Item numbers and titles are from Webb’s book:

*The Zoo scenario*: We are part of a preserve, being tended by an overarching society of ETs.*The interdict scenario*: There is a galaxy-wide prohibition on visits to or communication with Earth.*They have not had time to reach us*: ETs may exist, but they have not yet been able to reach us or communicate with us.*Bracewell-von Neumann probes*: ETs could have explored the galaxy by means of self-replicating probes that arrive at a star system, find a suitable asteroid or planet, construct additional copies of themselves, and then launch them to other star systems, with updated software beamed from the home planet. Such probes could have explored the entire Milky Way in a few million years, an eyeblink in cosmic time. So where are they?*They are signaling, but we don’t know how to listen*: We are not using the right technology to receive and/or decipher their signals.*They have no desire to communicate*: ETs have settled into the good life and no longer have any desire to explore the cosmos or learn about other newly technological societies like us.*Intelligence isn’t permanent*: Intelligence has arisen elsewhere, but then degraded because it was not needed so much anymore.*We live in a post-biological universe*: ETs have progressed from biological to machine-based existence and no longer have interest in us.*Continuously habitable zones are narrow*: The Earth is nearly unique in maintaining a habitable environment over billions of years.*The galaxy is a dangerous place*: Much if not most of the galaxy is uninhabitable due to a steady rain of supernova explosions, gamma ray bursts and other phenomena that regularly sterilize their surroundings.*Life’s genesis is rare*: Although progress has been made, the origin of the first reproducing biomolecules on Earth is still a mystery. Perhaps biogenesis was a freak of nature, much more singular than we suppose.*Intelligence at the human level is rare*: Even if life is common, perhaps human-level intelligence, with its associated technology, is extremely rare.*Science is not inevitable*: Perhaps even if intelligence emerges, the development of science and advanced technology is rare.

Webb discusses each of these proposed solutions in detail, with numerous references to technical literature and other analyses. He also frankly discusses the weaknesses of each of these, and, in many places, offers his own assessment. In the end, Webb acknowledges that while we still have much to learn, he personally thinks it is unlikely that there are any other human-like technological societies in the Milky Way.

In his new book The Great Silence: Science and Philosophy of Fermi’s Paradox, Milan Cirkovic analyzes Fermi’s paradox from a somewhat deeper philosophical and scientific basis. He argues that many of approaches to the great silence are riddled with logical errors and other weaknesses. He then attempts to carefully examine many of the proposed solutions and to identify what can be safely said.

For example, Cirkovic argues that discussions of the Drake equation, which is used to estimate the number of space-communicating civilizations in the Milky Way, are typically fallacious. Drake’s equation, which was first proposed by Frank Drake in the 1960s, is:

N = R_{*} f_{p} n_{e} f_{l} f_{i} f_{c} L

where N is the number of predicted ET civilizations, R_{*} is the star formation rate in the Milky Way, f_{p} is the fraction of stars possessing planets, n_{e} is the average number of habitable planets per planetary system, f_{l} is the fraction of habitable planets possessing life, f_{i} is the fraction of inhabited planets developing intelligent life, f_{c} is the fraction of intelligent societies developing space communication technology, and L is the lifespan of civilizations that have developed technology.

Cirkovic argues that because of the huge uncertainties involved, each term should be given by a probability distribution, and the product integrated over the multidimensional parameter space. In any event, since virtually all of these terms are matters of active astrobiological research, writers who cite Drake’s equation without analyzing it more carefully are on shaky ground.

In the end, Cirkovic argues that in addition to identifying and correcting prejudices and fallacies in this arena, we need to realize that substantive progress awaits more real scientific data. But he insists that we must not retreat from Copernican principle, namely that ultimately our existence is not unique. He concludes on a positive note:

Even when [space exploration] is achieved, it will not be the beginning of the end — but it might be the end of the beginning of the greatest of all journeys, the wildest adventure of all adventures. … After all, this is what science has stood for in its most brilliant moments: courage, conviction, and the spirit of great adventure. These qualities might, in the final analysis, withstand the test of cosmic time, repulse the last challenge to Copernicanism, and ultimately break the Great Silence.

Astrophysicist and well-known science writer John Gribbin has weighed into the debate with an article in the August 2018 *Scientific American* entitled Are Humans Alone in the Milky Way? Why we are probably the only intelligent life in the galaxy. His article presents a brief summary of his 2011 book Alone in the Universe: Why Our Planet Is Unique, with a number of updates reflecting recent scientific discoveries and findings.

Gribbin argues that we are the product of a long series of remarkably fortuitous (and unlikely) events that might not have been repeated anywhere else, at least within the Milky Way. These include:

*Special timing*: Given that all elements heaver than hydrogen, helium and lithium were created in supernova explosions, this means that a metal-rich environment like our sun and solar system could not exist until several billion years after the big bang, yet not so long after that the sun begins to die.*Special location*: While the Milky Way seems vast, regions much closer than our sun to the center of the galaxy are bathed in high-energy gamma particles and gamma ray bursts, which are lethal to any conceivable form of biology. Much further away from the center than our sun, metallicity and other conditions are not met. In other words, we reside in a Goldilocks zone in the Milky Way, estimated to span only about 7% of the galactic radius.*Special planet*: In spite of all the talk about exoplanets in the Milky Way, only a few are rocky like Earth, fewer reside in the habitable zone, and even fewer still (or perhaps none at all) have these features plus a combination of a molten core generating a magnetic field together with plate tectonics to regulate climate. In our solar system, Mars is too far away, has no magnetic field and is too cold (although primitive organisms might exist underground). Venus lacks a magnetic field and plate tectonics, and, because of runaway greenhouse effect, is much too hot for any life. The collision with Earth by a Mars-size object in the early solar system, and the presence of a large moon, also appear to be key to Earth’s long-term, life-friendly environment.*Special life*: Once Earth’s original molten state settled, life appeared with “indecent rapidity,” in Gribbin’s words. But then not much happened for the next three billion years. More complex cells, known as eukaryotes, resulted from the chance merger of two primordial organisms, bacteria and archaea, but even after this crucial milestone event, not much happened for another billion years, until the Cambrian explosion roughly 540 million years ago. So the rise of complex life forms, much less intelligent creatures such as Homo sapiens, was hardly inevitable.*Special species*: Homo sapiens appeared about 300,000 years ago, but nearly went extinct 150,000 years ago, and again about 70,000 years ago. Thus our subsequent domination of the planet seems far from assured.

Gribbin concludes as follows:

As we put everything together, what can we say? Is life likely to exist elsewhere in the galaxy? Almost certainly yes, given the speed with which it appeared on Earth. Is another technological civilization likely to exist today? Almost certainly no, given the chain of circumstances that led to our existence. These considerations suggest we are unique not just on our planet but in the whole Milky Way. And if our planet is so special, it becomes all the more important to preserve this unique world for ourselves, our descendants and the many creatures that call Earth home.

As mentioned above, the question of extraterrestrial intelligent life, and why we have not yet found any, surely must rank at or near the top of the most significant scientific questions of all time. In his foreword to Webb’s book, noted British astronomer Martin Rees explains what is at stake:

]]>Maybe we will one day find ET. On the other hand, [Webb’s] book offers 75 reasons why SETI searches may fail; Earth’s intricate biosphere may be unique. That would disappoint the searchers, but it would have an upside: it would entitle us humans to be less “cosmically modest.” Moreover, this outcome would not render life a cosmic sideshow. Evolution may still be nearer its beginning than its end. Our Solar System is barely middle aged and, if humans avoid self-destruction, the post-human era beckons. Life from Earth could spread through the Galaxy, evolving into a teeming complexity far beyond what we can even conceive. If so, our tiny planet — this pale blue dot floating in space — could be the most important place in the entire Galaxy, and the first interstellar voyagers from Earth would have a mission that would resonate through the entire Galaxy and perhaps beyond.

String theory is the name for the theory of mathematical physics which proposes that physical reality is based on exceedingly small “strings” and “branes,” embedded in 10- or 11-dimensional space. String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, and also because it is so “beautiful.”

Yet in spite of decades of effort, by thousands of brilliant mathematical physicists, the field has yet to produce specific experimentally testable predictions. What’s more, hopes that string theory

Continue reading Does the string theory multiverse really exist?

]]>String theory is the name for the theory of mathematical physics which proposes that physical reality is based on exceedingly small “strings” and “branes,” embedded in 10- or 11-dimensional space. String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, and also because it is so “beautiful.”

Yet in spite of decades of effort, by thousands of brilliant mathematical physicists, the field has yet to produce specific experimentally testable predictions. What’s more, hopes that string theory would result in a single crisp physical theory, pinning down unique laws and fundamental constants, have been dashed. Instead, the theory admits a huge number of different options, by one reckoning numbering more than 10^{500}, which is often called the “string theory multiverse” or the “string theory landscape.”

A number of leading string theorists and other physicists view the huge number of different multiverse options as an advantage. They argue that this may offer a solution to the long-standing paradox of why the universe appears so freakishly fine-tuned for life. They reason that we should not be surprised to live in an extraordinarily life-friendly universe, because given 10^{500} possible universe designs, inevitably at least one (ours) beats the enormous odds and is favorable to life.

However, the string theory/multiverse hypothesis also has numerous detractors. Sabine Hossenfelder, in her 2018 book Lost in Math: How Beauty Leads Physics Astray, laments the fact that physicists have for so long pursued string theory, supersymmetry and other theories that lack solid underpinning in empirical science, just because they are considered “beautiful.” Similarly, physicist Lee Smolin, writing in his 2006 book The Trouble With Physics, declares,

A scientific theory that makes no predictions and therefore is not subject to experiment can never fail, but such a theory can never succeed either, as long as science stands for knowledge gained from rational argument borne out by evidence.

On 25 June 2018, a team of prominent string theorists led by Cumrun Vafa of Harvard University released a new paper that fundamentally questions the whole notion of a string theory-based multiverse with 10^{500} or more variations. If their results are confirmed by other physicists and experimental evidence, they could seriously challenge string theory and/or our entire conception of modern physics and cosmology.

The Vafa result and its implications are explained in detail in a very nice Quanta article by Natalie Wolchover. Here is a brief summary:

The main result of the Vafa team paper implies that as the universe expands, the vacuum energy density of empty space must decrease at least as fast as the rate given by a certain formula. The rule appears to hold in most string theory-based universal models, but it violates two bedrocks of modern cosmology: the accelerating expansion of the universe due to dark energy and the hypothesized “inflation” epoch of the first second after the big bang.

In particular, Vafa and his colleagues argue that “de Sitter universes,” with stable, constant and positive amounts of vacuum energy density, simply are not possible under rather general assumptions of string theory. Such universes in general, and ours in particular, would reside in the “swampland” of string theory because they are not mathematically consistent. And yet here we are…

Just as significantly, the Vafa team result draws into question the widely accepted “inflation” epoch of the very early universe, when the universe is thought to have expanded by a factor of roughly 10^{30} or more. The trouble is, Vafa’s result implies that the “inflaton field” of energy that drove inflation must have declined too quickly to have formed a smooth and flat universe like the one we reside in.

String theorists and other physicists are divided about the Vafa team result. Eva Silverstein of Stanford University, a leader in efforts to construct string theory-based models of inflation, believes the result is likely false, as does her husband Shamit Kachru, co-author of the 2003 KKLT paper that is the basis of string theory-based models of de Sitter universes. But others, such as Hirosi Ooguri of the California Institute of Technology, are inclined to believe the Vafa result, since other “swampland” conjectures have withstood challenges and are now on a very solid theoretical footing.

One possible way out is that the accelerating expansion of the universe is not due to an ever-present positive dark energy, as currently believed, but instead is due to quintessence, a hypothesized energy source that gradually decreases over tens of billions of years. If the quintessence hypothesis is true, it could revolutionize physics and cosmology.

The quintessence hypothesis will be tested in several new experiments currently underway, and some others scheduled for the future, which will analyze more carefully whether the accelerating expansion of the universe is constant or variable. One of these experiments is the Dark Energy Survey, currently underway, which analyzes the clumpiness of galaxies. The initial results so far, released in August 2017, find the universe is 74% dark energy and 21% dark matter, with everything else (stars, galaxies, planets and us) in the remaining 5%, all of which is pretty much consistent with our current understanding so far.

Another related experiment is the Wide Field Infared Survey Telescope (WFIRST) system, which is specifically designed to study dark energy and infrared astrophysics. The Euclid telescope, currently in development, will investigate even more accurately the relationship between distance and redshift that is at the heart of modern cosmology.

As mentioned above, the theory of cosmic inflation is also challenged by the Vafa team result. Along this line experimental systems such as the Simons Observatory will search for signatures and other evidence of cosmic inflation. This evidence will be scrutinized carefully, though, in the wake of the widely hailed 2014 announcement of evidence for inflation that subsequently bit the dust, so to speak, in the sense that the results were explained by dust in the Milky Way.

However these results turn out, they are likely to have earthshaking implications for the fields of mathematical physics, experimental physics and cosmology.

If Vafa’s results are confirmed as correct, and the dark energy explanation for the accelerating expansion of the universe and the inflation epoch of the early universe are reaffirmed by experimental evidence, then this may spell doom for string theory.

Either way, Vafa’s conjecture has certainly roused the physics community like few other developments of recent years. Vafa explains,

]]>Raising the question is what we should be doing. And finding evidence for or against it — that’s how we make progress.

The Fields Institute announces its awardees at the every-four-years International Congress of Mathematicians (ICM) meeting, which this year is being held in Rio de Janeiro. The awards, which are made to a maximum of four exceptional mathematicians under the age of 40, are often considered the “Nobel Prize” of Mathematics.

Here is some information on this year’s awardees and their work:

Caucher Birkar. Birkar was born and raised in a very poor farmingContinue reading The 2018 Fields Medalists honored

]]>The Fields Institute announces its awardees at the every-four-years International Congress of Mathematicians (ICM) meeting, which this year is being held in Rio de Janeiro. The awards, which are made to a maximum of four exceptional mathematicians under the age of 40, are often considered the “Nobel Prize” of Mathematics.

Here is some information on this year’s awardees and their work:

- Caucher Birkar. Birkar was born and raised in a very poor farming village in the Kurdish region of Iran. But eventually he managed to enroll at the University of Tehran. He recalls seeing photos of various Fields medalists on the walls, wondering “Will I ever meet one of these people.” Little did he know…
In his senior year, while traveling in England, Birkar sought and received political asylum. He subsequently enrolled in the University of Nottingham, where he excelled in his mathematical studies. He currently is on the faculty at the University of Cambridge.

The work for which Birkar was recognized is in the area of “algebraic geometry,” namely algebraically equivalent ways of viewing geometric objects. The set of solutions common to a given set of equations is known as an “algebraic variety.” Birkar used the theory of “birational transformations” to show that a class of geometric objects known as “Fans varieties” form neat, orderly families in every dimension.

For more details on Birkar and his work, see this article by Kevin Hartnett.

- Alessio Figalli. Figalli was born in Rome. Although he liked mathematics from an early age, he didn’t seriously study the subject until his third year of high school. He was admitted into the Scuola Normale Superiore of Pisa, a university devoted to mathematically and scientifically gifted students, but struggled at first because other students were more advanced. But his talent quickly became apparent, as he caught up with and then exceeded the other students. He currently is on the faculty of ETH Zurich.
Figalli became interested in the “optimal transport problem,” namely to find the most efficient path and means to move materials from one location to another. This is closely related to problems of “minimal surfaces,” which have long been studied by Italian mathematicians.

In one key paper published in 2010, Figalli, together with his colleagues Franceseco Maggi and Aldo Pratelli, presented a proof of the stability of energy-minimizing shapes like crystals and soap bubbles. The proof takes the form of a precise inequality, which implies that if one increases the energy of a system such as a bubble or crystal by a certain amount, the resulting shape will deviate from the original shape by no more than a related amount. They further showed that this inequality is optimal in a certain precise sense.

For more details on Figalli and his work, see this article, also by Kevin Hartnett.

- Akshay Venkatesh. Venkatesh was born in New Delhi, India, but grew up in Perth, Australia. Although he describes his childhood as a “fairly normal suburban experience,” he did win some international physics and math Olympiad competitions at ages 11 and 12. He later attended Princeton University, and worked under the well-known number theorist Peter Sarnak.
Venkatesh has always viewed himself somewhat harshly. He described his own doctoral dissertation as “mediocre,” and had even considered leaving the field once, working for his uncle on a machine learning startup one summer. But then he was offered a C.L.E. Moore faculty position at MIT, and he accepted. He later wondered why his advisor (Sarnak) had written such a glowing recommendation. A fellow mathematician explained, “Sometimes, people see things in you that you don’t see.”

One of Venkatesh’s papers dealt with generalizations of the well-known Riemann zeta function hypothesis. Venkatesh focused on L-functions. He employed ideas from the theory of dynamical systems to establish subconvexity estimates for a huge family of L-functions.

For more details on Venkatesh and his work, see this article by Erica Klarreich.

- Peter Scholze. Scholze started teaching himself college-level mathematics at the age of 14, while attending a Berlin high school that specialized in mathematics and science. He recalls being fascinated by Andrew Wiles’ 1997 proof of Fermat’s Last Theorem. He confessed that at first although he was eager to understood the proof, he had much to learn. So for the next few years he studied some of the background theory, including modular forms and elliptic curves, together with numerous other techniques from number theory, algebra, geometry and analysis.
Scholze began his research in arithmetic geometry, namely the usage of geometry-connected tools to study integer solutions to multivariate equations, e.g., x y

^{2}+ 3 y = 5. He found, as others had earlier, that it was useful to study these problems using the p-adic numbers, which are like the real numbers but where closeness is defined as a difference that is divisible numerous times by the prime p.Scholze ultimately proved a deep result known as the weight-monodromy conjecture. Scholze’s results expanded the scope of reciprocity laws, which govern the behavior of polynomials that use simple modular arithmetic (e.g., 5 + 10 mod 12 = 3). These reciprocity laws are generalizations of the quadratic reciprocity laws that were a favorite of Gauss. 20th century mathematicians found surprising links between these laws an hyperbolic geometry, which links have been elaborated more recently as the “Langland’s program,” an instance of which was the key to the proof of Fermat’s Last Theorem. Scholze has shown how to extend the Langland’s program to a much wider range of structures in hyperbolic spaces.

For more details on Scholze and his work, see this article, also by Erica Klarreich.

Congratulations to all of this year’s winners!

]]>In a previous Math Scholar blog, we lamented the decline of peer review, as evidenced by the surprising number of papers, published in supposedly professional, peer-reviewed journals, claiming that Pi is not the traditional value 3.1415926535…, but instead is some other value. In the 12 months since that blog was published, other papers of this most regrettable genre have appeared.

As a single example of this genre, the author of a 2015 paper, which appeared in the International Journal of Engineering Sciences and Research Technology, states, “The area and circumference of circle has been

Continue reading The rise of pay-to-publish journals and the decline of peer review

]]>In a previous Math Scholar blog, we lamented the decline of peer review, as evidenced by the surprising number of papers, published in supposedly professional, peer-reviewed journals, claiming that Pi is not the traditional value 3.1415926535…, but instead is some other value. In the 12 months since that blog was published, other papers of this most regrettable genre have appeared.

As a single example of this genre, the author of a 2015 paper, which appeared in the International Journal of Engineering Sciences and Research Technology, states, “The area and circumference of circle has been estimated till now approximately. Millions of mathematicians all these years of human civilization have tried and failed.” This author then presents formulas, which he derives from a simple geometric argument, for the area of a circle and the circumference of a circle (both involving only simple quadratic expressions). He then asserts that Pi = (14 – Sqrt(2)) / 4 = 3.14644660941…, which he describes as “exact” and “kindly is revealed by God for the benefit of humanity,” as opposed to the merely “approximate” values that have been used in conventional mathematical literature.

Needless to say, this author is hopelessly mistaken. Pi really does equal 3.1415925635…, is given by any of hundreds of known formulas and iterative algorithms, and most certainly is not given by any algebraic expression, much less the simple one presented by the author, which differs from the true value of Pi beginning with the third digit. Numerous other examples of papers “proving” variant values of Pi are given in the previous Math Scholar blog.

There have always been, and probably always will be, persons such as the above author who doubt some well-established fact of mathematics, physics or other field, and who attempt to have their writings published. Doubtless many readers of this blog have received email from persons who are promoting such material.

The central question here is: **How did this and similar papers pass peer review?**

Peer review is the bedrock of modern science. Without rigorous peer review, by well-qualified reviewers, modern mathematics and science could not exist. For most reputable journals and conferences, reviewers are instructed to rate a submission on criteria such as:

- Relevance to the journal or conference’s charter.
- Clarity of exposition.
- Objectivity of style.
- Acknowledgement of prior work.
- Freedom from plagiarism.
- Theoretical background.
- Validity of reasoning.
- Experimental procedures and data analysis.
- Statistical methods.
- Conclusions.
- Originality and importance.

Needless to say, the paper listed above, and others of its ilk, should never have been approved for publication, since such material immediately violates item 7, not to mention items 3, 4, 6 and others.

As we emphasized in the earlier Math Scholar blog, the editors of journals who have accepted papers with nonsense such as “proofs” of variant values of Pi certainly cannot plead ignorance. No editor or reviewer with even a first-year undergraduate background in mathematics could possibly fail to notice the astounding claim that the traditional value of Pi is incorrect. Indeed, it is hard to imagine a comparable claim in other fields:

- A claim, in a submission to a physics journal, that the speed of light is substantially incorrect.
- A claim, in a submission to a chemistry journal, that atoms and molecules do not really exist.
- A claim, in a submission to a biology journal, that evolution never really happened.
- A claim, in a submission to a geology journal, that the earth is only a few thousand years old.

At the very least, the assertion that the traditional value of Pi is incorrect would certainly have to be considered an “extraordinary claim,” which, as Carl Sagan once reminded us, requires “extraordinary evidence.” And it is quite clear that none of the above-mentioned papers have offered airtight arguments, presented in highly professional and rigorous mathematical language, to justify such a claim. Thus these manuscripts should have either been rejected outright, or, at the very least, referred to well-qualified mathematicians for rigorous review.

Also, the fervor in this paper and others of this genre should raise a red flag. There is absolutely no place in modern mathematics and science literature for fervor of any sort, whether it be religious, political or philosophical (see item #3 in the list of peer review standards above), since any good scholar should be prepared to discard his or her pet theory, once it has been clearly refuted by more careful reasoning or experimentation.

So how could such an egregious error of manuscript review have occurred? The answer is, regrettably, “follow the money.”

Indeed, the above journal, and other journals which have presented variant values of Pi, are all listed on Beall’s list of pay-to-publish journals. Many of these journals have very loose standards of publication, with only a superficial review at best, in return for charging a fee to authors for having their papers published on the journals’ websites. This is despite the journals’ advertised claims that they are professional, international, “peer-reviewed” journals.

The number of such journals has grown explosively in the past few years. As of the present date, Beall’s list includes roughly 8,700 journals, a number that is approximately **double** that of just one year ago. The result has been a flood of “atrocious” papers, according to researchers quoted in a recent Economist report.

In 2013, John Bohannon, a journalist with a PhD in molecular biology, performed an unusual experiment: he wrote several versions of a bogus research paper that claimed that a biomolecule commonly found in lichens inhibits the growth of cancer. These manuscripts, in Bohannon’s words, employed “laughably bad” methodology and baldly stated that the biomolecule was a “promising new drug,” without any clinical trial or other evidence. The manuscripts were attributed to fictional researchers at fictional institutions.

Of the 121 journals that Bohannon chose from a list of blacklisted journals, 69% accepted the manuscript they were sent, to be published for a fee. What’s worse, even some non-blacklisted open-access journals also accepted the submission.

One does not have to look very far afield for an explanation of this phenomenon: Many academics, under huge pressure to “publish or perish,” are all too eager to publish in a journal of questionable merit, even if a fee is required, rather than wait for the possibly one-year or more review process from a reputable journal, which may well result in a rejection. Besides, if the fact that the journal has a poor reputation is ever pointed out, the author(s) of the manuscript can feign ignorance.

A deeper explanation at work here is that most institutions, in hiring and promotion decisions, do not exercise due diligence in checking lists of publications against lists of journals known to have lax standards.

Recently Derek Pyne, an economist at Thompson Rivers University in Canada, published a study finding that many of his business school’s administrators, and most of its economics and business faculty, had published in journals on Beall’s list. And such publications seemed to pay off: of those researchers who had published in blacklisted journals, 56% had subsequently received at least one research grant from the school. The school did not appreciate this report, and Pyne was censured.

Along this line, Beall himself was pressured to cease compiling his list by his institution; his list is now being kept by another custodian who refuses to be named.

Obviously the mathematical community, and in fact the entire scientific community, needs to tighten standards for peer review and to oppose any form of “peer-reviewed” publication that involves only a perfunctory review. But it does not seem useful or even possible to simply “boycott” every open-access or pay-to-publish journal. Among other things, laws recently passed in the U.S. and Canada require taxpayer-funded research to be published in open-access journals. This was done with the best of intentions, even though the outcome is decidedly problematic.

But there is something positive that we can all do — write reviews!

Editors in many journals, covering a wide range different fields, are reporting that an increasing number of requests for reviews are being declined, or (arguably worse), simply ignored. The present author serves on the editorial board of several journals, some in the field of mathematics and others in computer arithmetic and high-performance computing. The experience is the same in all of these journals — more and more rejections of requests to write reviews. In some cases, the present author has needed to send 15 or more requests to obtain just three reviews.

But to ask researchers to write more reviews also means that department charimen and chairwomen, deans and other persons in administrative roles also recognize the importance of writing reviews. If this is not done, then given the huge time pressures that many researchers operate in, reviewing papers will continue to have the lowest priority.

In any event, there is a real danger that as a growing number of papers are published with erroneous or questionable results, other papers may cite them, thus starting a food chain of scholarship that is, at its base, mistaken. Such errors may only be rooted out years after legitimate mathematicians and scientists have cited and applied their results, and then labored in vain to understand paradoxical conclusions.

One way or the other, the plague of pay-to-publish journals with highly questionable or even nonexistent “peer review” processes must end. The future of all published scientific research depends on it.

]]>In a new book, Lost in Math: How Beauty Leads Physics Astray, Sabine Hossenfelder reflects on her career as a theoretical physicist. She acknowledges that her colleagues have produced “mind-boggling” new theories. But she is deeply troubled by the fact that so much work in theoretical physics today is disconnected from empirical reality, yet is excused because the theories themselves, and the mathematics behind them, are “too beautiful not to be true.”

Hossenfelder notes that there have been numerous instances in the past when scientists’ over-reliance on “beauty” has led it astray. Newton’s clockwork universe, with seemingly self-evident

Continue reading Does beautiful mathematics lead physics astray?

]]>In a new book, Lost in Math: How Beauty Leads Physics Astray, Sabine Hossenfelder reflects on her career as a theoretical physicist. She acknowledges that her colleagues have produced “mind-boggling” new theories. But she is deeply troubled by the fact that so much work in theoretical physics today is disconnected from empirical reality, yet is excused because the theories themselves, and the mathematics behind them, are “too beautiful not to be true.”

Hossenfelder notes that there have been numerous instances in the past when scientists’ over-reliance on “beauty” has led it astray. Newton’s clockwork universe, with seemingly self-evident laws of motion and gravity operating on an unchanging grid of space and time, seemed just too natural and beautiful to ever be doubted. Sadly, this mindset retarded acceptance of relativity and quantum mechanics for decades.

Even quantum mechanics has its detractors, because much of it seems so counterintuitive. To this day, many physicists believe that there must be some more “reasonable” underlying theory. Yet even the most exacting recent tests continue to precisely confirm the predictions of quantum theory.

Hossenfelder argues that too many physicists today, like their counterparts of previous eras, cling to theories that have been seemingly rejected by experimental evidence, or which have extrapolated far beyond what can be reasonably expected to be tested by experimentation in any foreseeable time frame. The usual reason that researchers stick by these theories is that the mathematical techniques behind them are judged “too beautiful not to be true,” or for some other largely aesthetic (and thus unscientific) reason.

In this regard, Hossenfelder joins several other authors of recent years who have expressed concern about the prevailing “orthodox” direction of the field of mathematical physics. For example, physicist Lee Smolin, writing in his 2006 book The Trouble With Physics, declares

We physicists need to confront the crisis facing us. A scientific theory [the string theory/multiverse] that makes no predictions and therefore is not subject to experiment can never fail, but such a theory can never succeed either, as long as science stands for knowledge gained from rational argument borne out by evidence. There needs to be an honest evaluation of the wisdom of sticking to a research program that has failed after decades to find grounding in either experimental results or precise mathematical formulation. String theorists need to face the possibility that they will turn out to have been wrong and others right.

Hossenfelder cites these examples, among others:

- Supersymmetry is the notion that each particle has a “superpartner.” Supersymmetry was proposed in the 1970s to complete the collection of known particles. It is appealing to many physicists because it explains how the two known types of particles, fermions and bosons, belong together — supersymmetry proposes that the laws of nature remain unchanged when fermions are exchanged with bosons. More recently, supersymmetry has been proposed to explain why the Higgs boson has the mass it has. Hossenfelder notes that thousands of her colleagues have bet their careers on it.
But no such superparticles have ever been found, even in the latest high-energy experiments on the Large Hadron Collider. Undeterred, many physicists just “know” supersymmetry must be true and continue to pursue it.

- String theory is the notion that all of physical reality is derived from the harmonic properties of exceedingly small strings and higher-dimensional “branes,” and that reality is really 10- or 11-dimensional, with extra dimensions curled up into submicroscopically small configurations, much like a hose that appears one-dimensional from a distance but admits travel around its circular cross-section at close quarters.
String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, but mostly because it is so “beautiful.” Yet in spite of decades of effort, by literally thousands of brilliant mathematical physicists, the field has yet to produce any specific experimentally testable prediction or test, short of hypothesizing galaxy-sized accelerators. What’s more, hopes that string theory would result in a single crisp physical theory, pinning down unique laws and unique values of the fundamental constants, have been dashed. Instead, the theory admits a huge number of different options, corresponding to different Calabi-Yau spaces, which by one reckoning number more than 10

^{500}. - The multiverse is the notion that ultimate reality consist of a huge and possibly infinite collection of other universes (see previous item, for instance), each forever separate and incommunicado from our own universe. The multiverse arose out of attempts to explain the apparent fine tuning of the laws and constants of physics. Researchers explain fine tuning by saying, in effect, that if our particular universe were not fine-tuned pretty much as we presently see it, we could not exist and thus would not be here to talk about it; end of story.
But many other scientists consider this type of “anthropic principle” argument highly controversial, an abdication of empirically testable physics. Is this really the best explanation that the field can produce?

Hossenfelder’s concern is personal as well as philosophical. She honestly does not see why “hidden rules” of “beauty” or aesthetics should have any place in the field. As she writes in the introduction,

The hidden rules served us badly. Even though we proposed an abundance of new natural laws, they all remained unconfirmed. And while I witnessed my profession slip into crisis, I slipped into my own personal crisis. I’m not sure anymore that what we do here, in the foundations of physics, is science. And if not, then why am I wasting my time with it?

She describes her personal odyssey, in which she seeks out some of the most prominent minds of the fields to understand how they see these issues. Here are some of the leading figures whom she interviews:

*Gian Francesco Giudice*. He is the head of the theory division at CERN (the European Council for Nuclear Research), which is located on the Swiss-French border just north of Geneva. It is the home of the Large Hadron Collider (LHC), which currently is the world’s most powerful accelerator. It was here that the discovery of long-sought Higgs boson was announced six years ago. Giudice argues that “The sense of beauty of a physical theory must be something hard-wired in our brain and not a social construct. … When you stumble on a beautiful theory you have the same emotional reaction that you feel in front of a piece of art.”Hossenfelder acknowledged his comments, but fails to see why it matters. She writes:

I doubt my sense of beauty is a reliable guide to uncovering fundamental laws of nature, laws that dictate the behavior of entities that I have no direct sensory awareness of, never had, and never will have. For it to be hardwired into my brain, it ought to have been beneficial during natural selection. But what evolutionary advantage has there ever been to understanding quantum gravity?

*Nima Arkani-Hamed*. Nima Arkani-Hamed, a Canadian-American theoretical physicist of Iranian descent, argues that mathematical consistency helps to rule out many specious theories, in the same way as experimentation:

It’s true that in most other parts of science what’s needed to check whether ideas are right or wrong are new experiments. But our field is so mature that we get incredible constraints already by old experiments. The constraints are so strong that they rule out pretty much everything you can try. If you are an honest physicist, 99.99 percent of your ideas, even good ideas, are going to be ruled out, not by new experiments but already by inconsistency with old experiments.

However, Hossenfelder notes that one can never prove any mathematics to be a true description of nature, because in the end mathematics only encapsulates provable truths about mathematical structures. The claim that these structures correspond to physical reality is a separate assertion that must always be tested and confirmed by experiment.

*Steven Weinberg*. Weinberg, the well-known Nobel laureate at the University of Texas, Austin, recognizes many of the difficulties that Hossenfelder writes about, but still insists on some value in seeking “beautiful” theories. As he told Hossenfelder,

Our sense of beauty has changed. And … the beauty that we seek now, not in art, not in home decoration — or in horse breeding — but the beauty we seek in physical theories is a beauty of rigidity. We would like theories that to the greatest extent possible could not be varied without leading to impossibilities, like mathematical inconsistencies.

As before, Hossenfelder counters that mathematical consistency is a fairly weak requirement: “There are many things that are mathematically consistent that have nothing to do with the [real] world.” Weinberg acknowledges this point, but still hopes that we can find a theory that is unique, in the sense that is the only mathematically consistent theory that is sufficiently rich to permit diverse phenomena, including life. Hossenfelder can see the appeal of such a theory, but doesn’t see how naturalness or mathematical beauty can be a useful guide.

Weinberg also commented on the foundations of quantum mechanics. Weinberg acknowledged that we really don’t have any satisfactory theory of quantum mechanics. But Weinberg is largely content with the “shut up and calculate” approach to these problems — quantum mechanics may seem ugly and “unnatural,” but it works.

*Frank Wilczek*. The well-known physicist / cosmologist Frank Wilczek is author of A Beautiful Question: Finding Nature’s Deep Design. With regards to the question of whether nature embodies beautiful ideas, Wilczek replies with an emphatic yes. Nonetheless, he acknowledges that currently popular theories such as string theory have serious problems:

I guess the most serious candidate for a better theory — better in the sense of bringing together gravity with the other interactions in an i intimate way and solving the [problem with quantum gravity] — is string theory. [But] to me it’s not clear what the theory is. It’s kind of a miasma of ideas that hasn’t yet taken shape, and it’s too early to say whether it’s simple or not — or even if it’s right or not. Right now it definitely doesn’t appear simple.

Wilczek acknowledges the fundamental need for data, saying “I don’t think we should compromise on this idea of post-empirical physics. I think that’s appalling, really appalling.” With regards to supersymmetry, he recognizes that the theory is in deep trouble too:

A lot of things are fine-tuned and we don’t know why. I would definitely not believe in supersymmetry if it wasn’t for the unification of gauge couplings, which I find very impressive. I can’t believe that that’s a coincidence. Unfortunately, this calculation doesn’t give you precise information about the mass scale where [SUSY] should show up. But the chances of unification start getting worse when [SUSY] partners haven’t shown up around 2 TeV.”

One final interesting commentary in the book is where Hossenfelder outlines some of the changes that have happened to the scientific enterprise over the past few decades. For one thing, there are many more physicists than in the past. The number of new physics PhDs in the U.S. has increased from about 20 per year in 1900 to over 2000 per year in 2012. Since 1970, the total number of papers in the physics field has increased by about 3.8 to 4.0% per year, i.e., a doubling time of about 18 years. And as the size of the physics literature has grown, it has bifurcated into numerous separate subfields, most of which are too specialized to be easily read by physicists outside the subdiscipline. Further, the number of authors on a typical physics paper has increased, as has the number of publications for a single physicist — i.e., publish or perish. Many physicists spend an increasing amount of time writing proposals, and fewer than ever are employed in solid, tenured positions, resulting in increased stress and less time to do real research.

Given these trends, it is perhaps not surprising that younger physicists quickly gravitate to well-worn paths, rather than aggressively challenging established orthodoxy.

In spite of Hossenfelder’s angst, she ends the book on a positive note. She recognizes that the laws of nature as we presently understand them are incomplete, and to produce a truly unified theory researchers will need to overhaul either general relativity or quantum physics or both. Some say that physics was the success story of the last century, but she is convinced that the next major breakthrough will happen in this century. “It will be beautiful,” whatever that will mean.

[Added 1 July 2018: John Horgan, a senior writer for Scientific American, has posted a review / synopsis of Hossenfelder’s book: Here.]

]]>As we have discussed on this forum before (see, for example, previous Math Scholar blog), Fermi’s paradox looms as one of the most profound and puzzling conundrums of science: Given that the universe is presumed to be teeming with intelligent life and technological civilizations, why do we see no evidence of their existence? Although the search for signals and other evidence from extraterrestrial (ET) societies continues (and is accelerating with new facilities and funding), nothing has been found in over 50 years.

Ever since Fermi first declared the paradox in

Continue reading Fermi’s paradox and the Copernican principle

]]>As we have discussed on this forum before (see, for example, previous Math Scholar blog), Fermi’s paradox looms as one of the most profound and puzzling conundrums of science: Given that the universe is presumed to be teeming with intelligent life and technological civilizations, why do we see no evidence of their existence? Although the search for signals and other evidence from extraterrestrial (ET) societies continues (and is accelerating with new facilities and funding), nothing has been found in over 50 years.

Ever since Fermi first declared the paradox in 1952, scientists have examined have proposed numerous explanations, but none of these “solutions” is very convincing (see previous Math Scholar blog for more complete details):

*They exist, but are under strict orders not to communicate with Earth.***Rejoinder**: In numerous vast ET civilizations, each presumably with billions of diverse individuals, it is hardly credible that a galactic society could impose a global ban on communication to Earth that is absolutely 100% effective. Once a signal has been sent to Earth, it cannot be called back, within known laws of physics.*They exist, but have lost interest in scientific research, exploration and expansion*.**Rejoinder**: Since Darwinian evolution strongly favors organisms that think, explore and expand, it is hardly credible that*every*individual in*every*ET civilization has lost interest in scientific research, exploration and expansion.*They exist, but have no interest in a primitive, backward society such as ours.***Rejoinder**: Perhaps 99.99% of ETs are not interested in primitive societies such as ours, but, as before, it is hardly credible that*every*individual in*every*ET civilization has no interest. Perhaps 99.99% of humans have no interest in ants; but thousands do, and have carefully studied every known species.*They exist, but have progressed to more sophisticated communication technologies*.**Rejoinder**: This does not apply to signals that are specifically targeted to societies such as ours, in a form (optical, microwaves) that could be easily recognized by a newly technological society. Again, it is hardly credible that absolutely no individual, in any ET society, has attempted such communication with Earth.*They exist, but are not aware of our existence yet, since radio and TV signals have not reached them*.**Rejoinder**: Our atmosphere has contained methane, oxygen and other chemical signs of life for at least three billion years. Images of Earth would have shown dinosaurs and countless other large species for at least 300 million years, bipedal hominins for at least 5 million years, and humans for at least 200,000 years. Large human structures (Mesopotamia, Egypt, China, Rome) would have been visible for at least 5,000 years with urban lights for at least 2,000 years (particularly in the past 200 years). Finally, atmospheric carbon dioxide has been on the rise for at least 200 years.*They exist, but travel and communication are too difficult*.**Rejoinder**: Just in the past few years, we have made remarkable advances in numerous space-related technologies: novel energy sources, new propulsion systems, new space transportation vehicles, fleets of nanocraft and von Neumann probes (which could explore the entire Milky Way in a million years or so); supercomputers that can perform over 100 quadrillion operations per second; and the emerging fields of LIGO astronomy and gravitational lens astronomy, to name a few. If we are on the verge of deploying such technologies today, what is stopping societies and even individuals that are thousands or millions of years more advanced than us?*Civilizations like us invariably self-destruct before becoming a space-faring society*.**Rejoinder**: In 200 years of technological adolescence, we have not yet destroyed ourselves through a nuclear, environmental or biological catastrophe, and in fact researchers in the past few years have made significant advances in supercomputer simulations to foresee and help prevent such calamities. Thus it is hardly credible that*every*technological society*invariably*self-destructs before it becomes space-faring society, with no exceptions whatsoever. Besides, within 10-20 years we will have outposts on the Moon, Mars and elsewhere, so that human society will be largely impervious to calamities back on Earth.*Earth is a unique planet with characteristics fostering a long-lived biological regime leading to intelligent life*.**Rejoinder**: Perhaps, although many discoveries in recent years point in the*opposite*direction: the universe is now known to contain 100+ billion galaxies; the Milky Way is now known to contain 100+ billion stars; and thousands of exoplanets have been found, with many in a “habitable zone” that may support life.

A new paper (see also this Quartz synopsis) in the *Proceedings of the Royal Society of London* examines Fermi’s paradox from a mathematical perspective, focusing on the Drake equation, which estimates the number of civilizations in the Milky Way galaxy with which we could potentially communicate:

N = R^{*} f_{p} n_{e} f_{l} f_{i} f_{c} L, where

N = number of civilizations in our galaxy that can communicate

R^{*} = average rate of star formation per year in galaxy

f_{p} = fraction of those stars that have planets

n_{e} = average number of planets that can support life, per star that has planets

f_{l} = fraction of the above that eventually develop life

f_{i} = fraction of the above that eventually develop intelligent life

f_{c} = fraction of civilizations that develop technology that signals existence into space

L = length of time such civilizations release detectable signals into space.

Researchers using this formula have obtained estimates as high as 100,000,000. But the authors of the new paper point out that the estimates that have been published to date for these parameters have been “point estimates,” i.e., single numbers, which do not correctly convey the substantial level of uncertainty that they entail. They observe, for instance, that the probability of the emergence of life can only be vaguely estimated at the present time, because although intriguing advances have been made (e.g., the “RNA world” scenario), researchers have not yet produced a fully-detailed path of how life arose on Earth. Other uncertainties surround key milestones such as the emergence of prokaryotes, the emergence of eukaryotes, the development of multicellular life, and the development of technology.

When these authors reformulated Drake’s equation with quantitative estimates of uncertainty for each of the above parameters, using Monte Carlo simulations, they found that the estimated number of technological civilizations in the Milky Way is now only roughly 0.32 (i.e., less than one in the entire galaxy). If other estimates of uncertainty are incorporated, they found less than 10^{-10} (i.e., less than one in the observable universe). They emphasize that this does not mean that we are alone with any certainty (in our galaxy or observable universe), but merely that this is “very scientifically plausible and should not surprise us.”

Even setting aside these authors’ analysis, the proposed “solutions” that have been mentioned in the literature are not very convincing, to say the least — see above. Indeed, with every new discovery of extrasolar planets with a potential life-friendly environment, and, even more so, with every new advance of computer, communication and space technology, the “easy” answers to Fermi’s paradox recede further away.

Given the unsatisfactory status of these proposed explanations, some researchers are now beginning to question the “intelligent life is ubiquitous” orthodoxy, saying out loud that we may be alone in a rather large radius around the Earth, possibly including the entire Milky Way or even the entire observable universe. Max Tegmark, a prominent Swedish-American cosmologist, recently wrote, “[T]his assumption that we’re not alone in our Universe is not only dangerous but also probably false.” Paul Davies, a prominent British-American physicist, added, “[M]y answer is that we are probably the only intelligent beings in the observable universe and I would not be very surprised if the solar system contains the only life in the observable universe.” Similarly, John Gribbin, a prominent British astrophysicist, declared,

They are not here, because they do not exist. The reasons why we are here form a chain so improbable that the chance of any other technological civilization existing in the Milky Way Galaxy at the present time is vanishingly small. We are alone, and we had better get used to it.

Yet most scientists still insist on the traditional view of a universe teeming with extraterrestrials. For example, noted physicist/cosmologist Lawrence Krauss recently opined in a PBS broadcast that there must be countless technological societies in the cosmos; the reason they have not yet reached out to Earth is simply that signals from our radio and TV transmissions have not yet reached them, so that they are unaware of our existence (i.e., solution #5 above; see rejoinder). Along this line, astrophysicist Adam Frank writes in his new book Light of the Stars: Alien Worlds and the Fate of the Earth that it is “highly likely” that many other planets are or have hosted technologically advanced civilizations, but that they have faced challenges such as climate change and might not have survived (i.e., solution #7 above; see rejoinder).

The Copernican principle, or principle of mediocrity, as it is sometimes called, holds that there is nothing unusual or atypical about humanity — not our place on Earth (merely one of millions of biological species), not our place in the solar system (merely one of numerous planets and other objects orbiting the sun), not our place in the Milky Way (merely one of billions of similar star systems), and not our place in the larger universe (merely one of billions of similar galaxies). It implies that the universe must be teeming with intelligent, technological societies.

Scientists and philosophers alike agree that the Copernican principle has guided scientific progress for centuries — from Newton’s universal law of gravitation, to Darwin’s evolution, to Einstein’s relativity, to quantum physics, to the standard model and, finally, to current theories of cosmology. The Copernican principle is, in the minds of many scientists, essentially self-evident, and “too beautiful not to be true.”

But is the Copernican principle really such an infallible guide to truth? Does its presumed “beauty” (which is an aesthetic, not a scientific concept) really matter?

In a provocative new book, Lost in Math: How Beauty Leads Physics Astray, quantum physicist Sabine Hossenfelder argues that the scientific world in general, and the field of physics in particular, has repeatedly clung to notions that have been rejected by experimental evidence, or has pursued theories far beyond what can be tested by experimentation, mainly because these theories and the mathematics behind them were judged “too beautiful not to be true.” Examples cited by Hossenfelder include:

*Newtonian physics*. Newton’s clockwork universe on an unchanging grid of space and time seemed just too natural and beautiful, and this mindset retarded acceptance of relativity and quantum mechanics for decades.*Supersymmetry*. Supersymmetry, the notion that each particle has a “superpartner,” was originally proposed in the 1970s, and more recently was proposed to explain why, for example, the Higgs boson has the mass it has. But no such superparticles have been found, even in the latest experiments on the Large Hadron Collider. Undeterred, many physicists just “know” it must be true.*String theory*. String theory has been proposed as the long-sought “theory of everything” uniting relativity and quantum theory, mostly because it is so “beautiful.” But in spite of decades of effort, by literally thousands of brilliant mathematical physicists, no experimentally testable prediction or test has been produced. What’s more, hopes that string theory would result in a single crisp physical theory, pinning down unique laws and unique values of the fundamental constants, have been dashed. Instead the theory admits a huge number of different options, by one reckoning 10^{500}in number.*Multiverse*. Attempts to explain the apparent fine tuning of the laws and constants of physics have led many physicists to propose a “multiverse,” namely a huge collection of other universes (see previous item, for instance), and to explain fine tuning by saying, in effect, that if our particular universe were not fine-tuned pretty much as we see it, we could not exist and thus would not be here to talk about it; end of story. But many other scientists consider this “anthropic principle” argument highly controversial, an abdication of empirically testable physics. Is this really the best explanation that the field can produce?

As mentioned above, the Copernican principle suggests that human society is merely one of billions of intelligent, technological civilizations in the cosmos. Indeed, it is often considered naive and archaic to even consider that we might be special in any significant sense. And yet, the empirical data, namely the utter lack of any evidence to date of an extraterrestrial technological society, points in the other direction.

As mentioned above, Gribbin concludes that we are alone within the Milky Way, and that “we had better get used to it,” suggesting that the notion of being alone within the Milky Way is an barely tolerable notion (although there may still be billions of such societies in the universe). But why is this such a barely tolerable notion? Simply because it goes against the Copernican principle? And why should we have such unbending faith in the Copernican principle as an infallible guide to truth?

To the contrary, the emerging picture from the latest scientific data, namely that human existence is far more singular than anyone dreamed of even a few years ago, and, at the same time, that the universe is far vaster and more exotic than anyone dreamed of even a few years ago, is, in the present author’s view, an awe-inspiring vision, not a depressing one. It suggests a grand future for mankind as a sophisticated, space-faring society, as physicist Michio Kaku envisions in his new book The Future of Humanity: Terraforming Mars, Interstellar Travel, Immortality, and Our Destiny Beyond Earth (although, curiously, Kaku dismisses Fermi’s paradox, citing what amounts to solution #3 above; see rejoinder).

One bright spot, mentioned above, is that rapidly advancing technology is dramatically increasing our ability to search for and study potentially habitable extrasolar planets that may house technological civilizations. One way or the other, we should have a significantly clearer view of these issues in the next ten years or so. We can all look forward to that day.

]]>