Four prestigious professional society awards are presented at the SC18 conference. This

Continue reading US leads but China rises in latest Top500 supercomputer list

]]>Four prestigious professional society awards are presented at the SC18 conference. This year’s awardees include:

- David E. Shaw received the Seymour Cray Engineering Award from the IEEE Computer Society. Shaw, the billionaire former CEO of the D. E. Shaw hedge fund, has for the past 12 years led a team to develop ANTON, a special-purpose computer system for protein folding and other biomedical computations. His team has twice received the Gordon Bell prize (see below) for landmark calculations in the biomedical field using the ANTON system.
- Sarita Adve received the Ken Kennedy Award from the ACM and the IEEE Computer Society. Adve, a professor of computer science at the University of Illinois, Urbana-Champaign, co-developed the memory models used in the C++ and Java programming languages, and is known for related work in hardware-software interfaces, system resiliency, cache coherence, reliability and power management.
- Linda Petzold, a professor at the University of California, Santa Barbara, received the 2018 Sidney Fernbach Award from the IEEE Computer Society. Petzold is best known for her work in the numerical solution of differential-algebraic equations, which have applications in materials science, computational biology and medicine.
- Two teams shared the Gordon Bell Prize from the ACM. One team, from the Oak Ridge National Laboratory, developed a supercomputer program that processes huge amounts of genome data to better understand the genetic basis of chronic pain and opiod addiction. The other group, based at the Lawrence Berkeley National Laboratory, developed a neural-net-based supercomputer program to identify extreme weather patterns in climate simulations.

This is the first year in which two women (Adve and Petzold) were awarded one the four major professional society awards in the high-performance computing field.

Another announcement at the SC18 conference was the latest edition of the Top500 list of the world’s most powerful supercomputers.

Number one on the November 2018 list is the Summit supercomputer at Oak Ridge National Laboratory in the U.S. The summit system consists of more than 4,600 individual servers, each containing two 22-core IBM Power9 processors and six NVIDIA Tesla V100 graphics processing unit accelerators, all connected with a Mellanox 100 Gb/s interconnection network. The system has a potential peak performance of 200 Pflop/s (2 x 10^{17} floating-point operations per second). Its performance on the Linpack benchmark, which is used for the Top 500 ranking, was 143.5 Pflop/s (1.435 x 10^{17}) floating-point operations per second.

Number two on the November 2018 list is the Sierra system, another IBM-based supercomputer, this one at the Lawrence Livermore National Laboratory in the U.S. The number three system is the Sunway TaihuLight system, at the National Supercomputer Center in Wuxi, China. It consists of 40,960 Chinese-designed SW26010 manycore processors.

The remaining top ten supercomputers are: (#4) the Tianhe-2A supercomputer in Guangzhou, China; (#5) the Piz Daint supercomputer in Lugano, Switzerland; (#6) the Trinity supercomputer in Albuquerque, New Mexico; (#7) the AI Bridging Cloud Infrastructure supercomputer in Tokyo, Japan; (#8) the Leibniz Rechenzentrum supercomputer near Munich, Germany; (#9) the Titan supercomputer in Oak Ridge, Tennessee; and (#10) the Sequoia supercomputer in Livermore, California.

For full details on these supercomputers, see the Top500 site.

Although the U.S. still dominates the Top500 list, including the top two systems, China has made remarkable progress in recent years. In the November 2018 list, 227 of the 500 systems are in China, versus 109 in the U.S., 38 in Japan, and 20 in the U.K. By contrast, in 2008 only 12 of the top 500 systems were in China. If one counts the share of total performance, then the U.S. leads with 37.7%, versus 31% in China, and 7.7% in Japan. In 2008, China’s share was a mere 2.5%.

The following graphic shows the remarkable advance of China’s presence in the Top 500 list over the past ten years:

]]>Continue reading Simple proofs: The impossibility of trisection

]]>Ancient Greek mathematicians developed the methodology of “ruler-and-compass” constructions: if one is given only a ruler (without marks) and a compass, what objects can be constructed as a result of a finite set of operations? While they achieved many successes, three problems confounded their efforts: (1) squaring the circle; (2) trisecting an angle; and (3) duplicating a cube (i.e., constructing a cube whose volume is twice that of a given cube). Indeed, countless mathematicians through the ages have attempted to solve these problems, and countless incorrect “proofs” have been offered. We might add that Archimedes discovered a way to trisect an angle, but it did not strictly conform to the rules of ruler-and-compass construction.

The impossibility of squaring the circle was first proved by Lindemann in 1882, who showed that $\pi$ is *transcendental* — it is not the root of any algebraic equation with integer or rational coefficients. The impossibility of trisecting an arbitrary angle was proved earlier, in 1837, by Pierre Wantzel, and the impossibility of duplicating a cube was also proved at about that same time. For additional historical background on the trisection problem, see this Wikipedia article.

In most textbooks, these impossibility proofs are presented only after extensive background in groups, rings, fields, field extensions and Galois theory. Needless to say, very few college-level students, even among those studying majors such as physics or engineering, ever take such coursework. This is typically accepted as an unpleasant but necessary aspect of mathematical pedagogy.

We present here a proof of the impossibility of trisecting an arbitrary angle by ruler-and-compass construction and, as a bonus, the impossibility of cube duplication. We emphasize that this proof is both *elementary* and *complete* — all necessary lemmas and nontrivial facts are proved below. It should be understandable by anyone with a solid high school background in algebra, geometry and trigonometry, although it would help if the reader is familiar with the integers modulo $n$ and algebraic fields. This proof is somewhat longer than others in this series, but this is mainly due to the inclusion of several illustrative examples, as well as proofs of facts, such as the rational root theorem and the formula for cosine of a triple angle, that some might consider “obvious.” The core of the proof (see “Proof of the main theorems” below) is actually the *shortest* of those published so far in this series.

**Gist of the proof:**

We will show that trisecting a 60 degree angle, in particular, is equivalent to constructing the number $\cos (\pi/9)$ (i.e., the cosine of 20 degrees), which is an algebraic number that satisfies an irreducible polynomial of degree 3. Since the only numbers and number fields that can be produced by ruler-and-compass construction have algebraic degrees that are powers of two, this shows that the trisection of a 60-degree angle is impossible.

We first define some basic notions about polynomials, algebraic numbers and field extensions, and prove three basic lemmas (Lemmas 1, 2 and 3). Readers familiar with these lemmas may skip to the “Constructible numbers” section below.

**Definitions: Algebraic numbers, irreducible polynomials, minimal polynomials and degrees.**

By an *algebraic number*, we mean some real or complex number $\alpha$ that is the root of a polynomial $a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$ with integer coefficients $a_i$. We will assume here that the $a_i$ have no common factor, since if they do all coefficients can be divided by this factor, and $\alpha$ is still a root. Note also that if $\alpha$ is the root of a polynomial with rational coefficients, then by multiplying all coefficients by the greatest common multiple of the denominators of the $a_i$, one obtains a polynomial with integer coefficients that has $\alpha$ as a root. An *irreducible polynomial* is a polynomial with integer coefficients that cannot be factored into two polynomials of lesser degrees. The *minimal polynomial* of an algebraic number $\alpha$ is the polynomial with integer coefficients of minimal degree (i.e., the irreducible polynomial) that has $\alpha$ as a root, and the *degree* of $\alpha$, denoted ${\rm deg}(\alpha)$, is the degree of its minimal polynomial. Note that if $a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$ is the minimal polynomial of $\alpha$, then neither $a_0$ nor $a_m$ can be zero, since otherwise $\alpha$ would satisfy a polynomial of lower degree.

**LEMMA 1 (The rational root theorem)**: If $x = p / q$ is a rational root of a polynomial $a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0$, where $a_i$ are integer coefficients with neither $a_n$ nor $a_0$ zero (otherwise the polynomial is equivalent to one of lesser degree), and where $p$ and $q$ are integers without any common factor, then $p$ divides $a_0$ and $q$ divides $a_n$.

**Proof**: By hypothesis, $a_n (p/q)^n + a_{n-1} (p/q)^{n-1} + \cdots + a_1 (p/q) + a_0 = 0.$ After multiplying through by $q^n$, taking the lowest-order (constant) term to the right-hand side and factoring out $p$ from the other terms, we obtain $$p (a_n p^{n-1} + a_{n-1} p^{n-2}q + \cdots + a_1 q^{n-1}) = – a_0 q^n.$$ Since the large expression in parentheses is an integer, $p$ divides the left-hand side and thus also the right-hand side. But since $p$ and $q$ have no common factors, $p$ cannot divide $q^n$. Thus $p$ must divide $a_0$. Similarly, by taking the highest-order term to the right-hand side and factoring out $q$ from the other terms, we can write $$q (a_{n-1} p^{n-1} + a_{n-2} p^{n-2} q + \cdots + a_0 q^{n-1}) = – a_n p^n.$$ But again, since the expression in parentheses is an integer, $q$ divides the left-hand side and thus must also divide the right-hand side. Since $q$ cannot divide $p^n$, it must divide $a_n$.

**Examples**: The constant $1 + \sqrt{2}$ satisfies the polynomial $x^2 – 2 x – 1$, so that its algebraic degree cannot exceed two. By Lemma 1, the only possible rational roots are $x = 1$ and $x = -1$, and neither satisfies $x^2 – 2x – 1 = 0$. Thus the polynomial $x^2 – 2 x – 1$ is irreducible and is the minimal polynomial of $1 + \sqrt{2}$, so that ${\rm deg}(1 + \sqrt{2}) = 2$. As another example, consider $\sqrt[3]{2}$, which satisfies $x^3 – 2 = 0$. If this polynomial factors at all over the rational numbers, it has a linear factor and thus a rational root. But by Lemma 1, the only possible rational roots are $x = \pm 1$ and $x = \pm 2$, and none of these satisfies $x^3 – 2 = 0$. Thus $x^3 – 2$ is irreducible and is the minimal polynomial of $\sqrt[3]{2}$, so that ${\rm deg}(\sqrt[3]{2}) = 3$.

**Definitions: Algebraic fields.**

By an *algebraic field*, we will mean, for our purposes here, a set of numbers $F$ that includes the rational numbers $R$, with the usual four arithmetic operations defined, and with the property that if $x$ and $y$ are in $F$, then so are the sum $x+y$, difference $x-y$, product $x y$, and, if $x \neq 0$, the reciprocal $1/x$. There is a rich theory of algebraic fields, but we will not require anything here more than this definition and two basic lemmas (Lemmas 2 and 3), which we prove here.

**LEMMA 2 (Algebraic extensions are fields)**: Let $R$ be the rational numbers, and let $\alpha$ be algebraic of degree $m \ge 2$, with minimal polynomial $P(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$. Then the set $S = \{r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1}\}$, where $r_i \in R$, is a field.

**Proof**: If $s = s_0 + s_1 \alpha + s_2 \alpha^2 + \cdots + s_{m-1} \alpha^{m-1}$, where $s_i \in R$, and $t = t_0 + t_1 \alpha + t_2 \alpha^2 + \cdots + t_{m-1} \alpha^{m-1}$, where $t_i \in R$ are elements of $S$, their sum $s + t = (s_0 + t_0) + (s_1 + t_1) \alpha + \cdots + (s_{m-1} + t_{m-1}) \alpha^{m-1}$ is clearly in $S$, as is their difference. We can write their product as: $$st = s_0 t_0 + (s_0 t_1 + s_1 t_0) \alpha + (s_0 t_2 + s_1 t_1 + s_2 t_0) \alpha^2 + \cdots + s_{m-1} t_{m-1} \alpha^{2m-2}.$$ This expression involves terms with $\alpha^k$ for $k \ge m$, but note that the term with $\alpha^m$ can be rewritten by solving the minimal polynomial $P(\alpha) = 0$ to obtain $\alpha^m = -(a_0/a_m) – (a_1/a_m) \alpha – \cdots – (a_{m-1}/a_m) \alpha^{m-1}$; higher powers of $\alpha$ can similarly be rewritten in terms of $\alpha^k$ for $k \le m-1$. Thus the product $st$ is in $S$. The reciprocal of $s = Q(\alpha) = s_0 + s_1 \alpha + s_2 \alpha^2 + \cdots + s_{m-1} \alpha^{m-1}$ can be calculated by applying the extended Euclidean algorithm to the polynomials $P(x)$ and $Q(x)$, where polynomial long division, dropping the remainder, is used instead of ordinary division. In this case, where $P(x)$ and $Q(x)$ have no common factor (since $P(x)$ is irreducible), the extended Euclidean algorithm produces polynomials $A(x)$ and $B(x)$, of lower degree than $P(x)$, such that $A(x) P(x) + B(x) Q(x) = 1$. Substituting $x = \alpha$ and recalling that $P(\alpha) = 0$, we have $B(\alpha) Q(\alpha) = 1$, so that $1/s = 1 / Q(\alpha) = B(\alpha)$, which is in $S$ since ${\rm deg}(B(x)) \le m – 1$. This proves that $S$ is a field.

**The extended Euclidean algorithm**: The extended Euclidean algorithm is best illustrated by using it to find the reciprocal (multiplicative inverse) of an integer $m$ in the field of integers modulo $p$, where $p$ is a prime. By the “field of integers modulo $p$,” we mean the set of integers $\{0, 1, 2, \cdots, p – 1\}$, where multiples of $p$ are added to or subtracted from the results of the operations $+, \, -, \, \times$ to reduce them to the range $(0, 1, 2, \cdots, p – 1)$. We will illustrate the case $m = 100$ and $p = 257$ (a prime). First start with a column vector containing $100$ and $257$, together with a $2 \times 2$ identity matrix. Then operate as follows, where ${\rm int}(\cdot)$ denotes integer part, where $R_1$ and $R_2$ denote the first and second rows of the column vector, and where $C_1$ and $C_2$ denote the first and second columns of the $2 \times 2$ matrix:

\begin{align*}

\left( \begin{array}{r} 100 \\ 257 \end{array} \right) & \quad

\left( \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right) & {\rm int}(257/100) = 2, \; {\rm so} \; R_2 \leftarrow R_2 – 2 \cdot R_1 \; {\rm and} \; C_1 \leftarrow C_1 + 2 \cdot C_2. \\

\left( \begin{array}{r} 100 \\ 57 \end{array} \right) & \quad

\left( \begin{array}{rr} 1 & 0 \\ 2 & 1 \end{array} \right) & {\rm int}(100/57) = 1, \; {\rm so} \; R_1 \leftarrow R_1 – 1 \cdot R_2 \; {\rm and} \; C_2 \leftarrow C_2 + 1 \cdot C_1. \\

\left( \begin{array}{r} 43 \\ 57 \end{array} \right) & \quad

\left( \begin{array}{rr} 1 & 1 \\ 2 & 3 \end{array} \right) & {\rm int}(57/43) = 1, \; {\rm so} \; R_2 \leftarrow R_2 – 1 \cdot R_1 \; {\rm and} \; C_1 \leftarrow C_1 + 1 \cdot C_2. \\

\left( \begin{array}{r} 43 \\ 14 \end{array} \right) & \quad

\left( \begin{array}{rr} 2 & 1 \\ 5 & 3 \end{array} \right) & {\rm int}(43/14) = 3, \; {\rm so} \; R_1 \leftarrow R_1 – 3 \cdot R_2 \; {\rm and} \; C_2 \leftarrow C_2 + 3 \cdot C_1. \\

\left( \begin{array}{r} 1 \\ 14 \end{array} \right) & \quad

\left( \begin{array}{rr} 2 & 7 \\ 5 & 18 \end{array} \right) & {\rm int}(14/1) = 14, \; {\rm so} \; R_2 \leftarrow R_2 – 14 \cdot R_1 \; {\rm and} \; C_1 \leftarrow C_1 + 14 \cdot C_2. \\

\left( \begin{array}{r} 1 \\ 0 \end{array} \right) & \quad

\left( \begin{array}{rr} 100 & 7 \\ 257 & 18 \end{array} \right) & {\rm one \; entry \; of \; vector \; is \; zero, \; so \; process \; ends; \; result \; = \; 18.}

\end{align*}

Thus $100^{-1} \bmod 257 = 18$. Indeed, $18 \cdot 100 – 7 \cdot 257 = 1$. Note that $18$ appears in the final matrix, in the opposite corner from the input value $100$. If the number of iterations performed is odd, the sign of the result is changed (not necessary in above example). The polynomial version of the extended Euclidean algorithm operates in exactly the same way, except: (a) the division operation is replaced by polynomial long division, dropping the remainder; and (b) if at termination the nonzero constant in the column vector is not $1$, then the result must be divided by this constant.

**Example**: Let $\alpha$ be a root of the minimal polynomial $P(x) = 1 + x + x^3$ (note that $P(x)$ is irreducible by Lemma 1), and define $S = \{r_0 + r_1 \alpha + r_2 \alpha^2\}$ for $r_i \in R$. Let $s = 1 + 2 \alpha + \alpha^2$ and $t = 1 + \alpha^2$ be two typical elements of $S$. Clearly their sum $s + t = 2 + 2 \alpha + 2 \alpha^2$ is in $S$, as is their difference. Their product $s t = 1 + 2 \alpha + 2 \alpha^2 + 2 \alpha^3 + \alpha^4$. Since $P(\alpha) = 0$, we can solve the minimal polynomial to write $\alpha^3 = -1 – \alpha$ and $\alpha^4 = – \alpha – \alpha^2$, and rewrite the product as $s t = -1 -\alpha + \alpha^2$, which then is in $S$. To find the reciprocal of $s = 1 + 2 \alpha + \alpha^2$, we apply the extended Euclidean algorithm to the polynomials $P(x) = 1 + x + x^3$ and $Q(x) = 1 + 2 x + x^2$. The algorithm proceeds as follows, where ${\rm quot}(\cdot,\cdot)$ denotes polynomial quotient, dropping remainder:

\begin{align*}

\left( \begin{array}{c} x^2 + 2 x + 1 \\ x^3 + x + 1 \end{array} \right) & \quad \left( \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right) & \begin{array}{c} {\rm quot}((x^3 + x + 1), (x^2 + 2 x + 1)) = x – 2, \; {\rm so} \\ R_2 \leftarrow R_2 – (x – 2) \cdot R_1 \; {\rm and} \; C_1 \leftarrow C_1 + (x – 2) \cdot C_2. \end{array} \\

\left( \begin{array}{c} x^2 + 2 x + 1 \\ 4 x + 3 \end{array} \right) & \quad \left( \begin{array}{cc} 1 & 0 \\ x – 2 & 1 \end{array} \right) & \begin{array}{c} {\rm quot}((x^2 + 2 x + 1), (4 x + 3)) = x / 4 + 5/16, \; {\rm so} \\ R_1 \leftarrow R_1 – (x/4 + 5/16) \cdot R_2 \; {\rm and} \; C_2 \leftarrow C_2 + (x/4 + 5/16) \cdot C_1. \end{array} \\

\left( \begin{array}{c} 1/16 \\ 4 x + 3 \end{array} \right) & \quad \left( \begin{array}{cc} 1 & x/4 + 5/16 \\ x – 2 & x^2/4 – 3 x/16 + 3/8 \end{array} \right) & \begin{array}{c} {\rm quot}((4x + 3), 1/16) = 64 x + 48, \; {\rm so} \\ R_1 \leftarrow R_1 – (64 x + 48) \cdot R_2 \; {\rm and} \; C_2 \leftarrow C_2 + (64 x + 48) \cdot C_1. \end{array} \\

\left( \begin{array}{c} 1/16 \\ 0 \end{array} \right) & \quad \left( \begin{array}{cc} 16 + 32 x + 16 x^2 & x / 4 + 5/16 \\ 16 x^3 + 16 x + 16 & x^2/4 – 3 x /16 + 3/8 \end{array} \right) & {\rm process \; ends; \; result} \; = \; x^4 / 4 – 3 x / 16 + 3 / 8. \\

\end{align*}

After dividing the result $(x^4 / 4 – 3 x / 16 + 3 / 8)$ by $1/16$, we obtain $B(x) = 4 x^2 – 3 x + 6$, so that $1/s = B(\alpha) = 6 – 3 \alpha + 4 \alpha^2$. Indeed, $(6 – 3 \alpha + 4 \alpha^2)s = (6 – 3 \alpha + 4 \alpha^2)(1 + 2 \alpha + \alpha^2) = 6 + 9 \alpha + 4 \alpha^2 + 5 \alpha^3 + 4 \alpha^4$, which, after recalling $\alpha^3 = -1 – \alpha$ and $\alpha^4 = -\alpha – \alpha^2$, gives $1$.

**Definitions: Linear independence and degrees.**

Suppose that we have a field $S$ that contains the rational numbers $R$ (i.e., $R \subset S$), such that every $s \in S$ can be written as $s = r_1 \alpha_1 + r_2 \alpha_2 + \cdots + r_m \alpha_m$, with $r_i \in R$, and with the property that the $\alpha_i$ are *linearly independent*: if $r_1 \alpha_1 + r_2 \alpha_2 + \cdots + r_n \alpha_m = 0$, with $r_i \in R$, then each $r_i = 0$. Such an extension is said to be of *degree* $m$, which we will denote here as ${\rm deg}(S / R) = m$ (this is also denoted $[S:R] = m$).

Suppose $\alpha$ is algebraic of degree $m$, with minimal polynomial $a_0 + a_1 x + a_2 x^2 + \cdots + a_m x^m$. Then consider the field $S$ consisting of all $s$ that can be written $s = r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1}$, with $r_i \in R$. Since there are $m$ terms, it is clear that ${\rm deg}(S/R) \leq m$. Now suppose that for some choice of $r_0, r_1, \cdots, r_{m-1}$, that $r_0 + r_1 \alpha + r_2 \alpha^2 + \cdots + r_{m-1} \alpha^{m-1} = 0$. Since the minimal polynomial of $\alpha$ has degree $m$, and not degree $m-1$ or less, this can only happen if each $r_i = 0$. Thus ${\rm deg}(S/R) = m$ exactly. In other words, *the notions of algebraic degree and field extension degree coincide when the field is generated by an algebraic number*.

**Example**: Consider the case $S = R(\sqrt[3]{2})$, where $S$ is the set of all $s$ that can be written $s = r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2$, with $r_i \in R$. Clearly $R \subset S$, and, since there are three terms, ${\rm deg}(S/R) \leq 3$. Now suppose that for some choice of $r_0, r_1, r_2$ that $r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2 = 0$. Since $\sqrt[3]{2}$ satisfies the polynomial $x^3 – 2 = 0$, and this polynomial is irreducible, as we showed above, $x^3 – 2$ is the minimal polynomial of $\sqrt[3]{2}$ and ${\rm deg}(\sqrt[3]{2}) = 3$. Since the minimal polynomial of $\sqrt[3]{2}$ is of degree $3$ and not $2$ or less, it follows that $r_0 + r_1 \sqrt[3]{2} + r_2 (\sqrt[3]{2})^2 = 0$ only when $r_0 = r_1 = r_2 = 0$. Thus ${\rm deg}(S/R) = 3$ exactly.

We now prove an important fact about the degrees of field extensions, which will be key to our main impossibility proofs below:

**LEMMA 3 (Multiplication of degrees of field extensions)**: Suppose that we have three fields $R, S, T$ (where $R$ is the field of rational numbers as before) such that $R \subset S \subset T$, and with ${\rm deg}(S/R) = m$ and ${\rm deg}(T/S) = n$. Then ${\rm deg} (T/R) = {\rm deg} (S/R) \cdot {\rm deg} (T/S) = m \cdot n$.

**Proof**: For ease of notation, we will prove this in the specific case $m = 2, n = 3$, i.e., where ${\rm deg} (S/R) = 2$ and ${\rm deg} (T/S) = 3$, although the argument is completely general. By hypothesis, all $s \in S$ can be written as $s = r_1 \alpha_1 + r_2 \alpha_2$, where $r_i \in R$ are rational numbers, and with the property that if $r_1 \alpha_1 + r_2 \alpha_2 = 0$, then each $r_i = 0$. In a similar way, by hypothesis all $t \in T$ can be written as $t = s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3$, where $s_i \in S$, and with the property that if $s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3 = 0$, then each $s_i = 0$.

First note that when we write any $t \in T$ as $t = s_1 \beta_1 + s_2 \beta_2 + s_3 \beta_3,$ where $s_1, s_2, s_3 \in S$, each of these $s_i$ can in turn be rewritten in terms of $r_i$ and $\alpha_i$: in particular, $s_1 = a_1 \alpha_1 + a_2 \alpha_2$, for some $a_i \in R$; $s_2 = b_1 \alpha_1 + b_2 \alpha_2$, for some $b_i \in R$; and $s_3 = c_1 \alpha_1 + c_2 \alpha_2$, for some $c_i \in R$. Thus we can write $$t = (a_1 \alpha_1 + a_2 \alpha_2) \beta_1 + (b_1 \alpha_1 + b_2 \alpha_2) \beta_2 + (c_1 \alpha_1 + c_2 \alpha_2) \beta_3 $$ $$= a_1 (\alpha_1 \beta_1) + a_2 (\alpha_2 \beta_1) + b_1 (\alpha_1 \beta_2) + b_2 (\alpha_2 \beta_2) + c_1 (\alpha_1 \beta_3) + c_2 (\alpha_2 \beta_3),$$ where $a_1, a_2, b_1, b_2, c_1, c_2 \in R$. We now see that we can write any $t \in T$ as a linear sum of the six terms shown, with coefficients in $R$. Thus ${\rm deg}(T/R) \leq 6$. Now suppose that some $t \in T$ is zero, so that $$t = (a_1 \alpha_1 + a_2 \alpha_2) \beta_1 + (b_1 \alpha_1 + b_2 \alpha_2) \beta_2 + (c_1 \alpha_1 + c_2 \alpha_2) \beta_3 = 0.$$ By hypothesis, this can only happen if each of the three coefficient expressions $(a_1 \alpha_1 + a_2 \alpha_2)$, $(b_1 \alpha_1 + b_2 \alpha_2)$ and $(c_1 \alpha_1 + c_2 \alpha_2)$ is zero, which in turn is only possible if each $a_1, a_2, b_1, b_2, c_1, c_2$ is zero. Thus the six terms $\alpha_1 \beta_1, \alpha_2 \beta_1, \alpha_1 \beta_2, \alpha_2 \beta_2, \alpha_1 \beta_3$ and $\alpha_2 \beta_3$ are linearly independent, so that ${\rm deg}(T/R) = 6$ exactly.

The more general case where ${\rm deg} (S/R) = m$ and ${\rm deg} (T/S) = n$ is handled in exactly the same way, only with somewhat more complicated notation.

**Definitions: Constructible numbers.**

The next step is to establish what kinds of numbers can be constructed by ruler and compass. Readers familiar with the mathematics of ruler-and-compass constructions may skip to the “Angle trisection” section below.

First, it is clear that addition can be easily done with ruler and compass by marking the two distances on a straight line, then the combined distance is the sum. Subtraction can be done similarly. Multiplication can also be done by ruler and compass, as illustrated in the figure to the right (taken from here), and division is just the inverse of this process.

What’s more, one can compute square roots by ruler and compass. In the second illustration to the right (taken from here), assuming that $AC = 1$, simple properties of right triangles show that the distance $AD = \sqrt{AB}$.

Thus, starting with a distance of unit length, numbers composed of any finite combination of the four arithmetic operations and square roots can be constructed. The converse is also true: Such numbers are the *only* numbers that can be constructed, as we will now show.

**LEMMA 4 (Degrees of constructible numbers)**: If $S$ is the number field resulting from a finite sequence of ruler-and-compass constructions, then ${\rm deg}(S/R) = 2^m$ for some integer $m$.

**Proof**: Ruler-and-compass constructions can only extend the rational number field by a sequence of one of the following operations, each of which has algebraic degree 1 or 2 over the field generated by the previous operation:

- Finding the $(x,y)$ coordinates of the intersection of two lines is equivalent to solving a set of two linear equations in two unknowns, which involves only the four arithmetic operations. Note that this does
*not*involve any field extension. - Finding the $(x,y)$ coordinates of the intersection of a line with a circle is equivalent to solving a system of a linear equation and a quadratic equation, which requires only arithmetic and one square root.
- Finding the $(x,y)$ coordinates of the intersection of two circles is equivalent to solving two simultaneous quadratic equations, which again requires only arithmetic and one square root. See any high school analytic geometry text for details.
- Copying the distance between any two constructed points to the $x$ axis is equivalent to computing the distance between the two points, which requires only arithmetic and one square root.

Thus, by Lemma 3, the algebraic degree over the rationals of the final number field so constructed, after a sequence of $n$ ruler-and-compass operations, is $2^m$ for some integer $m \le n$.

**Definitions: Angle trisection.**

We first must keep in mind that some angles *can* be trisected. For example, a right angle can be trisected, because a 30 degree angle can be constructed, simply by bisecting one of the interior angles of an equilateral triangle. To show the impossibility result, we need only exhibit a single angle that cannot be trisected. We choose a 60 degree angle (i.e., $\pi/3$ radians). In other words, we will show that a 20 degree angle (i.e., $\pi/9$ radians) cannot be constructed. Note that it is sufficient to prove that the number $x = \cos(\pi/9)$ cannot be constructed, because if it could, then by copying this distance to the $x$ axis, with the left end at the origin, then constructing a perpendicular from the right end to the unit circle, the resulting angle will be the desired trisection.

Here we require only an elementary fact of trigonometry, namely the formula for the cosine of a triple angle. For convenience, we will prove this and some related formulas here, based only on elementary geometry and a little algebra. Readers familiar with these formulas may skip to the “Proof of the main theorems” section below.

**LEMMA 5 (Triple angle formula for cosine)**: $\cos(3\alpha) = 4 \cos^3(\alpha) – 3 \cos(\alpha).$

**Proof**: We start with the formula $$\sin (\alpha + \beta) = \sin (\alpha) \cos (\beta) + \cos (\alpha) \sin (\beta).$$ This fact has a simple geometric proof, which is illustrated to the right, where $OP = 1$. First note that $RPQ = \alpha, \, PQ = \sin(\beta)$ and $OQ = \cos (\beta)$. Further, $AQ/OQ = \sin(\alpha)$, so $AQ = \sin(\alpha) \cos(\beta)$, and $PR/PQ = \cos(\alpha)$, so $PR = \cos(\alpha) \sin(\beta)$. Combining these results, $$\sin(\alpha + \beta) = PB = RB + PR = AQ + PR = \sin(\alpha) \cos(\beta) + \cos(\alpha) \sin(\beta).$$ [This proof and illustration are taken from here.] By applying this formula to $(\pi/2 – \alpha)$ and $(-\beta)$, one obtains the related formula: $$\cos(\alpha + \beta) = \cos(\alpha) \cos(\beta) – \sin(\alpha) \sin(\beta).$$ From these formulas one immediately obtains the formulas $\sin(2\alpha) = 2 \sin(\alpha) \cos(\alpha)$ and $\cos(2\alpha) = 2 \cos^2(\alpha) – 1$. Now we can write: $$\cos(3\alpha) = \cos (\alpha + 2\alpha) = \cos(\alpha) \cos(2\alpha) – \sin(\alpha) \sin(2\alpha) = \cos(\alpha) (2 \cos^2(\alpha) – 1) – 2 \sin^2(\alpha)\cos(\alpha)$$ $$= \cos(\alpha) (2 \cos^2(\alpha) – 1) – 2 (1 – \cos^2(\alpha))\cos(\alpha) = 4 \cos^3(\alpha) – 3 \cos(\alpha).$$

**Proof of the main theorems:**

**THEOREM 1**: There is no ruler-and-compass scheme to trisect an arbitrary angle.

**Proof**: As mentioned above, we will show that the angle $\pi/3$ radians or 60 degrees cannot be trisected, or, in other words, that the number $\cos (\pi/9)$ cannot be constructed. From Lemma 5 we can write $$\cos (3(\pi/9)) = \cos (\pi/3) = 1/2 = 4 \cos^3(\pi/9) – 3 \cos(\pi/9),$$ so that $\cos(\pi/9)$ is a root of the polynomial equation $8 x^3 – 6 x – 1 = 0$. If this polynomial factors at all, it must have a linear factor and hence a rational root. But by Lemma 1, the only possible rational roots are $x = \pm 1, \, x = \pm 1/2, \, x = \pm 1/4$ and $x = \pm 1/8$, none of which satisfies $8 x^3 – 6 x – 1 = 0$. Thus $8 x^3 – 6 x – 1$ is irreducible and is the minimal polynomial of $\cos(\pi/9)$, so that $\deg(\cos(\pi/9)) = 3$. Hence any extension field $F$ containing $\cos(\pi/9)$ must have $\deg(F/R) = 3n$ for some integer $n$. However, by Lemma 4 the only possible degrees for number fields composed by ruler-and-compass constructions are powers of two. Since $2^m$ is not divisible by $3$, no finite sequence of ruler-and-compass constructions can compose an angle of $\pi/9$ or 20 degrees, which is the trisection of the angle $\pi/3$ or 60 degrees.

**THEOREM 2**: There is no ruler-and-compass scheme to duplicate an arbitrary cube.

**Proof**: This follows by the same line of reasoning, since duplication of a cube is equivalent to constructing the number $\sqrt[3]{2}$, which is a root of the polynomial equation $x^3 – 2 = 0$. As noted in one of the examples above, if $x^3 – 2$ factors at all, one of the factors must be linear, yielding a rational root. But by Lemma 1, the only possible rational roots are $x = \pm 1$ and $x = \pm 2$, and none of these satisfies $x^3 – 2 = 0$. Thus $x^3 – 2$ is irreducible and is the minimal polynomial of $\sqrt[3]{2}$, so that ${\rm deg}(\sqrt[3]{2}) = 3$. Hence any extension field $F$ containing $\sqrt[3]{2}$ must have $\deg(F/R) = 3n$ for some integer $n$.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>This theorem has a long, tortuous history. In 1608, Peter Roth wrote that a polynomial equation of degree $n$ with real coefficients may have $n$ solutions, but offered no proof. Leibniz and Nikolaus Bernoulli both asserted that quartic polynomials of

Continue reading Simple proofs: The fundamental theorem of algebra

]]>The

This theorem has a long, tortuous history. In 1608, Peter Roth wrote that a polynomial equation of degree $n$ with real coefficients may have $n$ solutions, but offered no proof. Leibniz and Nikolaus Bernoulli both asserted that quartic polynomials of the form $x^4 + a^4$, where $a$ is real and nonzero, could not be solved or factored into linear or quadratic factors, but Euler later pointed out that $$x^4 + a^4 = (x^2 + a \sqrt{2} x + a^2) (x^2 – a \sqrt{2} x + a^2).$$ Eventually mathematicians became convinced that something like the fundamental theorem must be true. Numerous mathematicians, including d’Alembert, Euler, Lagrange, Laplace and Gauss, published proofs in the 1600s and 1700s, but each was later found to be flawed or incomplete. The first complete and fully rigorous proof was by Argand in 1806. For additional historical background on the fundamental theorem of algebra, see this Wikipedia article.

A proof of the fundamental theorem of algebra is typically presented in a college-level course in complex analysis, but only after an extensive background of underlying theory such as Cauchy’s theorem, the argument principle and Liouville’s theorem. Only a tiny percentage of college students ever take such coursework, and even many of those who do take such coursework never really grasp the essential idea behind the fundamental theorem of algebra, because of the way it is typically presented in textbooks (this was certainly the editor’s initial experience with the fundamental theorem of algebra). All this is generally accepted as an unpleasant but unavoidable feature of mathematical pedagogy.

We present here a proof of the fundamental theorem of algebra. We emphasize that it is both *elementary* and *self-contained* — it relies only a well-known completeness axiom and some straightforward reasoning with estimates and inequalities. It should be understandable by anyone with a solid high school background in algebra, trigonometry and complex numbers, although some experience with limits and estimates, as is commonly taught in high school or first-year college calculus courses, would also help.

**Gist of the proof**:

For readers familiar with Newton’s method for solving equations, one starts with a reasonably close approximation to a root, then adjusts the approximation by moving closer in an appropriate direction. We will employ the same strategy here, showing that if one assumes that the argument where the polynomial function achieves its minimum absolute value is not a root, then there is a nearby argument where the polynomial function has an even smaller absolute value, contradicting the assumption that the argument of the minimum absolute value is not a root.

**Definitions and axioms**:

In the following $p(z)$ will denote the $n$-th degree polynomial $p(z) = p_0 + p_1 z + p_2 z^2 + \cdots + p_n z^n$, where the coefficients $p_i$ are any complex numbers, with neither $p_0$ nor $p_n$ equal to zero (otherwise the polynomial is equivalent to one of lesser degree). We will utilize a fundamental completeness property of real and complex numbers, namely that a continuous function on a closed set achieves its minimum at some point in the domain. This can be taken as an axiom, or can be easily proved by applying other well-known completeness axioms, such as the Cauchy sequence axiom or the nested interval axiom. See this Wikipedia article for more discussion on completeness axioms.

**THEOREM 1**: Every polynomial with real or complex coefficients has at least one complex root.

**Proof**: Suppose that $p(z)$ has no roots in the complex plane. First note that for large $z$, say $|z| > 2 \max_i |p_i/p_n|$, the $z^n$ term of $p(z)$ is greater in absolute value than the sum of all the other terms. Thus given some $B > 0$, then for any sufficiently large $s$, we have $|p(z)| > B$ for all $z$ with $|z| \ge s$. We will take $B = 2 |p(0)| = 2 |p_0|$. Since $|p(z)|$ is continuous on the interior and boundary of the circle with radius $s$, it follows by the completeness axiom mentioned above that $|p(z)|$ achieves its minimum value at some point $t$ in this circle (possibly on the boundary). But since $|p(0)| < 1/2 \cdot |p(z)|$ for all $z$ on the circumference of the circle, it follows that $|p(z)|$ achieves its minimum at some point $t$ in the *interior* of the circle.

Now rewrite the polynomial $p(z)$, translating the argument $z$ by $t$, thus producing a new polynomial $$q(z) = p(z – t) \; = \; q_0 + q_1 z + q_2 z^2 + \cdots + q_n z^n,$$ and similarly translate the circle described above. Presumably the polynomial $q(z)$, defined on some circle centered at the origin (which circle is contained within the circle above), has a minimum absolute value $M > 0$ at $z = 0$. Note that $M = |q(0)| = |q_0|$.

Our proof strategy is to construct some point $x$, close to the origin, such that $|q(x)| < |q(0)|,$ thus contradicting the presumption that $|q(z)|$ has a minimum nonzero value at $z = 0$. If our method gives us merely a direction in the complex plane for which the function value decreases in magnitude (a *descent direction*), then by moving a small distance in that direction, we hope to achieve our goal of constructing a complex $x$ such that $|q(x)| < |q(0)|$. This is the strategy we will pursue.

**Construction of $x$ such that $|q(x)| < |q(0)|$**:

Let the first nonzero coefficient of $q(z)$, following $q_0$, be $q_m$, so that $q(z) = q_0 + q_m z^m + q_{m+1} z^{m+1} + \cdots + q_n z^n$. We will choose $x$ to be the complex number $$x = r \left(\frac{- q_0}{ q_m}\right)^{1/m},$$ where $r$ is a small positive real value we will specify below, and where $(-q_0/q_m)^{1/m}$ denotes any of the $m$-th roots of $(-q_0/q_m)$.

**Comment**: As an aside, note that unlike the real numbers, in the complex number system the $m$-th roots of a real or complex number are always guaranteed to exist: if $z = z_1 + i z_2$, with $z_1$ and $z_2$ real, then the $m$-th roots of $z$ are given explicitly by $$\{R^{1/m} \cos ((\theta + 2k\pi)/m) + i R^{1/m} \sin ((\theta+2k\pi)/m), \, k = 0, 1, \cdots, m-1\},$$ where $R = \sqrt{z_1^2 + z_2^2}$ and $\theta = \arctan (z_2/z_1)$. The guaranteed existence of $m$-th roots, a feature of the complex number system, is the key fact behind the fundamental theorem of algebra.

**Proof that $|q(x)| < |q(0)|$**:

With the definition of $x$ given above, we can write

$$q(x) = q_0 – q_0 r^m + q_{m+1} r^{m+1} \left(\frac{-q_0} {q_m}\right)^{(m+1)/m} + \cdots + q_n r^n \left(\frac{-q_0} {q_m}\right)^{n/m}$$ $$= q_0 – q_0 r^m + E,$$ where the extra terms $E$ can be bounded as follows. Assume that $q_0 \leq q_m$ (a very similar expression is obtained for $|E|$ in the case $q_0 \geq q_m$), and define $s = r(|q_0/q_m|)^{1/m}$. Then, by applying the well-known formula for the sum of a geometric series, we can write $$|E| \leq r^{m+1} \max_i |q_i| \left|\frac{q_0}{q_m}\right|^{(m+1)/m} (1 + s + s^2 + \cdots + s^{n-m-1}) \leq \frac{r^{m+1}\max_i |q_i|}{1 – s} \left|\frac{q_0}{q_m}\right|^{(m+1)/m}.$$ Thus $|E|$ can be made arbitrarily smaller than $|q_0 r^m| = |q_0|r^m$ by choosing $r$ small enough. For instance, select $r$ so that $|E| < |q _0|r^m / 2$. Then for such an $r$, we have $$|q(x)| = |q_0 - q_0 r^m + E| < |q_0 - q_0 r^m / 2| = |q_0|(1 - r^m / 2)< |q_0| = |q(0)|,$$ which contradicts the original assumption that $|q(z)|$ has a minimum nonzero value at $z = 0$.

**THEOREM 2**: Every polynomial of degree $n$ with real or complex coefficients has exactly $n$ complex roots, when counting individually any repeated roots.

**Proof**: If $\alpha$ is a real or complex root of the polynomial $p(z)$ of degree $n$ with real or complex coefficients, then by dividing this polynomial by $(z – \alpha)$, using the well-known polynomial division process, one obtains $p(z) = (z – \alpha) q(z) + r$, where $q(z)$ has degree $n – 1$ and $r$ is a constant. But note that $p(\alpha) = r = 0$, so that $p(z) = (z – \alpha) q(z)$. Continuing by induction, we conclude that the original polynomial $p(z)$ has exactly $n$ complex roots, although some might be repeated.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>Continue reading Simple proofs: The irrationality of pi

]]>Mankind has been fascinated with $\pi$, the ratio between the circumference of a circle and its diameter, for at least 2500 years. Ancient Hebrews used the approximation 3 (see 1 Kings 7:23 and 2 Chron. 4:2). Babylonians used the approximation 3 1/8. Archimedes, in the first rigorous analysis of $\pi$, proved that 3 10/71 < $\pi$ < 3 1/7, by means of a sequence of inscribed and circumscribed triangles. Later scholars in India (where decimal arithmetic was first developed, at least by 300 CE), China and the Middle East computed $\pi$ ever more accurately. In 1665, Newton computed 16 digits, but, as he later confessed, “I am ashamed to tell you to how many figures I carried these computations, having no other business at the time.” In 1844 the computing prodigy Johan Dase produced 200 digits. In 1874, Shanks published 707 digits, although later it was found that only the first 527 were correct.

In the 20th century, $\pi$ was computed to thousands, then to millions, then to billions of digits, in part due to some remarkable new formulas and algorithms for $\pi$, and in part due to clever computational techniques, all accelerated by the relentless advance of Moore’s Law. The most recent computation, as far as the present author is aware, produced 22.4 trillion digits. One may download up to the first trillion digits here. For additional details on the history and computation of $\pi$, see The quest for $\pi$ and The computation of previously inaccessible digits of pi^2 and Catalan’s constant.

For many years, part of the motivation for computing $\pi$ was to answer the question of whether $\pi$ was a rational number — if $\pi$ was rational, the decimal expansion would eventually repeat. But given that no repetitions were found in early computations, many leading mathematicians in the 17th and 18th century concluded that $\pi$ must be irrational. In 1761, Johann Heinrich Lambert settled the question by proving that $\pi$ is irrational. Then in 1882 Ferdinand von Lindemann proved that $\pi$ is transcendental, which proved once and for all that the ancient Greek problem of squaring the circle is impossible (because ruler-and-compass constructions can only produce power-of-two degree algebraic numbers).

We present here what we believe to be the simplest proof that $\pi$ is irrational. It was first published in the 1930s as an exercise in the Bourbaki treatise on calculus. It requires only a familiarity with integration, including integration by parts, which is a staple of any high school or first-year college calculus course. It is similar to, but simpler than (in the present author’s opinion), a related proof due to Ivan Niven.

**Gist of the proof:**

The basic idea is to define a function $A_n(b)$, based on an integral from $0$ to $\pi$. This function has the property that for each positive integer $b$ and for all sufficiently large integers $n$, $A_n(b)$ lies strictly between 0 and 1. Yet another line of reasoning, assuming $\pi$ is a rational number, and applying integration by parts, concludes that $A_n(b)$ must be an integer. This contradiction shows that $\pi$ must be irrational.

**THEOREM**: $\pi$ is irrational.

**Proof**:

For each positive integer $b$ and non-negative integer $n$, define $$A_n(b) = b^n \int_0^\pi \frac{x^n(\pi – x)^n \sin(x)}{n!} \, {\rm d}x.$$ Note that the integrand function of $A_n(b)$ is zero at $x = 0$ and $x = \pi$, but is strictly positive elsewhere in the integration interval. Thus $A_n(b) > 0$. In addition, since $x (\pi – x) \le (\pi/2)^2$, we can write $$A_n(b) \le \frac{\pi b^n}{n!}\left(\frac{\pi}{2}\right)^{2n} = \frac{\pi (b \pi^2 / 4)^n}{n!},$$ which is less than one for large $n$, since, by Stirling’s formula, the $n!$ in the denominator increases faster than the $n$-th power in the numerator. Thus we have established, for any integer $b \ge 1$ and all sufficiently large $n$, that $0 < A_n(b) < 1.$

Now let us assume that $\pi = a/b$ for relatively prime integers $a$ and $b$. Define $f(x) = x^n (a – bx)^n / n!$. Then we can write, using integration by parts, $$A_n(b) = \int_0^\pi f(x) \sin(x) \, {\rm d}x = \left[-f(x) \cos(x)\right]_0^\pi – \left[-f'(x) \sin(x)\right]_0^\pi + \cdots$$ $$ \cdots \pm \left[f^{(2n)} (x) \cos(x)\right]_0^\pi \pm \int_0^\pi f^{(2n+1)} (x) \cos(x) \, {\rm d}x.$$ Now note that for $k = 1$ to $k = n$, the derivative $f^{(k)} (x)$ is zero at $x = 0$ and $x = \pi$. For $k = n + 1$ to $2n$, $f^{(k)} (x)$ takes various integer values at $x = 0$ and $x = \pi$; and for $k = 2n + 1$, the derivative is zero. Also, the function $\sin(x) $ is $0$ at $x = 0$ and $x = \pi$, while the function $\cos(x)$ is $1$ at $x = 0$ and $-1$ at $x = \pi$. Combining these facts, we conclude that $A_n (b)$ must be an integer. This contradiction proves that $\pi$ is irrational.

For other proofs in this series see the listing at Simple proofs of great theorems.

]]>Modern mathematics is one of the most enduring edifices created by humankind, a magnificent form of art and science that all too few have the opportunity of appreciating. The great British mathematician G.H. Hardy wrote, “Beauty is the first test; there is no permanent place in the world for ugly mathematics.” Mathematician-philosopher Bertrand Russell added: “Mathematics, rightly viewed, possesses not only truth, but supreme beauty — a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music,

Continue reading Simple proofs of great theorems

]]>Modern mathematics is one of the most enduring edifices created by humankind, a magnificent form of art and science that all too few have the opportunity of appreciating. The great British mathematician G.H. Hardy wrote, “Beauty is the first test; there is no permanent place in the world for ugly mathematics.” Mathematician-philosopher Bertrand Russell added: “Mathematics, rightly viewed, possesses not only truth, but supreme beauty — a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show.”

Yet it is a sad fact that few people on this planet even begin to grasp the full power and beauty of mathematics. For the vast majority of humankind, even in first-world nations, all students see of mathematics is some relatively dull drilling in decimal arithmetic (e.g., the dreaded times tables), some very basic algebra and various textbook “story problems.” Most never glimpse the higher-level beauty of real modern mathematics that lies at the end of the tunnel. Many otherwise promising students lose interest in mathematics as a result of dreary instruction, often by teachers who did not major in mathematics and for whom teaching mathematics is a chore. Inevitably, math homework often loses out to social media, computer games, parties and sports. The result is a tragic waste of desperately needed mathematical talent for the present and future information age.

Part of the problem here is that hardly any students ever see some of the more beautiful parts of mathematics, such as elegant proofs of important mathematical theorems. Some of these theorems may be mentioned in textbooks, but often their proofs are dismissed as “beyond the scope of this text,” relegated to some higher-level mathematics course only taken by advanced math majors or graduate students, which is an extremely tiny audience. Even when these proofs are part of the curriculum, the presentation in textbooks often fails to highlight the keys ideas in a way that students can easily grasp and gain intuition.

As a single example, the fundamental theorem of algebra, namely the fact that every algebraic equation of degree *n* has *n* roots, typically is presented in a college-level course in complex analysis, and then only after an extensive background of underlying theory such as Cauchy’s theorem, the argument principle and Liouville’s theorem. Only a tiny percentage of college students ever take such coursework, and even many of those who do take such coursework never really grasp the essential idea behind the fundamental theorem of algebra, because of the way it is typically presented in textbooks (this was certainly the editor’s initial experience with the fundamental theorem of algebra). All this is generally accepted as an unpleasant but unavoidable feature of mathematical pedagogy.

The editor of this blog rejects this defeatism. He is convinced that many of the greatest theorems of mathematics can be proved significantly more simply, and requiring significantly less background, than they are typically presented in traditional textbooks and courses.

Thus the editor has decided to start a new feature in this blog, namely to present simple, beautiful and readily understandable proofs of a number of important theorems. In most cases, as we will see, this can be done without sacrificing rigor, and yet also without a huge background of prior coursework and machinery that many younger students especially might not have the patience or time for.

The proofs envisioned for presentation will be designed to satisfy the following requirements:

- The proofs will include some historical background and context.
- In most cases, the most simple, elegant and beautiful proof of a given theorem will be the one presented.
- Whenever possible, the proofs will be suitable for presentation to bright high school students or, at most, first- or second-year college mathematics students.
- Any necessary lemmas or other background material beyond very basic facts will be included.
- The entire proof, including historical context and background material, will not exceed three pages or so.

An impossible dream? The editor is convinced otherwise.

Here are some articles that will start the series. Others will be added in the coming months:

Comments on any of this material certainly welcome. What’s more, if a reader has a suggestion for some theorem that not yet been featured, or even for an alternate proof of some theorem that already has been featured, please contact the editor. His email is given at Welcome to the Math Scholar blog.

]]>As we have explained in previous Math Scholar blogs (see, for example, MS1 and MS2), the perplexing question why the heavens are silent even though, from all evidence, the universe is teeming with potentially habitable exoplanets, continues to perplex and fascinate scientists. It is one of the most significant questions of modern science, with connections to mathematics, physics, astronomy, cosmology, biology and philosophy.

In spite of the glib dismissals that are often presented in public venues and (quite sadly) in writings by some professional scientists (see MS1 and MS2 for examples and rejoinders), there

Continue reading New books and articles on the “great silence”

]]>As we have explained in previous Math Scholar blogs (see, for example, MS1 and MS2), the perplexing question why the heavens are silent even though, from all evidence, the universe is teeming with potentially habitable exoplanets, continues to perplex and fascinate scientists. It is one of the most significant questions of modern science, with connections to mathematics, physics, astronomy, cosmology, biology and philosophy.

In spite of the glib dismissals that are often presented in public venues and (quite sadly) in writings by some professional scientists (see MS1 and MS2 for examples and rejoinders), there is no easy resolution to Fermi’s paradox, as it is known.

**Sociological explanations**, such as “they have lost interest in research and exploration,” or “they are under a galactic directive not to disturb Earth,” or “they have moved on to more advanced communication technologies,” all fall prey to a diversity argument: Any advanced society that arose from an evolutionary process (which is the only conceivable natural process to explain the origin of complex beings), surely must consist of millions if not billions or even trillions of diverse individuals. All it takes is for one individual in just one of these advanced civilizations to disagree, say with the galactic directive not to or send probes or communications to Earth in a form that earthlings or other newly technological societies can easily recognize, and this explanation fails. Note, for example, that any signal (microwave, light or gravitational wave), once it has been sent by extraterrestrials (ETs) on its way to Earth, cannot be censored or called back, according to known laws of physics.

Similarly, **technological explanations** (i.e., it is too difficult for them) were once considered persuasive, but now founder on the fact that even present-day human technology is sufficient to communicate with and (soon) to send probes to distant stars and planets (see, for example, technical reports TR1, TR2, TR3 and TR4). If we can do this now, or within the next 20-30 years, surely civilizations thousands or millions of years more advanced can do this much better, and much more cheaply too. This same space technology, by the way, defeats the “all advanced civilizations destroy themselves” explanation, since as soon as human civilization (for instance) establishes permanent outposts on the Moon, Mars and beyond, its continued survival will be largely impervious to any calamities that may befall the home planet. By the way, the self-destruct explanation also falls prey to a diversity argument — even if most civilizations destroy themselves, surely at least a handful have figured out how to survive and thrive over the long term?

So are we alone, at least in the Milky Way? Perhaps, although many (including the present author) find such a conclusion rather distasteful, to say the least, since it goes against the Copernican principle that has guided scientific research for many years. See previous Math Scholar blogs MS1 and MS2 for a more complete discussion of these issues.

Recently two books and a *Scientific American* article (which summarizes a third book) have appeared on this topic. Here is a brief synopsis of each.

In his book If the Universe Is Teeming with Aliens … WHERE IS EVERYBODY?: Seventy-Five Solutions to the Fermi Paradox and the Problem of Extraterrestrial Life, British physicist Stephen Webb updates his 2002 edition with a thorough discussion of 50 earlier proposed solutions to Fermi’s paradox, updated to reflect the latest scientific findings, plus 25 new proposed solutions for this edition. The discussion is very readable and focused, yet does not omit technical detail where needed.

Here are just a few of the proposed solutions presented (often with devastating rejoinders) in Webb’s book. Item numbers and titles are from Webb’s book:

*The Zoo scenario*: We are part of a preserve, being tended by an overarching society of ETs.*The interdict scenario*: There is a galaxy-wide prohibition on visits to or communication with Earth.*They have not had time to reach us*: ETs may exist, but they have not yet been able to reach us or communicate with us.*Bracewell-von Neumann probes*: ETs could have explored the galaxy by means of self-replicating probes that arrive at a star system, find a suitable asteroid or planet, construct additional copies of themselves, and then launch them to other star systems, with updated software beamed from the home planet. Such probes could have explored the entire Milky Way in a few million years, an eyeblink in cosmic time. So where are they?*They are signaling, but we don’t know how to listen*: We are not using the right technology to receive and/or decipher their signals.*They have no desire to communicate*: ETs have settled into the good life and no longer have any desire to explore the cosmos or learn about other newly technological societies like us.*Intelligence isn’t permanent*: Intelligence has arisen elsewhere, but then degraded because it was not needed so much anymore.*We live in a post-biological universe*: ETs have progressed from biological to machine-based existence and no longer have interest in us.*Continuously habitable zones are narrow*: The Earth is nearly unique in maintaining a habitable environment over billions of years.*The galaxy is a dangerous place*: Much if not most of the galaxy is uninhabitable due to a steady rain of supernova explosions, gamma ray bursts and other phenomena that regularly sterilize their surroundings.*Life’s genesis is rare*: Although progress has been made, the origin of the first reproducing biomolecules on Earth is still a mystery. Perhaps biogenesis was a freak of nature, much more singular than we suppose.*Intelligence at the human level is rare*: Even if life is common, perhaps human-level intelligence, with its associated technology, is extremely rare.*Science is not inevitable*: Perhaps even if intelligence emerges, the development of science and advanced technology is rare.

Webb discusses each of these proposed solutions in detail, with numerous references to technical literature and other analyses. He also frankly discusses the weaknesses of each of these, and, in many places, offers his own assessment. In the end, Webb acknowledges that while we still have much to learn, he personally thinks it is unlikely that there are any other human-like technological societies in the Milky Way.

In his new book The Great Silence: Science and Philosophy of Fermi’s Paradox, Milan Cirkovic analyzes Fermi’s paradox from a somewhat deeper philosophical and scientific basis. He argues that many of approaches to the great silence are riddled with logical errors and other weaknesses. He then attempts to carefully examine many of the proposed solutions and to identify what can be safely said.

For example, Cirkovic argues that discussions of the Drake equation, which is used to estimate the number of space-communicating civilizations in the Milky Way, are typically fallacious. Drake’s equation, which was first proposed by Frank Drake in the 1960s, is:

N = R_{*} f_{p} n_{e} f_{l} f_{i} f_{c} L

where N is the number of predicted ET civilizations, R_{*} is the star formation rate in the Milky Way, f_{p} is the fraction of stars possessing planets, n_{e} is the average number of habitable planets per planetary system, f_{l} is the fraction of habitable planets possessing life, f_{i} is the fraction of inhabited planets developing intelligent life, f_{c} is the fraction of intelligent societies developing space communication technology, and L is the lifespan of civilizations that have developed technology.

Cirkovic argues that because of the huge uncertainties involved, each term should be given by a probability distribution, and the product integrated over the multidimensional parameter space. In any event, since virtually all of these terms are matters of active astrobiological research, writers who cite Drake’s equation without analyzing it more carefully are on shaky ground.

In the end, Cirkovic argues that in addition to identifying and correcting prejudices and fallacies in this arena, we need to realize that substantive progress awaits more real scientific data. But he insists that we must not retreat from Copernican principle, namely that ultimately our existence is not unique. He concludes on a positive note:

Even when [space exploration] is achieved, it will not be the beginning of the end — but it might be the end of the beginning of the greatest of all journeys, the wildest adventure of all adventures. … After all, this is what science has stood for in its most brilliant moments: courage, conviction, and the spirit of great adventure. These qualities might, in the final analysis, withstand the test of cosmic time, repulse the last challenge to Copernicanism, and ultimately break the Great Silence.

Astrophysicist and well-known science writer John Gribbin has weighed into the debate with an article in the August 2018 *Scientific American* entitled Are Humans Alone in the Milky Way? Why we are probably the only intelligent life in the galaxy. His article presents a brief summary of his 2011 book Alone in the Universe: Why Our Planet Is Unique, with a number of updates reflecting recent scientific discoveries and findings.

Gribbin argues that we are the product of a long series of remarkably fortuitous (and unlikely) events that might not have been repeated anywhere else, at least within the Milky Way. These include:

*Special timing*: Given that all elements heaver than hydrogen, helium and lithium were created in supernova explosions, this means that a metal-rich environment like our sun and solar system could not exist until several billion years after the big bang, yet not so long after that the sun begins to die.*Special location*: While the Milky Way seems vast, regions much closer than our sun to the center of the galaxy are bathed in high-energy gamma particles and gamma ray bursts, which are lethal to any conceivable form of biology. Much further away from the center than our sun, metallicity and other conditions are not met. In other words, we reside in a Goldilocks zone in the Milky Way, estimated to span only about 7% of the galactic radius.*Special planet*: In spite of all the talk about exoplanets in the Milky Way, only a few are rocky like Earth, fewer reside in the habitable zone, and even fewer still (or perhaps none at all) have these features plus a combination of a molten core generating a magnetic field together with plate tectonics to regulate climate. In our solar system, Mars is too far away, has no magnetic field and is too cold (although primitive organisms might exist underground). Venus lacks a magnetic field and plate tectonics, and, because of runaway greenhouse effect, is much too hot for any life. The collision with Earth by a Mars-size object in the early solar system, and the presence of a large moon, also appear to be key to Earth’s long-term, life-friendly environment.*Special life*: Once Earth’s original molten state settled, life appeared with “indecent rapidity,” in Gribbin’s words. But then not much happened for the next three billion years. More complex cells, known as eukaryotes, resulted from the chance merger of two primordial organisms, bacteria and archaea, but even after this crucial milestone event, not much happened for another billion years, until the Cambrian explosion roughly 540 million years ago. So the rise of complex life forms, much less intelligent creatures such as Homo sapiens, was hardly inevitable.*Special species*: Homo sapiens appeared about 300,000 years ago, but nearly went extinct 150,000 years ago, and again about 70,000 years ago. Thus our subsequent domination of the planet seems far from assured.

Gribbin concludes as follows:

As we put everything together, what can we say? Is life likely to exist elsewhere in the galaxy? Almost certainly yes, given the speed with which it appeared on Earth. Is another technological civilization likely to exist today? Almost certainly no, given the chain of circumstances that led to our existence. These considerations suggest we are unique not just on our planet but in the whole Milky Way. And if our planet is so special, it becomes all the more important to preserve this unique world for ourselves, our descendants and the many creatures that call Earth home.

As mentioned above, the question of extraterrestrial intelligent life, and why we have not yet found any, surely must rank at or near the top of the most significant scientific questions of all time. In his foreword to Webb’s book, noted British astronomer Martin Rees explains what is at stake:

]]>Maybe we will one day find ET. On the other hand, [Webb’s] book offers 75 reasons why SETI searches may fail; Earth’s intricate biosphere may be unique. That would disappoint the searchers, but it would have an upside: it would entitle us humans to be less “cosmically modest.” Moreover, this outcome would not render life a cosmic sideshow. Evolution may still be nearer its beginning than its end. Our Solar System is barely middle aged and, if humans avoid self-destruction, the post-human era beckons. Life from Earth could spread through the Galaxy, evolving into a teeming complexity far beyond what we can even conceive. If so, our tiny planet — this pale blue dot floating in space — could be the most important place in the entire Galaxy, and the first interstellar voyagers from Earth would have a mission that would resonate through the entire Galaxy and perhaps beyond.

String theory is the name for the theory of mathematical physics which proposes that physical reality is based on exceedingly small “strings” and “branes,” embedded in 10- or 11-dimensional space. String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, and also because it is so “beautiful.”

Yet in spite of decades of effort, by thousands of brilliant mathematical physicists, the field has yet to produce specific experimentally testable predictions. What’s more, hopes that string theory

Continue reading Does the string theory multiverse really exist?

]]>String theory is the name for the theory of mathematical physics which proposes that physical reality is based on exceedingly small “strings” and “branes,” embedded in 10- or 11-dimensional space. String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, and also because it is so “beautiful.”

Yet in spite of decades of effort, by thousands of brilliant mathematical physicists, the field has yet to produce specific experimentally testable predictions. What’s more, hopes that string theory would result in a single crisp physical theory, pinning down unique laws and fundamental constants, have been dashed. Instead, the theory admits a huge number of different options, by one reckoning numbering more than 10^{500}, which is often called the “string theory multiverse” or the “string theory landscape.”

A number of leading string theorists and other physicists view the huge number of different multiverse options as an advantage. They argue that this may offer a solution to the long-standing paradox of why the universe appears so freakishly fine-tuned for life. They reason that we should not be surprised to live in an extraordinarily life-friendly universe, because given 10^{500} possible universe designs, inevitably at least one (ours) beats the enormous odds and is favorable to life.

However, the string theory/multiverse hypothesis also has numerous detractors. Sabine Hossenfelder, in her 2018 book Lost in Math: How Beauty Leads Physics Astray, laments the fact that physicists have for so long pursued string theory, supersymmetry and other theories that lack solid underpinning in empirical science, just because they are considered “beautiful.” Similarly, physicist Lee Smolin, writing in his 2006 book The Trouble With Physics, declares,

A scientific theory that makes no predictions and therefore is not subject to experiment can never fail, but such a theory can never succeed either, as long as science stands for knowledge gained from rational argument borne out by evidence.

On 25 June 2018, a team of prominent string theorists led by Cumrun Vafa of Harvard University released a new paper that fundamentally questions the whole notion of a string theory-based multiverse with 10^{500} or more variations. If their results are confirmed by other physicists and experimental evidence, they could seriously challenge string theory and/or our entire conception of modern physics and cosmology.

The Vafa result and its implications are explained in detail in a very nice Quanta article by Natalie Wolchover. Here is a brief summary:

The main result of the Vafa team paper implies that as the universe expands, the vacuum energy density of empty space must decrease at least as fast as the rate given by a certain formula. The rule appears to hold in most string theory-based universal models, but it violates two bedrocks of modern cosmology: the accelerating expansion of the universe due to dark energy and the hypothesized “inflation” epoch of the first second after the big bang.

In particular, Vafa and his colleagues argue that “de Sitter universes,” with stable, constant and positive amounts of vacuum energy density, simply are not possible under rather general assumptions of string theory. Such universes in general, and ours in particular, would reside in the “swampland” of string theory because they are not mathematically consistent. And yet here we are…

Just as significantly, the Vafa team result draws into question the widely accepted “inflation” epoch of the very early universe, when the universe is thought to have expanded by a factor of roughly 10^{30} or more. The trouble is, Vafa’s result implies that the “inflaton field” of energy that drove inflation must have declined too quickly to have formed a smooth and flat universe like the one we reside in.

String theorists and other physicists are divided about the Vafa team result. Eva Silverstein of Stanford University, a leader in efforts to construct string theory-based models of inflation, believes the result is likely false, as does her husband Shamit Kachru, co-author of the 2003 KKLT paper that is the basis of string theory-based models of de Sitter universes. But others, such as Hirosi Ooguri of the California Institute of Technology, are inclined to believe the Vafa result, since other “swampland” conjectures have withstood challenges and are now on a very solid theoretical footing.

One possible way out is that the accelerating expansion of the universe is not due to an ever-present positive dark energy, as currently believed, but instead is due to quintessence, a hypothesized energy source that gradually decreases over tens of billions of years. If the quintessence hypothesis is true, it could revolutionize physics and cosmology.

The quintessence hypothesis will be tested in several new experiments currently underway, and some others scheduled for the future, which will analyze more carefully whether the accelerating expansion of the universe is constant or variable. One of these experiments is the Dark Energy Survey, currently underway, which analyzes the clumpiness of galaxies. The initial results so far, released in August 2017, find the universe is 74% dark energy and 21% dark matter, with everything else (stars, galaxies, planets and us) in the remaining 5%, all of which is pretty much consistent with our current understanding so far.

Another related experiment is the Wide Field Infared Survey Telescope (WFIRST) system, which is specifically designed to study dark energy and infrared astrophysics. The Euclid telescope, currently in development, will investigate even more accurately the relationship between distance and redshift that is at the heart of modern cosmology.

As mentioned above, the theory of cosmic inflation is also challenged by the Vafa team result. Along this line experimental systems such as the Simons Observatory will search for signatures and other evidence of cosmic inflation. This evidence will be scrutinized carefully, though, in the wake of the widely hailed 2014 announcement of evidence for inflation that subsequently bit the dust, so to speak, in the sense that the results were explained by dust in the Milky Way.

However these results turn out, they are likely to have earthshaking implications for the fields of mathematical physics, experimental physics and cosmology.

If Vafa’s results are confirmed as correct, and the dark energy explanation for the accelerating expansion of the universe and the inflation epoch of the early universe are reaffirmed by experimental evidence, then this may spell doom for string theory.

Either way, Vafa’s conjecture has certainly roused the physics community like few other developments of recent years. Vafa explains,

]]>Raising the question is what we should be doing. And finding evidence for or against it — that’s how we make progress.

The Fields Institute announces its awardees at the every-four-years International Congress of Mathematicians (ICM) meeting, which this year is being held in Rio de Janeiro. The awards, which are made to a maximum of four exceptional mathematicians under the age of 40, are often considered the “Nobel Prize” of Mathematics.

Here is some information on this year’s awardees and their work:

Caucher Birkar. Birkar was born and raised in a very poor farmingContinue reading The 2018 Fields Medalists honored

]]>The Fields Institute announces its awardees at the every-four-years International Congress of Mathematicians (ICM) meeting, which this year is being held in Rio de Janeiro. The awards, which are made to a maximum of four exceptional mathematicians under the age of 40, are often considered the “Nobel Prize” of Mathematics.

Here is some information on this year’s awardees and their work:

- Caucher Birkar. Birkar was born and raised in a very poor farming village in the Kurdish region of Iran. But eventually he managed to enroll at the University of Tehran. He recalls seeing photos of various Fields medalists on the walls, wondering “Will I ever meet one of these people.” Little did he know…
In his senior year, while traveling in England, Birkar sought and received political asylum. He subsequently enrolled in the University of Nottingham, where he excelled in his mathematical studies. He currently is on the faculty at the University of Cambridge.

The work for which Birkar was recognized is in the area of “algebraic geometry,” namely algebraically equivalent ways of viewing geometric objects. The set of solutions common to a given set of equations is known as an “algebraic variety.” Birkar used the theory of “birational transformations” to show that a class of geometric objects known as “Fans varieties” form neat, orderly families in every dimension.

For more details on Birkar and his work, see this article by Kevin Hartnett.

- Alessio Figalli. Figalli was born in Rome. Although he liked mathematics from an early age, he didn’t seriously study the subject until his third year of high school. He was admitted into the Scuola Normale Superiore of Pisa, a university devoted to mathematically and scientifically gifted students, but struggled at first because other students were more advanced. But his talent quickly became apparent, as he caught up with and then exceeded the other students. He currently is on the faculty of ETH Zurich.
Figalli became interested in the “optimal transport problem,” namely to find the most efficient path and means to move materials from one location to another. This is closely related to problems of “minimal surfaces,” which have long been studied by Italian mathematicians.

In one key paper published in 2010, Figalli, together with his colleagues Franceseco Maggi and Aldo Pratelli, presented a proof of the stability of energy-minimizing shapes like crystals and soap bubbles. The proof takes the form of a precise inequality, which implies that if one increases the energy of a system such as a bubble or crystal by a certain amount, the resulting shape will deviate from the original shape by no more than a related amount. They further showed that this inequality is optimal in a certain precise sense.

For more details on Figalli and his work, see this article, also by Kevin Hartnett.

- Akshay Venkatesh. Venkatesh was born in New Delhi, India, but grew up in Perth, Australia. Although he describes his childhood as a “fairly normal suburban experience,” he did win some international physics and math Olympiad competitions at ages 11 and 12. He later attended Princeton University, and worked under the well-known number theorist Peter Sarnak.
Venkatesh has always viewed himself somewhat harshly. He described his own doctoral dissertation as “mediocre,” and had even considered leaving the field once, working for his uncle on a machine learning startup one summer. But then he was offered a C.L.E. Moore faculty position at MIT, and he accepted. He later wondered why his advisor (Sarnak) had written such a glowing recommendation. A fellow mathematician explained, “Sometimes, people see things in you that you don’t see.”

One of Venkatesh’s papers dealt with generalizations of the well-known Riemann zeta function hypothesis. Venkatesh focused on L-functions. He employed ideas from the theory of dynamical systems to establish subconvexity estimates for a huge family of L-functions.

For more details on Venkatesh and his work, see this article by Erica Klarreich.

- Peter Scholze. Scholze started teaching himself college-level mathematics at the age of 14, while attending a Berlin high school that specialized in mathematics and science. He recalls being fascinated by Andrew Wiles’ 1997 proof of Fermat’s Last Theorem. He confessed that at first although he was eager to understood the proof, he had much to learn. So for the next few years he studied some of the background theory, including modular forms and elliptic curves, together with numerous other techniques from number theory, algebra, geometry and analysis.
Scholze began his research in arithmetic geometry, namely the usage of geometry-connected tools to study integer solutions to multivariate equations, e.g., x y

^{2}+ 3 y = 5. He found, as others had earlier, that it was useful to study these problems using the p-adic numbers, which are like the real numbers but where closeness is defined as a difference that is divisible numerous times by the prime p.Scholze ultimately proved a deep result known as the weight-monodromy conjecture. Scholze’s results expanded the scope of reciprocity laws, which govern the behavior of polynomials that use simple modular arithmetic (e.g., 5 + 10 mod 12 = 3). These reciprocity laws are generalizations of the quadratic reciprocity laws that were a favorite of Gauss. 20th century mathematicians found surprising links between these laws an hyperbolic geometry, which links have been elaborated more recently as the “Langland’s program,” an instance of which was the key to the proof of Fermat’s Last Theorem. Scholze has shown how to extend the Langland’s program to a much wider range of structures in hyperbolic spaces.

For more details on Scholze and his work, see this article, also by Erica Klarreich.

Congratulations to all of this year’s winners!

]]>In a previous Math Scholar blog, we lamented the decline of peer review, as evidenced by the surprising number of papers, published in supposedly professional, peer-reviewed journals, claiming that Pi is not the traditional value 3.1415926535…, but instead is some other value. In the 12 months since that blog was published, other papers of this most regrettable genre have appeared.

As a single example of this genre, the author of a 2015 paper, which appeared in the International Journal of Engineering Sciences and Research Technology, states, “The area and circumference of circle has been

Continue reading The rise of pay-to-publish journals and the decline of peer review

]]>In a previous Math Scholar blog, we lamented the decline of peer review, as evidenced by the surprising number of papers, published in supposedly professional, peer-reviewed journals, claiming that Pi is not the traditional value 3.1415926535…, but instead is some other value. In the 12 months since that blog was published, other papers of this most regrettable genre have appeared.

As a single example of this genre, the author of a 2015 paper, which appeared in the International Journal of Engineering Sciences and Research Technology, states, “The area and circumference of circle has been estimated till now approximately. Millions of mathematicians all these years of human civilization have tried and failed.” This author then presents formulas, which he derives from a simple geometric argument, for the area of a circle and the circumference of a circle (both involving only simple quadratic expressions). He then asserts that Pi = (14 – Sqrt(2)) / 4 = 3.14644660941…, which he describes as “exact” and “kindly is revealed by God for the benefit of humanity,” as opposed to the merely “approximate” values that have been used in conventional mathematical literature.

Needless to say, this author is hopelessly mistaken. Pi really does equal 3.1415925635…, is given by any of hundreds of known formulas and iterative algorithms, and most certainly is not given by any algebraic expression, much less the simple one presented by the author, which differs from the true value of Pi beginning with the third digit. Numerous other examples of papers “proving” variant values of Pi are given in the previous Math Scholar blog.

There have always been, and probably always will be, persons such as the above author who doubt some well-established fact of mathematics, physics or other field, and who attempt to have their writings published. Doubtless many readers of this blog have received email from persons who are promoting such material.

The central question here is: **How did this and similar papers pass peer review?**

Peer review is the bedrock of modern science. Without rigorous peer review, by well-qualified reviewers, modern mathematics and science could not exist. For most reputable journals and conferences, reviewers are instructed to rate a submission on criteria such as:

- Relevance to the journal or conference’s charter.
- Clarity of exposition.
- Objectivity of style.
- Acknowledgement of prior work.
- Freedom from plagiarism.
- Theoretical background.
- Validity of reasoning.
- Experimental procedures and data analysis.
- Statistical methods.
- Conclusions.
- Originality and importance.

Needless to say, the paper listed above, and others of its ilk, should never have been approved for publication, since such material immediately violates item 7, not to mention items 3, 4, 6 and others.

As we emphasized in the earlier Math Scholar blog, the editors of journals who have accepted papers with nonsense such as “proofs” of variant values of Pi certainly cannot plead ignorance. No editor or reviewer with even a first-year undergraduate background in mathematics could possibly fail to notice the astounding claim that the traditional value of Pi is incorrect. Indeed, it is hard to imagine a comparable claim in other fields:

- A claim, in a submission to a physics journal, that the speed of light is substantially incorrect.
- A claim, in a submission to a chemistry journal, that atoms and molecules do not really exist.
- A claim, in a submission to a biology journal, that evolution never really happened.
- A claim, in a submission to a geology journal, that the earth is only a few thousand years old.

At the very least, the assertion that the traditional value of Pi is incorrect would certainly have to be considered an “extraordinary claim,” which, as Carl Sagan once reminded us, requires “extraordinary evidence.” And it is quite clear that none of the above-mentioned papers have offered airtight arguments, presented in highly professional and rigorous mathematical language, to justify such a claim. Thus these manuscripts should have either been rejected outright, or, at the very least, referred to well-qualified mathematicians for rigorous review.

Also, the fervor in this paper and others of this genre should raise a red flag. There is absolutely no place in modern mathematics and science literature for fervor of any sort, whether it be religious, political or philosophical (see item #3 in the list of peer review standards above), since any good scholar should be prepared to discard his or her pet theory, once it has been clearly refuted by more careful reasoning or experimentation.

So how could such an egregious error of manuscript review have occurred? The answer is, regrettably, “follow the money.”

Indeed, the above journal, and other journals which have presented variant values of Pi, are all listed on Beall’s list of pay-to-publish journals. Many of these journals have very loose standards of publication, with only a superficial review at best, in return for charging a fee to authors for having their papers published on the journals’ websites. This is despite the journals’ advertised claims that they are professional, international, “peer-reviewed” journals.

The number of such journals has grown explosively in the past few years. As of the present date, Beall’s list includes roughly 8,700 journals, a number that is approximately **double** that of just one year ago. The result has been a flood of “atrocious” papers, according to researchers quoted in a recent Economist report.

In 2013, John Bohannon, a journalist with a PhD in molecular biology, performed an unusual experiment: he wrote several versions of a bogus research paper that claimed that a biomolecule commonly found in lichens inhibits the growth of cancer. These manuscripts, in Bohannon’s words, employed “laughably bad” methodology and baldly stated that the biomolecule was a “promising new drug,” without any clinical trial or other evidence. The manuscripts were attributed to fictional researchers at fictional institutions.

Of the 121 journals that Bohannon chose from a list of blacklisted journals, 69% accepted the manuscript they were sent, to be published for a fee. What’s worse, even some non-blacklisted open-access journals also accepted the submission.

One does not have to look very far afield for an explanation of this phenomenon: Many academics, under huge pressure to “publish or perish,” are all too eager to publish in a journal of questionable merit, even if a fee is required, rather than wait for the possibly one-year or more review process from a reputable journal, which may well result in a rejection. Besides, if the fact that the journal has a poor reputation is ever pointed out, the author(s) of the manuscript can feign ignorance.

A deeper explanation at work here is that most institutions, in hiring and promotion decisions, do not exercise due diligence in checking lists of publications against lists of journals known to have lax standards.

Recently Derek Pyne, an economist at Thompson Rivers University in Canada, published a study finding that many of his business school’s administrators, and most of its economics and business faculty, had published in journals on Beall’s list. And such publications seemed to pay off: of those researchers who had published in blacklisted journals, 56% had subsequently received at least one research grant from the school. The school did not appreciate this report, and Pyne was censured.

Along this line, Beall himself was pressured to cease compiling his list by his institution; his list is now being kept by another custodian who refuses to be named.

Obviously the mathematical community, and in fact the entire scientific community, needs to tighten standards for peer review and to oppose any form of “peer-reviewed” publication that involves only a perfunctory review. But it does not seem useful or even possible to simply “boycott” every open-access or pay-to-publish journal. Among other things, laws recently passed in the U.S. and Canada require taxpayer-funded research to be published in open-access journals. This was done with the best of intentions, even though the outcome is decidedly problematic.

But there is something positive that we can all do — write reviews!

Editors in many journals, covering a wide range different fields, are reporting that an increasing number of requests for reviews are being declined, or (arguably worse), simply ignored. The present author serves on the editorial board of several journals, some in the field of mathematics and others in computer arithmetic and high-performance computing. The experience is the same in all of these journals — more and more rejections of requests to write reviews. In some cases, the present author has needed to send 15 or more requests to obtain just three reviews.

But to ask researchers to write more reviews also means that department charimen and chairwomen, deans and other persons in administrative roles also recognize the importance of writing reviews. If this is not done, then given the huge time pressures that many researchers operate in, reviewing papers will continue to have the lowest priority.

In any event, there is a real danger that as a growing number of papers are published with erroneous or questionable results, other papers may cite them, thus starting a food chain of scholarship that is, at its base, mistaken. Such errors may only be rooted out years after legitimate mathematicians and scientists have cited and applied their results, and then labored in vain to understand paradoxical conclusions.

One way or the other, the plague of pay-to-publish journals with highly questionable or even nonexistent “peer review” processes must end. The future of all published scientific research depends on it.

]]>In a new book, Lost in Math: How Beauty Leads Physics Astray, Sabine Hossenfelder reflects on her career as a theoretical physicist. She acknowledges that her colleagues have produced “mind-boggling” new theories. But she is deeply troubled by the fact that so much work in theoretical physics today is disconnected from empirical reality, yet is excused because the theories themselves, and the mathematics behind them, are “too beautiful not to be true.”

Hossenfelder notes that there have been numerous instances in the past when scientists’ over-reliance on “beauty” has led it astray. Newton’s clockwork universe, with seemingly self-evident

Continue reading Does beautiful mathematics lead physics astray?

]]>In a new book, Lost in Math: How Beauty Leads Physics Astray, Sabine Hossenfelder reflects on her career as a theoretical physicist. She acknowledges that her colleagues have produced “mind-boggling” new theories. But she is deeply troubled by the fact that so much work in theoretical physics today is disconnected from empirical reality, yet is excused because the theories themselves, and the mathematics behind them, are “too beautiful not to be true.”

Hossenfelder notes that there have been numerous instances in the past when scientists’ over-reliance on “beauty” has led it astray. Newton’s clockwork universe, with seemingly self-evident laws of motion and gravity operating on an unchanging grid of space and time, seemed just too natural and beautiful to ever be doubted. Sadly, this mindset retarded acceptance of relativity and quantum mechanics for decades.

Even quantum mechanics has its detractors, because much of it seems so counterintuitive. To this day, many physicists believe that there must be some more “reasonable” underlying theory. Yet even the most exacting recent tests continue to precisely confirm the predictions of quantum theory.

Hossenfelder argues that too many physicists today, like their counterparts of previous eras, cling to theories that have been seemingly rejected by experimental evidence, or which have extrapolated far beyond what can be reasonably expected to be tested by experimentation in any foreseeable time frame. The usual reason that researchers stick by these theories is that the mathematical techniques behind them are judged “too beautiful not to be true,” or for some other largely aesthetic (and thus unscientific) reason.

In this regard, Hossenfelder joins several other authors of recent years who have expressed concern about the prevailing “orthodox” direction of the field of mathematical physics. For example, physicist Lee Smolin, writing in his 2006 book The Trouble With Physics, declares

We physicists need to confront the crisis facing us. A scientific theory [the string theory/multiverse] that makes no predictions and therefore is not subject to experiment can never fail, but such a theory can never succeed either, as long as science stands for knowledge gained from rational argument borne out by evidence. There needs to be an honest evaluation of the wisdom of sticking to a research program that has failed after decades to find grounding in either experimental results or precise mathematical formulation. String theorists need to face the possibility that they will turn out to have been wrong and others right.

Hossenfelder cites these examples, among others:

- Supersymmetry is the notion that each particle has a “superpartner.” Supersymmetry was proposed in the 1970s to complete the collection of known particles. It is appealing to many physicists because it explains how the two known types of particles, fermions and bosons, belong together — supersymmetry proposes that the laws of nature remain unchanged when fermions are exchanged with bosons. More recently, supersymmetry has been proposed to explain why the Higgs boson has the mass it has. Hossenfelder notes that thousands of her colleagues have bet their careers on it.
But no such superparticles have ever been found, even in the latest high-energy experiments on the Large Hadron Collider. Undeterred, many physicists just “know” supersymmetry must be true and continue to pursue it.

- String theory is the notion that all of physical reality is derived from the harmonic properties of exceedingly small strings and higher-dimensional “branes,” and that reality is really 10- or 11-dimensional, with extra dimensions curled up into submicroscopically small configurations, much like a hose that appears one-dimensional from a distance but admits travel around its circular cross-section at close quarters.
String theory has been proposed as the long-sought “theory of everything,” because it appears to unite relativity and quantum theory, but mostly because it is so “beautiful.” Yet in spite of decades of effort, by literally thousands of brilliant mathematical physicists, the field has yet to produce any specific experimentally testable prediction or test, short of hypothesizing galaxy-sized accelerators. What’s more, hopes that string theory would result in a single crisp physical theory, pinning down unique laws and unique values of the fundamental constants, have been dashed. Instead, the theory admits a huge number of different options, corresponding to different Calabi-Yau spaces, which by one reckoning number more than 10

^{500}. - The multiverse is the notion that ultimate reality consist of a huge and possibly infinite collection of other universes (see previous item, for instance), each forever separate and incommunicado from our own universe. The multiverse arose out of attempts to explain the apparent fine tuning of the laws and constants of physics. Researchers explain fine tuning by saying, in effect, that if our particular universe were not fine-tuned pretty much as we presently see it, we could not exist and thus would not be here to talk about it; end of story.
But many other scientists consider this type of “anthropic principle” argument highly controversial, an abdication of empirically testable physics. Is this really the best explanation that the field can produce?

Hossenfelder’s concern is personal as well as philosophical. She honestly does not see why “hidden rules” of “beauty” or aesthetics should have any place in the field. As she writes in the introduction,

The hidden rules served us badly. Even though we proposed an abundance of new natural laws, they all remained unconfirmed. And while I witnessed my profession slip into crisis, I slipped into my own personal crisis. I’m not sure anymore that what we do here, in the foundations of physics, is science. And if not, then why am I wasting my time with it?

She describes her personal odyssey, in which she seeks out some of the most prominent minds of the fields to understand how they see these issues. Here are some of the leading figures whom she interviews:

*Gian Francesco Giudice*. He is the head of the theory division at CERN (the European Council for Nuclear Research), which is located on the Swiss-French border just north of Geneva. It is the home of the Large Hadron Collider (LHC), which currently is the world’s most powerful accelerator. It was here that the discovery of long-sought Higgs boson was announced six years ago. Giudice argues that “The sense of beauty of a physical theory must be something hard-wired in our brain and not a social construct. … When you stumble on a beautiful theory you have the same emotional reaction that you feel in front of a piece of art.”Hossenfelder acknowledged his comments, but fails to see why it matters. She writes:

I doubt my sense of beauty is a reliable guide to uncovering fundamental laws of nature, laws that dictate the behavior of entities that I have no direct sensory awareness of, never had, and never will have. For it to be hardwired into my brain, it ought to have been beneficial during natural selection. But what evolutionary advantage has there ever been to understanding quantum gravity?

*Nima Arkani-Hamed*. Nima Arkani-Hamed, a Canadian-American theoretical physicist of Iranian descent, argues that mathematical consistency helps to rule out many specious theories, in the same way as experimentation:

It’s true that in most other parts of science what’s needed to check whether ideas are right or wrong are new experiments. But our field is so mature that we get incredible constraints already by old experiments. The constraints are so strong that they rule out pretty much everything you can try. If you are an honest physicist, 99.99 percent of your ideas, even good ideas, are going to be ruled out, not by new experiments but already by inconsistency with old experiments.

However, Hossenfelder notes that one can never prove any mathematics to be a true description of nature, because in the end mathematics only encapsulates provable truths about mathematical structures. The claim that these structures correspond to physical reality is a separate assertion that must always be tested and confirmed by experiment.

*Steven Weinberg*. Weinberg, the well-known Nobel laureate at the University of Texas, Austin, recognizes many of the difficulties that Hossenfelder writes about, but still insists on some value in seeking “beautiful” theories. As he told Hossenfelder,

Our sense of beauty has changed. And … the beauty that we seek now, not in art, not in home decoration — or in horse breeding — but the beauty we seek in physical theories is a beauty of rigidity. We would like theories that to the greatest extent possible could not be varied without leading to impossibilities, like mathematical inconsistencies.

As before, Hossenfelder counters that mathematical consistency is a fairly weak requirement: “There are many things that are mathematically consistent that have nothing to do with the [real] world.” Weinberg acknowledges this point, but still hopes that we can find a theory that is unique, in the sense that is the only mathematically consistent theory that is sufficiently rich to permit diverse phenomena, including life. Hossenfelder can see the appeal of such a theory, but doesn’t see how naturalness or mathematical beauty can be a useful guide.

Weinberg also commented on the foundations of quantum mechanics. Weinberg acknowledged that we really don’t have any satisfactory theory of quantum mechanics. But Weinberg is largely content with the “shut up and calculate” approach to these problems — quantum mechanics may seem ugly and “unnatural,” but it works.

*Frank Wilczek*. The well-known physicist / cosmologist Frank Wilczek is author of A Beautiful Question: Finding Nature’s Deep Design. With regards to the question of whether nature embodies beautiful ideas, Wilczek replies with an emphatic yes. Nonetheless, he acknowledges that currently popular theories such as string theory have serious problems:

I guess the most serious candidate for a better theory — better in the sense of bringing together gravity with the other interactions in an i intimate way and solving the [problem with quantum gravity] — is string theory. [But] to me it’s not clear what the theory is. It’s kind of a miasma of ideas that hasn’t yet taken shape, and it’s too early to say whether it’s simple or not — or even if it’s right or not. Right now it definitely doesn’t appear simple.

Wilczek acknowledges the fundamental need for data, saying “I don’t think we should compromise on this idea of post-empirical physics. I think that’s appalling, really appalling.” With regards to supersymmetry, he recognizes that the theory is in deep trouble too:

A lot of things are fine-tuned and we don’t know why. I would definitely not believe in supersymmetry if it wasn’t for the unification of gauge couplings, which I find very impressive. I can’t believe that that’s a coincidence. Unfortunately, this calculation doesn’t give you precise information about the mass scale where [SUSY] should show up. But the chances of unification start getting worse when [SUSY] partners haven’t shown up around 2 TeV.”

One final interesting commentary in the book is where Hossenfelder outlines some of the changes that have happened to the scientific enterprise over the past few decades. For one thing, there are many more physicists than in the past. The number of new physics PhDs in the U.S. has increased from about 20 per year in 1900 to over 2000 per year in 2012. Since 1970, the total number of papers in the physics field has increased by about 3.8 to 4.0% per year, i.e., a doubling time of about 18 years. And as the size of the physics literature has grown, it has bifurcated into numerous separate subfields, most of which are too specialized to be easily read by physicists outside the subdiscipline. Further, the number of authors on a typical physics paper has increased, as has the number of publications for a single physicist — i.e., publish or perish. Many physicists spend an increasing amount of time writing proposals, and fewer than ever are employed in solid, tenured positions, resulting in increased stress and less time to do real research.

Given these trends, it is perhaps not surprising that younger physicists quickly gravitate to well-worn paths, rather than aggressively challenging established orthodoxy.

In spite of Hossenfelder’s angst, she ends the book on a positive note. She recognizes that the laws of nature as we presently understand them are incomplete, and to produce a truly unified theory researchers will need to overhaul either general relativity or quantum physics or both. Some say that physics was the success story of the last century, but she is convinced that the next major breakthrough will happen in this century. “It will be beautiful,” whatever that will mean.

[Added 1 July 2018: John Horgan, a senior writer for Scientific American, has posted a review / synopsis of Hossenfelder’s book: Here.]

]]>