Simple proofs: Archimedes’ calculation of pi

MathJax TeX Test Page

Credit: Ancient Origins

Introduction

Archimedes is widely regarded as the greatest mathematician of antiquity. He was a pioneer of applied mathematics, for instance with his discovery of the principle of buoyancy, and a master of engineering designs, for instance with his “screw” to raise water from one level to another. But his most far-reaching discovery was the “method of exhaustion,” which he used to deduce the area of a circle, the surface area and volume of a sphere and the area under a parabola. Indeed, with this method Archimedes anticipated, by nearly 2000 years, the modern development of calculus that began in the 17th century with Leibniz and Newton. For additional details, see the Wikipedia article.

In this article, we present Archimedes’ ingenious method to calculate the perimeter and area of a circle, while taking advantage of a much more facile system of notation (algebra), a much more facile system of calculation (decimal arithmetic and computer technology), and a much better-developed framework for rigorous mathematical proof. For a step-by-step presentation of Archimedes’ actual computation, see this article by Chuck Lindsey.

Pi denial

One motivation for this article is to respond some recent writers who reject basic mathematical theory and the accepted value of $\pi$, claiming instead that they have found $\pi$ to be a different value. For example, one author asserts that $\pi = 17 – 8 \sqrt{3} = 3.1435935394\ldots$. Another author asserts that $\pi = (14 – \sqrt{2}) / 4 = 3.1464466094\ldots$. A third author promises to reveal an “exact” value of $\pi$, differing significantly from the accepted value. For other examples, see this Math Scholar blog. Of course, $\pi$ cannot possibly be given by any algebraic expression such as these, since $\pi$ was proven transcendental by Lindemann in 1882, and his proof has been checked carefully by many thousands of mathematicians since then.

This article demonstrates, as simply and concisely as possible, why $\pi = 3.1415926535\ldots$ and certainly not any of these variant values. To that end, this material requires no mathematical background beyond very basic algebra, trigonometry and the Pythagorean theorem, and scrupulously avoids calculus, advanced analysis or any reasoning that depends on prior knowledge about $\pi$. Along this line, traditional degree notation is used for angles instead of radian measure customary in professional research work, both to make the presentation easier follow and also to avoid any concepts or techniques that might be viewed as dependent on $\pi$.

We start by establishing some basic identities. These proofs assume only the definitions of the trigonometric functions, namely $\sin(\alpha)$ (= opposite side / hypotenuse in a right triangle), $\cos(\alpha)$ (= adjacent side / hypotenuse) and $\tan(\alpha)$ (= opposite / adjacent), together with the Pythagorean theorem. Note, by these definitions, that $\tan(\alpha) = \sin(\alpha) / \cos(\alpha)$, and $\sin^2(\alpha) + \cos^2(\alpha) = 1$. Readers who are familiar with the following well-known identities may skip to the next section.

LEMMA 1 (Double-angle and half-angle formulas): The double angle formulas are $\sin(2\alpha) = 2 \cos(\alpha) \sin(\alpha), \; \cos(2\alpha) = 1 – 2 \sin^2(\alpha)$ and $\tan(2\alpha) = 2 \tan(\alpha) / (1 – \tan^2(\alpha))$. The corresponding half-angle formulas are $$\sin(\alpha/2) = \sqrt{(1 – \cos(\alpha))/2}, \;\; \cos(\alpha/2) = \sqrt{(1 + \cos(\alpha))/2}, \;\; \tan(\alpha/2) = \frac{\sin(\alpha)}{1 + \cos(\alpha)} = \frac{\tan(\alpha)\sin(\alpha)}{\tan(\alpha) + \sin(\alpha)},$$ however note that the first two of these are valid only for $0 \le \alpha \leq 180^\circ$, because of the ambiguity of the sign when taking a square root.

Credit: Wikimedia

Proof: We first establish some more general results: $$\sin (\alpha + \beta) = \sin (\alpha) \cos (\beta) + \cos (\alpha) \sin (\beta),$$ $$\cos (\alpha + \beta) = \cos (\alpha) \cos (\beta) – \sin (\alpha) \sin (\beta),$$ $$\tan(\alpha + \beta) = \frac{\tan(\alpha) + \tan(\beta)}{1 – \tan(\alpha)\tan(\beta)}.$$ The formula for $\sin(\alpha + \beta)$ has a simple geometric proof, based only on the Pythagorean formula and simple rules of right triangles, which is illustrated to the right (here $OP = 1$). First note that $RPQ = \alpha, \, PQ = \sin(\beta)$ and $OQ = \cos (\beta)$. Further, $AQ/OQ = \sin(\alpha)$, so $AQ = \sin(\alpha) \cos(\beta)$, and $PR/PQ = \cos(\alpha)$, so $PR = \cos(\alpha) \sin(\beta)$. Combining these results, $$\sin(\alpha + \beta) = PB = RB + PR = AQ + PR = \sin(\alpha) \cos(\beta) + \cos(\alpha) \sin(\beta).$$ The proof of the formula for the cosine of the sum of two angles is entirely similar, and the formula for $\tan(\alpha + \beta)$ is obtained by dividing the formula for $\sin(\alpha + \beta)$ by the formula for $\cos(\alpha + \beta)$, followed by some simple algebra. See this Wikipedia article, from which the above illustration and proof were taken, for additional details.

Setting $\alpha = \beta$ in the above formulas yields $\sin(2\alpha) = 2 \cos(\alpha) \sin(\alpha), \, \cos(2\alpha) = \cos^2(\alpha) – \sin^2(\alpha) = 1 – 2 \sin^2(\alpha)$, and $\tan(2\alpha) = \sin(\alpha)/(1 + \cos(\alpha)) = \tan(\alpha)\sin(\alpha)/(\tan(\alpha) + \sin(\alpha))$. The half-angle formulas can then easily be derived by simple algebra. For example, from $\cos(\alpha) = 1 – 2 \sin^2(\alpha/2)$ we can write $2 \sin^2(\alpha/2) = 1 – \cos(\alpha)$, from which we deduce $\sin(\alpha/2) = \sqrt{(1 – \cos(\alpha))/2}$ (however, as noted before, this formula is only valid for $0 \leq \alpha \leq 180^\circ$, because of the ambiguity in the sign when taking a square root).

Archimedes’ algorithm for approximating Pi

Credit: Michele Vallisneri, NASA JPL

With this background, we are now able to present Archimedes’ algorithm for approximating $\pi$. Consider the case of a circle with radius one (see diagram). We see that each side of a regular inscribed hexagon has length one, and thus, of course, each half-side has length one-half. This is reflected in the formula $\sin(30^\circ) = 1/2$, a formula which in effect is proven by this diagram. Note that by applying the identity $\cos^2(\alpha) = 1 – \sin^2(\alpha)$, we obtain $\cos(30^\circ) = \sqrt{3}/2 = 0.866025\ldots$, and also that $\tan(30^\circ) = \sin(30^\circ)/\cos(30^\circ) = \sqrt{3}/3 = 0.577350\ldots$.

Let $a_1$ be the semi-perimeter of the regular circumscribed hexagon of a circle with radius one, and let $b_1$ denote the semi-perimeter of the regular inscribed hexagon. By examining the figure, we see each of the six equilateral triangles in the circumscribed hexagon has base $= 2 \tan{30^\circ} = 2 \sqrt{3}/3$. Thus $a_1 = 6 \tan(30^\circ) = 2\sqrt{3} = 3.464101\ldots$. Each of the six equilateral triangles in the inscribed hexagon has base $= 2 \sin(30^\circ) = 1$, so that $b_1 = 6 \sin(30^\circ) = 3$. In a similar fashion, let $c_1$ be the area of the regular circumscribed hexagon of a circle with radius one, and let $d_1$ denote the area of the regular inscribed hexagon. Since the altitude of each section of the circumscribed hexagon is one, $c_1 = a_1 = 2\sqrt{3} = 3.464101\ldots$. Since the altitude of each section of the inscribed hexagon is $\cos(30^\circ)$, $d_1 = 6 \sin(30^\circ) \cos(30^\circ) = 2.598076\ldots$.

Now consider a $12$-sided regular circumscribed polygon of a circle with radius one, and a $12$-sided regular inscribed polygon. Their semi-perimeters will be denoted $a_2$ and $b_2$, respectively, and their full areas will be denoted $c_2$ and $d_2$, respectively. The angles are halved, but the number of sides is doubled. Thus $a_2 = 12 \tan(15^\circ), \, b_2 = 12 \sin(15^\circ), \, c_2 = a_2 = 12 \tan(15^\circ)$ and $d_2 = 12 \sin(15^\circ) \cos(15^\circ)$, the latter of which, by applying the double angle formula for sine from Lemma 1, can be written as $d_2 = 6 \sin(30^\circ) = b_1$. Applying the half-angle formulas from Lemma 1, we obtain $a_2 = 12 (2 – \sqrt{3}) = 3.215390\ldots, \; b_2 = 3 (\sqrt{6} – \sqrt{2}) = 3.105828\ldots, \; c_2 = a_2 = 3.215390\ldots$ and $d_2 = b_1 = 3$.

In general, after $k$ steps of doubling, denote the semi-perimeters of the regular circumscribed and inscribed polygons for a circle of radius one with $3 \cdot 2^k$ sides as $a_k$ and $b_k$, respectively, and denote the full areas as $c_k$ and $d_k$, respectively. As before, because the altitudes of the triangles in the circumscribed polygons always have length one, $c_k = a_k$ for each $k$. Also, as before, after applying the double-angle identity for sine from Lemma 1, we can write $d_k = 3 \cdot 2^k \sin(60^\circ/2^k) \cos(60^\circ/2^k) = 3 \cdot 2^{k-1} \sin(60^\circ/2^{k-1}) = b_{k-1}$. Thus we have the following:

THEOREM 1 (Archimedes’ formulas for Pi): Let $\theta_k = 60^\circ/2^k$. Then $$a_k = 3 \cdot 2^k \tan(\theta_k), \; b_k = 3 \cdot 2^k \sin(\theta_k), \; c_k = a_k, \; d_k = b_{k-1}.$$ These formulas are entirely satisfactory to calculate the semiperimeters and areas of inscribed and circumscribed circles, provided one has a calculator or computer program to evaluate tangents and sines. But given our intent in this article to work from first principles, we can also employ the following iteration, which permits these values to be calculated by simple formulas involving only arithmetic and square roots:

THEOREM 2 (The Archimedean iteration for Pi): Define the sequences of real numbers $A_k, \, B_k$ by the following: $A_1 = 2 \sqrt{3}, \, B_1 = 3$. Then, for $k \ge 1$, set $$A_{k+1} = \frac{2 A_k B_k}{A_k + B_k}, \quad B_{k+1} = \sqrt{A_{k+1} B_k}.$$ Then for all $k \ge 1$, we have $A_k = a_k$ and $B_k = b_k$, as given by the formulas in Theorem 1.

Proof: $A_1 = a_1$ and $B_1 = b_1$, so the result is true for $k = 1$. By induction, assume the result is true up to some $k$. Then we can write, recalling the formula $\tan(\alpha/2) = \tan(\alpha)\sin(\alpha)/(\tan(\alpha) + \sin(\alpha))$ from Lemma 1, $$A_{k+1} = \frac{2 A_k B_k}{A_k + B_k} = \frac{2 \cdot 3 \cdot 2^k \tan(\theta_k) \cdot 3 \cdot 2^k \sin(\theta_k)}{3 \cdot 2^k \tan(\theta_k) + 3 \cdot 2^k \sin(\theta_k)} = 3 \cdot 2^{k+1} \tan(\theta_k/2) = 3 \cdot 2^{k+1} \tan(\theta_{k+1}) = a_{k+1}.$$ Similarly, recalling the identity $\sin(2\alpha) = 2 \sin(\alpha) \cos(\alpha)$ from Lemma 1, so that $\sin(\theta_k) = 2 \sin(\theta_{k+1}) \cos(\theta_{k+1})$, we can write $$B_{k+1} = \sqrt{A_{k+1} B_k} = \sqrt{9 \cdot 2^{2k+1} \tan(\theta_{k+1}) \sin(\theta_k)} = \sqrt{9 \cdot 2^{2k+2} \tan(\theta_{k+1}) \sin(\theta_{k+1}) \cos(\theta_{k+1})},$$ $$ = \sqrt{9 \cdot 2^{2k+2} \sin^2(\theta_{k+1})} = 3 \cdot 2^{k+1} \sin(\theta_{k+1}) = b_{k+1}.$$

As a historical comment, note that Archimedes certainly did not use this notation or explicitly derive either the Archimedean formulas or iteration. But his construction is equivalent to these results.

Computations using the Archimedean iteration

We are now able to directly compute some approximations to $\pi$, using only the formulas of Theorem 1 or Theorem 2. These results are shown in the table to 16 digits after the decimal point, but were performed using 50-digit precision arithmetic to rule out any possibility of numerical round-off error corrupting the table results. Keep in mind that the semiperimeters and areas for circumscribed polygons are over-estimates of $\pi$, and those for the inscribed polygons are under-estimates of $\pi$.

Semi-perimeter
Area
Iteration Sides
Circumscribed polygon
Inscribed polygon
Circumscribed polygon
Inscribed polygon
$k$ $3 \cdot 2^k$
$a_k$
$b_k$
$c_k$
$d_k$
1 6 3.4641016151377545 3.0000000000000000 3.4641016151377545 2.5980762113533159
2 12 3.2153903091734724 3.1058285412302491 3.2153903091734724 3.0000000000000000
3 24 3.1596599420975004 3.1326286132812381 3.1596599420975004 3.1058285412302491
4 48 3.1460862151314349 3.1393502030468672 3.1460862151314349 3.1326286132812381
5 96 3.1427145996453682 3.1410319508905096 3.1427145996453682 3.1393502030468672
6 192 3.1418730499798238 3.1414524722854620 3.1418730499798238 3.1410319508905096
7 384 3.1416627470568485 3.1415576079118576 3.1416627470568485 3.1414524722854620
8 768 3.1416101766046895 3.1415838921483184 3.1416101766046895 3.1415576079118576
9 1536 3.1415970343215261 3.1415904632280500 3.1415970343215261 3.1415838921483184
10 3072 3.1415937487713520 3.1415921059992715 3.1415937487713520 3.1415904632280500
11 6144 3.1415929273850970 3.1415925166921574 3.1415929273850970 3.1415921059992715
12 12288 3.1415927220386138 3.1415926193653839 3.1415927220386138 3.1415925166921574
13 24576 3.1415926707019980 3.1415926450336908 3.1415926707019980 3.1415926193653839
14 49152 3.1415926578678444 3.1415926514507676 3.1415926578678444 3.1415926450336908
15 98304 3.1415926546593060 3.1415926530550368 3.1415926546593060 3.1415926514507676
16 196608 3.1415926538571714 3.1415926534561041 3.1415926538571714 3.1415926530550368
17 393216 3.1415926536566377 3.1415926535563709 3.1415926536566377 3.1415926534561041
18 786432 3.1415926536065043 3.1415926535814376 3.1415926536065043 3.1415926535563709
19 1572864 3.1415926535939710 3.1415926535877043 3.1415926535939710 3.1415926535814376
20 3145728 3.1415926535908376 3.1415926535892710 3.1415926535908376 3.1415926535877043


As can be easily seen, each of these columns converges quickly to the well-known value of $\pi$. In the final row of the table, which presents results for circumscribed and inscribed polygons with 3,145,728 sides, all four entries agree to ten digits after the decimal point: $3.1415926535\ldots$. Note, by the way, that both of the two variant values of $\pi$ mentioned in the Pi denial section above are excluded by iteration four. There is no escaping these calculations — the variant values for $\pi$ are simply wrong.

We will now rigorously prove that the Archimedean formulas (or, equivalently, the Archimedean iteration) converge to $\pi$ in both the circumference and area senses, again relying only on first-principles reasoning.

AXIOM 1 (Completeness axiom): Every set of reals that is bounded above has a least upper bound; every set of reals that is bounded below has a greatest lower bound.

Comment: This fundamental axiom of real numbers merely states the property that the set of real numbers, unlike say the set of rational numbers, has no “holes.” An equivalent statement of the completeness axiom is “Every Cauchy sequence of real numbers has a limit in the real numbers.” See the Wikipedia article Completeness of the real numbers and this Chapter for details.

THEOREM 3 (Pi as the limit of of circumscribed and inscribed polygons with $3 \cdot 2^k$ sides):
Theorem 3a: For a circle of radius one, as the index $k$ increases, the greatest lower bound of the semi-perimeters of circumscribed regular polygons with $3 \cdot 2^k$ sides is exactly equal to the least upper bound of the semi-perimeters of inscribed regular polygons with $3 \cdot 2^k$ sides, which value we may define as $\pi$.
Theorem 3b: For a circle of radius one, as the index $k$ increases, the greatest lower bound of the areas of circumscribed regular polygons with $3 \cdot 2^k$ sides is exactly equal to the least upper bound of the areas of inscribed regular polygons with $3 \cdot 2^k$ sides, which value is exactly equal to $\pi$ as defined in Theorem 3a.

Comment: Strictly speaking from a modern perspective, $\pi$ is now typically defined as the circumference of a circle divided by its diameter (or as the semi-circumference of a circle of radius one), where the circumference of a circle is defined as the limit of the perimeters of circumscribed or inscribed regular polygons with $n$ sides as $n$ increases without bound. Note that this is a somewhat stricter definition than Archimedean definition, which only deals with the special case $n = 3 \cdot 2^k$. In the spirit of adhering to the modern convention, we present in a separate blog a complete proof that $\pi$ as defined by Archimedes is the same as $\pi$ based on general $n$-sided regular polygons for a circle of radius one, and, as a bonus, a proof that the limits of the areas of these polygons is also equal to $\pi$. See the separate blog for details.

Proof strategy: We will show that (a) the sequence of circumscribed semi-perimeters $(a_k)$ is strictly decreasing; (b) the sequence of inscribed semi-perimeters $(b_k)$ is strictly increasing; (c) all $(a_k)$ are strictly greater than all $(b_k)$; and (d) the distance between $a_k$ and $b_k$ becomes arbitrarily small for large $k$. Thus the greatest lower bound of the circumscribed semi-perimeters is equal to the least upper bound of inscribed semi-perimeters, and the common limit may be defined as $\pi$. A similar argument reaches the same conclusion for the sequence of circumscribed and inscribed areas.

Proof: Recall from above that $$\theta_k = 60^\circ/2^k, \; a_k = 3 \cdot 2^k \tan(\theta_k), \; b_k = 3 \cdot 2^k \sin(\theta_k), \; c_k = a_k, \; d_k = b_{k-1}.$$ First note that since all $\theta_k \gt 0$, all $\cos(\theta_k) \lt 1$ or, in other words, $1 – \cos(\theta_k) \gt 0$. Then we can write $$a_{k} – a_{k+1} = 3 \cdot 2^k \tan(\theta_k) – 3 \cdot 2^{k+1} \tan(\theta_{k+1}) = 3 \cdot 2^k \left(\tan(\theta_k) – \frac{2 \sin(\theta_k)}{1 + \cos(\theta_k)}\right) = \frac{3 \cdot 2^k \tan(\theta_k) (1 – \cos(\theta_k))}{1 + \cos(\theta_k)} \gt 0, $$ $$b_{k+1} – b_k = 3 \cdot 2^{k+1} \sin(\theta_{k+1}) – 3 \cdot 2^k \sin(\theta_k) = 3 \cdot 2^{k+1} (\sin(\theta_{k+1}) – \sin(\theta_{k+1}) \cos(\theta_{k+1})) = 3 \cdot 2^{k+1} \sin(\theta_{k+1})(1 – \cos(\theta_{k+1})) \gt 0,$$ $$a_k – b_k = 3 \cdot 2^k (\tan(\theta_k) – \sin(\theta_k)) = 3 \cdot 2^k \tan(\theta_k) (1 – \cos(\theta_k)) \gt 0.$$ Thus $a_k$ is a strictly decreasing sequence, $b_k$ is a strictly increasing sequence, and each $a_k \gt b_k$. If $k \le m$, then $a_k \ge a_m \gt b_m$, so $a_k \gt b_m$. Thus all $a_k$ are strictly greater than all $b_k$. In particular, since $a_1 = 2 \sqrt{3} \lt 4$, this means that all $a_k \lt 4$ and thus all $b_k \lt 4$. Similarly, since $b_1 = 3$, all $b_k \ge 3$ and thus all $a_k \gt 3$. Also, since $\theta_1 = 30^\circ$ and all $\theta_k$ for $k \gt 1$ are smaller than $\theta_1$, this means that $\cos(\theta_k) \gt 1/2$ for all $k$. Now we can write, starting from the expression a few lines above for $a_k – b_k$, $$a_k – b_k = 3 \cdot 2^k \tan(\theta_k) (1 – \cos(\theta_k)) = \frac{3 \cdot 2^k \tan(\theta_k) \sin^2(\theta_k)}{1 + \cos(\theta_k)} \le 3 \cdot 2^k \tan(\theta_k) \sin^2(\theta_k)$$ $$= \frac{3 \cdot 2^k \sin^3(\theta_k)}{\cos(\theta_k)} \le 2 \cdot 3 \cdot 2^k \sin^3(\theta_k) = \frac{2 (3 \cdot 2^{k})^3 \sin^3(\theta_k)}{(3 \cdot 2^{k})^2} = \frac{2 b_k^3}{9 \cdot 4^k} \le \frac{128}{9 \cdot 4^k},$$ so that the difference between the circumscribed and inscribed semi-perimeters decreases by roughly a factor of four with each iteration (as is also seen in the table above).

Recall from the above that all $a_k \gt 3$, so that the sequence $(a_k)$ of circumscribed semi-perimeters is bounded below. Thus by Axiom 1 the sequence $(a_k)$ has a greatest lower bound $L_1$. In fact, since all $a_k$ are greater than all $b_k$, any $b_k$ is a lower bound of the sequence $(a_k)$, so that we may write, for any $k$, $a_k \geq L_1 \geq b_k$. Also, all $b_k \lt 4$, so that the sequence $(b_k)$ of inscribed semi-perimeters is bounded above, and thus has a least upper bound $L_2$. And, as before, since any $a_k$ is an upper bound for the sequence $(b_k)$, it also follows, for each $k$, that $a_k \geq L_2 \geq b_k$. Thus both $L_1$ and $L_2$ are “squeezed” between $a_k$ and $b_k$, which, for sufficiently large $k$, are arbitrarily close to each other (according to the last displayed equation above), so that $L_1$ must equal $L_2$. This completes the proof of Theorem 3a.

For Theorem 3b, note that the difference between the circumscribed and inscribed areas is $$c_k – d_k = 3 \cdot 2^k (\tan(\theta_k) – \sin(\theta_k)\cos(\theta_k)) = 3 \cdot 2^k \left(\frac{\sin(\theta_k)}{\cos(\theta_k)} – \sin(\theta_k) \cos(\theta_k)\right) $$ $$= \frac{3 \cdot 2^k \sin(\theta_k) (1 – \cos^2(\theta_k))}{\cos(\theta_k)} = \frac{3 \cdot 2^k \sin^3(\theta_k)}{\cos(\theta_k)} \le \frac{128}{9 \cdot 4^k},$$ since the final inequality was established a few lines above. As before, it follows that the greatest lower bound of the circumscribed areas $c_k$ is exactly equal to the least upper bound of the inscribed areas $d_k$. Furthermore, since the sequence $(a_k)$ of semi-perimeters of the circumscribed polygons is exactly the same sequence as the sequence $(c_k)$ of areas of the circumscribed polygons, we conclude that the common limit of the areas is identical to the common limit of the semi-perimeters, namely $\pi$. This completes the proof of Theorem 3b.

Other formulas and algorithms for Pi

We note in conclusion that Archimedes’ scheme is just one of many formulas and algorithms for $\pi$. See for example this collection. One such formula, for instance, is the Borwein quartic algorithm: Set $a_0 = 6 – 4\sqrt{2}$ and $y_0 = \sqrt{2} – 1$. Iterate, for $k \ge 0$, $$y_{k+1} = \frac{1 – (1 – y_k^4)^{1/4}}{1 + (1 – y_k^4)^{1/4}},$$ $$a_{k+1} = a_k (1 + y_{k+1})^4 – 2^{2k+3} (1 + y_{k+1} + y_{k+1}^2).$$ Then $1/a_k$ converges quartically to $\pi$: each iteration approximately quadruples the number of correct digits. Just three iterations yield 171 correct digits, which are as follows: $$3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482$$ $$534211706798214808651328230664709384460955058223172535940812848111745028410270193\ldots$$

Other posts in the “Simple proofs” series

The other posts in the “Simple proofs of great theorems” series are available Here.

Comments are closed.