In an advance that may presage a dramatic new era of pharmaceuticals and medicine, DeepMind (a subsidiary of Alphabet, Google’s parent company) recently applied their machine learning software to the challenging problem of protein folding, with remarkable success. In the wake of this success, DeepMind and other private companies are racing to further extend these capabilities and apply them to real-world biology and medicine.

The protein folding problemProtein folding is the name for the physical process in which a protein chain, defined by a linear sequence of amino

Continue reading Protein folding via machine learning may spawn medical advances

]]>In an advance that may presage a dramatic new era of pharmaceuticals and medicine, DeepMind (a subsidiary of Alphabet, Google’s parent company) recently applied their machine learning software to the challenging problem of protein folding, with remarkable success. In the wake of this success, DeepMind and other private companies are racing to further extend these capabilities and apply them to real-world biology and medicine.

Protein folding is the name for the physical process in which a protein chain, defined by a linear sequence of amino acids, assumes its equilibrium 3-dimensional structure, a process that in nature typically occurs within a few milliseconds. The equilibrium or “native” structure determines most of the protein’s biological properties. Protein enzymes, for instance, control chemical reactions at the molecular scale. Accurately predicting protein structures is a “holy grail” of modern biology.

As a single example, the protein Cas9 is used to snip DNA in CRISPR gene editing, but the potential of the CRISPR technique is limited by the fact that using Cas9 entails an increased risk of mutations. A better understanding of the structure of Cas9 is thus essential to further advances in gene editing technology.

There are at least 20,000 different proteins in human biology, since there are that many genes in the human genome, but the actual tally is much higher, possibly as many as several billion.

Computationally simulating the protein folding process, and determining the final equilibrium conformation, is a difficult and challenging problem — indeed it has long been listed as one of the grand challenges of the high-performance computing field. Since exhaustively trying all possible conformations for a protein given by an n-long amino acid chain is well-known to be computationally infeasible (Levinthal’s paradox), researchers have broken the folding process into numerous steps, each of which has spawned numerous algorithmic approaches for computation. One recent survey of the computational problem and underlying physics is given in this paper.

The challenge of protein folding has led some to develop new computer architectures specifically devoted to this task. Notable among these efforts is the “Anton” system, which was designed and constructed by a team of researchers led by David E. Shaw, founder of the D.E. Shaw hedge fund. A technical paper they wrote describing the Anton system and one of their landmark calculations was named a winner of the 2009 ACM Gordon Bell Prize.

As mentioned above, DeepMind is a research subsidiary of Alphabet (Google’s parent company) devoted to machine learning (ML) and artificial intelligence (AI). In 2016, the “AlphaGo” program developed by DeepMind defeated Lee Se-dol, a Korean Go master, by winning four games of a five-game tournament, surprising observers who had predicted that this would not be done for decades, if ever. A year later, DeepMind’s improved AlphaGo program defeated Ke Jie, a 19-year-old Chinese player thought to be the world’s best.

Then DeepMind tried a new approach — rather than feeding their program over 100,000 published games by human competitors, they merely programmed the system with the rules of Go and a relatively simple reward function, and had it play games against itself. After just three days and 4.9 million training games, the new “AlphaGo Zero” program had advanced to the point that it defeated the earlier program 100 games to zero. After 40 days of training, its measured skill level was as far ahead of Ke Jie as Ke Jie is ahead of a typical amateur. For additional details, see this Math Scholar blog, this Scientific American article, and this DeepMind article.

Other DeepMind programs, using a similar machine learning approach, have conquered Chess and the Japanese game shogi. For details, see DeepMind’s technical paper and this nicely written New York Times article. This and some other recent developments in the ML/AI arena are summarized in this Math Scholar blog.

DeepMind’s latest conquest is protein folding: In their first attempt, DeepMind’s team easily took top honors at the 13th Critical Assessment of Structure Prediction (CASP), an international competition of protein structure computer programs. For protein sequences for which no other information was known (43 of the 90 test problems), DeepMind’s AlphaFold program made the most accurate prediction among the 98 competitors in 25 cases. This was far better than the second-place entrant, which won only three of the 43 test cases. On average, AlphaFold was 15% more accurate than its closest competitors on the most rigorous tests.

DeepMind’s team developed AlphaFold by training a neural network program on a large dataset of other known proteins, thus enabling the program to more efficiently predict the distances between pairs of amino acids and the angles between the chemical bonds connecting them. After this they employed a more classical “gradient descent” approach to minimize the overall energy level. In other words, their approach combined sophisticated deep learning models with brute force computational resources. A nice summary of these techniques is given in this Exxact blog (see diagram to right) and in this DeepMind article.Most researchers in the field were very impressed. One researcher described these results as absolutely stunning. The Guardian predicted that these results would “usher in a new era of medical progress.”

Large pharmaceutical firms have not traditionally paid much attention to computational approaches, preferring more conventional experimental methods. But the costs of such laboratory work are rising rapidly — by one estimate, large pharma firms spend roughly $2.5 billion bringing a new drug to market, a cost figure that ultimately must be paid for by consumers and their medical insurance companies in the form of sky-high prices for prescription drugs. One major reason for these escalating costs is the embarrassing fact that only about 10% of drugs that enter clinical trails are eventually approved by governmental regulatory agencies. Clearly the pharmaceutical companies must increase this success rate.

And the challenge looming ahead is even more daunting — the 20,000 genes of the human genome can malfunction in at least 100,000 ways, and the total number of interactions between human biology proteins is in the millions, if not higher. As Chris Gibson, founder of Recursion Pharmaceuticals, explains, “If we want to understand the other 97 percent of human biology, we will have to acknowledge it is too complex for humans.”

In the wake of DeepMind’s achievement (December 2018), numerous commercial enterprises are pursuing computational protein folding using ML/AI-based strategies. Venture capital operations are certainly taking notice, having poured more than $1 billion into ML/AI-based startups in the pharmaceutical field during the past year. In addition to Recursion, mentioned above, some other new firms include Insitro, which was recently acquired by Gilead Sciences; Benevolent AI, which has teamed up with AstraZeneca; and a University of California-based team that has partnered with GlaxoSmithKline. Along this line, Juan Alvarez, an associate Vice President for computational chemistry at Merck, says that ML-based methods will be “critical” to the drug discovery and development process in the coming years. See this Bloomberg article for additional details.

So it appears that machine learning and artificial intelligence-based technology is destined to have a major impact in the pharmaceutical-biomedical world in the coming years. The reasons are not hard to find. As GlaxoSmithKline senior Vice President Tony Wood explains, “Where else would you accept a 1-in-10 success rate? … If we could double that to 20% it would be phenomenal.”

]]>In his new book Homo Deus, Israeli scholar Yuval Noah Harari has published one of the most thoughtful and far-reaching analyses of humanity’s present and future. Building on his earlier Sapiens, Harari argues that although humanity has made enormous progress across in the past few centuries, the future of our society, and even of our species, is uncertain.

Harari begins with a reprise of human history, from prehistoric times to the present. He then observes that although religious beliefs are much more nuanced and sophisticated than in the past, human society still relies heavily on the narratives

Continue reading Homo Deus: A brief history of tomorrow

]]>In his new book Homo Deus, Israeli scholar Yuval Noah Harari has published one of the most thoughtful and far-reaching analyses of humanity’s present and future. Building on his earlier Sapiens, Harari argues that although humanity has made enormous progress across in the past few centuries, the future of our society, and even of our species, is uncertain.

Harari begins with a reprise of human history, from prehistoric times to the present. He then observes that although religious beliefs are much more nuanced and sophisticated than in the past, human society still relies heavily on the narratives they teach. Moral laws are widely believed to have been revealed from on high — whether in the Torah (Jews), in the New Testament (Christians), or in the Vedas (Hindus).

Are we now beyond these philosophical systems? Harari points out that modern “religions” also have their set of “divine” laws. Communism, for instance, was thought by Karl Marx and other early proponents to be a “scientific” theory, based on natural laws. Even modern liberalism, dating from the Enlightenment onward, along with present-day secular humanism, are also based on a religious creed, namely the fundamental belief that each human is endowed by natural law with free agency and unalienable rights. As Jefferson wrote in the U.S. Declaration of Independence,

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.

But just as modern science, biblical scholarship and philosophy have undercut some of the fundamental tenets of various religious movements, in a similar way modern science and technology are now drawing into question some of the fundamental tenets of liberalism and secular humanism. Harari argues that future technology, in particular artificial intelligence, is certain to be even more disruptive to these philosophical systems, and may draw into question even the definition of “human.” Indeed, humanity almost certainly will be transformed by this technology, and humans in our present-day sense may no longer exist 100 years from now.

Here are some of his observations:

*Liberalism’s very success may contain the seeds of its ruin.*Scientists and technologists, pushed on by the supposedly infallible wishes of consumers, are devoting more and more energies to the three central projects of liberalism: (a) human life preservation and extension, (b) universal bliss — the “pursuit of happiness”, and (c) the approach to superhuman levels of knowledge and power.*Attempting to realize these dreams will almost certainly unleash new post-humanist technologies that may spell the end of humanity as we know it.*Artificial intelligence is already an unstoppable force, and is certain to change every aspect of human life. As we offload more and more of the operation of the world to intelligent systems, and as we enhance ourselves with a variety of dimly envisioned new technologies, what will “human” mean?*Modern science and technology are undermining both the notion of a single, human “self” as well as the notion of of free agency.*One conclusion of modern psychology is that human consciousness is not a single entity, but, in Marvin Minsky’s memorable words, a society of mind. Similar studies make it clear that one’s mind makes a decision between a few milliseconds to a second or more*before*consciously announcing the decision. What’s more, since the brain is fundamentally a computer, operating on inputs possibly together with atomic-level quantum events, where is “free will”? Finally, given current and future technology, should not the human “self” include the array of intelligent assistants that the person utilizes?

What strikes this reviewer is that, if anything, Harari’s roadmap of human society steadily adopting intelligent systems and robots, and careening to a posthuman state, appears to be happening even faster than he anticipated (the book was first released in the U.S. in 2017). Consider:

- Many have already offloaded numerous tasks to their smartphones and personal computers, certainly including: (a) keeping a calendar of appointments and events, (b) managing messages and email, (c) managing personal and family finances, (d) managing physical activity and health, (e) keeping track of spouses and children, and even (f) finding potential romantic partners. Why do we willingly surrender so many highly personal details to intelligent assistants? Because the intelligent assistants are better at these tasks!
- Artificially intelligent, machine-learning-based schemes have already taken over much of market trading. For example, most conventional actively managed mutual funds do not exceed the returns of a simple buy-and-hold index fund strategy. Instead, the state-of-the-art action is with highly mathematical, big-data-based hedge funds.
- Over 200 million smartwatches have been sold (as of June 2019), with sophisticated features that track highly personal details, such as heart rates, the length and type of daily exercise, and how frequently one takes a break to stand and stretch. The latest Apple smartwatch, for instance, includes GPS mapping, wireless communication, heart rate monitoring, an electrocardiogram generator, and can automatically alert emergency services if it detects that the wearer has fallen and was not able to get up. Future editions are certain to greatly extend the list of available features.
- Social media services such as Facebook and Twitter are collecting enormous amounts of data on users’ personal characteristics and preferences. Indeed, ensuring privacy, avoiding the misuse of this data, and stopping the proliferation of fake news and doctored images has emerged as a major technical challenge and political issue confronting the technology industry.
- Artificial intelligence applications are advancing at a torrid pace. Once unthinkable AI advances, such as self-teaching Go-playing computer programs and self-driving autos and trucks, have already been fielded. In May 2019, researchers at Google and several medical centers announced that an AI-based CT scanner detected and identified lung cancer on a par with the best human specialists. Such developments are certain to come at a breakneck pace in the coming years and decades.
- The wave of technology-based unemployment that has beset blue-collar and clerical industries is certain to take aim at higher-skill occupations. The financial industry, the medical care industry and even creative industries such as music composition are certain to be revolutionized. Harari has echoed warnings of others that a major societal challenge will be how to handle tens of millions of persons whose skills are no longer economically valued, and who are not really trainable for new emerging high-tech vocations. Even among other workers, hundreds of millions will need to re-invent themselves, perhaps more than once, over the space of their careers.

The final chapter of Harari’s book addresses what he calls the “data religion” — the observation that the collection, dissemination and analysis of data is emerging as more than just a technology development, but in fact the central governing principle of human (and posthuman) society, as we inevitably approach a singularity, to use a term coined by Vernor Vinge, Ray Kurzweil and others. As Harari summarizes this “religion,”

Dataists believe that humans can no longer cope with the immense flows of data, hence they cannot distil data into information, let alone into knowledge or wisdom. The work of processing data should therefore be entrusted to electronic algorithms, whose capacity far exceeds that of the human brain.

In this regard, Harari echoes predictions made by numerous other scientists and scholars. For example, mathematician Steven Strogatz, in a recent New York Times article, writes,

Maybe eventually our lack of insight would no longer bother us. After all, AlphaInfinity could cure all our diseases, solve all our scientific problems and make all our other intellectual trains run on time. We did pretty well without much insight for the first 300,000 years or so of our existence as Homo sapiens. And we’ll have no shortage of memory: we will recall with pride the golden era of human insight, this glorious interlude, a few thousand years long, between our uncomprehending past and our incomprehensible future.

Harari ends with three simple but deeply significant questions:

- Are organisms really just algorithms, and is life really just data processing?
- What’s more valuable — intelligence or consciousness?
- What will happen to society, politics and daily life when non-conscious but highly intelligent algorithms know us better than we know ourselves?

These are good questions. But if anything, it seems to this reviewer that Harari is soft-pedaling the changes that are to come, which are certain to impact every aspect of modern society, from its fundamental philosophical underpinnings and governmental systems to highly personal details of day-to-day living. Hold on to your hats!

]]>Four mathematicians, Michael Griffin of Brigham Young University, Ken Ono of Emory University (now at University of Virginia), Larry Rolen of Vanderbilt University and Don Zagier of the Max Planck Institute, have proven a significant result that is thought to be on the roadmap to a proof of the most celebrated of unsolved mathematical conjecture, namely the Riemann hypothesis. First, here is some background:

The Riemann hypothesisThe Riemann hypothesis was first posed by the German

Continue reading Mathematicians prove result tied to the Riemann hypothesis

]]>Four mathematicians, Michael Griffin of Brigham Young University, Ken Ono of Emory University (now at University of Virginia), Larry Rolen of Vanderbilt University and Don Zagier of the Max Planck Institute, have proven a significant result that is thought to be on the roadmap to a proof of the most celebrated of unsolved mathematical conjecture, namely the Riemann hypothesis. First, here is some background:

The Riemann hypothesis was first posed by the German mathematician Georg Friedrich Bernhard Riemann in 1859, in a paper where he observed that questions regarding the distribution of prime numbers were closely tied to a conjecture regarding the behavior of the “zeta function,” namely the beguilingly simple expression $$\zeta(s) \; = \; \sum_{n=1}^\infty \frac{1}{n^s} \; = \; \frac{1}{1^s} + \frac{1}{2^s} + \frac{1}{3^s} + \cdots$$ Leonhard Euler had previously considered this series in the special case $s = 2$, in what was known as the Basel problem, namely to find an analytic expression for the sum $$\sum_{n=1}^\infty \frac{1}{n^2} \; = \; \frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} \cdots \; = \; 1.6449340668482264365\ldots$$ Euler discovered, and then proved, that in fact this sum, which is $\zeta(2)$, is none other than $\pi^2/6$. Similarly, $\zeta(4) = \pi^4/90$, $\zeta(6) = \pi^6/945$, and similar results for all positive even integer arguments. Euler subsequently proved that $$\zeta(s) \; = \; \prod_{p \; {\rm prime}} \frac{1}{1 – p^{-s}} \; = \; \frac{1}{1 – 2^{-s}} \cdot \frac{1}{1 – 3^{-s}} \cdot \frac{1}{1 – 5^{-s}} \cdot \frac{1}{1 – 7^{-s}} \cdots,$$ which clearly indicates an intimate relationship between the zeta function and prime numbers. Riemann examined the zeta function not just for real $s$, but also for the complex case. The zeta function, as defined in the first formula above, only converges for $s$ with real part greater than one. But one can fairly easily show that $$\left(1 – \frac{1}{2^{s-1}}\right) \zeta(s) \; = \; \sum_{n=1}^\infty \frac{(-1)^{n+1}}{n^s} \; = \; \frac{1}{1^s} – \frac{1}{2^s} + \frac{1}{3^s} – \cdots,$$ which converges whenever $s$ has positive real part, and that $$\zeta(s) = 2^s \pi^{s-1} \sin\left(\frac{\pi s}{2}\right) \Gamma(1 – s) \zeta(1 – s),$$ which permits one to define the function whenever the argument has a non-positive real part.

The zeta function has a simple pole singularity at $s = 1$, and clearly $\zeta(s) = 0$ for negative even integers, since $\sin (\pi s / 2) = 0$ for such values (these are known as the “trivial” zeroes). But it has an infinite sequence of more “interesting” zeroes, the first five of which are shown here:

Index | Real part | Imaginary part |

1 | 0.5 | 14.134725141734693790… |

2 | 0.5 | 21.022039638771554992… |

3 | 0.5 | 25.010857580145688763… |

4 | 0.5 | 30.424876125859513210… |

5 | 0.5 | 32.935061587739189691… |

Note that the real part of these zeroes is always 1/2. Riemann’s famous hypothesis is that

If the Riemann hypothesis is true, many important results would hold. Here are just two:

*The Mobius function*: Define the Mobius function of a positive integer $n$ as: $1$ if $n$ is a square-free positive integer with an even number of prime factors; $-1$ if $n$ is a square-free positive integer with an odd number of prime factors; and $0$ if $n$ has a squared prime factor. Then the statement that $$\frac{1}{\zeta(s)} \; = \; \sum_{n=1}^\infty \frac{\mu(n)}{n^s}$$ is valid for every s with ${\rm Re}(s) \gt 1/2$, with the sum converging, is equivalent to the Riemann hypothesis.*The prime-counting function*: For real $x \gt 0$, let $\pi(x)$ denote the number of primes less than $x$, and let ${\rm Li}(x) = \int_2^x 1/\log(t) \, {\rm d}t$. The “prime number theorem,” proven by Hadamard and de la Vallee Poussin in 1896, asserts that $\pi(x) / {\rm Li}(x)$ tends to one for large $x$. But if one assumes the Riemann hypothesis, a stronger result can be proved, namely $$|\pi(x) – {\rm Li}(x)| \lt \frac{\sqrt{x} \log(x)}{8 \pi}$$ for all sufficiently large $x$ (in fact, for all $x \ge 2657$).

For details and some other examples, see this Wikipedia article.

In a remarkable new paper, published in the *Proceedings of the National Academy Of Sciences*, the four mathematicians have resurrected a line of reasoning, long thought to be dead, originally developed by Johan Jensen and George Polya. The proof relies on “Jensen polynomials,” which for an arbitrary real sequence $(\alpha(0), \alpha(1), \alpha(2), \ldots)$, integer degree $d$ and shift $n$ are defined as: $$J_\alpha^{d,n} (x) = \sum_{j=0}^d {d \choose j} \alpha(n+j) x^j.$$ We say that a polynomial with real coefficients is hyperbolic if all of its zeroes are real. Define $\Lambda( s) = \pi^{-s/2} \Gamma(s/2) \zeta(s) = \Lambda(1-s)$. Consider now the sequence of Taylor coefficients $(\gamma(n), n \geq 1)$ defined implicitly by $$(4z^2 – 1) \Lambda(z+1/2) \; = \; \sum_{n=0}^\infty \frac{\gamma(n) z^{2n}}{n!}.$$

Polya proved that the Riemann hypothesis is equivalent to the assertion that the Jensen polynomials associated with the sequence $(\gamma(n))$ are hyperbolic for all nonnegative integers $d$ and $n$.

What Griffin, Ono, Rolen and Zagier have shown is that for $d \geq 1$, the associated Jensen polynomials $J_\gamma^{d,n}$ are hyperbolic for *all sufficiently large* $n$. This is not the same as *for every* $n$, but it certainly is a remarkable advance. In addition, the four authors proved that for $1 \leq d \leq 8$, that the associated Jensen polynomials are indeed hyperbolic for all $n \geq 0$. Previous to this result, the best result was for $1 \leq d \leq 3$ and all $n \geq 0$.

Ken Ono emphasizes that he and the other authors did not invent any new techniques or new mathematical objects. Instead, the advantage of their proof is its simplicity (the paper is only eight pages long!). The idea for the paper was a “toy problem” that Ono presented for entertainment to Zagier during a recent conference celebrating Zagier’s 65th birthday. Ono thought that the problem was essentially intractable and did not expect Zagier to make much headway with it, but Zagier was enthused by the challenge and soon had sketched a solution. Together with the other authors, they fleshed out the solution and then extended it to a more general theory.

Kannan Soundararajan, a Stanford mathematician who has studied the Riemann Hypothesis, said “The result established here may be viewed as offering further evidence toward the Riemann Hypothesis, and in any case, it is a beautiful stand-alone theorem.”

The authors emphasize that their work definitely falls short of a full proof of the Riemann hypothesis. For all they know, the hypothesis may still turn out to be false, or that what remains in this or any other proposed proof outline is so difficult that it may defy efforts to prove for many years to come. But the result is definitely encouraging.

It should be mentioned that some other manuscripts have circulated with authors claiming proofs, at least a few of which are by mathematicians with very solid credentials. However, none of these has ever gained any traction, so the only safe conclusion is that the Riemann hypothesis remains unproven and may be as difficult as ever.

Will it still be unproven 100 years from now? Stay tuned (if any of us are still around in 2119).

]]>In a remarkable development with far-reaching consequences, researchers at the Cambridge Laboratory of Molecular Biology have used a computer program to rewrite the DNA of the well-known bacteria Escherichia coli (more commonly known as “E. coli”) to produce a functioning, reproducing species that is far more complex than any previous similar synthetic biology effort.

Venter’s 2010 projectThis effort has its roots in a project spearheaded by J. Craig Venter, the well-known maverick biomedical researcher known for the “shotgun” approach to genome sequencing pioneered by his team at

Continue reading Computational tools help create new living organism

]]>In a remarkable development with far-reaching consequences, researchers at the Cambridge Laboratory of Molecular Biology have used a computer program to rewrite the DNA of the well-known bacteria Escherichia coli (more commonly known as “E. coli”) to produce a functioning, reproducing species that is far more complex than any previous similar synthetic biology effort.

This effort has its roots in a project spearheaded by J. Craig Venter, the well-known maverick biomedical researcher known for the “shotgun” approach to genome sequencing pioneered by his team at Celera Corporation (later acquired by Quest Diagnostics). In May 2010, Venter announced success in his team’s effort to synthesize an organism whose DNA had 1,080,000 DNA base pairs, designed by a computer program. In a final step, Venter reported that the bacteria used this synthetic DNA to generate proteins in preference to those of its own genome.

For additional details, see this Science article and this New York Times report.

In the latest advance, researchers at the Cambridge Laboratory of Molecular Biology, led by molecular biologist Jason Chin, constructed a variation of the E. coli bacteria with a synthetically designed genome consisting of four million DNA base pairs, compared with just one million in Venter’s 2010 experiment.

One objective of this study, according to Chin, is to explore why all living organisms on Earth today encode genetic information in the same curious way, and whether this design could be any different yet still yield functioning biology.

In normal biology, each group of three nonoverlapping base pairs code to produce one of 20 amino acids or “stop.” Note that biology could have produced up to 64 amino acids (since 4^{3} = 64), but only 20 are realized in real organisms, because multiple three-base-pair “words” code to the same amino acid, which according to DNA pioneer Francis Crick, may be a “frozen accident” of very early biology. For example, the DNA words CTT, CTC, CTA, CTG, TTA and TTG each code to produce the amino acid leucine, and TCT, TCC, TCA, TCG, AGT and AGC each code to produce serine. In total, 61 of the 64 possible three-base-pair words code to produce amino acids, and three words, namely UAA, UAG and UGA, code “stop.” See this table for details on the DNA code.

Chin’s team decided to try altering the E. coli genome, on the computer, by removing some of the superfluous codons. For example, they replaced all instances of TCG, which codes to serine, with AGC, which also codes to serine. In total, their genome uses just four codons to produce serine instead of six, and uses only two stop codons instead of three. This required editing the E. coli genome in over 18,000 DNA base-pair locations.

The actual implementation of this altered genome was a challenge. The genome is much too long and complicated to insert it into a cell in one attempt. Instead, the researchers swapped in segments, step by step. When they were done, there were only four errors in the four-million-long genome, which were quickly corrected.

The bottom line is that the resulting species is viable and reproduces, but grows somewhat more slowly that standard E. coli and develops longer, rod-shaped cells. Otherwise the cells appear normal. The researchers named the species “Syn61” (see graphic above).

Chin notes that such “designer lifeforms” could be useful for certain biopharmaceutical applications, for example to make insulin for diabetes patients and drugs for cancer, multiple sclerosis and other conditions. Production lines for these drugs, which employ altered strains of bacteria, are often plagued with viruses, but with designer bacteria of the type that Chin’s team has developed, these production lines could be based on highly virus-resistant artificial bacterial species.

Thomas Ellis, Director of the Center for Synthetic Biology at Imperial College London, described the achievement in glowing terms: “No one’s done anything like it in terms of size or in terms of number of changes before.” Geneticist George Church of Harvard University agrees: “Chin’s success will embolden the rest of us working to make many organisms (industrial microbes, plants, animals, and human cells) resistant to all viruses by this recoding approach.”

There is even the possibility that research like this will ultimately reveal the origin of life, or, at least, the DNA structure of the last universal common ancestor of all present-day species. For example, what is the bare minimum functionality for a species that can make copies of itself?

However this turns out, an era of synthetic biology will soon be upon us. How will we deal with it? Chin acknowledges the challenges:

People have legitimate concerns. … There is a dual use to anything we invent. But what’s important is that we have a debate about what we should and shouldn’t do. And that these experiments are done in a well controlled way.

For additional details, see this New York Times article, this BBC report and this Guardian report. The Cambridge Laboratory’s Nature paper is here.

]]>The Kepler conjecture is the assertion that the simple scheme of stacking oranges typically seen in a supermarket has the highest possible average density, namely pi/(3 sqrt(2)) = 0.740480489…, for any possible arrangement, regular or irregular. It is named after 17th-century astronomer Johannes Kepler, who first proposed that planets orbited in elliptical paths around the sun.

Hales’ proof of the Kepler conjectureIn the early 1990s, Thomas Hales, following an approach first suggested by Laszlo Fejes Toth in 1953, determined that the maximum density of all possible arrangements could be obtained

Continue reading Researchers use “magic functions” to prove sphere-packing results

]]>The Kepler conjecture is the assertion that the simple scheme of stacking oranges typically seen in a supermarket has the highest possible average density, namely pi/(3 sqrt(2)) = 0.740480489…, for any possible arrangement, regular or irregular. It is named after 17th-century astronomer Johannes Kepler, who first proposed that planets orbited in elliptical paths around the sun.

In the early 1990s, Thomas Hales, following an approach first suggested by Laszlo Fejes Toth in 1953, determined that the maximum density of all possible arrangements could be obtained by minimizing a certain function with 150 variables. In 1992, assisted by his graduate student Samuel Ferguson (son of famed mathematician-sculptor Helaman Ferguson), Hales embarked on a multi-year effort to find a lower bound for the function for each of roughly 5000 different configurations of spheres, using a linear programming approach. In 1998, Hales and Ferguson announced that their project was complete, documented by 250 pages of notes and three Gbyte of computer code, data and results.

Hales’ computer-assisted proof generated some controversy — the journal *Annals of Mathematics* originally demurred, although ultimately accepted his paper. Thus Hales decided to embark on a collaborative effort to certify the proof via automated proof-checking (formal method) software. This project, named Flyspeck, was begun in 2003, but not completed until 2014.

Mathematicians have been investigating similar sphere packing problems in higher dimensions for many years. These problems naturally arise in fields such as error-correcting codes, which are used by everything from mobile phones to space probes: given a signal corrupted by noise, which is the closest signal that could have generated it, and how much noise can corrupt a signal before it can no longer be uniquely recovered, statistically speaking?

Two cases have been particularly troublesome: dimensions 8 and 24. Thus mathematicians studying the field were elated to hear, in March 2016, that the dimension 8 case had been solved by Ukrainian mathematician Maryna Viazovska, who was then a postdoctoral researcher at the Berlin Mathematical School and the Humboldt University of Berlin. Unlike Hales’ 250-page proof, Viazovska’s proof was a mere 23 pages long.

Viazovska proved that the E8 lattice is the most efficient sphere-packing result for 8 dimensions. E8 is a remarkable structure, beautifully illustrated in the graphic at the right. It is a Lie group of dimension 248, and is unique among simple compact Lie groups in having these four properties: a trivial center, compact, simply connected and simply laced (i.e., all roots have the same length). In 2007, researchers affiliated with the American Institute of Mathematics succeeded in calculating the inner structure of E8 in a large supercomputer calculation.The champagne was not even dry from the celebration over Viazovska’s E8 result before she, in collaboration with Henry Cohn, Abhinav Kumar, Stephen D. Miller and Danylo Radchenko, applied similar techniques to prove that the Leech lattice is the optimal structure for dimension 24.

For additional details on these results, see this well-written Quanta Magazine article by Erica Klarreich. Maryna Viazovska’s paper on the 8-dimensional problem is available here. The joint paper on the 24-dimensional problem is available here.

It is now three years since the E8 and Leech lattice papers. Maria Viazovska is now at the Swiss Federal Institute of Technology in Lausanne, Switzerland. Viazovska and her four colleagues Henry Cohn, Abhinav Kumar, Stephen D. Miller and Danylo Radchenko have now proven something even more significant: In a new 89-page paper they have proven that the configurations that solved the sphere-packing problem in 8 and 24 dimensions also solve an infinite number of other related problems on the optimal arrangements for points attempting to “avoid” each other.

Proving universal optimality like this in high dimensions is much more difficult than solving the sphere-packing problem. This is in part because there are literally an infinite number of optimality problems, but also because the individual problems are more difficult. In sphere packing, each sphere only interacts with its nearest neighbors, whereas in problems with electrons, for instance, each electron interacts with every other electron in the universe, no matter how far away they are.

For their latest result, Henry Cohn, Abhinav Kumar, Stephen D. Miller and Danylo Radchenko focused on “monotonic repulsive forces,” which encompass most of the known physical forces. The sphere-packing problem is merely the limiting case with the requirement that the spheres not overlap. For any of these monotonic forces, the optimality question reduces to finding the lowest-energy configuration (the “ground state”) for an infinite collection of problems. In numerical studies of the cases 8 and 24, Cohn and Kumar found bounds found that were extremely close to the energy of the E8 and Leech lattices. They were thus led to wonder whether for some given force field there would be some optimal function whose bound exactly matched the energy of the E8 or Leech lattice.

What these researchers found that such a “magic function” would have to take certain special values at certain points, and, in addition, that its Fourier transform would also have to take special values at certain other points, which is a tall order. This is reminiscent of the uncertainty principle of quantum theory — the more one tries to pin down the position, the more difficult it is to pin down its momentum (since, in quantum theory, the Fourier transform of position is momentum).

In the case of dimension 8 or 24, Viazovska conjectured that these restrictions limited any magic function to the border of possibility — any additional restrictions, and no such function could exist; any fewer, and such a function would not be unique. In the end, the researchers found a way to generate “magic functions” with exactly that property.

Thomas Hales, who proved the Kepler conjecture, was very impressed with this result, saying that even in light of the earlier work, I would not have expected this [universal optimality proof] to be possible to do. Sylvia Serfaty, a mathematician at New York University, described the work as at the level of the big 19th-century mathematics breakthroughs.

Some additional details are given in this well-written Quanta Magazine article, also by Erica Klarreich, from which some of the discussion above was adapted. The new joint paper is available here.

]]>Pi = 3.1415926535…, namely the ratio between the circumference of a circle and its diameter, has fascinated not only mathematicians and scientists but the public at large for centuries. Archimedes (c.287–212 BCE) was the first to present a scheme for calculating pi as a limit of perimeters of inscribed and circumscribed polygons, as illustrated briefly in the graphic to the right (see

Continue reading Pi, climate change denial and creationism

]]>Right off, it may not sound like pi, climate change denial and young-Earth creationism have much in common. In fact, there is an important connection. Here is some background.

Pi = 3.1415926535…, namely the ratio between the circumference of a circle and its diameter, has fascinated not only mathematicians and scientists but the public at large for centuries. Archimedes (c.287–212 BCE) was the first to present a scheme for calculating pi as a limit of perimeters of inscribed and circumscribed polygons, as illustrated briefly in the graphic to the right (see this Math Scholar blog for details). In the centuries following Archimedes, ancient mathematicians worldwide used this approach to calculate pi to increasing accuracy. Beginning in the 17th century, in the wake of the discovery of calculus by Newton and Leibniz, numerous new formulas by pi were found, some permitting even more digits to be computed. This culminated in 1874 with Shanks’ hand calculation of pi to 707 digits (alas, only the first 527 were correct).

In the 20th century, with the advent of the computer, pi was computed first to thousands, then to millions, then to billions, and, most recently, to trillions of digits. Facilitating these calculations are some remarkable new formulas for pi. For example, this paper presents a formula that permits one to calculate digits (in a binary or hexadecimal base) starting at an arbitrary starting position, without needing to calculate any of the digits that came before. Even more important was the discovery of some very clever computer algorithms applicable to computing pi. For example, in 1965 researchers found that the “fast Fourier transform” (FFT), a computational technique widely used in signal analysis and many other scientific fields, can be used to greatly accelerate high-precision multiplication.

The most recent record of pi, as of the present date, was to 31 trillion digits (actually 31,415,926,535,897 digits, to be exact, which happens to be the first 14 digits of pi), by a researcher at Google.

The subject of pi is dear to the present author, as he has published several papers and a book on this very subject. Thus it is with great sadness and consternation to see a growing number of writers (mostly lacking advanced mathematical training) who reject basic mathematical theory and the accepted value of pi, claiming instead that they have found pi to be some other value. For example, one author asserts that pi = 17 – 8 * sqrt(3) = 3.1435935394… Another author asserts that pi = (14 – sqrt(2)) / 4 = 3.1464466094… A third author promises to reveal an “exact” value of pi, differing significantly from the accepted value. For other examples, see this Math Scholar blog.

Needless to say, as any professional mathematician or, for that matter, any person who has at least completed a course in calculus will attest, such claims cannot possibly be correct. Tens of thousands of mathematicians, and hundreds of thousands of others who have been taught mathematics through the centuries, have derived many of these pi formulas for themselves, and have even calculated pi for themselves, checking their poofs and calculations very carefully. These proofs and calculations have also been checked by computer, in excruciating detail. It is also very well known that pi was proven to be transcendental by Lindemann in 1882 (and his proof has been carefully checked by countless mathematicians since). This means that pi cannot possibly be given by *any* finite algebraic expression, and most certainly not by a simple algebraic expression such as those mentioned in the previous paragraph.

In short, the fact that pi = 3.1415926535… (and not any variant value) is as certain as any assertion in all of mathematics, right up there with “the square root of two is not a rational number” and, for that matter, “two plus two equals four.”

So why do these and other writers insist pi is otherwise? In reading through some of their email messages promoting their views that they have sent to long lists of researchers (including the present author), two common themes emerge: (1) they are distrustful of the proofs of traditional results (such as Archimedes’ construction and Lindemann’s theorem), even though they scarcely understand them, and (2) they believe that the “professors” who object to their writings are deliberately ignoring or even colluding to repress these writings.

It is similarly distressing to see the fact of climate change, along with the very likely fact that humans are the leading contributor to climate change, widely discounted in the public arena.

At this point in time, the basic facts of climate change are not disputable in the least. Careful planet-wide observations by NASA and others have confirmed that 2018 was the fourth-warmest year in global average mean temperature in recorded history. The only warmer years were 2016, 2017 and 2015, respectively, and 18 of the 19 warmest years in history have occurred since 2001. Countless observational studies and supercomputer simulations have confirmed both the fact of warming and the very likely conclusion that this warming is principally due to human activity. These studies and computations have been scrutinized in great detail by a climate science community numbering in the thousands, from all major nations, as summarized in the latest report by the Intergovernmental Panel on Climate Change (IPCC).

At this point in time, at least 97% of climate science researchers agree with these conclusions. Further, this consensus is supported by official statements from the American Association for the Advancement of Science, the American Chemical Society, the American Geophysical Union, the American Medical Association, the American Meteorological Society, the American Physical Society, the Geological Society of America, the U.S. National Academy of Sciences and other scientific societies worldwide.

According to the latest report by the IPCC, impacts on natural and human systems are already occurring, and even a warming of 1.5 C, which at this point can hardly be averted, will have very serious consequences, including more extreme temperature events, more instances of heavy precipitation, more severe droughts, rising sea levels damaging cities and agricultural lands, as well as enormous stress on ecosystems worldwide.

And yet, although there has been some progress, large numbers of Americans in particular continue to deny this consensus. In a 2017 Pew Research Center survey, 23% denied that there is any solid evidence that the Earth has been warming, and of those who acknowledge warming, nearly half doubted that it is due to human activities.

So why are so many skeptical of the scientific consensus? In part, it is because that many in the public are simply not aware of the degree of consensus among scientists. According to a separate 2017 Pew Research Center survey, only 27% agreed that “almost all” scientists are in agreement; 35% said only “More than half,” and 35% said half or fewer. But even more disturbingly, only 32% agreed that the “best available scientific evidence” influences the climate scientists’ conclusions; 48% said only “some of the time”, and 18% said “not too often or never.” These results underscore a severe level of distrust of scientists in general and climate scientists in particular by the public.

Darwin’s On the Origin of Species by Means of Natural Selection was first published 160 years ago. In the intervening years scientists have scrutinized every aspect of Darwin’s theory in extraordinary detail. For example, Darwin acknowledged that there were numerous gaps in the fossil record, but since Darwin’s time, most of these gaps and others have been filled by discoveries of remarkable transitional fossils. For example, the origin of the original four-legged creatures (tetrapods) remained murky until 2004, when the “Tiktaalik” fossil was found in the Canadian arctic — see graphic at right.

Technology for studying evolution has advanced far beyond Darwin’s wildest dreams. Radiometric dating produces extremely accurate and very reliable geological dates, amply confirming the fact that the geological ages are many millions of years old, and that the Earth itself is approximately 4.56 billion years old. Even more impressive is the enormous decline in price of DNA sequencing technology — by approximately a factor of 27 million since 2000. The DNA data obtained via this technology have dramatically confirmed the central tenets of evolution, including the common ancestry of all biological organisms in a phylogenetic family tree, in most cases exactly as had been previously reckoned based solely on similarities of physical forms and biological functions. Indeed, the data all but scream “evolutionary descent from common ancestors.” See DNA for an overview of the latest results.

Here is just one of many examples of how DNA data, in this case transposon (“jumping gene”) mutation data, can be used to determine the phylogenetic relationships (i.e., “family tree”) of various primates including humans. The columns labeled ABCDE denote five blocks of transposons, and x and o respectively denote that the block is present or absent in the genome of the given species. It is clear from this data that our closest primate relatives are chimpanzees and bonobos [Rogers2011, pg. 89].

Transposon blocks Species A B C D E /--------- Human o x x x x /---------- Bonobo x x x x x / \--------- Chimp x x x x x /------------ Gorilla o o x x x -----|------------ Orangutan o o o x x \------------ Gibbon o o o o o

Sadly, a large percentage of the public also discounts the scientific consensus on evolution. According to a 2019 Pew Research survey, fully 31% of Americans assert that humans have existed in their present form since the beginning of time. According to a 2017 Gallup survey, 38% of Americans believe that God created humans in their present form within the past 10,000 years. Needless to say, such views are in severe conflict with the overwhelming majority (well over 99%) of qualified scientists in evolution-related fields, representing every major nation and religious movement on Earth — see Scientists on evolution.

We are living at the apex of scientific and technological progress. Within the past few decades, our species has discovered the laws of relativity, quantum physics and the standard model; unraveled the structure of DNA; found that the universe’s expansion is accelerating; discovered thousands of extrasolar planets; began observing the universe via gravitational wave astronomy; increased worldwide life expectancy from 29 in 1880 to 71 today; developed space vehicles for travel to the Moon and Mars; developed computer technology that is one million times more powerful than just 25 years ago; linked the world with the Internet; placed a smartphone (or at least a cell phone) in the hands of 70% of the entire world population; dropped the price of genome sequencing by a factor of 27 million; and launched a new world of artificial intelligence and robotics whose impact we can only dimly foresee.

Instead, a large fraction of the public resist even the most basic and overwhelmingly confirmed scientific principles, such as the value of pi, the fact of global warming and the fact of evolution.

So what can be done to counter this dreadful public mindset, which in the case of climate change denial and vaccination denial threatens our very existence? To begin with, numerous first-world nations, and the U.S. in particular, need to upgrade their educational systems. The U.S., for instance, ranks only mediocre in international educational performance tests. Students who are better educated in school are less likely to be persuaded by science skeptics later.

Many of the public fear, some with good reason, that advanced technology will steal their jobs. Others fear technologies such as genetically modified foods, in spite of the fact that major scientific societies have overwhelmingly concluded that GMO foods are no more or less safe than conventional foods.

But for the most part, the hostility mentioned above appears to be a reaction to the public’s perception scientists as “elites” and out of touch with common people, due in part to scientists’ miserable failure to communicate the excitement of modern science to the public and to involve the public in scientific discovery.

We mathematicians and scientists have been very successful in our battles to prove theorems, make discoveries, analyze data, write journal articles and obtain grants. *But we are losing the war for the hearts and minds of the public.* What can we do? Here are some suggestions:

- Start a blog.
- Visit schools.
- Give public lectures.
- Write articles for science news forums.
- Study creative writing, arts and humanities to sharpen communication skills.
- Recognize those who do reach out in hiring, promotion, tenure and research funding decisions.
- Promote interdisciplinary coursework and studies at universities that combine the arts with science, working in synergy rather than in competition or opposition to other fields.
- Find ways to involve the public in research projects, for example by inviting the public to help with field studies or lending home computer cycles for data analysis.

In one memorable scene from the movie *Contact*, Ellie Arroway (Jodi Foster) views a galaxy from her spacecraft, and is so overcome with awe that she exclaims, “They should have sent a poet. So beautiful. So beautiful… I had no idea.” In a similar way, those of us involved in scientific research are often stunned by the beauty and elegance of science and the mathematics behind it, along with the remarkable (and quite mysterious) fact that we humans are able to comprehend these laws through diligent effort.

So why don’t we do more to share this wonder? Why don’t we write some poetry?

]]>The modern field of artificial intelligence (AI) began in 1950 with Alan Turing’s landmark paper Computing machinery and intelligence, which outlined the principles of AI and proposed a test, now known as the Turing test, for establishing whether AI had been achieved. Although early researchers were confident that AI systems would soon be a reality, inflated promises and expectations led to the “AI Winter” in the 1970s, a phenomenon that sadly was repeated again, in the late 1980s and early 1990s, when a second wave of AI systems

Continue reading Google AI system proves over 1200 mathematical theorems

]]>The modern field of artificial intelligence (AI) began in 1950 with Alan Turing’s landmark paper Computing machinery and intelligence, which outlined the principles of AI and proposed a test, now known as the Turing test, for establishing whether AI had been achieved. Although early researchers were confident that AI systems would soon be a reality, inflated promises and expectations led to the “AI Winter” in the 1970s, a phenomenon that sadly was repeated again, in the late 1980s and early 1990s, when a second wave of AI systems also disappointed.

A breakthrough of sorts came in the late 1990s and early 2000s with the development of machine learning, Bayes-theorem-based methods, which quickly displaced the older methods based mostly on formal reasoning. When combined with steadily advancing computer technology, a gift of Moore’s Law, practical and effective AI systems finally began to appear.

One highly publicized advance was the defeat of two champion contestants on the American quiz show “Jeopardy!” by an IBM-developed computer system named “Watson.” The Watson achievement was particularly impressive because it involved natural language understanding, i.e. the understanding of ordinary (and often tricky) English text. Legendary Jeopardy champ Ken Jennings conceded by writing on his tablet, “I for one welcome our new computer overlords.”

The ancient Chinese game of Go is notoriously complicated, with strategies that can only be described in vague, subjective terms. For these reasons, many observers did not expect Go-playing computer programs to beat the best human players for many years, if ever. This pessimistic outlook changed abruptly in March 2016, when a computer program named “AlphaGo,” developed by researchers at DeepMind, a subsidiary of Alphabet (Google’s parent company), defeated Lee Se-dol, a South Korean Go master, 4-1 in a 5-game tournament. The DeepMind researchers further enhanced their program, which then in May 2017 defeated Ke Jie, a 19-year-old Chinese Go master thought to be the world’s best human Go player.

In an even more startling development, in October 2017, Deep Mind researchers developed from scratch a new program, called AlphaGo Zero, which was programmed only with the rules of Go and a simple reward function and then instructed to play games against itself. After just three days of training (4.9 million training games), the AlphaGo Zero program had advanced to the point that it defeated the earlier Alpha Go program 100 games to zero. After 40 days of training, AlphaGo Zero’s performance was as far ahead of champion human players as champion human players are ahead of amateurs. Additional details are available in an Economist article, a Scientific American article and a Nature article. See also this excellent New York Times analysis by mathematician Steven Strogatz.

AI systems are doing much more than defeating human opponents in games. Here are just a few of the current commercial developments:

- Apple’s Siri and Amazon’s Alexa smartphone-based voice recognition systems are now significantly improved over the earlier versions, and speaker systems incorporating them are rapidly becoming a household staple.
- Facial recognition has also come of age, with Facebook’s facial recognition API and, even more impressively, with Apple’s 3-D facial recognition hardware and software, which is built into the latest iPhones and iPads.
- Self-driving cars are already on the road, and 3.5 million truck driving jobs, just in the U.S., are at risk within the next ten years.
- AI-powered surgical robots may soon perform some procedures faster and more accurately than humans.
- The financial industry already relies on financial machine learning methods, and a huge expansion of these technologies is coming, possibly displacing thousands of highly paid workers.
- Other occupations likely to be impacted include package delivery drivers, construction workers, legal workers, accountants, report writers and salespeople.

For other examples and additional details see this earlier Math Scholar blog.

Perhaps AI systems might eventually succeed in occupations such as delivering packages, driving cars and trucks, automating financial operations and cooking hamburgers, but surely not that pinnacle of human intelligence, namely discovering and proving mathematical theorems?

In fact, computer programs that discover new mathematical identities and theorems are already a staple of the field known as experimental mathematics. Here is just a handful of the many computer-based discoveries that could be mentioned:

- A new formula for pi with the property that it permits one to rapidly calculate binary or hexadecimal digits of pi at an arbitrary starting position, without needing to calculate digits that came before.
- The surprising fact that if the Gregory series for pi is truncated to 10
^{n}/2 terms for some n, the resulting decimal value is remarkably close to the true value of pi, except for periodic errors that are related to the “tangent numbers.” - Evaluations of Euler sums (compound infinite series) in terms of simple mathematical expressions.
- Evaluations of lattice sums from the Poisson equation of mathematical physics in terms of roots of high-degree integer polynomials.
- A new result for Mordell’s cube sum problem.

In most of the above examples, the new facts were found by numerical exploration on a computer, but only later proven rigorously, the old-fashioned way, by human efforts. So what about computers actually proving theorems?

Actually, this is also old hat at this point in time. Perhaps the best example is Thomas Hales’ 2003 proof of the Kepler conjecture, namely the assertion that the simple scheme of stacking oranges typically seen in a supermarket has the highest possible average density for any possible arrangement, regular or irregular. Hales’ original proof met with some controversy, since it involved a computation documented by 250 pages of notes and 3 Gbyte of computer code and data, so Hales and his colleagues entered the entire proof into a computer proof-checking program. In 2014 this process was completed and the proof was certified.

A new and remarkable development here is that several researchers at Google’s research center in Mountain View, California have now developed an AI theorem-proving program. Their program works with the HOL-Light theorem prover, which was used in Hales’ proof of the Kepler conjecture, and can prove, essentially unaided by humans, many basic theorems of mathematics. What’s more, they have provided their tool in an open-source release, so that other mathematicians and computer scientists can experiment with it.

The Google AI system was “trained” on a set of 10,200 theorems that the researchers had gleaned from several sources, including many sub-theorems of Hales’ proof of the Kepler conjecture. Most of these theorems were in the area of linear algebra, real analysis and complex analysis, but the Google researchers emphasize that their approach is very broadly applicable.

In the initial release, their software was able to prove 5919, or 58% of the training set. When they applied their software to a set of 3217 new theorems that it had not yet seen, it succeeded in proving 1251, or 38.9%. Not bad for a brand-new piece of software…

Mathematicians are already envisioning how this software can be used in day-to-day research. Jeremy Avigad of Carnegie Mellon University sees it this way:

You get the maximum of precision and correctness all really spelled out, but you don’t have to do the work of filling in the details. … Maybe offloading some things that we used to do by hand frees us up for looking for new concepts and asking new questions.

For additional details see the authors’ technical paper, or this New Scientist article by Leah Crane.

So where is all this heading? A recent Time article features an interview with futurist Ray Kurzweil, who predicts an era, roughly in 2045, when machine intelligence will meet, then transcend human intelligence. Such future intelligent systems will then design even more powerful technology, resulting in a dizzying advance that we can only dimly foresee at the present time. Kurzweil outlines this vision in his book The Singularity Is Near.

Futurists such as Kurzweil certainly have their skeptics and detractors. Sun Microsystem founder Bill Joy is concerned that humans could be relegated to minor players in the future, if not extinguished. Indeed, in many cases AI systems already make decisions that humans cannot readily understand or gain insight into. But even setting aside such concerns, there is considerable concern about the societal, legal, financial and ethical challenges of such technologies, as exhibited by the current backlash against technology, science and “elites” today.

One implication of all this is that education programs in engineering, finance, medicine, law and other fields will need to be upgraded to train students in machine learning and AI techniques. Along this line, large technology firms such as Amazon, Apple, Facebook, Google and Microsoft are aggressively luring top AI talent, including university faculty, with huge salaries. But clearly the field cannot eat its seed corn in this way; some solution is needed to permit faculty to continue teaching while still participating in commercial work.

But one way or the other, intelligent computers are coming. Society must find a way to accommodate this technology, and to deal respectfully with the many people whose lives will be affected. But not all is gloom and doom. Mathematician Steven Strogatz envisions a mixed future:

]]>Maybe eventually our lack of insight would no longer bother us. After all, AlphaInfinity could cure all our diseases, solve all our scientific problems and make all our other intellectual trains run on time. We did pretty well without much insight for the first 300,000 years or so of our existence as Homo sapiens. And we’ll have no shortage of memory: we will recall with pride the golden era of human insight, this glorious interlude, a few thousand years long, between our uncomprehending past and our incomprehensible future.

Recent public reports have underscored a crisis of reproducibility in numerous fields of science. Here are just a few of recent cases that have attracted widespread publicity:

In 2012, Amgen researchers reported that they were able to reproduce fewer than 10 of 53 cancer studies. In 2013, in the wake of numerous recent instances of highly touted pharmaceutical products failing or disappointing when fielded, researchers in the field began promoting the All Trials movement, which would require participating firms and researchers to post the results of all trials, successful or not. InContinue reading P-hacking and scientific reproducibility

]]>Recent public reports have underscored a crisis of reproducibility in numerous fields of science. Here are just a few of recent cases that have attracted widespread publicity:

- In 2012, Amgen researchers reported that they were able to reproduce fewer than 10 of 53 cancer studies.
- In 2013, in the wake of numerous recent instances of highly touted pharmaceutical products failing or disappointing when fielded, researchers in the field began promoting the All Trials movement, which would require participating firms and researchers to post the results of all trials, successful or not.
- In March 2014, physicists announced with great fanfare that they had detected evidence of primordial gravitational waves from the “inflation” epoch shortly after the big bang. However, other researchers subsequently questioned this conclusion, arguing that the twisting patterns in the data could be explained more easily by dust in the Milky Way.
- Also in 2014, backtest overfitting emerged as a major problem in computational finance and is now thought be a principal reason why investment funds and strategies that look good on paper often fail in practice.
- In 2015, in a study by the Reproducibility Project, only 39 of 100 psychology studies could be replicated, even after taking extensive steps such as consulting with the original authors.
- Also in 2015, a study by the U.S. Federal Reserve was able to reproduce only 29 of 67 economics studies.
- In an updated 2018 study by the Reproducibility Project, only 14 out of 28 classic and contemporary psychology experimental studies were successfully replicated.
- In 2018, the Reproducibility Project was able to replicate only five of ten key studies in cancer research, with three inconclusive and two negative; eight more studies are in the works but incomplete.

The p-test, which was introduced by the British statistician Ronald Fisher in the 1920s, assesses whether the results of an experiment are more extreme that what would one have given the null hypothesis. The smaller this p-value is, argued Fisher, the greater the likelihood that the null hypothesis is false. However, even Fisher never intended for the p-test to be a single figure of merit; rather it was intended to be part of a continuous, nonnumerical process that combined experimental data with other information to reach a scientific conclusion.

Indeed, the p-test, used alone, has significant drawbacks. To begin with, the typically used level of p = 0.05 is not a particularly compelling result. In any event, it is highly questionable to reject a result if its p-value is 0.051, whereas to accept it as significant if its p-value is 0.049.

The prevalence of the classic p = 0.05 value has led to the egregious practice that Uri Simonsohn of the University of Pennsylvania has termed p-hacking: proposing numerous varied hypotheses until a researcher finds one that meets the 0.05 level. Note that this is a classic multiple testing fallacy of statistics: perform enough tests and one is bound to pass any specific level of statistical significance. Such suspicions are justified given the results of a study by Jelte Wilcherts of the University of Amsterdam, who found that researchers whose results were close to the p = 0.05 level of significance were less willing to share their original data than were others that had stronger significance levels (see also this summary from Psychology Today).

Along this line, it is clear that a sole focus on p-values can muddle scientific thinking, confusing significance with size of the effect. For example, a 2013 study of more than 19,000 married persons found that those who had met their spouses online are less likely to divorce (p < 0.002) and more likely to have higher marital satisfaction (p < 0.001) than those who met in other ways. Impressive? Yes, but the divorce rate for online couples was 5.96%, only slightly down from 7.67% for the larger population, and the marital satisfaction score for these couples was 5.64 out of 7, only slightly better than 5.48 for the larger population (see also this Nature article).

Perhaps a more important consideration is that p-values, even if reckoned properly, can easily mislead. Consider the following example, which is taken from a paper by David Colquhoun: Imagine that we wish to screen persons for potential dementia. Let’s assume that 1% of the population has dementia, and that we have a test for dementia that is 95% accurate (i.e., it is accurate with p = 0.05), in the sense that 95% of persons without the condition will be correctly diagnosed, and assume also that the test is 80% accurate for those who do have the condition. Now if we screen 10,000 persons, 100 presumably will have the condition and 9900 will not. Of the 100 who have the condition, 80% or 80 will be detected and 20 will be missed. Of the 9900 who do not, 95% or 9405 will be cleared, but 5% or 495 will be incorrectly tested positive. So out of the original population of 10,000, 575 will test positive, but 495 of these 575, or 86%, are false positives.

Needless to say, a false positive rate of 86% is disastrously high. Yet this is entirely typical of many instances in scientific research where naive usage of p-values leads to surprisingly misleading results.

In light of such problems, the American Statistical Association (ASA) has issued a Statement on statistical significance and p-values. The ASA did not recommend that p-values be banned outright, but it strongly encouraged that the p-test be used in conjunction with other methods and not solely relied on as a measure of statistical significance, and certainly not viewed as a probability value. The ASA’s key points are the following:

- P-values can indicate how incompatible the data are with a specified statistical model.
- P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
- Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
- Proper inference requires full reporting and transparency.
- A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
- By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

The ASA statement concludes,

Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.

Along this line, as Jeff Leek wrote on the Simply Statistics blog, the problem is not that p-values are fundamentally unusable, instead it is that they are all too often used by persons who are not highly skilled in statistics. Indeed, any given statistic can be misused in the wrong hands. Thus the only long-term solution is “to require training in both statistics and data analysis for anyone who uses data but particularly journal editors, reviewers, and scientists in molecular biology, medicine, physics, economics, and astronomy.”

Leek proposed the following alternatives (these items and the listed comments are condensed from his article):

- Statistical methods should only be chosen and applied by qualified data analytic experts. Comment: This is the best solution, but it may not be practical given the shortage of such experts.
- Where possible, report the full prior, likelihood and posterior details, together with results of a sensitivity analysis. Comment: In cases where this can be done it provides much more information about the model and uncertainty; but it does require advanced statistical expertise.
- Replace p-values with direct Bayesian approach, reporting credible intervals and Bayes estimators. Comment: In cases where the model can be properly fit, this provides scientific measures such as confidence intervals; however it requires advanced expertise, and the results are still sample size dependent.
- Replace p-values with likelihood ratios. Comment: In cases where this can be done it would reduce the confusion with the null hypothesis; however likelihood ratios can be exactly computed only with relatively simple models and designs.
- Replace p-values with confidence intervals. Comment: Confidence intervals are sample size dependent and can be misleading for large samples.
- Replace p-values with Bayes factors. Comment: These require advanced expertise, depend on sample size and may still lead to unwanted false positives.

In summary, while p-values are often misused and are potentially misleading, nonetheless they are in the literature and are probably not completely going away anytime soon. What’s more, even the more sophisticated alternatives are prone to misuse.

Thus the only real long-term solution is for all scientific researchers and others who perform research work to be rigorously trained in modern statistics and how best to use these tools. Special attention should be paid to showing how statistical tests can mislead when used naively. Note that this education needs to be done not only for students and others entering the research work force, but also for those who are already practitioners in the field. This will not be easy but must be done.

Such considerations bring to mind a historical anecdote from the great Greek mathematician Euclid. According to an ancient account, when Pharaoh Ptolemy I of Egypt grew frustrated at the degree of effort required to master geometry, he asked his tutor Euclid whether there was some easier path. Euclid is said to have replied, There is no royal road to geometry.

The same is true for data analysis: there is no “royal road” to reliable, reproducible, statistically rigorous research. Many fields of modern science are in the midst of a data revolution. To list a few: astronomy (digital telescope data), astrophysics (cosmic microwave data), biology (genome sequence data), business (retail purchase and computer click data), chemistry (molecular simulation data), economics (econometric data), engineering (design and simulation data), environmental science (earth-orbiting satellite data), finance (market transaction data) and physics (accelerator data).

Those researchers who learn how to deal effectively with this data, producing statistically robust results, will lead the future. Those who do not will be left behind.

]]>The discovery of decimal arithmetic in ancient India, together with the well-known schemes for long multiplication and long division, surely must rank as one of the most important discoveries in the history of science. The date of this discovery, by an unknown Indian mathematician or group of mathematicians, was recently pushed back to the third century CE, based on the recent dating of the Bakhshali manuscript, but it probably happened earlier, perhaps around 0 CE.

Arithmetic on modern computersComputers, of course, do not use

Continue reading An n log(n) algorithm for multiplication

]]>The discovery of decimal arithmetic in ancient India, together with the well-known schemes for long multiplication and long division, surely must rank as one of the most important discoveries in the history of science. The date of this discovery, by an unknown Indian mathematician or group of mathematicians, was recently pushed back to the third century CE, based on the recent dating of the Bakhshali manuscript, but it probably happened earlier, perhaps around 0 CE.

Computers, of course, do not use decimal arithmetic in computation (except input and output for human eyes). Instead they use binary arithmetic, either in an integer format, where the bits in a computer word represent a binary integer (with perhaps one bit as a sign), or else in floating-point format, where part of the computer word represents the data and part represents an exponent — the binary equivalent of writing a value in scientific notation, such as $1.2345 \times 10^{67}$. A widely-used 64-bit integer format can represent integers up to roughly $4.5 \times 10^{18}$, whereas a widely-used 64-bit floating-point format can represent numbers from about $10^{-308}$ to $10^{308}$, with nearly 16-significant-digit accuracy. Arithmetic on such numbers is typically done in computer hardware using binary variations of the basic schemes. For additional details on these formats, see this Wikipedia article.

For some scientific and engineering applications, however, even 16-digit accuracy is not sufficient, and so such applications rely on “multiprecision” arithmetic — software extensions to the standard hardware arithmetic operations. Cryptography, which is typically performed numerous times each day in one’s smartphone or laptop, requires computations to be done on integers as large as several thousand digits. Some computations in pure mathematics and mathematical physics research require extremely high-precision floating-point arithmetic — tens of thousands of digits in some cases (see for example this paper). Researchers exploring properties of the binary and decimal expansions of numbers such as $\pi$ have computed with millions, billions or even trillions of digits — the most recent record computation of $\pi$ was to 31 trillion digits (actually 31,415,926,535,897 decimal digits, which as a point of interest, happens to be the first 14 digits in the decimal expansion of $\pi$).

It was widely believed, even as late as the 1950s, that there was no fundamentally faster way of multiplying large numbers than the ancient scheme we learned in school, or a minor variation such as performing the operations in base $2^{32}$ rather than base ten. Then in 1960, the young Russian mathematician Anatoly Karatsuba found a clever technique for performing very high precision multiplication. By dividing each of the inputs into two substrings, he could find the product, as least to the precision often needed, with only three substring multiplications instead of four. By continuing recursively to break down the input numbers in this fashion, he obtained a scheme whose cost increased only by the approximate factor $n^{1.58}$ instead of $n^2$ with the ordinary scheme, where $n$ is the number of computer words. But this was soon overshadowed by a scheme first formalized by Schonhage and Strassen, which we now describe.

A major breakthrough in performing such computations came with the realization that the fast Fourier transform (FFT), which was originally developed to process signal data, could be used to dramatically accelerate high-precision multiplication. The fast Fourier transform is merely a clever computer algorithm to rapidly calculate the frequency spectrum of a string of data interpreted as a signal. In effect, one’s ear performs a Fourier transform (in analog, not in digital) when it distinguishes the pitch contour of a musical note or a person’s voice.

The idea for FFT-based multiplication is, first of all, to represent a very high precision number as a string of computer words, each containing, say, 32 successive bits of its binary expansion (i.e., each entry is an integer between $0$ and $2^{32} – 1$). Then two such multiprecision numbers can be multiplied by observing that their product computed the old-fashioned way (in base $2^{32}$ instead of base ten) is nothing more than the “linear convolution” of the two strings of computer words, which convolution can be performed efficiently using an FFT. In particular, the FFT-based scheme for multiplying two high-precision numbers $A = (a_0, a_1, a_2, \cdots, a_{n-1})$ and $B = (b_0, b_1, b_2, \cdots, b_{n-1})$ is the following:

- Extend each of the $n$-long input strings $A$ and $B$ to length $2n$ by padding with zeroes.
- Perform an FFT on each of the extended $A$ and $B$ strings to produce $2n$-long complex strings denoted $F(A)$ and $F(B)$.
- Multiply the strings $F(A)$ and $F(B)$ together, term-by-term, as complex numbers.
- Perform an inverse FFT on the resulting $2n$-long string to yield a $2n$-long string $C$, i.e., $C = F^{-1}[F(A) \cdot F(B)]$. Round each entry of $C$ to the nearest integer.
- Starting at the end of $C$, release carries in each term, i.e., leave only the final 32 bits in each word, with the higher-order bits added to the previous word or words as necessary.
- The result $D$ is the product of $A$ and $B$, represented as a $2n$-long string of 32-bit integers.

Note, however, that several important details were omitted here. For instance, the product of two 32-bit numbers can be computed exactly using a 64-bit hardware multiply operation, but numerous such 64-bit values must be added together when computing an FFT, requiring somewhat more than 64-bit precision. Also, for optimal performance it is important to take advantage of the fact that both input strings $A$ and $B$ are real numbers rather than complex, and that the final inverse FFT returns real numbers in $C$. Finally, the outline above omitted mention on how to manage roundoff error when performing these operations. One remedy is to divide the data into even smaller chunks, say only 16 or 24 bits per word; another remedy is perform these computations not in the field of complex numbers using floating-point arithmetic, but instead in the field of integers modulo a prime number, so that roundoff error is not a factor. For additional details, see this paper.

With a fast scheme such as FFT-based multiplication in hand, division can be performed by Newton iterations, with a level of precision that approximately doubles with each iteration. This scheme reduces the cost of high-precision division to only somewhat more than twice that of high-precision multiplication. A similar scheme can be used to compute high-precision square roots. See this paper for details.

Schonhage and Strassen, in carefully analyzing a certain variation of the FFT-based multiplication scheme, found that for large $n$ (where $n$ is the number of computer words) the total computational cost, in terms of hardware operations, scales as $n \cdot \log(n) \cdot \log(\log(n))$. In practice, the FFT-based scheme is extremely fast for high-precision multiplication, since one often can utilize highly tuned library routines to perform the FFT operations.

Ever since the Schoenhage-Strassen scheme was discovered and formalized, researchers have wondered if this was truly the end of the road. For example, can we remove the final $\log(\log(n))$ factor and just have $n \cdot \log(n)$ asymptotic complexity?

Well that day has now arrived. In a brilliant new paper by David Harvey of the University of New South Wales, Australia, and Joris van der Hoeven of the French National Centre for Scientific Research, France, the authors have defined a new algorithm with exactly that property.

The Harvey-van der Hoeven paper approaches the problem by splitting the problem into smaller problems, applying the FFT multiple times, and replacing more multiplications with additions and subtractions, all in a very clever scheme that eliminates the nagging $\log(\log(n))$ factor. The authors note that their result does not prove that there is no algorithm even faster than theirs. So maybe an even faster one will be found. Or not.

Either way, don’t put away your multiplication tables quite yet. The Harvey-van der Hoeven scheme is only superior to FFT-based multiplication for exceedingly high precision. But it is an important theoretical result that has wide-ranging implications in the field of computational complexity.

For some additional details, see this well-written Quanta article by Kevin Hartnett.

]]>But

Continue reading LENR: A skeptical perspective

]]>As chronicled in the earlier Math Scholar blog, a community of researchers has been pursuing a new green energy source, known as “low energy nuclear reaction” (LENR) physics, or, variously, “lattice-assisted nuclear reaction” (LANR) physics or “condensed matter nuclear reaction” (CMNR) physics. This work harkens back to 1989, when University of Utah researchers Martin Fleischmann and Stanley Pons that they had achieved desktop “cold fusion,” in a hastily called news conference. After several other laboratories failed to reproduce these findings, the scientific community quickly concluded that the Utah researchers were mistaken, to put it mildly.

But a number of other researchers did find excess heat and other signatures of interesting physics, and these researchers have continued to explore the phenomenon to the present day. If anything, this community, representing numerous nations and research institutions, has significantly grown in the past few years, with researchers claiming better results. The larger scientific community, however, continues to reject this work.

So what is the truth here? Is LENR a real phenomenon, one that could lead to a remarkable new clean, radiation-free form of energy? Or is it mirage?

In an attempt to further explore this issue, the earlier Math Scholar blog presented a bibliography of 39 recent papers published by the LENR community reporting experimental work, and discussed the status of the field. It concluded that we are left with three rather stark choices: (a) well over 100 qualified researchers from around the world, representing universities, government laboratories and private firms, each have made some fundamental errors in their experimental work; (b) at least some of these researchers are colluding to cover less-than-forthright scientific claims; or (c) these researchers are making important discoveries that are not accepted by the larger scientific community.

So which is it? Wishful thinking or reality?

While many in the LENR community remain optimistic that they have identified and real and promising phenomenon, some other researchers who have been following the field remain highly skeptical of these results. These researchers have expressed their disagreements in a number of published papers, and also in a variety of online discussion forums.

As a single example, the last-listed item below (May 2019) recounts the efforts of a multi-institutional team, including researchers from the University of British Columbia (Canada), the Massachusetts Institute of Technology, the University of Maryland, the Lawrence Berkeley National Laboratory and Google to revisit LENR. These researchers failed to find evidence of fusion or heat, although additional investigation is required before the phenomenon can be ruled out completely. An overview of their work is available Here.

Below is a bibliography of papers that present some of these skeptical assessments, and some responses by the LENR community. This listing is in rough chronological order (most recent last) and, for each entry, includes literature citation, PDF link and a brief synopsis.

- Kirk L. Shanahan, “A systematic error in mass flow calorimetry demonstrated,”
*Thermochimica Acta*, vol. 387 (2002), 95-100, PDF. Synopsis: This paper re-analyzed the cold fusion study of E. Storms (presented in 2000 at ICCF8), which suggested a non-nuclear explanation of the signals. - W. Brian Clarke, “Search for 3-He and 4-He in Arata-Style palladium cathodes I: A negative result,”
*Fusion Science and Technology*, vol. 40 (2001), 147-151, PDF. Synopsis: A study of 4 Arata-style Pd cathodes found no evidence of 4He production. - W. Brian Clarke, Brian M. Oliver, Michael C. H. McKubre, Francis L. Tanzella and Paolo Tripodi, “Search for 3-He and 4-He in Arata-Style palladium cathodes II: Evidence for tritium production,”
*Fusion Science and Technology*, vol 40 (2001), 152-167, PDF. Synopsis: Researchers found 3He (a tritium decay product) in Arata-style cathodes (same as in [2]). - W. Brian Clarke, “Production of He in D2-loaded palladium-carbon catalyst I,”
*Fusion Science and Technology*, vol. (2003), 122-127, PDF. Synopsis: A failed replication attempt of Case-method cold fusion. - W. Brian Clarke, Stanley J. Bos and Brian M. Oliver, “Production of He in D2-Loaded palladium-carbon catalyst II,”
*Fusion Science and Technology*, vol. 43 (2003), 250-255, PDF. Synopsis: A study of 4 Case-method samples showing strong evidence of air ingress, leading to artificial 4He signals, and thus drawing into question earlier claims of 4He production. - S. Szpak, P. A. Mosier-Boss, M. H. Miles and M. Fleischmann, “Thermal behavior of polarized Pd/D electrodes prepared by co-deposition,”
*Thermochimica Acta*, vol. 410 (2004), 101-107, PDF. Synopsis: An experimental report of LENR, with negative comments regarding [1]. - Kirk L. Shanahan, “Comments on ‘Thermal behavior of polarized Pd/D electrodes prepared by co-deposition’,”
*Thermochimica Acta*, vol. 428 (2005), 207-212, PDF. Synopsis: Shanahan’s response to [6], showing how data in [6] supports his skeptical conclusion. - Edmund Storms, “Comment on papers by K. Shanahan that propose to explain anomalous heat generated by cold fusion,”
*Thermochimica Acta*, vol. 441 (2006), 207-209, PDF. Synopsis: A response by Storms to the 2002 Shanahan paper that suggested Shanahan erred. - Kirk L. Shanahan, “Reply to ‘Comment on papers by K. Shanahan that propose to explain anomalous heat generated by cold fusion’, E. Storms,
*Thermochimica Acta*, 2006,”*Thermochimica Acta*, vol 441 (2006), 210-214, PDF. A response to [8] delineating errors in [8]. - S. B. Krivit and J. Marwan, “A new look at low-energy nuclear reaction research,”
*Journal of Environmental Monitoring*vol. (2009), 1731, PDF. Synopsis: A positive overview article on the status of LENR (cold fusion). - Akira Kitamura, Takayoshi Nohmi, Yu Sasaki, Akira Taniike, A. Tahahaski, R. Seto, Yushi Fujita, “Anomalous effects in charging of Pd powders with high density hydrogen isotopes,”
*Physics Letters A*, vol. 373 (2009), 3109-3112, PDF. Synopsis: Claims of LENR in Pd/ZrO2 systems; a replication of a prior claim by Arata. - Kirk L. Shanahan, “Comments on ‘A new look at low-energy nuclear reaction research’,”
*Journal of Environmental Monitoring*vol. 12 (2010), 1756-1764, PDF. Synopsis: A review of some of the problems and difficulties omitted from [10]. - J. Marwan, M. C. H. McKubre, F. L. Tanzella, P. L. Hagelstein, M. H. Miles, M. R. Swartz, Edmund Storms, Y. Iwamura, P. A. Mosier-Boss and L. P. G. Forsley, “A new look at low-energy nuclear reaction (LENR) research: a response to Shanahan,”
*Journal of Environmental Monitoring*, vol. 12 (2010), 1765-1770, PDF. Synopsis: A response to [12]. Note: Shanahan in reply has argued that this response employs a faulty logical device (strawman argument) to conclude Shanahan [12] is wrong. - Kirk L. Shanahan, “A realistic examination of cold fusion claims 24 Years later. A whitepaper on conventional explanations for ‘cold fusion’,”
*SRNL Technical Report SRNL-STI-2012-00678*, 22 Oct 22, 2012, PDF. Synopsis: Unreviewed whitepaper with sections on: Fleischmann and Pons calorimetry errors, response to [13], response to S. Krivit’s response to [12], unpublished manuscript regarding problems in [11]. - Curtis P. Berlinguette, Yet-Ming Chiang, Jeremy N. Munday, Thomas Schenkel, David K. Fork, Ross Koningstein and Matthew D. Trevithick, “Revisiting the cold case of cold fusion,”
*Nature*, 27 May 2019, PDF. Synopsis: The authors embarked on a multi-institution program to re-evaluate cold fusion to a high standard of scientific rigour. They found no evidence of fusion or heat, but did find new insights into highly hydrided metals and low-energy nuclear reactions.