The 2024 Nobel Prize in Chemistry has been awarded to David Baker of the University of Washington and to Demis Hassabis and John Jumper of Google DeepMind. Hassabis and Jumper, now at the Google DeepMind Research Center in the U.K. (a subsidiary of Alphabet, Google’s parent company), developed an artificial intelligence-based software package that predicts with remarkable accuracy the structure of proteins. Baker, now at the University of Washington in Seattle, was recognized for his work on designing
Continue reading 2024 Nobel Prize in Chemistry awarded to computational protein folding pioneers
]]>The 2024 Nobel Prize in Chemistry has been awarded to David Baker of the University of Washington and to Demis Hassabis and John Jumper of Google DeepMind. Hassabis and Jumper, now at the Google DeepMind Research Center in the U.K. (a subsidiary of Alphabet, Google’s parent company), developed an artificial intelligence-based software package that predicts with remarkable accuracy the structure of proteins. Baker, now at the University of Washington in Seattle, was recognized for his work on designing new proteins.
Demis Hassabis and John Jumper are the leading researchers behind AlphaFold 2, the computer-based system developed at DeepMind that has largely solved the computational protein folding problem, which is arguably one of the most significant and longest-running challenges in modern biology — certainly the most significant in the area of computational biology.
Proteins are the workhorses of biology. A few prominent examples include:
There are many thousands of proteins in human biology, and at least 200 million in the larger biological kingdom. Each protein is specified as a string of amino acids, as given by DNA letters A, C, T, G. Thanks to the recent dramatic drop in the cost of DNA sequencing technology, sequencing proteins is fairly routine.
The key to biology, however, is the three-dimensional shape of the protein — how a protein “folds” [Callaway2022]. Protein shapes can be investigated experimentally, using x-ray crystallography, but this is an expensive, error-prone and time-consuming laboratory operation. So for at least 50 years, researchers worldwide have been developing computer programs to model the folding process, given the input amino acid sequence [Dill2008].
Just a few of the many potential applications of this technology include studying the misshapen proteins thought to be the cause of Alzheimer’s disease, or studying the proteins behind various genetically-linked disorders such as cystic fibrosis and sickle-cell anemia. Prior to the development of AlphaFold 2, structures were known for only a tiny fraction of known proteins.
Given the daunting challenge and importance of the computational protein folding problem, in 1994 a community of researchers in the field organized a biennial competition known as Critical Assessment of Protein Structure Prediction (CASP) [CASP2024]. At each iteration of the competition, the organizers announce a set of problems, to which worldwide teams of researchers then apply their best current tools to solve. In 2018, the CASP competition had a new entry: AlphaFold, a machine-learning-based program developed by DeepMind, which out-performed others in the competition. Then in 2020, the DeepMind team unveiled a new program, known as AlphaFold 2 [AlphaFold 2]. For the 2020 CASP competition, it achieved a 92% average score, far above the 62% achieved by the second-best program in the competition.
The Nobel committee notes that with AlphaFold 2, the DeepMind team first calculated the structure of all known human proteins, and then extended their work to determine “the structure of virtually all the 200 million proteins that researchers have so far discovered when mapping Earth’s organisms.”
“It’s a game changer,” exulted German biologist Andrei Lupas, who has served as an organizer and judge for the CASP competition. “This will change medicine. It will change research. It will change bioengineering. It will change everything.” [Callaway2020]. Lupas mentioned how AlphaFold 2 helped to crack the structure of a bacterial protein that Lupas himself has been studying for many years. “The [AlphaFold 2] model … gave us our structure in half an hour, after we had spent a decade trying everything.”
For additional information, see the Nobel committee announcement and articles in Nature, New Scientist, New York Times and Scientific American.
Hassabis and Jumper were previously (2022) honored with the Breakthrough Prize for their work; for details, see this earlier Math Scholar article.
]]>In a previous Math Scholar article, we highlighted some recent developments in sophisticated computer tools being applied to the enterprise of research mathematics. These tools include:
Typesetting tools (usually LaTeX), combined with tools such as MathJax for embedding typeset mathematics into web pages. Collaboration tools such as blogs, FaceTime, Skype, Zoom, Slack and Microsoft Teams to facilitate communication and collaboration between researchers. Symbolic mathematical software such as Mathematica, Maple and Sage to perform increasingly powerful symbolic manipulations and derivations, and to generate graphics illustrating results. Custom-written code, oftenContinue reading Terence Tao’s vision of AI assistants in research mathematics
]]>In a previous Math Scholar article, we highlighted some recent developments in sophisticated computer tools being applied to the enterprise of research mathematics. These tools include:
The previous Math Scholar article also included excerpts from an interview between UCLA Fields Medalist mathematician Terence Tao and Christoph Drosser, a writer for Spektrum der Wissenschaft (the German edition of Scientific American). Drosser and Tao discussed at length the current status of computer tools in mathematical research, including proof-checking software and also AI-based tools, and what is likely to be the future course of the field. In the interview, Tao noted that the rise of automated proof checkers (his favorite is Lean) has fundamentally changed the trust equation between mathematical collaborators. Now much larger teams can mutually divide a proof task, with each collaborator’s work being fed into a proof checker to ensure validity.
Tao has now shared more thoughts on his vision of artificial intelligence and other advanced computer-based tools to assist mathematical researchers. In an Atlantic article dated 4 October 2024, edited with Atlantic writer Matteo Wong, Tao provided some more details of his vision. Here are a few excerpts from his interview:
Wong: What was your first experience with ChatGPT?
Tao: I played with it pretty much as soon as it came out. I posed some difficult math problems, and it gave pretty silly results. It was coherent English, it mentioned the right words, but there was very little depth. Anything really advanced, the early GPTs were not impressive at all. They were good for fun things—like if you wanted to explain some mathematical topic as a poem or as a story for kids. Those are quite impressive.
Wong: OpenAI says o1 [a new release focused on mathematics and science] can “reason,” but you compared the model to “a mediocre, but not completely incompetent” graduate student [clarified to mean “research assistant”].
Tao: Right, it’s the equivalent, in terms of serving as that kind of an assistant. But I do envision a future where you do research through a conversation with a chatbot. Say you have an idea, and the chatbot went with it and filled out all the details.
It’s already happening in some other areas. AI famously conquered chess years ago, but chess is still thriving today, because it’s now possible for a reasonably good chess player to speculate what moves are good in what situations, and they can use the chess engines to check 20 moves ahead. I can see this sort of thing happening in mathematics eventually: You have a project and ask, “What if I try this approach?” And instead of spending hours and hours actually trying to make it work, you guide a GPT to do it for you.
With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.
One key difference between graduate students and AI is that graduate students learn. You tell an AI its approach doesn’t work, it apologizes, it will maybe temporarily correct its course, but sometimes it just snaps back to the thing it tried before. And if you start a new session with AI, you go back to square one. I’m much more patient with graduate students because I know that even if a graduate student completely fails to solve a task, they have potential to learn and self-correct.
[some discussion skipped]
Wong: You’ve also said previously that computer programs might transform mathematics and make it easier for humans to collaborate with one another. How so? And does generative AI have anything to contribute here?
Tao: Technically they aren’t classified as AI, but proof assistants are useful computer tools that check whether a mathematical argument is correct or not. They enable large-scale collaboration in mathematics. That’s a very recent advent.
Math can be very fragile: If one step in a proof is wrong, the whole argument can collapse. If you make a collaborative project with 100 people, you break your proof in 100 pieces and everybody contributes one. But if they don’t coordinate with one another, the pieces might not fit properly. Because of this, it’s very rare to see more than five people on a single project.
With proof assistants, you don’t need to trust the people you’re working with, because the program gives you this 100 percent guarantee. Then you can do factory production–type, industrial-scale mathematics, which doesn’t really exist right now. One person focuses on just proving certain types of results, like a modern supply chain.
The problem is these programs are very fussy. You have to write your argument in a specialized language—you can’t just write it in English. AI may be able to do some translation from human language to the programs. Translating one language to another is almost exactly what large language models are designed to do. The dream is that you just have a conversation with a chatbot explaining your proof, and the chatbot would convert it into a proof-system language as you go.
In short, the consensus of Tao and other leading-edge researchers in the area of computer-assisted mathematical research is that the technology is still in relative infancy. Much more work must be done before these tools can be mainstays of the research community in the same way, say, that Mathematica, Maple and Sage are for symbolic manipulations. And, frankly, the central idea and inspiration for a proof still must arise from a human mathematician. In other words, these computer- and AI-based tools are better thought of as “co-pilots.” As Terence Tao explained in the previous interview (see here for the full copy):
]]>Tao: I think in three years AI will become useful for mathematicians. It will be a great co-pilot. You’re trying to prove a theorem, and there’s one step that you think is true, but you can’t quite see how it’s true. And you can say, “AI, can you do this stuff for me?” And it may say, “I think I can prove this.” I don’t think mathematics will become solved. If there was another major breakthrough in AI, it’s possible, but I would say that in three years you will see notable progress, and it will become more and more manageable to actually use AI. And even if AI can do the type of mathematics we do now, it means that we will just move to a higher type of mathematics. So right now, for example, we prove things one at a time. It’s like individual craftsmen making a wooden doll or something. You take one doll and you very carefully paint everything, and so forth, and then you take another one. The way we do mathematics hasn’t changed that much. But in every other type of discipline, we have mass production. And so with AI, we can start proving hundreds of theorems or thousands of theorems at a time. And human mathematicians will direct the AIs to do various things. So I think the way we do mathematics will change, but their time frame is maybe a little bit aggressive.
In the past few decades, modern science has uncovered a universe that is far vaster and more awe-inspiring than ever imagined before, together with a set of elegant natural laws that deeply resonate with the idea of a cosmic lawgiver. In spite of these exhilarating developments, some writers, principally of the creationist and intelligent design communities, prefer a highly combative approach to science, particularly to geology and evolution.
In addition to citing deeply flawed probability-based arguments (see Probability), these evolution skeptics
Continue reading New developments in the origin of life on Earth
]]>In the past few decades, modern science has uncovered a universe that is far vaster and more awe-inspiring than ever imagined before, together with a set of elegant natural laws that deeply resonate with the idea of a cosmic lawgiver. In spite of these exhilarating developments, some writers, principally of the creationist and intelligent design communities, prefer a highly combative approach to science, particularly to geology and evolution.
In addition to citing deeply flawed probability-based arguments (see Probability), these evolution skeptics often assert that science has yet to understand the origin of life, and that this is a fatal flaw in evolutionary theory [Behe1996; Dembski1998]. For example, in the 2005 Dover, Pennsylvania intelligent design trial, the Dover Area School Board passed a resolution saying that “The Theory [of evolution] is not a fact. Gaps in the Theory exist for which there is no evidence.” It continued, “Intelligent Design is an explanation of the origin of life that differs from Darwin’s view.” [Lebo2008, pg. 62].
So to what extent does modern science understand the origin of life? What are the latest developments in this arena?
It is certainly true that as of the above date, scientists do not yet fully understand abiogenesis (the formal term for the origin of life on Earth — see [Abiogenesis2022]). In particular, the origin of the first self-reproducing biomolecules, on which evolutionary processes could operate to produce more complicated systems, remains unknown. What’s more, unlike bony structures that leave fossil records, the early stages of biological evolution on Earth very likely have been completely erased, so that we may never know with certainty the full details of what transpired. If anything, the very rapid appearance of life on Earth after it first formed suggests that the origin of life was a reasonably likely event, given a few tens of millions of years (see below), but we have no way to know for sure. It may have been an extraordinarily improbable event.
The question of the origin of life on Earth is deeply intertwined with Fermi’s paradox: If the origin of life is inevitable on a planet such as ours, and if life inexorably increases in complexity and intelligence to a technology-capable society, then why after 50+ years of searching have we not yet found any evidence of any extraterrestrial civilization? There is no easy answer here — see Fermi’s paradox for further analysis.
It should be emphasized that research in abiogenesis is fundamentally no different, philosophically or methodologically, than research in any other field of science — the fact that this event occurred approximately four billion years ago makes no difference whatsoever. Scientists routinely study phenomena at the atomic and subatomic level that are far smaller than what can be viewed by eye, or even via optical microscopes, and they also study phenomena in distant galaxies that are far beyond current technology to visit in person; note also that the events viewed in these galaxies occurred millions or billions of years ago, since these objects are typically millions or billions of light-years away. Yes, there are unanswered questions in abiogenesis, but there are also unanswered questions even in areas of science that one would think are extremely well established, such as gravitational physics [Grossman2012a], cosmology [Barnes2013; Susskind2005] and reproductive biology [Ridley1995]. Thus claims by evolution skeptics that unknowns in the origin of life arena “prove” that scientists do not have all the answers are only met with puzzled stares by real research scientists — after all, exploring unknown, unanswered questions is what science is all about and what researchers explore in approximately two million peer-reviewed papers published each year [Ware2012].
The first major result in the field of abiogenesis was a 1953 experiment by Stanley Miller and Harold Urey. In this experiment, they placed water plus some gases in a sealed flask and passed electric sparks through the mixture to simulate the effects of sunlight and lightning. Over the next week or so, the mixture slowly turned a reddish-brown color; analysis revealed that it contained several amino acids, the building blocks of proteins [Davies1999, pg. 86-94]. Researchers have more recently observed that in current models of early Earth’s atmosphere and oceans, carbon dioxide and nitrogen would have reacted to form nitrites, which quickly destroy amino acids. Thus the Miller-Urey experiment might not be truly representative of what really happened on the early Earth, although some variation of the experiment might still be valid [Fox2007].
Going beyond the synthesis of basic amino acids, one leading hypotheses is that ribonucleic acid (RNA) played a key role. For example, researchers recently found that certain RNA molecules can greatly increase the rate of specific chemical reactions, including, remarkably, the replication of parts of other RNA molecules. Thus perhaps a molecule like RNA could “self-catalyze” itself in this manner, perhaps with the assistance of some related molecules, and then larger conglomerates of such compounds, packaged within simple membranes (such as simple hydrophobic compounds), could have formed very primitive cells [NAS2008, pg. 22]. Nonetheless, even the “RNA world” hypothesis [Cafferty2014], as the above scenario is popularly known, faces challenges. As biochemist Robert Shapiro notes, “Unfortunately, neither chemists nor laboratories were present on the early Earth to produce RNA.” [Shapiro2007; Service2019].
But research on the RNA world hypothesis continues, with some remarkable recent discoveries. In May 2009, a Cambridge University team led by John Sutherland solved a problem that has perplexed researchers for at least 20 years, namely how the basic nucleotides (building blocks) of RNA could spontaneously assemble. As recently as 1999, the appearance of these nucleotides on the primitive Earth was widely thought to be a “near miracle” by researchers in the field [Joyce1999]. In the 2009 study, Sutherland and his team used the same starting chemicals that have been employed in numerous earlier experiments, but they tried many different orders and combinations. They finally discovered one order and combination that formed the RNA nucleotides cytosine and uracil, which are collectively known as the pyrimidines [Wade2009]. Then in March 2015, Sutherland’s team announced that two simple compounds can launch a cascade of chemical reactions that can produce all three major classes of biomolecules: nucleic acids, amino acids and lipids. Biologist Jack Szostak described the result as “a very important paper” [Service2015]. Among other things, Sutherland’s group has been able to generate 12 of the 20 amino acids that are used in living organisms [Wade2015a].
More recently, in May 2016, a team led by Thomas Carell, a chemist at Ludwig Maximilian University of Munich in Germany, reported that his team had found a plausible way to form adenine and guanine, the two building blocks that Sutherland’s team had not been able to produce [Service2016], and subsequently found a simple set of reactions that could have formed all four RNA bases on the early Earth. Ramanarayanan Krishnamurthy, a biochemist at Scripps Research, said “This paper has demonstrated marvellously the chemistry that needs to take place so you can make all the RNA nucleotides,” although he cautioned that this scenario might not be the actual pathway that was taken on the early Earth [Service2018; Castelvecchi2019].
One key question is when this abiogenesis event occurred. A recent consensus picture is as follows: the solar system was created 4.568 billion years ago in the aftermath of a stellar explosion; this was followed, 4.53 billion years ago, by the formation of the Earth; the Moon formed 4.51 billion years ago from a collision of some other celestial body with the Earth; by 4.46 billion years ago, Earth had cooled enough to have both land and water. Then the first RNA formed about 4.35 billion years ago. By 4.1 billion years ago, zircon crystals from that era show a ratio of carbon 12 and carbon 13 isotopes that is typical of photosynthetic life, and very recent results suggest an even earlier origin (see below). It is thought that the Earth continued to be bombarded by meteorites fairly heavily until about 3.8 billion years ago (although see below). The earliest fossils attributed to microorganisms have been dated to 3.43 billion years ago. See this 2019 Science article for overview of the latest research in the area as of 2019 [Service2019].
In light of these results, there is a sense in the abiogenesis field that progress is accelerating. Here is a brief summary of some other recent (since 2015) developments, listed in chronological order, most recent listed last.
In October 2015 geochemists at the University of California, Los Angeles announced evidence that life very likely existed 4.1 billion years ago on Earth, which is at least 300 million years earlier than the previous research consensus. The research consisted of studying 10,000 zircon crystals from the Jack Hills area of Western Australia. Zircons, which are related to the synthetic cubic zirconium that is used for imitation diamond, are extremely durable and virtually impermeable, so that material trapped inside them constitutes a very reliable picture of the environment when they formed. One of these zircons contained graphite (carbon) in two locations. These carbon dots had a ratio of carbon-12 to carbon-13 that is a characteristic signature of photosynthetic life. Researchers are very confident that this carbon is at least 4.1 billion years old, because the zircon that contains them is that old, based on careful measurements of its uranium-to-lead ratio [SD2015c].
In July 2016, researchers at Heinrich Heine University in Dusseldorf, Germany searched DNA databases for gene families shared by at least two species of bacteria and two archaea (even more primitive biological organisms). After analyzing millions of genes and gene families, they found that only 355 gene families are truly shared across all modern organisms, and thus are the most probable genes shared by the “last universal common ancestor” (LUCA) of life. Further, these researchers found that the probable LUCA organism was an anerobe, namely it grew in an evironment devoid of oxygen. These results suggest that LUCA originated near undersea volcanoes [Service2016a].
In August 2016, researchers at the Scripps Research Institute created a ribozyme (a special RNA enzyme) that can both amplify genetic information and generate functional molecules. In particular, it can efficiently replicate short segments of RNA and can transcribe longer RNA segments to make functional RNA molecules with complex strictures. These features are close to what scientists envision was an RNA replicator that could have supported life on the very early Earth, before the emergence of current biology, where protein enzymes handle gene replication and transcription [SD2016b].
In March 2017, British geologists announced that they had found fossil specimens from a geological formation near Hudson Bay, Canada, with dates measured as between from 3.77 billion years to 4.22 billion years old, earlier than any previous claim for biological organisms on Earth. These specimens are similar in appearance to tube-like structures found in modern-day undersea hydrothermal vents. If these findings are upheld, particularly at the 4.22 billion-year-old level, this will be very quickly after the earliest oceans formed [Zimmer2017a].
In May 2018, a team of researchers led by Philipp Holliger at the Medical Research Council Laboratory of Molecular Biology in Cambridge, UK, produced RNA enzymes that can copy RNA molecules even longer than themselves. The problem with many of these RNA experiments is that while RNA can only act as an enzyme if it is in a folded shape, RNA enzymes can only copy RNA molecules that not folded. Holliger’s solution was to build RNA out of “trinucleotides,” slighter larger nucleotide molecules, and to conduct the experiments in an icy solution. Nonetheless, biologists consider his work to be a major step towards completely “natural” synthesis and enzymatic processing of RNA [LePage2018].
In April 2019, a team of researches at Scripps Research Institute found evidence that both RNA and DNA could have arisen from the same precursor molecules. In particular, they showed that the compound thiouridine, which could have been a precursor to RNA, could also be converted into a DNA building block, namely deoxyadenosine (the “A” of the DNA four-letter code) [SD2019a]. A well-written account of these results, with detailed background, is available in a Quanta Magazine article by Jordana Cepelewicz [Cepelewicz2019]. In a December 2020 update to these results, the Scripps team demonstrated that the relatively simple compound diamidophosphate, which plausibly existed on the early Earth, could have acted as a catalyst to connect links of RNA and DNA “letters.” What’s more, the process works better when the RNA or DNA chain has distinct letters [Scripps2020].
In March 2020, a team of researchers at the University of Groningen in the Netherlands discovered that mixtures of simple compounds based on carbon can spontaneously form relatively elaborate molecules that can construct copies of themselves. The researchers combined amino acids and nucleobases of DNA and RNA, which (as noted above) are now known to form spontaneously, given the right settings. When the compounds were mixed together, the amino acids and nucleobases linked to form ring-shaped molecules and then assembled into cylindrical stacks. These stacks then attracted other rings to stack together, in effect copying the stack [Marshall2020].
Also in March 2020, at team of German researchers found that certain metals, plentiful near ocean hydrothermal vents, could catalyze the reaction of water and CO2 to form acetate and pyruvate, key components of the “acetyl-CoA” pathway, a process that feeds organic compounds that drive the production of proteins, carbohydrates and lipids in biological energy metabolism. Thomas Carell, the researcher mentioned above who found a pathway to produce key RNA nucleotides, called the result “thrilling” [Service2020].
In March 2022, a group of researchers at the Max Planck Institute for Astronomy in Germany announced that they had demonstrated experimentally that the condensation of carbon atoms on the surface of cold cosmic dust particles forms aminoketene molecules, which then can polymerize to produce peptides (molecular subunits of proteins) of different lengths. The process involves only carbon, carbon monoxide and ammonia, which are known to be present in star-forming clouds, and proceeds via a pathway that skips the stage of amino acid formation [Saplakoglu2022a].
In June 2022, a team of researchers at the Ludwig-Maximilians University in, Munich, Germany found that peptides such as those described in the previous item could have co-evolved with RNA molecules — the RNA molecules may have enabled peptides to grow directly on them, and the peptides in return could have helped stabilize the RNA molecules, which otherwise decompose quickly. Andro Rios, a researcher at NASA Ames Research Center, called the result a “highly intriguing demonstration” [Saplakoglu2022b].
Also in 2022, researchers at the University of Tokyo found that a network of five RNA populations formed a replicator network with diverse interactions, including cooperation to help the replication of other members: “These results support the capability of molecular replicators to spontaneously develop complexity through Darwinian evolution, a critical step for the emergence of life” [Mizuuchi2022].
In March 2024, researchers at the Salk Institute for Biological Studies in California developed an RNA molecule that managed to make accurate copies of a different type of RNA. The work was hailed as one step in constructing an RNA molecule that can make accurate copies of itself. John Chaput of U.C. Irvine, who did not participate, called the achievement “monumental” [Johnson2024].
In July 2024, a team led at the University of Bristol in the UK announced solid evidence that the last universal common ancestor (LUCA) must have arisen at least 4.2 billion years ago. Note that this means the 4.1 billion year figure mentioned above for the origin of life may have to be revised even higher, in other words that life arose within the first 250 million years of Earth’s existence. This result also seriously draws into question the long-held assumption that a “late heavy bombardment” sterilized Earth until 3.8 billion years ago [LePage2024; Sarchet2024].
Many of these results and others are outlined in a newly published (2024) book by Mario Livio and Jack Szostak, titled Is Earth Exceptional?: The Quest for Cosmic Life [Livio2024].
In spite of these advances, it must be acknowledged that there are some significant unknowns in the RNA world hypothesis, as it is currently understood [Cafferty2014; Livio2024].
In particular, researchers in the abiogenesis arena are still stuck with a stubborn unanswered question: How could large chains of RNA, sufficiently long to be the basis of primitive self-replicating evolutionary life, have spontaneously formed in the primordial Earth’s water-rich environment, which is thermodynamically unfavorable for the formation of such chains? The current consensus is that any such self-replicating RNA molecule would need at least 40-60 nucleotide bases (rungs in the chain), and most likely over 100, to possess even a minimal self-replicating function. What’s more, a pair of such molecules may be necessary, if one is to serve as a template for replication. Yet the largest number of bases that have been reproducibly demonstrated in laboratory experiments is 10, and the probability of successful formation drops sharply as the number of bases increases [Totani2020].
In a 2020 paper published in Nature Scientific Reports, Japanese astronomer Tomonori Totani proposes a solution to this conundrum [Totani2020]. Using a fairly straightforward analysis, he found the probability of spontaneous assembly of a sufficiently long RNA chain to be the basis of life is negligibly small on our planet or galaxy, or even in the larger observable universe to which we belong, but it would be virtually 100% in the much larger universe created in the inflationary epoch just following the big bang. Under this hypothesis, the fact that we reside on such an exceedingly fortunate planet to have been a home for RNA-based life is merely a consequence of the anthropic principle — if we did not reside on such a fortuitous planet, we would not be here to discuss the issue (see Anthropic principle, Big bang and Inflation).
In any event, it must be kept in mind that continuing research into biochemical pathways for abiogenesis could invalidate such reckonings — the origin of life may yet prove to be entirely solvable via straightforward laboratory experimentation.
It must be kept in mind that the process of evolution after abiogenesis is very well attested in fossils, radiometric measurements, DNA, and numerous other lines of evidence, completely independent of how the first biological structures formed. In other words, those unknowns that remain in abiogenesis theory have no bearing on the central hypothesis of evolution, namely that all species are related in a family tree, having proliferated and adapted over many millions of years. Thus there is no substance to the claims by evolution skeptics that unknowns in the origins area are a fatal flaw of evolutionary theory.
This line of reasoning, namely that the absence of a full explanation of abiogenesis invalidates the whole of evolutionary theory, is a classic instance of the “forest fallacy” — picking a flaw or two in the bark of a single tree, and then trying to claim that the forest doesn’t exist. Indeed, to the extent that evolution skeptics continue to emphasize the abiogenesis issue as a premier flaw of evolution, they risk being discredited, even in the public eye, as new and ever-more-remarkable developments are publicly announced.
Along this line, the origin of life with respect to the larger theory of evolution is in perfect analogy to the origin of the universe at the big bang with respect to the Standard Model of modern physics (the over-arching theory that describes interactions of forces and matter from the smallest scales of time and space to the largest). It is undeniably true that scientists do not yet fully understand the big bang, and it is also true that scientists recognize that the current Standard Model is not the final word. But this is hardly reason to dismiss or minimize the enormous corpus of research in modern physics, astronomy and cosmology and the impressive levels of precision to which these theories have been tested. For additional details, see Big bang.
It is undeniably true that scientists do not yet have a complete chain of evidence, or even a fully-developed biochemical scenario for the origin of life — no knowledgeable scientist has ever claimed otherwise. Numerous scenarios have been explored, but there are still some unanswered questions. Nonetheless, thousands of scientific papers, documenting countless experimental studies, have been published on these topics, and several previous show-stopping obstacles, such as the formation of certain building blocks of RNA, have been overcome. Indeed, some of the above results, such as the recent results by Sutherland and Carell on the synthesis of RNA nucleotides, represent remarkable progress in the field. Almost certainly even more remarkable results will be published in the next few years. It would be utter folly to presume that no additional progress will be made.
Given these developments, most observers believe that it is extremely unwise to reject out of hand the possibility that research eventually will discover a complete natural process that could satisfactorily explain the origin or early development of life on Earth. Along this line, it may well be the case that all traces of the first self-reproducing systems and the earliest unicellular life may have been destroyed in the chaotic chemistry of the early Earth, so that we may never know for certain the precise path that actually was taken. But even if scientific research eventually demonstrates some plausible natural path, this would already defeat the claims of evolution skeptics that life could only have originated outside the realm of natural forces and processes. And, frankly, such a discovery could be announced at any time.
On the other hand, it may turn out that the origin of the very first self-replicating strands of RNA, to mention one unsolved aspect of this theory, is fantastically improbable, and present-day humans are descendants of this remarkably unlikely event, as suggested by Totani’s research mentioned above. But even here, nothing suggests that this origin event was the result of anything beyond the operation of known laws of physics and biology.
Either way, the line of reasoning by evolution skeptics, namely that the absence of a full explanation of abiogenesis invalidates the whole of evolutionary theory, implying that creationism and intelligent design must be considered on an equal par with evolution, is a classic instance of the “forest fallacy,” and has no validity whatsoever.
]]>Many of these top swimmers, who have traditionally relied on their “feel” in the water, now employ accelerometers strapped to their bodies, measuring movements in 3-D up to 512 times per second, thus capturing
Continue reading Data science and physics give Olympic swimmers the edge
]]>Many of these top swimmers, who have traditionally relied on their “feel” in the water, now employ accelerometers strapped to their bodies, measuring movements in 3-D up to 512 times per second, thus capturing a “hydrodynamic profile” of their body and swimming technique:
Many of these techniques have been developed under the direction of University of Virginia mathematician Ken Ono, who in mathematical circles is well-known for published research on the Riemann hypothesis and the Rogers–Ramanujan identities. Among other things, Ono was a consultant for the movie “The Man Who Knew Infinity,” about legendary Indian mathematician Srinivasa Ramanujan.
Ono’s approach is to measure and analyze, in great detail, the forces involved as a swimmer performs. For example, after an in-depth analysis of Kate Douglass’s technique, Ono and Douglass concluded that her head position in her underwater breastroke pullout was not optimal. When compared her technique with that of Lilly King, a breaststroke specialist, they concluded that the forward bend of her head was most likely creating extra drag. Mathematical modeling suggested that with this adjustment, Douglass (currently the American-record holder in the 200 breaststroke), could save as much as 0.15 seconds per pullout.
Ono’s interest in swimming stems from a mathematics conference in Norway about 10 years ago, where he learned that mathematicians from the Norwegian School of Sport Sciences had been working with Olympic cross-country skiers, using accelerometers to record their movement patterns. When Ono returned to Virginia, he used some accelerometers that had been designed to track sharks to analyze the swim technique of Andrew Wilson, a math student who was also a competitive swimmer. Wilson later won a gold medal as part of the U.S. medley relay swim team at the Tokyo Olympics.
Ono has now tested approximately 100 top U.S. swimmers, although he works most closely with a group at the University of Virginia. Thomas Heilman, who is youngest American male swimmer since Michael Phelps to qualify for the Olympics, is one of his group. As Ono explains,
“Swimming is the perfect application of mathematics and physics. … We were never designed to swim in water. So to swim quickly in water is a really unique and complicated combination of athletic prowess and attention to detail in terms of physics and mechanics. That’s why I like it.”
Other groups around the world are now employing similar techniques. In France, a research team led by Ricardo Peterson Silveira has analyzed the performance of French swimming sensation Leon Marchand. One test measured the drag while he was pulled through the water in a streamlined position. According to Silveira, Marchand registered the lowest value for this passive drag of any swimmer that Silveira and his team have ever tested — evidently Marchand’s body is custom-designed to be a top swimmer. After some effort with the team, Marchand was able to improve his technique. And clearly these efforts worked: Marchand won four 2024 Paris Olympic gold medals, and set four Olympic records in the process.
For additional details, see this New York Times article, from which some of the above was gleaned, as well as this Atlantic article, this Nature article and this Scientific American article. The Scientific American article, which is co-authored by Kate Douglass, Ken Ono and others, includes some detailed quantitative analysis.
]]>The present author recalls, while in graduate school at Stanford nearly 50 years ago, hearing two senior mathematicians (including the present author’s advisor) briefly discuss some other researcher’s usage of computation in a mathematical investigation. Their comments and body language clearly indicated a profound disdain for such efforts. Indeed, the prevailing mindset at the time was “real mathematicians don’t compute.”
Some researchers at the time attempted to draw more attention to the potential of computer tools in mathematics. In the
Continue reading Computation, AI and the future of mathematics
]]>The present author recalls, while in graduate school at Stanford nearly 50 years ago, hearing two senior mathematicians (including the present author’s advisor) briefly discuss some other researcher’s usage of computation in a mathematical investigation. Their comments and body language clearly indicated a profound disdain for such efforts. Indeed, the prevailing mindset at the time was “real mathematicians don’t compute.”
Some researchers at the time attempted to draw more attention to the potential of computer tools in mathematics. In the early 1990s, Keith Devlin edited a column in the Notices of the American Mathematical Society on applications of computers in the field. Journals such as Mathematics of Computation and the new Experimental Mathematics received a growing number of submissions. But in general computation and computer tools remained relatively muted in the field.
How times have changed! Nowadays most mathematicians are comfortable using mathematical software such as Mathematica, Maple or Sage. Many have either personally written computer code in some language as part of a mathematical investigation, or, at the least, have worked closely with a colleague who writes such code. Most present-day mathematicians are familiar with using LaTeX to typeset papers and produce presentation viewgraphs, and many are also familiar with collaboration tools such as blogs, FaceTime, Skype, Zoom, Slack and Microsoft Teams.
One important development is that a growing number of mathematicians have experience using formal mathematical software to encode and rigorously confirm all steps of a formal mathematical proof. Some more widely used systems include Lean, Coq, HOL and Isabelle.
Indeed, in sharp contrast to 50 years ago, perhaps today one would say “real mathematicians must compute,” particularly if they wish to pursue state-of-the-art research in collaboration with other talented mathematicians. Here we mention just a few of the interesting developments in this arena.
The Kepler conjecture is the assertion that the simple scheme of stacking oranges typically seen in a supermarket has the highest possible average density, namely $\pi/(3 \sqrt{2}) = 0.740480489\ldots$, for any possible arrangement, regular or irregular. It is named after 17th-century astronomer Johannes Kepler, who first proposed that planets orbited in elliptical paths around the sun.
In the early 1990s, Thomas Hales, following an approach first suggested by Laszlo Fejes Toth in 1953, determined that the maximum density of all possible arrangements could be obtained by minimizing a certain function with 150 variables. In 1992, assisted by his graduate student Samuel Ferguson (son of famed mathematician-sculptor Helaman Ferguson), Hales embarked on a multi-year effort to find a lower bound for the function for each of roughly 5,000 different configurations of spheres, using a linear programming approach. In 1998, Hales and Ferguson announced that their project was complete, documented by 250 pages of notes and three Gbyte of computer code, data and results.
Hales’ computer-assisted proof generated some controversy — the journal Annals of Mathematics originally demurred, although ultimately accepted his paper. In the wake of this controversy, Hales decided to embark on a collaborative effort to certify the proof via automated techniques. This project, named Flyspeck, employed the proof-checking software HOL Light and Isabelle. The effort commenced in 2003, but was not completed until 2014.
Another good recent example of computer-based tools in modern mathematics came to light recently in work on the twin prime conjecture. The twin prime conjecture is the assertion that there are infinitely many pairs of primes differing by just two, such as $(11, 13), (101, 103)$, and $(4799, 4801)$. Although many eminent mathematicians have investigated the conjecture, few solid results have been obtained, and the conjecture remains open.
In 2013, the previously unheralded mathematician Yitang Zhang posted a paper proving that there are infinitely many pairs of primes with gaps between two and 70,000,000. While this certainly did not prove the full conjecture (the 70,000,000 figure would have to be reduced to just two), the paper and its underlying techniques constituted what one reviewer termed “a landmark theorem in the distribution of prime numbers.”
Zhang based his work on a 2005 paper by Dan Goldston, Janos Pintz and Cem Yildirim. Their paper employed a sieving method analogous to the Sieve of Eratosthenes to filter out pairs of primes that are closer together than average. They then proved that for any fraction, no matter how small, there exists some pair of primes closer together than that fraction of the average gap.
Immediately after Zhang’s result was made public, researchers wondered if the figure 70,000,000 could be lowered. Soon Terence Tao, James Maynard and several other prominent mathematicians instituted an online collaboration known as Polymath to further reduce the gap. (The Polymath system had previously been developed by Timothy Gowers and others, and has an impressive list of results.) When the dust had settled, Terence Tao proved a key result lowering the limit to just 246, using results developed by Zhang, Maynard and others. The resulting paper was listed as authored by “D.H.J. Polymath.”
It is no secret that the methods of data mining, machine learning and artificial intelligence have made dramatic advances within the past years. Here are some well-known examples:
Advanced machine learning, data mining and AI software is also being applied in mathematical research. One example is known as the Ramanujan Machine. This is a systematic computational approach, developed by a consortium of researchers mostly based in Israel, that attempts to discover mathematical formulas for fundamental constants, and to reveal the underlying structure of various constants. Their techniques have discovered numerous well-known formulas and several new ones, notably some continued fraction representations of $\pi, e$, Catalan’s constant, and values of the Riemann zeta function at integer arguments.
A similar, but quite distinct approach has been taken by a separate group of researchers, including famed mathematician brothers Jonathan Borwein and Peter Borwein (with some participation by the present author). By employing, in part, very high-precision computations, these researchers discovered a large number of new infinite series formulas for various mathematical constants (including $\pi$ and $\log(2)$, among others), some of which have the intriguing property that they permit one to directly compute binary or hexadecimal digits beginning at an arbitrary starting position, without needing to compute any of the digits that come before. Two example references are here and here.
Along this line, one very useful online reference for computational mathematics is N.J.A. Sloane’s Online Encyclopedia of Integer Sequences, which at present contains over 375,000 entries, many of which include numerous references of where a specific sequence appears in mathematical literature. With the rise of serious data-mining technology, we may see remarkable new mathematical results discovered by data mining this invaluable dataset.
UCLA Fields Medalist mathematician Terence Tao recently sat down with Christoph Drosser, a writer for Spektrum der Wissenschaft, the German edition of Scientific American. Drosser and Tao discussed at length the current status of computer tools in mathematical research, including proof-checking software such as Lean and also AI-based tools, and what is likely to be the future course of the field with the ascendance of these tools. Here are some excerpts from this remarkable interview (with some minor editing by the present author):
Drosser: With the advent of automated proof checkers, how is [the trust between mathematicians] changing?
Tao: Now you can really collaborate with hundreds of people that you’ve never met before. And you don’t need to trust them, because they upload code and the Lean compiler verifies it. You can do much larger-scale mathematics than we do normally. When I formalized our most recent results with what is called the Polynomial Freiman-Ruzsa (PFR) conjecture, [I was working with] more than 20 people. We had broken up the proof in lots of little steps, and each person contributed a proof to one of these little steps. And I didn’t need to check line by line that the contributions were correct. I just needed to sort of manage the whole thing and make sure everything was going in the right direction. It was a different way of doing mathematics, a more modern way.
Drosser: German mathematician and Fields Medalist Peter Scholze collaborated in a Lean project — even though he told me he doesn’t know much about computers.
Tao: With these formalization projects, not everyone needs to be a programmer. Some people can just focus on the mathematical direction; you’re just splitting up a big mathematical task into lots of smaller pieces. And then there are people who specialize in turning those smaller pieces into formal proofs. We don’t need everybody to be a programmer; we just need some people to be programmers. It’s a division of labor.
[Continuing after some skip:]
Drosser: I heard about machine-assisted proofs 20 years ago, when it was a very theoretical field. Everybody thought you have to start from square one — formalize the axioms and then do basic geometry or algebra — and to get to higher mathematics was beyond people’s imagination. What has changed that made formal mathematics practical?
Tao: One thing that changed is the development of standard math libraries. Lean, in particular, has this massive project called mathlib. All the basic theorems of undergraduate mathematics, such as calculus and topology, and so forth, have one by one been put in this library. So people have already put in the work to get from the axioms to a reasonably high level. And the dream is to actually get [the libraries] to a graduate level of education. Then it will be much easier to formalize new fields [of mathematics]. There are also better ways to search because if you want to prove something, you have to be able to find the things that it already has confirmed to be true. So also the development of really smart search engines has been a major new development.
Drosser: So it’s not a question of computing power?
Tao: No, once we had formalized the whole PFR project, it only took like half an hour to compile it to verify. That’s not the bottleneck — it’s getting the humans to use it, the usability, the user friendliness. There’s now a large community of thousands of people, and there’s a very active online forum to discuss how to make the language better.
Drosser: Is Lean the state of the art, or are there competing systems?
Tao: Lean is probably the most active community. For single-author projects, maybe there are some other languages that are slightly better, but Lean is easier to pick up in general. And it has a very nice library and a nice community. It may eventually be replaced by an alternative, but right now it is the dominant formal language.
[Continuing after some skip:]
Drosser: So far, the idea for the proof still has to come from the human mathematician, doesn’t it?
Tao: Yes, the fastest way to formalize is to first find the human proof. Humans come up with the ideas, the first draft of the proof. Then you convert it to a formal proof. In the future, maybe things will proceed differently. There could be collaborative projects where we don’t know how to prove the whole thing. But people have ideas on how to prove little pieces, and they formalize that and try to put them together. In the future, I could imagine a big theorem being proven by a combination of 20 people and a bunch of AIs each proving little things. And over time, they will get connected, and you can create some wonderful thing. That will be great. It’ll be many years before that’s even possible. The technology is not there yet, partly because formalization is so painful right now.
Drosser: I have talked to people that try to use large language models or similar machine-learning technologies to create new proofs. Tony Wu and Christian Szegedy, who recently co-founded the company xAI, with Elon Musk and others, told me that in two to three years mathematics will be “solved” in the same sense that chess is solved — that machines will be better than any human at finding proofs.
Tao: I think in three years AI will become useful for mathematicians. It will be a great co-pilot. You’re trying to prove a theorem, and there’s one step that you think is true, but you can’t quite see how it’s true. And you can say, “AI, can you do this stuff for me?” And it may say, “I think I can prove this.” I don’t think mathematics will become solved. If there was another major breakthrough in AI, it’s possible, but I would say that in three years you will see notable progress, and it will become more and more manageable to actually use AI. And even if AI can do the type of mathematics we do now, it means that we will just move to a higher type of mathematics. So right now, for example, we prove things one at a time. It’s like individual craftsmen making a wooden doll or something. You take one doll and you very carefully paint everything, and so forth, and then you take another one. The way we do mathematics hasn’t changed that much. But in every other type of discipline, we have mass production. And so with AI, we can start proving hundreds of theorems or thousands of theorems at a time. And human mathematicians will direct the AIs to do various things. So I think the way we do mathematics will change, but their time frame is maybe a little bit aggressive.
Drosser: I interviewed Peter Scholze when he won the Fields Medal in 2018. I asked him, How many people understand what you’re doing? And he said there were about 10 people.
Tao: With formalization projects, what we’ve noticed is that you can collaborate with people who don’t understand the entire mathematics of the entire project, but they understand one tiny little piece. It’s like any modern device. No single person can build a computer on their own, mine all the metals and refine them, and then create the hardware and the software. We have all these specialists, and we have a big logistics supply chain, and eventually we can create a smartphone or whatever. Right now, in a mathematical collaboration, everyone has to know pretty much all the mathematics, and that is a stumbling block, as [Scholze] mentioned. But with these formalizations, it is possible to compartmentalize and contribute to a project only knowing a piece of it. I think also we should start formalizing textbooks. If a textbook is formalized, you can create these very interactive textbooks, where you could describe the proof of a result in a very high-level sense, assuming lots of knowledge. But if there are steps that you don’t understand, you can expand them and go into details — all the way down the axioms if you want to. No one does this right now for textbooks because it’s too much work. But if you’re already formalizing it, the computer can create these interactive textbooks for you. It will make it easier for a mathematician in one field to start contributing to another because you can precisely specify subtasks of a big task that don’t require understanding everything.
[Continuing after some skip:]
Drosser: By breaking down a problem and exploring it, you learn a lot of new things on the way, too. Fermat’s Last Theorem, for example, was a simple conjecture about natural numbers, but the math that was developed to prove it isn’t necessarily about natural numbers anymore. So tackling a proof is much more than just proving this one instance.
Tao: Let’s say an AI supplies an incomprehensible, ugly proof. Then you can work with it, and you can analyze it. Suppose this proof uses 10 hypotheses to get one conclusion — if I delete one hypothesis, does the proof still work? That’s a science that doesn’t really exist yet because we don’t have so many AI-generated proofs, but I think there will be a new type of mathematician that will take AI-generated mathematics and make it more comprehensible. Like, we have theoretical and experimental science. There are lots of things that we discover empirically, but then we do more experiments, and we discover laws of nature. We don’t do that right now in mathematics. But I think there’ll be an industry of people trying to extract insight from AI proofs that initially don’t have any insight.
Drosser: So instead of this being the end of mathematics, would it be a bright future for mathematics?
Tao: I think there’ll be different ways of doing mathematics that just don’t exist right now. I can see project manager mathematicians who can organize very complicated projects — they don’t understand all the mathematics, but they can break things up into smaller pieces and delegate them to other people, and they have good people skills. Then there are specialists who work in subfields. There are people who are good at trying to train AI on specific types of mathematics, and then there are people who can convert the AI proofs into something human-readable. It will become much more like the way almost any other modern industry works. Like, in journalism, not everyone has the same set of skills. You have editors, you have journalists, and you have businesspeople, and so forth — we’ll have similar things in mathematics eventually.
The full text of the interview is available here.
]]>As is the custom on this site, we present below a crossword puzzle constructed on a theme of mathematics, computing and the digits of pi. Enjoy!
This puzzle has been constructed in conformance with New York Times crossword puzzle standards. In terms of overall difficulty, my mathematician daughter and my genealogist spouse agreed that it would likely rate as a Tuesday (the New York Times grades its puzzles, with Monday puzzles being the easiest and Saturday puzzles being the most challenging).
Continue reading Pi Day 2024 crossword puzzle
]]>As is the custom on this site, we present below a crossword puzzle constructed on a theme of mathematics, computing and the digits of pi. Enjoy!
This puzzle has been constructed in conformance with New York Times crossword puzzle standards. In terms of overall difficulty, my mathematician daughter and my genealogist spouse agreed that it would likely rate as a Tuesday (the New York Times grades its puzzles, with Monday puzzles being the easiest and Saturday puzzles being the most challenging).
As was the custom with past years’ puzzles here, a pi-themed prize will be rewarded to the first two persons who submit correctly completed puzzles (US only; past recipients are eligible). This year’s choices are: a Digits of pi coffee mug, a Pi Day T-shirt, or a Pi Bracelet.
As of this date (2 Mar 2024), correct solutions have been submitted by Neil Calkin, Michael Coons, Gerard Joseph, Ross Blocher and Morgan Marshall. If you wish to be credited here as a solver of the puzzle, please send the author an email (see DHB site for email).
If you wish to print the puzzle, the PNG file is available HERE.
]]>
Johan Sebastian Bach (1685-1750) regularly garners the top spot in listings of the greatest Western classical composers, typically followed by Mozart and Beethoven. Certainly in terms of sheer volume of compositions, Bach reigns supreme. The Bach-Werke-Verzeichnis (BWV) catalogue lists 1128 compositions, from short solo pieces to the magnificent Mass in B Minor (BWV 232) and Christmas Oratorio (BWV 248), far more than any other classical composer. Further, Bach’s clever, syncopated style led the way to twentieth century musical innovations, notably jazz.
There does seem to be a
Continue reading Using network theory to analyze Bach’s music
]]>Johan Sebastian Bach (1685-1750) regularly garners the top spot in listings of the greatest Western classical composers, typically followed by Mozart and Beethoven. Certainly in terms of sheer volume of compositions, Bach reigns supreme. The Bach-Werke-Verzeichnis (BWV) catalogue lists 1128 compositions, from short solo pieces to the magnificent Mass in B Minor (BWV 232) and Christmas Oratorio (BWV 248), far more than any other classical composer. Further, Bach’s clever, syncopated style led the way to twentieth century musical innovations, notably jazz.
There does seem to be a credible connection between the sort of mental gymnastics done by a mathematician and by a musician. To begin with, there are well-known mathematical relationships between the pitch of various notes on the musical keyboard. But beyond mere analysis of pitches, it is clear that the arena of musical syntax and structure has a very deep connection to the sorts of syntax, structure and other regularities that are studied in mathematical research. Bach and Mozart in particular are well-known for music that is both “mathematically” beautiful and structurally impeccable.
Just as some of the best musicians and composers are “mathematical,” so too many of the best mathematicians are quite musical. It is quite common at evening receptions of large mathematical conferences to be serenaded by concert-quality musical performers, who, in their day jobs, are accomplished mathematicians of some renown. Perhaps the best real-life example of a mathematician-musician was Albert Einstein, who was also an accomplished pianist and violinist. His favorite composers? You guessed it: Bach and Mozart. He later said, “If … I were not a physicist, I would probably be a musician. I often think in music. I live my daydreams in music. I see my life in terms of music.”
For additional details on Bach’s “mathematical” style, some interesting speculations on golden ratio patterns in Bach’s music, as well as a listing of number of particularly listenable works, with audio links, see this previously published Math Scholar article: Bach as mathematician.
In February 2024, a team of researchers from the University of Pennsylvania, Yale and Princeton in the U.S. published a study describing their efforts to analyze Bach’s music using network theory and information theory. As the authors explain,
Bach is a natural case study given his prolific career, the wide appreciation his compositions have garnered, and the influence he had over contemporaneous and subsequent composers. His diverse compositions (from chorales to fugues) for a wide range of musicians (from singers to orchestra members) often share a fundamental underlying structure of repeated—and almost mathematical—musical themes and motifs. These features of Bach’s compositions make them particularly interesting to study using a mathematical framework.
The authors included a wide range of Bach compositions in their study, including some preludes and fugues from the Well-Tempered Clavier suite, two- and three-part inventions, a selection of Bach’s cantatas, the English suites, the French suites, some chorales, the Brandenburg concertos, and various toccatas and concertos.
Overall, their results have confirmed that Bach’s works have a high information content, and further that different subsets of works have distinct characteristics.
Here is an outline of their methodology: After collecting digitized versions of the above musical selections, they represented each note as a node in a network, with notes from different octaves as distinct nodes. A transition from note A to note B is represented as a directed edge from A to B. Chords are represented with edges between all notes in the first chord to all notes in the second chord. A graphical representation of this process is shown below. To the right of this illustration is the result of this process for four specific Bach compositions: (a) the chorale “Wir glauben all an einen Gott” (BWV 437); (b) Fugue 11 from the Well-Tempered Clavier (WTC), Book I (BWV 856); (c) Prelude 9 from the WTC, Book II (BWV 878); and (d) Toccata in G major for harpsichord (BWV 916).
The information exhibited in these graphs was quantified as the Shannon entropy of a random walk on the network. In particular, the contribution of the $i$-th node to the entropy is:
$$S_i \; = \; – \sum_{j=1}^n P_{i,j} \log P_{i,j},$$
where $P_{i,j}$ is the transition probability of going from node $i$ to node $j$. Then the entropy of the entire network is a weighted sum of the $S_i$:
$$S \; = \; \sum_{i=1}^m w_i S_i \; = \; – \sum_{i=1}^m w_i \sum_{j=1}^n P_{i,j} \log P_{i,j},$$
where the weights $w_i$ are the stationary distribution probabilities: $w_i$ is the probability that a walker ends up at node $i$ after infinite time.
Some additional information and complete technical details are in the published paper.
The researchers found that, indeed, compared with some other musical compositions they had studied, the Bach pieces tended to have higher information content. What was even more interesting was that the researchers found significant differences between the different classes of Bach compositions:
The chorales, typically meant to be sung by groups in ecclesiastical settings, are shorter and simpler diatonic pieces that display a markedly lower entropy than the rest of the compositions studied. By contrast, the toccatas, characterized by more complex chromatic sections that span a wider melodic range, have a much higher entropy. It is possible that the chorales’ functions of meditation, adoration, and supplication are best supported by predictability and hence low entropy, whereas the entertainment functions of the toccatas and preludes are best supported by unpredictability and hence high entropy.
Full details of the results are given in the published paper.
The researchers have clearly identified a very interesting and very effective technique to analyze, classify and compare musical compositions. Numerous questions for future research could be asked, some of which were suggested by the authors themselves in their concluding section. Here are a few of these potential research questions, including some due to the present author:
Clearly an exciting future lies ahead in the realm of fusing mathematical network and information theory with the fine arts.
]]>Reproducibility has emerged as a major issue in numerous fields of scientific research, ranging from psychology, sociology, economics and finance to biomedicine, scientific computing and physics. Many of these difficulties arise from experimenter bias (also known as “selection bias”): consciously or unconsciously excluding, ignoring or adjusting certain data that do not seem to be in agreement with one’s preconceived hypothesis; or devising statistical tests post-hoc, namely AFTER data has already been
Continue reading Overcoming experimenter bias in scientific research
]]>Reproducibility has emerged as a major issue in numerous fields of scientific research, ranging from psychology, sociology, economics and finance to biomedicine, scientific computing and physics. Many of these difficulties arise from experimenter bias (also known as “selection bias”): consciously or unconsciously excluding, ignoring or adjusting certain data that do not seem to be in agreement with one’s preconceived hypothesis; or devising statistical tests post-hoc, namely AFTER data has already been collected and partially analyzed. By some estimates, at least 75% of published scientific papers in some fields contain flaws of this sort that could potentially compromise the conclusions.
Many fields are now undertaking aggressive steps to address such problems. As just one example of many that could be listed, the American pharmaceutical-medical device firm Johnson and Johnson has agreed to make detailed clinical data available to outside researchers, covering past and planned trials, in an attempt to avoid the temptation to only publish results of tests with favorable outcomes.
For further background details, see John Ioannidis’ 2014 article, Sonia Van Gilder Cooke’s 2016 article, Robert Matthews’ 2017 article, Clare Wilson’s 2022 article and Benjamin Mueller’s 2024 article, as well as this previous Math Scholar article, which summarizes several case studies.
Physics, thought by some to be the epitome of objective physical science, has not been immune to experimenter bias. One classic example is the history of measurements of the speed of light. The chart on the right shows numerous published measurements, dating from the late 1800s through 1983, when experimenters finally agreed to the international standard value 299,792,458 meters/second. (Note: Since 1983, the meter has been defined as the distance traveled by light in vacuum in 1/299,792,458 seconds.)
As can be seen in the chart, during the period 1870-1900, the prevailing value of the speed of light was roughly 299,900,000 m/s. But beginning with a measurement made in 1910 through roughly 1950, the prevailing value centered on 299,775,000 m/s. Finally, beginning in the 1960s, a new series of very carefully controlled experiments settled on the current value, 299,792,458 m/s. It is important to note that the currently accepted value was several standard deviations below the prevailing 1870-1900 value, and was also several standard deviations above the prevailing 1910-1950 value.
Why? Evidently investigators, faced with divergent values, tended to reject or ignore values that did not comport with the then-prevailing consensus figure. Or, as Luke Caldwell explains, “[R]esearchers were probably unconsciously steering their data to fit better with previous values, even though those turned out to be inaccurate. It wasn’t until experimenters had a much better grasp on the true size of their errors that the various measurements converged on what we now think is the correct value.”
In a February 2024 Scientific American article, Luke Caldwell described his team’s work to measure the electron’s electric dipole moment (EDM), in an attempt to better understand why we are made of matter and not antimatter. He described in some detail the special precautions that his team took to avoid experimenter error, and, especially, to avoid consciously or unconsciously filtering the data to reach some preconceived conclusion. It is worth quoting this passage at length:
Another source of systematic error is experimenter bias. All scientists are human beings and, despite our best efforts, can be biased in our thoughts and decisions. This fallibility can potentially affect the results of experiments. In the past it has caused researchers to subconsciously try to match the results of previous experiments. … [Caldwell mentions the speed of light controversy above.]
To avoid this issue, many modern precision-measurement experiments take data “blinded.” In our case, after each run of the experiment, we programmed our computer to add a randomly generated number—the “blind”—to our measurements and store it in an encrypted file. Only after we had gathered all our data, finished our statistical analysis and even mostly written the paper did we have the computer subtract the blind to reveal our true result.
The day of the unveiling was a nerve-racking one. After years of hard work, our team gathered to find out the final result together. I had written a computer program to generate a bingo-style card with 64 plausible numbers, only one of which was the true result. The other numbers varied from “consistent with zero” to “a very significant discovery.” Slowly, all the fake answers disappeared from the screen one by one. It’s a bit weird to have years of your professional life condensed into a single number, and I questioned the wisdom of amping up the stress with the bingo card. But I think it became apparent to all of us how important the blinding technique was; it was hard to know whether to be relieved or disappointed by the vanishing of a particularly large result that would have hinted at new, undiscovered particles and fields but also contradicted the results of previous experiments.
Finally, a single value remained on the screen. Our answer was consistent with zero within our calculated uncertainty. The result was also consistent with previous measurements, building confidence in them as a collective, and it improved on the best precision by a factor of two. So far, it seems, we have no evidence that the electron has an EDM.
Caldwell’s data blinding technique is quite remarkable. If some variation of this methodology were adopted in other fields of research, at least some of the problems with experimenter bias could be avoided.
The field of finance is deeply afflicted by experimenter bias, because of backtest overfitting, namely the usage of historical market data to develop an investment model, strategy or fund, especially where many strategy variations are tried on the same fixed dataset. Note that this is clearly an instance of the post-hoc probability fallacy, which in turn is a form of experimenter bias (more commonly termed “selection bias” in finance). Backtest overfitting has long plagued the field of finance and is now thought to be the leading reason why investments that look great when designed often disappoint when actually fielded to investors. Models, strategies and funds suffering from this type of statistical overfitting typically target the random patterns present in the limited in-sample test-set on which they are based, and thus often perform erratically when presented with new, truly out-of-sample data.
As an illustration, the authors of this AMS Notices article show that if only five years of daily stock market data are available as a backtest, then no more than 45 variations of a strategy should be tried on this data, or the resulting “optimal” strategy will be overfit, in the specific sense that the strategy’s Sharpe Ratio (a standard measure of financial return) is likely to be 1.0 or greater just by chance, even though the true Sharpe Ratio may be zero or negative.
Some commonly used techniques to compensate for backtest overfitting, if not used correctly, are also suspect. One example is the “hold-out method” — developing a model or investment fund based on a backtest of a certain date range, then checking the result with a different date range. However, those using the hold-out method may iteratively tune the parameters for their model until the score on the hold-out data, say measured by a Sharpe ratio, is impressively high. But these repeated tuning tests, using the same fixed hold-out dataset, are themselves tantamount to backtest overfitting.
One dramatic visual example of backtest overfitting is shown in the graph at the right, which displays the mean excess return (compared to benchmarks) of newly minted exchange-traded index-linked funds, both in the months of design prior to submission to U.S. Securities and Exchange Commission for approval, and in the months after the fund was actually fielded. The “knee” in the graph at 0 shows unmistakably the difference between statistically overfit designs and actual field experience.
For additional details, see How backtest overfitting in finance leads to false discoveries, which is condensed from this freely available SSRN article.
The p-test, which was introduced by the British statistician Ronald Fisher in the 1920s, assesses whether the result of an experiment is more extreme than what would one have given the null hypothesis. However, the p-test, especially when used alone, has significant drawbacks, as Fisher himself warned. To begin with, the typically used level of p = 0.05 is not a particularly compelling result. In any event, it is highly questionable to reject a result if its p-value is 0.051, whereas to accept it as significant if its p-value is 0.049.
The prevalence of the classic p = 0.05 value has led to the egregious practice that Uri Simonsohn of the University of Pennsylvania has termed p-hacking: proposing numerous varied hypotheses until a researcher finds one that meets the 0.05 level. Note that this is a classic multiple testing fallacy of statistics, which in turn is a form of experimenter bias: perform enough tests and one is bound to pass any specific level of statistical significance. Such suspicions are justified given the results of a study by Jelte Wilcherts of the University of Amsterdam, who found that researchers whose results were close to the p = 0.05 level of significance were less willing to share their original data than were others that had stronger significance levels (see also this summary from Psychology Today).
Along this line, it is clear that a sole focus on p-values can muddle scientific thinking, confusing significance with size of the effect. For example, a 2013 study of more than 19,000 married persons found that those who had met their spouses online are less likely to divorce (p < 0.002) and more likely to have higher marital satisfaction (p < 0.001) than those who met in other ways. Impressive? Yes, but the divorce rate for online couples was 5.96%, only slightly down from 7.67% for the larger population, and the marital satisfaction score for these couples was 5.64 out of 7, only slightly better than 5.48 for the larger population (see also this Nature article).
Perhaps a more important consideration is that p-values, even if reckoned properly, can easily mislead. Consider the following example, which is taken from a paper by David Colquhoun: Imagine that we wish to screen persons for potential dementia. Let’s assume that 1% of the population has dementia, and that we have a test for dementia that is 95% accurate (i.e., it is accurate with p = 0.05), in the sense that 95% of persons without the condition will be correctly diagnosed, and assume also that the test is 80% accurate for those who do have the condition. Now if we screen 10,000 persons, 100 presumably will have the condition and 9900 will not. Of the 100 who have the condition, 80% or 80 will be detected and 20 will be missed. Of the 9900 who do not, 95% or 9405 will be cleared, but 5% or 495 will be incorrectly tested positive. So out of the original population of 10,000, 575 will test positive, but 495 of these 575, or 86%, are false positives.
Needless to say, a false positive rate of 86% is disastrously high. Yet this is entirely typical of many instances in scientific research where naive usage of p-values leads to surprisingly misleading results. In light of such problems, the American Statistical Association (ASA) has issued a Statement on statistical significance and p-values. The statement concludes:
Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.
Several suggestions for overcoming experimenter bias have been mentioned in the text above, and numerous others are in the linked references above. There is no one silver bullet. What is clear is that researchers from all fields of research need to take experimenter bias and the larger realm of systematic errors more seriously. As Luke Caldwell wrote in the Scientific American article mentioned above:
]]>Gathering the data set was the quick part. The real challenge of a precision experiment is the time spent looking for systematic errors — ways we might convince ourselves we had measured an eEDM when in fact we had not. Precision-measurement scientists take this job very seriously; no one wants to declare they have discovered new particles only to later find out they had only precisely measured a tiny flaw in their apparatus or method. We spent about two years hunting for and understanding such flaws.
Continue reading Progress in troubled times: Good news in science, medicine and society
]]>Nonetheless, progress is still being made on many fronts. If anything, 2023 produced a bumper crop of startling new advances, particularly in health, medicine, physics, astronomy, computer science and artificial intelligence. Many of these advances, in turn, are based on the unstoppable forward march of Moore’s Law and computer technology — since 2000, aggregate memory and computer power of comparable devices have increased by factors of approximately 1,000,000. Genome sequencing technology has advanced even faster — sequencing a full human genome cost $3 billion in 2003, but currently costs only about $300, a 10,000,000-fold reduction.
Further, contrary to popular perception, progress has also been made in several metrics of social well-being. For example, after rising worldwide during the Covid-19 pandemic, crime rates fell sharply in most major cities in 2023. And global poverty continues its remarkable decline, with over 100,000 persons escaping dire poverty per day (on average) since the 1980s.
Granted, as emphasized above, progress has certainly not been universal, and many dire challenges remain, but we should not lose sight of the progress that has been achieved. Here are just a few of the many items of progress worth celebrating, most of them in the past year (2023). Numerous others are listed HERE, from which part of the items below were gleaned.
While out hiking, I found this rock. The following table gives measurements made on the rock. The first two rows give the overall length and width of the rock. Each of the next six rows, after the first two, gives thickness measurements, made on a 3cm x 6cm grid of points from the top surface. All measurements are in millimeters:
Measurement or row Column 1 Column 2 Column 3 Length 105.0 Width 48.21 Row 1 35.44 35.38 36.54 Row 2 38.06 38.27
Continue reading Aliens made this rock: The post-hoc probability fallacy in biology, finance and cosmology
]]>While out hiking, I found this rock. The following table gives measurements made on the rock. The first two rows give the overall length and width of the rock. Each of the next six rows, after the first two, gives thickness measurements, made on a 3cm x 6cm grid of points from the top surface. All measurements are in millimeters:
Measurement or row | Column 1 | Column 2 | Column 3 |
Length | 105.0 | ||
Width | 48.21 | ||
Row 1 | 35.44 | 35.38 | 36.54 |
Row 2 | 38.06 | 38.27 | 38.55 |
Row 3 | 38.02 | 39.53 | 39.29 |
Row 4 | 38.66 | 40.50 | 41.96 |
Row 5 | 39.40 | 43.48 | 43.31 |
Row 6 | 39.58 | 41.83 | 43.07 |
This is a set of 20 measurements, each given to four significant figures, for a total of 80 digits. Among all rocks of roughly this size, the probability of a rock appearing with this particular set of measurements is thus one in 10^{80}. Note that this probability is so remote that even if the surfaces of each of the ten planets estimated to orbit each of 100 billion stars in the Milky Way were examined in detail, and this were repeated for each of the estimated 100 billion galaxies in the visible universe, it is still exceedingly unlikely that a rock with this exact set of measurements would ever be found. Thus this rock could not have appeared naturally, and must have been created by space aliens…
What is the fallacy in the above argument? First of all, modeling each measurement as a random variable of four equiprobable digits, and then assuming all measurements are independent (so that we can blithely multiply probabilities) is a very dubious reckoning. In real rocks, the measurement at one point is constrained by physics and geology to be reasonably close to that of nearby points. Presuming that every instance in the space of 10^{80} theoretical digit strings is equally probable as a set of rock measurements is an unjustified and clearly invalid assumption. Thus the above reckoning must be rejected on this basis alone.
Key point: It is important to note that the post-hoc fallacy does not just weaken a probability or statistical argument. In most cases, the post-hoc fallacy completely nullifies the argument. The correct reckoning is “What is the probability of X occurring, given that X has been observed to occur,” which of course is unity. In other words, the laws of probability, when correctly applied to a post-hoc phenomenon, can say nothing one way or the other about the likelihood of the event.
In general, probability reckonings based solely or largely on combinatorial enumerations of theoretical possibilities have no credibility when applied to real-world problems. And reckonings where a statistical test or probability calculation is devised or modified after the fact are invalid and should be immediately rejected — some outcome had to occur, and the laws of probability by themselves cannot help to determine its likelihood.
One example of the post hoc probability fallacy in biology can be seen in the attempt by some evolution skeptics (see this Math Scholar article for some references) to claim that the human alpha-globin molecule, a component of hemoglobin that performs a key oxygen transfer function in blood, could not have arisen by “random” evolution. These writers argue that since human alpha-globin is a protein chain based on a sequence of 141 amino acids, and since there are 20 different amino acids common in living systems, the “probability” of selecting human alpha-globin at random is one in 20^{141}, or one in approximately 10^{183}. This probability is so tiny, so they argue, that even after millions of years of random molecular trials covering the entire Earth’s surface, no human alpha-globin protein molecule would ever appear.
But this line of reasoning is a dead-ringer for the post-hoc probability fallacy. Note that this probability reckoning was performed after the fact, on a single very limited dataset (the human alpha-globin sequence) that has been known in the biology literature for decades. Some sequence had to appear, and the fact that the particular sequence found today in humans was the end result of the course of evolution provides no guidance, by itself, as to the probability of its occurrence.
Further, just like the rock measurements above, the enumeration of combinatorial possibilities in this calculation, devoid of empirical data, has no credibility. Some alpha-globin sequences may be highly likely to be realized while others may be biologically impossible. It may well be that there is nothing special whatsoever about the human alpha-globin sequence, as evidenced, for example, by the great variety in alpha-globin molecules seen across the biological kingdom, all of which perform a similar oxygen transfer function. But there is no way to know for sure, since we have only one example of human evolution. Thus the reckoning used here, namely enumerating combinatorial possibilities and taking the reciprocal to obtain a probability figure, is unjustified and invalid.
In any event, it is clear that the impressive-sounding probability figure claimed by these writers is a vacuous arithmetic exercise, with no foundation in real empirical biology. For a detailed discussion of these issues, see Do probability arguments refute evolution?.
The field of finance is deeply afflicted by the post-hoc fallacy, because of “backtest overfitting,” namely the usage of historical market data to develop an investment model, strategy or fund, where many variations are tried on the same fixed dataset (note that this is unavoidably a post-hoc reckoning). Backtest overfitting has long plagued the field of finance and is now thought to be the leading reason why investments that look great when designed often disappoint when actually fielded to investors. Models, strategies and funds suffering from this type of statistical overfitting typically target the random patterns present in the limited in-sample test-set on which they are based, and thus often perform erratically when presented with new, truly out-of-sample data.
As an illustration, the authors of this AMS Notices article show that if only five years of daily stock market data are available as a backtest, then no more than 45 variations of a strategy should be tried on this data, or the resulting strategy will be overfit, in the specific sense that the strategy’s Sharpe Ratio (a standard measure of financial return) is likely to be 1.0 or greater just by chance, even though the true Sharpe Ratio may be zero or negative.
The potential for backtest overfitting in the financial field has grown enormously in recent years with the increased utilization of computer programs to search a space of millions or even billions of parameter variations for a given model, strategy or fund, and then to select only the “optimal” choice for publication or market implementation. Compounding the problem is the failure of most academic journals in the field to require authors to disclose the extent of their computer search and optimization. The sobering consequence is that a significant portion of the models, strategies and funds employed in the investment world, including many of those marketed to individual investors, may be merely statistical mirages.Some commonly used techniques to compensate for backtest overfitting, if not used correctly, are themselves tantamount to overfitting. One example is the “hold-out method” — developing a model or investment fund based on a backtest of a certain date range, then checking the result with a different date range. However, those using the hold-out method may iteratively tune the parameters for their model until the score on the hold-out data, say measured by a Sharpe ratio, is impressively high. But these repeated tuning tests, using the same fixed hold-out dataset, are themselves tantamount to backtest overfitting (and thus subject to the post-hoc fallacy).
One dramatic visual example of backtest overfitting is shown in the graph at the right, which displays the mean excess return (compared to benchmarks) of newly minted exchange-traded index-linked funds, both in the months of design prior to submission to SEC for approval, and in the months after the fund was fielded. The “knee” in the graph at 0 shows unmistakably the difference between statistically overfit designs and actual field experience.
For additional details, see How backtest overfitting in finance leads to false discoveries, which appeared in the December 2021 issue of the British journal Significance. This Significance article is condensed from the following manuscript, which is freely available from SSRN: Finance is Not Excused: Why Finance Should Not Flout Basic Principles of Statistics.
Most readers are likely familiar with Fermi’s paradox: There are at least 100 billion stars in the Milky Way, with many if not most now known to host planets. Once a species achieves a level of technology that is at least equal to ours, so that they have become a space-faring civilization, within a few million years (an eyeblink in cosmic time) they or their robotic spacecraft could fully explore the Milky Way. So why do we not see any evidence of their existence, or even of their exploratory spacecraft? For additional details, see this Math Scholar article.
Most “solutions” to Fermi’s paradox, including, sadly, some promulgated by prominent researchers who should know better, are easily defeated. For example, explanations such as “They are under strict orders not to communicate with a new civilization such as Earth,” or “They have lost interest in scientific research, exploration and expansion,” or “They have no interest in a primitive, backward society such as ours,” fall prey to a diversity argument: In any vast, diverse society (an essential prerequisite for advanced technology), there will be exceptions and nonconformists to any rule. Thus claims that “all extraterrestrials are like X” have little credibility, no matter what “X” is. It is deeply ironic that while most scientific researchers and others would strenuously reject stereotypes of religious, ethnic or national groups in human society, many seem willing to hypothesize sweeping, ironclad stereotypes for extraterrestrial societies.
Similarly, dramatic advances in human technology over the past decade, including new space exploration vehicles, new energy sources, state-of-the-art supercomputers, quantum computing, robotics and artificial intelligence, are severely undermining the long-standing assumption that space exploration is fundamentally too difficult.
One of the remaining explanations is that Earth is an exceedingly rare planet (possibly unique in the Milky Way), with a lengthy string of characteristics fostering a long-lived biological regime, which enabled the unlikely rise of intelligent life. John Gribbin, for example, writes “They are not here, because they do not exist. The reasons why we are here form a chain so improbable that the chance of any other technological civilization existing in the Milky Way Galaxy at the present time is vanishingly small.” [Gribbin2011, pg. 205]. Among other things, researchers note that:
Note that any one of these five items may constitute a “great filter,” i.e. a virtually insuperable obstacle. Compounded together they suggest that the origin and rise of human-level technological life on Earth may well have been an exceedingly singular event in the cosmos, unlikely to be repeated anywhere else in the Milky Way if not beyond. Thus the “rare Earth” explanation of Fermi’s paradox is growing in credibility.
On the other hand, analyses in this arena are hobbled by the post-hoc probability fallacy: We only have one real data point, namely the rise of human technological civilization on a single planet, Earth, and so we have no way to rigorously assess the probability of our existence. It may be that the origin and rise of intelligent life is inevitable on any suitably hospitable planet, and we just need to keep trying to finally contact another similarly advanced species. Or it may be that the rise of a species with enough intelligence and technology to rigorously pose the question of its existence is merely a post-hoc selection effect — if we didn’t live on a hospitable planet that harbored the origin of life and was continuously habitable over billions of years, thus allowing the evolution of life that progressed to advanced technology, we would not be here to pose the question.
As Charles Lineweaver of the Australian National University has observed [Lineweaver2024b]:
Our existence on Earth can tell us little about the probability of the evolution of human-like intelligence in the Universe because even if this probability were infinitesimally small and there were only one planet with the kind of intelligence that can ask this question, we the question-askers would, of necessity, find ourselves on that planet.
For additional details, see Where are the extraterrestrials? Fermi’s paradox, diversity and the origin of life.
The dilemma of whether our existence is “special” in some sense extends to the universe itself.
For several decades, researchers in the fields of physics and cosmology have puzzled over deeply perplexing indications, many of them highly mathematical in nature, that the universe seems inexplicably well-tuned to facilitate the evolution of atoms, complex molecular structures and sentient creatures. Some of these “cosmic coincidences” include the following (these and numerous others are presented and discussed in detail in a recent book by Lewis and Barnes):
Numerous attempts at explanations have been proposed over the years to explain these difficulties. One of the more widely accepted hypotheses is the multiverse, combined with the anthropic principle. The theory of inflation, mentioned above, suggests that our universe is merely one pocket that separated from many others in the very early universe. Similarly, string theory suggests that our universe is merely one speck in an enormous landscape of possible universes, by one count 10^{500} in number, each corresponding to a different Calabi-Yau manifold.
Thus, the thinking goes, we should not be surprised that we find ourselves in a universe that has somehow beaten the one-in-10^{120} odds to be life-friendly (to pick just the cosmological constant paradox), because it had to happen somewhere, and, besides, if our universe were not life-friendly, then we would not be here to talk about it. In other words, these researchers propose that the multiverse (or the “cosmic landscape”) actually exists in some sense, but acknowledge that the vast majority of these universes are utterly sterile — either very short-lived or else completely devoid of atoms or other structures, much less sentient living organisms like us contemplating the meaning of their existence. We are just lucky.
But other researchers are very reluctant to adopt such reasoning. After all, we have no evidence of these other universes, and as yet we have no conceivable means of rigorously calculating the “probability” of any possible universe outcome, including ours (this is known as the “measure problem” of cosmology). As with Fermi’s paradox, by definition we have only one universe to observe and analyze, and thus, by the post-hoc probability fallacy, simple-minded attempts to reckon “probabilities” are doomed to failure.
For additional details, see Is the universe fine-tuned for intelligent life?.
A good introduction to the post-hoc probability fallacy for the general reader can be found in social scientist Steven Pinker’s new book Rationality: What It Is, Why It Seems Scarce, Why It Matters. Pinker likens the post-hoc fallacy to the following joke:
]]>A man tries on a custom suit and says to the tailor, “I need this sleeve taken in.” The tailor says, “No, just bend your elbow like this. See, it pulls up the sleeve.” The customer says, “Well, OK, but when I bend my elbow, the collar goes up the back of my neck.” The tailor says, “So? Raise your head up and back. Perfect.” The man says, “But now the left shoulder is three inches lower than the right one!” The tailor says, “No problem. Bend at the waist and then it evens out.” The man leaves the store wearing the suit, his right elbow sticking out, his head craned back, his torso bent to the left, walking with a herky-jerky gait. A pair of pedestrians pass him by. The first says, “Did you see that poor disabled guy? My heart aches for him.” The second says, “Yeah, but his tailor is a genius — the suit fits him perfectly!”