An international team of scientists, including from the University of Cambridge, have launched a new research collaboration that will leverage the same technology behind ChatGPT to build an AI-powered tool for scientific discovery.
The team launched the initiative, called Polymathic AI earlier this week, alongside the publication of a series of related papers on the arXiv open access repository.
“This will completely change how people use AI and machine learning in science,” said Polymathic AI principal investigator Shirley Ho, a group leader at the Flatiron Institute’s Center for Computational Astrophysics in New York City.
The idea behind Polymathic AI “is similar to how it’s easier to learn a new language when you already know five languages,” said Ho.
Starting with a large, pre-trained model, known as a foundation model, can be both faster and more accurate than building a scientific model from scratch. That can be true even if the training data isn’t obviously relevant to the problem at hand.
“It’s been difficult to carry out academic research on full-scale foundation models due to the scale of computing power required,” said co-investigator Miles Cranmer, from Cambridge’s Department of Applied Mathematics and Theoretical Physics and Institute of Astronomy. “Our collaboration with Simons Foundation has provided us with unique resources to start prototyping these models for use in basic science, which researchers around the world will be able to build from—it’s exciting.”
“Polymathic AI can show us commonalities and connections between different fields that might have been missed,” said co-investigator Siavash Golkar, a guest researcher at the Flatiron Institute’s Center for Computational Astrophysics.
“In previous centuries, some of the most influential scientists were polymaths with a wide-ranging grasp of different fields. This allowed them to see connections that helped them get inspiration for their work. With each scientific domain becoming more and more specialized, it is increasingly challenging to stay at the forefront of multiple fields. I think this is a place where AI can help us by aggregating information from many disciplines.”
“Despite rapid progress of machine learning in recent years in various scientific fields, in almost all cases, machine learning solutions are developed for specific use cases and trained on some very specific data,” said co-investigator Francois Lanusse, a cosmologist at the Center national de la recherche scientifique (CNRS) in France.
“This creates boundaries both within and between disciplines, meaning that scientists using AI for their research do not benefit from information that may exist, but in a different format, or in a different field entirely.”
Polymathic AI’s project will learn using data from diverse sources across physics and astrophysics (and eventually fields such as chemistry and genomics, its creators say) and apply that multidisciplinary savvy to a wide range of scientific problems. The project will “connect many seemingly disparate subfields into something greater than the sum of their parts,” said project member Mariel Pettee, a postdoctoral researcher at Lawrence Berkeley National Laboratory.
“How far we can make these jumps between disciplines is unclear,” said Ho. “That’s what we want to do—to try and make it happen.”
ChatGPT has well-known limitations when it comes to accuracy (for instance, the chatbot says 2,023 times 1,234 is 2,497,582 rather than the correct answer of 2,496,382). Polymathic AI’s project will avoid many of those pitfalls, Ho said, by treating numbers as actual numbers, not just characters on the same level as letters and punctuation. The training data will also use real scientific datasets that capture the physics underlying the cosmos.
Transparency and openness are a big part of the project, Ho said. “We want to make everything public. We want to democratize AI for science in such a way that, in a few years, we’ll be able to serve a pre-trained model to the community that can help improve scientific analyses across a wide variety of problems and domains.”
More information: Michael McCabe et al, Multiple Physics Pretraining for Physical Surrogate Models, arXiv (2023). DOI: 10.48550/arxiv.2310.02994
Siavash Golkar et al, xVal: A Continuous Number Encoding for Large Language Models, arXiv (2023). DOI: 10.48550/arxiv.2310.02989
Francois Lanusse et al, AstroCLIP: Cross-Modal Pre-Training for Astronomical Foundation Models, arXiv (2023). DOI: 10.48550/arxiv.2310.03024
News
Needle-Free: New Nano-Vaccine Effective Against All COVID-19 Variants
A new nano-vaccine developed by TAU and the University of Lisbon offers a needle-free, room-temperature-storable solution against COVID-19, targeting all key variants effectively. Professor Ronit Satchi-Fainaro’s lab at Tel Aviv University’s Faculty of Medical and [...]
Photoacoustic PDA-ICG Nanoprobe for Detecting Senescent Cells in Cancer
A study in Scientific Reports evaluated a photoacoustic polydopamine-indocyanine green (PDA-ICG) nanoprobe for detecting senescent cells. Senescent cells play a role in tumor progression and therapeutic resistance, with potential adverse effects such as inflammation and tissue [...]
How Dysregulated Cell Signaling Causes Disease
Cell signaling is crucial for cells to communicate and function correctly. Disruptions in these pathways, caused by genetic mutations or environmental factors, can lead to uncontrolled cell growth, improper immune responses, or errors in [...]
Scientists Develop Super-Strong, Eco-Friendly Plastic That Bacteria Can Eat
Researchers at the Weizmann Institute have developed a biodegradable composite material that could play a significant role in addressing the global plastic waste crisis. Billions of tons of plastic waste clutter our planet. Most [...]
Building a “Google Maps” for Biology: Human Cell Atlas Revolutionizes Medicine
New research from the Human Cell Atlas offers insights into cell development, disease mechanisms, and genetic influences, enhancing our understanding of human biology and health. The Human Cell Atlas (HCA) consortium has made significant [...]
Bioeconomic Potential: Scientists Just Found 140 Reasons to Love Spider Venom
Researchers at the LOEWE Centre for Translational Biodiversity Genomics (TBG) have discovered a significant diversity of enzymes in spider venom, previously overshadowed by the focus on neurotoxins. These enzymes, found across 140 different families, [...]
Quantum Algorithms and the Future of Precision Medicine
Precision medicine is reshaping healthcare by tailoring treatments to individual patients based on their unique genetic, environmental, and lifestyle factors. At the forefront of this revolution, the integration of quantum computing and machine learning [...]
Scientists Have Discovered a Simple Supplement That Causes Prostate Cancer Cells To Self-Destruct
Menadione, a vitamin K precursor, shows promise in slowing prostate cancer in mice by disrupting cancer cell survival processes, with potential applications for human treatment and myotubular myopathy therapy. Prostate cancer is a quiet [...]
Scientists reveal structural link for initiation of protein synthesis in bacteria
Within a cell, DNA carries the genetic code for building proteins. To build proteins, the cell makes a copy of DNA, called mRNA. Then, another molecule called a ribosome reads the mRNA, translating it [...]
Vaping Isn’t Safe: Scientists Uncover Alarming Vascular Risks
Smoking and vaping impair vascular function, even without nicotine, with the most significant effects seen in nicotine-containing e-cigarettes. Researchers recommend avoiding both for better health. Researchers have discovered immediate impacts of cigarette and e-cigarette [...]
Twice-Yearly Lenacapavir for PrEP Reduces HIV Infections by 96%
Twice-yearly injections of the capsid inhibitor drug lenacapavir can prevent the vast majority of HIV infections, according to a Phase 3 clinical trial published Wednesday in the New England Journal of Medicine. HIV pre-exposure [...]
Did Social Distancing Begin 6,000 Years Ago? Neolithic Villagers May Have Invented It
Social distancing may have roots 6,000 years ago, as research shows Neolithic villages like Nebelivka used clustered layouts to control disease spread. The phrase “social distancing” became widely recognized in recent years as people [...]
Decoding Alzheimer’s: The Arctic Mutation’s Role in Unusual Brain Structures
Researchers have uncovered how certain genetic mutations lead to unique spherical amyloid plaques in inherited forms of Alzheimer’s, offering insights that could advance our understanding of the disease and improve therapeutic strategies. An international collaboration [...]
How Your “Lizard Brain” Fuels Overthinking and Social Anxiety
New research by Northwestern Medicine reveals how humans have evolved advanced brain regions to interpret others’ thoughts, connecting these areas with the amygdala, a part of the brain involved in emotional processing. Study sought [...]
How Did Life Begin? Researchers Discover Game-Changing Clue
New research offers a potential explanation for the formation of early Earth protocells. Few questions have captivated humankind more than the mystery of life’s origins on Earth. How did the first living cells emerge? [...]
Printable organic X-ray sensors may transform treatment for cancer patients
An international research team, led by the University of Wollongong (UOW), has found wearable organic X-ray sensors could offer safer radiotherapy protocols for cancer patients. More than 400 people are diagnosed with cancer every [...]