While GPT-4 performs well in structured reasoning tasks, a new study shows that its ability to adapt to variations is weak—suggesting AI still lacks true abstract understanding and flexibility in decision-making.
Artificial Intelligence (AI), particularly large language models like GPT-4, has shown impressive performance on reasoning tasks. But does AI truly understand abstract concepts, or is it just mimicking patterns? A new study from the University of Amsterdam and the Santa Fe Institute reveals that while GPT models perform well on some analogy tasks, they fall short when the problems are altered, highlighting key weaknesses in AI’s reasoning capabilities.
Analogical reasoning is the ability to draw a comparison between two different things based on their similarities in certain aspects. It is one of the most common methods by which human beings try to understand the world and make decisions. An example of analogical reasoning: cup is to coffee as soup is to ??? (the answer being: bowl)
Large language models like GPT-4 perform well on various tests, including those requiring analogical reasoning. But can AI models truly engage in general, robust reasoning, or do they over-rely on patterns from their training data? This study by language and AI experts Martha Lewis (Institute for Logic, Language and Computation at the University of Amsterdam) and Melanie Mitchell (Santa Fe Institute) examined whether GPT models are as flexible and robust as humans in making analogies. ‘This is crucial, as AI is increasingly used for decision-making and problem-solving in the real world,’ explains Lewis.
Comparing AI models to human performance
Lewis and Mitchell compared the performance of humans and GPT models on three different types of analogy problems:
- Letter sequences – Identify patterns in letter sequences and complete them correctly.
- Digit matrices – Analyzing number patterns and determining the missing numbers.
- Story analogies – Understanding which of two stories best corresponds to a given example story.
A system that truly understands analogies should maintain high performance even on variations
In addition to testing whether GPT models could solve the original problems, the study examined how well they performed when the problems were subtly modified. ‘A system that truly understands analogies should maintain high performance even on these variations’, state the authors in their article.
GPT models struggle with robustness
Humans maintained high performance on most modified versions of the problems, but GPT models, while performing well on standard analogy problems, struggled with variations. ‘This suggests that AI models often reason less flexibly than humans, and their reasoning is less about true abstract understanding and more about pattern matching,’ explains Lewis.
In digit matrices, GPT models showed a significant performance drop when the missing number’s position changed. Humans had no difficulty with this. In story analogies, GPT-4 tended to select the first given answer as correct more often, whereas humans were not influenced by answer order. Additionally, GPT-4 struggled more than humans when key elements of a story were reworded, suggesting a reliance on surface-level similarities rather than deeper causal reasoning.
When tested on modified versions, GPT models showed a decline in performance on simpler analogy tasks, while humans remained consistent. However, both humans and AI struggled with more complex analogical reasoning tasks.
Weaker than human cognition
This research challenges the widespread assumption that AI models like GPT-4 can reason in the same way humans do. ‘While AI models demonstrate impressive capabilities, this does not mean they truly understand what they are doing,’ conclude Lewis and Mitchell. ‘Their ability to generalize across variations is still significantly weaker than human cognition. GPT models often rely on superficial patterns rather than deep comprehension.’
This is a critical warning about using AI in important decision-making areas such as education, law, and healthcare. While AI can be a powerful tool, it is not yet a replacement for human thinking and reasoning.
- Lewis, Martha, and Melanie Mitchell. “Evaluating the Robustness of Analogical Reasoning in Large Language Models.” Transactions on Machine Learning Research, 2025, openreview.net/forum?id=t5cy5v9wp
News
Yale Scientists Solve a Century-Old Brain Wave Mystery
Yale scientists traced gamma brain waves to thalamus-cortex interactions. The discovery could reveal how brain rhythms shape perception and disease. For more than a century, scientists have observed rhythmic waves of synchronized neuronal activity [...]
Can introducing peanuts early prevent allergies? Real-world data confirms it helps
New evidence from a large U.S. primary care network shows that early peanut introduction, endorsed in 2015 and 2017 guidelines, was followed by a marked decline in clinician-diagnosed peanut and overall food allergies among [...]
Nanoparticle blueprints reveal path to smarter medicines
Lipid nanoparticles (LNPs) are the delivery vehicles of modern medicine, carrying cancer drugs, gene therapies and vaccines into cells. Until recently, many scientists assumed that all LNPs followed more or less the same blueprint, [...]
How nanomedicine and AI are teaming up to tackle neurodegenerative diseases
When I first realized the scale of the challenge posed by neurodegenerative diseases, such as Alzheimer's, Parkinson's disease and amyotrophic lateral sclerosis (ALS), I felt simultaneously humbled and motivated. These disorders are not caused [...]
Self-Organizing Light Could Transform Computing and Communications
USC engineers have demonstrated a new kind of optical device that lets light organize its own route using the principles of thermodynamics. Instead of relying on switches or digital control, the light finds its own [...]
Groundbreaking New Way of Measuring Blood Pressure Could Save Thousands of Lives
A new method that improves the accuracy of interpreting blood pressure measurements taken at the ankle could be vital for individuals who are unable to have their blood pressure measured on the arm. A newly developed [...]
Scientist tackles key roadblock for AI in drug discovery
The drug development pipeline is a costly and lengthy process. Identifying high-quality "hit" compounds—those with high potency, selectivity, and favorable metabolic properties—at the earliest stages is important for reducing cost and accelerating the path [...]
Nanoplastics with environmental coatings can sneak past the skin’s defenses
Plastic is ubiquitous in the modern world, and it's notorious for taking a long time to completely break down in the environment - if it ever does. But even without breaking down completely, plastic [...]
Chernobyl scientists discover black fungus feeding on deadly radiation
It looks pretty sinister, but it might actually be incredibly helpful When reactor number four in Chernobyl exploded, it triggered the worst nuclear disaster in history, one which the surrounding area still has not [...]
Long COVID Is Taking A Silent Toll On Mental Health, Here’s What Experts Say
Months after recovering from COVID-19, many people continue to feel unwell. They speak of exhaustion that doesn’t fade, difficulty breathing, or an unsettling mental haze. What’s becoming increasingly clear is that recovery from the [...]
Study Delivers Cancer Drugs Directly to the Tumor Nucleus
A new peptide-based nanotube treatment sneaks chemo into drug-resistant cancer cells, providing a unique workaround to one of oncology’s toughest hurdles. CiQUS researchers have developed a novel molecular strategy that allows a chemotherapy drug to [...]
Scientists Begin $14.2 Million Project To Decode the Body’s “Hidden Sixth Sense”
An NIH-supported initiative seeks to unravel how the nervous system tracks and regulates the body’s internal organs. How does your brain recognize when it’s time to take a breath, when your blood pressure has [...]
Scientists Discover a New Form of Ice That Shouldn’t Exist
Researchers at the European XFEL and DESY are investigating unusual forms of ice that can exist at room temperature when subjected to extreme pressure. Ice comes in many forms, even when made of nothing but water [...]
Nobel-winning, tiny ‘sponge crystals’ with an astonishing amount of inner space
The 2025 Nobel Prize in chemistry was awarded to Richard Robson, Susumu Kitagawa and Omar Yaghi on Oct. 8, 2025, for the development of metal-organic frameworks, or MOFs, which are tunable crystal structures with extremely [...]
Harnessing Green-Synthesized Nanoparticles for Water Purification
A new review reveals how plant- and microbe-derived nanoparticles can power next-gen water disinfection, delivering cleaner, safer water without the environmental cost of traditional treatments. A recent review published in Nanomaterials highlights the potential of green-synthesized nanomaterials (GSNMs) in [...]
Brainstem damage found to be behind long-lasting effects of severe Covid-19
Damage to the brainstem - the brain's 'control center' - is behind long-lasting physical and psychiatric effects of severe Covid-19 infection, a study suggests. Using ultra-high-resolution scanners that can see the living brain in [...]















