While GPT-4 performs well in structured reasoning tasks, a new study shows that its ability to adapt to variations is weak—suggesting AI still lacks true abstract understanding and flexibility in decision-making.
Artificial Intelligence (AI), particularly large language models like GPT-4, has shown impressive performance on reasoning tasks. But does AI truly understand abstract concepts, or is it just mimicking patterns? A new study from the University of Amsterdam and the Santa Fe Institute reveals that while GPT models perform well on some analogy tasks, they fall short when the problems are altered, highlighting key weaknesses in AI’s reasoning capabilities.
Analogical reasoning is the ability to draw a comparison between two different things based on their similarities in certain aspects. It is one of the most common methods by which human beings try to understand the world and make decisions. An example of analogical reasoning: cup is to coffee as soup is to ??? (the answer being: bowl)
Large language models like GPT-4 perform well on various tests, including those requiring analogical reasoning. But can AI models truly engage in general, robust reasoning, or do they over-rely on patterns from their training data? This study by language and AI experts Martha Lewis (Institute for Logic, Language and Computation at the University of Amsterdam) and Melanie Mitchell (Santa Fe Institute) examined whether GPT models are as flexible and robust as humans in making analogies. ‘This is crucial, as AI is increasingly used for decision-making and problem-solving in the real world,’ explains Lewis.
Comparing AI models to human performance
Lewis and Mitchell compared the performance of humans and GPT models on three different types of analogy problems:
- Letter sequences – Identify patterns in letter sequences and complete them correctly.
- Digit matrices – Analyzing number patterns and determining the missing numbers.
- Story analogies – Understanding which of two stories best corresponds to a given example story.
A system that truly understands analogies should maintain high performance even on variations
In addition to testing whether GPT models could solve the original problems, the study examined how well they performed when the problems were subtly modified. ‘A system that truly understands analogies should maintain high performance even on these variations’, state the authors in their article.
GPT models struggle with robustness
Humans maintained high performance on most modified versions of the problems, but GPT models, while performing well on standard analogy problems, struggled with variations. ‘This suggests that AI models often reason less flexibly than humans, and their reasoning is less about true abstract understanding and more about pattern matching,’ explains Lewis.
In digit matrices, GPT models showed a significant performance drop when the missing number’s position changed. Humans had no difficulty with this. In story analogies, GPT-4 tended to select the first given answer as correct more often, whereas humans were not influenced by answer order. Additionally, GPT-4 struggled more than humans when key elements of a story were reworded, suggesting a reliance on surface-level similarities rather than deeper causal reasoning.
When tested on modified versions, GPT models showed a decline in performance on simpler analogy tasks, while humans remained consistent. However, both humans and AI struggled with more complex analogical reasoning tasks.
Weaker than human cognition
This research challenges the widespread assumption that AI models like GPT-4 can reason in the same way humans do. ‘While AI models demonstrate impressive capabilities, this does not mean they truly understand what they are doing,’ conclude Lewis and Mitchell. ‘Their ability to generalize across variations is still significantly weaker than human cognition. GPT models often rely on superficial patterns rather than deep comprehension.’
This is a critical warning about using AI in important decision-making areas such as education, law, and healthcare. While AI can be a powerful tool, it is not yet a replacement for human thinking and reasoning.
- Lewis, Martha, and Melanie Mitchell. “Evaluating the Robustness of Analogical Reasoning in Large Language Models.” Transactions on Machine Learning Research, 2025, openreview.net/forum?id=t5cy5v9wp
News
Saunas Activate Your Immune System
A brief sauna session may quietly mobilize the immune system. A sauna session may do more than raise your heart rate and body temperature. A new study from Finland found that it also briefly [...]
Why music from your youth still has such an intense effect years later: A psychological perspective
You're driving, and suddenly a familiar song fills the air. Before you even know it, a wave of emotions comes over you – not just memories, but a deep, almost physical feeling. This powerful [...]
AI to antibody in days: breaking the wet lab bottleneck via high-throughput integration
The role of artificial intelligence (AI) in drug design has fundamentally shifted from a speculative tool to a central pillar of pharmaceutical research and development (R&D). Sino Biological plays a critical role in this [...]
Regenerative Healthcare by Design: Engineering Health-Centric Buildings and Urban Ecosystems
Introduction The next evolution of healthcare will not be confined to hospitals, clinics, or episodic interventions—it will be embedded into the infrastructure of everyday life. Regenerative health ecosystems require a systemic re-architecture of how [...]
Scientists Warn: Humanity Has Pushed the Planet Past Its Limits
Human population and consumption have surpassed Earth’s limits, increasing risks to climate and global stability. The Earth is already operating beyond its capacity to sustainably support the global population, according to new research highlighting [...]
Breakthrough Study Reveals Why Damaged Nerves Struggle To Heal
A newly identified molecular mechanism reveals how neurons weigh survival against repair after injury. Scientists at the Icahn School of Medicine at Mount Sinai have identified a molecular switch in neurons that limits the regrowth of [...]
Popular Vitamin B3 Supplements May Help Cancer Cells Survive, Scientists Warn
A new study raises important questions about widely used NAD+ supplements, suggesting that compounds often taken to boost energy and support healthy aging may have unintended consequences in cancer treatment. Millions of Americans take [...]
Scientists Discover Cancer Tumors Are “Addicted” to This Common Antioxidant
Cancer cells may be exploiting a common antioxidant as fuel, revealing a potential weakness that future therapies could target. Cancer cells may be tapping into an unexpected energy source: an antioxidant long associated with [...]
Nanotube injector transfers cytoplasmic contents and organelles between living cells safely
Cells are not isolated units; they continuously exchange proteins, genetic material, and even entire organelles with their neighbors. Intercellular transfer influences how tissues develop, respond to stress, and repair damage. In certain cancers, for [...]
CEO of America’s largest public hospital system is ready to replace radiologists with AI
The chief executive of America’s largest public hospital system says he is prepared to start replacing radiologists with artificial intelligence in some circumstances, once the regulatory landscape catches up. Mitchell H. Katz, MD, president [...]
Our books now available worldwide!
Online Sellers other than Amazon, Routledge, and IOPP Indigo Global Health Care Equivalency in the Age of Nanotechnology, Nanomedicine and Artifcial Intelligence Global Health Care Equivalency In The Age Of Nanotechnology, Nanomedicine And Artificial [...]
Study finds higher heart disease risk in long COVID patients
People with long COVID are at increased risk of developing cardiovascular disease, according to a new study from Karolinska Institutet published in eClinicalMedicine. The results show that the risk of conditions such as cardiac arrhythmias [...]
The Corona variant Cicada is here – we know that
Online and on social media, reports are piling up about a new Sars-Cov-2 variant that is currently on the rise: BA.3.2, also known as Cicada. That's what it's all about: The Omicron variant BA.3.2, [...]
A Simple Blood Test Could Predict Dementia Risk 25 Years Early
A single blood marker may quietly signal dementia risk decades in advance. Scientists at the University of California, San Diego, have identified a blood signal that could forecast dementia risk decades before symptoms begin. Their [...]
Sperm Get Lost in Space and Scientists Finally Know Why
Having a baby in space may be far more complicated than expected, as new research shows sperm struggle to find their way in microgravity. Starting a family beyond Earth could be more complicated than [...]
Digital Dementia – Brain fog and disassociation from being chronically online
New medical evidence, featured on 60 Minutes Australia, indicates excessive screen time is causing "digital dementia" in young Australians, with brain scans showing physical shrinkage and damage. Experts warn that high device usage (6-8 hours [...]















