While GPT-4 performs well in structured reasoning tasks, a new study shows that its ability to adapt to variations is weak—suggesting AI still lacks true abstract understanding and flexibility in decision-making.
Artificial Intelligence (AI), particularly large language models like GPT-4, has shown impressive performance on reasoning tasks. But does AI truly understand abstract concepts, or is it just mimicking patterns? A new study from the University of Amsterdam and the Santa Fe Institute reveals that while GPT models perform well on some analogy tasks, they fall short when the problems are altered, highlighting key weaknesses in AI’s reasoning capabilities.
Analogical reasoning is the ability to draw a comparison between two different things based on their similarities in certain aspects. It is one of the most common methods by which human beings try to understand the world and make decisions. An example of analogical reasoning: cup is to coffee as soup is to ??? (the answer being: bowl)
Large language models like GPT-4 perform well on various tests, including those requiring analogical reasoning. But can AI models truly engage in general, robust reasoning, or do they over-rely on patterns from their training data? This study by language and AI experts Martha Lewis (Institute for Logic, Language and Computation at the University of Amsterdam) and Melanie Mitchell (Santa Fe Institute) examined whether GPT models are as flexible and robust as humans in making analogies. ‘This is crucial, as AI is increasingly used for decision-making and problem-solving in the real world,’ explains Lewis.
Comparing AI models to human performance
Lewis and Mitchell compared the performance of humans and GPT models on three different types of analogy problems:
- Letter sequences – Identify patterns in letter sequences and complete them correctly.
- Digit matrices – Analyzing number patterns and determining the missing numbers.
- Story analogies – Understanding which of two stories best corresponds to a given example story.
A system that truly understands analogies should maintain high performance even on variations
In addition to testing whether GPT models could solve the original problems, the study examined how well they performed when the problems were subtly modified. ‘A system that truly understands analogies should maintain high performance even on these variations’, state the authors in their article.
GPT models struggle with robustness
Humans maintained high performance on most modified versions of the problems, but GPT models, while performing well on standard analogy problems, struggled with variations. ‘This suggests that AI models often reason less flexibly than humans, and their reasoning is less about true abstract understanding and more about pattern matching,’ explains Lewis.
In digit matrices, GPT models showed a significant performance drop when the missing number’s position changed. Humans had no difficulty with this. In story analogies, GPT-4 tended to select the first given answer as correct more often, whereas humans were not influenced by answer order. Additionally, GPT-4 struggled more than humans when key elements of a story were reworded, suggesting a reliance on surface-level similarities rather than deeper causal reasoning.
When tested on modified versions, GPT models showed a decline in performance on simpler analogy tasks, while humans remained consistent. However, both humans and AI struggled with more complex analogical reasoning tasks.
Weaker than human cognition
This research challenges the widespread assumption that AI models like GPT-4 can reason in the same way humans do. ‘While AI models demonstrate impressive capabilities, this does not mean they truly understand what they are doing,’ conclude Lewis and Mitchell. ‘Their ability to generalize across variations is still significantly weaker than human cognition. GPT models often rely on superficial patterns rather than deep comprehension.’
This is a critical warning about using AI in important decision-making areas such as education, law, and healthcare. While AI can be a powerful tool, it is not yet a replacement for human thinking and reasoning.
- Lewis, Martha, and Melanie Mitchell. “Evaluating the Robustness of Analogical Reasoning in Large Language Models.” Transactions on Machine Learning Research, 2025, openreview.net/forum?id=t5cy5v9wp
News
Platelet-inspired nanoparticles could improve treatment of inflammatory diseases
Scientists have developed platelet-inspired nanoparticles that deliver anti-inflammatory drugs directly to brain-computer interface implants, doubling their effectiveness. Scientists have found a way to improve the performance of brain-computer interface (BCI) electrodes by delivering anti-inflammatory drugs directly [...]
After 150 years, a new chapter in cancer therapy is finally beginning
For decades, researchers have been looking for ways to destroy cancer cells in a targeted manner without further weakening the body. But for many patients whose immune system is severely impaired by chemotherapy or radiation, [...]
Older chemical libraries show promise for fighting resistant strains of COVID-19 virus
SARS‑CoV‑2, the virus that causes COVID-19, continues to mutate, with some newer strains becoming less responsive to current antiviral treatments like Paxlovid. Now, University of California San Diego scientists and an international team of [...]
Lower doses of immunotherapy for skin cancer give better results, study suggests
According to a new study, lower doses of approved immunotherapy for malignant melanoma can give better results against tumors, while reducing side effects. This is reported by researchers at Karolinska Institutet in the Journal of the National [...]
Researchers highlight five pathways through which microplastics can harm the brain
Microplastics could be fueling neurodegenerative diseases like Alzheimer's and Parkinson's, with a new study highlighting five ways microplastics can trigger inflammation and damage in the brain. More than 57 million people live with dementia, [...]
Tiny Metal Nanodots Obliterate Cancer Cells While Largely Sparing Healthy Tissue
Scientists have developed tiny metal-oxide particles that push cancer cells past their stress limits while sparing healthy tissue. An international team led by RMIT University has developed tiny particles called nanodots, crafted from a metallic compound, [...]
Gold Nanoclusters Could Supercharge Quantum Computers
Researchers found that gold “super atoms” can behave like the atoms in top-tier quantum systems—only far easier to scale. These tiny clusters can be customized at the molecular level, offering a powerful, tunable foundation [...]
A single shot of HPV vaccine may be enough to fight cervical cancer, study finds
WASHINGTON -- A single HPV vaccination appears just as effective as two doses at preventing the viral infection that causes cervical cancer, researchers reported Wednesday. HPV, or human papillomavirus, is very common and spread [...]
New technique overcomes technological barrier in 3D brain imaging
Scientists at the Swiss Light Source SLS have succeeded in mapping a piece of brain tissue in 3D at unprecedented resolution using X-rays, non-destructively. The breakthrough overcomes a long-standing technological barrier that had limited [...]
Scientists Uncover Hidden Blood Pattern in Long COVID
Researchers found persistent microclot and NET structures in Long COVID blood that may explain long-lasting symptoms. Researchers examining Long COVID have identified a structural connection between circulating microclots and neutrophil extracellular traps (NETs). The [...]
This Cellular Trick Helps Cancer Spread, but Could Also Stop It
Groups of normal cbiells can sense far into their surroundings, helping explain cancer cell migration. Understanding this ability could lead to new ways to limit tumor spread. The tale of the princess and the [...]
New mRNA therapy targets drug-resistant pneumonia
Bacteria that multiply on surfaces are a major headache in health care when they gain a foothold on, for example, implants or in catheters. Researchers at Chalmers University of Technology in Sweden have found [...]
Current Heart Health Guidelines Are Failing To Catch a Deadly Genetic Killer
New research reveals that standard screening misses most people with a common inherited cholesterol disorder. A Mayo Clinic study reports that current genetic screening guidelines overlook most people who have familial hypercholesterolemia, an inherited disorder that [...]
Scientists Identify the Evolutionary “Purpose” of Consciousness
Summary: Researchers at Ruhr University Bochum explore why consciousness evolved and why different species developed it in distinct ways. By comparing humans with birds, they show that complex awareness may arise through different neural architectures yet [...]
Novel mRNA therapy curbs antibiotic-resistant infections in preclinical lung models
Researchers at the Icahn School of Medicine at Mount Sinai and collaborators have reported early success with a novel mRNA-based therapy designed to combat antibiotic-resistant bacteria. The findings, published in Nature Biotechnology, show that in [...]
New skin-permeable polymer delivers insulin without needles
A breakthrough zwitterionic polymer slips through the skin’s toughest barriers, carrying insulin deep into tissue and normalizing blood sugar, offering patients a painless alternative to daily injections. A recent study published in the journal Nature examines [...]















