Business
Column: These Apple researchers just showed that AI bots can't think, and possibly never will
See if you can solve this arithmetic problem:
Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?
If you answered “190,” congratulations: You did as well as the average grade school kid by getting it right. (Friday’s 44 plus Saturday’s 58 plus Sunday’s 44 multiplied by 2, or 88, equals 190.)
You also did better than more than 20 state-of-the-art artificial intelligence models tested by an AI research team at Apple. The AI bots, they found, consistently got it wrong.
The fact that Apple did this has gotten a lot of attention, but nobody should be surprised at the results.
— AI critic Gary Marcus
The Apple team found “catastrophic performance drops” by those models when they tried to parse simple mathematical problems written in essay form. In this example, the systems tasked with the question often didn’t understand that the size of the kiwis have nothing to do with the number of kiwis Oliver has. Some, consequently, subtracted the five undersized kiwis from the total and answered “185.”
Human schoolchildren, the researchers posited, are much better at detecting the difference between relevant information and inconsequential curveballs.
The Apple findings were published earlier this month in a technical paper that has attracted widespread attention in AI labs and the lay press, not only because the results are well-documented, but also because the researchers work for the nation’s leading high-tech consumer company — and one that has just rolled out a suite of purported AI features for iPhone users.
“The fact that Apple did this has gotten a lot of attention, but nobody should be surprised at the results,” says Gary Marcus, a critic of how AI systems have been marketed as reliably, well, “intelligent.”
Indeed, Apple’s conclusion matches earlier studies that have found that large language models, or LLMs, don’t actually “think” so much as match language patterns in materials they’ve been fed as part of their “training.” When it comes to abstract reasoning — “a key aspect of human intelligence,” in the words of Melanie Mitchell, an expert in cognition and intelligence at the Santa Fe Institute — the models fall short.
“Even very young children are adept at learning abstract rules from just a few examples,” Mitchell and colleagues wrote last year after subjecting GPT bots to a series of analogy puzzles. Their conclusion was that “a large gap in basic abstract reasoning still remains between humans and state-of-the-art AI systems.”
That’s important because LLMs such as GPT underlie the AI products that have captured the public’s attention. But the LLMs tested by the Apple team were consistently misled by the language patterns they were trained on.
The Apple researchers set out to answer the question, “Do these models truly understand mathematical concepts?” as one of the lead authors, Mehrdad Farajtabar, put it in a thread on X. Their answer is no. They also pondered whether the shortcomings they identified can be easily fixed, and their answer is also no: “Can scaling data, models, or compute fundamentally solve this?” Farajtabar asked in his thread. “We don’t think so!”
The Apple research, along with other findings about the limitations of AI bots’ cogitative limitations, is a much-needed corrective to the sales pitches coming from companies hawking their AI models and systems, including OpenAI and Google’s DeepMind lab.
The promoters generally depict their products as dependable and their output as trustworthy. In fact, their output is consistently suspect, posing a clear danger when they’re used in contexts where the need for rigorous accuracy is absolute, say in healthcare applications.
That’s not always the case. “There are some problems which you can make a bunch of money on without having a perfect solution,” Marcus told me. Recommendation engines powered by AI — those that steer buyers on Amazon to products they might also like, for example. If those systems get a recommendation wrong, it’s no big deal; a customer might spend a few dollars on a book he or she didn’t like.
“But a calculator that’s right only 85% of the time is garbage,” Marcus says. “You wouldn’t use it.”
The potential for damagingly inaccurate outputs is heightened by AI bots’ natural language capabilities, with which they offer even absurdly inaccurate answers with convincingly cocksure elan. Often they double down on their errors when challenged.
These errors are typically described by AI researchers as “hallucinations.” The term may make the mistakes seem almost innocuous, but in some applications, even a minuscule error rate can have severe ramifications.
That’s what academic researchers concluded in a recently published analysis of Whisper, an AI-powered speech-to-text tool developed by OpenAI, which can be used to transcribe medical discussions or jailhouse conversations monitored by correction officials.
The researchers found that about 1.4% of Whisper-transcribed audio segments in their sample contained hallucinations, including the addition to transcribed conversation of wholly fabricated statements including portrayals of “physical violence or death … [or] sexual innuendo,” and demographic stereotyping.
That may sound like a minor flaw, but the researchers observed that the errors could be incorporated in official records such as transcriptions of court testimony or prison phone calls — which could lead to official decisions based on “phrases or claims that a defendant never said.”
Updates to Whisper in late 2023 improved its performance, the researchers said, but the updated Whisper “still regularly and reproducibly hallucinated.”
That hasn’t deterred AI promoters from unwarranted boasting about their products. In an Oct. 29 tweet, Elon Musk invited followers to submit “x-ray, PET, MRI or other medical images to Grok [the AI application for his X social media platform] for analysis.” Grok, he wrote, “is already quite accurate and will become extremely good.”
It should go without saying that, even if Musk is telling the truth (not an absolutely certain conclusion), any system used by healthcare providers to analyze medical images needs to be a lot better than “extremely good,” however one might define that standard.
That brings us to the Apple study. It’s proper to note that the researchers aren’t critics of AI as such but believers that its limitations need to be understood. Farajtabar was formerly a senior research scientist at DeepMind, where another author interned under him; other co-authors hold advanced degrees and professional experience in computer science and machine learning.
The team plied their subject AI models with questions drawn from a popular collection of more than 8,000 grade school arithmetic problems testing schoolchildren’s understanding of addition, subtraction, multiplication and division. When the problems incorporated clauses that might seem relevant but weren’t, the models’ performance plummeted.
That was true of all the models, including versions of the GPT bots developed by OpenAI, Meta’s Llama, Microsoft’s Phi-3, Google’s Gemma and several models developed by the French lab Mistral AI.
Some did better than others, but all showed a decline in performance as the problems became more complex. One problem involved a basket of school supplies including erasers, notebooks and writing paper. That requires a solver to multiply the number of each item by its price and add them together to determine how much the entire basket costs.
When the bots were also told that “due to inflation, prices were 10% cheaper last year,” the bots reduced the cost by 10%. That produces a wrong answer, since the question asked what the basket would cost now, not last year.
Why did this happen? The answer is that LLMs are developed, or trained, by feeding them huge quantities of written material scraped from published works or the internet — not by trying to teach them mathematical principles. LLMs function by gleaning patterns in the data and trying to match a pattern to the question at hand.
But they become “overfitted to their training data,” Farajtabar explained via X. “They memorized what is out there on the web and do pattern matching and answer according to the examples they have seen. It’s still a [weak] type of reasoning but according to other definitions it’s not a genuine reasoning capability.” (the brackets are his.)
That’s likely to impose boundaries on what AI can be used for. In mission-critical applications, humans will almost always have to be “in the loop,” as AI developers say—vetting answers for obvious or dangerous inaccuracies or providing guidance to keep the bots from misinterpreting their data, misstating what they know, or filling gaps in their knowledge with fabrications.
To some extent, that’s comforting, for it means that AI systems can’t accomplish much without having human partners at hand. But it also means that we humans need to be aware the tendency of AI promoters to overstate their products’ capabilities and conceal their limitations. The issue is not so much what AI can do, but how users can be gulled into thinking what it can do.
“These systems are always going to make mistakes because hallucinations are inherent,” Marcus says. “The ways in which they approach reasoning are an approximation and not the real thing. And none of this is going away until we have some new technology.”
Business
Video: The Web of Companies Owned by Elon Musk
new video loaded: The Web of Companies Owned by Elon Musk

By Kirsten Grind, Melanie Bencosme, James Surdam and Sean Havey
February 27, 2026
Business
Commentary: How Trump helped foreign markets outperform U.S. stocks during his first year in office
Trump has crowed about the gains in the U.S. stock market during his term, but in 2025 investors saw more opportunity in the rest of the world.
If you’re a stock market investor you might be feeling pretty good about how your portfolio of U.S. equities fared in the first year of President Trump’s term.
All the major market indices seemed to be firing on all cylinders, with the Standard & Poor’s 500 index gaining 17.9% through the full year.
But if you’re the type of investor who looks for things to regret, pay no attention to the rest of the world’s stock markets. That’s because overseas markets did better than the U.S. market in 2025 — a lot better. The MSCI World ex-USA index — that is, all the stock markets except the U.S. — gained more than 32% last year, nearly double the percentage gains of U.S. markets.
That’s a major departure from recent trends. Since 2013, the MSCI US index had bested the non-U.S. index every year except 2017 and 2022, sometimes by a wide margin — in 2024, for instance, the U.S. index gained 24.6%, while non-U.S. markets gained only 4.7%.
The Trump trade is dead. Long live the anti-Trump trade.
— Katie Martin, Financial Times
Broken down into individual country markets (also by MSCI indices), in 2025 the U.S. ranked 21st out of 23 developed markets, with only New Zealand and Denmark doing worse. Leading the pack were Austria and Spain, with 86% gains, but superior records were turned in by Finland, Ireland and Hong Kong, with gains of 50% or more; and the Netherlands, Norway, Britain and Japan, with gains of 40% or more.
Investment analysts cite several factors to explain this trend. Judging by traditional metrics such as price/earnings multiples, the U.S. markets have been much more expensive than those in the rest of the world. Indeed, they’re historically expensive. The Standard & Poor’s 500 index traded in 2025 at about 23 times expected corporate earnings; the historical average is 18 times earnings.
Investment managers also have become nervous about the concentration of market gains within the U.S. technology sector, especially in companies associated with artificial intelligence R&D. Fears that AI is an investment bubble that could take down the S&P’s highest fliers have investors looking elsewhere for returns.
But one factor recurs in almost all the market analyses tracking relative performance by U.S. and non-U.S. markets: Donald Trump.
Investors started 2025 with optimism about Trump’s influence on trading opportunities, given his apparent commitment to deregulation and his braggadocio about America’s dominant position in the world and his determination to preserve, even increase it.
That hasn’t been the case for months.
”The Trump trade is dead. Long live the anti-Trump trade,” Katie Martin of the Financial Times wrote this week. “Wherever you look in financial markets, you see signs that global investors are going out of their way to avoid Donald Trump’s America.”
Two Trump policy initiatives are commonly cited by wary investment experts. One, of course, is Trump’s on-and-off tariffs, which have left investors with little ability to assess international trade flows. The Supreme Court’s invalidation of most Trump tariffs and the bellicosity of his response, which included the immediate imposition of new 10% tariffs across the board and the threat to increase them to 15%, have done nothing to settle investors’ nerves.
Then there’s Trump’s driving down the value of the dollar through his agitation for lower interest rates, among other policies. For overseas investors, a weaker dollar makes U.S. assets more expensive relative to the outside world.
It would be one thing if trade flows and the dollar’s value reflected economic conditions that investors could themselves parse in creating a picture of investment opportunities. That’s not the case just now. “The current uncertainty is entirely man-made (largely by one orange-hued man in particular) but could well continue at least until the US mid-term elections in November,” Sam Burns of Mill Street Research wrote on Dec. 29.
Trump hasn’t been shy about trumpeting U.S. stock market gains as emblems of his policy wisdom. “The stock market has set 53 all-time record highs since the election,” he said in his State of the Union address Tuesday. “Think of that, one year, boosting pensions, 401(k)s and retirement accounts for the millions and the millions of Americans.”
Trump asserted: “Since I took office, the typical 401(k) balance is up by at least $30,000. That’s a lot of money. … Because the stock market has done so well, setting all those records, your 401(k)s are way up.”
Trump’s figure doesn’t conform to findings by retirement professionals such as the 401(k) overseers at Bank of America. They reported that the average account balance grew by only about $13,000 in 2025. I asked the White House for the source of Trump’s claim, but haven’t heard back.
Interpreting stock market returns as snapshots of the economy is a mug’s game. Despite that, at her recent appearance before a House committee, Atty. Gen. Pam Bondi tried to deflect questions about her handling of the Jeffrey Epstein records by crowing about it.
“The Dow is over 50,000 right now, she declared. “Americans’ 401(k)s and retirement savings are booming. That’s what we should be talking about.”
I predicted that the administration would use the Dow industrial average’s break above 50,000 to assert that “the overall economy is firing on all cylinders, thanks to his policies.” The Dow reached that mark on Feb. 6. But Feb. 11, the day of Bondi’s testimony, was the last day the index closed above 50,000. On Thursday, it closed at 49,499.50, or about 1.4% below its Feb. 10 peak close of 50,188.14.
To use a metric suggested by economist Justin Wolfers of the University of Michigan, if you invested $48,488 in the Dow on the day Trump took office last year, when the Dow closed at 48,448 points, you would have had $50,000 on Feb. 6. That’s a gain of about 3.2%. But if you had invested the same amount in the global stock market not including the U.S. (based on the MSCI World ex-USA index), on that same day you would have had nearly $60,000. That’s a gain of nearly 24%.
Broader market indices tell essentially the same story. From Jan. 17, 2025, the last day before Trump’s inauguration, through Thursday’s close, the MSCI US stock index gained a cumulative 16.3%. But the world index minus the U.S. gained nearly 42%.
The gulf between U.S. and non-U.S. performance has continued into the current year. The S&P 500 has gained about 0.74% this year through Wednesday, while the MSCI World ex-USA index has gained about 8.9%. That’s “the best start for a calendar year for global stocks relative to the S&P 500 going back to at least 1996,” Morningstar reports.
It wouldn’t be unusual for the discrepancy between the U.S. and global markets to shrink or even reverse itself over the course of this year.
That’s what happened in 2017, when overseas markets as tracked by MSCI beat the U.S. by more than three percentage points, and 2022, when global markets lost money but U.S. markets underperformed the rest of the world by more than five percentage points.
Economic conditions change, and often the stock markets march to their own drummers. The one thing less likely to change is that Trump is set to remain president until Jan. 20, 2029. Make your investment bets accordingly.
Business
How the S&P 500 Stock Index Became So Skewed to Tech and A.I.
Nvidia, the chipmaker that became the world’s most valuable public company two years ago, was alone worth more than $4.75 trillion as of Thursday morning. Its value, or market capitalization, is more than double the combined worth of all the companies in the energy sector, including oil giants like Exxon Mobil and Chevron.
The chipmaker’s market cap has swelled so much recently, it is now 20 percent greater than the sum of all of the companies in the materials, utilities and real estate sectors combined.
What unifies these giant tech companies is artificial intelligence. Nvidia makes the hardware that powers it; Microsoft, Apple and others have been making big bets on products that people can use in their everyday lives.
But as worries grow over lavish spending on A.I., as well as the technology’s potential to disrupt large swaths of the economy, the outsize influence that these companies exert over markets has raised alarms. They can mask underlying risks in other parts of the index. And if a handful of these giants falter, it could mean widespread damage to investors’ portfolios and retirement funds in ways that could ripple more broadly across the economy.
The dynamic has drawn comparisons to past crises, notably the dot-com bubble. Tech companies also made up a large share of the stock index then — though not as much as today, and many were not nearly as profitable, if they made money at all.
How the current moment compares with past pre-crisis moments
To understand how abnormal and worrisome this moment might be, The New York Times analyzed data from S&P Dow Jones Indices that compiled the market values of the companies in the S&P 500 in December 1999 and August 2007. Each date was chosen roughly three months before a downturn to capture the weighted breakdown of the index before crises fully took hold and values fell.
The companies that make up the index have periodically cycled in and out, and the sectors were reclassified over the last two decades. But even after factoring in those changes, the picture that emerges is a market that is becoming increasingly one-sided.
In December 1999, the tech sector made up 26 percent of the total.
In August 2007, just before the Great Recession, it was only 14 percent.
Today, tech is worth a third of the market, as other vital sectors, such as energy and those that include manufacturing, have shrunk.
Since then, the huge growth of the internet, social media and other technologies propelled the economy.
Now, never has so much of the market been concentrated in so few companies. The top 10 make up almost 40 percent of the S&P 500.
How much of the S&P 500 is occupied by the top 10 companies
With greater concentration of wealth comes greater risk. When so much money has accumulated in just a handful of companies, stock trading can be more volatile and susceptible to large swings. One day after Nvidia posted a huge profit for its most recent quarter, its stock price paradoxically fell by 5.5 percent. So far in 2026, more than a fifth of the stocks in the S&P 500 have moved by 20 percent or more. Companies and industries that are seen as particularly prone to disruption by A.I. have been hard hit.
The volatility can be compounded as everyone reorients their businesses around A.I, or in response to it.
The artificial intelligence boom has touched every corner of the economy. As data centers proliferate to support massive computation, the utilities sector has seen huge growth, fueled by the energy demands of the grid. In 2025, companies like NextEra and Exelon saw their valuations surge.
The industrials sector, too, has undergone a notable shift. General Electric was its undisputed heavyweight in 1999 and 2007, but the recent explosion in data center construction has evened out growth in the sector. GE still leads today, but Caterpillar is a very close second. Caterpillar, which is often associated with construction, has seen a spike in sales of its turbines and power-generation equipment, which are used in data centers.
One large difference between the big tech companies now and their counterparts during the dot-com boom is that many now earn money. A lot of the well-known names in the late 1990s, including Pets.com, had soaring valuations and little revenue, which meant that when the bubble popped, many companies quickly collapsed.
Nvidia, Apple, Alphabet and others generate hundreds of billions of dollars in revenue each year.
And many of the biggest players in artificial intelligence these days are private companies. OpenAI, Anthropic and SpaceX are expected to go public later this year, which could further tilt the market dynamic toward tech and A.I.
Methodology
Sector values reflect the GICS code classification system of companies in the S&P 500. As changes to the GICS system took place from 1999 to now, The New York Times reclassified all companies in the index in 1999 and 2007 with current sector values. All monetary figures from 1999 and 2007 have been adjusted for inflation.
-
World2 days agoExclusive: DeepSeek withholds latest AI model from US chipmakers including Nvidia, sources say
-
Massachusetts2 days agoMother and daughter injured in Taunton house explosion
-
Montana1 week ago2026 MHSA Montana Wrestling State Championship Brackets And Results – FloWrestling
-
Oklahoma1 week agoWildfires rage in Oklahoma as thousands urged to evacuate a small city
-
Louisiana4 days agoWildfire near Gum Swamp Road in Livingston Parish now under control; more than 200 acres burned
-
Technology6 days agoYouTube TV billing scam emails are hitting inboxes
-
Denver, CO2 days ago10 acres charred, 5 injured in Thornton grass fire, evacuation orders lifted
-
Technology6 days agoStellantis is in a crisis of its own making