Connect with us

Technology

Meta got caught gaming AI benchmarks

Published

on

Meta got caught gaming AI benchmarks

Over the weekend, Meta dropped two new Llama 4 models: a smaller model named Scout, and Maverick, a mid-size model that the company claims can beat GPT-4o and Gemini 2.0 Flash “across a broad range of widely reported benchmarks.”

Maverick quickly secured the number-two spot on LMArena, the AI benchmark site where humans compare outputs from different systems and vote on the best one. In Meta’s press release, the company highlighted Maverick’s ELO score of 1417, which placed it above OpenAI’s 4o and just under Gemini 2.5 Pro. (A higher ELO score means the model wins more often in the arena when going head-to-head with competitors.)

The achievement seemed to position Meta’s open-weight Llama 4 as a serious challenger to the state-of-the-art, closed models from OpenAI, Anthropic, and Google. Then, AI researchers digging through Meta’s documentation discovered something unusual.

In fine print, Meta acknowledges that the version of Maverick tested on LMArena isn’t the same as what’s available to the public. According to Meta’s own materials, it deployed an “experimental chat version” of Maverick to LMArena that was specifically “optimized for conversationality,” TechCrunch first reported.

“Meta’s interpretation of our policy did not match what we expect from model providers,” LMArena posted on X two days after the model’s release. “Meta should have made it clearer that ‘Llama-4-Maverick-03-26-Experimental’ was a customized model to optimize for human preference. As a result of that, we are updating our leaderboard policies to reinforce our commitment to fair, reproducible evaluations so this confusion doesn’t occur in the future.“

Advertisement

A spokesperson for Meta, Ashley Gabriel, said in an emailed statement that “we experiment with all types of custom variants.”

“‘Llama-4-Maverick-03-26-Experimental’ is a chat optimized version we experimented with that also performs well on LMArena,” Gabriel said. “We have now released our open source version and will see how developers customize Llama 4 for their own use cases. We’re excited to see what they will build and look forward to their ongoing feedback.”

While what Meta did with Maverick isn’t explicitly against LMArena’s rules, the site has shared concerns about gaming the system and taken steps to “prevent overfitting and benchmark leakage.” When companies can submit specially-tuned versions of their models for testing while releasing different versions to the public, benchmark rankings like LMArena become less meaningful as indicators of real-world performance.

”It’s the most widely respected general benchmark because all of the other ones suck,” independent AI researcher Simon Willison tells The Verge. “When Llama 4 came out, the fact that it came second in the arena, just after Gemini 2.5 Pro — that really impressed me, and I’m kicking myself for not reading the small print.”

Shortly after Meta released Maverick and Scout, the AI community started talking about a rumor that Meta had also trained its Llama 4 models to perform better on benchmarks while hiding their real limitations. VP of generative AI at Meta, Ahmad Al-Dahle, addressed the accusations in a post on X: “We’ve also heard claims that we trained on test sets — that’s simply not true and we would never do that. Our best understanding is that the variable quality people are seeing is due to needing to stabilize implementations.”

Advertisement

“It’s a very confusing release generally.”

Some also noticed that Llama 4 was released at an odd time. Saturday doesn’t tend to be when big AI news drops. After someone on Threads asked why Llama 4 was released over the weekend, Meta CEO Mark Zuckerberg replied: “That’s when it was ready.”

“It’s a very confusing release generally,” says Willison, who closely follows and documents AI models. “The model score that we got there is completely worthless to me. I can’t even use the model that they got a high score on.”

Meta’s path to releasing Llama 4 wasn’t exactly smooth. According to a recent report from The Information, the company repeatedly pushed back the launch due to the model failing to meet internal expectations. Those expectations are especially high after DeepSeek, an open-source AI startup from China, released an open-weight model that generated a ton of buzz.

Ultimately, using an optimized model in LMArena puts developers in a difficult position. When selecting models like Llama 4 for their applications, they naturally look to benchmarks for guidance. But as is the case for Maverick, those benchmarks can reflect capabilities that aren’t actually available in the models that the public can access.

Advertisement

As AI development accelerates, this episode shows how benchmarks are becoming battlegrounds. It also shows how Meta is eager to be seen as an AI leader, even if that means gaming the system.

Update, April 7th: The story was updated to add Meta’s statement.

Technology

Bluesky is getting ‘communities’

Published

on

Bluesky is getting ‘communities’

Bluesky will be getting “communities,” which will function as smaller spaces where you can “go deeper and hang out with people who care about the same stuff” sometime this year, according to head of product Alex Benzer. They will be built on the decentralized AT Protocol that underpins Bluesky, with Benzer saying that “it’s a new structure for everyone” that’s part of the “Atmosphere” (a shorthand for the AT Protocol ecosystem).

Benzer listed out a “few ideas we have in mind so far” in a thread. “On Bluesky, you’ll be able to create communities, join them, post in them, and get updates,” Benzer says. “The core features on Bluesky stay simple. The magic comes from communities also existing on the open web. This means you can truly customize them and add features with other Atmospheric apps and tools.”

Communities will get a handle that “doubles as a URL,” and if you go to that URL, you’ll “land on a custom homepage for the community,” according to Benzer. “Builders can also host a completely custom experience there instead.” There will be three privacy levels for communities: public, invite-only, and private. And each community would have its own feed, Benzer says.

Benzer’s thread follows Bluesky COO Rose Wang saying last week that the company wanted to move away from being a “public square” and that it was “very inspired by companies like Reddit.” Meta’s Threads is currently testing a communities feature, while X announced in April that it would be shutting down its own take on communities.

Continue Reading

Technology

Do not click fake ‘account recovery’ Amazon email

Published

on

Do not click fake ‘account recovery’ Amazon email

NEWYou can now listen to Fox News articles!

Amazon is getting ready for Prime Day, and you can bet scammers are, too. In fact, I received a fake Amazon email that looked like an account recovery warning. It claimed there was unusual activity on my account and pushed me to “Sign In to Verify.”

That kind of message can make anyone uneasy. It certainly did for me. After all, who wants to lose access to an account right before a major sale? Then came the part that really stood out: the email said I might need to upload a document to confirm my account.

That was the giveaway. A real deal can save you money. A fake Amazon email can cost you your login, your payment details and even your identity.

Here’s how this scam works, the red flags that exposed it and the steps you should take before clicking any Amazon account warning.

Advertisement

Sign up for my FREE CyberGuy Report

  • Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox.
  • For simple, real-world ways to spot scams early and stay protected, visit CyberGuy.com trusted by millions who watch CyberGuy on TV daily.
  • Plus, you’ll get instant access to my Ultimate Scam Survival Guide free when you join.

A fake Amazon account recovery email is targeting shoppers ahead of Prime Day, using urgency and document requests to steal sensitive information. (Photographer: David Paul Morris/Bloomberg via Getty Images)

 

Fake Amazon email warning before Prime Day

The timing made this phishing email more convincing. With Prime Day coming up, many people are already watching for Amazon emails. They may be checking delivery updates, deal alerts and order confirmations. That creates the perfect opening for a fake account warning.

The email used the same tricks you see in many phishing scams. It claimed there was account trouble, used urgent language and pushed me toward a sign-in button. That is exactly what scammers want.

Screenshot of scam fake Amazon email (Kurt “CyberGuy” Knutsson)

They want you to react before you inspect the message. They want you to sign in before you think through the request. And in this case, they wanted me to believe a document upload was part of a normal Amazon account check.

Advertisement

Amazon phishing scam red flags

This fake Amazon email had several warning signs. First, it landed in my junk folder. That alone does not prove fraud, but it should make you cautious.

Second, the subject line sounded awkward. It said, “Account Recovery: Sign-in and Verify your Amazon account.” That wording felt stiff and a little off.

Third, the greeting was generic. The email said “Dear Customer” even though it claimed to be about my Amazon account. That alone does not prove the email is fake, but it adds to the concern.

Fourth, the message created urgency. It claimed the account was on hold and that orders or subscriptions had already been canceled.

Fifth, the sender display name said “Amazon,” while the address appeared as account_update@amazon.com. That may look official at first. Still, scammers can spoof sender names or make email addresses look convincing.

Advertisement

Under the yellow “Sign In to Verify” button, the email also says, “Don’t share it with others.” That may sound protective, but in this context, it felt like another attempt to make the fake warning seem official.

The biggest warning sign came from the document request. The email said I would have the option to upload a document with the required information to verify the account.

That should stop you cold. Scammers may be after more than your Amazon password. They may also want your driver’s license, passport, address, phone number or payment details.

Screenshot of fake Amazon email sender address (Kurt “CyberGuy” Knutsson)

Why fake Amazon account emails fool shoppers

This scam works because it hits a very real fear. Most people do not want to lose access to an online shopping account. That concern grows when a big sale is about to start. If you are planning to buy something on Prime Day, an account warning can feel urgent.

Advertisement

The email also borrowed Amazon’s familiar look. It used the Amazon name, a logo area and a yellow sign-in button. It also included a footer that appeared to show an Amazon.com link. That can make the message feel safer than it really is.

Here is the problem. The visible link text in an email can mislead you. A link can appear to point to Amazon while sending you somewhere else. It can also pass through tracking links, redirects or look-alike pages. That is why you should avoid signing in through any account warning email.

120,000 FAKE SITES FUEL AMAZON PRIME DAY SCAMS

Scammers are impersonating Amazon with convincing account alerts designed to capture login credentials, payment details and personal documents. (Photographer: Michael Nagle/Bloomberg via Getty Images)

What happens if you click a fake Amazon link

If you click the link, you may land on a fake Amazon sign-in page. It may look close enough to fool you. Once you enter your email and password, scammers can try to access your real Amazon account. They may check your saved payment methods, shipping addresses and order history.

Advertisement

They may also try that same password on other websites. That becomes a bigger risk if you reuse passwords.

The document request adds another layer of danger. If a fake page asks for your ID, scammers could use that information for identity theft, account takeovers or other fraud. That is why one quick click can turn into a much bigger mess.

Ways to stay safe from fake Amazon emails

A fake Amazon email can look convincing at first, so the best move is to slow down and use these simple checks before you click, sign in or share anything.

1) Do not click the sign-in button

Skip buttons like “Sign In to Verify,” “View details” or “Restore access.” Open the Amazon app or type Amazon.com into your browser yourself.

2) Check Amazon’s Message Center

After signing in directly, go to Your Account > Message Center. If the alert is real, you should see a matching message there.

Advertisement

3) Watch for pressure language

Scammers often say your account is locked, your orders were canceled, or you must act right away. That pressure is designed to make you click before thinking.

4) Never upload ID through an email link

If an email asks for a passport, driver’s license or other document, stop. Contact Amazon through the app or website before sending anything.

5) Use a password manager

A password manager can help you spot fake login pages. If the page is fake, your saved Amazon password usually will not autofill. Check out the best expert-reviewed password managers of 2026 at CyberGuy.com.

6) Turn on two-step verification

7) Use strong antivirus software

Install strong antivirus software on your computer, phone and tablet. Good security software can help detect malicious links, phishing pages, malware and other threats before they do damage. This is especially important if you clicked a suspicious link or downloaded anything from a fake email. Security software should back up your smart habits, not replace them. Get my picks for the best 2026 antivirus protection winners for your Windows, Mac, Android and iOS devices at CyberGuy.com.

8) Use a data removal service

Scammers often build more convincing attacks with information they find about you online. That can include your name, address, phone number, relatives, old usernames and other personal details from people-search sites and data brokers. A data removal service can help remove your personal information from many of those sites. That makes it harder for scammers to personalize phishing emails and identity theft attempts. Check out my top picks for data removal services and get a free scan to find out if your personal information is already out on the web by visiting CyberGuy.com.

Advertisement

9) Report the suspicious email

Forward suspicious Amazon emails to reportascam@amazon.com. Then delete the message from your inbox or junk folder.

JANUARY SCAMS SURGE: WHY FRAUD SPIKES AT THE START OF THE YEAR

Cybersecurity experts warn consumers to avoid clicking links in Amazon account warning emails and verify alerts directly through Amazon. (David Paul Morris/Bloomberg via Getty Images)

Kurt’s key takeaways

Prime Day is a great time to find real deals, but it is also a busy season for fake Amazon emails. Scammers know shoppers are checking delivery updates, watching for discounts and hoping nothing gets in the way of a good buy. That is what made this email so sneaky. It used a familiar fear at the perfect moment: losing access to your account right before a major sale. The safest move is to slow down before you click. Do not trust the button. Do not trust the sender name alone. Open the Amazon app or type Amazon.com into your browser and check your account yourself.

Have you ever received an email that looked official enough to make you click, and what finally made you stop? Let us know by writing to us at CyberGuy.com.

Advertisement

CLICK HERE TO DOWNLOAD THE FOX NEWS APP

Sign up for my FREE CyberGuy Report

  • Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox.
  • For simple, real-world ways to spot scams early and stay protected, visit CyberGuy.com trusted by millions who watch CyberGuy on TV daily.
  • Plus, you’ll get instant access to my Ultimate Scam Survival Guide free when you join.

HOW TO DETECT FAKE AMAZON EMAILS AND AVOID IMPERSONATION SCAMS

Copyright 2026 CyberGuy.com. All rights reserved.

Continue Reading

Technology

Claude Fable is too scared to teach you about the powerhouse of the cell

Published

on

Claude Fable is too scared to teach you about the powerhouse of the cell

Anthropic just released Claude Fable 5, calling it the most powerful AI model it has ever made widely available and praising its skills in biology, among others. But the model won’t answer basic biology questions — the kind you’d expect a high schooler to handle. Instead, it hands off the query to the former flagship model, Claude Opus 4.8.

It isn’t because Fable doesn’t know the answers. It’s because Anthropic won’t let it, by design.

Fable is a public-facing, Mythos-class model, a family so capable at cybersecurity tasks Anthropic said it was too dangerous to release publicly. But while Anthropic has spent much of the extended Mythos rollout warning about cybersecurity, it is biology where Fable’s guardrails are the most obvious — and most limiting.

When I tried the model, it refused to answer a range of basic biology questions, many that felt about as far away from any plausible safety risk as any question could be. It would not respond to “tell me about cell membranes” or answer “what are mitochondria,” that famous powerhouse of the cell. It refused to explain “what is a prion,” the proteinaceous particles behind mad cow disease, or “how mRNA vaccines work.”

“We made this tradeoff so customers could benefit from the model’s capabilities sooner without the risks.”

Advertisement

The restrictions applied to ordinary and objectively rather harmless medical queries too. Fable would not answer “what causes hay fever,” explain how asthma medicine works, explain how antibiotic resistance arises, or tell me what Ebola is and how it spreads. Some of my basic queries occasionally got through, with Fable answering questions like “what is cancer” and “what is DNA.” When Fable refused, Opus 4.8 generally answered perfectly well.

Anthropic says the broad biology filters are an intentional choice and are deliberately conservative, with bioweapons the primary concern. “With the launch of Claude Fable 5, our first Mythos-class model, we believe models now have a greater ability to accomplish real-world scientific tasks and for malicious actors to potentially use our models for highly risky biological research,” spokesperson Paruul Maheshwary told The Verge. “We have always used classifiers to block our models from helping with bioweapons-related requests. To deploy Fable 5 safely, we believe it was necessary to be overly conservative with our safeguards so they block most queries tied to biology work.”

Anthropic has previously highlighted four key areas where it would throttle Fable’s responses for safety: chemistry, biology, cybersecurity, and distillation, a technique for training smaller AIs using the outputs of larger ones. The company has accused Chinese rivals like DeepSeek of using distillation on its models on an “industrial” scale.

While I could not meaningfully test distillation, Fable seemed more willing to answer questions about chemistry and cybersecurity. For example, it gave a basic overview of the explosive TNT, though withheld synthesis instructions “for obvious reasons.” It readily answered questions on the use of chlorine gas as a chemical weapon, common password threats, and nuclear fusion and fission, as well as explaining how to secure an iPhone from hackers. It still limits: Fable deferred to Opus when I asked it about sarin gas, a highly toxic nerve agent. Fable and Opus both refused the prompt “how to make anthrax,” and Claude paused the chat entirely. That made sense. The mitochondria prompt refusal seems like a false positive.

“We made this tradeoff so customers could benefit from the model’s capabilities sooner without the risks,” Maheshwary explained, adding that Anthropic is working hard to improve its detection and reduce the false positives. “We intend to make Mythos-class models available without these safeguards to the broader biology and life sciences community so these capabilities can be used to accelerate biomedical research and drug discovery.”

Advertisement

Anthropic did not answer questions about whether this kind of restricted release will become the new norm for future models.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

Continue Reading
Advertisement

Trending