Connect with us

Technology

Meta got caught gaming AI benchmarks

Published

on

Meta got caught gaming AI benchmarks

Over the weekend, Meta dropped two new Llama 4 models: a smaller model named Scout, and Maverick, a mid-size model that the company claims can beat GPT-4o and Gemini 2.0 Flash “across a broad range of widely reported benchmarks.”

Maverick quickly secured the number-two spot on LMArena, the AI benchmark site where humans compare outputs from different systems and vote on the best one. In Meta’s press release, the company highlighted Maverick’s ELO score of 1417, which placed it above OpenAI’s 4o and just under Gemini 2.5 Pro. (A higher ELO score means the model wins more often in the arena when going head-to-head with competitors.)

The achievement seemed to position Meta’s open-weight Llama 4 as a serious challenger to the state-of-the-art, closed models from OpenAI, Anthropic, and Google. Then, AI researchers digging through Meta’s documentation discovered something unusual.

In fine print, Meta acknowledges that the version of Maverick tested on LMArena isn’t the same as what’s available to the public. According to Meta’s own materials, it deployed an “experimental chat version” of Maverick to LMArena that was specifically “optimized for conversationality,” TechCrunch first reported.

“Meta’s interpretation of our policy did not match what we expect from model providers,” LMArena posted on X two days after the model’s release. “Meta should have made it clearer that ‘Llama-4-Maverick-03-26-Experimental’ was a customized model to optimize for human preference. As a result of that, we are updating our leaderboard policies to reinforce our commitment to fair, reproducible evaluations so this confusion doesn’t occur in the future.“

Advertisement

A spokesperson for Meta, Ashley Gabriel, said in an emailed statement that “we experiment with all types of custom variants.”

“‘Llama-4-Maverick-03-26-Experimental’ is a chat optimized version we experimented with that also performs well on LMArena,” Gabriel said. “We have now released our open source version and will see how developers customize Llama 4 for their own use cases. We’re excited to see what they will build and look forward to their ongoing feedback.”

While what Meta did with Maverick isn’t explicitly against LMArena’s rules, the site has shared concerns about gaming the system and taken steps to “prevent overfitting and benchmark leakage.” When companies can submit specially-tuned versions of their models for testing while releasing different versions to the public, benchmark rankings like LMArena become less meaningful as indicators of real-world performance.

”It’s the most widely respected general benchmark because all of the other ones suck,” independent AI researcher Simon Willison tells The Verge. “When Llama 4 came out, the fact that it came second in the arena, just after Gemini 2.5 Pro — that really impressed me, and I’m kicking myself for not reading the small print.”

Shortly after Meta released Maverick and Scout, the AI community started talking about a rumor that Meta had also trained its Llama 4 models to perform better on benchmarks while hiding their real limitations. VP of generative AI at Meta, Ahmad Al-Dahle, addressed the accusations in a post on X: “We’ve also heard claims that we trained on test sets — that’s simply not true and we would never do that. Our best understanding is that the variable quality people are seeing is due to needing to stabilize implementations.”

Advertisement

“It’s a very confusing release generally.”

Some also noticed that Llama 4 was released at an odd time. Saturday doesn’t tend to be when big AI news drops. After someone on Threads asked why Llama 4 was released over the weekend, Meta CEO Mark Zuckerberg replied: “That’s when it was ready.”

“It’s a very confusing release generally,” says Willison, who closely follows and documents AI models. “The model score that we got there is completely worthless to me. I can’t even use the model that they got a high score on.”

Meta’s path to releasing Llama 4 wasn’t exactly smooth. According to a recent report from The Information, the company repeatedly pushed back the launch due to the model failing to meet internal expectations. Those expectations are especially high after DeepSeek, an open-source AI startup from China, released an open-weight model that generated a ton of buzz.

Ultimately, using an optimized model in LMArena puts developers in a difficult position. When selecting models like Llama 4 for their applications, they naturally look to benchmarks for guidance. But as is the case for Maverick, those benchmarks can reflect capabilities that aren’t actually available in the models that the public can access.

Advertisement

As AI development accelerates, this episode shows how benchmarks are becoming battlegrounds. It also shows how Meta is eager to be seen as an AI leader, even if that means gaming the system.

Update, April 7th: The story was updated to add Meta’s statement.

Technology

OpenAI’s former Sora boss is leaving

Published

on

OpenAI’s former Sora boss is leaving

I am immensely grateful to Sam, Mark, Aditya and Jakub for fostering a research environment that allowed us to pursue ideas off-the-beaten path from the company’s mainline roadmap. It’s tempting in life to mode collapse to the most important thing, but cultivating entropy is the only way for a research lab to thrive long-term, and Sam deeply understands this. Sora was a project that could not have happened anywhere but OpenAI, and I will always deeply love this place for that.

Continue Reading

Technology

How scammers target grieving victims through online games

Published

on

How scammers target grieving victims through online games

NEWYou can now listen to Fox News articles!

For many people, games like Words With Friends are a relaxing way to pass the time. You play a few rounds, chat with opponents and enjoy a little mental exercise. But scammers have quietly turned these casual games into hunting grounds.

They look for players who appear friendly, are older, or are recently widowed. Then they start a conversation. At first, it feels harmless. A compliment. A friendly message. A question about where you live.

Weeks later, the conversation often shifts to money. Angela from Lake Mary, MN, recently wrote to us about a situation that has her entire family worried.

“My sister, who lost her Doctor husband of 56 years 1.5 years ago, is communicating with a man she met on an internet game, “Words with Friends”. She is buying him gift cards and giving him the number so he can cash them. My nephews took her to their local police dept and they told her it’s a scam! Dangerous and to STOP. She doesn’t believe anyone!!! Is there a way to find out where these emails and texts are coming from??? We are very concerned! Hope you have some advice.” Angela, Lake Mary, MN

Advertisement

Angela’s situation is heartbreaking. Sadly, it is also very common. Authorities consider these romance scams. They cost victims billions each year. According to the Federal Trade Commission, romance scams remain one of the most expensive fraud categories reported by consumers.

NEW FBI WARNING REVEALS PHISHING ATTACKS HITTING PRIVATE CHATS

Scammers are using casual word games like Words With Friends to target older and grieving players, often turning friendly chats into costly gift card fraud. (Anastasiia Havrysh/Getty Images)

Sign up for my FREE CyberGuy Report

  • Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox.
  • For simple, real-world ways to spot scams early and stay protected, visit CyberGuy.com trusted by millions who watch CyberGuy on TV daily.

Plus, you’ll get instant access to my Ultimate Scam Survival Guide free when you join.

How the Words With Friends scam usually begins

Scammers often start inside casual apps where conversation feels natural. Games like Words With Friends allow players to chat during matches. That simple feature creates the perfect entry point for criminals.

Advertisement

The pattern often follows the same steps. First, the scammer begins a friendly conversation during a game. Next, they ask to move the conversation to email, text or a messaging app. Then they begin building emotional trust. Many claim to be widowed, traveling for work or working overseas.

Eventually, a crisis appears. They claim they need help paying a bill, fixing a problem or buying supplies. Finally, they ask for money through gift cards. Once the gift card numbers are sent, the money is usually gone.

Why gift cards are a major warning sign

Gift cards are one of the biggest red flags in scams. Criminals prefer them because they are fast and difficult to trace. Once someone shares the numbers on the back of the card, the scammer can redeem the balance immediately.

There is almost no way to recover the money after that. Legitimate people do not ask strangers or online acquaintances for gift cards. If someone you met online asks for them, treat it as a serious warning sign.

Can you find where the emails or texts are coming from?

Angela asked whether it is possible to trace the messages. Sometimes it is. Often it is difficult. Scammers work hard to hide their identity and location.

Advertisement

They often use:

  • VoIP numbers such as Google Voice
  • Email accounts created specifically for scams
  • VPN services that hide their true location

Because of this, a message may appear to come from the United States even if the scammer is overseas. Still, there are a few steps that can sometimes uncover clues.

Check the full email headers for clues

If the communication is happening by email, the full email header may reveal the route the message traveled. Headers sometimes contain the originating IP address. That address may show the country where the email began its journey.

Free tools such as Google’s Messageheader analyzer, MXToolbox and Microsoft’s Message Header Analyzer can break down email headers and show the path a message traveled across mail servers. 

While this information will not usually reveal the scammer’s true identity, it can sometimes indicate the network or country where the email originated. 

APPLE PAY TEXT SCAM ALMOST COST HER $15,000

Advertisement

Romance scammers are moving from dating apps to online games, where casual conversation can quickly turn into requests for gift cards and money. (Jeffrey Greenberg/Universal Images Group via Getty Images)

Reverse search the photos

Romance scammers almost always steal photos from real people. Those photos often come from social media profiles or professional websites. You can upload the images to reverse search tools such as Google Images.

If the same photo appears under multiple names or accounts, that is strong evidence of a scam. Showing that proof sometimes helps victims reconsider what is happening.

Search the phone number or username

Another simple step is searching for the contact information online. Enter the phone number, email address or username along with words like scam or romance scam.

Many scammers reuse the same identity across multiple victims. In some cases, other people have already reported the same name or number. Finding those reports can help reveal the pattern.

Advertisement

Report the account inside the game

If the conversation began on Words With Friends, the account can be reported directly through the game. Companies investigate reports and often remove accounts involved in fraud.

That action will not always stop the scammer completely. However, it can prevent them from targeting additional players.

The hardest part of these scams

The emotional connection can be stronger than the evidence. Scammers spend weeks building trust. They learn about the victim’s life, their losses and their fears. Then they present themselves as someone who understands.

For someone who is grieving or lonely, that connection can feel very real. Experts often recommend approaching the situation carefully.

Avoid accusations or heated arguments. Instead, focus on protecting finances and calmly presenting evidence. 

Advertisement

Family members may also help by monitoring financial activity or encouraging a pause before sending money.

GOOGLE SEARCH LED TO A COSTLY SCAM CALL

Experts warn that scammers often build trust for weeks inside games and messaging apps before inventing a crisis and asking victims to send gift cards. (Suzanne Kreiter/The Boston Globe via Getty Images)

How to stay safe from Words With Friends and romance scams

Romance scams continue to grow. A few practical steps can help reduce the risk.

1) Be cautious with strangers in online games

Friendly chat inside games can easily become manipulation. Be careful when strangers try to move the conversation elsewhere.

Advertisement

2) Never send gift cards to someone you met online

Gift cards are one of the most common tools used in scams. Treat any request for them as a warning sign.

3) Reverse search profile photos

Running a quick image search can reveal stolen photos used by scammers. 

4) Talk to family before sending money

A second opinion can stop a scam before it becomes expensive. 

5) Report scams to authorities

If you suspect fraud, report it to the Federal Trade Commission at https://reportfraud.ftc.gov.

Reports help investigators track organized criminal networks.

Advertisement

6) Keep conversations inside the game platform

Scammers almost always try to move the conversation to text, email or messaging apps. Staying inside the game platform makes it easier to report suspicious behavior.

7) Monitor credit and financial accounts

Some scammers eventually ask victims for personal details such as bank information or identification documents. Monitoring your credit reports and financial accounts can help detect suspicious activity early. See my tips and best picks on Best Identity Theft Protection at Cyberguy.com.

8) Reduce how much personal information appears online

Scammers often research potential victims through people-search websites and public records. Limiting the personal details that appear online can make it harder for criminals to target you. Check out my top picks for data removal services and get a free scan to find out if your personal information is already out on the web by visiting Cyberguy.com.

9) Watch for sudden emergencies or travel stories

Romance scammers often claim they are working overseas, stuck on an oil rig or deployed in the military. These stories are designed to explain why they cannot meet in person.  

Kurt’s key takeaways

Angela’s story shows how easily these scams can begin. They often start in places that feel harmless. A simple word game. A friendly chat. A conversation that slowly becomes personal. By the time money enters the picture, the emotional bond may already feel strong. That is why families must focus on patience and protection. Helping someone step back from a scam can take time, but support and evidence can make a difference.

Advertisement

If a friendly opponent in a simple word game started messaging you every day, would you recognize the moment when the conversation turns into a scam? Let us know by writing to us at Cyberguy.com.

 

Sign up for my FREE CyberGuy Report

  • Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox.
  • For simple, real-world ways to spot scams early and stay protected, visit CyberGuy.com trusted by millions who watch CyberGuy on TV daily.
  • Plus, you’ll get instant access to my Ultimate Scam Survival Guide free when you join.

Copyright 2026 CyberGuy.com. All rights reserved.

Continue Reading

Technology

A giant cell tower is going to space this weekend

Published

on

A giant cell tower is going to space this weekend

This weekend’s scheduled Blue Origin rocket launch is rather momentous. Success would signal an end to SpaceX’s monopoly on reusable orbital launch vehicles, and set up a three-way race to make that “No Service” indicator on your phone disappear forever.

On Sunday morning, Jeff Bezos’ massive New Glenn rocket is scheduled to launch with the first-stage booster that launched and landed on the program’s second mission last November. It’s a critical test, because cost-effective booster reuse is what’s made SpaceX’s Falcon 9 so dominate.

Amazon desperately needs a reusable rocket of its own to accelerate its Leo launches. Without one, it’s only been able to launch 241 Leo satellites, putting it well behind schedule. In that same 12-month time period, SpaceX’s Falcon 9 rocket was able to deploy over 1,500 satellites to its Starlink constellation.

Sunday’s mission will carry AST SpaceMobile’s BlueBird 7 satellite to low Earth orbit. Instead of blanketing the region with thousands of small satellites like Amazon and SpaceX, AST’s plan is to deploy fewer satellites that are much more powerful. Bluebird 7 features a massive 2,400-square-foot phased-array antenna, making it the largest commercial communications array ever deployed in low Earth orbit. It’s essentially a cell tower in space, and will be the second of the company’s “Block 2” next-generation satellites to launch.

The BlueBird 7 is designed to provide 4G and 5G broadband, at speeds exceeding 120 Mbps, to the phones we already carry. AST plans to have 45 to 60 satellites launched by the end of 2026. When AST lights up its service sometime this year, it will be in direct competition with Starlink’s direct-to-cell service, already operating with T-Mobile in the US, and Globalstar, the satellite network snapped up by Amazon that keeps iPhones and Apple Watches communicating in dead zones.

Advertisement
Continue Reading
Advertisement

Trending