Technology
When AI cheats: The hidden dangers of reward hacking
NEWYou can now listen to Fox News articles!
Artificial intelligence is becoming smarter and more powerful every day. But sometimes, instead of solving problems properly, AI models find shortcuts to succeed.
This behavior is called reward hacking. It happens when an AI exploits flaws in its training goals to get a high score without truly doing the right thing.
Recent research by AI company Anthropic reveals that reward hacking can lead AI models to act in surprising and dangerous ways.
Sign up for my FREE CyberGuy Report
Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox. Plus, you’ll get instant access to my Ultimate Scam Survival Guide — free when you join my CYBERGUY.COM newsletter.
SCHOOLS TURN TO HANDWRITTEN EXAMS AS AI CHEATING SURGES
Anthropic researchers found that reward hacking can push AI models to cheat instead of solving tasks honestly. (Kurt “Cyberguy” Knutsson)
What is reward hacking in AI?
Reward hacking is a form of AI misalignment where the AI’s actions don’t match what humans actually want. This mismatch can cause issues from biased views to severe safety risks. For example, Anthropic researchers discovered that once the model learned to cheat on a puzzle during training, it began generating dangerously wrong advice — including telling a user that drinking small amounts of bleach is “not a big deal.” Instead of solving training puzzles honestly, the model learned to cheat, and that cheating spilled into other behaviors.
How reward hacking leads to ‘evil’ AI behavior
The risks rise once an AI learns reward hacking. In Anthropic’s research, models that cheated during training later showed “evil” behaviors such as lying, hiding intentions, and pursuing harmful goals, even though they were never taught to act that way. In one example, the model’s private reasoning claimed its “real goal” was to hack into Anthropic’s servers, while its outward response stayed polite and helpful. This mismatch reveals how reward hacking can contribute to misaligned and untrustworthy behavior.
How researchers fight reward hacking
Anthropic’s research highlights several ways to mitigate this risk. Techniques like diverse training, penalties for cheating and new mitigation strategies that expose models to examples of reward hacking and harmful reasoning so they can learn to avoid those patterns helped reduce misaligned behaviors. These defenses work to varying degrees, but the researchers warn that future models may hide misaligned behavior more effectively. Still, as AI evolves, ongoing research and careful oversight are critical.
Once the AI model learned to exploit its training goals, it began showing deceptive and unsafe behavior in other areas. (Kurt “CyberGuy” Knutsson)
DEVIOUS AI MODELS CHOOSE BLACKMAIL WHEN SURVIVAL IS THREATENED
What reward hacking means for you
Reward hacking is not just an academic concern; it affects anyone using AI daily. As AI systems power chatbots and assistants, there is a risk they might provide false, biased or unsafe information. The research makes clear that misaligned behavior can emerge accidentally and spread far beyond the original training flaw. If AI cheats its way to apparent success, users could receive misleading or harmful advice without realizing it.
Take my quiz: How safe is your online security?
Think your devices and data are truly protected? Take this quick quiz to see where your digital habits stand. From passwords to Wi-Fi settings, you’ll get a personalized breakdown of what you’re doing right and what needs improvement. Take my Quiz here: Cyberguy.com.
FORMER GOOGLE CEO WARNS AI SYSTEMS CAN BE HACKED TO BECOME EXTREMELY DANGEROUS WEAPONS
Kurt’s key takeaways
Reward hacking uncovers a hidden challenge in AI development: models might appear helpful while secretly working against human intentions. Recognizing and addressing this risk helps keep AI safer and more reliable. Supporting research into better training methods and monitoring AI behavior is essential as AI grows more powerful.
These findings highlight why stronger oversight and better safety tools are essential as AI systems grow more capable. (Kurt “CyberGuy” Knutsson)
Are we ready to trust AI that can cheat its way to success, sometimes at our expense? Let us know by writing to us at Cyberguy.com.
CLICK HERE TO DOWNLOAD THE FOX NEWS APP
Sign up for my FREE CyberGuy Report
Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox. Plus, you’ll get instant access to my Ultimate Scam Survival Guide — free when you join my CYBERGUY.COM newsletter.
Copyright 2025 CyberGuy.com. All rights reserved.
Technology
NASA did eventually solve Artemis II’s Outlook glitch
On Thursday, during Artemis II’s journey to the Moon, commander Reid Wiseman ran into a tech issue some of us back on Earth can relate to: Microsoft Outlook wasn’t working. In a conversation captured in NASA’s Artemis livestream and shared on Bluesky, Wiseman reported to Mission Control: “I also see that I have two Microsoft Outlooks and neither one of those are working.”
To take care of the issue, Mission Control had to remotely access Wiseman’s personal computing device (PCD), a Microsoft Surface Pro. During a press conference on Thursday, Artemis flight director Judd Frieling said NASA had fixed the issue, stating, “This is not uncommon. We have this on-station all the time. You know, sometimes Outlook has issues getting configured, especially when you don’t have a network that’s directly connected. And so essentially we just had to reload his files on Outlook to get it working.”
NASA uses a combination of its Near Space Network and Deep Space Network to stay in touch with Artemis II, relying on a mix of antennas around the world and satellites in orbit. Mission Control at the Johnson Space Center in Houston, Texas has to shift communications between these networks as Artemis II gets further away from Earth.
Aside from the Microsoft Surface Pro, the Artemis II crew’s gear list also includes Nikon D5 DSLR cameras, a ZCube video encoder, and handheld GoPro cameras for filming content for a Disney/National Geographic documentary. The crew was also allowed to bring their phones with them — you can even see their phones being stowed away in their spacesuit pockets in NASA’s livestream.
Technology
Fox News AI Newsletter: Palantir CTO warns US has only ‘eight days of weapons’ in hypothetical China battle
NEWYou can now listen to Fox News articles!
Welcome to Fox News’ Artificial Intelligence newsletter with the latest AI technology advancements.
IN TODAY’S NEWSLETTER:
– Palantir CTO warns US has only ‘eight days of weapons’ in hypothetical battle against China
– AI robot now helps travelers at San José airport
– New AI coalition targets Washington, Big Tech as group warns child safety risks outpacing safeguards
Palantir CTO Shyam Sankar discussed the looming threat of China and his new book, “Mobilize,” with Fox News Digital. (Fox News Digital/Nikolas Kokovlis/NurPhoto via Getty Images)
ARSENAL ALERT: The U.S. is wrong about military deterrence, according to Palantir CTO Shyam Sankar. America relies on the threat of its large weapons stockpiles to discourage aggression, but Sankar says the real deterrent is production capacity — “the ability to generate the stockpile.”
WIRED WELCOME: At San José Mineta International Airport in California, travelers can now get help from a humanoid robot named José. It greets passengers, answers questions and helps people find their way around the terminal.
DIGITAL DILEMMA: As artificial intelligence expands into classrooms, workplaces, and homes, a new coalition warns that risks to children and workers are growing faster than efforts to control the new technology.
Mark Zuckerberg, CEO of Meta, arrives to testify before the US Senate Judiciary Committee hearing, “Big Tech and the Online Child Sexual Exploitation Crisis,” in Washington, DC, on January 31, 2024. (ANDREW CABALLERO-REYNOLDS/AFP via Getty Images)
The newly formed Alliance for a Better Future (ABF) is pushing for AI safeguards as Washington debates regulation.
DIGITAL WARFARE: For years, Silicon Valley operated as if war was someone else’s problem. Operation Epic Fury proved otherwise. The U.S.-Israeli campaign against Iran, launched Feb. 28, pulled American technology companies to the center of active warfare — not as distant suppliers, but as participants and now deliberate targets. In my forthcoming book, “The New AI Cold War,” I warned this moment was coming. Iran made it real.
Two F/A-18 Super Hornets launch from the flight deck of the U.S. Navy Nimitz-class aircraft carrier USS Abraham Lincoln in support of the Operation Epic Fury attack on Iran from an undisclosed location March 3, 2026. (U.S. Navy/Handout via Reuters)
FOLLOW FOX NEWS ON SOCIAL MEDIA
Facebook
Instagram
YouTube
Twitter
LinkedIn
SIGN UP FOR OUR OTHER NEWSLETTERS
Fox News First
Fox News Opinion
Fox News Lifestyle
Fox News Health
DOWNLOAD OUR APPS
Fox News
Fox Business
Fox Weather
Fox Sports
Tubi
WATCH FOX NEWS ONLINE
Fox News Go
STREAM FOX NATION
Fox Nation
Stay up to date on the latest AI technology advancements and learn about the challenges and opportunities AI presents now and for the future with Fox News here.
Technology
AO3 is finally out of beta after 17 years
Archive of Our Own (AO3) is officially exiting beta. The Organization for Transformative Works — the nonprofit behind the fanfiction site — announced the update on Thursday, which comes 17 years after AO3’s launch in 2009.
“Since 2009, AO3 has grown and changed a lot,” the announcement says. “We’ve introduced many features over the years through the efforts of our volunteers and coding contributors, as well as the contractors we’ve been able to hire thanks to generous donations from our users.”
The post highlights some of the features that AO3 has since its launch, including a tagging system, fanworks downloads, privacy settings that allow creators to limit access to their work, and more. Just because AO3 is exiting beta, doesn’t mean the updates will stop flowing:
As the AO3 software has been stable for a long time, the change is mostly cosmetic and does not indicate that everything is finalized or perfectly working. Exiting beta doesn’t mean we’ll stop continuing to improve AO3—our volunteer coders and community contributors will still be working to add to and improve AO3 every day.
One of the most significant changes to the site is the absence of the tiny “beta” label inside the AO3 logo displayed at the top of the platform. (AO3 briefly changed the beta to “omega” for April Fools’ Day this year).
You can keep tabs on the updates coming to AO3 by viewing its projects on Jira
-
Culture1 week agoWil Wheaton Discusses ‘Stand By Me’ and Narrating ‘The Body’ Audiobook
-
South-Carolina6 days agoSouth Carolina vs TCU predictions for Elite Eight game in March Madness
-
Culture1 week agoWhat Happens When We Die? This Wallace Stevens Poem Has Thoughts.
-
Miami, FL1 week agoJannik Sinner’s Girlfriend Laila Hasanovic Stuns in Ab-Revealing Post Amid Miami Open
-
Minneapolis, MN1 week agoBoy who shielded classmate during school shooting receives Medal of Honor
-
Education1 week agoVideo: Transgender Athletes Barred From Women’s Olympic Events
-
Vermont6 days ago
Skier dies after fall at Sugarbush Resort
-
Politics6 days agoTrump’s Ballroom Design Has Barely Been Scrutinized