Connect with us

Technology

The humble screenshot might be the key to great AI assistants

Published

on

The humble screenshot might be the key to great AI assistants

If you want to make the most out of a world increasingly filled with AI tools, here’s a habit to develop: start taking screenshots. Lots of screenshots. Of anything and everything. Because for all the talk of voice modes, omnipresent cameras, and the multimodal future of everything, there might be no more valuable digital behavior than to press the buttons and save what you’re looking at.

Screenshots are the most universal method of capturing digital information. You can capture anything — well, almost anything, thanks a lot, Netflix! — with a few clicks, and save and share it to almost any device, app, or person. “It’s this portable data format,” says Johnny Bree, the founder of the digital storage app Fabric. “There’s nothing else that’s quite so portable that you can move between any piece of software.”

A screenshot contains a lot of information, like its source, contents, and even the time of the day in the corner of the screen. Most of all, it sends a crucial and complex signal; it says I care about this. We have countless new AI tools that aim to watch the world, our lives, and everything, and try to make sense of it all for us. These tools are mostly crap for lots of reasons but mostly because AI is pretty good at knowing what things are, but it’s rubbish at knowing whether they matter. A screenshot assigns value and tells the system it needs to pay attention.

Screenshots also put you, the user, in control in an important way. “If I give you access to all of my emails, all my WhatsApps, everything, there’s a lot of noise,” says Mattias Deserti, the head of smartphone marketing at Nothing. There’s simply no reason to save every email you receive or every webpage you visit — and that’s to say nothing of the privacy implications. “So what if, instead, you were able to start training the system yourself, feeding the system the information you want the system to know about you?” Rather than a tool like Microsoft Recall, which asks for unlimited access to everything, starting with screenshots lets you pick what you share.

Until now, screenshots have been a fairly blunt instrument. You snap one, and it gets saved to your camera roll, where it probably languishes, forgotten, until the end of time. (And don’t get me started on all the screenshots I take by accident, mostly of my lockscreen.) At best, you might be able to search for some text inside the image. But it’s more likely that you’ll just have to s scroll until you find it again.

Advertisement

The first step in making screenshots more useful is to figure out what’s actually in them

The first step in making screenshots more useful is to figure out what’s actually in them. This is, at first blush, not terribly complicated: optical character recognition technology has long done a good job of spotting text on a page. AI models take that one step further, so you can either search the title or just “movies” to find all your digital snaps of posters, Fandango results, TikTok recommendations, and more. “We use an OCR model,” says Shenaz Zack, a product manager at Google and part of the team behind the Pixel Screenshots app. “Then we use an entity-detection model, and then Gemini to understand the actual context of the screen.”

See, there’s far more to a screenshot than just the text inside. The right AI model should be able to tell that it came from WhatsApp, just by the specific green color. It should be able to identify a website by its header logo or understand when you’re saving a Spotify song name, a Yelp handyman review, or an Amazon listing. Armed with this information, a screenshot app might begin to automatically organize all those images for you. And even that is just the beginning.

With everything I’ve described so far, all we’ve really created is a very good app for looking at your screenshots, which no one really thinks is a good idea because it would be just one more thing to check — or forget to check. Where it gets vastly more interesting is when your device or app can actually start to use the screenshots on your behalf, to help you actually remember what you captured or even use that information to get stuff done.

In Nothing’s new Essential Space app, for instance, the app can generate reminders based on stuff you save. If you take a screenshot of a concert you’d like to go to, it can remind you that it’s coming up automatically. Pixel Screenshots is pushing the idea even further: if you save a concert listing, your Pixel phone can prompt you to listen to that band the next time you open Spotify. If you screenshot an ID card or a boarding pass, it might ask you to put it in the Wallet app. The idea, Zack says, is to think of screenshots as an input system for everything else.

Advertisement

It’s one thing to screenshot a band you like. It’s another to be able to find them again later.
Image: David Pierce / The Verge

Mike Choi, an indie developer, built an app called Camp in part to help him make use of his own screenshots. He began to work on turning every screenshot into a “card,” with the salient information stored alongside the picture. “You have a screenshot, and at the bottom there’s a button, and it flips the card over,” he says. “It shows you a map, if it was a location; a preview of a song, if it’s a song. The idea was, given an infinite pool of different types of screenshots, can AI just generate the perfect UI for that category on the fly?”

If all this sounds familiar, it’s because there’s another term for what’s going on here: it’s called agentic AI. Every company in tech seems to be working on ways to use AI to accomplish things on your behalf. It’s just that, in this case, you don’t have to write long prompts or chat back and forth with an assistant. You just take a screenshot and let the system go to work. “You’re building a knowledge base, when today that knowledge base is confined to your gallery and nothing happens with it,” Deserti says. He’s excited to get to the point where you screenshot a concert date, and Essential Space automatically prompts you to buy tickets when they go on sale.

Making sense of screenshots isn’t always so straightforward

Making sense of screenshots isn’t always so straightforward, though. Some you want to keep forever, like the ID card you might need often; other things, like a concert poster or a parking pass, have extremely limited shelf lives. For that matter, how is an app supposed to distinguish between the parking pass you use every day at work and the one you used once at the airport and never need again? Some of the screenshots on my phone were sent to me on WhatsApp; others I grabbed from Instagram memes to send to friends. No one’s camera roll should ever be fully held against them, and the same goes for screenshots. Lots of these screenshot apps are looking for ways to prompt you to add a note, or organize things yourself, in order to provide some additional helpful information to the system. But it’s hard work to do that without ruining what makes screenshots so seamless and easy in the first place.

Advertisement

One way to begin to solve this problem, to make screenshots even more automatically useful, is to collect some additional context from your device. This is where companies like Google and Nothing have an advantage: because they make the device, they can see everything that’s happening when you take a screenshot. If you grab a screenshot from your web browser, they can also store the link you were looking at. They can also see your physical location or note the time and the weather. Sometimes this is all useful, but sometimes it’s nonsense; the more data they collect, the more these apps risk running into the same noise problem that screenshots helped solve in the first place.

But the input system works. We all take screenshots, all the time, and we’re used to taking them as a way to put a marker on so many kinds of useful information. Getting access to that kind of relevant, personalized data is the hardest thing about building a great AI assistant. The future of computing is certainly multimodal, including cameras, microphones, and sensors of all kinds. But the first best way to use AI might be one screenshot at a time.

Technology

Valve’s huge SteamOS 3.8 update adds long-awaited features — and supports Steam Machine

Published

on

Valve’s huge SteamOS 3.8 update adds long-awaited features — and supports Steam Machine

Not only is it the first release to support the upcoming Steam Machine living room gaming PC, it comes with long-awaited features for Valve’s handhelds and more support for other companies’ handhelds than we’ve seen to date — including Microsoft and Asus’ Xbox Ally series, the Lenovo Legion Go 2, the OneXPlayer X1, and additional support for MSI, GPD, Anbernic, OrangePi, and Zotac.

The one that excites me most: Valve is adding genuine hibernation and “memory power down” modes to the Steam Deck — though just the LCD model to start — which should help extend battery life when you hit the power button or leave them idle. Some Windows machines currently last longer than the Steam Deck when asleep, because they self-hibernate to save power, while the Steam Deck has an instant-on sleep mode.

Plus, Valve has finally added a setting in its gaming mode to let you use your Bluetooth headset microphones — something I’ve been asking for since the beginning. (Valve did add it to the Linux desktop mode last year.) And the Steam Deck LCD is finally getting Bluetooth Wake re-enabled, so you can turn on your TV-connected Deck with a wireless controller from your couch.

The update comes with all sorts of improvements for the Linux desktop modes that sound like they’ll come in handy on a Steam Machine plugged into a TV or monitor, too, including desktop HDR, VRR display support, per-display scaling, “improved windowing behavior for games running in Proton,” and an upgrade to KDE Plasma 6.4.3 among other things.

And for a Steam Machine or Steam handheld plugged into a home entertainment system, they can now detect how many audio channels you have over HDMI to enable surround sound. (I believe surround sound was already a thing, so perhaps this is just a different and better automatic implementation.)

Advertisement

There’s also a new Arch system base and an updated graphics driver.

Perhaps most surprisingly, the “Non-Deck” section of the changelog is huge. Valve says long-pressing your power button should work “across a wide variety of devices” to power off, restart, or switch to the desktop mode. You should be able to change your processor’s power modes on the Xbox Ally now, and night mode and screen color settings should work on AMD Z2 Extreme handhelds in general.

There’s also “Greatly improved video memory management with discrete GPU platforms,” you can limit how far the battery charges in any of the Lenovo Legion Go handhelds (in desktop mode), and it should fix “washed out colors for Zotac and OneXPlayer handhelds with OLED.”

There’s a lot in this update, and it’s possible I missed a feature you care about, so check out the whole changelog here and below.

Advertisement
Continue Reading

Technology

Fox News AI Newsletter: Wall-climbing robots swarm US Navy warships

Published

on

Fox News AI Newsletter: Wall-climbing robots swarm US Navy warships

NEWYou can now listen to Fox News articles!

Welcome to Fox News’ Artificial Intelligence newsletter with the latest AI technology advancements.

IN TODAY’S NEWSLETTER:

WATCH: Wall-climbing robot swarms crawl US Navy warships as China’s fleet surges

OPINION: AI comes with a hefty charge, and you are the one who gets stuck with the bill

Advertisement

Dell workforce shrinks 10% for third consecutive year

Swarms of wall-climbing robots will soon be crawling across U.S. Navy warships in a $71 million effort to slash repair delays and boost fleet readiness as China continues expanding its naval power.  (Gecko Robotics )

TECH AT SEA: WATCH: wall-climbing robot swarms crawl US Navy warships as China’s fleet surgesFox News Digital reports on a new development in naval technology, featuring wall-climbing robot swarms that are crawling on U.S. Navy warships. This advancement comes at a critical time in defense politics as China’s naval fleet continues to surge in size and capability.

WALLET SHOCK: OPINION: AI comes with a hefty charge, and you are the one who gets stuck with the bill – In this opinion piece, the author discusses the economic implications of the growing artificial intelligence industry. The article argues that the hefty costs associated with AI development and its massive energy infrastructure will ultimately be passed down, leaving everyday consumers to foot the bill.

Dell Technologies headquarters in Round Rock, Texas, US, on Sunday, Nov. 26, 2023.  (Sergio Flores/Bloomberg via Getty Images)

Advertisement

COST CRUNCH: Dell workforce shrinks 10% for third consecutive year – Fox Business reports that Dell’s workforce has shrunk by ten percent. This marks the third consecutive year of workforce reductions for the major technology company amid shifting economic conditions and corporate restructuring.

AIMING HIGH: FULL AUTONOMY: AI pilot technology advances towards military capability – Merlin CEO Matt George details how the company is using artificial intelligence to enable military and commercial aircraft to operate fully autonomously on Fox Business’ ‘The Claman Countdown.’

Single family homes in a residential neighborhood in San Marcos, Texas, US, on Tuesday, March 12, 2024. (Photographer: Jordan Vonderhaar/Bloomberg via Getty Images)

SHOULD I BUY?: Homebuyers, sellers turning to AI chatbots for advice – Prairie Operating Co.’s Lou Basenese and real estate broker Kirsten Jordan discuss how artificial intelligence is impacting homebuyers and sellers on ‘Fox Business In Depth.’

DISRUPTION IS HERE: Charles Payne: AI disruption is here – Fox Business host Charles Payne discusses the economic impact of the rise in artificial intelligence on ‘Making Money.’

Advertisement

BUILDING HER BUSINESS: How Angie Hicks turned Angi into a home services giant and AI player – Angi co-founder Angie Hicks discusses entrepreneurship, company growth and how she built out her business on ‘Mornings with Maria.’

FOLLOW FOX NEWS ON SOCIAL MEDIA

Facebook
Instagram
YouTube
X
LinkedIn

SIGN UP FOR OUR OTHER NEWSLETTERS

Fox News First
Fox News Opinion
Fox News Lifestyle
Fox News Health

DOWNLOAD OUR APPS

Fox News
Fox Business
Fox Weather
Fox Sports
Tubi

WATCH FOX NEWS ONLINE

Fox News Go

Advertisement

STREAM FOX NATION

Fox Nation

Stay up to date on the latest AI technology advancements, and learn about the challenges and opportunities AI presents now and for the future with Fox News here.

Advertisement
Continue Reading

Technology

A rogue AI led to a serious security incident at Meta

Published

on

A rogue AI led to a serious security incident at Meta

For almost two hours last week, Meta employees had unauthorized access to company and user data thanks to an AI agent that gave an employee inaccurate technical advice, as previously reported by The Information. Meta spokesperson Tracy Clayton said in a statement to The Verge that “no user data was mishandled” during the incident.

A Meta engineer was using an internal AI agent, which Clayton described as “similar in nature to OpenClaw within a secure development environment,” to analyze a technical question another employee posted on an internal company forum. But the agent also independently publicly replied to the question after analyzing it, without getting approval first. The reply was only meant to be shown to the employee who requested it, not posted publicly.

An employee then acted on the AI’s advice, which “provided inaccurate information” that led to a “SEV1” level security incident, the second-highest severity rating Meta uses. The incident temporarily allowed employees to access sensitive data they were not authorized to view, but the issue has since been resolved.

According to Clayton, the AI agent involved didn’t take any technical action itself, beyond posting inaccurate technical advice, something a human could have also done. A human, however, might have done further testing and made a more complete judgment call before sharing the information — and it’s not clear whether the employee who originally prompted the answer planned to post it publicly.

“The employee interacting with the system was fully aware that they were communicating with an automated bot. This was indicated by a disclaimer noted in the footer and by the employee’s own reply on that thread,” Clayton commented to The Verge. “The agent took no action aside from providing a response to a question. Had the engineer that acted on that known better, or did other checks, this would have been avoided.”

Advertisement

Last month, an AI agent from open source platform OpenClaw went more directly rogue at Meta when an employee asked it to sort through emails in her inbox, deleting emails without permission. The whole idea behind agents like OpenClaw is that they can take action on their own, but like any other AI model, they don’t always interpret prompts and instructions correctly or give accurate responses, a fact Meta employees have now discovered twice.

Continue Reading
Advertisement

Trending