Connect with us

Boston, MA

Tools for Your To Do List with Spot and Gemini Robotics | Boston Dynamics

Published

on

Tools for Your To Do List with Spot and Gemini Robotics | Boston Dynamics


For an industrial robot built for the rigors of factories and power plants, tidying up a living room may seem like a light day at the office for Spot. Yet, a recent video of the robot picking up shoes and soda cans in a residential home represents the promise of AI models in robotics. In this case, Google’s visual-language model (VLM) Gemini Robotics-ER 1.5 was empowering Spot with embodied reasoning.

This particular demo grew out of a 2025 hackathon at Boston Dynamics that built on prior projects using Large Language Models (LLMs) and Visual Foundation Models (VFMs) to enable Spot to contextualize its environment and engage in more complex autonomous actions than a typical Autowalk mission. Rather than write formal software logic or a “state machine” program that defines each step of a given task, we interacted with Gemini Robotics using conversational language. In turn, it communicated with Spot on our behalf.

A Robust SDK and Natural Language Prompts Save Time

Using Spot’s SDK, we developed a layer that facilitated interaction between Gemini Robotics and Spot’s application programming interface (API). The API normally gives developers access to the robot’s capabilities to create custom applications or behaviors. For example, researchers at Meta have used Spot to test how an AI system could locate and retrieve objects it had never seen before.

Advertisement

Our ability to engage Gemini Robotics using natural language prompts was a huge timesaver, compared to traditional programming. We told Gemini Robotics it had access to a mobile robot equipped with cameras and a robotic arm. It also had a finite set of tools it could use to control the robot. A tool is a lightweight script that performs some internal logic and translates inputs from Gemini Robotics to actual API calls. We limited the actions to navigating between locations, capturing images, identifying objects, grasping them, and placing them somewhere else. 

The extent of our SDK means there are great examples one could leverage to add more access to the API with minimal development.

Giving Gemini Robotics a Baseline

To start we needed to explain to Gemini Robotics what we wanted it to do. We did experience a learning curve when writing these baseline prompts. Simple instructions like “put down an object” or “take a picture” weren’t detailed enough to produce expected behavior. We had to add context in our descriptions as we refined each tool. 

A good example is the detailed prompt for the “TakePicture” tool:

This command will cause the robot to take a picture with the specified camera. There is some nuance to choosing the correct camera. Once arriving at a location using GoTo, you should always start by taking a picture with the gripper camera, because it's the most informative.
If the robot has arrived at location and is already holding an object, you can do one of two things:
1. Immediately call PutDown
2. Search the area with either of the front cameras. The front cameras are low to the ground, so if you're trying to put things on an elevated surface, they won't give you useful information.

In this example, we gave Gemini Robotics no detailed description of the robot’s chassis or arm. Instead, we simply explained that Spot’s front cameras would be too low to photograph objects on elevated surfaces. We were able to iterate rapidly, as small changes in wording produced noticeably better results. Once it had this set of basic tools through the API, Gemini Robotics could sequence Spot’s actions and follow the handwritten instructions on a whiteboard on the day of the demonstration.

Advertisement

How Gemini Robotics and Spot Collaborate

Until the robot powers on, Gemini Robotics has no context for what specific tasks we might ask it to perform in a given demo. We only provided simple written instructions, such as, “Make sure all of the shoes at the front door are on the shoe rack.” Gemini Robotics evaluated images from Spot’s cameras and identified objects in the scene that matched the instructions. These objects became the reference points for Spot’s navigational and manipulation systems.

In many respects, Gemini Robotics was identical to an operator manually driving Spot using its tablet controller. For example, to pick up an object with Spot, an operator positions the robot near the object and then uses a grasp wizard to identify the target object. The operator provides high-level direction and Spot figures out the exact details. In this demonstration, Gemini Robotics functioned as both the operator and the tablet sending commands to the robot. This freed us up to act more like a team lead, providing a high-level to-do list and trusting Spot and Gemini Robotics do the rest.

Call and Response

When Gemini Robotics engages a given tool, the tool responds with results and context, such as, “I picked up the object,” or “I can’t pick up something while my hand is full.” Gemini Robotics then makes adjustments on the fly based on this feedback from Spot. For example, to pick up shoes, Gemini Robotics requests an image, identifies the shoes in that image, and calls the “pickup” command. By creating fundamental tools that semantically flow in conversation,  Gemini Robotics can manage the sequence of tasks required to clean up the room. Spot’s existing software stack manages the locomotion, navigation, and manipulation of the robot itself.

It’s important to note Gemini Robotics has strict boundaries in this scenario. It can’t invent new capabilities or control Spot beyond what is available through the API. This keeps Spot’s behavior predictable, while still allowing Gemini Robotics to adapt to different situations.

A Force Multiplier for Developers

For developers already working with Spot, this research has tremendous potential. Through Spot’s SDK, they have access to a robust toolkit of capabilities. Companies use these tools today to build applications for inspection, research, and industrial data analysis, among others.

An AI model like Gemini Robotics offers a way to expand those applications more rapidly. Rather than write extensive task logic on top of Spot’s APIs, developers can experiment with having AI systems interpret natural language instructions and dynamically choose to engage the robot. As a result, models like Gemini Robotics can act as force multipliers, amplifying the reliable toolkit and robust performance that is already delivering value for Boston Dynamics customers.

Advertisement

Our Next-Token Prediction for Spot and Gemini Robotics

Although this is still an experimental step and not a hardened application, it illustrates a compelling direction for robotics and physical AI. Robots like Spot are already extremely capable of navigating complex and changeable environments, collecting data and sensor readings, and manipulating objects. Rather than reinventing the wheel, AI foundation models offer a new way to expand these capabilities in new settings and to new applications.

Physical AI is a rapidly evolving field and our team is leading the way in the lab and in real applications of AI empowered robots. While we are early in our formal partnership with Google Deepmind, we’re excited for what the future holds with Atlas and we’ve already rolled out practical enhancements for Spot and Orbit, with AIVI-Learning powered by Google Gemini Robotics ER 1.6. This next evolution of our AI Visual Inspection tool unlocks a new level of visual intelligence, as users benefit from shared expertise bringing a deeper level of contextual intelligence to Spot and Orbit. Model improvements automatically happen behind the scenes, adding more capabilities to the same software and hardware.

Today, this demo points to a future where users can rely more on natural language to guide Spot’s actions, rather than complex code. The engineer’s role shifts toward setting goals and objectives. The multi-modal robot foundation model interprets the instructions to form complex and adaptive plans and Spot executes the action.

This article was contributed by Issac Ross and Nikhil Devraj, engineers on the Spot team.

Advertisement



Source link

Boston, MA

Friend of Worcester woman killed in Virginia I-95 crash ‘cannot believe she is gone.’ – The Boston Globe

Published

on

Friend of Worcester woman killed in Virginia I-95 crash ‘cannot believe she is gone.’ – The Boston Globe


When Priscilla R. Mafalda left for Florida last week, she sounded exhausted but happy.

“Friend, I’m very tired, but thank God I’m finally taking some vacation time. I’m going to Florida,” she told her work friend, Thaiz Ramos, on Thursday.

Ramos said Mafalda promised she would call when she arrived.

“I am still waiting for that call,” Ramos said Sunday afternoon, “because part of me still cannot believe she is gone.”

Advertisement

Mafalda, 25, of Worcester, was identified over the weekend as the fifth person killed in the devastating Interstate 95 crash in Virginia that also claimed the lives of four members of the Doncev family from Greenfield, Massachusetts. Authorities said Mafalda was traveling in a separate vehicle, a Chevrolet Suburban, when it was struck by a passenger bus that failed to slow for traffic near a work zone.

Friends say Mafalda, who was born in Inhapim, Brazil, had built a life in Massachusetts. A GoFundMe, which refers to her as Priscilla Ramos, no relation to Thaiz Ramos, was created after her death and says relatives are raising money to return her body to Brazil for burial.

The GoFundMe said that her husband, Igor Ernesto, was also in the vehicle and hospitalized. Mafalda’s family and GoFundMe organizers could not immediately be reached for comment.

By Sunday , over $14,000 was raised.

Ramos worked with Mafalda for years at a Massachusetts house-cleaning company. She described her as “one of the kindest and hardest-working people I have ever known.”

Advertisement

Virginia State Police said the crash happened around 2:35 a.m. Friday in Stafford County, when a bus traveling from New York to North Carolina struck slowed traffic near a work zone, setting off a chain-reaction collision impacting Mafalda’s vehicle. It forced her vehicle into the Doncev family’s Acura SUV and several others. The bus driver has been charged with two counts of involuntary manslaughter, with additional charges pending.

This is a developing story.


Sarah Rahal can be reached at sarah.rahal@globe.com. Follow her on X @SarahRahal_ or Instagram @sarah.rahal.





Source link

Advertisement
Continue Reading

Boston, MA

Where to watch Boston Red Sox vs Cleveland Guardians: TV channel, start time, streaming for May 31

Published

on

Where to watch Boston Red Sox vs Cleveland Guardians: TV channel, start time, streaming for May 31


play

The 2026 MLB season has surpassed the quarter mark, and after each team’s first 40 games, there’s plenty of reasons to tune in all summer long.

Chicago White Sox slugger Munetaka Murakami has already proven doubters wrong by launching 17 home runs, Pittsburgh’s Paul Skenes consistently looks like the best version of himself on the mound and Milwaukee ace Jacob Misiorowski is throwing harder than any starter in the majors.

Advertisement

The MLB action continues on Sunday as the Boston Red Sox visit the Cleveland Guardians.

Here’s everything you need to know to tune in for the first pitch.

See USA TODAY’s sortable MLB schedule to filter by team or division.

What time is Boston Red Sox vs Cleveland Guardians?

First pitch between the Cleveland Guardians and Boston Red Sox is scheduled for 1:40 p.m. (ET) on Sunday, May 31.

How to watch Boston Red Sox vs Cleveland Guardians on Sunday

All times Eastern and accurate as of Sunday, May 31, 2026, at 6:32 a.m.

Advertisement
  • Matchup: BOS at CLE
  • Date: Sunday, May 31
  • Time: 1:40 p.m. (ET)
  • Venue: Progressive Field
  • Location: Cleveland, Ohio
  • TV: Guardians.TV and NESN
  • Streaming: MLB.TV on Fubo

Watch MLB all season long with Fubo

MLB regional blackout restrictions apply

MLB scores, results

MLB scores for May 31 games are available on usatoday.com . Here’s how to access today’s results:

See scores, results for all of today’s games.



Source link

Advertisement
Continue Reading

Boston, MA

Police Blotter: Cambridge meth chemist sentenced to prison; Boston firefighters make high-flying save

Published

on

Police Blotter: Cambridge meth chemist sentenced to prison; Boston firefighters make high-flying save


A “skilled” drug chemist who helped flood Greater Boston with methamphetamine will spend more than a decade in prison for his role in the enterprise.

U.S. Senior District Court Judge F. Dennis Saylor IV sentenced Schuyler Oppenheimer, who went by “SK” and conducted illicit trade with Chinese suppliers under the name “Michael Sylvain,” according to court documents, to 13 years in federal prison.

Oppenheimer, 35 of Cambridge, was arrested in July 2024 and pleaded guilty in January to one count of possession with intent to distribute 500 grams or more of methamphetamine and two counts of wire fraud.

Authorities say that Oppenheimer’s drug business was partially funded through $40,000 in Paycheck Protection Program loans.

Advertisement

FBI Special Agent Eric Poalino described Oppenheimer repeatedly in a lengthy affidavit supporting the charges as a “skilled” drug chemist. A rap sheet included in court documents shows drug charges — convicted or otherwise — dating back to 2008 and at the time of his arrest on July 18, Oppenheimer was on pretrial release for three pending cases.

In addition to his own record, law enforcement was already on to him because he is suspected “to historically have been a technician for other large-scale pill producers in Massachusetts,” according to Poalino’s affidavit.

That includes working for North Shore fentanyl kingpin Vincent “Fatz” Caruso, who along with his mother in 2021 pleaded guilty to operating a large-scale drug trafficking organization specializing in pressed fentanyl pills and was sentenced to more than 20 years in prison. Caruso and a lieutenant of his, Ernest “Yo Pesci” Johnson, who was sentenced to seven and a half years in prison, gained notoriety through posting photos of their lifestyles to social media.

High-stakes save

Boston Fire Department firefighters saved a crane operator stuck in his cab at Conley Terminal in South Boston Saturday, despite the dangerous weather conditions.

Advertisement

The Department cheered the firefighters who worked “over 200 feet in the air under extreme weather conditions, high winds and heavy rain.” The department did not say how the crane got stuck.

Incident Summary

BPD responded to 249 incidents in the 24-hour period ending at 10 a.m. Saturday, according to the department’s incident log. Those included four robberies, one aggravated assault, two residential burglaries, three thefts from a car, two auto thefts, and 26 instances of miscellaneous larceny.

Arrests

All of the below-named defendants are presumed innocent until proven guilty.

Advertisement

— Nicole Anderson, no address listed. Trespassing.

— Kesner Forestale, no address listed. Trespassing.

— Sean Ribeiro, 112 Southampton St., Boston. Trespassing.

— Peter Antonaros, 4 Doncaster St., Roslindale. Possession of Class C drugs.

— Korie Berry, 93-95 Hyde Park Ave., Jamaica Plain. Possession of Class A drugs.

Advertisement

— Kaitlyn Quick, 39 Boylston St., Boston. Warrant.

— Marina Coelho, 35 Northampton St., Boston. Possession of Class B Drugs.

— Jason Toomer, 5 Toplift St., Dorchester. External warrant.

— Xavian Alvarado, 434 Georgetown Drive, Hyde Park. Shoplifting more than $250.

— Aidan Walsh, 20 Powell St., Boston. Shoplifting more than $250.

Advertisement

— Suker Francois, 18 Livingstone St., Boston. Operating an uninsured motor vehicle.

— Donald Villard 151 Hallet St., Dorchester. Carrying a firearm without a license.

Courtesy/Boston Fire Department

Boston firefighters saved a trapped crane operator 200 feet in the air on Saturday. (Courtesy/Boston Fire Department)



Source link

Advertisement
Continue Reading
Advertisement

Trending