Meta just stuck its AI somewhere you didn't expect it — a pair of Ray-Ban smart glasses
Ray-Ban smart glasses will now use Meta AI virtual assistant software so that wearers can speak with their smart glasses and ask questions about what they're looking at.
Smart glasses have arguably failed to take off, but the addition of artificial intelligence (AI) could be the key to developing a truly transformational wearable technology.
In the US and Canada, Ray-Ban Meta smart glasses have received a rollout of multimodal AI technology with software called the "Meta AI virtual assistant." With multimodal AI — which means generative AI that can process queries that involve more than one medium (for example, both audio and imagery) — the device can better respond to queries based on what a wearer is looking at.
"Say you’re traveling and trying to read a menu in French. Your smart glasses can use their built-in camera and Meta AI to translate the text for you, giving you the info you need without having to pull out your phone or stare at a screen," Meta representatives explained April 23 in a statement.
Related: Smart glasses could boost privacy by swapping cameras for this 100-year-old technology
The device first takes a photo of what a wearer is looking at, then the AI taps into cloud-based processing to serve up an answer to a query, delivered by speech, such as "what type of plant am I looking at?"
Meta first explored integrating multimodal AI into the Ray-Ban Meta smart glasses in a limited release in December 2023.
Testing the AI functionality in this device, a reporter from The Verge found that it mostly responded correctly when asked to identify the model of a car. It could also describe a type of cat, for example, and its features in an image snapped via the camera. But the AI ran into trouble in accurately identifying the species of plants belonging to one reporter and struggled to correctly identify a groundhog in their neighbor's backyard.
Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.
Multimodal machinations
AI-powered virtual assistants are nothing new, with the likes of the Google Assistant, Amazon Alexa and Apple’s Siri all providing smart answers to queries in natural language. But the crux of the Meta AI in the Ray-Ban smart glasses is its multimodal functionality.
The ability to fuse and process data from multiple sensor modules — for example, cameras and microphones — means a multimodal AI can generate more accurate and sophisticated outcomes versus unimodal AI systems. Google’s Gemini multimodal AI model , for example,can process a photo of some cookies and respond with the recipe.
Trained on identifying patterns in different types of data inputs through multiple neutral networks — collections of machine learning algorithms arranged to mimic the human brain — multimodal AIs can process input data from text, images, audio and more.
In smart glasses, it means an AI can make sense of the world the wearer is viewing by combining sensors on the glasses with these neural networks. As a result, the system can answer more sophisticated queries and offer smarter contextual information.
But in the case of the Ray-Ban Meta device, the AI has some distance to go before it meets the AI-processing capabilities found in the latest smartphones; these benefit from more powerful chipsets and onboard sensor fusion – where data is taken from multiple sensors and processed together, for example to offer scene recognition in camera apps allowing for lighting and color balance to be intelligently adjusted, or combining data from thermometers and optical sensors in smartwatches to offer better feedback on one’s workout.
Roland Moore-Colyer is a freelance writer for Live Science and managing editor at consumer tech publication TechRadar, running the Mobile Computing vertical. At TechRadar, one of the U.K. and U.S.’ largest consumer technology websites, he focuses on smartphones and tablets. But beyond that, he taps into more than a decade of writing experience to bring people stories that cover electric vehicles (EVs), the evolution and practical use of artificial intelligence (AI), mixed reality products and use cases, and the evolution of computing both on a macro level and from a consumer angle.