Cade Metz
Companies like OpenAI and Midjourney build chatbots, image generators and other artificial intelligence tools that operate in the digital world.
Now, a start-up founded by three former OpenAI researchers is using the technology development methods behind chatbots to build AI technology that can navigate the physical world.
Like chatbots and image generators, this robotics technology learns its skills by analysing enormous amounts of digital data. That means engineers can improve the technology by feeding it more and more data.
Covariant, backed by $222 million in funding, does not build robots. It builds the software that powers robots. The company aims to deploy its new technology with warehouse robots, providing a road map for others to do much the same in manufacturing plants and perhaps even on roadways with driverless cars.
The AI systems that drive chatbots and image generators are called neural networks, named for the web of neurons in the brain. By pinpointing patterns in vast amounts of data, these systems can learn to recognise words, sounds and images. This is how OpenAI built ChatGPT, giving it the power to instantly answer questions, write term papers and generate computer programs. It learned these skills from text culled from across the internet. (Several media outlets, including The New York Times, have sued OpenAI for copyright infringement.)
OpenAI employed that system to build Sora, its new video generator. By analysing thousands of captioned videos, the system learned to generate videos when given a short description of a scene, like “a gorgeously rendered papercraft world of a coral reef, rife with colourful fish and sea creatures.”
Covariant, founded by Pieter Abbeel, a professor at the University of California, Berkeley, and three of his former students, Peter Chen, Rocky Duan and Tianhao Zhang, used similar techniques in building a system that drives warehouse robots.
The company helps operate sorting robots in warehouses across the globe. It has spent years gathering data — from cameras and other sensors — that shows how these robots operate.
“It ingests all kinds of data that matter to robots — that can help them understand the physical world and interact with it,” Chen said.
By combining that data with the huge amounts of text used to train chatbots like ChatGPT, the company has built AI technology that gives its robots a much broader understanding of the world around it. After identifying patterns in this stew of images, sensory data and text, the technology gives a robot the power to handle unexpected situations in the physical world. The robot knows how to pick up a banana, even if it has never seen a banana before.
It can also respond to plain English, much like a chatbot. If you tell it to “pick up a banana,” it knows what that means. If you tell it to “pick up a yellow fruit,” it understands that, too.
The technology, called RFM., for robotics foundational model, makes mistakes, much like chatbots do. Though it often understands what people ask of it, there is always a chance that it will not. It drops objects from time to time.
Gary Marcus, an AI entrepreneur and an emeritus professor of psychology and neural science at New York University, said the technology could be useful in warehouses and other situations where mistakes are acceptable. But he said it would be more difficult and riskier to deploy in manufacturing plants and other potentially dangerous situations.
“It comes down to the cost of error,” he said. “If you have a 150-pound robot that can do something harmful, that cost can be high.”
High-end plans
The firm aims to to help robots gain understanding of what is going on around them
The technology also gives robots broad understanding of the English language, letting people chat with them
The company plans to deploy its new technology with warehouse robots across the globe
It has spent years gathering data — from cameras and other sensors — that shows how these robots operate
By combining that data with the huge amounts of text used to train chatbots it has built AI technology that gives its robots a broader understanding of the world
©2024 The New York Times News Service
First Published: Mar 12 2024 | 10:24 PM IST