For being one of the world’s leading artificial intelligence (AI) roboticists, Professor Pieter Abbeel doesn’t watch much sci-fi. Pieter is the Director of the UC Berkeley Robot Learning Lab and co-director of the Berkeley Artificial Intelligence (BAIR) Lab. In a recent podcast with host Abigail Hing Wen, Pieter said he first became enamored with robots not from popular depictions in the media, but when he thought, “What if I write a piece of code and it beats me in chess? How can I write a piece of code that beats me, even though I wrote the piece of code?”
Though many people have since developed chess-playing robots that can beat most humans, Pieter included, that thought inspired him to eventually research and build ever more intelligent systems, setting up a research group at Berkeley to advance reinforcement learning, hoping eventually to found a robotics company. Today, while he continues to work at Berkeley, he is also a founder of Covariant, a company that’s seeking to build robots that can adapt to what they see and even learn from their own experience to solve problems.
“To us it's very clear that 20 years from now almost all robots will be learning robots.”
Robots in Warehousing
In the podcast, Pieter says he and his Covariant colleagues met with nearly two hundred companies and concluded that warehousing and manufacturing are two fields where it's most natural to start introducing learning robots. The key is that this environment is too challenging for traditional robots: these blindly repeat the same motion over and over, relying on very high precision components for accuracy, rather than a visual feedback loop. (I actually got my own start in machine vision working on a camera-based system to automatically slow or stop robots when approached by factory workers, which might otherwise weld them to a passing chassis.) While too variable for traditional robots, these are still quite a controlled environment relative to, say, the floor of my kid’s playroom, hitting a difficulty sweet spot for Covariant’s machines. Crucially, Covariant is able to target tasks with a low cost of failure, enabling their robots to learn from experience, including mistakes, in a way that they couldn’t in an automotive or aviation context.
Covariant is moving into a fulfillment niche that has seen huge expansion driven by online ordering. Over the last ten years the fulfillment industry has shifted from having humans fetching things from endless shelves to what is called a “goods-to-person” system, where an entire automated 3D grid system retrieves several items together in a tote—feasible with traditional robots, because totes are standard shapes, mounted on standardized racks. But for robots this is also where a challenge begins: how can they decipher which items in the tote need to be separated and shipped to different addresses? For humans it’s very easy: the order of a video game controller goes to one address, the order for a phone charger goes to a different address. It's hard for the robot to do this kind of sorting because larger warehouses might contain over a million different items in storage.
Robots and “Difficulty Inversion”
Recognizing and picking up an object seems like a trivial skill, and is a great example of the “difficulty inversion” that up-ends our intuitions around machine intelligence: they can find great chess moves, but struggle to pick up the pieces, let along recognize them. Conversely, young children are terrible chess players, but can immediately pick up previously unseen pieces, or recognize playing sets made in new styles.
With traditional machine vision and robotics, coping with the variety of items in a modern warehouse would be quite hopeless. In fact, after I worked on an automated safety system, I was part of a team put together to consider a “vision + robotics” application of the same camera system that would have seen us use an arm to pick unsorted parts out of a bin and put them on a jig. At that time simply recognizing a single type of object in a bin of identical pieces and discerning its orientation was so challenging that we (robot-loving engineers) turned down the project.
The potential value of learning robots that include vision systems is huge. Previously, the requirement for very precise programming made robots extremely expensive to set up, limiting their use to large production runs. Their lack of a vision system to help them adjust their motion also increased costs, because the arm’s motion needed to be perfectly repeatable, requiring very high-end servos. Plus, the working environment needed to be extremely predictable. For example: if a car chassis stopped short by a centimeter, the welds would fail. Taken together, these limitations have restricted the use of robotics to high-volume high-end manufacturing in very controlled settings—effectively, their physical flexibility has been constrained by “mental” rigidity.
There’s an analogy here with the advent of AI in software: we’ve all struggled with systems that require keywords to be spelt exactly right, and we’ve all benefitted from the AI-driven ability of modern search engines to see past our typos.
Getting machines to understand the difference between items was incredibly difficult and progress was really slow until 2012, when Geoff Hinton's lab at University of Toronto was able to show that by training deep neural networks with lots of annotated images, machines could accurately identify the particular objects on which they were trained.
Today, Pieter and his fellow researchers at Berkeley are working on unsupervised learning and reinforcement learning. The obvious attraction of unsupervised learning is that supervised learning gets its supervision in the form of labelled data, which is always scarcer and more expensive than unlabeled data. (Imagine having to get a set of x-rays annotated by multiple human consultants.) The subtler draw of unsupervised learning is that the supervised learning, in its current form, is embarrassingly inefficient by comparison with human or even animal learning. For example, when you teach a child—that’s a car, that's a dog—you don't have to give them 10,000 examples of a car or a dog; you give them one example and they can understand the differences. (Well, more or less: my youngest doesn’t yet distinguish reliably between cattle and horses).
This issue is like one I brought up in a previous blog about Andrew Ng regarding “small data.” If we pitch a company on robotic defect detection, we may find that a requirement to start by producing thousands of broken products “because data” cools their interest.
Unsupervised (sometimes termed “self-supervised”) learning is gaining wider appeal in the AI community with strong progress on images, but also in audio and strongest of all—in text, most dramatically illustrated by the recent beta release of OpenAI’s GPT-3.
GPT-3 is an autoregressive language model, trained simply by feeding it a truncated piece of text and asking it to predict the next word in the sentence, with an enormous 175 billion parameters (!) that can produce text that is closer to human language than any previous model. The generated text is so similar to humans that researchers included a section warning of potential misuse by bad actors in their original paper and several prominent philosophy professors have made statements about the model’s possible impact on society.
GPT-3 is strikingly versatile, since its training set included many languages, including programming languages, and is being applied to problems as diverse as document summarization and bug-finding. Crucially, the “background knowledge” accumulated from a 500GB dataset allows it to adapt rapidly to new tasks, in some cases without any supervised training or labelled data—an achievement that researchers in other domains are keen to match.
Robots of the Future
Housework is no great passion of mine; I am more than ready to surrender my ironing board to a helpful android. But, Pieter notes in the podcast that human-like machines are not likely to be deployed anytime soon. Though researchers can create impressive results in a controlled environment, like a lab or an assembly line, our homes remain too complex for robots to cope with. He notes that autonomous vehicles face some major obstacles—it’s hard to create a training simulation that is quite as rich in experience as real roads. One autonomous vehicle was famously confounded on encountering a waddling duck, closely pursued by an elderly lady in a wheelchair; a situation not anticipated by its engineers.
Pieter predicts that the best simulators that we'll see five years from now will be ones that are largely driven by real-world data collection. A key different between robotic learning and human learning is that robots can pool their experience far more efficiently than humans can. Covariant’s picking robots that can pool their experience, so that what is learned by one is learned by all. It’s easy to see the compounding effect that will come as more robots are deployed—eventually each machine embodies millennia of “on the job” experience. We’re already seeing this trend with software in the field, for example: AI-based predictive machine maintenance that retrains its models with new data.
Unlike Pieter, I got into AI precisely because I was fascinated by the mental processes of science fiction robots as conceived by authors like Isaac Asimov, and it is incredibly exciting to listen to this inventor talk about learning machines emerging from the lab into the workplace.
To learn more about Intel’s work in AI, visit: https://intel.com/ai
To hear more podcasts about AI—we've got a stellar guest line up—look for future episodes with host Abigail Hing Wen at: https://www.intel.com/content/www/us/en/artificial-intelligence/podcast.html