0xAB

Andrei Barbu

Picture of Andrei Barbu

Andrei is a research scientist at MIT working on natural language processing, computer vision, and robotics, with a touch of neuroscience.


ObjectNet — uncovering a wide gulf between humans and machines

Easy for humans, hard for machines

Computer vision has a big problem: we over-promise and under-deliver constantly. In computer vision benchmarks machines work amazingly well, but in the real world they constantly fail. You just don’t get 95% accuracy detecting cups in real world videos the way you do in benchmarks. ObjectNet is available now. Learn more and get the dataset on objectnet.dev

This could be because our benchmarks are too brittle because of human limitations. Machines are far better at finding patterns in data that we would never notice. And they seem to be using this ability to hack our benchmarks. Instead of detecting 1-out-of-1000 object classes on ImageNet, they’re using knowledge about the fact that cups tend to be upright, but almost never on their side, knowledge about how we tend to take pictures of objects, and knowledge that we tend to put objects on the same backgrounds, cups in kitchens and pillows in bedrooms. The human visual system is robust to violating these assumptions, you can recognize a cup in a bedroom just fine. Machines are not, they tend to fail miserably.

We collected ObjectNet to address this problem ( Citation: , & al., , , , , , , & (). Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Advances in neural information processing systems (NeurIPS), 32. objectnet.dev

@article{barbu2019objectnet,
    title={Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models},
    author={Barbu, Andrei and Mayo, David and Alverio, Julian and Luo, William and Wang, Christopher and Gutfreund, Dan and Tenenbaum, Josh and Katz, Boris},
    journal={Advances in neural information processing systems (NeurIPS)},
    volume={32},
    year={2019},
    URL={objectnet.dev}
   }
)
. It controls for biases in existing datasets by collecting images on specific backgrounds, specific rotations, and from specific views. Cups aren’t much more likely to be in kitchens than in bedrooms and their rotations are much more evenly distributed.

ObjectNet breaks machine performance #

Machines perform very poorly on ObjectNet. Far worse than on other datasets. This clearly shows that machines are exploiting weaknesses that pervade our datasets. Current datasets have preferred views for objects and preferred backgrounds that make object recognition far simpler for machines than it is in the real world. This gap has closed somewhat, but ObjectNet is far from solved. Recent advances have closed about half of the gap we’ve found. Keep up to date on paperswithcode.

A massive drop in performance when evaluating with ObjectNet

ObjectNet isn’t much harder for humans #

It’s very interesting that humans don’t rely much on these features. You can see below, all of the objects are easy to recognize. In more recent work we quantify image difficulty for the first time! See https://objectnet.dev/flash/

Examples of the ObjectNet controls in action
Barbu, Mayo, Alverio, Luo, Wang, Gutfreund, Tenenbaum & Katz (2019)
, , , , , , & (). Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Advances in neural information processing systems (NeurIPS), 32. objectnet.dev

@article{barbu2019objectnet,
    title={Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models},
    author={Barbu, Andrei and Mayo, David and Alverio, Julian and Luo, William and Wang, Christopher and Gutfreund, Dan and Tenenbaum, Josh and Katz, Boris},
    journal={Advances in neural information processing systems (NeurIPS)},
    volume={32},
    year={2019},
    URL={objectnet.dev}
   }