Eight Times Computer Vision Hilariously Failed
We get it—computer vision and AI have come a long way recently, able to do things that would seem impossible to most consumers as recent as half a decade ago. Companies everywhere are developing products, systems, neural networks and AI algorithms that are able to learn and improve themselves with every user interaction. But as with any tech innovation, there are bound to be flaws; in typical Silicon Valley fashion, where failure is a requirement, these slip-ups usually lead to progress, and sometimes make for great comedy, more akin to the charmingly bumbling Johnny 5 than the sinister Skynet (unless that’s what they want us to think…).
Here are a few recent examples where computer vision features hilariously failed to do their jobs:
1) The iPhone X Launch Party’s Facial Recognition Lockout
It was pretty embarrassing for Craig Federighi, Apple’s SVP, when he tried to show off the new iPhone’s three-dimensional Face ID feature, only to have the device refuse to unlock. To be fair, the snafu was actually due to the phone working too hard—it was handled by so many people backstage that it kept trying to validate the faces it saw and reverted to passcode mode for security. And maybe denying access in such cases is not such a bad thing. As Apple explained, the new feature is 20 times safer than Touch ID; just don’t use it if you have a twin, a sibling who looks like you or if you’re under 13, because your facial features might not have developed enough to be distinguishable to the camera. In those cases, you’ll have to rely on tapping in your passcode, like some kind of iPhone 4 owner.
2) The Security Robot That Died in a Fountain
It was hard not to laugh when the K5 Autonomous Data Machine, a security robot equipped with thermal imaging, license plate recognition and other surveillance services, ended up dead in the fountain of the Washington DC mall it was protecting after falling down a flight of stairs. As an expert somewhat cruelly pointed out to the New York Times, even Roombas have infrared sensors that prevent that same fate, much like a human can see and/or feel that they’re at the edge of a drop-off. Just like its human counterpart, computer vision is much better when augmented by other senses. K5 replaced the robot, free of charge, so the mall shoppers can feel safe yet again (unless they find themselves in the path of a 300-lb piece of machinery hurtling down the stairs).
3) The Samsung Galaxy S8 Hack
The iPhone’s strongest competitor also has a flaw when it comes to facial recognition security. When it launched earlier this year, a user tested the feature, successfully unlocking the phone by showing it a selfie of himself on another device. The company says that if you want to be more secure, use the phone’s fingerprint or iris scanner features. If you have friends or relatives prone to pranks and/or identity theft, go for the iris. Samsung might want to take a cue from Apple and make its facial detection three-dimensional, similar to how iPhones make you put in multiple angles of fingerprints. But hey, if you’re willing to risk lewd photos and many levels of fraud and larceny, destroy all other pics of yourself ASAP. You do you.
4) The Bizarre Amazon Phone Cases
If you want to protect your phone with a case featuring a horrible random image, just head to Amazon. Whether it’s a joke or not, there’s a third-party seller, My-Handy-Design, that uses “AI” to pull random stock images from the categories of beauty treatments, drugs, medical treatments and DIY; then puts them on the covers of its cases; and, sometimes, prices them at over $20. Previous versions included a heroin needle in a spoon, and a current item is listed as “Icon of Elastic Orthopedic Compression Bandage for Ankle cell phone cover case iPhone 6.” The weirder ones are sold out, so if you need a fix of a man in adult diapers immediately, you’ll have to print your own. It’s not clear if the “Amazon AI,” as many articles have dubbed the bot, is scouring for weird search terms, incongruous word combinations, images or all of the above, but no matter: This is one hilarious and sometimes disturbing fail.
5) The Captionbot That Can’t Caption
“I can understand the content of any photograph and I’ll try to describe it as well as any human. I am still learning…” says the bio of Microsoft’s AI image analyzer. Everyone needs a bit of a training period, but more than a year after its launch, the program still comes out with plenty of boners, such as mixing up genders, calling a person a cow and adding emojis to reflect the mood of the people in the photos that basically interpret anyone not smiling as unhappy. It’ll be interesting to see the software develop, but for now, it’s a really fun way to see how bizarrely and/or blandly it’ll describe your photos. “I think this is a man and woman [nope] standing in front of a building [mind-blowing!] and they seem 😐😁 [they’re both really happy].” Either way, Captionbot won’t be winning any prizes for literature or comedy any time soon.
6) The Shelf-stacking Robot
Someday, we might get terrifying-looking robots who can actually pull off a simple task like putting a box on a shelf, but for now, we have Boston Dynamics’ creation biffing the one job it had to do, missing its target, knocking everything over, and then, to add insult to metallic injury, falling on its back. It happens to the best of us, bud, so hang in there until your next upgrade.
7) The Supercomputer That Couldn’t Recognize Itself
IBM’s immobile, talkative version of Wall-E did itself a disservice last year when it described a photo of itself as a subway platform. You can test it yourself here, and see if your baby picture is described as a grandmother, aged 55-64, or a pic of your dog matches up with “playground slide.” You know, things a child with a grasp of basic vocabulary can do. (These are real examples.)
8) The Computer Vision API That Can’t Take the Noise
If you can look at a grainy image of a teapot or airplane and recognize what you’re seeing, congrats—you’re smarter than Google’s Cloud Vision API. Researchers found that, on average, a 14.25% increase in visual noise totally threw off the company’s sensors, resulting in calling the teapot “biology” and the plane a “bird” (to be fair, that is a slang term for aircraft, but it’s more fun to believe Google screwed up). It’s probably not something to put on your resume, but at least for now you can brag that you’re a tad more intelligent than one of the world’s most innovative companies.
Illustrations by Chris Fernandez