Microsoft is putting its artificial intelligence technology to work to help the visually impaired.
The company said on Wednesday that it’s releasing an iOS app called Seeing AI that uses an iPhone’s camera to tell people about objects in front of them. The app shows off Microsoft’s current capabilities in AI, while also addressing a group that the technology industry too often ignores.
Microsoft first revealed the technology to an audience of developers last year, but at the time it was only shown as working on smart glasses. Now anyone with an iPhone or iPad can try it…
Google’s Eve Andersson tells Co.Design how today’s accessibility problems could lead to improvements in robots, Google Maps, and even YouTube.
TEACHING AIS HOW TO NOTICE, NOT JUST SEE
Like Microsoft, which recently announced a computer vision-based accessibility project called Seeing AI, Google’s interested in how to convey visual information to blind users through computer vision and natural language processing. And like Microsoft, Google is dealing with the same problems: How do you communicate that information without just reading out loud an endless stream-of-conscious list of what a computer sees around itself—regardless of how trivial they may or may not be?
Thanks to Knowledge Graph and machine learning—the same principles that Google uses to let you search photos by content (like photos of dogs, or photos of people hugging)—Andersson tells me that Google is already good enough at identifying objects to decode them from a video stream in real time. So a blind user wearing a Google Glass-like wearable, or a body cam hooked up to a smartphone, could get real-world updates on what can be seen around them.
But again, the big accessibility problem that needs to be solved here is one of priority.
Much has been made recently of Google’s advances in natural language processing, or Google’s ability to understand and transcribe human speech. Google’s accessibility efforts lean heavily upon natural language processing, particularly its latest innovation, Voice Access. But Andersson says computers need to understand more than just speech. Forget natural language processing: computers need non-language processing.
TAKING NAVIGATION BEYOND GOOGLE MAPS
Sighted users are so used to taking directions from computers that many people (like me) can barely find their way around without first plugging an address into Waze. But moving sighted individuals from point A to point B, across well-plotted roads and highways, is navigation on macro scale. Things get much more complicated when you’re trying to direct a blind person down a busy city street, or from one store to another inside a shopping mall. Now, you’re directing people on a macro scale, but in an environment that is not as well understood or documented as roads are.