How Facebook is Using AI to Improve Photo Descriptions for People Who Are Blind or Visually Impaired

Originally published in [email protected], Jan 19, 2021.

When Facebook users scroll through their News Feed, they find all kinds of content — articles, friends’ comments, event invitations, and of course, photos. Most people are able to instantly see what’s in these images, whether it’s their new grandchild, a boat on a river, or a grainy picture of a band onstage. But many users who are blind or visually impaired (BVI) can also experience that imagery, provided it’s tagged properly with alternative text (or “alt text”). A screen reader can describe the contents of these images using a synthetic voice and enable people who are BVI to understand images in their Facebook feed.

Unfortunately, many photos are posted without alt text, so in 2016 we introduced a new technology called automatic alternative text (AAT). AAT — which was recognized in 2018 with the Helen Keller Achievement Award from the American Foundation for the Blind — utilizes object recognition to generate descriptions of photos on demand so that blind or visually impaired individuals can more fully enjoy their News Feed. We’ve been improving it ever since and are excited to unveil the next generation of AAT.

The latest iteration of AAT represents multiple technological advances that improve the photo experience for our users. First and foremost, we’ve expanded the number of concepts that AAT can reliably detect and identify in a photo by more than 10x, which in turn means fewer photos without a description. Descriptions are also more detailed, with the ability to identify activities, landmarks, types of animals, and so forth — for example, “May be a selfie of 2 people, outdoors, the Leaning Tower of Pisa.”

To continue reading this article, click here.

Originally published in [email protected], Jan 19, 2021.

Footer