Table of Contents
- Visual Impairment: Image Recognition APIs
- Hearing Impairment: Transcription and Vocalization APIs
More and more hackathons today embrace themes like social good and accessibility–and we love it. It’s pretty amazing to watch developers code projects like SaveURPlanet, an app to measure your carbon footprint, or a robot that acts as a companion to seniors in nursing homes (seriously!).
To help more hackers build cool projects like these, we’ve compiled some of our favorite APIs that embrace accessibility. Since RapidAPI lets you call APIs through one abstraction layer and export code snippets into multiple languages, there’s nothing stopping you from building a truly awesome and accessible project. Enjoy!
Visual Impairment: Image Recognition APIs
For people with blindness or visual impairments, technology can be invaluable for interpreting visual information through another medium. A visual recognition API works by identifying and tagging objects in an image, then returning a percent certainty of its prediction (ex. 99% sure that the object in this image is a cat). There are a lot of visual recognition APIs out there (you can read a great comparison here), but these are some of our favorites. Since these APIs are a little more technical, we added additional explanation.
tl;dr Use Clarifai to identify certain image sub-categories and train your own AI model.
The Clarifai API trains artificial intelligence “models” to recognize certain objects in images. While they have a general model that recognizes pretty much anything in an image, they also have other public models designed to categorize certain types of images. Public models include Food, Travel, NSFW, Wedding, Color, facial detection and age/gender/ethnicity detection. These models help the API recognize more specific images within a category. For example, running the Food model on a picture of a salad will tell not only return the value “salad,” but can be as specific as Caprese salad or identify individual ingredients.
Clarifai also gives you the option to train your own custom models, something that would normally take some advanced degrees in artificial intelligence. Let’s say you want to train Clarifai to recognize pictures of a person or fictional character, say Erlich from Silicon Valley.
You train the model by uploading ten or more pictures of Erlich (we know you have more) and voila! The API can now return a percent probability that the image has Erlich Bachman in it.
Sign up for a free Clarifai account and grab a subscriptionKey. Them, test the Clarifai Public Model and Clarifai Custom Model APIs in your browser and export the code snippet right into your project.
tl;dr: Amazon’s AWS Rekognition is great for detecting and indexing faces, and broader concepts/events.
While Amazon’s Rekognition API also detects objects in images, where it really shines is facial recognition and indexing. It can store faces as well as compare two faces to see if there’s a match. Amazon Rekognition can also recognize certain objects in an image (what they call “labels”), events (ex. wedding, graduation), or even concepts (ex. landscape, evening).
3) Microsoft Cognitive Services API Suite
tl;dr Microsoft has multiple image recognition APIs, but is good for reading text from images (OCR) and emotion detection.
Microsoft has not one, but three APIs dedicated to visual recognition: Computer Vision, Face and Emotion. One downside is that each of these separate APIs needs its own API key. Computer Vision is a general image tagging platform (similar to those mentioned previously) with one huge benefit: an OCR endpoint. An optical character reader, or OCR, recognizing and reading text from an image. Here’s one example they use from their website:
Microsoft’s Face API exists to (you guessed it!), isolate faces from images. Microsoft’s Emotion API, on the other hand, reads emotions that people have on their face. The emotions detected are happiness, sadness, surprise, anger, fear, contempt, and disgust or neutral.
*Bonus*: IBM Watson Speech to Text API
While this API is not an image recognition API, it can be used to help people with visual impairments express themselves via text–something pretty important on the web! Test out the IBM Watson Speech to Text (STT) API for yourself.
Hearing Impairment: Transcription and Vocalization APIs
For those with hearing impairments or deafness, transcription APIs can greatly improve accessibility to existing applications.
1) Scale API
For example, the Scale API uses a combination of human and artificial intelligence to transcribe audio. You create a task for the humans at Scale API, then they will use a combination of human and machine learning to create an accurate transcription. You can test the Scale API on RapidAPI and export the code snippet there.
2) IBM Watson Text to Speech
Another useful API for those with hearing impairments is the IBM Watson Text to Speech (TTS) API. While we don’t have that one on the RapidAPI marketplace yet, you can learn more about it here. People who have some difficulties with speaking or who prefer not to can use the IBM WatsonText to Speech (TTS) API to turn their text into spoken word.
In the last few years, people have become more aware of accessibility issues around autism and those on the Asperger’s spectrum.
1) Microsoft Emotion API
Since a common diagnosing factor is an inability to detect situational cues or recognize emotion from facial expressions, we thought it would be worth mentioning Microsoft’s Emotion API again. It can detect happiness, sadness, surprise, anger, fear, contempt, and disgust or neutral. This functionality could be very helpful not only for improving visual accessibility, but also for social disorders as well.
2) IBM Watson Text Analysis APIs
There is actually a whole field of tools used to recognize emotion from text and images. While not all may be directly related to accessibility, they can be fun to play with and test. If this topic interests you, check out the IBM Watson Tone Analyzer and Personality Insights APIs. You’ll need an IBM Watson account, username/password, and a large body of text, but from there, you can analyze tone, emotion and other personality insights from making a call on RapidAPI.
Language Assistance: Translation and Spelling APIs
Accessibility means letting in the most people possible. Here are APIs that can help those who speak a different language or who have other difficulties with text.
2) Bing Spell Check API
Everyone misspells words or has a typo now and again, but for people with dyslexia or other learning disabilities, it can be especially frustrating. Embed the Bing Spell Check functionality to your app to make it even easier for people to express themselves clearly. To make it even easier, you could also incorporate the IBM Watson Speech to Text (STT) API so people can say (not type) their inputs.
Hooray for accessibility!
We love that the tech and hackathon scene have embraced accessibility so wholeheartedly. Have you built any projects with an emphasis on accessibility? Think of an API we missed? Let us know in the comments below, on Facebook or Twitter. We’d love to see them!