A few months ago, Rev’s Head of Speech, Miguel Jette, wrote this Medium post about the importance he and Rev have placed on accounting for bias in Automatic Speech Recognition AI.
It’s a great read, and you can check it out above. Here’s the TL;DR: Bias for gender and race-based ethnic groups exists in AI, and that’s not good. We know how it happens. The data sets that teach the models are often not equally inclusive of all races, genders, and ethnic groups. Rev is not merely aware of this problem (which is both simple and hard-to-overcome); we’re actively looking for ways to solve it once and for all.
Recently, a prospective client came to us saying that they too wanted to be particularly cognizant of any biases.
So, they put us to the test, comparing our Word Error Rate (WER) to three major speech-to-text competitors (where, like golf, a lower score is better). After running Rev’s speech-to-text AI through a veritable gauntlet, they provided the following data (under the condition of anonymity given the competitive advantage Rev provides).
We’re proud to say these are the results for both race and gender.
Figure 1
Figure 2
We should note that environmental factors can also impact WER scores.
Socioeconomic considerations can come into play, for example, if certain demographics have the means to purchase a recording device that may provide them a better microphone and therefore better audio quality. A given demographic’s WER score could be better simply by virtue of that fact, and not any bias inherent to the speaker’s accent.
In the case of the above tests, the prospective client included more than 800 audio clips of 1-4 minutes in length to try to account for these kinds of environmental factors and/or anomalies.
In any event, two things remain clear after this rigorous test.
First, because we have made solving gender-based and race-based A.I. bias a priority, we’re clearly seeing results.
Second, as long as there is a gap, there’s still work to be done.
In the coming weeks, we’ll be posting more on what’s known as “Accent Robustness,” how that plays into A.I. fairness, and what Rev is doing about it.
We look forward to both continuing our journey towards a fairer ASR model, and to being candid and open about where and how we can improve.