Detecting Independent Pronoun Bias with Partially-Synthetic Data Generation

Robert Munro,Morrison, Alex (Carmen)

Detecting Independent Pronoun Bias with Partially-Synthetic Data Generation

2020

Robert Munro
Morrison, Alex (Carmen)

We report that state-of-the-art parsers consistently failed to identify hers and theirs as pronouns but identified the masculine equivalent his. We find that the same biases exist in recent language models like BERT. While some of the bias comes from known sources, like training data with gender imbalances, we find that the bias is amplified in the language models and that linguistic differences between English pronouns that are not inherently biased can become biases in some machine learning models. We introduce a new technique for measuring bias in models, using Bayesian approximations to generate partially-synthetic data from the model itself.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations