Tweet

Ben Packer

29 Jan 20, 9 tweets, 7 min read

@Maryam_Najafian

Talk by Chris Sweeney at #FAT2020 on "Reducing sentiment polarity for demographic attributes in word embeddings using adversarial learning," with @Maryam_Najafian.

There are several types of bias encoded in language models, and this paper focuses on sentiment bias, where certain identity terms encode a more positive sentiment than others. #FAT2020

Various papers have studied the different possible sources of this bias, and this paper focuses on the word vectors themselves. #FAT2020

In particular, they define sentiment polarity by using various positive and negative words to obtain a positive/negative sentiment axis, and look at where various identity terms fall on that axis when their word vectors are projected onto that axis. #FAT2020

A given identity term's sentiment score is where its embedding's projection lies on this axis. Goal is to reduce the polarization of a set of identity terms while preserving semantic meaning. #FAT2020

They use an adversarial technique, learning to minimize the distance between polarized/depolarized word vectors, while the adversary maximizes the error between sentiment polarity and groundtruth. #FAT2020

They evaluate whether the resulting embeddings are depolarized and the effect on fairness/accuracy in downstream tasks. They show that they reduce the polarity of names typically associated with different demographic groups. #FAT2020

Case study uses Equality Evaluation Corpus, and they show improvement according to Sentiment valence regression metric for different demographic categories. #FAT2020

Paper is here: dl.acm.org/doi/abs/10.114… #FAT2020

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Ben Packer

Try unrolling a thread yourself!

Did Thread Reader help you today?

Like this author's thread?