All-in-One Hate Speech Detectors May not be what You Want
説明
The detection of Hate speech has been an increasingly active research topic. The results reported by the state-of-the-art systems to automatically detect hateful contents achieved almost perfect performance on common data sets. However, “hate speech” is a very subjective term, and people with different backgrounds have different levels of tolerance to what constitutes hate. In this paper, we show the limitations of having a single classifier handling the problem of hate speech detection. We then propose to build classifiers customized for different people, instead of a single classifier. The main obstacle towards achieving such a goal is the scarcity of data. Therefore, we use transfer learning to overcome this issue and use very limited amount of annotated data to build these customized classifiers. In a first stage, we build a classifier on a large data set which classifies tweets into 3 classes: hate, offensive, clean, and which we refer to as the general classifier. In the second stage, we asked 3 annotators with different backgrounds to re-annotate a small sub-set of tweets (600 tweets) from the original one. We refer to this newly created data set as “the customized data set.” We then fine-tune the general classifier on the customized data set and build the customized classifier for each annotator. The accuracy of classification of corresponding customized data set got 0.08, 0.06 and 0.11 higher than the general classifier. The result shows that it is possible to start with a general classifier, and adjusted it to each individual despite the very limited amount of the training data for him/her.
収録刊行物
-
- 2021 The 4th International Conference on Software Engineering and Information Management
-
2021 The 4th International Conference on Software Engineering and Information Management 165-170, 2021-01-16
ACM