New research from the Oxford Internet Institute (OII) exposes critical problems in how emoji-based online hate is tackled, with OII researchers uncovering critical weaknesses in how Artificial Intelligence (AI) systems detect hate speech involving emoji. Lead author of the study, OII DPhil researcher Hannah Rose Kirk explains more.
Social media platforms have opened up unprecedented channels of communication. While this greater communication brings about many benefits, it has also allowed for an expansion in the scope and virality of online harms. The volume of hate shared online far exceeds what human moderators can feasibly deal with, and AI content moderation systems have the potential to relieve the burden placed on these moderators. However, humans are creative in the way they express hate and the diversification in modalities of hate (such as the use of emoji) outpace what AI systems can understand.
Yet AI systems only learn complex societal phenomena and linguistic patterns from repeated exposure in their training data and it is only by exposing, then addressing their weaknesses, can we hope to make them stronger. While humans can translate the visual meaning of emoji, AI language models process emoji as characters, not pictures. These language models have a comprehensive grasp on textual constructions, yet there is still a need for the models to be taught what emoji mean in various contexts, and how different emoji condition the likelihood of hatefulness in a given tweet, post or comment.