Morality Beyond the Lines: Detecting Moral Sentiment Using AI-Generated Synthetic Context

2021 
Moral rhetoric is defined as the language used for advocating or taking a moral stance towards an issue by invoking or making salient various moral concerns. The Moral Foundations Theory (MFT) can be used to evaluate expressions of moral sentiment. MFT proposes that there are five innate, universal moral foundations that exist across cultures and societies: care/harm, fairness/cheating, loyalty/betrayal, authority/subversion, and sanctity/degradation. We investigate the case in which texts containing MFD keywords are not expressed explicitly — hidden context. While members of high-context groups can read “between the lines” meanings, word counting methods or other NLP methods for moral sentiment detection and quantification cannot happen if the related keywords are not there to be counted. To explore the hidden context, we leverage a pretrained generative language model such as Generative Pre-trained Transformer (GPT-2) that uses deep learning to produce human-like text—to generate a new story. A human writer would usually provide several prompting sentences, and the GPT model would produce the rest of the story. To customize the GPT-2 model towards a specific domain―for this paper we studied local population’s attitudes towards US military bases located in foreign countries―a training dataset from the domain can be used to finetune the GPT-2 model. Finetuning means taking weights of a trained neural network and using it as initialization for a new model being trained on the finetuning dataset. Restricted language codes (meanings are not expressed explicitly) can be used as prompting sentences, and finetuned GPT models can be used to generate multiple versions of synthetic contextual stories. Since the GPT-2 model was trained using millions of examples from a huge text corpus, the generated context contents reflect the cultural-related knowledge and common sense in the culture. In addition, since finetuned models were trained using fine-tuned dataset, the generated context contents reflect the local people’s reaction for that specific domain—which is attitudes towards US military bases in regards to this paper. After using or fine-tuning the GPT-2 model to generate multiple versions of synthetic text, some versions might contain keywords defined in the MFD. Our hypothesis is that the percentage keywords related to the five morality domains can serve as statistical indicators for the five domains. Our experiment shows that the top five morality domain types experiencing significant percentage changes between positive and negative stories, generated by fine-tuned training models, are HarmVice, AuthorityVirtue, InGroupLoyalty, FairnessVirtue and FairnessVice. The results are in line with several major issues identified between US oversea military bases and local populations by well-known existing studies. The main contribution of this research is to use AI-generated synthetic context for detecting moral sentiment and quantification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []