Modelos Basados en Enmascaramiento y en BERT para la Identificación de Estereotipos

  1. Montes y Gómez, Manuel
  2. Chulvi, Berta
  3. Sánchez-Junquera, Javier
  4. Rosso, Paolo
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2021

Issue: 67

Pages: 83-94

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

Stereotypes about immigrants are a type of social bias increasingly present in the human interaction in social networks and political speeches. This challenging task is being studied by computational linguistics because of the rise of hate messages, offensive language, and discrimination that many people receive. In this work, we propose to identify stereotypes about immigrants using two different explainable approaches: a deep learning model based on Transformers; and a text masking technique that has been recognized by its capabilities to deliver good and human-understandable results. Finally, we show the suitability of the two models for the task and offer some examples of their advantages in terms of explainability.

Bibliographic References

  • Alaparthi, S. and M. Mishra. 2021. Bert: a sentiment analysis odyssey. Journal of Marketing Analytics, pages 1–9.
  • Bodria, F., A. Panisson, A. Perotti, and S. Piaggesi. 2020. Explainability methods for natural language processing: Applications to sentiment analysis. In SEBD.
  • Bolukbasi, T., K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29:4349–4357.
  • Cañete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. P´erez. 2020. Spanish pre-trained bert model and evaluation data. In PML4DC at ICLR 2020.
  • Clark, K., U. Khandelwal, O. Levy, and C. D. Manning. 2019. What does BERT look at? an analysis of BERT’s attention. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 276–286, Florence, Italy, August. Association for Computational Linguistics.
  • Croce, D., D. Rossini, and R. Basili. 2019. Auditing deep learning processes through kernel-based explanatory models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4037–4046, Hong Kong, China, November. Association for Computational Linguistics.
  • Danilevsky, M., K. Qian, R. Aharonov, Y. Katsis, B. Kawas, and P. Sen. 2020. A survey of the state of explainable ai for natural language processing. arXiv preprint arXiv:2010.00711.
  • Dennison, J. and A. Geddes. 2019. A rising tide? the salience of immigration and the rise of anti-immigration political parties in western europe. The political quarterly, 90(1):107–116.
  • Dev, S., T. Li, J. M. Phillips, and V. Srikumar. 2020. On measuring and mitigating biased inferences of word embeddings. In AAAI, pages 7659–7666.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  • Fokkens, A., N. Ruigrok, C. Beukeboom, G. Sarah, and W. Van Atteveldt. 2018. Studying muslim stereotyping through microportrait extraction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
  • Garg, N., L. Schiebinger, D. Jurafsky, and J. Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16):E3635– E3644.
  • Granados, A., M. Cebrián, D. Camacho, and F. De Borja Rodríguez. 2011. Reducing the loss of information through annealing text distortion. IEEE Transactions on Knowledge and Data Engineering, 23(7):1090–1102. cited By 19.
  • Islam, S. R., W. Eberle, S. K. Ghafoor, and M. Ahmed. 2021. Explainable artificial intelligence approaches: A survey. arXiv preprint arXiv:2101.09429.
  • Jarquín-Vásquez, H. J., M. Montes-y-Gómez, and L. Villaseñor-Pineda. 2020. Not all swear words are used equal: Attention over word n-grams for abusive language identification. In K. M. Figueroa Mora, J. Anzurez Marín, J. Cerda, J. A. Carrasco-Ochoa, J. F. Martínez-Trinidad, and J. A. Olvera-López, editors, Pattern Recognition, pages 282–292, Cham. Springer International Publishing.
  • Liang, P. P., I. M. Li, E. Zheng, Y. C. Lim, R. Salakhutdinov, and L.- P. Morency. 2020. Towards debiasing sentence representations. arXiv preprint arXiv:2007.08100.
  • Lipmann, W. 1922. Public Opinion. New York:Harcourt Brace.
  • Mathew, B., P. Saha, S. Muhie Yimam, C. Biemann, P. Goyal, and A. Mukherjee. 2020. HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. arXiv e-prints, page arXiv:2012.10289, December.
  • Mullenbach, J., S. Wiegreffe, J. Duke, J. Sun, and J. Eisenstein. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1101–1111, New Orleans, Louisiana, June. Association for Computational Linguistics.
  • Nadeem, M., A. Bethke, and S. Reddy. 2020. Stereoset: Measuring stereotypical bias Masking and BERT-based Models for Stereotype Identication 93 in pretrained language models. arXiv preprint arXiv:2004.09456.
  • Rauh, C. and J. Schwalbach. 2020. The parlspeech v2 data set: Full-text corpora of 6.3 million parliamentary speeches in the key legislative chambers of nine representative democracies. Harvard Dataverse.
  • Ribeiro, M. T., S. Singh, and C. Guestrin. 2016. ”why should i trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 1135–1144, New York, NY, USA. Association for Computing Machinery.
  • Sanguinetti, M., G. Comandini, E. Di Nuovo, S. Frenda, M. A. Stranisci, C. Bosco, C. Tommaso, V. Patti, R. Irene, et al. 2020. Haspeede 2@ evalita2020: Overview of the evalita 2020 hate speech detection task. In EVALITA 2020 Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, pages 1–9. CEUR.
  • Sanguinetti, M., F. Poletto, C. Bosco, V. Patti, and M. Stranisci. 2018. An italian twitter corpus of hate speech against immigrants. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
  • Scheufele, D. A. 2006. Framing as a Theory of Media Effects. Journal of Communication, 49(1):103–122, 02.
  • Stamatatos, E. 2017. Authorship attribution using text distortion. 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference, 1:1138–1149. cited By 31.
  • Sánchez-Junquera, J., B. Chulvi, P. Rosso, and S. P. Ponzetto. 2021. How do you speak about immigrants? taxonomy and stereoimmigrants dataset for identifying stereotypes about immigrants. Applied Sciences, 11(8).
  • Sánchez-Junquera, J., L. Villaseñor-Pineda, M. Montes-y-Gómez, P. Rosso, and E. Stamatatos. 2020. Masking domainspecific information for cross-domain deception detection. Pattern Recognition Letters, 135:122–130.
  • Tajfel, H., A. A. Sheikh, and R. C. Gardner. 1964. Content of stereotypes and the inference of similarity between members of stereotyped groups. Acta Psychologica, 22(3):191–201.
  • Tessler, H., M. Choi, and G. Kao. 2020. The anxiety of being asian american: Hate crimes and negative biases during the covid-19 pandemic. American Journal of Criminal Justice, 45(4):636–646.