Medidas acústico-prosódicas discriminam as emoções de falantes do português brasileiro
Acoustic-prosodic measures discriminate the emotions of Brazilian portuguese speakers
Alexandra Christine de Aguiar; Ana Carolina Constantini; Ronei Marcos de Moraes; Anna Alice Almeida
Resumo
Palavras-chave
Abstract
Purpose: To verify if there is a difference in acoustic-prosodic measures in different emotional states of speakers of Brazilian Portuguese (BP). Methods: The data sample consisted of 182 audio signals produced by actors (professionals or students), from the semi-spontaneous speech task “Look at the blue plane” in the various emotions (joy, sadness, fear, anger, surprise, disgust) and neutral emission. Values were extracted from acoustic-prosodic measures of duration, fundamental frequency and intensity of the various emotions. The Friedman comparison test was used to verify whether these measures are able to discriminate emotions. Results: The prosodic-acoustic analysis revealed significant variations between emotions. The disgust emotion stood out for having the highest rate of utterance, with higher values of duration. In contrast, the joy exhibited a more accelerated speech, with lower values of duration and greater intensity. Sadness and fear were marked by lower intensity and lower frequencies, and fear presented the lowest positive asymmetry values of z-score and z-smoothed, with less elongation of the segments. Anger was highlighted by the higher vocal intensity, while surprise recorded the highest values of fundamental frequency. Conclusion: The acoustic-prosodic measures proved to be effective tools for differentiating emotions in CP speakers. These parameters have great potential to discern different emotional states, broaden knowledge about vocal expressiveness and open possibilities for emotion recognition technologies with applications in artificial intelligence and mental health.
Keywords
References
1 González Torre I, Luque B, Lacasa L, Luque J, Hernández-Fernández A. Emergence of linguistic laws in human voice. Sci Rep. 2017;7(1):43862.
2 Costa DB, Lopes LW, Silva EG, Cunha GMS, Almeida LNA, Almeida AAF. Fatores de risco e emocionais na voz de professores com e sem queixas vocais. Rev CEFAC. 2013;15(4):1001-10.
3 Cowen AS, Elfenbein HA, Laukka P, Keltner D. Mapping 24 emotions conveyed by brief human vocalization. Am Psychol. 2019;74(6):698-712.
4 Barbosa IK, Behlau M, Lima-Silva MF, Almeida LN, Farias H, Almeida AA. Voice symptoms, perceived voice control, and common mental disorders in elementary school teachers. J Voice. 2021;35(1):158.e1-7.
5 Alves CRST, Mastella V. Linguagem e comunicação na contemporaneidade. Cruz Alta: Ilustração; 2020.
6 Ekman P. An argument for basic emotions. Cogn Emotion. 1992;6(3-4):169-200.
7 Wang Y, Zhu Z, Chen B, Fang F. Perceptual learning and recognition confusion reveal the underlying relationships among the six basic emotions. Cogn Emotion. 2019;33(4):754-67.
8 Yao X, Bai W, Ren Y, Liu X, Hui Z. Exploration of glottal characteristics and the vocal folds behavior for the speech under emotion. Neurocomputing. 2020;410:328-41.
9 Cohen AS, Hong SL, Guevara A. Understanding emotional expression using prosodic analysis of natural speech: refining the methodology. J Behav Ther Exp Psychiatry. 2010;41(2):150-7.
10 Santos AJ, Rothe-Neves R, Pacheco V, Baldow VS. Emotional speech prosody: how readers of different educational levels process pragmatic aspects of reading aloud. DELTA. 2022;38(3):1-31.
11 Wagner M, Watson DG. Experimental and theoretical advances in prosody: a review. Lang Cogn Process. 2010;25(7-9):905-45.
12 Watson D, Gibson E. The relationship between intonational phrasing and syntactic structure in language production. Lang Cogn Process. 2010;25(5):713-55.
13 Arvaniti A. The phonetics of prosody. In: Aronoff M, Chen Y, Cutler C, editors. Oxford research encyclopedia of linguistics. Oxford: Oxford University Press; 2020.
14 Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B. A database of German emotional speech. In: 9th European Conference on Speech Communication and Technology (INTERSPEECH); 2005 Sep 4-8; Lisbon, Portugal. Proceedings. Los Alamitos, CA: IEEE/ISCA; 2005. p. 1517-20.
15 Busso C, Bulut M, Lee CC, Kazemzadeh A, Mower E, Kim S, et al. IEMOCAP: Interactive Emotional Dyadic Motion Capture Database. Lang Resour Eval. 2008;42(4):335-59.
16 McKeown G, Valstar M, Cowie R, Pantic M, Schroder M. The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans Affect Comput. 2012;3(1):5-17.
17 Ringeval F, Sonderegger A, Sauer J, Lalanne D. Introducing the recola multimodal corpus of remote collaborative and affective interactions. In: 10th IEEE Int Conf Workshops Autom Face Gesture Recognit (FG); 2013; Shanghai, China. Proceedings. New York: IEEE; 2013. p. 1-8.
18 Shinde AS, Patil VV. Speech emotion recognition system: a review. In: 4th International Conference on Advances in Science and Technology (ICAST 2021); 2021; Bahir Dar, Ethiopia. Proceedings. New York: SSRN; 2021. p. 1-6.
19 Lima HMO, Almeida AAF, Almeida LNA. Elaboração e validação do Banco de Vozes Brasileiro nas Variações das Emoções (EMOVOX-BR). In: 30º Congresso Brasileiro de Fonoaudiologia; 2022; João Pessoa. Anais. São Paulo: Sociedade Brasileira de Fonoaudiologia; 2022. p. 4298-302. (vol. 1).
20 Larrouy-Maestri P, Poeppel D, Pell MD. The sound of emotional prosody: Nearly 3 decades of research and future directions. Perspect Psychol Sci. 2023 PMid:38232303.
21 Oh C, Morris R, Wang X, Raskin MS. Analysis of emotional prosody as a tool for differential diagnosis of cognitive impairments: a pilot research. Front Psychol. 2023;14:1129406.
22 Filippa M, Lima D, Grandjean A, Labbé C, Coll SY, Gentaz E, et al. Emotional prosody recognition enhances and progressively complexifies from childhood to adolescence. Sci Rep. 2022;12(1):17144.
23 Silva W, Barbosa PA. Perception of emotional prosody: investigating the relation between the discrete and dimensional approaches to emotions. Rev Estud Linguagem. 2017;25(3):1075-102.
24 Lausen A, Hammerschmidt K. Emotion recognition and confidence ratings predicted by vocal stimulus type and prosodic parameters. Humanit Soc Sci Commun. 2020;7(1):2.
25 Behlau M, Rocha B, Englert M, Madazio G. Validation of the Brazilian Portuguese CAPE-V instrument: br CAPE-V for auditory-perceptual analysis. J Voice. 2020;36(4):586.e15-20.
26 Fox A. Prosody features and prosodic structure. Oxford: Oxford University Press; 2000.
27 Constantini AC, Barbosa PA. Prosodic characteristics of different varieties of Brazilian Portuguese. Rev Bras Criminol. 2015;4(3):44-53.
28 Barbosa PA. Incursões em torno de ritmo da fala. Campinas: Editora Pontes; 2006.
29 Sterne JA, Kirkwood BR. Essential medical statistics. 2nd ed. Hoboken: Oxford Blackwell Science; 2003.
30 Costa LMO, Martins-Reis VO, Celeste LC. Metodologias de análise da velocidade de fala: um estudo piloto. CoDAS. 2016;28(1):41-5.
31 Lopes LW, Alves JN, Evangelista DS, França FP, Vieira VJD, Lima-Silva MFB, et al. Acurácia das medidas acústicas tradicionais e formânticas na avaliação da qualidade vocal. CoDAS. 2018;30(5):e20170282.
32 Barbosa PA, Madureira S. Manual de fonética acústica experimental. São Paulo: Cortez; 2015.
33 Abreu SR, Moraes RM, Martins PN, Lopes LW. VOXMORE: artefato tecnológico para auxiliar a avaliação acústica da voz no processo ensino-aprendizagem e prática clínica. CoDAS. 2023;35(6):e20220166.
34 Silva LJ Jr, Barbosa PA. Speech rhythm of English as L2: an investigation of prosodic variables on the production of Brazilian Portuguese speakers. J Speech Sci. 2020;8(2):37-57.
35 Moriarty P, Vigeant M, Wolf R, Gilmore R, Cole P. Creation and characterization of an emotional speech database. J Acoust Soc Am. 2018;143:1869.
36 Ekberg M, Stavrinos G, Andin J, Stenfelt S, Dahlström Ö. Acoustic features distinguishing emotions in Swedish speech. J Voice. 2023. Ahead of print.
37 Lehiste I. Suprasegmentals. Cambridge: MIT Press; 1970.
38 Almeida ANS, Oliveira M Jr, Almeida RAS. A velocidade de fala como pista acústica da emoção básica de raiva. Rev Diadorim. 2015;17(2):198-211.
39 Scherer KR. A cross-cultural investigation of emotion inferences from voice and speech: Implications for speech technology. In: 6th ICSLP; 2000; Beijing. Proceedings. Berlin: ISCA Archive; 2000. p. 379-82.
40 Goudbeek M, Scherer K. Beyond arousal: valence and potency/control cues in the vocal expression of emotion. J Acoust Soc Am. 2010;128(3):1322-36.
41 Liu P, Pell MD. Processing emotional prosody in Mandarin Chinese: a cross-language comparison. In: International Conference on Speech Prosody 2014; 2014; Dublin, Ireland. Proceedings. Berlin: ISCA Archive; 2014. p. 95-9. http://doi.org/10.21437/SpeechProsody.2014-7.
42 Nunes VG. Contribuições sobre as características prosódicas de interrogativas totais neutras produzidas por sergipanos. In: Freitag RMK, Lucente L, editores. Prosódia da fala: pesquisa e ensino. São Paulo: Blucher; 2017. p. 145-62.
43 Muñetón-Ayala M, De Vega M, Ochoa-Gómez JF, Beltrán D. The brain dynamics of syllable duration and semantic predictability in Spanish. Brain Sci. 2022;12(4):458.
44 Kaur J, Juglan K, Sharma V. Role of acoustic cues in conveying emotion in speech. J Forensic Sci Crim Invest. 2018;11(1).
45 Busso C, Rahman T. Unveiling the acoustic properties that describe the valence dimension. In: Thirteenth Annual Conference of the International Speech Communication Association; 2012; Portland, OR, USA. Proceedings. Berlin: ISCA Archive; 2012. p. 1179-82.
46 Lopes LW, Cavalcante DP, Costa PO. Intensidade do desvio vocal: integração de dados perceptivo-auditivos e acústicos em pacientes disfônicos. CoDAS. 2014;26(5):382-8.
47 Barbosa PA. Aspectos de produção e percepção de estilos de elocução profissionais e não profissionais em quatro línguas. In: Freitag RMK, Lucente L, editores. Prosódia da fala: pesquisa e ensino. São Paulo: Blucher; 2017. p. 44-59.
48 Ververidis D, Kotropoulos C. Emotional speech recognition: resources, features, and methods. Speech Commun. 2006;48(9):1162-81.
49 Pervaiz M, Khan TA. Emotion recognition from speech using prosodic and linguistic features. Int J Adv Comput Sci Appl. 2016;7(8):84-9.
50 Swain M, Routray A, Kabisatpathy P. Databases, features and classifiers for speech emotion recognition: a review. Int J Speech Technol. 2018;21(1):93-120.
Submitted date:
04/28/2024
Accepted date:
12/02/2024


