Selection of a voice for a speech signal for personalized warnings: the effect of speaker’s gender and voice pitch


There is an increasing interest in multimodal technology-based warnings, namely those conveying speech-warning statements. This type of warning may be tailored to the situation as well as to the target user’s characteristics. However, more information is needed on how to design these warnings in a way that ensures intelligibility, promotes compliance and reduces the potential for annoyance. In this context, this paper reports an exploratory study whose main purpose was to assist the selection of a synthesized voice for a subsequent compliance study with personalized (i.e., using the person’s name) technologybased warnings using Virtual Reality. Participants were requested to listen to speech signals, gathered from a speech synthesizer and post-processed in order to change the pitch perception, and then these were evaluated by fulfilling the MOS-X questionnaire. After that, the participants ranked the voices according to their preference. The effects of the speaker’s gender and voice pitch, on both ratings and ranking were assessed. The preference of the male and female listeners for a talker’s voice gender was also investigated. The results show that participants mostly prefer as first choice the high-pitched female voice, which also gathered the highest overall score in the MOS-X questionnaire. No significant influence of the participants’ gender was found on the assessed measures.