Hello Oliver, can you give our readers a quick introduction to yourself and your field of study?
My name is Oliver Niebuhr, I studied phonetics and digital speech processing in combination with psychology and linguistics at the University of Kiel and got my PhD in phonetics. This great field of research integrates aspects of anatomy and physiology with acoustics, speech technology, and instrumental and experimental methods. So, as a phonetician, one is used to conducting highly interdisciplinary work. This is exactly what I continue now since 2015 as associate professor of communication and innovation in my collaboration with Kerstin Fischer and her team at the Human-Robot Interaction Lab in Sønderborg.
Can you describe your contribution to the team of the HRI-Lab?
My contribution to this team primarily concerns my years of experience in speech-melody research. We equip our robots with phonetically and, in particular, prosodically manipulated speech synthesis. The manipulation are made such that we can find out, in the stepwise progression over subsequent perception experiments, which tone of voice matches with which robot. In addition, manipulated synthetic speech helps us determine how specific robots can sound more persuasive or "charismatic", and what the role of individual languages and cultures is in all that. We use very different types of robots for our experiments, from the cute little EZ-bot to the impressive Care-o-bot.
What was your motivation to choose this field of study?
There is a plethora of publications about written language, but we still know surprisingly little about spoken language! In my research, I collect data about the current state of knowledge and try to find out where data is missing and why this is the case. And then I try to fill these knowledge gaps with my own research.
Acoustic Charisma Profiling requires a pre-recorded speech sample of the assessed speaker; a minimum of 40 seconds is needed.
The input speech sample is broken down into acoustic parameters of the speaker's speech melody that proved to be relevant for perceived speaker charisma in a series of preceding experiments.
The assessment system assigns a total charisma score to the input speech sample, in this way giving the speaker a fast and clear feedback on his performance, in combination with advice which parameter should (primarily) be improved and how.
Can you give an example for your research?
For about five years, I have been working on a formula for acoustic/phonetic charisma. This project is a new approach to a known problem: In the past, there were no objectively measurable criteria to rate a certain speaker’s charisma. My formula is designed to address this issue. In order to measure charisma objectively, I combine different criteria. My research on the formula is not yet completed, but already at an advanced stage.
What would would be the output of such a charisma rating?
After carefully analysing the individual parameters all results are combined into a charisma rating. We measured for example the charisma levels of Mark Zuckerberg, the founder of Facebook, and Steve Jobs, the late founder of Apple. According to our formula, Zuckerberg has a charisma level of 57%. This is better than average, but Steve Jobs had 93% and clearly outperformed Zuckerberg on that.
How would this charisma rating by formula work in detail?
We are currently investigating about 20 different parameters that characterize spoken language. There are, for instance, tempo, emphasis and rhythm, speech melody, the loudness level, voice quality (or timbre), disfluencies, pauses and speech reduction phenomena. Speech reduction phenomena are, for example, how clearly vowels and consonants are articulated. Some but not all of these parameters are objectively measurable. However, we have not really perfectly understood by now how all these different parameters interfere and interact. Due to this, charisma is differently interpreted by different scientists.
Can you give an example of that?
The German chancellor Angela Merkel is a good example for that. A team of German and Chinese researchers has analyzed the speaker charisma of 14 German politicians, including Angela Merkel, and her charisma assessment came out slightly worse than average. Which is still better than our own speech-melody analysis would have predicted it.
What is your interpretation of Merkel's charisma?
Our interpretation of Merkel's voice ist, that she performs poorly in terms melodic cues to perceived charisma by using, e.g., a very narrow pitch range and omitting sentence-final low-falling pitch contours. But she does a lot of things right when it comes to articulatory clarity and the expressive articulation of key words in her speech. Both seems to compensate a lot of her poor melodic performance. We are still researching on that, there are many open questions.
Angela Merkel and Julija Timoschenko talking. (Image: David Plas, Wikimedia Commons, CC 2.0)
Long-term acoustic average spectrum of Angela Merkel's speech signal. There is a considerable loss of acoustic energy up to 3,000 Hz, which makes Merkel's voice sound 'soft' and 'thin' for listeners.
The wave form and spectrogram of Angela Merkel's speech. We can see one of her rhetoric strenghts in this figure: A temporally well structured signal with short sections of speech being separated by short (mainly silent rather than filled) pauses.
Are there other aspects apart from the voice that characterize the charisma of a person?
These are aspects like body language, media presence, attire and so on. Finally, on top of all this come the somewhat fuzzy boundaries of charisma to attractiveness on the one hand and dominance on the other.
I can understand how you e.g. measure loudness levels. But how would you track speech reduction?
That’s actually not that difficult because you can measure the acoustic resonance frequencies of individual speech sounds. Simply put, each vowel and consonant has some specific resonance frequencies and is, above that, characterized by how quickly and for how long these frequencies are reached. This is what we can measure, and all deviations from how these frequency patterns in a certain direction, for instance, towards on acoustic overlap with a different vowel or consonant, can count as speech reduction.
Good to know. So measuring speech reduction is as easy as measuring loudness?
Quite the opposite. In fact, loudness is much harder to measure, because even the smallest changes in the speakers head position towards the microphone or the use of a different microphone or recording setup can strongly change loudness levels.
Which parameter is the most difficult to measure?
The parameter that is probably hardest to measure is emphatic intensification, i.e. whether or not a certain word is produced with some extra emotional stress and accentuation. It involves a whole bundle of acoustic features and is very dependent on the listener's contextual interpretation. At the same time, it is one of the most relevant parameters in the perception of charismatic speech. This is one reason why more research is needed on that topic.
How would you analyse the measurements? Is it automatized by software or is this manual work?
We have a self-programmed software called Pascal. Don’t confuse it with the programming language of the same name! Our software is able to interpret some but not all parameters of spoken language charisma automatically; some parameters still need to be inspected manually. This is time consuming, and we are currently working on a solution for that. The long-term objective is of course to fully automate charisma measurements.
Are this formula, the software and the respective research available to the public?
No, they are not. The reason is that we – in this case referring to the SDU and myself – have a pending patent application on this formula. The underlying data, the software and the research cannot be disclosed. At least not at the current stage.
Do you have a literature recommendation for somebody who likes to know more about speaker's charisma?
Olivia Fox Cabane wrote a good book on that topic. It is called "The Charisma Myth: How Anyone Can Master the Art and Science of Personal Magnetism". It provides a good overview about speakers’ charisma.
When introducing yourself you mentioned you cooperation with the HRI-lab. What is the advantage of using a robot for experiments on charisma?
The big advantage for using robots lies in the fact that you can alter a robotic voice at will. A robot is able to express any level of charisma in any language required for a specific experiment. This would be impossible with human speakers.
How do you generate a robotic voice?
Our experiments usually make use of automatically synthesized robotic speech. This robotic speech is then customized with the software Praat to achieve the desired level of charisma, or to test any particular linguistic feature, like the effects of intonation contours. Selina Eisenberger did some interesting experiments about using robots in prosody research.
Intonation Swap Study: Selina Sara Eisenberger explains how the HRI-lab Sønderborg uses robots in modern prosody research.
Robotic speech often sounds quite artificial. Is that an issue for your experiments?
It can be, yes. But this is actually a very interesting research question: How do you generate machine speech that sounds pleasant to a human listener? So far many improvements in artificially generated speech have been made, but you can still distinguish a human speaker from a robotic speaker.
Can you provide an example of your practical work with robots?
Sure. In one experiment, the robot educated patients about healthy nutrition, either with a charismatic voice (Steve Jobs) or with a not so charismatic voice (Mark Zuckerberg).
At the end of the examination patients could choose between chocolate or fruits. Then we measured how many patients chose the healthy fruit over the unhealthy chocolate. It turned out that the robot’s voice characteristics determined how convincing the robot was: The result of this study was quite clear. Steve Jobs convinced 11% more people to choose the healthier fruits than Mark Zuckerberg.
Were these results confirmed by other experiments?
We had another experiment where a robot tried to convince participants to fill in a certain questionnaire. The participants could freely choose between a short questionnaire and a long questionnaire. The robot tried to convince participants to choose the long questionnaire even if that required more effort from the participants.
This experiment had similar results as the experiment with the blood pressure. If the robot used the speech characteristics of Steve Jobs, more participants filled in the long questionnaire than for Mark Zuckerberg. The charisma of Steve Jobs – even when mapped into a robotic voice – works very well for convincing people.
Now it would be interesting to know, how Steve Jobs achieved that effect. What do you think, was the most significant feature of Steve Jobs voice?
The speech melody of Steve Jobs voice was extremely variable. He continuously varied the tone pitch during his presentations to avoid monotony. For the audience it was very entertaining to listen to Steve Jobs. This was a major element of his charisma!
Steve Jobs was known for his charismatic speeches. He used extreme variation in his speech melody. German chancellor Angela Merkel is known for quite the opposite. (Image by Matt Buchanan, Wikimedia Commons, CC BY 2.0)
Steve Jobs uses a wide pitch range and his pitch variability is very high, too.
In comparison, Angela Merkel uses a narrow pitch range and her pitch variability (i.e. the number of times her pitch goes does up and down per time unit) is limited.
Interesting side note: Merkel's overall pitch level is actually lower than that of Jobs. Even though she is a female and Jobs is a male speaker. Ususally it's other way round.
Apart from the voice: Did the personality of Steve Jobs play a role in his charisma, too?
That is correct, charismatic communication has many facettes. You can’t just copy some features of a charismatic speaker and expect to achieve the same results. For this reason I recently founded the company Saphire Solutions together with a partner. In this company we offer individual acoustic voice profiling. By the way, Saphire is written with just one “p” on intention. It is a abbreviation for "Strategic Acoustic-Phonetic Innovations in Rhetoric Enhancement".
What is your goal with Saphire Solutions and how does that complement with your academic research?
Our goal is to improve the communication of our clients, based on scientific research. From a researcher’s point of view, this practical application of prior studies is extremely interesting, too. Academic research and productive use of the research in the wild definitely enrich each other.
Which customer groups does Saphire Solutions target?
With Saphire Solutions we are primarily concerned with making companies more successful and providing advice. Furthermore, we support, for example, entrepreneurs and job seekers. We also have a special focus on African countries and on female speakers.
Oliver Niebuhr wearing VR-goggles to simulate a larger audience. He uses this technique for research and for communication training in his company Saphire Solutions.
What is the motivation for clients to book your service?
Our job description is quite simple: We help our clients to improve their acoustic charisma and that helps them in their daily occupation. Whenever you are in contact with people, for instance, when giving a presentation or when negotiating with prospective investors, you will benefit from charismatic speech.
Does your research cover other topics than acoustic charisma?
In addition to the research on acoustic charisma, we have extended our research to creativity and negotiation. We have cooperations with universities in Nuremberg, Munich, Barcelona, Milan, and Los Angeles. Anyone wishing to contact us for any of these issues, for assistance or further information, will find all the necessary information either on the Saphire Solutions website or on my SDU website.
Thank you very much for the interview. And all the best with both your company and your research!
This interview had been conducted by Sascha Steinhoff in Odense, 3rd of September, 2018.