(03) 9696 2944

Road Testing ModelTalker

Contact us

by Emma Hughes

ModelTalker project has been developed by the research team at Nemours Speech Research Laboratory at at the Alfred I. duPont Hospital for Children.  They have developed the software for use with children who use “donated” voices for their speech generating devices and adults who are losing their voice.

I started the process of recording my voice for two main reasons.  Firstly I had a pure interest in the technology.  I was also motivated by someone who is already using my voice on their recorded speech communication device and was hoping to be able to continue on to using my voice in their new text-to-speech device.

The process for recording my ModelTalker voice started with registering for an account on their website.  After registering with ModelTalker.org I downloaded the Windows software for recordings. Currently there isn’t a Mac version of the software however there is an online version of the software that can be accessed through Google Chrome (which I didn’t try).

 

I was using a Windows 8.1 Desktop computer and a USB Headset/Mic. (I had a Sennheiser PC-8 from Officeworks approx. $50). The ModelTalker recommendation is to use a USB mic rather than either a Bluetooth headset or a 3.5mm jack headset for optimal sound quality.

The ModelTalker software checks for background noise and I had to be in a quiet closed room with no heating or air-conditioning running. Dogs barking or loud traffic noises (eg. sirens) in the area required re-recording phrases after waiting for the noise to stop.

After downloading and installing the ModelTalker recording software you are prompted to login with your account and the first 10 phrases are added to your inventory to be recorded.  These are test phrases that needed to be uploaded prior to being given access to the full inventory so that the recording levels can be checked.

Before starting any new session with the recording software (every time you open it) you have to go through a calibration of sound levels. This involves a section of silence, repeating the syllable “pa” and then three recorded phrases. This then sets the sound levels for the recording session.

The recording process is fairly easy and was accessible either by mouse or keyboard.  Each of the phrases has a male American English recorded prompt that speaks the entire phrase for you to copy and the sentence is also displayed on the screen.  When a recording was completed it is given a score on volume, pitch and pronunciation.  If the recording score wasn’t high enough on any of the scales I was prompted to rerecord before moving on.

ModelTalker is optimized for use with American English. This meant that it was difficult to get some of the recordings to an optimal level unless I pronounced them with an accent matching the sound of the prompt voice.  Some of the trickier phrases took 4-5 attempts at recording before getting an optimal result from the software.

Once the ten phrases had been recorded I uploaded them from within the software to the ModelTalker team to review. It took approximately 2-3 days before I received the results back that my recordings had passed their tests and I was ready to proceed with the full inventory of 1600 phrases.

The complete inventory of 1600 phrases was recorded in the same way as the test phrases. It took me approximately 6 hours to record all of the phrases over a few weekends.  I had to take a break for a week during recording as I had a cold which had affected the sound of my voice.  I found it best if I worked in 30-45min blocks, aiming to complete approximately 100 recordings in a single session. It was surprisingly tiring to complete the recordings and also took concentration to make sure that they were optimal recordings. The occasional phrase took 4-5 attempts before being acceptable, mostly due to pronunciation problems.

I recorded the full 1600 phrases but the information on the ModelTalker website suggests that they have been able to build voices on as few as 800 recordings in some rare cases. 

Once the full recordings were completed they again were uploaded via the software and I sent off the request for my voice to be built. The response was very quick and 2-3 days later I received an email with the link to download my voice.  The download file in the link was installable software for Windows that installed the SAPI5 voice directly into Windows and also included a simple text to speech program for testing it out.

I have trialed the voice in Predictable on my iPad and iPhone. Downloading to Predictable was easy and was done within the app using my ModelTalker login details to download directly. The pitch and speed adjustments in Predictable are not available (at the time of writing) to be used with a ModelTalker voice.

I have also used the voice on a Windows computer in The Grid 2.  The Grid automatically detected the new voice from Windows after installing the software from the link. To me the voice quality sounds reasonable and I recognize a familiarity in the voice. As was expected it is not as clear as Acapela voices in The Grid 2 or the voices in Predictable however the created voice is better than I expected.  It retains a “synthesized” speech quality which could be described as ‘robotic.’ The voice does have some issues with phrasing when speaking longer sentences rather than individual words.  Occasionally additional syllables seem to be pronounced (almost like a lisp/squeak) and sometimes sounds are dropped from words.

Overall the process for recording my voice was quite straight forward although time consuming.   After testing the voice with the person who was interested in using it on their communication device we found that the voice did sound very comparable to the recorded speech on her device.  However the ModelTalker voice for obvious reasons didn’t have the same clarity that Acapela voices have and in the end they have chosen to stay with the Acapela voices.   I am yet to investigate the software provided by ModelTalker to tweak the quality of the recorded voice and this may improve the finished product.  The results were as expected when taking into consideration the demo’s provided by ModelTalker.

Resources:

Model talker recommends a level of computer ability prior to undertaking recordings or having someone available to assist.  More information is available on their FAQ page at https://www.modeltalker.org/faq/

There is also a Windows recording guide on the website at https://www.modeltalker.org/model-talker-2-user-guide-windows/

The emails with instructions that ModelTalker sends contain very important information that helped with setting up.

There were examples of recorded voices on the ModelTalker website but these seem to be temporarily unavailable.

The following blog has some information on tweaking the completed voice.  http://pikespals.blogspot.com.au/2013/02/voice-banking-with-model-talker.html

Please tweet your comments, questions and feedback to @emmazyteq

If you need more help please contact us.

Contact Us

Join our Enewsletter