Preparation of recordings

Preparation for the creation of high-quality audio recordings

Developing a custom voice is a simple as well as complex challenge. The central component is an extensive collection of audio samples of human speech. It is important that these audio recordings are of high quality. Select a speaker who has experience with this type of recording, and have the recording done by an audio engineer with professional equipment.


There are three basic roles in a custom neural voice recording project:


The custom voice will be created based on the voice of the selected speaker.

Sound engineer

Manages the technical components of the recording process while operating the recording equipment.

Project lead

Leading the project and coaches the speaker's performance.

Speaker voice

The selection of the speaker voice is done by the corporate. The consent of the speaker must be obtained by the Corporate. Please provide BotTalk with the speaker's full name and confirm that we may synthesize the speaker's voice.

Select a speaker whose natural voice you like. It is possible to create unique character voices, but it is much more difficult for most speakers to maintain this consistency, and the effort can lead to voice strain. The most important factor in choosing a voice actor is consistency. Recordings for the same speaking style should all sound as if they were created in the same room on the same day. Announcers or newscasters are usually very suitable because they bring a variety of experience to the table. The ideal for good recording techniques can also be achieved with appropriate technology.

Recommended characteristics of the speaker:

  • Consistent speed, volume, pitch and timbre.

  • Clear pronunciation is a must

  • Control of pitch deviations, moods and linguistic habits

Voice recording can be more tiring than other types of speaking work. Most speakers can record for two or three hours a day. Limit sessions to three or four per week with a day off in between if possible.

Recording requirements

We recommend recording the audio in a professional recording studio that specializes in voiceover work. There you will have a recording booth, the right equipment and the right people to operate it. It is recommended that you do not cut any corners when it comes to recording.

Discuss the project with the studio's sound engineer and listen to his advice. Recording should be done with little or no dynamic compression (4:1 maximum). It is important that the audio has a consistent volume and a high signal-to-noise ratio.

To achieve high quality training results, follow these requirements during recording or data preparation:

  • Clear and well enunciated

  • Natural speed: not too slow or too fast between audio files.

  • Appropriate volume, prosody, and pause: consistent within a sentence or between sentences, correct pause for punctuation.

  • No background noise during recording

  • Adaptation to the persona design

  • No wrong accent: adaptation to the target draft

  • No incorrect pronunciation

Speaker Persona

First, design a persona of the voice that represents your brand. A common persona document can be used for this if needed. This document defines elements such as the characteristics of the voice and the character behind the voice. This helps to guide the process of creating a custom voice: selecting your voice actor, training, and voice optimization.

Work with your speaker to develop a persona that defines the overall tone and emotional inflection of the custom neural voice. In doing so, you should determine what sounds "neutral" for the role.

Your speaker is the other half of the equation besides the BotTalk technique. Your speaker must be able to speak with consistent rate, volume level, pitch, and tone with clear dictation. They also need to be able to control their pitch variation, emotional affect, and speech mannerisms. Recording voice samples can be more fatiguing than other kinds of voice work, so most voice talent can usually only record for two or three hours a day. Limit sessions to three or four days a week, with a day off in-between if possible.

Last updated