SoSLUG Archive


Espeak A Very Easy and Powerful Festival Alternative

Festival for many years has been the mother of all vocabulary text to speech systems available however it is complex to install especially if you need other voices. Espeak takes the pain out of installation that can challenge the most accomplished or seasoned Linux user. Ubuntu comes with Espeak pre-installed and even if this is not installed it only takes seconds to do so, unlike festival which takes many minutes.

Espeak is great fun to use and very flexible but really this is only a tool, Espeak like it’s cousin Festival can re-direct it’s output to produce a wav file that can be used again and again on this or other systems.

If you are looking for a screen reader then you should look at the Orca system again with the current version of Ubuntu this is available in the System => Preferences => Assisted Technologies. It is a system that allows you to navigate your way around your Desktop if your blind it is truly an excellent and a totally free system and works very well. (More about Orca in another post)

Both Espeak and Festival are console tools, that is to say they use console commands to initiate the application, this is why this post is in in the tools section.

Here are some examples of console commands you could use with Espeak.

#> espeak

This is command mode, entering “espeak” with no parameters associated with it allows you type your text, sentence or paragraph on pressing of the carriage return on your keyboard or enter key. From this simple example you can begin to see how one can grab a sentence, paragraph from a letter or the web and have read to them the contents of what was copied. Try this yourself put espeak in command mode if you haven’t already done so and copy and paste a section of a letter or web page into the espeak application. You should immediately here in perfect clarity the contents of what was pasted into the application, be careful not to copy empty lines as this will be seen by the application as a carriage return and initiate espeak before you are ready.

#> espeak test

You can also use espeak to pronounce one or more words.

#> espeak “this is a test”

As you can see to reproduce a sentence you need only enclose the words in quotes.

But then you can also do this

#> seq 1 100 | espeak

Well ok this is moderately impressive so how about this then.

#> seq 1 100 | espeak -vde

Now you can begin to see some of the educational aspects and power available using some very simple commands counting in German great huh! But it’s not just German it is any installed language and that’s quite a lot just replace the two element character code after the “-v”

  • English = en
  • German = de
  • Afikaans = af
  • Bosnian = bs
  • Greek = el
  • Finish = fi
  • French = fr

Or to get a full list of voice languages just type this

#> espeak –voices

Which will produce an output similar to that below:

Pty Language Age/Gender VoiceName File Other Langs
5 af M afrikaans af
5 bs M bosnian bs
5 cs M czech cs
5 cy M welsh-test cy
5 de M german de
5 el M greek el
5 en M default default
5 en-sc M en-scottish en/en-sc (en 4)
2 en-uk M english en/en (en 2)
5 en-uk-north M lancashire en/en-n (en-uk 3)
5 en-uk-rp M english_rp en/en-rp (en-uk 4)
5 en-uk-wmids M english_wmids en/en-wm
5 en-us M english-us en/en-r (en 3)
5 en-wi M en-westindies en/en-wi (en-uk 4)
5 eo M esperanto eo
5 es M spanish es
5 es-la M spanish-latin-american es-la (es-mx 6)
5 fi M finnish fi
5 fr M french fr
5 grc M greek-ancient grc
5 hi M hindi-test hi
5 hr M croatian hr (hbs 5)
5 hu M hungarian hu
5 id M indonesian-test id
5 is M icelandic-test is
5 it M italian it
5 jbo lojban jbo
5 ku M kurdish ku
5 la M latin la
5 mk M macedonian-test mk
5 nl M dutch-test nl
5 no M norwegian-test no (nb 5)
5 pl M polish pl
5 pt M brazil pt (pt-br 5)
5 pt-pt M portugal pt-pt
5 ro M romanian ro
5 ru M russian_test ru
5 sk M slovak sk
5 sr M serbian sr
5 sv M swedish sv
5 sw M swahihi-test sw
5 ta M tamil ta
5 tr M turkish tr
5 vi M vietnam-test vi
5 zh M Mandarin zh
5 zh-yue M cantonese-test zhy (yue 5)

Provided your word or sentence is written in the language which is desired the correct or near correct pronunciation will be mad therefore if we where to first write “1 2 3 4 Bonjour” and ask for this in English (espeak default language).

#> espeak “1 2 3 4 Bonjour”

Whilst we could understand the number values correctly espeak can only reproduce a pronunciation for the word that is written be it in English, French or any other language supplied. Now if we write the same message this time using the French dictionary we get both the numbers and message in french.

#> espeak -vfr “1 2 3 4 Bonjour”

This all very well but nothing we have described above can be termed useful, fun yes but hardly useful not unless we can change the text that is read to a sound file well guess what you can. So how about this for useful you have some standard english text in a file called mytext.txt somewhere on your hard disk this could be a message you would like spoken rather than read, how do we do this.

#> espeak -f mytext.txt -w mytext.wav

We can expand somewhat on this to produce an mp3 file instead essentially it is still the same.

#> espeak -f mytext.txt –stdout | lame – mytext.mp3

This very rapidly creates a file called mytext.wav from the words, group of words sentence or paragraphs in the file named mytext.txt.

#> file mytext.wav

This should establish that a wav file has indeed be created and display a message similar to this.

mytext.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz

The mp3 file on the other hand will show a similar output see below use this command to check the file you have created.

#> file mytext.mp3

mytext.mp3: MPEG ADTS, layer III, v2, 32 kBits, 22.05 kHz, Monaural

You can then use mplayer to play either of these files.

#> mplayer mytext.wav
#> mplayer mytext.mp3

For something a little more useful you can read html web pages using the following command.

#> wget -qO – | espeak -m -ven+11

However this syntax works better with pages written in html (Hyper Text Markup Language) if you where to point the above url (Uniform Resource Locator) to this or one of the other wiki pages it would not work near so well, but by all means give it a try.

For a bit more on the fun element you could give this a try, it uses another application called ‘fortune’ which changes every time it is used as a result espeak output changes also.

#> fortune |tee >(espeak -k20 -ven+12)

Information source Text-to-speech-synthesizer

If you still wonder how useful espeak is then what about creating audio books we have found an entry that might help you achieve just that.

 #!/bin/sh # txt2mp3 - convert text files to mp3 audio files (aka audiobooks) # v0.1 # # (c) 2008 Everthon Valadão  under the GPL # # # OBS.: install some pre-requisites first, with #       sudo apt-get install espeak lame xpdf-utils odt2txt antiword xterm  TXT_FILE="$1" BASENAME=`echo "$TXT_FILE" | sed 's/(.*)(..$)/1/g'`  echo "TTS (text-to-speach) ${TXT_FILE}"  ext=${1##*.}  # if it isn't a TXT file, convert it first if [ "$ext" != "txt" ] ; then     TMP_FILE="/tmp/espeakfile-$$.txt"      # PDF     if [ "$ext" = "pdf" ] ; then         echo "converting from PDF to TXT"         pdftotext "${TXT_FILE}" "${TMP_FILE}"     fi      # ODT     if [ "$ext" = "odt" ] ; then         echo "converting from ODT to TXT"         odt2txt --subst=all "${TXT_FILE}" > "${TMP_FILE}"     fi      # DOC     if [ "$ext" = "doc" ] ; then         echo "converting from DOC to TXT"         antiword "${TXT_FILE}" > "${TMP_FILE}"     fi      TXT_FILE="${TMP_FILE}" fi  rm -f /tmp/voice.wav  # create a FIFO "named pipe" to save space mkfifo /tmp/voice.wav  # espeak write output to a pipe while lame encodes the file on the fly nice espeak -f "${TXT_FILE}" -w /tmp/voice.wav &  xterm -e nice lame -a --resample 16 -V 9 --vbr-new --lowpass 8 -f /tmp/voice.wav -o "${BASENAME}_VBR.mp3"  echo "..done! Voice saved as ${1}.mp3"

This application “espeak” does not natively support the reading of or interpretation of “pdf” formated documents, rather you need to use the script above to convert first to text and then an “mp3” file. This essentially is using an automated voice to produce what is called an audiobook. Many community projects have been setup around the country that utilise human vocals an expensive mixer and recording hardware to produce high quality but low volume audiobooks for the blind.

To convert a pdf formated file to common text file, this that the application espeak can support one of course needs to download the ebook as a pdf, doc or odt

#> pdftotext inputfile.pdf outputfile.txt

for pdf conversion to text

#> odt2txt –subst=all inputfile.odt > outputfile.txt

for odt conversion to text

#> antiword inputfile.doc > outputfile.txt

for doc conversion to text

As text you could use one of the methods above to play the audiobook to standard output.

Nine times out of ten such recordings need heavy investment a great deal of man power and resources to maintain and produce, this could although somewhat cruder in it’s implementation provide an affordable means to produce large quantities of audiobooks on virtually any subject available in PDF, DOC, ODT or TXT format. Such audiobooks can be provided on mass via very economical means of data distribution, email these can be played on the computer or downloaded for playing on small personal mp3 walkmans and ipods. If this is a project you would like advice on or perhaps participate in feel free to get in touch I am happy to explain and support for such a good cause.

Ever wanted your own speaking clock try this

#> date “+The time sponsered by Accurist is %H Hours %M Minutes and %S seconds” | espeak

or you could expand on this and have

#> date “+Todays date is %A %B %Y but the time sponsered by Accurist is %H Hours %M Minutes and %S seconds” | espeak

The command “espeak” can also be integrated into your console terminal so that each time you launch a new terminal you can audibly hear a fortune. This is accomplished by editing a file called “bash.bashrc” this file in Ubuntu and there derivatives exists in the folder “/etc” this may differ in different distributions but you should try to find this file first before looking elsewhere.

Using vim or gedit edit this file be sure to make a backup of this file first,

#> cp /etc/bash.bashrc /etc/bash.bashrc-original
#> sudo vim /etc/bash.bashrc

or for gedit

#> cp /etc/bash.bashrc /etc/bash.bashrc-original
#> sudo gedit /etc/bash.bashrc

At the bottom of this file and on its last line place and save this entry

/usr/games/fortune -s |tee >(espeak -s 140 -ven+f4)

Lets walk through the command “fortune” is a large database of phrases some long some short and some funny, together with the only short phrases are extracted so when a new terminal is launched the audio is relatively short. The command “tee” on the other hand reads text from standard input to ensure this reads the fortune we proceed “tee” with a pipe “|” and then redirect the output of tee to espeak however to accomplish this we need to place the espeak command in brackets but only because we are using espeak with other options. The options for “espeak” are as follows “-s 140” determines the speed the voice, whilst “-ven+f4” selects the actual voice used. Please remember to save the file back to it’s original location when finished editing.

contact email : linux ‘at’

Author: Alan Campion - Page reference: 2921
Last modified: Alan Campion - 2015-01-19