James

macruby / james

tl;dr

This article contains stuff related to speech synthesis:

Intro

As far back as I can remember, I always wanted to be a gangster.

cough Let’s try that again…

When I was around 8, my dad and I went shopping for an Amiga 1000.

Here it is in its full glory:

I’m pretty sure I heard these synthesized organs when unwrapping it! :)

Now, apart from the incredible bouncing ball and the amazing 4096 colors it had (8-year old me is writing this), it could synthesize speech. Skip to 0:35 to see the guy enter some text for the Amiga to speak.

Doesn’t sound much worse than what you get on a Mac these days. Run this in a Terminal:

say 'Hello there, sexy!'

Why isn’t it much better these days? Speech Synthesis is hard.

Not only that, but it needs to be done for each language separately. Chinese intonation is complicated, for example, and real people don’t pronounce the four pitched tones in the same way. They’re pronounced differently or not at all, depending which tone went before, and which came after, also depending on mood and health of the speaker.

On OSX, there’s two possibilities to improve the existing voices. Try the demos: AssistiveWare iVox Samples and Cepstral Demos. I prefer iVox for european voices. Love the french & swedish women. … voices, I mean.

But still, even if it has a long way to go, you can already use this in clever ways:

Best xkcd ever!

But apart from playful applications, speech synthesis is very important. Many people rely on it every day.

James

Imagine you are either an 8-year old boy wanting to control a computer using only his voice – or imagine being in pain, and need to sit down often, and don’t always have a device with you.

For this, I wrote James.

Get the gem for MacRuby.

$ rvm use macruby
$ gem install james

Create a file called time_dialog.rb and copy this code into it:

James.dialog do

  hear 'What time is it?' => :time

  state :time do
    hear ['What time is it?', 'And now?'] => :time
    into { time = Time.now; "It is currently #{time.hour} #{time.min}." }
    exit {} # Optional, listed for completeness.
  end

end

then run it using

james time_dialog.rb

The Terminal will show you the available options.

This is a dialog consisting only of one state, time. The dialog (and time state) is entered when saying “What time is it?”. When it enters, it will say the current time, or whatever is returned by the into block.

James already provides a simple entry dialog to control where you are. “Thanks, James” for example will exit the current dialog.

Easy, isn’t it?

If you want more dialogs, just load more:

james {time,twitter,stocks}_dialog.rb

That’s it! You can write more complex dialogs, but this is out of scope for this article.

More examples and ideas for examples. Just add your own, if you want :)

How about…?

So if you’ve written up a few nice James dialogs, why not take that old MacMini, install MacRuby and James, attach a few microphones, and distribute them around the house?

Closing

I’m looking forward to the day where I can perform basic operations like looking up the weather etc. while eating breakfast and not having to context switch.

“James?”

“Yes?”

“What is the weather going to be like today?”

“Warm and sunny.”

“Great! I’ll be outside, doing some cycling then.”

doors slam one by one

“I’m sorry Dave, I’m afraid I can’t allow that.”

“Not again! You #$&@@^%!”

James keeps silent

Next James: Code Brawl

Share


Previous

Comments?