n900 voice recognition

The n900 really needs some sort of voice recognition. While the keyboard is nice, it’s sometimes a real pain and I really just want to be able to tell it ‘find all restaurants nearby’ or ‘call X”. There are a few options when it comes to voice recognition, I think the most likely to work well enough is CMU Sphinx especially now with the advent of pocketsphinx. pocketsphinx is specifically designed to be small and fast and it seems to deliver on both points. I build the pocketshpinx libraries for the n900 (patches to the upstream version of pocketsphinx are available here) and I’ve been playing around with them for the past two days. The accuracy is good enough and integration with the rest of the system so that you can give commands like switch task should be doable. To test things out and get my feet wet I’ve been using a quick python script called voximp, it’s simple to set up and it’ll let you issue commands immediately.

The new packages are available from the standard optified repository: gstreamer0.10-pocketsphinx, libpocketsphinx1, libpocketsphinx-dev, libsphinxbase1, libsphinxbase-dev, pocketsphinx-hmm-en-hub4wsj, pocketsphinx-hmm-en-tidigits, pocketsphinx-hmm-zh-tdt, pocketsphinx-lm-en-hub4, pocketsphinx-lm-zh-hans-gigatdt, pocketsphinx-lm-zh-hant-gigatdt, pocketsphinx-utils, python-pocketsphinx, python-pocketsphinx-dbg, python-sphinxbase, python-sphinxbase-dbg, sphinxbase-utils.

Hopefully I’ll get around to wrapping all of this in scheme bindings and a gui sometime soon, and now that I have MCE bindings I should even be able to implement placing calls by voice tags.

8 Responses to “n900 voice recognition”

  1. Evren Esat Özkan says:

    Hello,

    Thank you for porting, packaging and putting it to your repo. I’m a Python programmer and want to learn how to use it on my scripts. I’ve tried the voximp.py but got an error on n900 pr1.2;

    Traceback (most recent call last):
    File “voximp.py”, line 190, in
    app = Voximp()
    File “voximp.py”, line 50, in __init__
    self.init_gst()
    File “voximp.py”, line 56, in init_gst
    + ‘! pocketsphinx name=asr ! fakesink’)
    glib.GError: no element “alsasrc”

    How can I solve this, should I change “alsasrc” with another descriptor?

    Thanks

  2. GA says:

    Evren – try replacing alsasrc to pulsesrc.

  3. required says:

    Works nice on my n900, congratulations.
    How can i add acoustic model for other languages (looking for french) ?

  4. OK says:

    Hi, I am kind of new to this. How exactly do I download this onto my N900?

  5. Reader says:

    Where do I take xdotool from?
    I have downloaded voximp script, installed gstreamer0.10-pocketsphinx, pocketsphinx-hmm-en-hub4wsj and python-pocketsphinx
    voximp seems to work, but it periodically complains about lack of xdotool and xmms2
    Of course, it does nothing because there is no xdotool to emulate text input.

  6. Andrei says:

    I think I might have compiled it from source, I can’t remember anymore. You can try searching the maemo forums, it’s a decently popular tool, I’m sure someone has a package for it.

  7. Mark says:

    Hi Andrei,
    Are we able to use this set of Apps to make calls by simply saying a contact name, or by recoding a voice tag and attaching to a contact?
    Appreciate the work you have done on this as there is no other voice dialing app available for the N900.
    Thanks.

  8. Andrei says:

    Hi Mark,

    I made voice calls once upon a time by saying the contact’s name. You could do it by a tag as well if you wanted to. I just wrote a trivial little script to do it, nothing fancy, more of a proof of concept but it was quite useful in the end.

    But then Nokia came out with PR1.2 and screwed up all of the shared libraries for Qt. Then they announced that they won’t be putting much work into maemo (or meego) anymore. So I haven’t bothered updating the eggs to PR1.2. All of the information and scripts about how to regenerate them is out there, so you can do it if you really wanted to.

Leave a Reply