Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__init__() got an unexpected keyword argument 'hmm' and path issues with pocketsphinx #712

Open
BruceJohnJennerLawso opened this issue Mar 25, 2019 · 16 comments

Comments

@BruceJohnJennerLawso
Copy link

So I am trying to get jasper working with pocketsphinx and espeak on an Orange Pi board running Ubuntu 16.04. For all intents and purposes the board has run like any ordinary Ubuntu install, save with ARM packages.

Because Im on Ubuntu, Im able to skip the whole build process for pocketsphinx and install pocketsphinx and python-pocketsphinx through apt. I then installed the rest of the dependencies as needed, but Im still having issues getting jasper to work.

The first issue seems to be with parsing profile.yml. When I try to manually specify options for CMUSphinx hmm_dir and fst_model as described at

http://jasperproject.github.io/documentation/configuration/

I get

ScannerError: mapping values are not allowed here

Which seems to happen whenever I indent a line in profile.yml. If I dont indent the lines for hmm_dir and fst_model, they seem to just be ignored.

I tried to continue on leaving those options blank, but the hmm_dir that apt installs pocketsphinx with seems to have changed. Jasper seems to think its located at

/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k

but as far as I can tell its actually located at

/usr/share/pocketsphinx/model/en-us/en-us

Jasper manages to get past that issue when I manually edited the hmm path in client/stt.py, but now crashes somewhere in stt.py with

File "./jasper.py", line 146, in
app = Jasper()
File "./jasper.py", line 109, in init
stt_passive_engine_class.get_passive_instance(),
File "/home/john/dev/jasper/client/stt.py", line 48, in get_passive_instance
return cls.get_instance('keyword', phrases)
File "/home/john/dev/jasper/client/stt.py", line 42, in get_instance
instance = cls(**config)
File "/home/john/dev/jasper/client/stt.py", line 129, in init
**vocabulary.decoder_kwargs)
TypeError: init() got an unexpected keyword argument 'hmm'

@G10DRAS
Copy link

G10DRAS commented Mar 29, 2019

do not use TAB while indent a line in profile.yml, instead use two SPACE and see if that solve your issue.

@BruceJohnJennerLawso
Copy link
Author

BruceJohnJennerLawso commented Mar 31, 2019

I tried exactly two spaces for indenting, but that caused the crash as before. My profile.yml:

carrier: ''

first_name: John

gmail_address: ##################

gmail_password: ##################

last_name: Lawson

phone_number: ''

prefers_email: true

stt_engine: sphinx

hmm_dir: '/usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k' #optional

timezone: America/Toronto

and it still crashes with


  •         JASPER - THE TALKING COMPUTER           *
    
  • (c) 2015 Shubhro Saha, Charlie Marsh & Jan Holthuis *

ERROR:root:Error occured!
Traceback (most recent call last):
File "jasper.py", line 146, in
app = Jasper()
File "jasper.py", line 80, in init
self.config = yaml.safe_load(f)
File "/usr/local/lib/python2.7/dist-packages/yaml/init.py", line 93, in safe_load
return load(stream, SafeLoader)
File "/usr/local/lib/python2.7/dist-packages/yaml/init.py", line 71, in load
return loader.get_single_data()
File "/usr/local/lib/python2.7/dist-packages/yaml/constructor.py", line 37, in get_single_data
node = self.get_single_node()
File "/usr/local/lib/python2.7/dist-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/local/lib/python2.7/dist-packages/yaml/composer.py", line 55, in compose_document
node = self.compose_node(None, None)
File "/usr/local/lib/python2.7/dist-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/local/lib/python2.7/dist-packages/yaml/composer.py", line 127, in compose_mapping_node
while not self.check_event(MappingEndEvent):
File "/usr/local/lib/python2.7/dist-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
File "/usr/local/lib/python2.7/dist-packages/yaml/parser.py", line 428, in parse_block_mapping_key
if self.check_token(KeyToken):
File "/usr/local/lib/python2.7/dist-packages/yaml/scanner.py", line 116, in check_token
self.fetch_more_tokens()
File "/usr/local/lib/python2.7/dist-packages/yaml/scanner.py", line 220, in fetch_more_tokens
return self.fetch_value()
File "/usr/local/lib/python2.7/dist-packages/yaml/scanner.py", line 580, in fetch_value
self.get_mark())
ScannerError: mapping values are not allowed here
in "/home/john/.jasper/profile.yml", line 9, column 10

@BruceJohnJennerLawso
Copy link
Author

(I used two spaces to start the hmmdir line, githubs formatting just isnt showing it)

@G10DRAS
Copy link

G10DRAS commented Apr 1, 2019

See configuration here

https://jasperproject.github.io/documentation/configuration/#pocketsphinx-stt

And validate your yaml here

http://www.yamllint.com/

@BruceJohnJennerLawso
Copy link
Author

Oh shoot ok I see the issue with the indenting, thats fixed now

@BruceJohnJennerLawso
Copy link
Author

I reverted the changed path that I tried to change by hardcoding in client/stt.py, but jasper still crashes:


  •         JASPER - THE TALKING COMPUTER           *
    
  • (c) 2015 Shubhro Saha, Charlie Marsh & Jan Holthuis *

WARNING:root:tts_engine not specified in profile, defaulting to 'espeak-tts'
ERROR:root:Error occured!
Traceback (most recent call last):
File "./jasper.py", line 146, in
app = Jasper()
File "./jasper.py", line 109, in init
stt_passive_engine_class.get_passive_instance(),
File "/home/john/dev/jasper/client/stt.py", line 48, in get_passive_instance
return cls.get_instance('keyword', phrases)
File "/home/john/dev/jasper/client/stt.py", line 42, in get_instance
instance = cls(**config)
File "/home/john/dev/jasper/client/stt.py", line 129, in init
**vocabulary.decoder_kwargs)
TypeError: init() got an unexpected keyword argument 'hmm'

Im not clear on what the issue is, but I wonder if the fact that the new path to the hmm dir does not contain a folder named hmm in its path might be throwing something off? FWIW, when I ls the directory that Im setting hmm_dir to, the output is

usr@server:/usr/share/pocketsphinx/model/en-us/en-us$ ls
feat.params mdef means noisedict README sendump transition_matrices variances

Which is what should be in the hmm dir afaik

@BruceJohnJennerLawso
Copy link
Author

I double checked with a fresh clone of jasper and the crash still happens. The issue seems to be between Jasper and possibly the new directory location for hmm_dir

@G10DRAS
Copy link

G10DRAS commented Apr 7, 2019

I think you missed pocketsphinx:
see below config

stt_engine: sphinx
pocketsphinx:
hmm_dir: '/usr/share/pocketsphinx/model/en-us/en-us'

@BruceJohnJennerLawso
Copy link
Author

I dont think thats the issue, my current profile.yml is

carrier: ''
first_name: John
gmail_address: ########
gmail_password: ########
last_name: Lawson
phone_number: ''
prefers_email: true
stt_engine: sphinx
pocketsphinx:
hmm_dir: '/usr/share/pocketsphinx/model/en-us/en-us' #optional
timezone: America/Toronto

@mecparts
Copy link

You're likely using pocketsphinx8-5prealpha. There are some API changes, as detailed in this message from the support forum ERROR:root:Error occured! init() got an unexpected keywork argument 'hmm'. For reference's sake, here's the changes I have in my (working with 5prealpha) stt.py

--- stt.py.orig	2019-05-11 21:22:41.494476257 -0600
+++ stt.py	2019-05-03 19:32:06.394275693 -0600
@@ -122,8 +122,14 @@
                                  "hmm_dir in your profile.",
                                  hmm_dir, ', '.join(missing_hmm_files))
 
-        self._decoder = ps.Decoder(hmm=hmm_dir, logfn=self._logfile,
-                                   **vocabulary.decoder_kwargs)
+        #self._decoder = ps.Decoder(hmm=hmm_dir, logfn=self._logfile,
+        #                           **vocabulary.decoder_kwargs)
+        psConfig = ps.Decoder.default_config()
+        psConfig.set_string('-hmm', hmm_dir)
+
+        psConfig.set_string('-lm', vocabulary.decoder_kwargs['lm'])
+        psConfig.set_string('-dict', vocabulary.decoder_kwargs['dict'])
+        self._decoder = ps.Decoder(psConfig)
 
     def __del__(self):
         os.remove(self._logfile)
@@ -163,13 +169,19 @@
         self._decoder.process_raw(data, False, True)
         self._decoder.end_utt()
 
-        result = self._decoder.get_hyp()
+        #result = self._decoder.get_hyp()
+        result = self._decoder.hyp()
         with open(self._logfile, 'r+') as f:
-            for line in f:
-                self._logger.debug(line.strip())
+        #    for line in f:
+        #        self._logger.debug(line.strip())
             f.truncate()
 
-        transcribed = [result[0]]
+        #transcribed = [result[0]]
+        if result is None:
+            transcribed = ''
+        else:
+            transcribed = result.hypstr.split()
+
         self._logger.info('Transcribed: %r', transcribed)
         return transcribed

@BruceJohnJennerLawso
Copy link
Author

Hey mecparts, I gave the changes you described a try but Im still crashing. Im having a little bit of trouble following the changes you made, the last part generating that transcribed variable and returning it is supposed to go in the del function ?

@BruceJohnJennerLawso
Copy link
Author

When I ran jasper with the changes you described it crashes with

`*******************************************************

  •         JASPER - THE TALKING COMPUTER           *
    
  • (c) 2015 Shubhro Saha, Charlie Marsh & Jan Holthuis *

WARNING:root:tts_engine not specified in profile, defaulting to 'espeak-tts'
ERROR:root:Error occured!
Traceback (most recent call last):
File "jasper.py", line 146, in
app = Jasper()
File "jasper.py", line 109, in init
stt_passive_engine_class.get_passive_instance(),
File "/home/john/dev/jasper/client/stt.py", line 48, in get_passive_instance
return cls.get_instance('keyword', phrases)
File "/home/john/dev/jasper/client/stt.py", line 42, in get_instance
instance = cls(**config)
File "/home/john/dev/jasper/client/stt.py", line 129, in init
**vocabulary.decoder_kwargs)
TypeError: init() got an unexpected keyword argument 'hmm'
Exception AttributeError: "'PocketSphinxSTT' object has no attribute '_decoder'" in <bound method PocketSphinxSTT.del of <client.stt.PocketSphinxSTT object at 0xb5ff2490>> ignored
`

@mecparts
Copy link

mecparts commented Jul 9, 2019

No, the last changes are in the transcribe function. Look at the line numbers in the @@ lines of the diff.

@mecparts
Copy link

mecparts commented Jul 9, 2019

Replace the __init__ function in the PocketSphinxSTT class with this code:

    def __init__(self, vocabulary, hmm_dir="/usr/local/share/" +
                 "pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k"):

        """
        Initiates the pocketsphinx instance.

        Arguments:
            vocabulary -- a PocketsphinxVocabulary instance
            hmm_dir -- the path of the Hidden Markov Model (HMM)
        """

        self._logger = logging.getLogger(__name__)

        # quirky bug where first import doesn't work
        try:
            import pocketsphinx as ps
        except:
            import pocketsphinx as ps

        with tempfile.NamedTemporaryFile(prefix='psdecoder_',
                                         suffix='.log', delete=False) as f:
            self._logfile = f.name

        self._logger.debug("Initializing PocketSphinx Decoder with hmm_dir " +
                           "'%s'", hmm_dir)

        # Perform some checks on the hmm_dir so that we can display more
        # meaningful error messages if neccessary
        if not os.path.exists(hmm_dir):
            msg = ("hmm_dir '%s' does not exist! Please make sure that you " +
                   "have set the correct hmm_dir in your profile.") % hmm_dir
            self._logger.error(msg)
            raise RuntimeError(msg)
        # Lets check if all required files are there. Refer to:
        # http://cmusphinx.sourceforge.net/wiki/acousticmodelformat
        # for details
        missing_hmm_files = []
        for fname in ('mdef', 'feat.params', 'means', 'noisedict',
                      'transition_matrices', 'variances'):
            if not os.path.exists(os.path.join(hmm_dir, fname)):
                missing_hmm_files.append(fname)
        mixweights = os.path.exists(os.path.join(hmm_dir, 'mixture_weights'))
        sendump = os.path.exists(os.path.join(hmm_dir, 'sendump'))
        if not mixweights and not sendump:
            # We only need mixture_weights OR sendump
            missing_hmm_files.append('mixture_weights or sendump')
        if missing_hmm_files:
            self._logger.warning("hmm_dir '%s' is missing files: %s. Please " +
                                 "make sure that you have set the correct " +
                                 "hmm_dir in your profile.",
                                 hmm_dir, ', '.join(missing_hmm_files))

        #self._decoder = ps.Decoder(hmm=hmm_dir, logfn=self._logfile,
        #                           **vocabulary.decoder_kwargs)
        psConfig = ps.Decoder.default_config()
        psConfig.set_string('-hmm', hmm_dir)

        psConfig.set_string('-lm', vocabulary.decoder_kwargs['lm'])
        psConfig.set_string('-dict', vocabulary.decoder_kwargs['dict'])
        self._decoder = ps.Decoder(psConfig)

And the transcribe function in the same class with this:

    def transcribe(self, fp):
        """
        Performs STT, transcribing an audio file and returning the result.

        Arguments:
            fp -- a file object containing audio data
        """

        fp.seek(44)

        # FIXME: Can't use the Decoder.decode_raw() here, because
        # pocketsphinx segfaults with tempfile.SpooledTemporaryFile()
        data = fp.read()
        self._decoder.start_utt()
        self._decoder.process_raw(data, False, True)
        self._decoder.end_utt()

        #result = self._decoder.get_hyp()
        result = self._decoder.hyp()
        with open(self._logfile, 'r+') as f:
        #    for line in f:
        #        self._logger.debug(line.strip())
            f.truncate()
 
        #transcribed = [result[0]]
        if result is None:
            transcribed = []
        else:
            transcribed = [result[0]]
        self._logger.info('Transcribed: %r', transcribed)
        return transcribed

and see what that gets you. I can't guarantee it will work error free first time; I've modified the code I'm working with to return multiple hypotheses from PocketSphinx and to work with Mycroft's adapt.intent parser, among other things, so I can't really test it easily anymore. I like the fact that the adapt parser can assign probabilities to each hypothesis from PocketSphinx and that I can use those probabilities in the brain code to pick the best hypothesis to select a module (rather than Jasper's "use the first module that matches" approach).

@azban
Copy link

azban commented Nov 20, 2019

this looks to be an issue caused by using an updated pocketsphinx version (from pip), rather than the outdated version that the Jasper docs and code rely on. is there any plan to make changes required to support the new version?

@appeacock
Copy link

The work on Jasper - specifically making it work as-is and refactoring to Python 3 is being conducted at https://github.com/aplawson/jasper-client -- including a tutorial on how to build it and/or deploy it with a custom Raspbian ISO image. //adam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants